Conformalized semi-supervised random forest for classification and abnormality detection

02/04/2023
by   Yujin Han, et al.
0

Traditional classifiers infer labels under the premise that the training and test samples are generated from the same distribution. This assumption can be problematic for safety-critical applications such as medical diagnosis and network attack detection. In this paper, we consider the multi-class classification problem when the training data and the test data may have different distributions. We propose conformalized semi-supervised random forest (CSForest), which constructs set-valued predictions C(x) to include the correct class label with desired probability while detecting outliers efficiently. We compare the proposed method to other state-of-art methods in both a synthetic example and a real data application to demonstrate the strength of our proposal.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset