Concept Drift Detection and Adaptation with Hierarchical Hypothesis Testing
In a streaming environment, there is often a need for statistical prediction models to detect and adapt to concept drifts (i.e., changes in the joint distribution between predictor and response variables) so as to mitigate deteriorating predictive performance over time. Various concept drift detection approaches have been proposed in the past decades. However, they do not perform well across different concept drift types (e.g., gradual or abrupt, recurrent or irregular) and different data stream distributions (e.g., balanced and imbalanced labels). This paper presents a novel framework that can detect and also adapt to the various concept drift types, even in the presence of imbalanced data labels. The framework leverages a hierarchical set of hypothesis tests in an online fashion to detect concept drifts and employs an adaptive training strategy to significantly boost its adaptation capability. The performance of the proposed framework is compared to benchmark approaches using both simulated and real-world datasets spanning the breadth of concept drift types. The proposed approach significantly outperforms benchmark solutions in terms of precision, delay of detection as well as the adaptability across different concepts.
READ FULL TEXT