Ensuring Learning Guarantees on Concept Drift Detection with Statistical Learning Theory
Concept Drift (CD) detection intends to continuously identify changes in data stream behaviors, supporting researchers in the study and modeling of real-world phenomena. Motivated by the lack of learning guarantees in current CD algorithms, we decided to take advantage of the Statistical Learning Theory (SLT) to formalize the necessary requirements to ensure probabilistic learning bounds, so drifts would refer to actual changes in data rather than by chance. As discussed along this paper, a set of mathematical assumptions must be held in order to rely on SLT bounds, which are especially controversial in CD scenarios. Based on this issue, we propose a methodology to address those assumptions in CD scenarios and therefore ensure learning guarantees. Complementary, we assessed a set of relevant and known CD algorithms from the literature in light of our methodology. As main contribution, we expect this work to support researchers while designing and evaluating CD algorithms on different domains.
READ FULL TEXT