Data-Efficient and Interpretable Tabular Anomaly Detection

03/03/2022
by   Chun-Hao Chang, et al.
0

Anomaly detection (AD) plays an important role in numerous applications. We focus on two understudied aspects of AD that are critical for integration into real-world applications. First, most AD methods cannot incorporate labeled data that are often available in practice in small quantities and can be crucial to achieve high AD accuracy. Second, most AD methods are not interpretable, a bottleneck that prevents stakeholders from understanding the reason behind the anomalies. In this paper, we propose a novel AD framework that adapts a white-box model class, Generalized Additive Models, to detect anomalies using a partial identification objective which naturally handles noisy or heterogeneous features. In addition, the proposed framework, DIAD, can incorporate a small amount of labeled data to further boost anomaly detection performances in semi-supervised settings. We demonstrate the superiority of our framework compared to previous work in both unsupervised and semi-supervised settings using diverse tabular datasets. For example, under 5 labeled anomalies DIAD improves from 86.2% to 89.4% AUC by learning AD from unlabeled data. We also present insightful interpretations that explain why DIAD deems certain samples as anomalies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/31/2022

Deep Anomaly Detection and Search via Reinforcement Learning

Semi-supervised Anomaly Detection (AD) is a kind of data mining task whi...
research
11/30/2022

SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch

Semi-supervised anomaly detection is a common problem, as often the data...
research
03/08/2018

A New Model for Evaluating Range-Based Anomaly Detection Algorithms

Classical anomaly detection (AD) is principally concerned with point-bas...
research
04/20/2021

What is Wrong with One-Class Anomaly Detection?

From a safety perspective, a machine learning method embedded in real-wo...
research
02/15/2023

Deep Anomaly Detection under Labeling Budget Constraints

Selecting informative data points for expert feedback can significantly ...
research
03/10/2023

Learning Global-Local Correspondence with Semantic Bottleneck for Logical Anomaly Detection

This paper presents a novel framework, named Global-Local Correspondence...
research
06/06/2023

Efficient Anomaly Detection with Budget Annotation Using Semi-Supervised Residual Transformer

Anomaly Detection is challenging as usually only the normal samples are ...

Please sign up or login with your details

Forgot password? Click here to reset