Influential Sample Selection: A Graph Signal Processing Approach

by   Rushil Anirudh, et al.

With the growing complexity of machine learning techniques, understanding the functioning of black-box models is more important than ever. A recently popular strategy towards interpretability is to generate explanations based on examples -- called influential samples -- that have the largest influence on the model's observed behavior. However, for such an analysis, we are confronted with a plethora of influence metrics. While each of these metrics provide varying levels of representativeness and diversity, existing approaches implicitly couple the definition of influence to their sample selection algorithm, thereby making it challenging to generalize to specific analysis needs. In this paper, we propose a generic approach to influential sample selection, which analyzes the influence metric as a function on a graph constructed using the samples. We show that samples which are critical to recovering the high-frequency content of the function correspond to the most influential samples. Our approach decouples the influence metric from the actual sample selection technique, and hence can be used with any type of task-specific influence. Using experiments in prototype selection, and semi-supervised classification, we show that, even with popularly used influence metrics, our approach can produce superior results in comparison to state-of-the-art approaches. Furthermore, we demonstrate how a novel influence metric can be used to recover the influence structure in characterizing the decision surface, and recovering corrupted labels efficiently.


page 6

page 7


Influence Selection for Active Learning

The existing active learning methods select the samples by evaluating th...

White-Box Analysis over Machine Learning: Modeling Performance of Configurable Systems

Performance-influence models can help stakeholders understand how and wh...

CLIMAX: An exploration of Classifier-Based Contrastive Explanations

Explainable AI is an evolving area that deals with understanding the dec...

Autoencoder Based Sample Selection for Self-Taught Learning

Self-taught learning is a technique that uses a large number of unlabele...

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Data samples collected for training machine learning models are typicall...

Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency

With growing capabilities of large language models, prompting them has b...

Differential Analysis of Triggers and Benign Features for Black-Box DNN Backdoor Detection

This paper proposes a data-efficient detection method for deep neural ne...

Please sign up or login with your details

Forgot password? Click here to reset