Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

by   Kaixuan Zhang, et al.

Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned.


page 1

page 2

page 3

page 4


A Manifold Two-Sample Test Study: Integral Probability Metric with Neural Networks

Two-sample tests are important areas aiming to determine whether two col...

Manifold: A Model-Agnostic Framework for Interpretation and Diagnosis of Machine Learning Models

Interpretation and diagnosis of machine learning models have gained rene...

A Topological-Framework to Improve Analysis of Machine Learning Model Performance

As both machine learning models and the datasets on which they are evalu...

Representation Learning via Manifold Flattening and Reconstruction

This work proposes an algorithm for explicitly constructing a pair of ne...

Manifold Learning with Geodesic Minimal Spanning Trees

In the manifold learning problem one seeks to discover a smooth low dime...

Influential Sample Selection: A Graph Signal Processing Approach

With the growing complexity of machine learning techniques, understandin...

Please sign up or login with your details

Forgot password? Click here to reset