Quantifying Overfitting: Evaluating Neural Network Performance through Analysis of Null Space

by   Hossein Rezaei, et al.

Machine learning models that are overfitted/overtrained are more vulnerable to knowledge leakage, which poses a risk to privacy. Suppose we download or receive a model from a third-party collaborator without knowing its training accuracy. How can we determine if it has been overfitted or overtrained on its training data? It's possible that the model was intentionally over-trained to make it vulnerable during testing. While an overfitted or overtrained model may perform well on testing data and even some generalization tests, we can't be sure it's not over-fitted. Conducting a comprehensive generalization test is also expensive. The goal of this paper is to address these issues and ensure the privacy and generalization of our method using only testing data. To achieve this, we analyze the null space in the last layer of neural networks, which enables us to quantify overfitting without access to training data or knowledge of the accuracy of those data. We evaluated our approach on various architectures and datasets and observed a distinct pattern in the angle of null space when models are overfitted. Furthermore, we show that models with poor generalization exhibit specific characteristics in this space. Our work represents the first attempt to quantify overfitting without access to training data or knowing any knowledge about the training samples.


page 1

page 2

page 3

page 4


Sample-based Regularization: A Transfer Learning Strategy Toward Better Generalization

Training a deep neural network with a small amount of data is a challeng...

ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning

When building machine learning models using sensitive data, organization...

Privacy Analysis in Language Models via Training Data Leakage Report

Recent advances in neural network based language models lead to successf...

Testing for Overfitting

High complexity models are notorious in machine learning for overfitting...

Towards Mitigating Architecture Overfitting in Dataset Distillation

Dataset distillation methods have demonstrated remarkable performance fo...

Suppressing Model Overfitting for Image Super-Resolution Networks

Large deep networks have demonstrated competitive performance in single ...

Quantifying Overfitting: Introducing the Overfitting Index

In the rapidly evolving domain of machine learning, ensuring model gener...

Please sign up or login with your details

Forgot password? Click here to reset