Assessing the Reliability of Deep Learning Classifiers Through Robustness Evaluation and Operational Profiles

by   Xingyu Zhao, et al.

The utilisation of Deep Learning (DL) is advancing into increasingly more sophisticated applications. While it shows great potential to provide transformational capabilities, DL also raises new challenges regarding its reliability in critical functions. In this paper, we present a model-agnostic reliability assessment method for DL classifiers, based on evidence from robustness evaluation and the operational profile (OP) of a given application. We partition the input space into small cells and then "assemble" their robustness (to the ground truth) according to the OP, where estimators on the cells' robustness and OPs are provided. Reliability estimates in terms of the probability of misclassification per input (pmi) can be derived together with confidence levels. A prototype tool is demonstrated with simplified case studies. Model assumptions and extension to real-world applications are also discussed. While our model easily uncovers the inherent difficulties of assessing the DL dependability (e.g. lack of data with ground truth and scalability issues), we provide preliminary/compromised solutions to advance in this research direction.


Detecting Operational Adversarial Examples for Reliable Deep Learning

The utilisation of Deep Learning (DL) raises new challenges regarding it...

Data Sanity Check for Deep Learning Systems via Learnt Assertions

Deep learning (DL) techniques have demonstrated satisfactory performance...

Adversarial Robustness of Deep Convolutional Candlestick Learner

Deep learning (DL) has been applied extensively in a wide range of field...

On the Reliability of Multiple Systems Estimation for the Quantification of Modern Slavery

The quantification of modern slavery has received increased attention re...

Hierarchical Distribution-Aware Testing of Deep Learning

With its growing use in safety/security-critical applications, Deep Lear...

A Survey on Uncertainty Toolkits for Deep Learning

The success of deep learning (DL) fostered the creation of unifying fram...

The Unnecessity of Assuming Statistically Independent Tests in Bayesian Software Reliability Assessments

When assessing a software-based system, the results of statistical infer...

Code Repositories


The Reliability Assessment Model for Deep Learning systems

view repo

Please sign up or login with your details

Forgot password? Click here to reset