Learn to Accumulate Evidence from All Training Samples: Theory and Practice

by   Deep Pandey, et al.

Evidential deep learning, built upon belief theory and subjective logic, offers a principled and computationally efficient way to turn a deterministic neural network uncertainty-aware. The resultant evidential models can quantify fine-grained uncertainty using the learned evidence. To ensure theoretically sound evidential models, the evidence needs to be non-negative, which requires special activation functions for model training and inference. This constraint often leads to inferior predictive performance compared to standard softmax models, making it challenging to extend them to many large-scale datasets. To unveil the real cause of this undesired behavior, we theoretically investigate evidential models and identify a fundamental limitation that explains the inferior performance: existing evidential activation functions create zero evidence regions, which prevent the model to learn from training samples falling into such regions. A deeper analysis of evidential activation functions based on our theoretical underpinning inspires the design of a novel regularizer that effectively alleviates this fundamental limitation. Extensive experiments over many challenging real-world datasets and settings confirm our theoretical findings and demonstrate the effectiveness of our proposed approach.


page 1

page 2

page 3

page 4


Orthogonal-Padé Activation Functions: Trainable Activation functions for smooth and faster convergence in deep networks

We have proposed orthogonal-Padé activation functions, which are trainab...

Know Your Limits: Monotonicity Softmax Make Neural Classifiers Overconfident on OOD Data

A crucial requirement for reliable deployment of deep learning models fo...

Evidential Conditional Neural Processes

The Conditional Neural Process (CNP) family of models offer a promising ...

Evolution of Novel Activation Functions in Neural Network Training with Applications to Classification of Exoplanets

We present analytical exploration of novel activation functions as conse...

Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

The performance of deep network learning strongly depends on the choice ...

FitAct: Error Resilient Deep Neural Networks via Fine-Grained Post-Trainable Activation Functions

Deep neural networks (DNNs) are increasingly being deployed in safety-cr...

Please sign up or login with your details

Forgot password? Click here to reset