Explaining the Deep Natural Language Processing by Mining Textual Interpretable Features

by   Francesco Ventura, et al.

Despite the high accuracy offered by state-of-the-art deep natural-language models (e.g. LSTM, BERT), their application in real-life settings is still widely limited, as they behave like a black-box to the end-user. Hence, explainability is rapidly becoming a fundamental requirement of future-generation data-driven systems based on deep-learning approaches. Several attempts to fulfill the existing gap between accuracy and interpretability have been done. However, robust and specialized xAI (Explainable Artificial Intelligence) solutions tailored to deep natural-language models are still missing. We propose a new framework, named T-EBAnO, which provides innovative prediction-local and class-based model-global explanation strategies tailored to black-box deep natural-language models. Given a deep NLP model and the textual input data, T-EBAnO provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process. Specifically, the framework extracts sets of interpretable features mining the inner knowledge of the model. Then, it quantifies the influence of each feature during the prediction process by exploiting the novel normalized Perturbation Influence Relation index at the local level and the novel Global Absolute Influence and Global Relative Influence indexes at the global level. The effectiveness and the quality of the local and global explanations obtained with T-EBAnO are proved on (i) a sentiment analysis task performed by a fine-tuned BERT model, and (ii) a toxic comment classification task performed by an LSTM model.


Local Interpretations for Explainable Natural Language Processing: A Survey

As the use of deep learning techniques has grown across various fields o...

TSXplain: Demystification of DNN Decisions for Time-Series using Natural Language and Statistical Features

Neural networks (NN) are considered as black-boxes due to the lack of ex...

Visualizing and Explaining Language Models

During the last decade, Natural Language Processing has become, after Co...

ExClaim: Explainable Neural Claim Verification Using Rationalization

With the advent of deep learning, text generation language models have i...

Attention is Not Always What You Need: Towards Efficient Classification of Domain-Specific Text

For large-scale IT corpora with hundreds of classes organized in a hiera...

Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning

Artificial intelligence, particularly through recent advancements in dee...

BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification

Healthcare predictive analytics aids medical decision-making, diagnosis ...

Please sign up or login with your details

Forgot password? Click here to reset