b'Anupam Datta'

research

∙ 08/28/2023

Identifying and Mitigating the Security Risks of Generative AI

Every major technical invention resurfaces the dual-use dilemma – the ne...

0 Clark Barrett, et al. ∙

research

∙ 06/01/2022

Order-sensitive Shapley Values for Evaluating Conceptual Soundness of NLP Models

Previous works show that deep NLP models are not always conceptually sou...

0 Kaiji Lu, et al. ∙

research

∙ 05/24/2022

Faithful Explanations for Deep Graph Models

This paper studies faithful explanations for Graph Neural Networks (GNNs...

1 Zifan Wang, et al. ∙

research

∙ 10/06/2021

Consistent Counterfactuals for Deep Models

Counterfactual examples are one of the most commonly-cited methods for e...

0 Emily Black, et al. ∙

research

∙ 03/20/2021

Boundary Attributions Provide Normal (Vector) Explanations

Recent work on explaining Deep Neural Networks (DNNs) focuses on attribu...

8 Zifan Wang, et al. ∙

research

∙ 11/02/2020

Abstracting Influence Paths for Explaining (Contextualization of) BERT Models

While "attention is all you need" may be proving true, we do not yet kno...

0 Kaiji Lu, et al. ∙

research

∙ 09/17/2020

Towards Behavior-Level Explanation for Deep Reinforcement Learning

While Deep Neural Networks (DNNs) are becoming the state-of-the-art for ...

23 Xuan Chen, et al. ∙

research

∙ 06/14/2020

Fairness Under Feature Exemptions: Counterfactual and Observational Measures

With the growing use of AI in highly consequential domains, the quantifi...

0 Sanghamitra Dutta, et al. ∙

research

∙ 06/11/2020

Smoothed Geometry for Robust Attribution

Feature attributions are a popular tool for explaining the behavior of D...

6 Zifan Wang, et al. ∙

research

∙ 05/03/2020

Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models

LSTM-based recurrent neural networks are the state-of-the-art for many n...

0 Kaiji Lu, et al. ∙

research

∙ 02/19/2020

Interpreting Interpretations: Organizing Attribution Methods by Criteria

Attribution methods that explains the behaviour of machine learning mode...

0 Zifan Wang, et al. ∙

research

∙ 12/21/2018

Feature-Wise Bias Amplification

We study the phenomenon of bias amplification in classifiers, wherein a ...

0 Klas Leino, et al. ∙

research

∙ 10/16/2018

Hunting for Discriminatory Proxies in Linear Regression Models

A machine learning model may exhibit discrimination when used to make de...

0 Samuel Yeom, et al. ∙

research

∙ 08/06/2018

Correspondences between Privacy and Nondiscrimination: Why They Should Be Studied Together

Privacy and nondiscrimination are related but different. We make this ob...

0 Anupam Datta, et al. ∙

research

∙ 07/31/2018

Gender Bias in Neural Natural Language Processing

We examine whether neural natural language processing (NLP) systems refl...

0 Kaiji Lu, et al. ∙

research

∙ 03/28/2018

Supervising Feature Influence

Causal influence measures for machine learnt classifiers shed light on t...

0 Shayak Sen, et al. ∙

research

∙ 02/11/2018

Influence-Directed Explanations for Deep Convolutional Networks

We study the problem of explaining a rich class of behavioral properties...

0 Klas Leino, et al. ∙

research

∙ 11/29/2017

Latent Factor Interpretations for Collaborative Filtering

Many machine learning systems utilize latent factors as internal represe...

0 Anupam Datta, et al. ∙

research

∙ 09/27/2017

Case Study: Explaining Diabetic Retinopathy Detection Deep CNNs via Integrated Gradients

In this report, we applied integrated gradients to explaining a neural n...

0 Linyi Li, et al. ∙

Anupam Datta

Featured Co-authors

Sign in with Google

Consider DeepAI Pro