DeepAI AI Chat
Log In Sign Up

Are Models Trained on Indian Legal Data Fair?

by   Sahil Girhepuje, et al.

Recent advances and applications of language technology and artificial intelligence have enabled much success across multiple domains like law, medical and mental health. AI-based Language Models, like Judgement Prediction, have recently been proposed for the legal sector. However, these models are strife with encoded social biases picked up from the training data. While bias and fairness have been studied across NLP, most studies primarily locate themselves within a Western context. In this work, we present an initial investigation of fairness from the Indian perspective in the legal domain. We highlight the propagation of learnt algorithmic biases in the bail prediction task for models trained on Hindi legal documents. We evaluate the fairness gap using demographic parity and show that a decision tree model trained for the bail prediction task has an overall fairness disparity of 0.237 between input features associated with Hindus and Muslims. Additionally, we highlight the need for further research and studies in the avenues of fairness/bias in applying AI in the legal sector with a specific focus on the Indian context.


page 1

page 2

page 3

page 4


Identifying biases in legal data: An algorithmic fairness perspective

The need to address representation biases and sentencing disparities in ...

Extracting Fairness Policies from Legal Documents

Machine Learning community is recently exploring the implications of bia...

On the Fairness of 'Fake' Data in Legal AI

The economics of smaller budgets and larger case numbers necessitates th...

Equality before the Law: Legal Judgment Consistency Analysis for Fairness

In a legal system, judgment consistency is regarded as one of the most i...

Re-contextualizing Fairness in NLP: The Case of India

Recent research has revealed undesirable biases in NLP data and models. ...

Lex Rosetta: Transfer of Predictive Models Across Languages, Jurisdictions, and Legal Domains

In this paper, we examine the use of multi-lingual sentence embeddings t...