Zero-shot Code-Mixed Offensive Span Identification through Rationale Extraction

05/12/2022
by   Manikandan Ravikiran, et al.
0

This paper investigates the effectiveness of sentence-level transformers for zero-shot offensive span identification on a code-mixed Tamil dataset. More specifically, we evaluate rationale extraction methods of Local Interpretable Model Agnostic Explanations (LIME) <cit.> and Integrated Gradients (IG) <cit.> for adapting transformer based offensive language classification models for zero-shot offensive span identification. To this end, we find that LIME and IG show baseline F_1 of 26.35% and 44.83%, respectively. Besides, we study the effect of data set size and training process on the overall accuracy of span identification. As a result, we find both LIME and IG to show significant improvement with Masked Data Augmentation and Multilabel Training, with F_1 of 50.23% and 47.38% respectively. Disclaimer : This paper contains examples that may be considered profane, vulgar, or offensive. The examples do not represent the views of the authors or their employers/graduate schools towards any person(s), group(s), practice(s), or entity/entities. Instead they are used to emphasize only the linguistic research challenges.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro