The State of NLP Literature: A Diachronic Analysis of the ACL Anthology
The ACL Anthology (AA) is a digital repository of tens of thousands of articles on Natural Language Processing (NLP). This paper examines the literature as a whole to identify broad trends in productivity, focus, and impact. It presents the analyses in a sequence of questions and answers. The goal is to record the state of the AA literature: who and how many of us are publishing? what are we publishing on? where and in what form are we publishing? and what is the impact of our publications? The answers are usually in the form of numbers, graphs, and inter-connected visualizations. Special emphasis is laid on the demographics and inclusiveness of NLP publishing. Notably, we find that only about 30 percentage has not improved since the year 2000. We also show that, on average, female first authors are cited less than male first authors, even when controlling for experience. We hope that recording citation and participation gaps across demographic groups will encourage more inclusiveness and fairness in research.
READ FULL TEXT