Quality change: norm or exception? Measurement, Analysis and Detection of Quality Change in Wikipedia

11/02/2021
by   Paramita Das, et al.
0

Wikipedia has been turned into an immensely popular crowd-sourced encyclopedia for information dissemination on numerous versatile topics in the form of subscription free content. It allows anyone to contribute so that the articles remain comprehensive and updated. For enrichment of content without compromising standards, the Wikipedia community enumerates a detailed set of guidelines, which should be followed. Based on these, articles are categorized into several quality classes by the Wikipedia editors with increasing adherence to guidelines. This quality assessment task by editors is laborious as well as demands platform expertise. As a first objective, in this paper, we study evolution of a Wikipedia article with respect to such quality scales. Our results show novel non-intuitive patterns emerging from this exploration. As a second objective we attempt to develop an automated data driven approach for the detection of the early signals influencing the quality change of articles. We posit this as a change point detection problem whereby we represent an article as a time series of consecutive revisions and encode every revision by a set of intuitive features. Finally, various change point detection algorithms are used to efficiently and accurately detect the future change points. We also perform various ablation studies to understand which group of features are most effective in identifying the change points. To the best of our knowledge, this is the first work that rigorously explores English Wikipedia article quality life cycle from the perspective of quality indicators and provides a novel unsupervised page level approach to detect quality switch, which can help in automatic content monitoring in Wikipedia thus contributing significantly to the CSCW community.

READ FULL TEXT

page 1

page 12

page 17

page 21

page 28

research
10/14/2020

NwQM: A neural quality assessment framework for Wikipedia

Millions of people irrespective of socioeconomic and demographic backgro...
research
02/26/2021

Language-agnostic Topic Classification for Wikipedia

A major challenge for many analyses of Wikipedia dynamics – e.g., imbala...
research
09/18/2018

Mind Your POV: Convergence of Articles and Editors Towards Wikipedia's Neutrality Norm

Wikipedia has a strong norm of writing in a 'neutral point of view' (NPO...
research
06/11/2019

StRE: Self Attentive Edit Quality Prediction in Wikipedia

Wikipedia can easily be justified as a behemoth, considering the sheer v...
research
04/02/2015

Eliciting Disease Data from Wikipedia Articles

Traditional disease surveillance systems suffer from several disadvantag...
research
08/15/2021

Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression

Organizing complex peer production projects and advancing scientific kno...
research
03/23/2017

TokTrack: A Complete Token Provenance and Change Tracking Dataset for the English Wikipedia

We present a dataset that contains every instance of all tokens ( words...

Please sign up or login with your details

Forgot password? Click here to reset