Inference of Media Bias and Content Quality Using Natural-Language Processing

by   Zehan Chao, et al.

Media bias can significantly impact the formation and development of opinions and sentiments in a population. It is thus important to study the emergence and development of partisan media and political polarization. However, it is challenging to quantitatively infer the ideological positions of media outlets. In this paper, we present a quantitative framework to infer both political bias and content quality of media outlets from text, and we illustrate this framework with empirical experiments with real-world data. We apply a bidirectional long short-term memory (LSTM) neural network to a data set of more than 1 million tweets to generate a two-dimensional ideological-bias and content-quality measurement for each tweet. We then infer a “media-bias chart” of (bias, quality) coordinates for the media outlets by integrating the (bias, quality) measurements of the tweets of the media outlets. We also apply a variety of baseline machine-learning methods, such as a naive-Bayes method and a support-vector machine (SVM), to infer the bias and quality values for each tweet. All of these baseline approaches are based on a bag-of-words approach. We find that the LSTM-network approach has the best performance of the examined methods. Our results illustrate the importance of leveraging word order into machine-learning methods in text analysis.


Uncovering Gender Bias in Media Coverage of Politicians with Machine Learning

This paper presents research uncovering systematic gender bias in the re...

Automating Political Bias Prediction

Every day media generate large amounts of text. An unbiased view on medi...

A Comparison of Synthetic Oversampling Methods for Multi-class Text Classification

The authors compared oversampling methods for the problem of multi-class...

Machine-Learning media bias

We present an automated method for measuring media bias. Inferring which...

Synchronous Prediction of Arousal and Valence Using LSTM Network for Affective Video Content Analysis

The affect embedded in video data conveys high-level semantic informatio...

De-identification In practice

We report our effort to identify the sensitive information, subset of da...

Matching with Text Data: An Experimental Evaluation of Methods for Matching Documents and of Measuring Match Quality

How should one perform matching in observational studies when the units ...

Please sign up or login with your details

Forgot password? Click here to reset