Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter

by   Thayer Alshaabi, et al.

In real-time, Twitter strongly imprints world events, popular culture, and the day-to-day; Twitter records an ever growing compendium of language use and change; and Twitter has been shown to enable certain kinds of prediction. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spreading through retweets. Here, we describe Storywrangler, an ongoing, day-scale curation of over 100 billion tweets containing around 1 trillion 1-grams from 2008 to 2020. For each day, we break tweets into 1-, 2-, and 3-grams across 150+ languages, record usage frequencies, and generate Zipf distributions. We make the data set available through an interactive time series viewer, and as downloadable time series and daily distributions. We showcase a few examples of the many possible avenues of study we aim to enable including how social amplification can be visualized through 'contagiograms'.


page 6

page 7

page 11

page 12

page 15

page 17

page 19


Grounding the Semantics of Part-of-Day Nouns Worldwide using Twitter

The usage of part-of-day nouns, such as 'night', and their time-specific...

Towards real-time population estimates: introducing Twitter daily estimates of residents and non-residents at the county level

The study of migrations and mobility has historically been severely limi...

Fighting Redundancy and Model Decay with Embeddings

Every day, hundreds of millions of new Tweets containing over 40 languag...

Six-Day Footraces in the Post-Pedestrianism Era

In a six-day footrace, competitors accumulate as much distance as possib...

A Python Library for Exploratory Data Analysis and Knowledge Discovery on Twitter Data

Twitter is perhaps the social media more amenable for research. It requi...

The Politics of Language Choice: How the Russian-Ukrainian War Influences Ukrainians' Language Use on Twitter

The use of language is innately political and often a vehicle of cultura...

Please sign up or login with your details

Forgot password? Click here to reset