Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter.

Literature DB >> 34272243

Storywrangler: A massive exploratorium for sociolinguistic, cultural, socioeconomic, and political timelines using Twitter.

Thayer Alshaabi^1,2,3, Jane L Adams^4,2, Michael V Arnold^4,2, Joshua R Minot^4,2, David R Dewhurst^4,2,5, Andrew J Reagan⁶, Christopher M Danforth^4,2,3, Peter Sheridan Dodds^1,2,3.

Abstract

In real time, Twitter strongly imprints world events, popular culture, and the day-to-day, recording an ever-growing compendium of language change. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spreading through retweets. Here, we describe Storywrangler, an ongoing curation of over 100 billion tweets containing 1 trillion 1-grams from 2008 to 2021. For each day, we break tweets into 1-, 2-, and 3-grams across 100+ languages, generating frequencies for words, hashtags, handles, numerals, symbols, and emojis. We make the dataset available through an interactive time series viewer and as downloadable time series and daily distributions. Although Storywrangler leverages Twitter data, our method of tracking dynamic changes in n-grams can be extended to any temporally evolving corpus. Illustrating the instrument's potential, we present example use cases including social amplification, the sociotechnical dynamics of famous individuals, box office success, and social unrest.

Copyright © 2021 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution License 4.0 (CC BY).

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 34272243 DOI： 10.1126/sciadv.abe6534

Source DB: PubMed Journal: Sci Adv ISSN： 2375-2548 Impact factor: 14.136

3 in total

1. Augmenting Semantic Lexicons Using Word Embeddings and Transfer Learning.

Authors: Thayer Alshaabi; Colin M Van Oort; Mikaela Irene Fudolig; Michael V Arnold; Christopher M Danforth; Peter Sheridan Dodds
Journal: Front Artif Intell Date: 2022-01-24

2. Quantifying Changes in the Language Used Around Mental Health on Twitter Over 10 Years: Observational Study.

Authors: Anne Marie Stupinski; Thayer Alshaabi; Michael V Arnold; Jane Lydia Adams; Joshua R Minot; Matthew Price; Peter Sheridan Dodds; Christopher M Danforth
Journal: JMIR Ment Health Date: 2022-03-30

3. Computational timeline reconstruction of the stories surrounding Trump: Story turbulence, narrative control, and collective chronopathy.

Authors: Peter Sheridan Dodds; Joshua R Minot; Michael V Arnold; Thayer Alshaabi; Jane Lydia Adams; Andrew J Reagan; Christopher M Danforth
Journal: PLoS One Date: 2021-12-08 Impact factor: 3.240

3 in total