Literature DB >> 34725168

Reply to Schmidt et al.: A robust surge of cognitive distortions in historical language.

Johan Bollen1, Marijn Ten Thij2,3, Fritz Breithaupt4, Alexander T J Barron2, Lauren A Rutter5, Lorenzo Lorenzo-Luaces5, Marten Scheffer6.   

Abstract

Entities:  

Mesh:

Year:  2021        PMID: 34725168      PMCID: PMC8609227          DOI: 10.1073/pnas.2115842118

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


× No keyword cloud information.
In their critique, Schmidt et al. (1) claim that our analysis of book language (2) cannot meaningfully reflect society. Their arguments bear no relevance to our paper. The statement that “words in books are not clinical interviews, and word frequencies are not psychiatric assessments” is irrelevant because we make no attempts at clinical diagnoses. Their observation that “Derrida” is a more frequent book word than “The Beatles” is also a red herring: We do not compare between words, but instead follow the dynamics of phrases over time. Lastly, our tracking of cognitive distortions (CDS) markers is not an attempt to identify “negative thoughts” but rather to detect markers of language involved in the expression of distorted thinking. Furthermore, Schmidt et al. (1) claim that our results are explained by a composition shift of the Google Books data toward more fiction since 2000. We have to disagree. We made our observations relative to a null model that specifically controls for such changes in corpus composition and other recency effects, and reported a robust signal well above that baseline (2). Schmidt et al. (ref. 1, figure 1) perform a linear regression analysis that shows a correlation between a word’s relative frequency in fiction and its rise in prevalence. Because our CDS n-grams (2, 3) are about 43% more prevalent in fiction than English overall, a shift toward more fiction does increase CDS n-gram prevalence. However, our analyses indicate that the observed rise of fiction in the data would only cause CDS prevalence to increase 16% from 1980 to 2019, much less than the magnitude of the observed shift and accounted for by our null model.
Fig. 1.

(Left) Original results published in Bollen et al. (2). (Right) The same analysis with n-gram counts in the Fiction corpus subtracted from the English corpus. The comparison reveals that the original results are robust against the removal of Fiction, and can thus not be explained by the growth of Fiction in the Google Books sample.

(Left) Original results published in Bollen et al. (2). (Right) The same analysis with n-gram counts in the Fiction corpus subtracted from the English corpus. The comparison reveals that the original results are robust against the removal of Fiction, and can thus not be explained by the growth of Fiction in the Google Books sample. Schmidt et al. (ref. 1, figure 2) make another inferential error when they draw conclusions from a correspondence between the prevalence of CDS n-grams and the sum of the log prevalence of their constituent words. These observations are not only compatible with our results but predicted: Changes in n-gram prevalence should match those of their constituent words. One cannot write “completely bad” without “completely” and “bad.” Furthermore, both terms individually mark similar cognitive distortion types, and will thus follow a similar trajectory. Instead of such indirect inferences or speculations, there is a more direct way to test whether our results are caused by a rise of fiction in the database. We remove the entire Fiction corpus from English by subtracting Fiction n-gram word counts from those in the English corpus. This analysis (Fig. 1) shows that the dynamics of CDS markers hardly differ from our original results. Along with the null model, this confirms that our results are unlikely to be driven by the growth of fiction in the Google Books sample. Overly harsh critiques on the emerging field of culturomics carry the risk of throwing the baby out with the bathwater. The millions of books produced over the past centuries are not unbiased reflections of natural language. Yet, they are not uncoupled from social, cultural, and psycholinguistic changes (4–8). This implies a treasure trove of information when interpreted with care.
  8 in total

1.  Linguistic positivity in historical texts reflects dynamic environmental and psychological factors.

Authors:  Rumen Iliev; Joe Hoover; Morteza Dehghani; Robert Axelrod
Journal:  Proc Natl Acad Sci U S A       Date:  2016-11-21       Impact factor: 11.205

2.  Historical analysis of national subjective wellbeing using millions of digitized books.

Authors:  Thomas T Hills; Eugenio Proto; Daniel Sgroi; Chanuki Illushka Seresinhe
Journal:  Nat Hum Behav       Date:  2019-10-14

3.  Individuals with depression express more distorted thinking on social media.

Authors:  Krishna C Bathina; Marijn Ten Thij; Lorenzo Lorenzo-Luaces; Lauren A Rutter; Johan Bollen
Journal:  Nat Hum Behav       Date:  2021-02-11

4.  Quantitative analysis of culture using millions of digitized books.

Authors:  Jean-Baptiste Michel; Yuan Kui Shen; Aviva Presser Aiden; Adrian Veres; Matthew K Gray; Joseph P Pickett; Dale Hoiberg; Dan Clancy; Peter Norvig; Jon Orwant; Steven Pinker; Martin A Nowak; Erez Lieberman Aiden
Journal:  Science       Date:  2010-12-16       Impact factor: 47.728

5.  The changing psychology of culture from 1800 through 2000.

Authors:  Patricia M Greenfield
Journal:  Psychol Sci       Date:  2013-08-07

6.  Human language reveals a universal positivity bias.

Authors:  Peter Sheridan Dodds; Eric M Clark; Suma Desu; Morgan R Frank; Andrew J Reagan; Jake Ryland Williams; Lewis Mitchell; Kameron Decker Harris; Isabel M Kloumann; James P Bagrow; Karine Megerdoomian; Matthew T McMahon; Brian F Tivnan; Christopher M Danforth
Journal:  Proc Natl Acad Sci U S A       Date:  2015-02-09       Impact factor: 11.205

7.  Uncontrolled corpus composition drives an apparent surge in cognitive distortions.

Authors:  Benjamin Schmidt; Steven T Piantadosi; Kyle Mahowald
Journal:  Proc Natl Acad Sci U S A       Date:  2021-11-09       Impact factor: 11.205

8.  Historical language records reveal a surge of cognitive distortions in recent decades.

Authors:  Johan Bollen; Marijn Ten Thij; Fritz Breithaupt; Alexander T J Barron; Lauren A Rutter; Lorenzo Lorenzo-Luaces; Marten Scheffer
Journal:  Proc Natl Acad Sci U S A       Date:  2021-07-27       Impact factor: 11.205

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.