| Literature DB >> 34916287 |
Marten Scheffer1, Ingrid van de Leemput2, Els Weinans2,3, Johan Bollen4.
Abstract
The surge of post-truth political argumentation suggests that we are living in a special historical period when it comes to the balance between emotion and reasoning. To explore if this is indeed the case, we analyze language in millions of books covering the period from 1850 to 2019 represented in Google nGram data. We show that the use of words associated with rationality, such as "determine" and "conclusion," rose systematically after 1850, while words related to human experience such as "feel" and "believe" declined. This pattern reversed over the past decades, paralleled by a shift from a collectivistic to an individualistic focus as reflected, among other things, by the ratio of singular to plural pronouns such as "I"/"we" and "he"/"they." Interpreting this synchronous sea change in book language remains challenging. However, as we show, the nature of this reversal occurs in fiction as well as nonfiction. Moreover, the pattern of change in the ratio between sentiment and rationality flag words since 1850 also occurs in New York Times articles, suggesting that it is not an artifact of the book corpora we analyzed. Finally, we show that word trends in books parallel trends in corresponding Google search terms, supporting the idea that changes in book language do in part reflect changes in interest. All in all, our results suggest that over the past decades, there has been a marked shift in public interest from the collective to the individual, and from rationality toward emotion.Entities:
Keywords: collectivity; individuality; language; rationality; sentiment
Mesh:
Year: 2021 PMID: 34916287 PMCID: PMC8713757 DOI: 10.1073/pnas.2107848118
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 12.779
Fig. 1.Dynamics of four characteristics of English and Spanish book language represented in Google n-gram data. (A, E, I, and M) Second principal component of change in z-scores of frequencies of the 5,000 most-used words. (B, F, J, and N) Relative level of arousal (black), positive sentiment (blue), and negative sentiment (red). (C, G, K, and O) Z-scores of frequencies of flag-words related to intuition, believing, spirituality, sapience: spirit, imagine, wisdom, wise, hunch, mind, suspicion, believe, think, trust, faith, truth, true, belief, doubt, hope, fear, life, soul, heaven, eternal, mortal, holy, god, pray, mystery, sense, feel, soft, hard, cold, hot, smell, foul, taste, sweet, bitter, hear, sound, silence, loud, see, light, dark, bright (for Spanish: espíritu, imaginar, sabiduría, mente, sospecha, creer, pensar, fe, verdad, duda, esperanza, miedo, vida, alma, cielo, santo, dios, misterio, sentido, sensación, sentir, suave, duro, frío, caliente, gusto, dulce, oír, silencio, fuerte, ver, mirar, oscuro, brillante). The black central line represents the mean and the gray shaded area the 95% confidence interval of the mean. (D, H, L, and P) Similar but for flag words related to rationality, science, and quantification: science, technology, scientific, chemistry, chemicals, physics, medicine, model, method, fact, data, math, analysis, conclusion, limit, result, determine, transmission, assuming, system, size, unit, pressure, area, percent (for Spanish: ciencia, tecnología, científico, química, productos, física, medicina, modelo, método, dato, datos, hipótesis, estadísticas, cálculo, análisis, conclusión, límite, resultado, determinar, transmisión, sistema, tamaño, unidad, presión, área, densidad, porcentaje).
Contrasting classes of concepts related to a personal (top row) vs. societal view of the world (bottom row) emerge by ranking words according to their correlation with principal components, overall sentiment, and the hockeystick pattern
Listed are the words that score highest vs. lowest on the second PCA axis depicted in Fig. 1, the words that correlate most positively vs. negatively with positive sentiment, and the words that increased most clearly after 1980 while declining between 1850 and 1980 vs. words that show the opposite pattern (ranked to the absolute difference in Kendall tau in those periods). We used positive sentiment for computing the correlations in the second column, but this is closely correlated to negative sentiment and arousal. Longer lists (5%) of English, and the analogous analysis of Spanish words, English fiction, and English excluding fiction are presented in .
Fig. 2.(A–D) Ratio of the relative frequencies of singular to corresponding plural pronouns in various book corpora represented in the Google n-gram database.
Fig. 3.Ratio of intuition to rationality related words in the New York Times (A) and various book corpora represented in the Google n-gram database (B–E). The graphs depict the ratio of the mean relative frequencies of the sets of rationality-related and intuition-related flag words presented in Fig. 1, right-hand columns.
Fig. 4.(A–D) Relationship between trends in the use of words in Google search queries and use of the same words in books for the period 2004 to 2019. Blue bars represent frequency distributions of Spearman rank correlations between each word in books and that same word in Google searches after subtracting average frequencies predicted by a null model of randomly matched words (50 bins). To construct the null model we matched the 5,000 words in books with randomly picked words from the same word list in Google searches and calculated the Spearman rank correlation. We ran this 1,000 times, resulting in a frequency distribution for each correlation bin. We subtracted the mean from the resulting distribution, such that the null line represents the average frequency of correlations (for more details see ). Black lines represent the 5% and 95% percentiles of the frequency distributions of the null model per correlation bin. For each of the corpora, positive correlations are found more than expected by chance, while negative correlations are found less than expected.