| Literature DB >> 36212768 |
Abstract
Academic writing is developing to be more positive. This linguistic positivity bias is confirmed in academic writing across disciplines and genres. The current research adopted sentiment analysis and examined the diachronic change in linguistic positivity in the full texts of 2,556 research articles published in Science in 25 years. The results showed that academic writing in research articles in the journal Science has become significantly more positive in the past 25 years. The findings of this study confirm linguistic positivity bias in academic writing based on empirical data from Science. Reasons for the increasingly positive language use in science articles might include the popularization of science, the growing number of researchers, and the difficulty of publishing in high-impact journals. Finally, this study discussed the implications of our findings for researchers, editors, and peer reviewers. © Akadémiai Kiadó, Budapest, Hungary 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.Entities:
Keywords: Academic writing; Linguistic positivity bias; Research articles; Science; Sentiment analysis
Year: 2022 PMID: 36212768 PMCID: PMC9526210 DOI: 10.1007/s11192-022-04515-2
Source DB: PubMed Journal: Scientometrics ISSN: 0138-9130 Impact factor: 3.801
Descriptive statistics of the corpus used in the study
| Year | Number of Articles | Number of Words in Full Texts | Mean Word Count | Number of Sentences Per Year | Mean Sentence Count Per Article |
|---|---|---|---|---|---|
| 1997 | 35 | 148,055 | 4,230 | 8,712 | 248 |
| 1998 | 45 | 174,071 | 3,868 | 11,561 | 256 |
| 1999 | 44 | 169,413 | 3,850 | 10,800 | 245 |
| 2000 | 59 | 227,902 | 3,862 | 15,980 | 270 |
| 2001 | 67 | 252,738 | 3,772 | 15,894 | 237 |
| 2002 | 67 | 248,088 | 3,702 | 15,471 | 230 |
| 2003 | 60 | 226,624 | 3,777 | 14,663 | 244 |
| 2004 | 60 | 228,106 | 3,801 | 14,071 | 234 |
| 2005 | 72 | 266,953 | 3,707 | 16,793 | 233 |
| 2006 | 64 | 236,720 | 3,698 | 14,455 | 225 |
| 2007 | 58 | 226,335 | 3,902 | 14,653 | 252 |
| 2008 | 57 | 211,828 | 3,716 | 13,318 | 233 |
| 2009 | 69 | 275,520 | 3,993 | 19,008 | 275 |
| 2010 | 67 | 268,991 | 4,014 | 19,143 | 285 |
| 2011 | 75 | 292,817 | 3,904 | 20,723 | 276 |
| 2012 | 54 | 213,215 | 3,948 | 14,809 | 274 |
| 2013 | 90 | 386,548 | 4,294 | 26,018 | 289 |
| 2014 | 126 | 551,805 | 4,379 | 37,231 | 295 |
| 2015 | 114 | 507,989 | 4,456 | 33,636 | 295 |
| 2016 | 149 | 669,659 | 4,494 | 45,393 | 304 |
| 2017 | 172 | 780,369 | 4,537 | 52,982 | 308 |
| 2018 | 192 | 875,819 | 4,561 | 60,291 | 314 |
| 2019 | 258 | 1,179,668 | 4,572 | 86,932 | 336 |
| 2020 | 278 | 1,256,443 | 4,519 | 93,645 | 336 |
| 2021 | 224 | 1,039,839 | 4,642 | 79,431 | 354 |
Distribution of sentiment in the full texts across 25 years
| Year | |||||
|---|---|---|---|---|---|
| Jockers sentiment lexicon | NRC sentiment lexicon | Jockers sentiment lexicon | NRC sentiment lexicon | SenticNet sentiment lexicon | |
| 1997 | 9.521331 | 9.527533 | 10.080014 | 10.661952 | 11.987791 |
| 1998 | 9.396531 | 9.406549 | 9.870166 | 10.315908 | 11.814906 |
| 1999 | 7.546121 | 7.578637 | 7.999714 | 8.21836 | 9.868191 |
| 2000 | 8.571431 | 8.608957 | 9.173225 | 9.526734 | 10.960807 |
| 2001 | 8.383277 | 8.433924 | 8.998861 | 9.42009 | 10.892345 |
| 2002 | 7.863707 | 7.898525 | 8.511171 | 8.917346 | 10.222025 |
| 2003 | 7.777968 | 7.813237 | 8.350494 | 8.488094 | 10.203304 |
| 2004 | 8.431991 | 8.462608 | 8.979032 | 9.344791 | 10.948697 |
| 2005 | 7.446901 | 7.439231 | 7.924671 | 8.110029 | 9.693471 |
| 2006 | 7.580008 | 7.58766 | 8.227586 | 8.489964 | 10.051596 |
| 2007 | 8.735593 | 8.775876 | 9.38116 | 9.682491 | 11.204487 |
| 2008 | 8.054496 | 8.105745 | 8.538112 | 8.94146 | 10.547337 |
| 2009 | 10.108575 | 10.153245 | 10.777659 | 11.069634 | 12.64925 |
| 2010 | 10.257882 | 10.277272 | 11.071934 | 11.166507 | 12.782353 |
| 2011 | 10.295619 | 10.318279 | 11.033113 | 11.376831 | 12.711169 |
| 2012 | 10.101425 | 10.115826 | 10.786342 | 11.124611 | 12.645313 |
| 2013 | 9.904227 | 9.947269 | 10.574727 | 10.815863 | 12.312222 |
| 2014 | 10.466433 | 10.50242 | 11.111915 | 11.368122 | 12.987398 |
| 2015 | 11.020506 | 11.050246 | 11.775518 | 11.977221 | 13.53401 |
| 2016 | 11.330756 | 11.350055 | 12.023898 | 12.192563 | 13.852181 |
| 2017 | 11.591671 | 11.617402 | 12.245307 | 12.457971 | 14.068735 |
| 2018 | 11.225818 | 11.246149 | 11.846535 | 12.016217 | 13.699407 |
| 2019 | 10.603499 | 10.620283 | 11.202951 | 11.355169 | 13.117856 |
| 2020 | 10.812 | 10.83708 | 11.454397 | 11.648572 | 13.344973 |
| 2021 | 10.413977 | 10.451182 | 10.988131 | 11.142709 | 12.929841 |
Descriptive statistics of standardized sentiment scores in SA1
| Lexicon | Meana | Standard deviationa | Maximum | Minimum |
|---|---|---|---|---|
| Jockers | 38.412861 | − 17.274664 | ||
| NRC | 38.418873 | − 17.316931 |
amean and standard deviation by year and full text
Fig. 1Diachronic trajectory of linguistic positivity based on SA1
Detailed statistics of simple linear regression for SA1
| Model | Variable | Estimate | Standard error | ||
|---|---|---|---|---|---|
| Jockers | (Intercept) | − 273.34805 | 48.40457 | − 5.647 | 9.51e− 06 *** |
| Year | 0.14079 | 0.02409 | 5.843 | 5.91e− 06 *** | |
| NRC | (Intercept) | − 273.42479 | 48.29356 | − 5.662 | 9.18e− 06 *** |
| Year | 0.14084 | 0.02404 | 5.859 | 5.69e− 06 *** |
*p < 0.05; **p < 0.01; ***p < 0.001
Descriptive statistics of the standardized sentiment scores in SA2
| Lexicon | Meana | Standard deviationa | Maximum | Minimum |
|---|---|---|---|---|
| Jockers | 10.492273 | 1.628299 | 39.64096 | − 18.735906 |
| NRC | 10.617342 | 1.689519 | 40.348968 | − 19.887274 |
| SenticNet | 10.729428 | 1.835543 | 40.884626 | − 17.719674 |
amean and standard deviation by year and full text
Fig. 2Diachronic trajectory of linguistic positivity based on SA2
Detailed statistics of simple linear regression for SA2
| Model | Variable | Estimate | Standard error | ||
|---|---|---|---|---|---|
| Jockers | (Intercept) | − 283.97819 | 50.24481 | − 5.652 | 9.40e− 06 *** |
| Year | 0.14639 | 0.02501 | 5.853 | 5.77e− 06 *** | |
| NRC | (Intercept) | − 262.86763 | 52.22310 | − 5.034 | 4.29e− 05 *** |
| Year | 0.14639 | 0.02501 | 5.853 | 5.77e− 06 *** | |
| SenticNet | (Intercept) | − 281.32789 | 49.83593 | − 5.645 | 9.56e− 06 *** |
| Year | 0.14599 | 0.02481 | 5.885 | 5.35e− 06 *** |
*p < 0.05; **p < 0.01; ***p < 0.001
Pearson’s correlation test for standardized sentiment scores in SA1 and SA2
| Lexicon | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| 1. Jockers (SA1) | – | ||||
| 2. NRC (SA1) | 0.9999409*** | – | |||
| 3. Jockers (SA2) | 0.9986053*** | 0.9986208*** | – | ||
| 4. NRC (SA2) | 0.9957922*** | 0.9958604*** | 0.9966371*** | – | |
| 5. SenticNet (SA2) | 0.9990544*** | 0.9991685*** | 0.9981632*** | 0.9959328*** | – |
*p < 0.05; **p < 0.01; ***p < 0.001
List of predefined positive and negative words (Vinkers et al., 2015)
| Category | Words |
|---|---|
| Positive | amazing, assuring, astonishing, bright, creative, encouraging, enormous, excellent, favourable, groundbreaking, hopeful, innovative, inspiring, inventive, novel, phenomenal, prominent, promising, reassuring, remarkable, robust, spectacular, supportive, unique, unprecedented |
| Negative | detrimental, disappointing, disconcerting, discouraging, disheartening, disturbing, frustrating, futile, hopeless, impossible, inadequate, ineffective, insignificant, insufficient, irrelevant, mediocre, pessimistic, substandard, unacceptable, unpromising, unsatisfactory, unsatisfying, useless, weak, worrisome |
Fig. 3Frequency of Vinkers et al’s (2015) positive and negative words between 1999 and 2008
Fig. 4Frequency of Vinkers et al’s (2015) positive and negative words between 2009 and 2018
Fig. 5Overall distribution of Vinkers et al’s (2015) positive and negative words between 1997 and 2021