| Literature DB >> 29955672 |
Tai Wang1,2, Zongkui Zhou2,3, Xiangen Hu2,3,4, Zhi Liu1, Yi Ding5, Zhiqiang Cai4.
Abstract
Resonance is generally used as a metaphor to describe the manner how the information from different sources is combined. Although it is an attractive and fundamental phenomenon in human behavior studies, most studies observed semantic resonances in well-controlled experimental settings at word level. To make up the missing link between word and document level resonances, we devoted our contributions to topic resonances in a novel and natural setting: academic commentaries. Ninety-three academic commentaries from ninety-three authors, along with their references and original papers, are analyzed by a latent Dirichlet allocation based natural language processing approach. This approach can decompose a corpus written and read by an author into several topics with different weights, which can reveal the phenomena ignored at word or document level. We found that (1) topic resonances commonly exist between commenters' fundamental input and output topics; (2) output words are re-allocated by commenters to echo salient input topics; (3) commenters are more prone to associate references which focus on the non-dominant input topics; and (4) topic resonance can even be predicted by a Hebbian-like model which matches the aforementioned findings. These findings will continue to enrich our understanding on the relationship among probe, feedback and context.Entities:
Keywords: Information science; Linguistics; Psychology
Year: 2018 PMID: 29955672 PMCID: PMC6019969 DOI: 10.1016/j.heliyon.2018.e00659
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
Major terminology used in commentary scenario.
| Terms | Descriptions | Examples |
|---|---|---|
| Output paper | Main body | Blaszczynski, A. (2008). Commentary: a response to “problems with the concept of video game ‘addiction’: some case study examples”. International Journal of Mental Health and Addiction, 6(2): 179–181. |
| Prior knowledge | Main body | Griffiths, M. D. (1993). Fruit machine gambling: The importance of structural characteristics. Journal of Gambling Studies, 9, 101–120. |
| Input paper | Main body | Wood, R. T. A. (2008). Problems with the concept of video game “addiction”: some case study examples, International Journal of Mental Health and Addiction, 6(2): 169–178. |
Main body refers to the full text of a paper excluding its authors' information, graphics and reference lists.
Fig. 1The quasi-experiment setting.
Fig. 2Seniority histogram.
Fig. 3Disciplines' proportions.
Fig. 4Analysis paradigm.
Fig. 5Topic distributions for a sample document set.
Top 10 words for each topic in the illustrative example.a
| Topic index | A | B | C | D | E |
|---|---|---|---|---|---|
| Topic label | excessive time | adolescent lottery | people case | research factors | structural characteristics |
| The 1st word | video | adolescents | video | internet | characteristics |
| The 2nd word | time | lottery | problems | factors | structural |
| The 3rd word | game | machine | people | many | machines |
| The 4th word | excessive | machines | game | use | machine |
| The 5th word | criteria | played | case | research | fruit |
| The 6th word | playing | slot | cause | risk | winning |
| The 7th word | videogames | found | behaviour | forms | pay |
| The 8th word | videogame | players | individuals | money | player |
| The 9th word | behaviour | fruit | playing | slot | near |
| The 10th word | addictive | adolescent | games | adolescent | psychological |
We use TF-IDF approach (Salton et al., 1975) to re-order the top words delivered by the extended SCVB0. TF (Term Frequency) here is replaced by the delivered token weight. The IDF (Inverse Document Frequency) here is replaced by the inverse topic frequency (i.e., the number of documents in IDF is replaced by the number of topics, and the number of documents which a certain term belongs to in IDF is replaced by the number of topics which a certain token belongs to). From a systematical view, this LDA-based topic model is followed by an independent plug-in “IF-IDF” module, rather than incorporated with an embedded IF-IDF formula at the very beginning as in (Wilson and Chew, 2010; Nikolenko et al., 2015).
Fig. 6Correlation coefficients of input and output topic distributions vs. p value.
Independent test on fundamental topic resonance.
| Gender | Discipline | Seniority | Commentary length | |||||
|---|---|---|---|---|---|---|---|---|
| Resonance | 0.42 | 22.10 | 37.21 | 16.97 | ||||
One-way ANOVA on gender.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| Male | 71 | 57 | 0.803 | 0.16 | ||
| Female | 22 | 19 | 0.863 | 0.12 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.062 | 1 | 0.062 | 0.409 | 0.524 | 3.946 |
| Within group | 13.830 | 91 | 0.152 | |||
| Total | 13.892 | 92 | ||||
One-way ANOVA on discipline.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| Surgery | 35 | 25 | 0.714 | 0.21 | ||
| Medicine | 15 | 13 | 0.867 | 0.12 | ||
| Psychology | 14 | 12 | 0.857 | 0.13 | ||
| etc | 29 | 26 | 0.897 | 0.10 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.612 | 3 | 0.204 | 1.37 | 0.258 | 2.707 |
| Within group | 13.280 | 89 | 0.149 | |||
| Total | 13.892 | 92 | ||||
The etc group consists of samples from all the other 21 disciplines except the above three.
One-way ANOVA on seniority.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| High | 47 | 39 | 0.830 | 0.14 | ||
| Low | 46 | 37 | 0.804 | 0.16 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.015 | 1 | 0.015 | 0.099 | 0.754 | 3.946 |
| Within group | 13.877 | 91 | 0.152 | |||
| Total | 13.892 | 92 | ||||
The whole samples are divided into High Seniority group and Low Seniority group by median seniority (23.5 years).
One-way ANOVA on commentary length.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| Long | 46 | 41 | 0.891 | 0.099 | ||
| Short | 47 | 35 | 0.744 | 0.194 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.015 | 1 | 0.015 | 0.099 | 0.754 | 3.946 |
| Within group | 13.877 | 91 | 0.152 | |||
| Total | 13.892 | 92 | ||||
The whole samples are divided into Long Commentary and Short Commentary group by median commentary length (391 tokens).
Fig. 7Boxplots of correlation coefficients between topic distributions of input, prior knowledge and output.
Independence test for being more biased on output topics.
| Gender | Discipline | Seniority | Length | |||||
|---|---|---|---|---|---|---|---|---|
| More biased | 2.88 | 23.30 | 29.57 | 21.32 | ||||
One-way analysis of variance (ANOVA) on gender.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| Male | 71 | 55 | 0.775 | 0.18 | ||
| Female | 22 | 13 | 0.591 | 0.25 | ||
| ANOVA | ||||||
| Source of Variance | SS | df | MS | F | p-value | F_crit |
| Between Groups | 0.567 | 1 | 0.567 | 2.913 | 0.091 | 3.946 |
| Within Group | 17.713 | 91 | 0.195 | |||
| Total | 18.280 | 92 | ||||
One-way analysis of variance (ANOVA) on discipline.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| Surgery | 35 | 27 | 0.771 | 0.18 | ||
| Medicine | 15 | 11 | 0.733 | 0.21 | ||
| Psychology | 14 | 8 | 0.571 | 0.26 | ||
| etc | 29 | 22 | 0.759 | 0.19 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.436 | 3 | 0.145 | 0.725 | 0.540 | 2.707 |
| Within group | 17.844 | 89 | 0.200 | |||
| Total | 18.280 | 92 | ||||
The etc group consists of samples from all other 21 disciplines except the above three.
One-way analysis of variance (ANOVA) on seniority.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| High | 47 | 35 | 0.745 | 0.19 | ||
| Low | 46 | 33 | 0.717 | 0.21 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.017 | 1 | 0.017 | 0.086 | 0.770 | 3.946 |
| Within group | 18.262 | 91 | 0.200 | |||
| Total | 18.280 | 92 | ||||
The whole samples are divided into High Seniority group and Low Seniority group by median seniority (23.5 years).
One-way analysis of variance (ANOVA) on commentary length.
| Summary | ||||||
|---|---|---|---|---|---|---|
| Groups | Samples | Sum | Mean | Variance | ||
| Long | 46 | 30 | 0.652 | 0.232 | ||
| Short | 47 | 38 | 0.808 | 0.158 | ||
| ANOVA | ||||||
| Source of variance | SS | df | MS | F | p-value | F_crit |
| Between groups | 0.568 | 1 | 0.568 | 2.919 | 0.091 | 3.946 |
| Within group | 17.711 | 91 | 0.194 | |||
| Total | 18.280 | 92 | ||||
The whole samples are divided into Long Commentary and Short Commentary group by median commentary length (391 tokens).
Fig. 8Correlation coefficients between input & pseudo prior knowledge topic distributions vs. correlation coefficients between input & real prior knowledge topic distributions.
Fig. 9Calculated sorted in ascending order.
Detailed model assessments on the maximum weighted input topic dimension.
| Subgroup | Assessment indicator | Hebbian-like model | Simple linear model |
|---|---|---|---|
| 1 | Model parameter | ( | |
| R2 | 0.282 | 0.288 | |
| F-statistic vs. zero model | 17 | 6.48 | |
| <0.001 | 0.0216 | ||
| 2 | Parameter | ( | |
| R2 | 0.697 | 0.697 | |
| F-statistic vs. zero model | 189 | 36.9 | |
| <0.001 | <0.001 | ||
| 3 | Parameter | ( | |
| R2 | 0.795 | 0.818 | |
| F-statistic vs. zero model | 679 | 76.4 | |
| <0.001 | <0.001 | ||
| 4 | Parameter | ( | |
| R2 | 0.528 | 0.662 | |
| F-statistic vs. zero model | 328 | 35.3 | |
| <0.001 | <0.001 | ||
| 5 | Parameter | ( | |
| R2 | 0.662 | 0.747 | |
| F-statistic vs. zero model | 170 | 23.6 | |
| <0.001 | 0.00126 |
Model assessment on all five input topic dimensions.
| Items | Input topic dimension | |||||
|---|---|---|---|---|---|---|
| 1st weighted | 2nd weighted | 3rd weighted | 4th weighted | 5th weighted | ||
| Final samples | 85 | 88 | 86 | 86 | 85 | |
| Subgroups | 5 | 6 | 6 | 7 | 10 | |
| SGNF | Hebbian | 5 | 4 | 4 | 4 | 6 |
| Linear | 3 | 2 | 1 | 2 | 3 | |
| Samples in SGNF model | Hebbian | 85 | 66 | 67 | 62 | 47 |
| Linear | 47 | 37 | 17 | 51 | 25 | |
| R2 scope in SGNF model | Hebbian | [0.282, 0.795] | [0.256, 0.537] | [0.290, 0.806] | [0.287, 0.905] | [0.453, 0.950] |
| Linear | [0.697, 0.818] | [0.574, 0.690] | 0.695 | [0.519, 0.754] | [0.748, 0.998] | |
SGNF is short for significant. A SGNF subgroup refers to the subgroup's model is at the significant level. The significant level is set p < 0.001 for all five dimensions except the fifth. The significant level is set p < 0.01 for the fifth weighted input topic dimension.
Top 12 references cited by B in his 39 journal papers.
| Author | The times of being cited | Rank |
|---|---|---|
| Productivity Commission (1999) | 15 | 1 |
| Lesieur, Blume (1987) | 15 | 1 |
| National Research Council (1999) | 14 | 3 |
| Ladouceur, Walker (1996) | 11 | 4 |
| American Psychiatric Association (1994) | 11 | 4 |
| American Psychiatric Association (2000) | 9 | 6 |
| Sylvain, Ladouceur, Boisvert (1997) | 8 | 7 |
| Blaszczynski, Nower (2002) | 8 | 7 |
| American Psychiatric Association (1987) | 8 | 7 |
| Petry (2004) | 7 | 10 |
| Jacobs (1986) | 7 | 10 |
| Walker (1992) | 6 | 12 |
| Shaffer, Hall (1996) | 6 | 12 |
| McConaghy, Armstrong, Blaszczynski, Allcock (1983) | 6 | 12 |
| Blaszczynski, Steel, McConaghy (1997) | 6 | 12 |
Only the family names of individual authors are retained, due to the consideration of privacy respect.
Top four authors of references in B's 35 refined references.
| Author | The times of appearing in reference sections | Rank |
|---|---|---|
| Griffiths, M. D. | 8 | 1 |
| Phillips, J. G. | 4 | 2 |
| Young, K. | 2 | 3 |
| Dickerson, M. G. | 2 | 3 |
The rest authors only appear once in reference sections.