| Literature DB >> 33266944 |
Chao Xu1, Chen Xu1, Wenjing Tian1, Anqing Hu2, Rui Jiang3.
Abstract
In this study, the Wikipedia page views for four selected topics, namely, education, the economy/finance, medicine, and nature/environment from 2016-2018 are collected and the sample entropies of the three years' page views are estimated and investigated using a short-time series multiscale entropy (sMSE) algorithm for a comprehensible understanding of the complexity of human website searching activities. The sample entropies of the selected topics are found to exhibit different temporal variations. In the past three years, the temporal characteristics of the sample entropies are vividly revealed, and the sample entropies of the selected topics follow the same tendencies and can be quantitatively ranked. By taking the 95% confidence interval into account, the temporal variations of sample entropies are further validated by statistical analysis (non-parametric), including the Wilcoxon signed-rank test and the Mann-Whitney U-test. The results suggest that the sample entropies estimated by the sMSE algorithm are feasible for analyzing the temporal variations of complexity for certain topics, whereas the regular variations of estimated sample entropies of different selected topics can't simply be accepted as is. Potential explanations and paths in forthcoming studies are also described and discussed.Entities:
Keywords: Wikipedia; complexity; human behavior; multiscale entropy; page view; sample entropy
Year: 2019 PMID: 33266944 PMCID: PMC7514710 DOI: 10.3390/e21030229
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1The calculated sample entropies of the white noise and 1/f noise series using the series multiscale entropy (sMSE) algorithm (series length: 1024, m = 2, r = 0.15 ∗ std).
Figure 2Data classification and the multilevel structure of Wikipedia.
Characteristics of the collected data.
| Topics | Number of Subcategories | Year | Length | Mean Value |
|---|---|---|---|---|
| Medicine | 28 | 2016 | 366 | 212,281 |
| 2017 | 365 | 199,207 | ||
| 2018 | 359 | 198,467 | ||
| Education | 27 | 2016 | 366 | 312,365 |
| 2017 | 365 | 319,447 | ||
| 2018 | 359 | 298,771 | ||
| Economy/finance | 32 | 2016 | 366 | 354,896 |
| 2017 | 365 | 323,208 | ||
| 2018 | 359 | 293,831 | ||
| Nature/environment | 16 | 2016 | 366 | 145,095 |
| 2017 | 365 | 148,254 | ||
| 2018 | 359 | 110,283 |
Figure 3Page views of the four selected topics in 2016, 2017 and 2018: (a) Education; (b) Economy/Finance; (c) Medicine; (d) Nature/Environment.
Figure 4Sample entropies of page views of the four selected topics: (a) Education; (b) Economy/Finance; (c) Medicine; (d) Nature/Environment.
Figure 5Comparison of sample entropies between the four selected topics: (a) 2016; (b) 2017; and (c) 2018.
p values of the Wilcoxon signed-rank test for results in Figure 4.
| Year Pairs | (2016,2017) | (2016,2018) | (2017,2018) |
|---|---|---|---|
|
| 0.0273 | 0.0488 | 0.0137 |
|
| 0.4922 | 0.0840 | 0.0039 |
|
| 0.4922 | 0.6250 | 1.0000 |
|
| 0.8457 | 0.4316 | 0.9219 |
p values of the Mann-Whitney U-test for results in Figure 5a.
| Topic Paris | (Edu, E/F) | (Edu, Med) | (Edu, N/E) | (E/F, Med) | (E/F, N/E) | (Med, N/E) |
|---|---|---|---|---|---|---|
| 0.4727 | 0.1859 | 0.5205 | 0.0757 | 0.2730 | 0.3847 |
p values of the Mann-Whitney U-test for results in Figure 5b.
| Topic Paris | (Edu, E/F) | (Edu, Med) | (Edu, N/E) | (E/F, Med) | (E/F, N/E) | (Med, N/E) |
|---|---|---|---|---|---|---|
| 0.2123 | 0.3447 | 0.5708 | 0.1041 | 0.2703 | 0.2413 |
p values of the Mann-Whitney U-test for results in Figure 5c.
| Topic Paris | (Edu, E/F) | (Edu, Med) | (Edu, N/E) | (E/F, Med) | (E/F, N/E) | (Med, N/E) |
|---|---|---|---|---|---|---|
| 0.4274 | 0.2413 | 0.7913 | 0.1212 | 0.2447 | 0.3847 |