| Literature DB >> 25365319 |
Iassen Halatchliyski1, Ulrike Cress2.
Abstract
Using a longitudinal network analysis approach, we investigate the structural development of the knowledge base of Wikipedia in order to explain the appearance of new knowledge. The data consists of the articles in two adjacent knowledge domains: psychology and education. We analyze the development of networks of knowledge consisting of interlinked articles at seven snapshots from 2006 to 2012 with an interval of one year between them. Longitudinal data on the topological position of each article in the networks is used to model the appearance of new knowledge over time. Thus, the structural dimension of knowledge is related to its dynamics. Using multilevel modeling as well as eigenvector and betweenness measures, we explain the significance of pivotal articles that are either central within one of the knowledge domains or boundary-crossing between the two domains at a given point in time for the future development of new knowledge in the knowledge base.Entities:
Mesh:
Year: 2014 PMID: 25365319 PMCID: PMC4218828 DOI: 10.1371/journal.pone.0111958
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Development of the number of categorized articles and of authors.
| snapshot year | specialized psychology articles | specialized education articles | intersection articles | Σ articles | Σ authors |
| 2006 | 2176 | 1357 | 325 | 3858 | 1776 |
| 2007 | 2911 | 1980 | 450 | 5341 | 3113 |
| 2008 | 3472 | 2556 | 526 | 6554 | 4265 |
| 2009 | 3908 | 3108 | 581 | 7597 | 5251 |
| 2010 | 4262 | 3595 | 626 | 8483 | 6104 |
| 2011 | 4660 | 4166 | 686 | 9512 | 7047 |
| 2012 | 5085 | 4696 | 731 | 10512 | 8002 |
Yearly growth in the total number of articles and authors.
| 2002/03 | 2003/04 | 2004/05 | 2005/06 | 2006/07 | 2007/08 | 2008/09 | 2009/10 | 2010/11 | 2011/12 | |
| articles | 119 | 574 | 1658 | 1486 | 1483 | 1213 | 1043 | 886 | 1029 | 1000 |
| authors | 12 | 135 | 677 | 952 | 1337 | 1152 | 986 | 853 | 943 | 955 |
Development of the number of articles with new contributions per period.
| 2006–2007 | 2007–2008 | 2008–2009 | 2009–2010 | 2010–2011 | 2011–2012 | |
| psychology | 1285 | 1561 | 1541 | 1428 | 1451 | 1515 |
| education | 739 | 955 | 1026 | 1049 | 1071 | 1161 |
| intersection | 198 | 269 | 252 | 210 | 233 | 231 |
| Σ | 2222 | 2785 | 2819 | 2687 | 2755 | 2907 |
Figure 1Degree distribution in the combined network of psychology and education articles in the seven snapshot years.
In a log-log scale, the colored points display the frequency of articles with a given number of neighbors over the years.
Figure 2Preferential attachment in the combined network of psychology and education articles in the seven snapshot years.
In a log-log scale, the colored lines demonstrate the relationship between the degree of an article and its probability to be linked to a newly created article for each of the yearly snapshots.
Multilevel logistic models of receiving newly created articles as neighbors.
| Level of variable | Estimate | Std. Error | z value | Pr(>|z|) | |
| Combined | |||||
| (Intercept) | 2.56 | 0.11 | 23.19 | <2e-16*** | |
| creation year | article | −0.33 | 0.01 | −25.27 | <2e-16*** |
| article age | period | −0.22 | 0.01 | −21.57 | <2e-16*** |
| t-1 log betweenness | period | 0.31 | 0.01 | 33.30 | <2e-16*** |
| t-1 log edit count | period | 0.26 | 0.02 | 11.40 | <2e-16*** |
| education article | article | −0.17 | 0.04 | −4.28 | 1.9e-05*** |
| intersection article | article | 0.19 | 0.07 | 2.87 | 0.0041** |
| featured article | article | 0.04 | 0.19 | 0.20 | 0.8399 |
| log controversiality | article | 0.05 | 0.01 | 4.26 | 2.0e-05*** |
| Psychology | |||||
| (Intercept) | 1.55 | 0.10 | 15.33 | <2e-16*** | |
| creation year | article | −0.47 | 0.02 | −27.58 | <2e-16*** |
| article age | period | −0.33 | 0.01 | −26.14 | <2e-16*** |
| t-1 log eigenvector | period | 0.51 | 0.02 | 26.82 | <2e-16*** |
| t-1 log edit count | period | 0.37 | 0.03 | 12.894 | <2e-16*** |
| intersection article | article | 0.68 | 0.07 | 9.50 | <2e-16*** |
| featured article | article | −0.14 | 0.22 | −0.61 | 0.5430 |
| log controversiality | article | 0.06 | 0.01 | 4.32 | 1.5e-05*** |
| Education | |||||
| (Intercept) | −0.20 | 0.10 | −1.89 | 0.0586. | |
| creation year | article | −0.38 | 0.02 | −19.97 | <2e-16*** |
| article age | period | −0.18 | 0.01 | −12.47 | <2e-16*** |
| t-1 log eigenvector | period | 0.27 | 0.02 | 15.69 | <2e-16*** |
| t-1 log edit count | period | 0.47 | 0.03 | 13.80 | <2e-16*** |
| intersection article | article | 0.70 | 0.08 | 9.04 | <2e-16*** |
| featured article | article | 0.37 | 0.40 | 0.91 | 0.3605 |
| log controversiality | article | 0.08 | 0.02 | 4.15 | 3.3e-05*** |
Multilevel linear models of the change in the edit count of the neighboring articles.
| Level of variable | Estimate | Std. Error | t value | Pr(>|z|) | |
| Combined | |||||
| (Intercept) | 131.95 | 4.09 | 32.29 | <2e-16*** | |
| creation year | article | −7.73 | 0.48 | −15.97 | <2e-16*** |
| article age | period | −8.13 | 0.35 | −23.32 | <2e-16*** |
| Δ neighbors since t-1 | period | 20.42 | 0.11 | 180.30 | <2e-16*** |
| t-1 log betweenness | period | 5.86 | 0.33 | 17.82 | <2e-16*** |
| t-1 log edit count | period | 9.98 | 0.90 | 11.82 | <2e-16*** |
| education article | article | −37.04 | 1.67 | −22.19 | <2e-16*** |
| intersection article | article | −9.28 | 2.91 | −3.19 | 0.0014** |
| featured article | article | 53.85 | 8.55 | 6.30 | 3.0e-10*** |
| log controversiality | article | 6.28 | 0.50 | 12.68 | <2e-16*** |
| Psychology | |||||
| (Intercept) | 116.88 | 3.47 | 33.66 | <2e-16*** | |
| creation year | article | −9.02 | 0.57 | −15.79 | <2e-16*** |
| article age | period | −9.06 | 0.43 | −21.28 | <2e-16*** |
| Δ neighbors since t-1 | period | 23.66 | 0.14 | 171.27 | <2e-16*** |
| t-1 log eigenvector | period | 13.30 | 0.58 | 22.80 | <2e-16*** |
| t-1 log edit count | period | 13.19 | 1.04 | 12.74 | <2e-16*** |
| intersection article | article | 0.79 | 2.73 | 0.29 | 0.7732 |
| featured article | article | 58.11 | 9.01 | 6.45 | 1.1e-10*** |
| log controversiality | article | 6.61 | 0.55 | 11.92 | <2e-16*** |
| Education | |||||
| (Intercept) | 53.14 | 2.43 | 21.90 | <2e-16*** | |
| creation year | article | −6.38 | 0.43 | −14.79 | <2e-16*** |
| article age | period | −5.85 | 0.33 | −17.66 | <2e-16*** |
| Δ neighbors since t-1 | period | 14.87 | 0.13 | 112.62 | <2e-16*** |
| t-1 log eigenvector | period | 4.52 | 0.36 | 12.70 | <2e-16*** |
| t-1 log edit count | period | 10.54 | 0.83 | 12.63 | <2e-16*** |
| intersection article | article | 36.77 | 2.09 | 17.56 | <2e-16*** |
| featured article | article | 1.10 | 11.27 | 0.10 | 0.9225 |
| log controversiality | article | 4.04 | 0.56 | 7.23 | 4.9e-13*** |
Multilevel logistic models of an article receiving new edits.
| Level of variable | Estimate | Std. Error | z value | Pr(>|z|) | |
| Combined | |||||
| (Intercept) | 0.63 | 0.09 | 7.11 | 1.2e-12*** | |
| creation year | article | −0.39 | 0.01 | −34.40 | <2e-16*** |
| article age | period | −0.31 | 0.01 | −32.86 | <2e-16*** |
| t-1 log betweenness | period | 0.09 | 0.01 | 12.10 | <2e-16*** |
| t-1 log edit count | period | 0.65 | 0.02 | 33.02 | <2e-16*** |
| education article | article | 0.01 | 0.03 | 0.20 | 0.8420 |
| intersection article | article | 0.01 | 0.06 | 0.34 | 0.7350 |
| featured article | article | 0.18 | 0.19 | 0.94 | 0.35 |
| log controversiality | article | 0.20 | 0.01 | 16.34 | <2e-16*** |
| Psychology | |||||
| (Intercept) | −0.02 | 0.07 | −0.21 | 0.8303 | |
| creation year | article | −0.43 | 0.01 | −32.38 | <2e-16*** |
| article age | period | −0.33 | 0.01 | −30.33 | <2e-16*** |
| t-1 log eigenvector | period | 0.05 | 0.01 | 4.43 | 9.6e-06*** |
| t-1 log edit count | period | 0.68 | 0.02 | 29.94 | <2e-16*** |
| intersection article | article | 0.11 | 0.05 | 1.98 | 0.0475* |
| featured article | article | 0.17 | 0.21 | 0.83 | 0.4093 |
| log controversiality | article | 0.19 | 0.01 | 14.08 | <2e-16*** |
| Education | |||||
| (Intercept) | −0.25 | 0.08 | −3.31 | 0.0009*** | |
| creation year | article | −0.37 | 0.01 | −26.28 | <2e-16*** |
| article age | period | −0.31 | 0.01 | −24.51 | <2e-16*** |
| t-1 log eigenvector | period | 0.03 | 0.01 | 2.92 | 0.0035** |
| t-1 log edit count | period | 0.72 | 0.03 | 27.67 | <2e-16*** |
| intersection article | article | 0.10 | 0.06 | 1.83 | 0.0678. |
| featured article | article | 0.42 | 0.34 | 1.23 | 0.2160 |
| log controversiality | article | 0.22 | 0.02 | 11.34 | <2e-16*** |