| Literature DB >> 25006672 |
Andrej Kastrin1, Thomas C Rindflesch2, Dimitar Hristovski3.
Abstract
Concept associations can be represented by a network that consists of a set of nodes representing concepts and a set of edges representing their relationships. Complex networks exhibit some common topological features including small diameter, high degree of clustering, power-law degree distribution, and modularity. We investigated the topological properties of a network constructed from co-occurrences between MeSH descriptors in the MEDLINE database. We conducted the analysis on two networks, one constructed from all MeSH descriptors and another using only major descriptors. Network reduction was performed using the Pearson's chi-square test for independence. To characterize topological properties of the network we adopted some specific measures, including diameter, average path length, clustering coefficient, and degree distribution. For the full MeSH network the average path length was 1.95 with a diameter of three edges and clustering coefficient of 0.26. The Kolmogorov-Smirnov test rejects the power law as a plausible model for degree distribution. For the major MeSH network the average path length was 2.63 edges with a diameter of seven edges and clustering coefficient of 0.15. The Kolmogorov-Smirnov test failed to reject the power law as a plausible model. The power-law exponent was 5.07. In both networks it was evident that nodes with a lower degree exhibit higher clustering than those with a higher degree. After simulated attack, where we removed 10% of nodes with the highest degrees, the giant component of each of the two networks contains about 90% of all nodes. Because of small average path length and high degree of clustering the MeSH network is small-world. A power-law distribution is not a plausible model for the degree distribution. The network is highly modular, highly resistant to targeted and random attack and with minimal dissortativity.Entities:
Mesh:
Year: 2014 PMID: 25006672 PMCID: PMC4090190 DOI: 10.1371/journal.pone.0102188
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Toy example of the constructed network.
Nodes represent MeSH descriptors. An edge between two MeSH descriptors is defined if they appear together in the same MEDLINE citation. Frequency of co-occurrence is represented by edge width. For example, the pair “Medical Informatics” – “Gene Expression” occurs in many more citations then does the pair “Principal Component Analysis” – “Pluripotent Stem Cells”. Note, that we use frequency information only for network reduction purposes (i.e., to obtain a statistic which indicates whether a pair of descriptors occurs together more often than by chance).
Contingency table of observed frequencies for pairs of MeSH descriptors.
|
|
| ||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note: U = MeSH descriptor u, V = MeSH descriptor v, O = observed frequency, R = row total, C = column total, N = grand total. For example, cell O 12 refers to the observed frequency of pairs in which descriptor u occurs, but descriptor v does not occur.
Calculation of expected frequencies for pairs of MeSH descriptors.
|
|
| |
|
|
|
|
|
|
|
|
Note: U = MeSH descriptor u, V = MeSH descriptor v, E = expected frequency, R = row total of observed frequencies, C = column total of observed frequencies, N = grand total of observed frequencies.
Figure 2Complementary cumulative degree distribution.
The plot shows degree distribution for full (left figure) and major (right figure) MeSH networks.
Figure 3Wordcloud.
Visual summary with 50 top degree MeSH descriptors for full (left figure) and major (right figure) MeSH networks. The text size is proportional to the node degree.
Figure 4Average clustering per degree.
Plot shows average clustering coefficient of nodes per degree for full (left figure) and major (right figure) MeSH networks. The nodes with a smaller degree exhibit higher clustering than those with larger degree. The decay can be approximated by power-law dependency.