| Literature DB >> 22369514 |
Wesley D Maciel1, Alessandra C Faria-Campos, Marcos A Gonçalves, Sérgio V A Campos.
Abstract
BACKGROUND: Biological systems are commonly described as networks of entity interactions. Some interactions are already known and integrate the current knowledge in life sciences. Others remain unknown for long periods of time and are frequently discovered by chance. In this work we present a model to predict these unknown interactions from a textual collection using the vector space model (VSM), a well known and established information retrieval model. We have extended the VSM ability to retrieve information using a transitive closure approach. Our objective is to use the VSM to identify the known interactions from the literature and construct a network. Based on interactions established in the network our model applies the transitive closure in order to predict and rank new interactions.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22369514 PMCID: PMC3287578 DOI: 10.1186/1471-2164-12-S4-S1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Representations of entity interactions in a 2-dimensional subnetwork.
Figure 2Network construction.
Categories and web sources of the biological entities.
| Category | Number of Entities | Number of Clusters | Web Source |
|---|---|---|---|
| Disease | 52 | 22 | Karolinska Institute [ |
| Drug | 44 | 22 | Drugs.com [ |
| Gene | 43 | 20 | Kyoto Encyclopedia of Genes and Genomes [ |
| Target | 50 | 23 | The Free Dictionary [ |
| Total | 189 | 87 | |
The subnetworks.
| Subnetwork | Dimensional Space |
|---|---|
| 1 | |
| 2 | |
| 3 | |
| 4 | |
| 5 | |
| 6 | |
| 7 | |
| 8 | |
| 9 | |
| 10 | |
| 11 | |
The subnetworks and their number of known and new interactions.
| Subnetwork | Dimensional Space | Known Interaction | New Interaction | Total |
|---|---|---|---|---|
| 1 | 192 | 270 | 462 | |
| 2 | 76 | 184 | 260 | |
| 3 | 138 | 346 | 484 | |
| 4 | 38 | 130 | 168 | |
| 5 | 105 | 294 | 399 | |
| 6 | 50 | 175 | 225 | |
| 7 | 71 | 304 | 375 | |
| 8 | 199 | 958 | 1.157 | |
| 9 | 55 | 269 | 324 | |
| 10 | 34 | 76 | 110 | |
| 11 | 69 | 189 | 258 | |
| Total | 1 027 | 3 195 | 4 222 | |
Ranking of subnetworks based on their best new interactions and number of dimensions. The interaction level of known interactions was determined by the arithmetic average of all similarities returned by the vector space model.
| Subnetwork | Dimensional Space | New Interaction | Level of Interaction |
|---|---|---|---|
| 6 | gene | androgen receptor | 0.9757 |
| 2 | disease | HIV | 0.9738 |
| 5 | drug | verapamil | 0.9597 |
| 1 | disease | erectile dysfunction | 0.9470 |
| 3 | disease | arrhythmia | 0.9272 |
| 4 | drug | ciclosporin | 0.8211 |
| 8 | disease | alzheimer dementia | 0.8807 |
| 10 | drug | acarbose | 0.8723 |
| 9 | disease | parkinson disease | 0.8695 |
| 7 | disease | gout | 0.8357 |
| 11 | disease | breast adenocarcinoma | 0.7826 |
Figure 3Interactions with confirmation patent claims by year.
The top 5 known interactions with high interaction level in 2005 that became new interactions in 2004. The interaction level of known interactions was determined by the arithmetic average of all similarities returned by the vector space model.
| Subnetwork | Dimensional Space | Interaction | Level in 2005 | Level in 2004 | Patents in 2005 | Patents in 2004 |
|---|---|---|---|---|---|---|
| 5 | disease | heart attack | 0.9999 | 0.8324 | 1 | 61 |
| 1 | target | adrenaline | 0.9866 | 0.8676 | 1 | 36 |
| 11 | target | hmg coa reduct. | 0.9190 | 0.6383 | 1 | 5 |
| 2 | target | gp iib/iiia | 0.9137 | 0.8354 | 1 | 103 |
| 4 | disease | HIV | 0.9041 | 0.8825 | 1 | 30 |
Figure 4Distribution of confirmation patent claims filed in 2005 throughout the levels of the ranking constructed with patents issued up to 2004 for the subnetwork . The number of new interactions predicted in this subnetwork in 2004 is 282. We have found 4 confirmation patent claims filed in 2005 for the new interactions predicted in 2004. The position of these 4 confirmation patent claims in the ranking of new interactions predicted in 2004 are respectively 3, 14, 39, and 159. Thus, we have observed that 3 confirmation patent claims were among the top 100 best ranked indications of subnetwork drug × target.
The subnetworks and their number of confirmation patent claims at the top 100 new interactions predicted in 2004.
| Subnetwork | Dimensional Space | New Interactions in 2004 | Confirmations Issued in 2005 | Distribution at the Top 100 New Interactions | ||
|---|---|---|---|---|---|---|
| AVG | MAX | SUM | ||||
| 1 | 275 | 5 | 3 | 4 | 4 | |
| 2 | 167 | 2 | 2 | 2 | 2 | |
| 3 | 348 | 2 | 1 | 1 | 1 | |
| 4 | 119 | 0 | 0 | 0 | 0 | |
| 5 | 282 | 4 | 3 | 3 | 3 | |
| 6 | 152 | 3 | 2 | 2 | 3 | |
| 7 | 308 | 4 | 1 | 1 | 4 | |
| 8 | 786 | 9 | 2 | 2 | 3 | |
| 9 | 242 | 2 | 2 | 2 | 2 | |
| 10 | 76 | 0 | 0 | 0 | 0 | |
| 11 | 175 | 1 | 1 | 1 | 0 | |
| Total | 2 930 | 32 | 17 | 18 | 22 | |
| % | 53 | 56 | 69 | |||
Confirmation papers for the first new interaction predicted in 2005 for each subnetwork.
| Subnetwork | Dimensional Space | First Interaction | Confirmation Papers |
|---|---|---|---|
| 1 | impotence | [ | |
| 2 | acquired immunodeficiency syndrome | [ | |
| 3 | arrhythmia | [ | |
| 4 | ciclosporin | [ | |
| 5 | verapamil | none | |
| 6 | androgen receptor | [ | |
| 7 | gout | none | |
| 8 | alzheimer’s disease | none | |
| 9 | parkinson’s disease | none | |
| 10 | acarbose | none | |
| 11 | breast cancer | none | |
Figure 5Research space for some possible interactions with the drug aspirin. Gray lines are known interactions and green lines are new interactions. The interaction level of known interactions was determined by the arithmetic average of all similarities returned by the vector space model.
Figure 6History of the inference process. Entities in blue are genes and entities in red are targets. The interaction level of known interactions was determined by the arithmetic average of all similarities returned by the vector space model.