| Literature DB >> 17233900 |
Rob Jelier1, Guido Jenster, Lambert C J Dorssers, Bas J Wouters, Peter J M Hendriksen, Barend Mons, Ruud Delwel, Jan A Kors.
Abstract
BACKGROUND: High-throughput experiments, such as with DNA microarrays, typically result in hundreds of genes potentially relevant to the process under study, rendering the interpretation of these experiments problematic. Here, we propose and evaluate an approach to find functional associations between large numbers of genes and other biomedical concepts from free-text literature. For each gene, a profile of related concepts is constructed that summarizes the context in which the gene is mentioned in literature. We assign a weight to each concept in the profile based on a likelihood ratio measure. Gene concept profiles can then be clustered to find related genes and other concepts.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17233900 PMCID: PMC1784107 DOI: 10.1186/1471-2105-8-14
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Area under the curve scores for individual genes per group for the concept profile method (open boxes) and the ACS (open circles). An asterisk above a group indicates that the difference in performance of the two methods is statistically significant (at the 0,05 level).
Occurrence of monocyte specific clusters in patient groups.
| MHC 2 | - | - | 4↑ | 13↓ | 9↓ | - | 3↑ | - | 7↓ | 4↑ | - |
| Cathepsins | - | 11↓ | 9↑ | - | 3↓ | - | - | 4↓ | 3* | 3↓ | - |
| NADPH oxidase/respiratory burst | - | - | 4↑ | - | 4↓ | - | 6↑ | - | - | - | - |
| CCR1 | ↓ | ↓ | ↑ | - | - | - | ↑ | - | - | - | - |
| CCR2 | ↓ | - | ↑ | - | - | - | ↑ | - | - | - | - |
| CD14 | - | - | ↑ | - | - | - | ↑ | - | - | - | - |
| LILRB4 | - | - | ↑ | - | - | - | - | - | - | - | ↑ |
* The cathepsins of group 12 include 1 down-regulated and 2 up-regulated genes.
The upper half of the table shows for the patient groups the presence of the clusters of genes that were discussed for patient group 5. Several patient groups are not shown as the SAM analysis only yielded very few distinguishing genes. The size of the clusters is indicated and the arrows indicate if the genes are up- or down regulated. The lower half of the table shows the presence of the genes that were discussed in the text.
Figure 2Fragment of the hierarchical clustering tree and heatmap based on the concept profiles for the genes differentially expressed following the agonistic stimulation of the androgen receptor. The tight cluster associated with melanosomes is highlighted.
Concepts representative for the cluster RAB27B, MYRIP, MLPH, RAB27A as given by Anni.
| RAB27A | 52,17 | 0,61 | 0,74 | 0,73 | 1 |
| MLPH | 11.16 | - | 0,44 | 1 | 0,29 |
| Myosin Type V | 7,22 | 0,04 | 0,68 | 0,4 | 0,22 |
| Melanosomes | 6,7 | 0,12 | 0,3 | 0,47 | 0,27 |
| RAB27B | 4,06 | 1 | 0,14 | - | 0,11 |
| MYRIP | 2,98 | 0,07 | 1 | 0,09 | 0,06 |
| Melanocytes | 2,73 | 0,13 | 0,14 | 0,28 | 0,17 |
| Myosins | 2,33 | 0,04 | 0,38 | 0,22 | 0,12 |
| Myosin Heavy Chains | 1,72 | - | 0,46 | 0,18 | 0,09 |
| GTP Phosphohydrolases | 1,31 | 0,17 | 0,23 | 0,04 | 0,08 |
| Actins | 1,17 | 0,05 | 0,32 | 0,12 | 0,06 |
| Exocytosis | 0,87 | 0,08 | 0,12 | 0,08 | 0,12 |
| Secretory Vesicles | 0,68 | 0,07 | 0,16 | 0,06 | 0,09 |
| Carrier Proteins | 0,59 | - | 0,11 | 0,17 | 0,09 |
| Organelles | 0,54 | 0,11 | - | 0,12 | 0,09 |
| rab GTP-Binding Proteins | 0,52 | 0,16 | - | 0,04 | 0,12 |
In the first column the concept names are shown, in the second the percentage contribution of this concept to the average cosine score (0,57) of this group. We limited the number of concepts to a contribution of 0,5% to the average cosine score. The remaining columns show the weight of the concepts in the concept profiles of the genes whose names are shown in the column headings. These weights form the basis of the clustering of the 4 genes.