| Literature DB >> 18460187 |
Irena Spasić1, Daniel Schober, Susanna-Assunta Sansone, Dietrich Rebholz-Schuhmann, Douglas B Kell, Norman W Paton.
Abstract
BACKGROUND: Many bioinformatics applications rely on controlled vocabularies or ontologies to consistently interpret and seamlessly integrate information scattered across public resources. Experimental data sets from metabolomics studies need to be integrated with one another, but also with data produced by other types of omics studies in the spirit of systems biology, hence the pressing need for vocabularies and ontologies in metabolomics. However, it is time-consuming and non trivial to construct these resources manually.Entities:
Mesh:
Year: 2008 PMID: 18460187 PMCID: PMC2367623 DOI: 10.1186/1471-2105-9-S5-S5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The flow of data in a TM approach to CV expansion. The information retrieval module is used to gather a corpus of documents relevant for a given CV from the literature databases. Automatic term recognition is applied against the corpus to extract terms as domain-specific lexical units. Some of the extracted terms not directly related to the CV are filtered out by using the knowledge about typically co-occurring types of terms.
Figure 2A sub-tree of the MeSH hierarchy. We show part of the MeSH hierarchy relevant for the two CVs (i.e. NMR and GC) considered.
Figure 3An HTML report summarising CV expansion results
Figure 4Citation details of the retrieved documents
Figure 5A full-text document retrieved from PMC
Figure 6A corpus of “Materials and Methods” sections
Figure 7A list of automatically extracted terms with links to their concordances
Term acquisition results for NMR
| 122,867 | 6,125 (141) | 1,613 | 758 (29) | ||
| 113,191 | 663 | 2,047 | 270 | ||
| 5,602 | 6,215 | 124 | 2,601 | ||
| 2,298 | 3,257 | 61 | 1,385 | ||
Term acquisition results for GC
| 60,338 | 1,351 (79) | 3,948 | 1,383 (58) | ||
| 42,418 | 68 | 3,012 | 97 | ||
| 2,708 | 811 | 2,442 | 1,114 | ||
| 567 | 348 | 1,323 | 526 | ||
Evaluation of term acquisition results for NMR
| 3.81 | 3.19 | 3.5 | 0.88 | |
| 4 | 3 | 3.5 | 1 |
Evaluation of term acquisition results for GC
| 3.06 | 3.79 | 3.425 | 0.93 | |
| 4 | 4 | 4 | 1 |
Figure 8Distribution of evaluation scores for NMR
Figure 9Distribution of evaluation scores for GC