| Literature DB >> 17134479 |
Fleur Mougin1, Anita Burgun, Olivier Bodenreider.
Abstract
BACKGROUND: Data integration is a crucial task in the biomedical domain and integrating data sources is one approach to integrating data. Data elements (DEs) in particular play an important role in data integration. We combine schema- and instance-based approaches to mapping DEs to terminological resources in order to facilitate data sources integration.Entities:
Mesh:
Year: 2006 PMID: 17134479 PMCID: PMC1764450 DOI: 10.1186/1471-2105-7-S3-S6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Example of the three Genew Web pages for the TNXB, HFE, and BRCA1 genes. Examples of data elements are encircled (Approved Symbol, Approved Name)
Figure 2Examples of the exploitation of the values of two data elements: (a) using the UMLS as a terminological resource, (b) using heuristics.
Mapping steps of data elements in the UMLS Metathesaurus
| 139/204 | Molecular Weight (C0026385) | ||
| 20/23 | cellular_component (C1166607) | ||
| 232/333 | Genes (C0017337) | ||
Repartition of the data elements under UMLS semantic types
| 37 | Names (C0027365) | ||
| 34 | Biological process (C1184743) | ||
| 26 | Context (C0542559) | ||
| 25 | Type (C0332307) | ||
| 19 | Site (C0205145) | ||
| 17 | malignant neoplasms (C0006826) | ||
| 17 | Statistical sensitivity (C0036667) | ||
| 16 | Drugs (C0013227) | ||
| 14 | immune system (C0020962) | ||
| 14 | Disease (C0012634) |
Results of the direct mapping of data elements to the NCI caDSR
| 10 | 10 |
Correct : | |
| 22 | 285 |
Correct : | |
| 10 | 10 | Partial : | |
| 273 | 2,467 |
Not useful : | |
| 39 | 218 |
Not useful : |
PN: Preferred Name, LN: Long Name, CDEs: common data elements
Results of the indirect mapping through data element values and heuristics
| 36 (6.6%) | ||
| 18 (3.3%) | ||
| 6 (1.1%) | ||
| 2 (0.3%) | ||
| 486 (88.7%) |