| Literature DB >> 18460185 |
Anaïs Mottaz1, Yum L Yip, Patrick Ruch, Anne-Lise Veuthey.
Abstract
BACKGROUND: Although the UniProt KnowledgeBase is not a medical-oriented database, it contains information on more than 2,000 human proteins involved in pathologies. However, these annotations are not standardized, which impairs the interoperability between biological and clinical resources. In order to make these data easily accessible to clinical researchers, we have developed a procedure to link diseases described in the UniProtKB/Swiss-Prot entries to the MeSH disease terminology.Entities:
Mesh:
Year: 2008 PMID: 18460185 PMCID: PMC2367626 DOI: 10.1186/1471-2105-9-S5-S3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Procedure of the mapping of UniProtKB/Swiss-Prot disease comment lines to MeSH terms.
Figure 2Disease comment lines in a UniProtKB/Swiss-Prot entry.
Evaluation of the mapping of 200 UniProtKB/Swiss-Prot disease lines (173 with a reference to OMIM)
| 35(17.5%) | 35(17.5%) | 100.0% | 91(45.5%) | 73(36.5%) | 80.0% | 126(63%) | 108(54%) | 86.0% | |
| 43(21.5%) | 40(20%) | 93.0% | 84(42%) | 68(34%) | 81.0% | 127(63.5%) | 108(54%) | 85.0% | |
| 23(11.5%) | 23(11.5%) | 100.0% | 58(29%) | 51(25.5%) | 88.0% | 93(46.5%) | 86(43%) | 92.5% | |
| 54(27%) | 52(26%) | 96.5% | 95(47.5%) | 76(38%) | 80.0% | 149(74.5%) | 128(64%) | 86.0% | |
SP: UniProtKB/Swiss-Prot
SP ∩ OMIM: both mappings correspond to the same MeSH descriptor.
Figure 3Recall –precision curves for partial matches of Swiss-Prot disease names (A) and OMIM titles and alternative titles (B) to the disease MeSH terms, with term normalisation (blue squares), without normalisation (green empty squares), and with the method developed by Ha-Thuc (red triangles). The data have been ordered according to the score and the precision is calculated at increasing recall intervals.
Figure 4F-measure in function of the score of partial matching to MeSH terms with Swiss-Prot disease names (blue triangles) or OMIM terms (red squares).
Mapping on MeSH of the 3408 UniProtKB/Swiss-Prot disease lines (2601 with a corresponding OMIM entry)
| 637 (18.7%) | 1332 (39%) | 1969 (57.8%) | |
| 745 (21.9%) | 1063 (31.2%) | 1808 (53.1%) | |
| 397 (11.6%) | 645 (18.9%) | 1289 (37.8%) | |
| 968 (28.4%) | 1362 (40%) | 2330 (68.4%) |
SP: UniProtKB/Swiss-Prot
SP ∩ OMIM: both mappings correspond to the same MeSH descriptor.