| Literature DB >> 29106618 |
Jinmeng Jia1, Zhongxin An1, Yue Ming1, Yongli Guo2, Wei Li3, Yunxiang Liang1, Dongming Guo1, Xin Li1, Jun Tai2, Geng Chen1, Yaqiong Jin2, Zhimei Liu2, Xin Ni2, Tieliu Shi1.
Abstract
Rare diseases affect over a hundred million people worldwide, most of these patients are not accurately diagnosed and effectively treated. The limited knowledge of rare diseases forms the biggest obstacle for improving their treatment. Detailed clinical phenotyping is considered as a keystone of deciphering genes and realizing the precision medicine for rare diseases. Here, we preset a standardized system for various types of rare diseases, called encyclopedia of Rare disease Annotations for Precision Medicine (eRAM). eRAM was built by text-mining nearly 10 million scientific publications and electronic medical records, and integrating various data in existing recognized databases (such as Unified Medical Language System (UMLS), Human Phenotype Ontology, Orphanet, OMIM, GWAS). eRAM systematically incorporates currently available data on clinical manifestations and molecular mechanisms of rare diseases and uncovers many novel associations among diseases. eRAM provides enriched annotations for 15 942 rare diseases, yielding 6147 human disease related phenotype terms, 31 661 mammalians phenotype terms, 10,202 symptoms from UMLS, 18 815 genes and 92 580 genotypes. eRAM can not only provide information about rare disease mechanism but also facilitate clinicians to make accurate diagnostic and therapeutic decisions towards rare diseases. eRAM can be freely accessed at http://www.unimd.org/eram/.Entities:
Mesh:
Year: 2018 PMID: 29106618 PMCID: PMC5753383 DOI: 10.1093/nar/gkx1062
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) Overlaps among the major disease sources. Venn diagram for the major sources of disease names in eRAM. (B) Overlaps among all primary disease resources. The symmetric matrix shows the number of overlapping diseases between all pairs of primary name sources according to eRAM mapping. Colors and numerals represent the overlapping degree in disease counts. Source abbreviations: DOID – Disease Ontology (Identifier), GARD – NIH Rare Diseases. (C) (a) Overlap between diseases which have phenotypic annotations (phenotypes and symptoms) between existing databases and text-mining results. (b) Statistics of diseases and D-M pairs which have phenotypic annotations between existing databases and text-mined literature. (c) Distributions of disease-manifestation associations between existing databases and text-mined literature. The x-axis represents the diseases with phenotypic annotation from both existing databases and literature, while y-axis represents the proportion of disease corresponding manifestations. (D) Coverage of the text-mined result. Comparison of text-mined sentences containing phenotypes. The y-axis represents the number of sentences containing disease-related phenotypes for each disease. DS, Disease-Symptom pairs; DP, Disease-Phenotype pairs. (E) Distributions of disease similarity scores. Blue bins represent phenotypes (existing in 6147 diseases), and red bins represent genes (existing in 5593 diseases).