| Literature DB >> 30764825 |
Feichen Shen1, Yiqing Zhao2, Liwei Wang2, Majid Rastegar Mojarad2, Yanshan Wang2, Sijia Liu2, Hongfang Liu3.
Abstract
BACKGROUND: Existing resources to assist the diagnosis of rare diseases are usually curated from the literature that can be limited for clinical use. It often takes substantial effort before the suspicion of a rare disease is even raised to utilize those resources. The primary goal of this study was to apply a data-driven approach to enrich existing rare disease resources by mining phenotype-disease associations from electronic medical record (EMR).Entities:
Keywords: Data-driven approach; Differential diagnosis; Knowledge enrichment; Rare disease
Mesh:
Year: 2019 PMID: 30764825 PMCID: PMC6376651 DOI: 10.1186/s12911-019-0752-9
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1System workflow
Confusion matrix for performance evaluation
| Differential diagnosis candidates in the eRAM gold standard | Differential diagnosis candidates not in the eRAM gold standard | |
|---|---|---|
| Differential diagnosis candidates generated by each graph | True Positive (TP) | False Positive (FP) |
| Differential diagnosis candidates not generated by each graph | False Negative (FN) | True Negative (TN) |
Fig. 2Plotted curve between association ranking and two metrics
Fig. 3Characterization of associations
Statistics between EMR and HPO-Orphanet on the number of rare diseases, phenotypes, and phenotype-disease associations
| Number of Unique Rare Diseases | Count |
| HPO-Orphanet | 2664 |
| EMR | 476 |
| HPO-Orphanet and EMR | 97 |
| In EMR but not in HPO-Orphanet | 379 |
| Number of Unique Phenotypes | Count |
| HPO-Orphanet | 4577 |
| EMR | 1337 |
| HPO-Orphanet and EMR | 1013 |
| In EMR but not in HPO-Orphanet | 324 |
| Number of Unique Associations | Count |
| HPO-Orphanet | 7529 |
| EMR | 1973 |
| HPO-Orphanet and EMR | 198 |
| In EMR but not in HPO-Orphanet | 1775 |
Graph characterization for bipartite graphs generated from the HPO-Orphanet, EMR, and HPO-Orphanet+ (based on 97 shared diseases)
| HPO-Orphanet Graph | EMR Graph | HPO-Orphanet+ Graph | |
|---|---|---|---|
| # of Disease Nodes | 97 | 97 | 97 |
| # of Phenotype Nodes | 722 | 670 | 1194 |
| # of Edges | 1973 | 2071 | 3914 |
| Density | 0.006 | 0.007 | 0.005 |
| Average Degree | 4.818 | 5.4 | 6.064 |
Top 15 diseases with the highest degree in bipartite graphs generated from the HPO-Orphanet, EMR, and HPO-Orphanet+
| HPO-Orphanet Graph | EMR Graph | HPO-Orphanet+ Graph |
|---|---|---|
| 22q11.2 deletion syndrome | multiple myeloma | multiple myeloma |
| melas | hodgkin lymphoma | hodgkin lymphoma |
| granulomatosis with polyangiitis | follicular lymphoma | giant cell arteritis |
| marfan syndrome | giant cell arteritis | follicular lymphoma |
| neurofibromatosis type 1 | primary sclerosing cholangitis | primary sclerosing cholangitis |
| trisomy 18 | myasthenia gravis | 22q11.2 deletion syndrome |
| eosinophilic granulomatosis with polyangiitis | granulomatosis with polyangiitis | granulomatosis with polyangiitis |
| giant cell arteritis | pulmonary arterial hypertension | melas |
| acromegaly | liposarcoma | myasthenia gravis |
| primary sclerosing cholangitis | eosinophilic esophagitis | rheumatic fever |
| systemic sclerosis | rheumatic fever | marfan syndrome |
| dermatomyositis | klatskin tumor | dermatomyositis |
| osteogenesis imperfecta | tetralogy of fallot | pulmonary arterial hypertension |
| addison disease | cystic fibrosis | craniopharyngioma |
| cushing syndrome | craniopharyngioma | neurofibromatosis type1 |
Fig. 4Comparison on differential diagnostic suggestion performance for Hodgkin Lymphoma
Fig. 5Interactive web-based tool for differential diagnostic suggestion (CD stands for Common Disease, and RD stands for Rare Disease)
Top 15 differential diagnostic candidates for the HPO-Orphanet+ graph, Phenomizer, and eRAM on Hodgkin lymphoma. Scores in column 1 and 3 indicate Jaccard similarity and scores in column 2 indicate the IC-based score calculated by the Phenomizer (CD stands for common disease, and RD stands for rare disease)
| HPO-Orphanet+ Graph | Phenomizer | eRAM |
|---|---|---|
| B-cell lymphoma (RD): 0.626 | Classic hodgkin lymphoma (RD): 3.986 | Nodular lymphocyte predominant hodgkin lymphoma (RD): 0.458 |
| Diffuse large b-cell lymphoma (RD): 0.62 | Behcet syndrome (RD): 3.189 | Schnitzler syndrome (RD): 0.273 |
| Chronic Obstructive Airway Disease (CD): 0.595 | Aggressive systemic mastocytosis (RD): 3.176 | Mantle cell lymphoma (RD): 0.25 |
| Dilated cardiomyopathy (RD): 0.594 | Alveolar echinococcosis (RD): 3.085 | Pulmonary blastoma (RD): 0.25 |
| Abdominal aortic aneurysm (RD): 0.592 | Systemic lupus erythematosus (RD): 2.997 | Aggressive systemic mastocytosis (RD): 0.22 |
| Glomerulonephritis (RD): 0.591 | Legionellosis (RD): 2.878 | Anemia, autoimmune hemolytic (RD): 0.219 |
| Diabetes Mellitus, Non-Insulin-Dependent (CD): 0.588 | Takayasu arteritis (RD): 2.731 | Hughes syndrome (RD): 0.219 |
| Multiple myeloma (RD): 0.588 | Cystic echinococcosis (RD): 2.648 | Follicular lymphoma (RD): 0.214 |
| Atrial Fibrillation (CD): 0.585 | Eosinophilic granuloma (RD): 2.647 | Thymic carcinoma (RD): 0.214 |
| Glaucoma (CD): 0.58 | Whipple disease (RD): 2.638 | Mast cell sarcoma (RD): 0.2 |
| Myeloid leukemia (RD): 0.58 | Familial thrombocytosis (RD): 2.632 | American trypanosomiasis (CD): 0.2 |
| Coronary heart disease (CD): 0.58 | Systemic mastocytosis (RD): 2.622 | Alpha-heavy chain disease (RD): 0.194 |
| Degenerative polyarthritis (CD): 0.573 | Emberger syndrome (RD): 2.549 | Klatskin tumor (RD): 0.192 |
| Lung adenocarcinoma (RD): 0.572 | Hypocomplementemic urticarial vasculitis (RD): 2.548 | Legionellosis (RD): 0.189 |
| Chronic Kidney Insufficiency (CD): 0.571 | Babesiosis (RD): 2.499 | Babesiasis (RD): 0.182 |