| Literature DB >> 28815104 |
Juan M Banda1, Yoni Halpern2, David Sontag3, Nigam H Shah1.
Abstract
The widespread usage of electronic health records (EHRs) for clinical research has produced multiple electronic phenotyping approaches. Methods for electronic phenotyping range from those needing extensive specialized medical expert supervision to those based on semi-supervised learning techniques. We present Automated PHenotype Routine for Observational Definition, Identification, Training and Evaluation (APHRODITE), an R- package phenotyping framework that combines noisy labeling and anchor learning. APHRODITE makes these cutting-edge phenotyping approaches available for use with the Observational Health Data Sciences and Informatics (OHDSI) data model for standardized and scalable deployment. APHRODITE uses EHR data available in the OHDSI Common Data Model to build classification models for electronic phenotyping. We demonstrate the utility of APHRODITE by comparing its performance versus traditional rule-based phenotyping approaches. Finally, the resulting phenotype models and model construction workflows built with APHRODITE can be shared between multiple OHDSI sites. Such sharing allows their application on large and diverse patient populations.Entities:
Year: 2017 PMID: 28815104 PMCID: PMC5543379
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1.APHRODITE phenotype development/deployment framework schematics. Phenotype definitions are initially learned at development sites and exported for deployment. At deployment sites, users have a choice to use the final keyword list to learn their own site-specific models or use the pre-built classifier.
Performance of noisy labeling process
| Cases | Controls | Accuracy | Recall | PPV | Cases | Controls | Accuracy | Recall | PPV | |
|---|---|---|---|---|---|---|---|---|---|---|
| Source | Myocardial Infarction (MI) | Type 2 Diabetes Mellitus (T2DM) | ||||||||
| 94 | 94 | 0.85 | 0.8 | 152 | 152 | 0.89 | 0.81 | |||
| 94 | 94 | 0.87 | 152 | 152 | 0.91 | 0.98 | 0.87 | |||
First 5 noisy labeling keywords
| Myocardial Infarction (MI) | Type 2 diabetes mellitus (T2DM) |
|---|---|
| Old myocardial infarction | Type 2 diabetes mellitus with hyperosmolar coma |
| True posterior myocardial infarction | Type 2 diabetes mellitus |
| Myocardial infarction with complication | Pre-existing type 2 diabetes mellitus |
| Myocardial infarction in recovery phase | Type 2 diabetes mellitus with multiple complications |
| Microinfarct of heart | Type 2 diabetes mellitus in non-obese |
Performance of classifiers trained with noisy labeled training data
| Cases | Cont. | Acc. | Recall | PPV | Acc. | Recall | PPV | |
|---|---|---|---|---|---|---|---|---|
| Myocardial Infarction (MI) | Type 2 Diabetes Mellitus (T2DM) | |||||||
| 750 | 750 | 0.86 | 0.89 | 0.84 | 0.88 | 0.89 | 0.87 | |
| 750 | 750 | 0.9 | 0.92 | 0.89 | 0.89 | 0.92 | 0.87 | |
| 1,500 | 1,500 | 0.9 | 0.9 | 0.91 | 0.93 | 0.88 | ||
| 10,000 | 10,000 | 0.91 | 0.91 | 0.92 | 0.89 | |||
Performance assessment of classifiers trained with noisy labeled training data using a gold-standard
| Cases | Cont. | Acc. | Recall | PPV | Cases | Cont. | Acc. | Recall | PPV | |
|---|---|---|---|---|---|---|---|---|---|---|
| Source | Myocardial Infarction (MI) | Type 2 Diabetes Mellitus (T2DM) | ||||||||
| 94 | 94 | 152 | 152 | |||||||
| 94 | 94 | 0.89 | 0.93 | 0.86 | 152 | 152 | 0.89 | 0.88 | 0.9 | |
| 94 | 94 | 0.91 | 0.93 | 0.90 | 152 | 152 | 0.91 | 0.95 | 0.88 | |
| 94 | 94 | 0.92 | 0.93 | 0.91 | 152 | 152 | 0.92 | 0.95 | 0.89 | |
| 94 | 94 | 0.92 | 0.91 | 152 | 152 | 0.93 | 0.89 | |||
Anchors suggested for the MI phenotype and T2DM phenotype
| Myocardial Infarction (MI) | Type 2 diabetes mellitus (T2DM) | ||||||
|---|---|---|---|---|---|---|---|
| Source | importance / rank | concept_name | Source | Importance / rank | concept_name | ||
| lab | 0.5637 | 1 | Serum HDL/non-HDL cholesterol ratio measurement | ||||
| lab | 0.5466 | 2 | Glucose measurement | ||||
| obs | 1.7831 | 3 | Chest CT | obs | 0.5328 | 3 | Lipid panel |
| obs | 1.7108 | 4 | Cataract | ||||
| obs | 1.6386 | 5 | Hypercholesterolemia | obs | 0.4984 | 5 | Diabetes mellitus |
| obs | 1.4699 | 7 | Palpitations | ||||
| obs | 0.4366 | 8 | Obesity | ||||
| obs | 1.3012 | 9 | Tightness sensation quality | obs | 0.4228 | 9 | Hypoglycemia |
| obs | 0.3987 | 10 | Metformin | ||||
| obs | 1.012 | 11 | Deficiency | ||||
| drugEx | 0.8916 | 12 | ferrous sulfate | ||||
| drugEx | 0.7711 | 13 | Dobutamine | visit | 0.3541 | 13 | Obesity |
| obs | 0.7229 | 14 | Fibrovascular | lab | 0.3334 | 14 | Glucose |
| lab | 0.5542 | 15 | Sodium [Moles/volume] in Blood | lab | 0.3059 | 15 | Hemoglobin A1c |
| drugEx | 0.4819 | 16 | clopidogrel | ||||
| lab | 0.3855 | 17 | Lactic acid measurement | ||||
| obs | 0.2131 | 18 | Cholesterol | ||||
| drugEx | 0.1928 | 19 | zolpidem | drugEx | 0.1925 | 19 | Insulin, Regular, Human |
| obs | 0.0964 | 20 | Aspirin 81 MG Enteric Coated Tablet | ||||
Performance results for anchored experiments
| Cases | Cont. | Acc. | Recall | PPV | Cases | Cont. | Acc. | Recall | PPV | |
|---|---|---|---|---|---|---|---|---|---|---|
| Myocardial Infarction (MI) | Type 2 Diabetes Mellitus (T2DM) | |||||||||
| 94 | 94 | 0.89 | 0.93 | 0.86 | 152 | 152 | 0.89 | 0.99 | 0.9 | |
| 94 | 94 | 0.91 | 0.93 | 0.9 | 152 | 152 | 0.91 | 0.98 | 0.88 | |
| 94 | 94 | 0.92 | 0.97 | 0.89 | 152 | 152 | 0.92 | 0.95 | 0.9 | |
| 94 | 94 | 0.96 | 0.91 | 152 | 152 | 0.95 | 0.91 | |||
Performance results for data removal experiments
| Cases | Cont. | Acc. | Recall | PPV | Cases | Cont. | Acc. | Recall | PPV | |
|---|---|---|---|---|---|---|---|---|---|---|
| Source | Myocardial Infarction (MI) | Type 2 Diabetes Mellitus (T2DM) | ||||||||
| 94 | 94 | 0.93 | 0.9 | 152 | 152 | 0.91 | 0.98 | 0.88 | ||
| Observations removed | 94 | 94 | 0.75 | 0.84 | 0.78 | 152 | 152 | 0.67 | 0.76 | 0.71 |
| Labs removed | 94 | 94 | 0.87 | 0.85 | 0.82 | 152 | 152 | 0.69 | 0.78 | 0.72 |
| Drugs removed | 94 | 94 | 0.85 | 0.85 | 0.82 | 152 | 152 | 0.83 | 0.9 | 0.84 |
| Visits removed | 94 | 94 | 0.89 | 0.88 | 0.86 | 152 | 152 | 0.86 | 0.91 | 0.86 |
| 94 | 94 | 0.97 | 0.89 | 152 | 152 | 0.92 | 0.95 | 0.9 | ||
| Observations removed | 94 | 94 | 0.77 | 0.89 | 0.79 | 152 | 152 | 0.7 | 0.77 | 0.73 |
| Labs removed | 94 | 94 | 0.89 | 0.88 | 0.81 | 152 | 152 | 0.71 | 0.79 | 0.75 |
| Drugs removed | 94 | 94 | 0.86 | 0.87 | 0.84 | 152 | 152 | 0.86 | 0.91 | 0.85 |
| Visits removed | 94 | 94 | 0.91 | 0.9 | 0.87 | 152 | 152 | 0.88 | 0.9 | 0.84 |
| 94 | 94 | 0.96 | 0.91 | 152 | 152 | 0.93 | 0.95 | 0.91 | ||
| Observations removed | 94 | 94 | 0.76 | 0.87 | 0.77 | 152 | 152 | 0.74 | 0.8 | 0.76 |
| Labs removed | 94 | 94 | 0.87 | 0.84 | 0.86 | 152 | 152 | 0.75 | 0.83 | 0.77 |
| Drugs removed | 94 | 94 | 0.86 | 0.86 | 0.82 | 152 | 152 | 0.88 | 0.92 | 0.87 |
| Visits removed | 94 | 94 | 0.91 | 0.91 | 0.89 | 152 | 152 | 0.89 | 0.92 | 0.86 |