| Literature DB >> 26306287 |
Laura K Wiley1, Jeremy D Moretz2, Joshua C Denny1, Josh F Peterson1, William S Bush3.
Abstract
It is unclear the extent to which best practices for phenotyping disease states from electronic medical records (EMRs) translate to phenotyping adverse drug events. Here we use statin-induced myotoxicity as a case study to identify best practices in this area. We compared multiple phenotyping algorithms using administrative codes, laboratory measurements, and full-text keyword matching to identify statin-related myopathy from EMRs. Manual review of 300 deidentified EMRs with exposure to at least one statin, created a gold standard set of 124 cases and 176 controls. We tested algorithms using ICD-9 billing codes, laboratory measurements of creatine kinase (CK) and keyword searches of clinical notes and allergy lists. The combined keyword algorithms produced were the most accurate (PPV=86%, NPV=91%). Unlike in most disease phenotyping algorithms, addition of ICD9 codes or laboratory data did not appreciably increase algorithm accuracy. We conclude that phenotype algorithms for adverse drug events should consider text based approaches.Entities:
Year: 2015 PMID: 26306287 PMCID: PMC4525276
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Statin Names Used for Searches
| Generic | Trade Name |
|---|---|
| Atorvastatin | Lipitor, Caduet |
| Fluvastatin | Lescol |
| Lovastatin | Mevacor, Altocor, Altoprev |
| Pitavastatin | Livalo, Pitava |
| Pravastatin | Pravachol |
| Rosuvastatin | Crestor |
| Simvastatin | Zocor, Vytorin, Simcor |
Also: statin, statins, & hmg
Review Population Statins (n=300)
| Characteristic | n (%) |
|---|---|
| Statin Ever Prescribed | |
| Atorvastatin | 150 (50%) / 64 (51.6%) |
| Fluvastatin | 25 (8.3%) / 5 (4.0%) |
| Lovastatin | 42 (14.0%) / 11 (8.9%) |
| Pitavastatin | 1 (0.3%) / 0 (0%) |
| Pravastatin | 99 (33.0%) / 31 (25%) |
| Rosuvastatin | 60 (20.0%) / 19 (15.3%) |
| Simvastatin | 235 (78.3%) / 74 (59.7%) |
| Not Specified | 0 (0%) / 8 (6.5%) |
Includes overlap (i.e., individuals on multiple statins).
Percentages of 124 myopathy events.
Performance of ICD9 and Creatine Kinase Algorithms
| (95%CI) | |||||
|---|---|---|---|---|---|
| (n) | Sensitivity | Specificity | PPV | NPV | |
| Drug Related | 4 | 0.02 (0.01–0.07) | 0.99 (0.97–1.00) | 0.75 (0.19–0.99) | 0.59 (0.53–0.65) |
| Muscle Related | 82 | 0.47 (0.38–0.56) | 0.86 (0.80–0.91) | 0.71 (0.60–0.80) | 0.70 (0.63–0.76) |
| All Codes | 84 | 0.48 (0.39–0.57) | 0.86 (0.80–0.91) | 0.70 (0.59–0.80) | 0.70 (0.63–0.76) |
| CK > 500 IU/L | 72 | 0.32 (0.24–0.41) | 0.81 (0.75–0.87) | 0.54 (0.42–0.66) | 0.64 (0.57–0.70) |
| CK (any value), no troponin | 224 | 0.89 (0.82–0.94) | 0.35 (0.28–0.43) | 0.49 (0.42–0.55) | 0.83 (0.73–0.91) |
| CK > 500 IU/L, no troponin | 61 | 0.30 (0.22–0.38) | 0.86 (0.80–0.91) | 0.59 (0.46–0.71) | 0.64 (0.58–0.70) |
| CK (any value), single measurement | 55 | 0.30 (0.22–0.38) | 0.89 (0.84–0.93) | 0.65 (0.51–0.78) | 0.65 (0.59–0.71) |
| CK > 500 IU/L, single measurement | 15 | 0.07 (0.03–0.14) | 0.97 (0.93–0.99) | 0.60 (0.32–0.84) | 0.60 (0.54–0.66) |
PPV (Positive Predictive Value), NPV (Negative Predictive Value).
Any time.
After first statin mention.
Only CK measurement in 5 day period.
Performance of Allergy and Keyword Algorithms
| Originial | |||||
|---|---|---|---|---|---|
| (n) | Sensitivity | Specificity | PPV | NPV | |
|
| |||||
| Problem List | 75 | 0.58 (0.49–0.67) | 0.98 (0.94–0.99) | 0.95 (0.87–0.99) | 0.77 (0.71–0.83) |
| 71 | 0.57 (0.48–0.66) | 1.00 (0.97–1.00) | 1.00 (0.93–1.00) | 0.77 (0.71–0.82) | |
|
| |||||
| High Value Documents | 86 | 0.62 (0.53–0.71) | 0.94 (0.90–0.97) | 0.88 (0.80–0.94) | 0.79 (0.72–0.84) |
| 78 | 0.61 (0.52–0.70) | 0.99 (0.96–1.00) | 0.97 (0.91–1.00) | 0.78 (0.72–0.84) | |
|
| |||||
| Combined | 89 | 0.64 (0.55–0.72) | 0.94 (0.89–0.97) | 0.88 (0.79–0.94) | 0.79 (0.73–0.84) |
| 80 | 0.63 (0.54–0.71) | 0.99 (0.96–1.00) | 0.97 (0.91–1.00) | 0.79 (0.73–0.84) | |
|
| |||||
|
| |||||
| Clinical Comm. | 65 | 0.45 (0.36–0.54) | 0.94 (0.90–0.97) | 0.85 (0.74–0.92) | 0.71 (0.65–0.77) |
| 60 | 0.46 (0.37–0.55) | 0.98 (0.95–1.00) | 0.98 (0.95–1.00) | 0.72 (0.66–0.78) | |
|
| |||||
| High Value Documents | 104 | 0.68 (0.59–0.76) | 0.88 (0.83–0.93) | 0.80 (0.71–0.87) | 0.80 (0.74–0.85) |
| 108 | 0.75 (0.66–0.82) | 0.91 (0.86–0.95) | 0.86 (0.78–0.92) | 0.84 (0.78–0.89) | |
|
| |||||
| Combined | 122 | 0.78 (0.69–0.85) | 0.85 (0.79–0.90) | 0.78 (0.69–0.85) | 0.85 (0.79–0.90) |
| 117 | 0.81 (0.73–0.87) | 0.90 (0.85–0.94) | 0.85 (0.78–0.91) | 0.87 (0.81–0.91) | |
PPV (Positive Predictive Value), NPV (Negative Predictive Value);
Original: algorithm built on training dataset. Corrected: revised algorithm with text extracted from the gold standard set.
Figure 1.Receiver Operator Characteristic Graph Comparison of Phenotyping Algorithms Comparison of true positive (TPR) vs false positive rate (FPR). Points in upper left are better classifiers.