Literature DB >> 25717410

Creation and Validation of an EMR-based Algorithm for Identifying Major Adverse Cardiac Events while on Statins.

Wei-Qi Wei1, Qiping Feng2, Peter Weeke2, William Bush3, Magarya S Waitara2, Otito F Iwuchukwu2, Dan M Roden4, Russell A Wilke5, Charles M Stein2, Joshua C Denny1.   

Abstract

Statin medications are often prescribed to ameliorate a patient's risk of cardiovascular events due in part to cholesterol reduction. We developed and evaluated an algorithm that can accurately identify subjects with major adverse cardiac events (MACE) while on statins using electronic medical record (EMR) data. The algorithm also identifies subjects experiencing their first MACE while on statins for primary prevention. The algorithm achieved 90% to 97% PPVs in identification of MACE cases as compared against physician review. By applying the algorithm to EMR data in BioVU, cases and controls were identified and used subsequently to replicate known associations with eight genetic variants. We replicated 6/8 previously reported genetic associations with cardiovascular diseases or lipid metabolism disorders. Our results demonstrated that the algorithm can be used to accurately identify subjects with MACE and MACE while on statins. Consequently, future e studies can be conducted to investigate and validate the relationship between statins and MACE using real-world clinical data.

Entities:  

Year:  2014        PMID: 25717410      PMCID: PMC4333709     

Source DB:  PubMed          Journal:  AMIA Jt Summits Transl Sci Proc


Introduction

Cardiovascular disease (CVD) is the leading cause of death worldwide. Recent mortality data show that CVD accounted for 32.8% of all deaths in the U.S.1 Many randomized clinical trials (RCTs) have shown that HMG-CoA reductase inhibitors (“statins”) significantly reduce the frequency of major adverse cardiac events (MACE) in patients at risk.2–7 Statins are one of the most commonly prescribed medications, and are generally well-tolerated.8 Given their clinical importance, they have been a frequent focus of investigation in electronic medical records (EMRs). We sought to develop a highly accurate algorithm to enable study of statin efficacy, measured as MACE while on statins, in EMRs. This algorithm can be used for later clinical and genomic studies. Since 2000, EMRs have been widely implemented through the U.S.9 The deployment of EMRs not only improves patient care but also generates huge clinical practice-based datasets ideal for evaluating previous findings from randomized controlled trials (RCTs).10–13 Although useful for research, EMR data often requires carefully constructed algorithms to accurately identify phenotypes for clinical and genomic study10,14–16; this is especially true for pharmacogenomic studies in the EMR, since they require knowledge of the temporal relationship between exposures and outcomes. Once accurate algorithms are identified, studies can be conducted to investigate relevant relationships, e.g., between statins and MACE, using real-world clinical data.

Background

MACE can be defined as cardiac death, nonfatal acute myocardial infarction (AMI), or target lesion revascularization. Previously, several investigators have explored the possibility of identifying MACE subjects using EMR data. In 1996, Pladevall et al., reported that the accuracy of using the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) code 410 to identify definite MI was 92%.17 Similarly, Petersen et al. in 1999 found that the positive predictive value (PPV) of AMI codes in the primary position was 96%. In addition, they also reported that the sensitivity and specificity of Current Procedural Terminology (CPT) coding were, respectively, 96% and 99% for coronary catheterization, 95% and 100% for coronary artery bypass graft surgery, and 90% and 99% for percutaneous transluminal coronary angioplasty.18 In 2002, Austin et al., examined the use of a discharge diagnosis of AMI and the PPV was 88%.19 In 2004, Kiyota et al., additionally required hospitalization lasting at least 3 days. Their results reflected a slightly improved PPV of 94%.20 Two recent studies, by Varas-Lorenzo et al. in 200821 and by Preciosa et al. in 201322, reported that ICD9-CM codes had a PPV of 95% and 96%, respectively. Generally, these results suggest that ICD-9-CM codes have been widely used for MACE subject identification and yield PPVs in the mid to high 90% range.23 However, all these studies were performed on primary/secondary discharge codes only (thus representing inpatient-generated codes, which typically result from professional coders). Such information is not available for many deidentified EMR datasets, i.e. it may not be clear if a code is for the principal or discharge diagnosis. Thus, the approach of simply using ICD9-CM codes may not generalize to a broad clinical research setting. Another important issue is identifying first MACE events. The recognition of such events empowers researchers to evaluate the effectiveness of a treatment for either primary or secondary prevention of MACE, therefore, has a foreseeable and significant impact on clinical practice. Recent studies have begun using EMR data for pharmacological studies. Drug response phenotypes can be challenging to identify accurately, as they require presence of a medication during the timing of an event. 24 In a recent paper, we described our methods for extracting information and constructing full dose-response curves for simvastatin and atorvastatin using EMR data. 25 Advanced techniques, e.g. natural language processing (NLP) and ontology, were used to retrieve medication and laboratory data from structured and unstructured EMRs. Other examples of pharmacological studies include pharmacogenetic studies of clopidogrel and CYP2C19 variants, in which manual review was ultimately required to achieve PPV26, and the affect of common variants with warfarin stable-dose international normalized ratios (INRs), which was able to be performed entirely using informatics techniques.27 Other clinical studies have used NLP, sometimes with laboratory data, to replicate known drug adverse events and suggest some others, though formal assessments of the PPV of each drug-event pair were not provided.28,29 In this manuscript, we introduce an algorithm to identify subjects with MACE while on statins from EMRs. We report its performance compared to manual chart review and a genetic validation study. Compared to other efforts, our algorithm involves all diagnosis codes as well as laboratory data and simple NLP, instead of just primary discharge codes; it also assesses concurrent statin use, and includes a determination of first MACE.

Methods

MACE Algorithm development

We used commonly captured EMR data, including ICD9-CM codes, CPT codes, and laboratory test results to develop an approach to identify MACE. We used all diagnosis codes rather than primary discharge codes alone so that our approach would be widely generalizable. We categorized a MACE event as either AMI or revascularization. Qualifying cases of AMI while on statins were required to have ≥2 AMI relevant ICD9-CM Codes (410.* or 411.*) within a 5-day window and an abnormal laboratory test (Table 1). An abnormal laboratory test was defined as either troponin ≥0.10 ng/ml or both creatinine kinase (CK) MB fraction to CK ratio≥3.0 and CK-MB ≥10.0 ng/mL. In addition, a statin must have been prescribed prior to the AMI event ≥180 days (Figure 1). We chose slightly higher thresholds than usual to ensure the accuracy of the algorithm. The duration of 180 days was chosen empirically to represent a time course for which a patient would have significant statin exposure before their event and to make it easier to ascertain whether the patient had remained on the medicine. Statins were either simvastatin (Zocor), fluvastatin(Lescol, Canef, Vastin), atorvastatin (Lipitor), pravastatin(Pravachol, Selektine), lovastatin(Mevacor), cerivastatin (Baycol, Lipobay), or rosuvastatin(Crestor). Medications were identified using records from electronic prescribing tools and processing of free text notes using MedEx30.
Table 1.

Algorithm for identifying subjects with MACE while on statins.

AMI on statin• ≥ 2 AMI Codes (410.* or 411.*) within a 5-Day Window• Abnormal lab within the same time window defined by ○ Troponin-I ≥ 0.10 ng/ml) ○ or Troponin-T ≥ 0.10 ng/ml, ○ or CK-MB/CK ratio ≥ 3.0 and CK-MB ≥ 10.0 ng/mL• Statin prescribed prior to the AMI event ≥180 days
1st AMI on statin• AMI on statin• No AMI codes (410 – 412) assigned before the AMI event• No MACE history defined by NLP
Revascularization while on statin• Any CPT code for angioplasty, stent, or CABG• statin prescribed prior to the procedure ≥180 days
1st Revascularization while on statin• Revascularization while on statin• No revascularization codes assigned before the AMI event• No MACE history defined by NLP
Figure 1.

Overview of algorithm for determining AMI on statins

For qualified subjects with an AMI while on statins, we identified individuals with 1st AMI events, as those with no AMI codes (410 – 412) prior to the qualifying statin exposure-AMI event and with no other MACE history defined by applying NLP on previous notes. We used the KnowledgeMap Concept Indexer (KMCI)31,32, a general-purpose NLP engine, to parse a patient’s notes. Any non-negated keywords found, including AMI, MI, acute myocardial infarction, myocardial infarction, CABG, coronary artery bypass, cypher, taxus, BMS, DES, and stent, was considered as an indication of positive MACE history and thereby excluded as a subject. Revascularization includes percutaneous coronary intervention (PCI) and coronary artery bypass grafting (CABG). To be a qualified subject with revascularization while on statins, one must have a revascularization CPT code and a statin must be prescribed prior to the procedure ≥180 days (Table 1). The CPT codes that we used included coronary artery bypass (33533-33536, 33510-33523), angioplasty (92980-92982, 92984, 92995, 92996), and stent (C1874-C1877). Individuals with 1st revascularizations while on statins were those whom met the above criteria and had no revascularization CPT codes and no revascularization history found by NLP prior to the MACE on statin event. We similarly developed an algorithm to identify control subjects without MACE while on statin. We excluded patients with any AMI diagnosis or revascularization CPT codes, patients with previous history of AMI or revascularization defined by NLP. We also required controls to have had similar statin exposure in their EMRs matched with cases.

Manual chart review

We applied the algorithm on BioVU individuals at Vanderbilt University Medical Center (VUMC) 33 to identify possible cases. In brief, BioVU links a de-identified image of the Vanderbilt EMR to DNA extracted from blood samples (obtained during routine clinical care and about to be discarded). Each record and associated DNA sample is linked by a unique identifier generated by a one-way hash function. The resource has been considered as containing data for nonhuman subjects in accordance with the provisions of Title 45 of the Code of Federal Regulations part 46, as have the individual research studies utilizing the resource.33 As of 09/2013, BioVU contains > 170,000 unique individuals, including their dense longitudinal clinical records and associated blood samples. From each category (AMI on statin, 1st AMI on statin, revascularization on statin, and 1st revascularization on statin), a group of 30 randomly selected cases was manually reviewed by two physicians. AMI on statin and 1st AMI on statin cases were reviewed by JCD, an internist. Revascularization on statin and 1st revascularization on statin cases were reviewed by PW, a cardiologist.

Genetic validation

To further illustrate the application of our algorithm, we performed a genotype and phenotype association study, also by leveraging BioVU resources. The study population consisted of the first 7747 European–Americans accrued into BioVU. The only selection criteria were that they met the general conditions for eligibility for BioVU; no clinical inclusion or exclusion criteria were applied. These subjects have already been genotyped in previous studies.34 In the current analysis, we identified 533 MACE cases and 2,642 MACE-free controls and compared the frequency of eight selected SNPs with previously known associations with cardiovascular diseases or lipid metabolism among cases and controls (Table 3): rs1045642 [pharmacogenetic predictors of lipid-lowering response to atorvastatin]35, rs440446 [ApoE gene, Variations in ApoE affect cholesterol metabolism, which in turn alter risk of heart disease and in particular a heart attack or a stroke]36, rs2200733[atrial fibrillation (AF) and ischemic stroke]37,38, rs405509 [CAD]39, rs1333049 [CAD]40,41, rs1800795 [CAD]42, rs1800888 [MACE after PCI]43, and rs1048101[hypertension] 44. SNPs were genotyped in DNA samples from these subjects. Genotyping was conducted using commercial Taqman Allelic discrimination assays available through Applied Biosystems, Inc. (ABI, Foster City, CA, USA). The case-control analyses was performed using PLINK, a free, open-source genetic analysis toolset (http://pngu.mgh.harvard.edu/~purcell/plink/).45 This platform was selected based on its efficiency, flexibility and ease of application. The primary outcome of this validation was to replicate these associations using our MACE algorithm.
Table 3.

Association between eight SNPs previously reported to be associated with CV disease with MACE on statins in our population.

Chr.SNPGene/AssociationMinor Allele Frequencyp-value
7rs1045642ABCB1/predictors of lipid-lowering response to atorvastatin0.4720.001
8rs1048101ADRA1A/hypertension0.4570.006
19rs440446ApoE/cholesterol metabolism, heart disease, AMI, and stroke0.3480.009
4rs22007334q25/atrial fibrillation(AF), ischemic stroke0.1190.016
19rs405509ApoE/CAD0.4740.032
5rs1800888ADRB2/MACE after undertaking PCI0.0120.040
9rs1333049CDKN2B/CAD0.4790.121
7rs1800795IL6/CAD0.4180.940

Results

Table 2 summarizes manual chart review results that ranged from 90% to 97% positive predictive value (PPV) for MACE case identification. We observed some false positives that were caused by system coding errors, e.g. a “stent” code assigned for an esophageal stent placement. The algorithm performed well on identifying the 1st event (PPV ~90%). Some previous major events were missed because they happened long time ago (before 1990) and were not recorded in our current system.
Table 2.

Results of manual chart review

CategoryPPV
Any AMI event96.67%
1st AMI event96.67%
Any AMI while on Statin90.00%
1st AMI event while on Statin90.00%
Any revascularization event96.55%
1st revascularization event89.66%
Any revascularization while on Statin96.55%
1st revascularization event while on Statin89.66%
A total of 533 MACE cases and 2,642 MACE-free controls were identified from 7747 subjects of the demonstration cohort. Eight pre-selected SNPs were genotyped for all 3175 subjects. Variants with call rate less than 99% were removed from final analyses. Case-control analysis successfully replicated six out of the eight previously reported associations with cardiovascular diseases or lipid metabolism disorders. The validation results were shown in Table 3. The strongest association was observed from the variant located in ABCB1 gene (rs1045642). This SNP— rs1045642, has already been proven to influence the body response to atorvastatin35, therefore potentially affects our cardiovascular endpoint— MACE. Two SNPs (rs440446, rs405509) located in ApoE gene were replicated. Both of them play a critical role in cholesterol metabolism, which in turn affect the development of heart disease.36,39 Rs2200733 is another important cardiovascular relevant SNP that we replicated. Numerous studies have reported that it is strongly associated with CAD regardless of race.37,38,46–50 We also validated the associations between MACE and two adrenergic receptor SNPs rs1048101 and rs1800888. The former has been previously reported to be able to alter the alpha1-adrenergic receptor autoantibody production in hypertensive patients 44 while the latter is associated with a more aggressive CAD and adversely affects prognosis in a study of 330 patients undergoing PCI43.

Discussion

In this paper, we reported a novel algorithm for use in EMRs to accurately identify cases with MACE and 1st MACE while on statin. The algorithm achieved 90% to 97% PPVs for the identification of MACE cases as compared to clinician review. By applying the algorithm to EMR data of demonstration cohort in BioVU, cases and controls were identified and used subsequently to replicate six out of eight associations with known genetic variants. Our results demonstrated that the algorithm can be used to accurately identify cases with MACE while on statins.
  50 in total

Review 1.  Electronic medical records as a tool in clinical pharmacology: opportunities and challenges.

Authors:  D M Roden; H Xu; J C Denny; R A Wilke
Journal:  Clin Pharmacol Ther       Date:  2012-06       Impact factor: 6.875

2.  Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin.

Authors:  Hua Xu; Min Jiang; Matt Oetjens; Erica A Bowton; Andrea H Ramirez; Janina M Jeff; Melissa A Basford; Jill M Pulley; James D Cowan; Xiaoming Wang; Marylyn D Ritchie; Daniel R Masys; Dan M Roden; Dana C Crawford; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2011 Jul-Aug       Impact factor: 4.497

3.  The rs2200733 variant on chromosome 4q25 is a risk factor for cardioembolic stroke related to atrial fibrillation in Polish patients.

Authors:  Marcin Wnuk; Joanna Pera; Jeremiasz Jagiełła; Elżbieta Szczygieł; Antoni Ferens; Karolina Spisak; Paweł Wołkow; Maria Kmieć; Jacek Burkot; Joanna Chrzanowska-Waśko; Wojciech Turaj; Agnieszka Słowik
Journal:  Neurol Neurochir Pol       Date:  2011 Mar-Apr       Impact factor: 1.621

4.  Positive predictive value of the diagnosis of acute myocardial infarction in an administrative database.

Authors:  L A Petersen; S Wright; S L Normand; J Daley
Journal:  J Gen Intern Med       Date:  1999-09       Impact factor: 5.128

5.  Characterization of statin dose response in electronic medical records.

Authors:  W-Q Wei; Q Feng; L Jiang; M S Waitara; O F Iwuchukwu; D M Roden; M Jiang; H Xu; R M Krauss; J I Rotter; D A Nickerson; R L Davis; R L Berg; P L Peissig; C A McCarty; R A Wilke; J C Denny
Journal:  Clin Pharmacol Ther       Date:  2013-10-04       Impact factor: 6.875

6.  Risk variants for atrial fibrillation on chromosome 4q25 associate with ischemic stroke.

Authors:  Solveig Gretarsdottir; Gudmar Thorleifsson; Andrei Manolescu; Unnur Styrkarsdottir; Anna Helgadottir; Andreas Gschwendtner; Konstantinos Kostulas; Gregor Kuhlenbäumer; Steve Bevan; Thorbjorg Jonsdottir; Hjordis Bjarnason; Jona Saemundsdottir; Stefan Palsson; David O Arnar; Hilma Holm; Gudmundur Thorgeirsson; Einar Mar Valdimarsson; Sigurlaug Sveinbjörnsdottir; Christian Gieger; Klaus Berger; H-Erich Wichmann; Jan Hillert; Hugh Markus; Jeffrey Robert Gulcher; E Bernd Ringelstein; Augustine Kong; Martin Dichgans; Daniel Fannar Gudbjartsson; Unnur Thorsteinsdottir; Kari Stefansson
Journal:  Ann Neurol       Date:  2008-10       Impact factor: 10.422

7.  Effect of early use of low-dose pravastatin on major adverse cardiac events in patients with acute myocardial infarction: the OACIS-LIPID Study.

Authors:  Hiroshi Sato; Kunihiro Kinjo; Hiroshi Ito; Atsushi Hirayama; Shinsuke Nanto; Masatake Fukunami; Masami Nishino; Young-Jae Lim; Yoshiyuki Kijima; Yukihiro Koretsune; Daisaku Nakatani; Hiroya Mizuno; Masahiko Shimizu; Masatsugu Hori
Journal:  Circ J       Date:  2008-01       Impact factor: 2.993

8.  PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations.

Authors:  Joshua C Denny; Marylyn D Ritchie; Melissa A Basford; Jill M Pulley; Lisa Bastarache; Kristin Brown-Gentry; Deede Wang; Dan R Masys; Dan M Roden; Dana C Crawford
Journal:  Bioinformatics       Date:  2010-03-24       Impact factor: 6.937

9.  Genetic determinants of serum lipid levels in Chinese subjects: a population-based study in Shanghai, China.

Authors:  Gabriella Andreotti; Idan Menashe; Jinbo Chen; Shih-Chen Chang; Asif Rashid; Yu-Tang Gao; Tian-Quan Han; Lori C Sakoda; Stephen Chanock; Philip S Rosenberg; Ann W Hsing
Journal:  Eur J Epidemiol       Date:  2009-11-04       Impact factor: 8.082

10.  Variation in GYS1 interacts with exercise and gender to predict cardiovascular mortality.

Authors:  Jenny Fredriksson; Dragi Anevski; Peter Almgren; Marketa Sjögren; Valeriya Lyssenko; Joyce Carlson; Bo Isomaa; Marja-Riitta Taskinen; Leif Groop; Marju Orho-Melander
Journal:  PLoS One       Date:  2007-03-14       Impact factor: 3.240

View more
  7 in total

1.  Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance.

Authors:  Wei-Qi Wei; Pedro L Teixeira; Huan Mo; Robert M Cronin; Jeremy L Warner; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2015-09-02       Impact factor: 4.497

Review 2.  Natural Language Processing for EHR-Based Pharmacovigilance: A Structured Review.

Authors:  Yuan Luo; William K Thompson; Timothy M Herr; Zexian Zeng; Mark A Berendsen; Siddhartha R Jonnalagadda; Matthew B Carson; Justin Starren
Journal:  Drug Saf       Date:  2017-11       Impact factor: 5.606

3.  LPA Variants Are Associated With Residual Cardiovascular Risk in Patients Receiving Statins.

Authors:  Wei-Qi Wei; Xiaohui Li; Qiping Feng; Michiaki Kubo; Iftikhar J Kullo; Peggy L Peissig; Elizabeth W Karlson; Gail P Jarvik; Ming Ta Michael Lee; Ning Shang; Eric A Larson; Todd Edwards; Christian M Shaffer; Jonathan D Mosley; Shiro Maeda; Momoko Horikoshi; Marylyn Ritchie; Marc S Williams; Eric B Larson; David R Crosslin; Sarah T Bland; Jennifer A Pacheco; Laura J Rasmussen-Torvik; David Cronkite; George Hripcsak; Nancy J Cox; Russell A Wilke; C Michael Stein; Jerome I Rotter; Yukihide Momozawa; Dan M Roden; Ronald M Krauss; Joshua C Denny
Journal:  Circulation       Date:  2018-10-23       Impact factor: 29.690

4.  Extracting research-quality phenotypes from electronic health records to support precision medicine.

Authors:  Wei-Qi Wei; Joshua C Denny
Journal:  Genome Med       Date:  2015-04-30       Impact factor: 11.117

5.  Validation of algorithms to identify elective percutaneous coronary interventions in administrative databases.

Authors:  Catherine G Derington; Lauren J Heath; David P Kao; Thomas Delate
Journal:  PLoS One       Date:  2020-04-07       Impact factor: 3.240

6.  Influence of Receptor Polymorphisms on the Response to α-Adrenergic Receptor Blockers in Pheochromocytoma Patients.

Authors:  Annika M A Berends; Mathieu S Bolhuis; Ilja M Nolte; Edward Buitenwerf; Thera P Links; Henri J L M Timmers; Richard A Feelders; Elisabeth M W Eekhoff; Eleonora P M Corssmit; Peter H Bisschop; Harm R Haak; Ron H N van Schaik; Samira El Bouazzaoui; Bob Wilffert; Michiel N Kerstens
Journal:  Biomedicines       Date:  2022-04-13

7.  Automated Phenotyping Tool for Identifying Developmental Language Disorder Cases in Health Systems Data (APT-DLD): A New Research Algorithm for Deployment in Large-Scale Electronic Health Record Systems.

Authors:  Courtney E Walters; Rachana Nitin; Katherine Margulis; Olivia Boorom; Daniel E Gustavson; Catherine T Bush; Lea K Davis; Jennifer E Below; Nancy J Cox; Stephen M Camarata; Reyna L Gordon
Journal:  J Speech Lang Hear Res       Date:  2020-08-11       Impact factor: 2.297

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.