OBJECTIVE: There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. MATERIALS AND METHODS: We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. RESULTS: An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. DISCUSSION: A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. CONCLUSION: We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries.
OBJECTIVE: There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. MATERIALS AND METHODS: We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. RESULTS: An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. DISCUSSION: A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. CONCLUSION: We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries.
Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: Guergana K Savova; Jin Fan; Zi Ye; Sean P Murphy; Jiaping Zheng; Christopher G Chute; Iftikhar J Kullo Journal: AMIA Annu Symp Proc Date: 2010-11-13
Authors: Iftikhar J Kullo; Jin Fan; Jyotishman Pathak; Guergana K Savova; Zeenat Ali; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: Abel N Kho; Jennifer A Pacheco; Peggy L Peissig; Luke Rasmussen; Katherine M Newton; Noah Weston; Paul K Crane; Jyotishman Pathak; Christopher G Chute; Suzette J Bielinski; Iftikhar J Kullo; Rongling Li; Teri A Manolio; Rex L Chisholm; Joshua C Denny Journal: Sci Transl Med Date: 2011-04-20 Impact factor: 17.956
Authors: R A Wilke; H Xu; J C Denny; D M Roden; R M Krauss; C A McCarty; R L Davis; T Skaar; J Lamba; G Savova Journal: Clin Pharmacol Ther Date: 2011-01-19 Impact factor: 6.875
Authors: Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf Journal: BMC Med Genomics Date: 2011-01-26 Impact factor: 3.063
Authors: Carol J Waudby; Richard L Berg; James G Linneman; Luke V Rasmussen; Peggy L Peissig; Lin Chen; Catherine A McCarty Journal: BMC Ophthalmol Date: 2011-11-11 Impact factor: 2.209
Authors: Huaqin Pan; Kimberly A Tryka; Daniel J Vreeman; Wayne Huggins; Michael J Phillips; Jayashri P Mehta; Jacqueline H Phillips; Clement J McDonald; Heather A Junkins; Erin M Ramos; Carol M Hamilton Journal: Hum Mutat Date: 2012-04-03 Impact factor: 4.878
Authors: David Carrell; Bradley Malin; John Aberdeen; Samuel Bayer; Cheryl Clark; Ben Wellner; Lynette Hirschman Journal: J Am Med Inform Assoc Date: 2012-07-06 Impact factor: 4.497
Authors: Rachel L Richesson; W Ed Hammond; Meredith Nahm; Douglas Wixted; Gregory E Simon; Jennifer G Robinson; Alan E Bauck; Denise Cifelli; Michelle M Smerek; John Dickerson; Reesa L Laws; Rosemary A Madigan; Shelley A Rusincovitch; Cynthia Kluchar; Robert M Califf Journal: J Am Med Inform Assoc Date: 2013-08-16 Impact factor: 4.497
Authors: Mary Regina Boland; George Hripcsak; Yufeng Shen; Wendy K Chung; Chunhua Weng Journal: J Am Med Inform Assoc Date: 2013-09-03 Impact factor: 4.497
Authors: Jimeng Sun; Candace D McNaughton; Ping Zhang; Adam Perer; Aris Gkoulalas-Divanis; Joshua C Denny; Jacqueline Kirby; Thomas Lasko; Alexander Saip; Bradley A Malin Journal: J Am Med Inform Assoc Date: 2013-09-17 Impact factor: 4.497
Authors: Sarah A Pendergrass; Shefali S Verma; Emily R Holzinger; Carrie B Moore; John Wallace; Scott M Dudek; Wayne Huggins; Terrie Kitchner; Carol Waudby; Richard Berg; Catherine A McCarty; Marylyn D Ritchie Journal: Pac Symp Biocomput Date: 2013