Ning Shang1, Cong Liu1, Luke V Rasmussen2, Casey N Ta1, Robert J Caroll3, Barbara Benoit4, Todd Lingren5, Ozan Dikilitas6, Frank D Mentch7, David S Carrell8, Wei-Qi Wei3, Yuan Luo2, Vivian S Gainer4, Iftikhar J Kullo6, Jennifer A Pacheco2, Hakon Hakonarson7, Theresa L Walunas2, Joshua C Denny3, Ken Wiley9, Shawn N Murphy4, George Hripcsak10, Chunhua Weng11. 1. Department of Biomedical Informatics, Columbia University, New York, NY, United States. 2. Northwestern University Feinberg School of Medicine, Chicago, IL, United States. 3. Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States. 4. Research Information Science and Computing, Partners Healthcare, Boston, MA, United States. 5. Department of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, United States. 6. Department of Cardiovascular Medicine, Mayo Clinic, Rochester, MN, United States. 7. Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States. 8. Kaiser Permanente Washington Health Research Institute, Seattle, WA, United States. 9. National Human Genome Research Institute, NIH, Bethesda, MD, United States. 10. Department of Biomedical Informatics, Columbia University, New York, NY, United States; Medical Informatics Services, NewYork-Presbyterian Hospital, New York, NY, United States. Electronic address: hripcsak@columbia.edu. 11. Department of Biomedical Informatics, Columbia University, New York, NY, United States. Electronic address: chunhua@columbia.edu.
Abstract
BACKGROUND: Implementation of phenotype algorithms requires phenotype engineers to interpret human-readable algorithms and translate the description (text and flowcharts) into computable phenotypes - a process that can be labor intensive and error prone. To address the critical need for reducing the implementation efforts, it is important to develop portable algorithms. METHODS: We conducted a retrospective analysis of phenotype algorithms developed in the Electronic Medical Records and Genomics (eMERGE) network and identified common customization tasks required for implementation. A novel scoring system was developed to quantify portability from three aspects: Knowledge conversion, clause Interpretation, and Programming (KIP). Tasks were grouped into twenty representative categories. Experienced phenotype engineers were asked to estimate the average time spent on each category and evaluate time saving enabled by a common data model (CDM), specifically the Observational Medical Outcomes Partnership (OMOP) model, for each category. RESULTS: A total of 485 distinct clauses (phenotype criteria) were identified from 55 phenotype algorithms, corresponding to 1153 customization tasks. In addition to 25 non-phenotype-specific tasks, 46 tasks are related to interpretation, 613 tasks are related to knowledge conversion, and 469 tasks are related to programming. A score between 0 and 2 (0 for easy, 1 for moderate, and 2 for difficult portability) is assigned for each aspect, yielding a total KIP score range of 0 to 6. The average clause-wise KIP score to reflect portability is 1.37 ± 1.38. Specifically, the average knowledge (K) score is 0.64 ± 0.66, interpretation (I) score is 0.33 ± 0.55, and programming (P) score is 0.40 ± 0.64. 5% of the categories can be completed within one hour (median). 70% of the categories take from days to months to complete. The OMOP model can assist with vocabulary mapping tasks. CONCLUSION: This study presents firsthand knowledge of the substantial implementation efforts in phenotyping and introduces a novel metric (KIP) to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. Phenotype developers are encouraged to analyze and optimize the portability in regards to knowledge, interpretation and programming. CDMs can be used to improve the portability for some 'knowledge-oriented' tasks.
BACKGROUND: Implementation of phenotype algorithms requires phenotype engineers to interpret human-readable algorithms and translate the description (text and flowcharts) into computable phenotypes - a process that can be labor intensive and error prone. To address the critical need for reducing the implementation efforts, it is important to develop portable algorithms. METHODS: We conducted a retrospective analysis of phenotype algorithms developed in the Electronic Medical Records and Genomics (eMERGE) network and identified common customization tasks required for implementation. A novel scoring system was developed to quantify portability from three aspects: Knowledge conversion, clause Interpretation, and Programming (KIP). Tasks were grouped into twenty representative categories. Experienced phenotype engineers were asked to estimate the average time spent on each category and evaluate time saving enabled by a common data model (CDM), specifically the Observational Medical Outcomes Partnership (OMOP) model, for each category. RESULTS: A total of 485 distinct clauses (phenotype criteria) were identified from 55 phenotype algorithms, corresponding to 1153 customization tasks. In addition to 25 non-phenotype-specific tasks, 46 tasks are related to interpretation, 613 tasks are related to knowledge conversion, and 469 tasks are related to programming. A score between 0 and 2 (0 for easy, 1 for moderate, and 2 for difficult portability) is assigned for each aspect, yielding a total KIP score range of 0 to 6. The average clause-wise KIP score to reflect portability is 1.37 ± 1.38. Specifically, the average knowledge (K) score is 0.64 ± 0.66, interpretation (I) score is 0.33 ± 0.55, and programming (P) score is 0.40 ± 0.64. 5% of the categories can be completed within one hour (median). 70% of the categories take from days to months to complete. The OMOP model can assist with vocabulary mapping tasks. CONCLUSION: This study presents firsthand knowledge of the substantial implementation efforts in phenotyping and introduces a novel metric (KIP) to measure portability of phenotype algorithms for quantifying such efforts across the eMERGE Network. Phenotype developers are encouraged to analyze and optimize the portability in regards to knowledge, interpretation and programming. CDMs can be used to improve the portability for some 'knowledge-oriented' tasks.
Authors: Joshua C Denny; Anderson Spickard; Kevin B Johnson; Neeraja B Peterson; Josh F Peterson; Randolph A Miller Journal: J Am Med Inform Assoc Date: 2009-08-28 Impact factor: 4.497
Authors: Katherine M Newton; Peggy L Peissig; Abel Ngo Kho; Suzette J Bielinski; Richard L Berg; Vidhu Choudhary; Melissa Basford; Christopher G Chute; Iftikhar J Kullo; Rongling Li; Jennifer A Pacheco; Luke V Rasmussen; Leslie Spangler; Joshua C Denny Journal: J Am Med Inform Assoc Date: 2013-03-26 Impact factor: 4.497
Authors: Iftikhar J Kullo; Jin Fan; Jyotishman Pathak; Guergana K Savova; Zeenat Ali; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: George Hripcsak; Ning Shang; Peggy L Peissig; Luke V Rasmussen; Cong Liu; Barbara Benoit; Robert J Carroll; David S Carrell; Joshua C Denny; Ozan Dikilitas; Vivian S Gainer; Kayla Marie Howell; Jeffrey G Klann; Iftikhar J Kullo; Todd Lingren; Frank D Mentch; Shawn N Murphy; Karthik Natarajan; Jennifer A Pacheco; Wei-Qi Wei; Ken Wiley; Chunhua Weng Journal: J Biomed Inform Date: 2019-07-17 Impact factor: 6.317
Authors: Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf Journal: BMC Med Genomics Date: 2011-01-26 Impact factor: 3.063
Authors: Luke V Rasmussen; Will K Thompson; Jennifer A Pacheco; Abel N Kho; David S Carrell; Jyotishman Pathak; Peggy L Peissig; Gerard Tromp; Joshua C Denny; Justin B Starren Journal: J Biomed Inform Date: 2014-06-21 Impact factor: 6.317
Authors: Huan Mo; Jennifer A Pacheco; Luke V Rasmussen; Peter Speltz; Jyotishman Pathak; Joshua C Denny; William K Thompson Journal: AMIA Jt Summits Transl Sci Proc Date: 2015-03-25
Authors: Hongfang Liu; Suzette J Bielinski; Sunghwan Sohn; Sean Murphy; Kavishwar B Wagholikar; Siddhartha R Jonnalagadda; K E Ravikumar; Stephen T Wu; Iftikhar J Kullo; Christopher G Chute Journal: AMIA Jt Summits Transl Sci Proc Date: 2013-03-18
Authors: Katherine P Liao; Tianxi Cai; Guergana K Savova; Shawn N Murphy; Elizabeth W Karlson; Ashwin N Ananthakrishnan; Vivian S Gainer; Stanley Y Shaw; Zongqi Xia; Peter Szolovits; Susanne Churchill; Isaac Kohane Journal: BMJ Date: 2015-04-24
Authors: Cong Liu; Casey N Ta; Jim M Havrilla; Jordan G Nestor; Matthew E Spotnitz; Andrew S Geneslaw; Yu Hu; Wendy K Chung; Kai Wang; Chunhua Weng Journal: Am J Hum Genet Date: 2022-08-22 Impact factor: 11.043
Authors: Genevieve L Wojcik; Jessica Murphy; Jacob L Edelson; Christopher R Gignoux; Alexander G Ioannidis; Alisa Manning; Manuel A Rivas; Steven Buyske; Audrey E Hendricks Journal: Nat Rev Genet Date: 2022-05-17 Impact factor: 59.581
Authors: Mindy K Ross; Henry Zheng; Bing Zhu; Ailina Lao; Hyejin Hong; Alamelu Natesan; Melina Radparvar; Alex A T Bui Journal: Methods Inf Med Date: 2021-07-14 Impact factor: 1.800
Authors: Pascal S Brandt; Richard C Kiefer; Jennifer A Pacheco; Prakash Adekkanattu; Evan T Sholle; Faraz S Ahmad; Jie Xu; Zhenxing Xu; Jessica S Ancker; Fei Wang; Yuan Luo; Guoqian Jiang; Jyotishman Pathak; Luke V Rasmussen Journal: Learn Health Syst Date: 2020-06-25
Authors: Sierra Davis; Louis Ehwerhemuepha; William Feaster; Jeffrey Hackman; Hiroki Morizono; Saravanan Kanakasabai; Abu Saleh Mohammad Mosa; Jerry Parker; Gary Iwamoto; Nisha Patel; Gary Gasparino; Natalie Kane; Mark A Hoffman Journal: JAMIA Open Date: 2022-01-17