Jie Na1, Nansu Zong2, Chen Wang1, David E Midthun3, Yuan Luo4, Ping Yang1, Guoqian Jiang2. 1. Department of Quantitative Health Sciences, Mayo Clinic, Rochester, Minnesota, USA. 2. Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, Minnesota, USA. 3. Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota, USA. 4. Department of Preventive Medicine, Northwestern University, Chicago, Illinois, USA.
Abstract
OBJECTIVE: The study sought to test the feasibility of conducting a phenome-wide association study to characterize phenotypic abnormalities associated with individuals at high risk for lung cancer using electronic health records. MATERIALS AND METHODS: We used the beta release of the All of Us Researcher Workbench with clinical and survey data from a population of 225 000 subjects. We identified 3 cohorts of individuals at high risk to develop lung cancer based on (1) the 2013 U.S. Preventive Services Task Force criteria, (2) the long-term quitters of cigarette smoking criteria, and (3) the younger age of onset criteria. We applied the logistic regression analysis to identify the significant associations between individuals' phenotypes and their risk categories. We validated our findings against a lung cancer cohort from the same population and conducted an expert review to understand whether these associations are known or potentially novel. RESULTS: We found a total of 214 statistically significant associations (P < .05 with a Bonferroni correction and odds ratio > 1.5) enriched in the high-risk individuals from 3 cohorts, and 15 enriched in the low-risk individuals. Forty significant associations enriched in the high-risk individuals and 13 enriched in the low-risk individuals were validated in the cancer cohort. Expert review identified 15 potentially new associations enriched in the high-risk individuals. CONCLUSIONS: It is feasible to conduct a phenome-wide association study to characterize phenotypic abnormalities associated in high-risk individuals developing lung cancer using electronic health records. The All of Us Research Workbench is a promising resource for the research studies to evaluate and optimize lung cancer screening criteria.
OBJECTIVE: The study sought to test the feasibility of conducting a phenome-wide association study to characterize phenotypic abnormalities associated with individuals at high risk for lung cancer using electronic health records. MATERIALS AND METHODS: We used the beta release of the All of Us Researcher Workbench with clinical and survey data from a population of 225 000 subjects. We identified 3 cohorts of individuals at high risk to develop lung cancer based on (1) the 2013 U.S. Preventive Services Task Force criteria, (2) the long-term quitters of cigarette smoking criteria, and (3) the younger age of onset criteria. We applied the logistic regression analysis to identify the significant associations between individuals' phenotypes and their risk categories. We validated our findings against a lung cancer cohort from the same population and conducted an expert review to understand whether these associations are known or potentially novel. RESULTS: We found a total of 214 statistically significant associations (P < .05 with a Bonferroni correction and odds ratio > 1.5) enriched in the high-risk individuals from 3 cohorts, and 15 enriched in the low-risk individuals. Forty significant associations enriched in the high-risk individuals and 13 enriched in the low-risk individuals were validated in the cancer cohort. Expert review identified 15 potentially new associations enriched in the high-risk individuals. CONCLUSIONS: It is feasible to conduct a phenome-wide association study to characterize phenotypic abnormalities associated in high-risk individuals developing lung cancer using electronic health records. The All of Us Research Workbench is a promising resource for the research studies to evaluate and optimize lung cancer screening criteria.
Keywords:
Electronic Health Records (EHRs); Phenome Wide Association Study (PheWAS); all of us research program; common data model; lung cancer screening
Authors: Denise R Aberle; Amanda M Adams; Christine D Berg; William C Black; Jonathan D Clapp; Richard M Fagerstrom; Ilana F Gareen; Constantine Gatsonis; Pamela M Marcus; JoRean D Sicks Journal: N Engl J Med Date: 2011-06-29 Impact factor: 91.245
Authors: Ahmedin Jemal; Freddie Bray; Melissa M Center; Jacques Ferlay; Elizabeth Ward; David Forman Journal: CA Cancer J Clin Date: 2011-02-04 Impact factor: 508.702
Authors: Yi Wang; David E Midthun; Jason A Wampfler; Bo Deng; Shawn M Stoddard; Shuo Zhang; Ping Yang Journal: JAMA Date: 2015-02-24 Impact factor: 56.272
Authors: Peter B Bach; Michael W Kattan; Mark D Thornquist; Mark G Kris; Ramsey C Tate; Matt J Barnett; Lillian J Hsieh; Colin B Begg Journal: J Natl Cancer Inst Date: 2003-03-19 Impact factor: 13.506
Authors: Ping Yang; Yi Wang; Jason A Wampfler; Dong Xie; Shawn M Stoddard; Jun She; David E Midthun Journal: J Thorac Oncol Date: 2016-02 Impact factor: 15.609
Authors: Martijn J Schuemie; Patrick B Ryan; Nicole Pratt; RuiJun Chen; Seng Chan You; Harlan M Krumholz; David Madigan; George Hripcsak; Marc A Suchard Journal: J Am Med Inform Assoc Date: 2020-08-01 Impact factor: 4.497
Authors: Cathy Williams; Matthew Suderman; Jeremy A Guggenheim; Genette Ellis; Steve Gregory; Yasmin Iles-Caven; Kate Northstone; Jean Golding; Marcus Pembrey Journal: Sci Rep Date: 2019-10-28 Impact factor: 4.379