| Literature DB >> 34981404 |
Cailey I Kerley1, Shikha Chaganti2, Tin Q Nguyen3,4, Camilo Bermudez5, Laurie E Cutting3,4,6,7, Lori L Beason-Held8, Thomas Lasko2,9, Bennett A Landman10,2,3,5,6,7,9.
Abstract
Along with the increasing availability of electronic medical record (EMR) data, phenome-wide association studies (PheWAS) and phenome-disease association studies (PheDAS) have become a prominent, first-line method of analysis for uncovering the secrets of EMR. Despite this recent growth, there is a lack of approachable software tools for conducting these analyses on large-scale EMR cohorts. In this article, we introduce pyPheWAS, an open-source python package for conducting PheDAS and related analyses. This toolkit includes 1) data preparation, such as cohort censoring and age-matching; 2) traditional PheDAS analysis of ICD-9 and ICD-10 billing codes; 3) PheDAS analysis applied to a novel EMR phenotype mapping: current procedural terminology (CPT) codes; and 4) novelty analysis of significant disease-phenotype associations found through PheDAS. The pyPheWAS toolkit is approachable and comprehensive, encapsulating data prep through result visualization all within a simple command-line interface. The toolkit is designed for the ever-growing scale of available EMR data, with the ability to analyze cohorts of 100,000 + patients in less than 2 h. Through a case study of Down Syndrome and other intellectual developmental disabilities, we demonstrate the ability of pyPheWAS to discover both known and potentially novel disease-phenotype associations across different experiment designs and disease groups. The software and user documentation are available in open source at https://github.com/MASILab/pyPheWAS .Entities:
Keywords: Electronic Medical Records; ICD; PheDAS; PheWAS; Phenotype
Mesh:
Year: 2022 PMID: 34981404 PMCID: PMC9250547 DOI: 10.1007/s12021-021-09553-4
Source DB: PubMed Journal: Neuroinformatics ISSN: 1539-2791