| Literature DB >> 34038146 |
Jodell E Linder1, Lisa Bastarache2, Jacob J Hughey2, Josh F Peterson2,3.
Abstract
Recent advances in genomic technology and widespread adoption of electronic health records (EHRs) have accelerated the development of genomic medicine, bringing promising research findings from genome science into clinical practice. Genomic and phenomic data, accrued across large populations through biobanks linked to EHRs, have enabled the study of genetic variation at a phenome-wide scale. Through new quantitative techniques, pleiotropy can be explored with phenome-wide association studies, the occurrence of common complex diseases can be predicted using the cumulative influence of many genetic variants (polygenic risk scores), and undiagnosed Mendelian syndromes can be identified using EHR-based phenotypic signatures (phenotype risk scores). In this review, we trace the role of EHRs from the development of genome-wide analytic techniques to translational efforts to test these new interventions to the clinic. Throughout, we describe the challenges that remain when combining EHRs with genetics to improve clinical care.Entities:
Keywords: GWAS; PheRS; PheWAS; electronic health records; phenome; translational genomics
Mesh:
Year: 2021 PMID: 34038146 PMCID: PMC9297710 DOI: 10.1146/annurev-genom-121120-125204
Source DB: PubMed Journal: Annu Rev Genomics Hum Genet ISSN: 1527-8204 Impact factor: 9.340
Figure 1Advancing translational genomics relies on research across the genome and phenome. Progress relies both on enabling resources and on analytic methods and tools to capitalize on those resources. Discovery research utilizing new technologies built off large-scale EHR and genomic data has led to clinical translation and implementation and to eventual changes in clinical practice. Abbreviations: EHR, electronic health record; e-phenotyping, electronic phenotyping; GWAS, genome-wide association study; PheRS, phenotype risk score; PheWAS, phenome-wide association study; PRS, polygenic risk score.
Commonly utilized data elements and tools for extracting phenotypes from EHRs for phenome-wide research
| Data element | Description | Utility for phenome science |
|---|---|---|
| Claims data | Billing claims data used for diagnosis and procedures; examples include ICD, phecodes (derived from ICD), and CPT | Structured data to extrapolate patient diagnoses, symptoms, findings, and procedures |
| Demographics | Age, sex/gender, race, ethnicity, date of birth, date of death | Covariate adjustments, cohort definition, structured data |
| Indexed concepts from clinical narratives | Terms may be mapped to SNOMED-CT and the HPO | Standardizing phenotype concepts to index and merge narrative text |
| Semistructured documents | Problem lists, family history, flow sheets, radiology, pathology, procedures, cytology reports | Natural language processing for complex e-phenotyping |
| Encounters | Admission-discharge-transfer, provider and clinic assignments | Severity stratification, healthcare utilization |
| Laboratory | Laboratory name, value, unit, date; standardized by LOINC in some EHRs | Criteria for detecting diagnoses, cohort definitions, covariate adjustments |
| Medications | Medication name, dosing, frequency, route, duration, form, strength standardized to RxNorm standard | Criteria for detecting exposures, cohort definitions, covariate adjustments |
| Tumor (cancer) registry | Organization (e.g., North American Association of Central Cancer Registries) for cancer registry data across public and private organizations for standardization | Cancer-related e-phenotyping |
| Vital signs | Blood pressure, BMI, height, weight, temperature | Covariate adjustments, structured data |
Abbreviations: BMI, body mass index; CPT, Current Procedural Terminology; EHR, electronic health record; e-phenotyping, electronic phenotyping; HPO, Human Phenotype Ontology; ICD, International Classification of Diseases; LOINC, Logical Observation Identifiers Names and Codes; SNOMED-CT, Systematized Nomenclature of Medicine–Clinical Terms.
Figure 2Milestones enabling translational research. EHR (top) and genomic data (bottom) technologies facilitated advancements in medical genomics, increasing the understanding of common complex diseases. Abbreviations: EHR, electronic health record; e-phenotyping, electronic phenotyping; GWAS, genome-wide association study; HITECH, Health Information Technology for Economic and Clinical Health; HLA, human leukocyte antigen; PheRS, phenotype risk score; PheWAS, phenome-wide association study.
Figure 3Creating a PheRS (see also the sidebar titled Creating a Phenotype Risk Score). (a) Summing the weights of each feature present in an EHR to calculate the PheRS. (b) An abbreviated version of OMIM’s clinical description for cystic fibrosis. (c) An example PheRS plot for a patient diagnosed with cystic fibrosis late in life. Before the diagnosis, this patient had a cystic fibrosis PheRS in the 99th percentile. Abbreviations: EHR, electronic health record; HPO, Human Phenotype Ontology; OMIM, Online Mendelian Inheritance in Man; PheRS, phenotype risk score.
Figure 4Cumulative number of publications that included terms for common large-scale analytic methods (GWAS, genome-wide association study; PheWAS, phenome-wide association study, phenome wide; GRS, genomic risk score, genetic risk score, polygenic risk score) in the title, abstract, or MeSH term. Enabling methods such as GWAS and PheWAS in combination with the availability of large-scale EHR data laid the foundation for translational research such as PRSs and GRSs. Abbreviations: EHR, electronic health record; GRS, genomic risk score; GWAS, genome-wide association study; MeSH, Medical Subject Headings; PheWAS, phenome-wide association study; PRS, polygenic risk score.