| Literature DB >> 36005898 |
V Eric Kerchberger1,2, Josh F Peterson1,2, Wei-Qi Wei1.
Abstract
OBJECTIVE: COVID-19 survivors are at risk for long-term health effects, but assessing the sequelae of COVID-19 at large scales is challenging. High-throughput methods to efficiently identify new medical problems arising after acute medical events using the electronic health record (EHR) could improve surveillance for long-term consequences of acute medical problems like COVID-19.Entities:
Keywords: COVID-19/complications; Terms: COVID-19; cohort study; electronic health records; phenome-wide association study
Year: 2022 PMID: 36005898 PMCID: PMC9452157 DOI: 10.1093/jamia/ocac159
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 7.942
Figure 1.Graphical timeline of data collection from electronic health record and phenotype encoding schematic. Graphical timeline of index SARS-CoV-2 test, recovery, and phenotype case and control definitions for (A) patients who were not hospitalized or (B) were hospitalized around time of index SARS-CoV-2 test. Index date was defined as date of either first positive SARS-CoV-2 polymerase chain reaction (PCR) or first negative test for never-infected patients. Recovery date was defined as either (A) 30 days after the index SARS-CoV-2 test in nonhospitalized patients or (B) 30 days after hospital discharge in hospitalized patients. (C) Schematic of temporal-informed phenotype feature engineering. The source EHR database was queried for diagnostic billing codes and the dataset was separated based on occurrence of codes before or after the temporal event (recovery date). Phecode feature engineering was applied to both “pre-event” and “postevent” datasets separately, then recombined to generate the final temporal-informed phenotypes. In this illustration, the patient is a temporal-informed case for phenotypes 359.2 and 427.21 (denoted as “T”) as they had the corresponding diagnosis codes entered into the medical record on at least 2 separate dates after the temporal event, and did not have the diagnosis codes on a visit either before SARS-CoV-2 testing or during the acute phase. The patient is excluded from analyses of phenotypes 401.1, 480.2, 496.1, and 521.8 (denoted as “–”) as they had those phecodes prior to the recovery date. The patient is a control for all phenotypes where they had zero codes in both the pre- and postevent datasets (eg, 204, 1001, and others; denoted as “F”). If the patient had a diagnosis-specific exclusion for a phecode in either dataset, the patient was excluded for that phecode in the temporal-informed phenotypes (Supplementary Table S1 and Appendix).
Characteristics of registry cohort
| Characteristic | Never infected | SARS-CoV-2 positive | Overall |
|---|---|---|---|
| Number in cohort | 156 017 | 30 088 | 186 105 |
| Age, median [IQR], years | 46 [32, 62] | 43 [30, 57] | 46 [32, 62] |
| Sex (%) | |||
| Female | 89 547 (57.4) | 16 718 (55.6) | 106 265 (57.1) |
| Male | 66 470 (42.6) | 13 370 (44.4) | 79 840 (42.9) |
| Race (%) | |||
| Black | 17 106 (11.0) | 3274 (10.9) | 20 380 (11.0) |
| Other race or multiracial | 7901 (5.1) | 1714 (5.7) | 9615 (5.2) |
| Unknown/not reported | 18 996 (12.2) | 5924 (19.7) | 24 920 (13.4) |
| White | 112 014 (71.8) | 19 176 (63.7) | 131 190 (70.5) |
| Ethnicity (%) | |||
| Hispanic/Latino | 4759 (3.1) | 1217 (4.0) | 5976 (3.2) |
| Non-Hispanic/Non-Latino | 128 049 (82.1) | 21 936 (72.9) | 149 985 (80.6) |
| Unknown/not reported | 23 209 (14.9) | 6935 (23.0) | 30 144 (16.2) |
| Received care at VUMC prior to SARS-CoV-2 test (%) | 106 839 (68.5) | 20 860 (69.3) | 127 699 (68.6) |
| SARS-CoV-2 testing indication (%) | |||
| Asymptomatic screening | 89 727 (57.5) | 6095 (20.3) | 95 822 (51.5) |
| Symptomatic testing | 66 290 (42.5) | 23 993 (79.7) | 90 283 (48.5) |
| EHR observation time | |||
| After SARS-CoV-2 test, median [IQR], days | 420 [267, 533] | 392 [317, 459] | 412 [274, 528] |
| After recovery, median [IQR], days | 378 [215, 495] | 361 [285, 427] | 374 [224, 489] |
| Hospitalization associated with SARS-CoV-2 test (%) | 43 146 (27.7) | 3393 (11.3) | 46 539 (25.0) |
| Severe COVID-19 (%) | – | 2358 (7.8) | – |
| Follow-up visit type (%) | |||
| Any follow-up visit | 96 615 (61.9) | 16 583 (55.1) | 113 198 (60.8) |
| Office visit | 89 559 (57.4) | 15 593 (51.8) | 105 152 (56.5) |
| Laboratory/anticoagulation visit | 42 646 (27.3) | 7216 (24.0) | 49 862 (26.8) |
| Inpatient surgery or procedure | 27 213 (17.4) | 4091 (13.6) | 31 304 (16.8) |
| Telemedicine visit | 16 617 (10.7) | 2478 (8.2) | 19 095 (10.3) |
| Outpatient surgery or procedure | 19 725 (12.6) | 2728 (9.1) | 22 453 (12.1) |
| Allied health practitioner visit | 14 821 (9.5) | 2580 (8.6) | 17 401 (9.4) |
| Infusion/radiation care | 4043 (2.6) | 542 (1.8) | 4585 (2.5) |
| Maternity care | 3899 (2.5) | 482 (1.6) | 4381 (2.4) |
| Outpatient observation in Emergency Department | 2403 (1.5) | 422 (1.4) | 2825 (1.5) |
| Inpatient medical admission | 1197 (0.8) | 1239 (4.1) | 2436 (1.3) |
| Time from SARS-CoV-2 test to first follow-up visit, median [IQR], days | 66 [44, 139] | 86 [48, 181] | 69 [44, 145] |
| Pregnant during study observation period (%) | 7565 (4.8) | 609 (2.0) | 8174 (4.4) |
| Pregnant around time of SARS-CoV-2 test (%) | 4488 (2.9) | 189 (0.6) | 4677 (2.5) |
| Died during postacute phase (%) | 1535 (1.0) | 158 (0.5) | 1693 (0.9) |
Defined as having at least 2 visits at VUMC prior to SARS-CoV-2 test separated by at least 180 days.
Reasons for asymptomatic screening included: asymptomatic admission to the hospital for another diagnosis, preprocedural or presurgical screening, known SARS-CoV-2 exposure, prereceipt of immunosupressive or antineoplastic therapy, pretransplant evaluation, or requirement for placement in postacute care or long-term nursing care.
SARS-CoV-2 test performed within 15 days prior to a hospital admission or during a hospital admission.
Severe COVID-19: admitted to hospital and received supplemental oxygen.
Some patients had more than 1 visit type.
Allied health practitioner visits included visits coded as being nurse-only visits, dietitian or nutritionist visits, and clinical support or educational visits.
Figure 2.Phecode case retention by temporal-informed phenotyping. Histograms of phenotype case retention per PheWAS code (phecode) using temporal-informed phenotyping. Individual histograms indicate each chapter within the phecode hierarchy. Number of phecodes per chapter are shown on x axis, case retention per phecode is shown on y axis. Labels indicate number of phenotypes with ≥10 cases and median [interquartile range] of the per-phecode case retention in each chapter.
Figure 3.Temporal-informed phenome scan of postacute COVID-19. PheWAS plot of new postacute phenotypes identified by temporal-informed phenotyping for COVID-19 survivors vs never-infected patients as the referent group (n=186 105, phenotypes available for testing=902). The x axis represents phecodes grouped by chapter within the phecode hierarchy. The y axis represents the negative log-transformed P values obtained using logistic regression after adjusting for age, sex, race, ethnicity, length of EHR observation after recovery, indication for testing, and medical comorbidities prior to testing. Upward triangles represent phenotypes with odds ratio >1.0 for COVID-19 survivors and downward triangles represent phenotypes with odds ratio <1.0. Horizontal red line indicates the phenome-wide significance P value significance using a Bonferroni correction (P=5.54 × 10−5).
Summary of temporal-informed PheWAS in postacute COVID-19
| Phecode | Description | Odds ratio | 95% CI |
| No. cases | No. controls |
|---|---|---|---|---|---|---|
| 512.9 | Other dyspnea | 3.04 | (2.52–3.68) | 5.54 × 10−31 | 811 | 93 936 |
| 512.7 | Shortness of breath | 2.49 | (2.09–2.96) | 2.73 × 10−24 | 988 | 93 936 |
| 569.2 | Gastrointestinal complications of surgery | 6.54 | (4.38–9.75) | 3.32 × 10−20 | 116 | 166 825 |
| 278.11 | Morbid obesity | 2.35 | (1.93–2.86) | 1.49 × 10−17 | 624 | 154 861 |
| 649 | Conditions of the mother complicating pregnancy, childbirth, or the puerperium | 3.85 | (2.76–5.38) | 2.66 × 10−15 | 169 | 95 518 |
| 509.1 | Respiratory failure | 7.09 | (4.35–11.6) | 3.89 × 10−15 | 101 | 157 792 |
| 136 | Other infectious and parasitic diseases | 9.20 | (5.14–16.5) | 8.43 × 10−14 | 54 | 181 966 |
| 359.2 | Myopathy | 20.5 | (9.24–45.4) | 9.99 × 10−14 | 33 | 174 863 |
| 427.9 | Palpitations | 2.14 | (1.75–2.61) | 1.40 × 10−13 | 628 | 137 086 |
| 418.1 | Precordial pain | 3.21 | (2.35–4.39) | 2.71 × 10−13 | 278 | 138 537 |
| 418 | Nonspecific chest pain | 2.01 | (1.66–2.43) | 1.19 × 10−12 | 746 | 138 537 |
| 646 | Other complications of pregnancy NEC | 5.91 | (3.55–9.83) | 7.89 × 10−12 | 69 | 99 542 |
| 585.1 | Acute renal failure | 3.15 | (2.26–4.38) | 9.49 × 10−12 | 309 | 157 475 |
| 427.21 | Atrial fibrillation | 2.62 | (1.98–3.48) | 2.56 × 10−11 | 443 | 137 086 |
| 1010 | Other tests | 3.17 | (2.19–4.60) | 1.21 × 10−9 | 155 | 169 347 |
| 644 | Anemia during pregnancy | 7.43 | (3.74–14.7) | 9.91 × 10−9 | 38 | 101 761 |
| 1010.6 | Reproductive and maternal health services | 1.75 | (1.44–2.12) | 9.99 × 10−9 | 591 | 172 787 |
| 638 | Other high-risk pregnancy | 2.19 | (1.67–2.86) | 1.34 × 10−8 | 312 | 178 757 |
| 350.1 | Abnormal involuntary movements | 2.53 | (1.83–3.48) | 1.46 × 10−8 | 256 | 170 487 |
| 671 | Venous/cerebrovascular complications & embolism in pregnancy and the puerperium | 21.5 | (7.25–63.7) | 3.10 × 10−8 | 17 | 103 586 |
| 649.1 | Diabetes or abnormal glucose tolerance complicating pregnancy | 4.73 | (2.68–8.34) | 7.77 × 10−8 | 57 | 95 518 |
| 782.3 | Edema | 2.08 | (1.59–2.73) | 8.34 × 10−8 | 424 | 168 184 |
| 452.2 | Deep vein thrombosis [DVT] | 3.23 | (2.09–4.99) | 1.26 × 10−7 | 138 | 162 711 |
| 285 | Other anemias | 2.05 | (1.56–2.68) | 1.85 × 10−7 | 473 | 146 505 |
| 781 | Symptoms involving nervous and musculoskeletal systems | 3.07 | (2.01–4.68) | 1.88 × 10−7 | 151 | 180 070 |
| 1013 | Asphyxia and hypoxemia | 5.51 | (2.89–10.5) | 2.07 × 10−7 | 52 | 175 439 |
| 292 | Neurological deficits | 2.39 | (1.72–3.32) | 2.31 × 10−7 | 242 | 162 234 |
| 599.2 | Retention of urine | 2.93 | (1.95–4.41) | 2.45 × 10−7 | 184 | 149 134 |
| 514 | Abnormal findings examination of lungs | 2.29 | (1.64–3.20) | 9.86 × 10−7 | 350 | 163 569 |
| 587 | Kidney replaced by transplant | 32.4 | (7.99–131.) | 1.12 × 10−6 | 22 | 157 475 |
| 401.1 | Essential hypertension | 1.42 | (1.23–1.64) | 2.17 × 10−6 | 1698 | 122 907 |
| 278.1 | Obesity | 1.70 | (1.36–2.12) | 2.33 × 10−6 | 566 | 154 861 |
| 327.32 | Obstructive sleep apnea | 1.69 | (1.36–2.11) | 2.51 × 10−6 | 669 | 150 608 |
| 420.1 | Myocarditis | 10.0 | (3.83–26.2) | 2.67 × 10−6 | 20 | 177 003 |
| 250.2 | Type 2 diabetes | 1.77 | (1.38–2.25) | 4.75 × 10−6 | 572 | 148 033 |
| 348.8 | Encephalopathy, not elsewhere classified | 6.23 | (2.76–14.1) | 1.10 × 10−5 | 32 | 160 519 |
| 653 | Problems associated with amniotic cavity and membranes | 8.04 | (3.15–20.5) | 1.32 × 10−5 | 19 | 97 532 |
| 502 | Postinflammatory pulmonary fibrosis | 5.47 | (2.49–12.0) | 2.26 × 10−5 | 40 | 157 792 |
| 284.1 | Pancytopenia | 3.25 | (1.87–5.66) | 2.96 × 10−5 | 94 | 146 505 |
| 38.3 | Bacteremia | 8.03 | (2.95–21.9) | 4.54 × 10−5 | 19 | 166 009 |
| 292.3 | Memory loss | 1.99 | (1.43–2.77) | 5.09 × 10−5 | 287 | 162 234 |
| 285.21 | Anemia in chronic kidney disease | 3.10 | (1.79–5.36) | 5.22 × 10−5 | 104 | 146 505 |
| 54 | Herpes simplex | 3.66 | (1.95–6.85) | 5.22 × 10−5 | 54 | 149 827 |
A list of ICD-10-CM codes included in each phecode is available at: https://phewascatalog.org/phecodes_icd10cm.
Figure 4.Temporal-informed phenome scans of postacute COVID-19 by demographic subgroups and timing of postacute diagnoses. PheWAS results for new postacute phenotypes identified by temporal-informed phenotyping among all adults tested for SARS-CoV-2 (left column, n=186 105), stratified by demographic subgroups (male sex, female sex, White non-Hispanic, Black non-Hispanic), and stratified by onset of the new diagnoses (“Early” diagnoses: within 60 days after recovery; “Late diagnoses”: later than 60 days after recovery). The y axis represents phecodes group by chapter within the phecode hierarchy. Cell color intensity illustrates adjusted P values by logistic regression. Text in cells show point estimates for effect odds ratios. Text in bold/italic and with a “*” indicate PheWAS associations that were statistically significant using a Bonferroni correction. Results for phecodes with a statistically significant association in any subgroup analysis are displayed. Empty cells indicate analyses with insufficient phenotype cases (<10) to perform the analysis for that phenotype in the subgroup.
Summary temporal-informed PheWAS for severe COVID-19 survivors
| Phecode | Description | Odds ratio | 95% CI |
| No. cases | No. controls |
|---|---|---|---|---|---|---|
| 509.1 | Respiratory failure | 225 | (62.7–808) | 1.02 × 10−15 | 31 | 25 204 |
| 401.1 | Essential hypertension | 3.71 | (2.55–5.39) | 6.72 × 10−12 | 243 | 21 801 |
| 514 | Abnormal findings examination of lungs | 10.7 | (4.93–23.4) | 2.30 × 10−9 | 42 | 25 588 |
| 504 | Other interstitial lung disease | 142 | (24.7–818) | 1.55 × 10−6 | 10 | 25 204 |
| 507 | Pleurisy or pleural effusion | 28.5 | (7.92–103) | 1.76 × 10−6 | 14 | 25 204 |
| 427.21 | Atrial fibrillation | 4.26 | (2.38–7.63) | 6.11 × 10−6 | 68 | 23 263 |
| 798 | Malaise and fatigue | 2.91 | (1.87–4.52) | 1.95 × 10−6 | 162 | 19 803 |
| 276.13 | Hyperpotassemia | 12.0 | (4.15–34.7) | 4.45 × 10−6 | 24 | 24 600 |
| 502 | Postinflammatory pulmonary fibrosis | 47.5 | (8.11–278) | 1.86 × 10−5 | 10 | 25 204 |
| 250.22 | Type 2 diabetes with renal manifestations | 45.7 | (7.79–268) | 2.30 × 10−5 | 32 | 24 221 |
| 1013 | Asphyxia and hypoxia | 11.8 | (3.45–40.5) | 8.59 × 10−5 | 15 | 26 963 |
A list of ICD-10-CM codes included in each phecode is available at: https://phewascatalog.org/phecodes_icd10cm.
Changes in outpatient vital signs or laboratory studies for select temporal-informed phenotypes
| Change in lab or vital sign from pretesting to postacute | ||||||
|---|---|---|---|---|---|---|
| Postacute phenotype(s) | Vital sign/lab (units) | Subgroup | Never infected mean (SD) | SARS-CoV-2 positive mean (SD) | Adjusted mean difference (95% CI) |
|
| Obesity morbid obesity | BMI (kg/m2) | Nonobese ( | 0.01 (1.6) | 0.21 (1.4) | 0.16 (0.12–0.21) | 2.00 × 10−13 |
| Essential hypertension | Systolic blood pressure (mmHg) | Normal blood pressure or prehypertension ( | −0.2 (13.0) | 0.4 (12.0) | 0.5 (0.1–1.0) | 0.015 |
| Palpitations atrial fibrillation | Heart rate (bpm) | Normal heart rate, no arrhythmia diagnoses ( | 0.1 (12) | 1.1 (12) | 1.0 (0.6–1.3) | 3.81 × 10−7 |
| Respiratory failure | Respiratory rate (min−1) | Normal respiratory rate, no lung disorders ( | −0.1 (2.2) | 0.1 (2.3) | 0.2 (0.1–0.3) | 3.89 × 10−5 |
| Pancytopenia | White blood cell (103/µL) | Normal WBC, no hematologic disorders ( | 0.0 (1.9) | 0.2 (1.9) | 0.2 (0.1–0.3) | 5.72 × 10−6 |
| Acute renal failure | Estimated GFR (mL/min) | No renal failure or kidney transplant ( | 0 (13) | 1 (12) | 1 (0–1) | 0.008 |
Among patients with the vital sign or lab value recorded both within 180 days prior to SARS-CoV-2 testing and within 365 days following recovery.
Prior to SARS-CoV-2 testing.
Calculated for each patient as Ypostacute−Ypretesting, where Y is the vital sign value or laboratory value. Negative values indicate a decrease in the vital sign/lab value from the pretesting to the postacute phases, and positive values indicate an increase in the vital sign/lab value.
Mean difference and 95% CI between groups adjusted for age, sex, race, ethnicity, and time between pre-SARS-CoV-2 test value and postacute value.
Adjusted P values using linear regression.
Figure 5.Changes in select vital signs and laboratory test values in postacute COVID-19. COVID-19 survivors had more substantial changes in (A) body mass index, (B) heart rate, (C) respiratory rate, and (D) white blood cell count from pretesting to postrecovery compared with never-infected controls. For each patient we used the median pretesting values obtained during outpatient visits occurring within 180 days before the index SARS-CoV-2 test, and the median postrecovery values obtained during outpatient visits occurring within 365 days after recovery from illness. Dots represent mean values in each exposure group, bars represent standard errors of the mean. Labels represent the adjusted mean difference between COVID-19 survivors and never-infected controls, number of patients with data for each analysis, and P values obtained by multiple linear regression.