| Literature DB >> 32548236 |
Donna Tjandra1, Raymond Q Migrino2,3, Bruno Giordani4, Jenna Wiens1.
Abstract
BACKGROUND: We sought to leverage data routinely collected in electronic health records (EHRs), with the goal of developing patient risk stratification tools for predicting risk of developing Alzheimer's disease (AD).Entities:
Keywords: cohort discovery; early prediction; electronic health record; machine learning
Year: 2020 PMID: 32548236 PMCID: PMC7293993 DOI: 10.1002/trc2.12035
Source DB: PubMed Journal: Alzheimers Dement (N Y) ISSN: 2352-8737
FIGURE 1Comparing Michigan Alzheimer's Disease Research Center (MIchigan‐ADRC) and Michigan Medicine's Research Data Warehouse (RDW) encounters for a sample patient. Each row represents a timeline for the respective dataset, and encounters are indicated with squares. Shading along the Michigan‐ADRC timeline indicates consensus‐based diagnoses. A true positive is counted if at least one identified Alzheimer's disease (AD) RDW encounter overlaps with the Michigan‐ADRC defined AD window (eg, the encounters in the blue circles)
FIGURE 2Cohort discovery results. Comparison of results from cohort discovery tools which tested a single electronic health record (EHR) component, were previously published, or whose median F1 score was >0.5. Each color corresponds to the identification tool indicated in the figure legend. Complexity in medical decisions was measured by the amount and variety of patient data examined by a physician, patient risk, and treatment options. A “*” in the figure legend denotes criteria whose F1 score was significantly worse than the best cohort discovery tool
FIGURE 3Applying inclusion/exclusion criteria. We begin with all patients in Michigan Medicine's Research Data Warehouse (RDW). Numbers in each box correspond to the number of patients included/excluded
Select characteristics of study cohort
| Patient demographics | RDW, N = 8,474 |
|---|---|
| Number of encounters per patient pre‐alignment (IQR) | 11 (4‐25) |
| Number of encounters per patient post‐alignment (IQR) | 84 (36‐172) |
| Female (%) | 54.94 |
Obtained from the inclusion/exclusion criteria in Figure 3.
Abbreviations: AD, Alzheimer's disease; IQR, interquartile range; RDW, Michigan Medicine's Research Data Warehouse.
FIGURE 4Comparison of electronic health record (EHR) data contributions. A, Analysis of individual EHR data fields. Comparison of model performance when trained with specific fields of EHR data. In this experiment, all data up to 1000 days prior to alignment were used. Error bars represent 95% confidence intervals. B, Analysis of longitudinal data. Comparison of model performance when trained on information from all encounters up to 1000 days prior to alignment versus training on information from up to 500 days before alignment and information from alignment only. In this experiment, data from all EHR components were used. Error bars represent 95% confidence intervals. The black dashed line represents the receiver operating characteristic curve for random predictions. C, Analysis of individual features. Broad categories in which the features from Table 2 can fall. Number correspond to those found in Table 2
Important features
| Feature group | Description | Drop in AUROC (95% CI) |
|---|---|---|
| 1. Age between 59 and 68 |
Maximum age between 59 and 68 Age between 59 and 68 | 0.0400 (0.0251‐0.0675) |
| 2. Visit type – outpatient between 250 and 500 days before alignment |
Patient has an outpatient visit Time between visits is in (0, 2] days | 0.0180 (0.0060‐0.0360) |
| 3. Age between 71 and 72 |
Maximum age between 71 and 72 Age between 71 and 72 | 0.0070 (0.0015‐0.0161) |
| 4. Religion value NON | Patient does not report a religious association | 0.0047 (0.0015‐0.0128) |
| 5. Laboratorytest 32623‐1 with value in (5.30, 7.4] 21000‐5 with value in (11.099, 12.9] 4544‐3 with value in (16.799, 36.8] 777‐3 with value in (25.999, 190.0] 785‐6 with value in (15.699, 29.5] 786‐4 with value in (29.799, 33.7] 787‐2 with value in (52.499, 86.3] 789‐8 with value in (2.149, 4.09] between 750 and 1000 days of alignment |
Blood measurements of platelet mean volume erythrocyte distribution hematocrit erythrocyte mean corpuscular hemoglobin | 0.0041 (0.0026‐0.0074) |
| 6. Laboratory test 736‐9 with value in (0.399, 16.6] 5905‐5 with value in (0.099, 6.1] 704‐7 with value in (0.000, 0.7] 731‐0 with value in (0.099, 1.1] 742‐7 with value in (0.000, 0.4] 751‐8 with value in (0.099, 3.0] between 500 and 750 days of alignment |
Blood measurements of lymphocytes monocytes basophils neutrophils | 0.0037 (0.0005‐0.0093) |
| 7. Diagnosis code V04.8 along with procedures 9065x and G000x between 250 and 500 days before alignment |
Vaccines for influenza, pneumococcal disease Revision mastoidectomy Injection of samarium lexidrona | 0.0028 (0.0006‐00073) |
| 8. Non‐invasive systolic blood pressure in (127, 136] between 500 and 750 days before alignment | Elevated blood pressure/hypertension | 0.0023 (0.0004‐0.0041) |
| 9. Procedure 8260x and lab test 2132‐9 with value in (89.999, 382.8] between 0 and 250 days before alignment |
Measurements of blood cyanide vitamin B12 transcobalamin | 0.0021 (0.0012‐0.0031) |
| 10. Laboratory test 50557‐8 with value negative 27297‐1 with value negative 50561‐0 with value negative 50563‐6 with value < 1 mg/dl 53327‐3 with value negative 53328‐1 with value negative 57747‐8 with value negative between 250 and 500 days of alignment |
Urine measurements of ketones leukocyte esterase protein urobilinogen total bilirubin glucose erythrocytes | 0.0021 (0.0009‐0.0044) |
Summary of the top 10 most important feature groups, as determined by permutation importance. The letter “x” is used to denote any character. Laboratory tests, diagnoses, and procedures are represented as LOINC, ICD9, and CPT codes respectively.
Abbreviations: AUROC, area under the receiver operating characteristics curve; CI, confidence interval; CPT, current procedural terminology; ICD9, International Classification of Diseases, Ninth Revision; LOINC, Logical Observation Identifiers Names and Codes.