| Literature DB >> 33083540 |
Zhenxing Xu1, Fei Wang1, Prakash Adekkanattu1, Budhaditya Bose1, Veer Vekaria1, Pascal Brandt2, Guoqian Jiang3, Richard C Kiefer3, Yuan Luo4, Jennifer A Pacheco4, Luke V Rasmussen4, Jie Xu1, George Alexopoulos1, Jyotishman Pathak1.
Abstract
OBJECTIVE: To identify depression subphenotypes from Electronic Health Records (EHRs) using machine learning methods, and analyze their characteristics with respect to patient demographics, comorbidities, and medications.Entities:
Keywords: depression; electronic health records; machine learning; phenotyping
Year: 2020 PMID: 33083540 PMCID: PMC7556423 DOI: 10.1002/lrh2.10241
Source DB: PubMed Journal: Learn Health Syst ISSN: 2379-6146
FIGURE 1Exclusion cascade to identify the depression cohort from the INSIGHT CRN dataset
Characteristics of case (depressed) and control (non‐depressed) groups
| Item | Depressed (n = 11 275) | Non‐depressed (n = 11 275) |
|---|---|---|
|
| 62.6 (19.5) | 63.7 (20.1) |
| 18 to 24 | 234 (2.1%) | 249 (2.2%) |
| 25 to 44 | 2134 (18.9%) | 2101 (18.6%) |
| 45 to 64 | 3729 (33.1%) | 3340 (29.6%) |
| ≥65 | 5178 (45.9%) | 5585 (49.5%) |
|
| ||
| Female | 7777 (69.0%) | 7698 (68.3%) |
|
| ||
| White | 3590 (31.8%) | 2475 (22.0%) |
| Black or African American | 981 (8.7%) | 3260 (28.9%) |
| Asian | 456 (4.0%) | 253 (2.2%) |
| American Indian or Alaska Native | 26 (0.2%) | 39 (0.3%) |
| Native Hawaiian or Other Pacific Islander | 17 (0.2%) | 9 (0.1%) |
|
| ||
| Not Hispanic or Latino | 6359 (56.4%) | 8220 (72.9%) |
| Hispanic or Latino | 1502 (13.3%) | 631 (5.6%) |
Performance of machine learning models for current classification of depression
| Precision | Recall | AUC | |
|---|---|---|---|
|
| 0.8511 ± 0.0078 | 0.6802 ± 0.0068 | 0.857 ± 0.0053 |
|
| 0.8855 ± 0.0088 | 0.5815 ± 0.0075 | 0.8376 ± 0.0052 |
|
| 0.6055 ± 0.0067 | 0.9074 ± 0.0072 | 0.8066 ± 0.0081 |
|
| 0.8583 ± 0.0084 | 0.6919 ± 0.0097 |
|
FIGURE 2The heatmap obtained from Clustergram based on the selected variables. The x and y axis represents the patients' unique ID. The similarity among the individual patients was computed using the Jaccard Index. The “green rectangles” represent the three depression subphenotypes. The smaller the distance of patients were, the darker the color was, the greater the degree of similarity among patients were. The clusters can be approximately outlined on the clustermap by observing the distribution of colors along the diagonal line of the distance matrix
FIGURE 3The percentage of patients with comorbidity in phenotypes. The x and y axis represent comorbidity and percentage, respectively. AH: Acquired Hypothyroidism; AD: Alzheimer's Disease and Related Disorders or Senile Dementia; AMI: Acute Myocardial Infarction; RAOA: Rheumatoid Arthritis/Osteoarthritis; AF: Atrial Fibrillation; BC: Breast Cancer; CKD: Chronic Kidney Disease; CC: Colorectal Cancer; COPD: Chronic Obstructive Pulmonary Disease and Bronchiectasis; EC: Endometrial Cancer; HF: Heart Failure; HIP: Hip/Pelvic Fracture; ADHD: Attention‐Deficit/Hyperactivity Disorder; AUD: Alcohol Use Disorders; ASD: Autism Spectrum Disorders; TBI: Traumatic Brain Injury and Nonpsychotic Mental Disorders due to Brain Damage; CP: Cerebral Palsy; CFMDD: Cystic Fibrosis and Other Metabolic Developmental Disorders; DUD: Drug Use Disorders; CPF: Chronic Pain and Fatigue, Fibromyalgia; SDHI: Sensory ‐ Deafness and Hearing Impairment; VH: Viral Hepatitis; AIDS: Acquired Immunodeficiency Syndrome; IDRC: Intellectual Disabilities and Related Conditions; LD: Learning Disabilities; LL: Leukemias and Lymphomas; LD: Liver Disease; MD: Muscular Dystrophy; MCH: Migraine and Chronic Headache; MI: Mobility Impairments; MSTM: Multiple Sclerosis and Transverse Myelitis; ODD: Other Developmental Delays; OUD: Opioid Use Disorder; PD: Personality Disorders; SPD: Schizophrenia and Other Psychotic Disorders; PTSD: Post‐Traumatic Stress Disorder; PVD: Peripheral Vascular Disease; SCD: Sickle Cell Disease; SCI: Spinal Cord Injury; SBCANS: Spina Bifida and Other Congenital Anomalies of the Nervous System; TU: Tobacco Use; PCU: Pressure and Chronic Ulcers; SBVI: Sensory—Blindness and Visual Impairment
FIGURE 4The percentage of patients with medications in phenotypes. The x and y axis represent medication and percentage, respectively. SSRI: Selective Serotonin Reuptake Inhibitors; OA: Other Antidepressants; NSMRI: Non‐Selective Monoamine Reuptake Inhibitors; SPN: Solutions for Parenteral Nutrition; OAD: Opium Alkaloids and Derivatives; BRD: Benzodiazepine Related Drugs; TGC: Third‐Generation Cephalosporins; HG: Heparin Group; OASU: Other Antihistamines for Systemic Use; ES: Electrolyte Solutions; SE: Softeners Emollients; SBAA: Selective Beta‐2‐Adrenoreceptor Agonists; VDA: Vitamin D and Analogues; CL: Contact Laxatives; H2RA: H2‐Receptor Antagonists; BDBA: Benzodiazepine Derivatives (N05BA); OAP: Other Antiepileptics; NNRTI: Nucleoside and Nucleotide Reverse Transcriptase Inhibitors; PAD: Propionic Acid Derivatives; OAA: Oxytocin and Analogues; DUED: Drugs Used in Erectile Dysfunction; DD: Dihydropyridine Derivatives; BDAE: Benzodiazepine Derivatives (N03AE); NOA: Natural Opium Alkaloids; NNERTI: Nucleosides and Nucleotides Exclude Reverse Transcriptase Inhibitors; PES: Penicillins with Extended Spectrum; OPSA: Other Potassium‐Sparing Agents; AA: Aldosterone Antagonists; PPI: Proton Pump Inhibitors; AC: Aluminium Compounds; OO: Other Opioids; OQAC: Other Quaternary Ammonium Compounds; SI: Selective Immunosuppressants; AE: Aminoalkyl Ethers; OCS: Other Cough Suppressants; LRA: Leukotriene Receptor Antagonists; IAS: Intermediate‐Acting Sulfonamides; TD: Trimethoprim and Derivatives; TH: Thyroid Hormones; NSEP: Natural and Semisynthetic Estrogens Plain; PEFC: Progestogens and Estrogens, Fixed Combinations; OAAD: Other Antiseptics and Disinfectants; CPG: Corticosteroids, very Potent (group IV); OATU: Other Antibiotics for Topical Use; HMGRI: HMG CoA Reductase Inhibitors; ABBA: Alpha and Beta Blocking Agents; BBAS: Beta Blocking Agents, Selective; SP: Sulfonamides Plain; TV: Tetanus Vaccines; DV: Diphtheria Vaccines; VKA: Vitamin K Antagonists; IAIL: Insulins and Analogues for Injection, Long‐acting; HD: Hydrazinophthalazine Derivatives
Characteristics of the three depression subphenotypes
| Characteristic | Phenotypes | Unadjusted | Adjusted ANCOVA | ||
|---|---|---|---|---|---|
| A | B | C | |||
|
No. of patients (%) Total (8903 patients) | 2791 (31.35) | 4687 (52.65) | 1452 (16.31) | ||
|
| 72.55 (14.93) | 68.44(19.09) | 63.47(18.81) | .526 | ‐ |
|
| |||||
| Female | 1716 (61.48) | 3341 (71.29) | 866 (59.66) | .456 | 0.643 |
| Male | 1075 (38.52) | 1346 (28.71) | 586 (40.34) | ||
|
| |||||
| Hypertension | 1798 (64.41) | 1403 (29.93) | 203 (13.96) |
|
|
| Diabetes | 1176 (42.17) | 714 (15.23) | 64 (4.42) |
|
|
| Hyperlipidemia | 1596 (57.18) | 1379 (29.42) | 198 (13.61) |
|
|
| RAOA | 623 (22.32) | 788 (16.81) | 60 (4.14) | .213 | 0.368 |
| Anemia | 632 (22.64) | 1014 (21.63) | 90 (6.18) | .564 | 0.482 |
| Asthma | 459 (16.45) | 1229 (26.22) | 96 (6.6) |
|
|
| CPF | 667 (23.9) | 1842 (39.29) | 181 (12.49) |
|
|
| Anxiety | 448 (16.05) | 1564 (32.13) | 620 (42.7) |
|
|
| TU | 231 (8.28) | 478 (10.2) | 232 (15.96) |
|
|
| Obesity | 572 (20.49) | 769 (16.41) | 111 (7.65) | .642 | 0.775 |
|
| |||||
| Selective serotonin reuptake inhibitors | 1064 (38.11) | 926 (19.75) | 191 (13.12) |
|
|
| Beta blocking agents, selective | 1069 (38.3) | 1262 (26.92) | 176 (12.11) | .321 | 0.535 |
| Insulins and analogues for injection, long‐acting | 961 (34.42) | 613 (13.08) | 136 (9.4) |
|
|
| Natural opium alkaloids | 571 (20.46) | 1370 (29.22) | 232 (15.98) |
|
|
| Proton pump inhibitors | 935 (33.5) | 1171 (24.98) | 214 (14.77) | .225 | 0.327 |
| Selective beta‐2‐adrenoreceptor agonists | 290 (10.38) | 980 (20.9) | 98 (6.77) |
|
|
| Benzodiazepine derivatives | 287 (10.28) | 963 (20.54) | 489 (33.65) |
|
|
| Benzodiazepine related drugs | 310 (11.12) | 571 (12.19) | 406 (27.93) | .381 | 0.499 |
| Other antidepressants | 266 (9.54) | 300 (6.41) | 415 (28.59) | .568 | 0.768 |
| Expectorants | 302 (10.82) | 967 (20.64) | 87 (5.98) |
|
|
Abbreviations: CPF, Chronic Pain and Fatigue, Fibromyalgia; RAOA, Rheumatoid Arthritis/Osteoarthritis; TU, Tobacco Use.
ANCOVA was performed to adjust significance in terms of age variable. The only continuous variable age is tested by using Kruskal‐Wallis H‐test. Other binary variables are tested by using Chi‐square test.