| Literature DB >> 29930957 |
Duc Thanh Anh Luong1, Dinh Tran1, Wilson D Pace2, Miriam Dickinson2, Joseph Vassalotti3, Jennifer Carroll2, Matthew Withiam-Leitch1, Min Yang1, Nikhil Satchidanand1, Elizabeth Staton2, Linda S Kahn1, Varun Chandola1, Chester H Fox1.
Abstract
INTRODUCTION: As chronic kidney disease (CKD) is among the most prevalent chronic diseases in the world with various rate of progression among patients, identifying its phenotypic subtypes is important for improving risk stratification and providing more targeted therapy and specific treatments for patients having different trajectories of the disease progression. PROBLEM DEFINITION AND DATA: The rapid growth and adoption of electronic health records (EHR) technology has created a unique opportunity to leverage the abundant clinical data, available as EHRs, to find meaningful phenotypic subtypes for CKD. In this study, we focus on extracting disease severity profiles for CKD while accounting for other confounding factors. PROBABILISTIC SUBTYPING MODEL: We employ a probabilistic model to identify precise phenotypes from EHR data of patients who have chronic kidney disease. Using this model, patient's eGFR trajectory is decomposed as a combination of four different components including disease subtype effect, covariate effect, individual long-term effect and individual short-term effect. EXPERIMENTALEntities:
Year: 2017 PMID: 29930957 PMCID: PMC5983069 DOI: 10.5334/egems.226
Source DB: PubMed Journal: EGEMS (Wash DC) ISSN: 2327-9214
List of Data Elements Available in CKD Dataset
| BASELINE CHARACTERISTICS | SUMMARY STATISTICS | |
|---|---|---|
| “PREPROCESSED” DARTNET PATIENTS | CKD COHORT | |
| Number of patients | 63209 | 17314 |
| Age at baseline (years) | 66.32 (57.50, 74.55) | 70.20 (63.13, 76.56) |
| Male | 26034 (41.19) | 6856 (39.60) |
| Female | 37175 (58.81) | 10458 (60.40) |
| Smoking status | ||
| Serum creatinine | 1.1 (0.9, 1.3) | 1.2 (1.0, 1.5) |
| Years of last serum creatinine measure | 2.76 (0.77, 4.75) | 4.48 (3.07, 5.85) |
| Albumin-to-creatinine ratio | 26.5 (8.0, 55.6) | 22.3 (7.5, 41.6) |
| Hemoglobin A1c | 6.6 (6.1, 7.4) | 6.6 (6.1, 7.3) |
| Alanine aminotransferase | 22 (15, 33) | 21 (15, 32) |
| Aspartate aminotransferase | 21 (17, 26) | 20 (17, 25) |
| Fasting Blood Glucose | 101 (92, 119) | 102 (92, 120) |
| Non-Fasting Blood Glucose | 102 (91, 123) | 103 (91, 123) |
| Triglyceride level | 128 (91, 184) | 131 (93, 185) |
| High Density Lipoprotein | 47 (39, 58) | 47 (39, 57) |
| Low Density Lipoprotein | 95 (75, 120) | 91 (72, 115) |
| Phosphorous | 3.6 (3.2, 4.2) | 3.5 (3.2, 3.9) |
| Parathyroid hormone | 61.0 (35.6, 111.0) | 56.3 (34.4, 92.0) |
| Height (inch) | 66 (63, 69) | 66 (63, 69) |
| Weight (lb) | 184.0 (155.7, 217.0) | 184.0 (156.0, 215.0) |
| Systolic blood pressure | 130 (120, 140) | 130 (120, 140) |
| Diastolic blood pressure | 76 (70, 82) | 74 (68, 80) |
Note: “Preprocessed” DARTNet patients are extracted by a procedure explained in Figure 1.
Continuous variables are summarized by median while 25th and 75th percentiles are presented in parenthesis. Categorical variables are summarized by number of patients in each category while percentage is presented in parenthesis.
Figure 1Flowchart of Preprocessing DARTNet Data
The CKD-EPI Equation for Estimating GFR
| RACE AND SEX | SERUM CREATININE | EQUATION |
|---|---|---|
| Female | ≤0.7 | |
| >0.7 | ||
| Male | ≤0.9 | |
| >0.9 | ||
| Female | ≤0.7 | |
| >0.7 | ||
| Male | ≤0.9 | |
| >0.9 | ||
Figure 2Probabilistic Subtyping Model
Figure 3Subtype Trajectories
Description of Subtypes in Terms of Their Trajectories
| SUBTYPE 1 | SUBTYPE 2 | SUBTYPE 3 | SUBTYPE 4 | SUBTYPE 5 | ||
|---|---|---|---|---|---|---|
| Patient records | Average rate of change of eGFR per year | 4.08 | 0.54 | –0.93 | –1.54 | –1.80 |
| Average baseline eGFR value | 54.71 | 53.38 | 48.93 | 40.90 | 26.45 | |
| Prototype’s trajectory | Rate of change of eGFR per year | 2.03 | 0.04 | –1.07 | –1.45 | –1.13 |
| Baseline eGFR value | 53.97 | 53.13 | 48.77 | 40.69 | 25.69 | |
Figure 4Distribution of Gender for Each Subtype
Figure 5Distribution of Baseline Age of Patients for Each Subtype
Rate of Change and Baseline eGFR of Each Subtype Breaking Down by Gender
| SUBTYPE | GENDER | AVERAGE RATE OF CHANGE OF EGFR PER YEAR | AVERAGE BASELINE EGFR VALUE |
|---|---|---|---|
| Subtype 1 | Female | 4.28 | 54.52 |
| Male | 3.83 | 54.96 | |
| Subtype 2 | Female | 0.60 | 53.15 |
| Male | 0.44 | 53.72 | |
| Subtype 3 | Female | –0.87 | 48.87 |
| Male | –1.03 | 49.02 | |
| Subtype 4 | Female | –1.42 | 40.72 |
| Male | –1.76 | 41.22 | |
| Subtype 5 | Female | –1.78 | 27.00 |
| Male | –1.83 | 25.63 | |
Rate of Change and Baseline eGFR of Each Subtype Breaking Down by Age
| Subtype | AGE GROUP | AVERAGE RATE OF CHANGE OF EGFR PER YEAR | AVERAGE BASELINE EGFR VALUE |
|---|---|---|---|
| Subtype 1 | < 45 | 9.10 | 52.71 |
| 45-65 | 4.13 | 54.97 | |
| > 65 | 3.91 | 54.56 | |
| Subtype 2 | < 45 | 0.46 | 53.50 |
| 45-65 | 0.62 | 53.65 | |
| > 65 | 0.50 | 53.24 | |
| Subtype 3 | < 45 | –1.68 | 49.14 |
| 45-65 | –0.84 | 49.11 | |
| > 65 | –0.96 | 48.86 | |
| Subtype 4 | < 45 | –2.32 | 42.52 |
| 45-65 | –2.24 | 42.58 | |
| > 65 | –1.34 | 40.43 | |
| Subtype 5 | < 45 | –1.20 | 13.46 |
| 45-65 | –2.22 | 27.84 | |
| > 65 | –1.73 | 27.61 | |
Figure 6Distribution of Patients for Each Subtype
Summarization of Relevant Clinical Measures with Respect to Each Subtype
| LAB MEASURES | SUBTYPE 1 | SUBTYPE 2 | SUBTYPE 3 | SUBTYPE 4 | SUBTYPE 5 |
|---|---|---|---|---|---|
| Albumin-to-creatinine ratio | 16.5 (6.8, 30.0) | 14.7 (6.1, 30.0) | 23.1 (7.7, 40.2) | 30.0 (11.0, 93.8) | 46.5 (17.0, 318.0) |
| Hemoglobin A1c | 6.5 (6.0, 7.2) | 6.5 (6.0, 7.2) | 6.6 (6.1, 7.4) | 6.7 (6.1, 7.5) | 6.7 (6.1, 7.7) |
| Alanine aminotransferase | 23 (16, 34) | 22 (15, 33) | 21 (15, 32) | 19 (13, 29) | 18 (12, 27) |
| Aspartate aminotransferase | 21 (17, 26) | 21 (17, 26) | 20 (17, 25) | 20 (16, 24) | 19 (15, 24) |
| Fasting Blood Glucose | 102 (92, 116) | 101 (92, 117) | 103 (92, 120) | 104 (91, 127) | 105 (92, 132) |
| Non-Fasting Blood Glucose | 102 (91, 120) | 101 (91, 118) | 103 (92, 126) | 105 (91, 132) | 106 (91, 138) |
| Triglyceride level | 128 (90, 180) | 126 (90, 177) | 133 (93, 190) | 143 (103, 200) | 144 (101, 201) |
| High Density Lipoprotein | 47 (39, 58) | 48 (40, 58) | 46 (38.5, 57) | 45 (37, 55) | 43 (36, 53) |
| Low Density Lipoprotein | 92 (72, 116) | 93 (73, 117) | 90 (71, 114) | 89 (70, 113) | 88 (68, 113) |
| Phosphorous | 3.4 (3.1, 3.8) | 3.4 (3.0, 3.7) | 3.4 (3.1, 3.8) | 3.6 (3.2, 4.0) | 3.8 (3.3, 4.5) |
| Parathyroid hormone | 42.4 (23.0, 59.0) | 48.0 (31.7, 75.7) | 53.9 (32.0, 84.4) | 60.0 (40.5, 102.0) | 114.0 (61.0, 203.0) |
| Systolic blood pressure | 128 (120, 140) | 130 (120, 140) | 130 (120, 140) | 130 (120, 142) | 130 (120, 142) |
| Diastolic blood pressure | 76 (70, 82) | 76 (70, 80) | 74 (68, 80) | 72 (66, 80) | 72 (66, 80) |
Note: Clinical measures are summarized by median while 25th and 75th percentiles are presented in parenthesis.
P-value of Hypothesis Testing for Each Lab Measure and Each Subtype
| LAB MEASURES | SUBTYPE 1 (%) | SUBTYPE 2 (%) | SUBTYPE 3 (%) | SUBTYPE 4 (%) | SUBTYPE 5 (%) |
|---|---|---|---|---|---|
| Albumin-to-creatinine ratio | 0.000 | 0.000 | 0.001 | 0.000 | |
| Hemoglobin A1c | 0.000 | 0.000 | 0.000 | 0.000 | |
| Alanine aminotransferase | 0.000 | 0.000 | 0.000 | 0.000 | |
| Aspartate aminotransferase | 1.419 | 0.002 | 0.076 | ||
| Fasting Blood Glucose | 0.232 | 0.647 | 0.594 | 0.135 | |
| Non-Fasting Blood Glucose | 0.000 | 0.000 | 1.705 | 0.000 | 0.000 |
| Triglyceride level | 0.193 | 0.000 | 1.785 | 0.000 | 0.000 |
| High Density Lipoprotein | 0.096 | 0.000 | 0.000 | 0.000 | |
| Low Density Lipoprotein | 0.537 | 0.000 | 0.037 | 0.311 | |
| Phosphorous | 3.624 | 0.000 | 0.590 | 1.747 | 0.000 |
| Parathyroid hormone | 0.000 | 0.000 | 0.005 | 0.000 | |
| Systolic blood pressure | 1.015 | ||||
| Diastolic blood pressure | 3.205 | 0.000 | 0.000 | ||