| Literature DB >> 34066464 |
Chao-Yu Guo1,2, Min-Yang Wu1,2, Hao-Min Cheng1,2,3,4.
Abstract
Background: Early detection of heart failure is the basis for better medical treatment and prognosis. Over the last decades, both prevalence and incidence rates of heart failure have increased worldwide, resulting in a significant global public health issue. However, an early diagnosis is not an easy task because symptoms of heart failure are usually non-specific. Therefore, this study aims to develop a risk prediction model for incident heart failure through a machine learning-based predictive model. Although African Americans have a higher risk of incident heart failure among all populations, few studies have developed a heart failure risk prediction model for African Americans.Entities:
Keywords: LASSO logistic regression; XGBoost; heart failure; machine learning; prediction model; random forest; support vector machine
Year: 2021 PMID: 34066464 PMCID: PMC8124765 DOI: 10.3390/ijerph18094943
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Scenarios considered.
Descriptive statistics of the study population.
| Baseline | Total Population | Non-HF | HF |
|---|---|---|---|
| Age | 54.96 (12.59) | 54.24 (12.37) | 63.91 (11.98) |
| BMI | 31.82 (7.2) | 31.72 (7.18) | 33.02 (7.31) |
| Waist | 100.83 (16.03) | 100.4 (15.93) | 106.9 (16.14) |
| High School Graduate | 2761 (83.26%) | 2607 (84.92%) | 154 (62.60%) |
| Gender | |||
| Male | 1228 (36.91%) | 1132 (36.74%) | 96 (39.02%) |
| Female | 2099 (63.09%) | 1949 (63.26%) | 150 (60.98%) |
| Current Smoker | 406 (12.31%) | 374 (12.25%) | 32 (13.11%) |
| Hypertension (HTN) | 1845 (55.47%) | 1644 (53.38%) | 201 (81.71%) |
| Diabetes Mellitus (DM) | 710 (21.5%) | 593 (19.39%) | 117 (47.95%) |
Figure 2Study flow chart.
Figure 3The training process of XGBoost.
Figure 4AUC for complete cases.
Figure 5AUC with simple imputation.
Figure 6AUC with KNN imputation.
Figure 7AUC with RF imputation.
The selected parameters of the final HF prediction model for XGBoost using complete cases (feature importance is in the parenthesis).
| <1% | <3% | <5% | <10% | <20% | <30% | <40% |
|---|---|---|---|---|---|---|
| age | age | DMmeds | DMmeds | dmMeds | CVDHx | frs_chdtenyrrisk |
| DMmeds | RepolarAntLat | age | Diabetes | MIHx | ascvd_tenyrrisk | ALDOSTERONE (0.0367) |
| Diabetes | DMmeds | BP3cat | CVDHx | EF | rrs_tenyrrisk (0.0220) | eGFRmdrd (0.0352) |
| eGFRckdepi | Diabetes | HTN | bpjnc7_3 | HbA1cIFCC (0.0150) | numbnessEver | occupation (0.0331) |
| MIecg | CVDHx | sbp | CHDHx | HbA1c | nutrition3cat (0.0194) | abi |
| RepolarAntLat | eGFRmdrd | eGFRckdepi | eGFRckdepi | strokeHx | FEV1PP | sbp |
| antiArythMeds | eGFRckdepi | CVDHx | FPG | Diabetes | totchol | calBlkMeds (0.0292) |
| RepolarAnt | statinMeds | QTcFrid | age | visionLossEver | asthma | FVC |
| statinMeds | edu3cat | BPmeds | sbp | statinMeds (0.0129) | BPmeds | eGFRckdepi (0.0253) |
| CVDHx | CardiacProcHx | waist | waist | frs_chdtenyrrisk | HTN | SCrCC |
Note: The above abbreviations are available in Table A1.
The selected parameters of the final HF prediction model for XGBoost using simple imputations.
| <1% | <3% | <5% | <10% | <20% | <30% | <40% |
|---|---|---|---|---|---|---|
| Diabetes | DMmeds | DMmeds | DMmeds | DMmeds | DMmeds | DMmeds |
| DMmeds | age | Diabetes | age | age | age | DialysisEver (0.0252) |
| age | CVDHx | DialysisEver (0.0243) | eGFRckdepi (0.0185) | Diabetes | Diabetes | CVDHx |
| HTN | eGFRckdepi (0.0263) | MIAntLat | Diabetes | DialysisEver (0.0190) | BPmeds | age |
| CVDHx | HTN | age | FEV1 | ConductionDefect (0.0183) | CVDHx | Afib |
| HSgrad | HSgrad | HSgrad | CVDHx | sex | age | MIHx |
| BPmeds | Diabetes | Afib | FVC | occupation (0.0159) | ConductionDefect (0.0141) | eGFRckdepi (0.0129) |
| eGFRckdepi | eGFRmdrd (0.0225) | edu3cat | CHDHx | MIant | FEV1 | SystLVdia |
| RepolarAntLat | MIHx | CVDHx | eGFRmdrd (0.0151) | CVDHx | eGFRckdepi (0.0139) | EF |
| ecgHR | sbp | EF | HbA1cIFCC (0.0148) | idealHealthSMK (0.0125) | CHDHx | ConductionDefect (0.0115) |
The selected parameters of the final HF prediction model for XGBoost using KNN imputations.
| <1% | <3% | <5% | <10% | <20% | <30% | <40% |
|---|---|---|---|---|---|---|
| Diabetes | age | DMmeds | DMmeds (0.0261) | DMmeds (0.0443) | DMmeds (0.0145) | DMmeds (0.0318) |
| DMmeds | DMmeds | age | age | age | age | DialysisEver (0.0294) |
| age | MIHx | MIant | CVDHx | CVDHx | eGFRmdrd (0.0138) | age |
| HTN | Diabetes | CVDHx | eGFRckdepi (0.0173) | EF | Diabetes | Diabetes |
| CVDHx | CVDHx | Diabetes | Diabetes | eGFRckdepi (0.0182) | SCrCC | Afib |
| BPmeds | eGFRckdepi (0.0205) | HSgrad | eGFRmdrd (0.0153) | FEV1 | eGFRckdepi (0.0118) | CVDHx |
| HSgrad | HSgrad | ConductionDefect (0.0201) | FEV1 | ConductionDefect (0.0174) | statinMeds (0.0115) | eGFRckdepi (0.0126) |
| eGFRckdepi (0.0222) | HTN | antiArythMedsSelf | CHDHx | MIHx | CVDHx | MIHx |
| RepolarAntLat (0.0209) | antiArythMeds (0.0171) | CHDHx | edu3cat | MajorScarAnt (0.0172) | everSmoker (0.0111) | calBlkMeds (0.0116) |
| ecgHR | eGFRmdrd (0.0164) | AntiArythMeds (0.1374) | DialysisEver (0.0139) | eGFRmdrd (0.0170) | rrs_tenyrrisk (0.0108) | FEV1 |
The selected parameters of the final HF prediction model for XGBoost using MissForest imputations.
| <1% | <3% | <5% | <10% | <20% | <30% | <40% |
|---|---|---|---|---|---|---|
| Diabetes | antiArythMeds | dmMeds | DMmeds (0.0263) | DMmeds | DMmeds | DMmeds |
| DMmeds | DMmeds | age | age | DialysisEver (0.0323) | Diabetes | ascvd_tenyrrisk (0.0255) |
| age | age | Diabetes | Diabetes | MIAntLat | rrs_tenyrrisk (0.0148) | age |
| HTN | eGFRckdepi | eGFRckdepi (0.0269) | CVDHx | Diabetes | age | eGFRckdepi (0.0210) |
| CVDHx | HTN | CVDHx | CHDHx | age | ascvd_tenyrrisk (0.0130) | rrs_tenyrrisk (0.0191) |
| HSgrad | SCrIDMS | eGFRmdrd (0.0212) | eGFRmdrd (0.0192) | Afib | MIant | frs_cvdtenyrrisk (0.0179) |
| eGFRckdepi (0.0243) | MIHx | HSgrad | eGFRckdepi (0.0162) | calBlkMeds (0.0154) | eGFRckdepi (0.0125) | MIHx |
| CHDHx | CVDHx | SCrIDMS | HSgrad | CVDHx | CVDHx | LEPTIN |
| RepolarAntLat (0.0238) | eGFRmdrd (0.0197) | BPMeds | FEV1 | eGFRckdepi (0.0149) | FEV1 | calBlkMeds (0.0135) |
| QTcBaz | Diabetes | HbA1c | SCrIDMS | EF | CHDHx | CardiacProcHx (0.0127) |
Coding book for variables included in this research.
| Variable Name | Variable Types | Variable Description | |
|---|---|---|---|
| 1. Demographics | |||
| age | Continuous | Age in Years | |
| sex | Categorical | Participant Sex | |
| alc | Categorical | Alcohol drinking in the past 12 months (Y/N) | |
| alcw | Continuous | Average number of drinks per week | |
| currentSmoker | Categorical | Self-Reported Cigarette Smoking Status | |
| everSmoker | Categorical | Self-Reported History of Cigarette Smoking | |
| 2. Anthropometrics | |||
| weight | Continuous | Weight (kg) | |
| height | Continuous | Height (cm) | |
| BMI | Continuous | Body Mass Index (kg/m2) | |
| waist | Continuous | Waist Circumference (cm) | |
| neck | Continuous | Neck Circumference (cm) | |
| bsa | Continuous | Calculated Body Surface Area (m2) | |
| obesity3cat | Categorical | Ideal Health: BMI < 25 (Normal) | |
| 3. Medications | |||
| medAcct | Categorical | Medication Accountability | |
| BPmedsSelf | Categorical | Self-Reported Blood Pressure Medication Status (Y/N) | |
| BPmeds | Categorical | Blood Pressure Medication Status (Y/N) | |
| DMmedsIns | Categorical | Diabetic Insulin Medication Status (Y/N) | |
| DMmedType | Categorical | Diabetes Medication Type | |
| dmMedsSelf | Categorical | Defined as Yes (Treated), if the participant reported being on diabetic | |
| DMmeds | Categorical | Diabetic Medication Status (Y/N) | |
| statinMedsSelf | Categorical | Defined as Yes (Treated), if the participant reported being on statin medication. | |
| statinMeds | Categorical | Statin Medication Status (Y/N) | |
| hrtMedsSelfEver | Categorical | Self Reported HRT Medication Status (Y/N) | |
| hrtMedsSelf | Categorical | Self Reported Current HRT Medication Status (Y/N) | |
| hrtMeds | Categorical | HRT Medication Status (Y/N) | |
| betaBlkMeds | Categorical | Beta Blocker Medication Status (Y/N) | |
| calBlkMeds | Categorical | Calcium Channel Blocker Medication Status (Y/N) | |
| diureticMeds | Categorical | Diuretic Medication Status (Y/N) | |
| antiArythMedsSelf | Categorical | Defined as Yes (Treated), if the participant reported being on antiarrhythmic medication. | |
| antiArythMeds | Categorical | Antiarrhythmic Medication Status (Y/N) | |
| 4. Hypertension | |||
| sbp | Continuous | Systolic Blood Pressure (mmHg) | |
| dbp | Continuous | Diastolic Blood Pressure (mmHg) | |
| BPjnc7 | Categorical | JNC 7 BP Classification | |
| HTN | Categorical | Hypertension Status | |
| ABI | Continuous | Ankle Brachial Index | |
| 5. Diabetes | |||
| FPG | Continuous | Fasting Plasma Glucose Level (mg/dL) | |
| FPG3cat | Categorical | Fasting Plasma Glucose Categorization | |
| HbA1c | Continuous | NGSP Hemoglobin HbA1c (%) | |
| HbA1c3cat | Categorical | NGSP Hemoglobin HbA1c (%) Categorization | |
| HbA1cIFCC | Continuous | IFCC Hemoglobin HbA1c in SI units (mmol/mol) | |
| HbA1cIFCC3cat | Categorical | IFCC Hemoglobin HbA1c in SI units (mmol/mol) Categorization | |
| fastingInsulin | Continuous | Fasting Insulin (Plasma IU/mL) | |
| HOMA-B | Continuous | HOMA-B | |
| HOMA-IR | Continuous | HOMA-IR | |
| Diabetes | Categorical | Diabetes Status (ADA 2010) | |
| diab3cat | Categorical | Diabetes Categorization | |
| 6. Lipids | |||
| ldl | Continuous | Fasting LDL Cholesterol Level (mg/dL) | |
| ldl5cat | Categorical | Fasting LDL Categorization | |
| hdl | Continuous | Fasting HDL Cholesterol Level (mg/dL) | |
| hdl3cat | Categorical | Fasting HDL Categorization | |
| trigs | Continuous | Fasting Triglyceride Level (mg/dL) | |
| trigs4cat | Categorical | Fasting Triglyceride Categorization | |
| totChol | Continuous | Fasting Total Cholesterol (mg/dL) | |
| 7. Biomarkers | |||
| hsCRP | Continuous | High Sensitivity C-Reactive Protein | |
| endothelin | Continuous | Endothelin-1 | |
| sCort | Continuous | Concentration of Cortisol Levels | |
| reninRIA | Continuous | Renin Activity RIA (Plasma ng/mL/hr) | |
| reninIRMA | Continuous | Renin Mass IRMA (Plasma pg/mL) | |
| aldosterone | Continuous | “Concentration of Aldosterone | |
| leptin | Continuous | (Serum ng/dL)” | |
| adiponectin | Continuous | Concentration of Leptin (Serum ng/mL) | |
| 8. Renal | |||
| SCrCC | Continuous | CC Calibrated Serum Creatinine (mg/dL) | |
| SCrIDMS | Continuous | IDMS Tracebale Serum Creatinine (mg/dL) | |
| eGFRmdrd | Continuous | eGFR MDRD | |
| eGFRckdepi | Continuous | eGFR CKD-Epi | |
| CreatinineU24hr | Continuous | 24-hour urine creatinine (g/24hr) | |
| CreatinineUSpot | Continuous | Random spot urine creatinine (mg/dL) | |
| AlbuminUSpot | Continuous | Random spot urine albumin (mg/dL) | |
| AlbuminU24hr | Continuous | 24-hour urine albumin (mg/24hr) | |
| DialysisEver | Categorical | Self-reported dialysis | |
| DialysisDuration | Continuous | Self-reported duration on dialysis (years) | |
| CKDHx | Categorical | Chronic Kidney Disease History | |
| 9. Respiratory | |||
| asthma | Categorical | Physician-Diagnosed Asthma | |
| maneuvers | Continuous | Successful Spirometry Maneuvers | |
| FVC | Continuous | Forced Vital Capacity (L) | |
| FEV1 | Continuous | Forced Expiratory Volume in 1 s (L) | |
| FEV6 | Continuous | Forced Expiratory Volume in 6 s (L) | |
| FEV1PP | Continuous | FEV1 % Predicted | |
| FVCPP | Continuous | FVC % Predicted | |
| 10. Echocardiogram | |||
| LVMecho | Continuous | Left Ventricular Mass (g) from Echo | |
| LVMindex | Continuous | Left Ventricular Mass Indexed by Height(m)^2.7 | |
| LVH | Categorical | Left Ventricular Hypertrophy | |
| EF | Continuous | Ejection Fraction | |
| EF3cat | Categorical | Ejection Fraction Categorization | |
| DiastLVdia | Continuous | Diastolic LV Diameter (mm) | |
| SystLVdia | Continuous | Systolic LV Diameter (mm) | |
| FS | Categorical | Fractional Shortening | |
| RWT | Continuous | Relative Wall Thickness | |
| 11. Electrocardiogram | |||
| ConductionDefect | Categorical | Conduction Defect | |
| MajorScarAnt | Categorical | Anterior QnQs Major Scar | |
| MinorScarAnt | Categorical | Anterior QnQs Minor Scar | |
| RepolarAnt | Categorical | Anterior Repolarization Abnormality | |
| MIAnt | Categorical | Anterior ECG defined MI | |
| MajorScarPost | Categorical | Posterior QnQs Major Scar | |
| MinorScarPost | Categorical | Posterior QnQs Minor Scar | |
| RepolarPost | Categorical | Posterior Repolarization Abnormality | |
| MIPost | Categorical | Posterior ECG defined MI | |
| MajorScarAntLat | Categorical | Anterolateral QnQs Major Scar | |
| MinorScarAntLat | Categorical | Anterolateral QnQs Minor Scar | |
| RepolarAntLat | Categorical | Anterolateral Repolarization Abnormality | |
| MIAntLat | Categorical | Anterolateral ECG defined MI | |
| MIecg | Categorical | ECG determined MI | |
| ecgHR | Continuous | Heart Rate (bpm) | |
| Afib | Categorical | Atrial Fibrillation | |
| Aflutter | Categorical | Atrial Flutter | |
| QRS | Continuous | QRS Interval (msec) | |
| QT | Continuous | QT Interval (msec) | |
| QTcFram | Continuous | Framingham Corrected QT Interval (msec) | |
| QTcBaz | Continuous | Bazett Corrected QT Interval (msec) | |
| QTcHod | Continuous | Hodge Corrected QT Interval (msec) | |
| QTcFrid | Continuous | Fridericia Corrected QT Interval (msec) | |
| CV | Continuous | Cornell Voltage (microvolts) | |
| LVHcv | Categorical | Cornell Voltage Criteria | |
| 12. Stroke History | |||
| speechLossEver | Categorical | History of Speech Loss | |
| visionLossEver | Categorical | History of Sudden Loss of Vision | |
| doubleVisionEver | Categorical | History of Double Vision | |
| numbnessEver | Categorical | History of Numbness | |
| paralysisEver | Categorical | History of Paralysis | |
| dizzynessEver | Categorical | History of Dizziness | |
| strokeHx | Categorical | History of Stroke | |
| 13. CVD History | |||
| MIHx | Categorical | Self-Reported History of MI | |
| CardiacProcHx | Categorical | Self-Reported history of Cardiac Procedures | |
| CHDHx | Categorical | Coronary Heart Disease Status/History | |
| CarotidAngioHx | Categorical | Self-Reported history of Carotid Angioplasty | |
| CVDHx | Categorical | Cardiovascular Disease History | |
| 14. Healthcare Access | |||
| Insured | Categorical | Visit 1 Health Insurance Status | |
| 15. Psychosocial | |||
| Income | Categorical | Income Status | |
| occupation | Categorical | Occupational Status | |
| edu3cat | Categorical | Education Attainment Categorization | |
| HSgrad | Categorical | High School Graduate | |
| dailyDiscr | Continuous | Everyday Discrimination Experiences | |
| lifetimeDiscrm | Continuous | Major Life Events Discrimination | |
| discrmBurden | Continuous | Discrimination Burden | |
| depression | Continuous | Total Depressive Symptoms Score | |
| weeklyStress | Continuous | Total Weekly Stress Score | |
| perceivedStress | Continuous | Total Global Stress Score | |
| 16. Life’s Simple 7 | |||
| SMK3cat | Categorical | AHA Smoking Categorization | |
| idealHealthSMK | Categorical | Indicator for Ideal Health via Smoking Status | |
| BMI3cat | Categorical | AHA BMI Categorization | |
| idealHealthBMI | Categorical | Indicator for Ideal Health via BMI | |
| PA3cat | Categorical | AHA Physical Activity Categorization | |
| idealHealthPA | Categorical | Indicator for Ideal Health via Physical Activity | |
| nutrition3cat | Categorical | AHA Nutrition Categorization | |
| idealHealthNutrition | Categorical | Indicator for Ideal Health via Nutrition | |
| totChol3cat | Categorical | AHA Total Cholesterol Categorization | |
| idealHealthChol | Categorical | Indicator for Ideal Health via Total Cholesterol | |
| BP3cat | Categorical | AHA BP Categorization | |
| idealHealthBP | Categorical | Indicator for Ideal Health via BP | |
| glucose3cat | Categorical | AHA Glucose Categorization | |
| idealHealthDM | Categorical | Indicator for Ideal Health via Glucose | |
| 17. Nutrition | |||
| vitaminD2 | Continuous | 25(OH) Vitamin D2 (ng/mL) | |
| vitaminD3 | Continuous | 25(OH) Vitamin D3 (ng/mL) | |
| vitaminD3epimer | Continuous | ep-25(OH) Vitamin D3 (ng/mL) | |
| darkgrnVeg | Continuous | Dark-green Vegetables | |
| eggs | Continuous | Eggs | |
| fish | Continuous | Fish | |
| 18. Physical Activity | |||
| sportIndex | Continuous | Sport Index | |
| hyIndex | Continuous | Home/Yard Index | |
| activeIndex | Continuous | Active Living Index | |
| 19. Risk Scores | |||
| frs_chdtenyrrisk | Continuous | Framingham Risk Score-Coronary Heart Disease | |
| frs_cvdtenyrrisk | Continuous | Framingham Risk Score-Cardiovascular Disease | |
| frs_atpiii_tenyrrisk | Continuous | Framingham Risk Score-Adult Treatment Panel (III)—Coronary Heart Disease | |
| rrs_tenryrisk | Continuous | Reynolds Risk Score | |
| ascvd_tenyrrisk | Continuous | American College of Cardiology—American Heart Association—Atherosclerotic Cardiovascular Disease |