| Literature DB >> 35990942 |
Lei Wang1, Jian Guo1,2, Zhuang Tian1, Samuel Seery3, Ye Jin2, Shuyang Zhang1.
Abstract
Background: Familial hypercholesterolemia (FH) is an autosomal-dominant genetic disorder with a high risk of premature arteriosclerotic cardiovascular disease (ASCVD). There are many alternative risk assessment tools, for example, DLCN, although their sensitivity and specificity vary among specific populations. We aimed to assess the risk discovery performance of a hybrid model consisting of existing FH risk assessment tools and machine learning (ML) methods, based on the Chinese patients with ASCVD. Materials andEntities:
Keywords: DLCN; early detection and prevention; familial hypercholesterolemia (FH); hybrid diagnosis; risk assessment
Year: 2022 PMID: 35990942 PMCID: PMC9381985 DOI: 10.3389/fcvm.2022.893986
Source DB: PubMed Journal: Front Cardiovasc Med ISSN: 2297-055X
FIGURE 1The exclusion process of first-ever ASCVD dataset.
Clinical characteristics for familial hypercholesterolemia (FH) patients identified by DLCN.
| Variables | Total participants (%) | DLCN | ||||
|
| ||||||
| Unlikely FH (%) | Possible FH (%) | Probable FH (%) | Definte FH (%) | |||
| Number of participants | 5,597 | 4,521 | 932 | 115 | 29 | |
| Age (years) | 63.02 ± 11.44 | 64.33 ± 10.8 | 58.55 ± 12.28 | 53.2 ± 10.78 | 42.41 ± 9.19 | <0.001 |
| Gender/Male | 3,993 (71.34) | 3,239 (71.64) | 658 (70.6) | 73 (63.48) | 23 (79.31) | 0.185 |
| Body mass index (kg/m2) | 25.46 ± 3.32 | 25.38 ± 3.31 | 25.78 ± 3.27 | 26.01 ± 3.47 | 25.34 ± 3.96 | 0.002 |
| Tendon xanthomata/Yes | 6 (0.11) | 0 (0) | 0 (0) | 4 (3.48) | 2 (6.9) | <0.001 |
| HDL-C (mmol/L) | 0.98 ± 0.25 | 0.97 ± 0.25 | 1.01 ± 0.24 | 1.03 ± 0.24 | 1.02 ± 0.26 | <0.001 |
| LDL-C (mmol/L) | 2.45 ± 0.91 | 2.23 ± 0.7 | 3.23 ± 0.8 | 4.2 ± 1.02 | 6.32 ± 1.98 | <0.001 |
| Lp(a) (mg/L) | 177.79 ± 215.07 | 172.89 ± 212.57 | 194.33 ± 221.28 | 227.11 ± 245.52 | 214.97 ± 228.57 | 0.002 |
| TC (mmol/L) | 4.17 ± 1.1 | 3.92 ± 0.92 | 5.01 ± 1 | 6.14 ± 1.2 | 7.57 ± 1.62 | <0.001 |
| TG (mmol/L) | 1.73 ± 1.63 | 1.7 ± 1.73 | 1.84 ± 1.06 | 2.14 ± 1.28 | 2.23 ± 1.46 | 0.001 |
|
| <0.001 | |||||
| Non-smoker | 2,417 (43.18) | 1,991 (44.04) | 377 (40.45) | 45 (39.13) | 4 (13.79) | |
| Ex-smoker | 1,322 (23.62) | 1,099 (24.31) | 191 (20.49) | 27 (23.48) | 5 (17.24) | |
| Current smoker | 1,858 (33.2) | 1,431 (31.65) | 364 (39.06) | 43 (37.39) | 20 (68.97) | |
|
| 0.03 | |||||
| Non-drinker | 3,437 (61.41) | 2,799 (61.91) | 558 (59.87) | 69 (60) | 11 (37.93) | |
| Ex-drinker | 355 (6.34) | 299 (6.61) | 48 (5.15) | 6 (5.22) | 2 (6.9) | |
| Drinking habits | 1,805 (32.25) | 1,423 (31.48) | 326 (34.98) | 40 (34.78) | 16 (55.17) | |
|
| ||||||
| Hyperlipemia/Yes | 1,962 (35.05) | 1,498 (33.13) | 402 (43.13) | 49 (42.61) | 13 (44.83) | <0.001 |
| Diabetes/Yes | 2,056 (36.73) | 1,680 (37.16) | 329 (35.3) | 35 (30.43) | 12 (41.38) | 0.333 |
| Hypertension/Yes | 3,649 (65.2) | 2,987 (66.07) | 577 (61.91) | 70 (60.87) | 15 (51.72) | <0.001 |
| premature CHD/yes | 1,383 (24.71) | 863 (19.09) | 407 (43.67) | 84 (73.04) | 29 (100) | <0.001 |
| Stroke/yes | 628 (11.22) | 494 (10.93) | 119 (12.77) | 15 (13.04) | 0 (0) | 0.083 |
|
| ||||||
| Family history of pCHD/Yes | 70 (1.25) | 28 (0.62) | 30 (3.22) | 10 (8.7) | 2 (6.9) | <0.001 |
| Family history of Hyperlipemia/Yes | 76 (1.36) | 49 (1.08) | 16 (1.72) | 6 (5.22) | 5 (17.24) | <0.001 |
| Family history of Stroke/Yes | 709 (12.67) | 562 (12.43) | 130 (13.95) | 15 (13.04) | 2 (6.9) | 0.475 |
The
Head to head comparison among 11 risk assessment tools.
| Head of the tools | SBR [1] | DLCN [2] | MEDPED [3] | JFHMC [4] | LDL-C/TC [5] | AHA [6] | SCCFH [7] | Lp(a)+DLCN [8] | mDLCN [9] | TW [10] | CHC [11] |
| Lipid levels |
|
|
|
|
|
|
|
|
|
| |
| Physical examination |
|
|
|
|
|
|
|
| |||
| Family history |
|
|
|
|
|
|
|
|
|
| |
| Clinical history |
|
|
|
|
|
| |||||
| Genetic test |
|
|
|
|
| ||||||
| Definite FH | Homozygous FH: |
| Homozygous FH: | Any 2:1 | Any 2:1 |
SBR [1], Simon Broome Register; DLCN [2], Dutch Lipid Clinic network; MEDPED [3], make early diagnosis to prevent early deaths; JFHMC [4], Japanese FH Management Criteria; LDL-C/TC [5], TC&LDL-c; AHA [6], American Heart Association; Lp(a)DLCN [7], Lp(a)add DLCN; SCCFH [8], Simplified Chinese Criteria for Familial Hypercholesterolemia; mDLCN [9], modified DLCN for China; TW [10], Taiwan FH diagnostic criteria; CHC [11], 2018 Chinese criteria.
FIGURE 2The roadmap of identifier setting.
The prevalence and the performance of 10 different criteria compared with DLCN.
| Criteria | Level | Less-risk (%) | Risky (%) | Prevalence (%) | Sen (%) | Spe (%) | PPV (%) | NPV (%) | AUC (%) |
| DLCN | 4 | 5453 | 144 | 2.57% | |||||
| mDLCN/risky | 4 | 387 | 140 | 9.42% | 97.22% | 92.90% | 26.57% | 99.92% | 95.06% |
| TW/risky | 4 | 0 | 80 | 1.43% | 55.56% | 100% | 100% | 98.84% | 77.78% |
| SCCFH/risky | 3 | 261 | 71 | 5.93% | 49.31% | 95.21% | 21.39% | 98.61% | 72.26% |
| SBR/risky | 3 | 18 | 23 | 0.73% | 15.97% | 99.67% | 56.10% | 97.82% | 57.82% |
| AHA/risky | 3 | 0 | 18 | 0.32% | 12.50% | 100% | 100% | 97.74% | 56.25% |
| LDL-C/TC/risky | 2 | 681 | 140 | 14.67% | 97.22% | 87.51% | 17.05% | 99.92% | 92.37% |
| Lp(a)+DLCN/risky | 2 | 140 | 91 | 4.13% | 63.19% | 97.43% | 39.39% | 99.01% | 80.31% |
| MEDPED/risky | 2 | 98 | 113 | 3.77% | 78.47% | 98.20% | 53.55% | 99.42% | 88.34% |
| CHC/risky | 2 | 8 | 14 | 0.39% | 9.72% | 99.85% | 63.64% | 97.67% | 54.79% |
| JFHMC/risky | 2 | 0 | 0 | 0 | 0 | 100% | 100% | 0 | 0 |
The nine variables for machine learning (ML) model setting.
| Variables | Logistic regression | RF | Elastic net | Lasso |
| LDL-c | 5.71 | 569.48 | 0.2184 | 0.1991 |
| pCHDTW | 2.9 | 86.01 | 0.1588 | 0.1663 |
| pCHD_fhTW | 2.18 | – | 0.1422 | 0.1658 |
| pStroke_fh | 3.03 | – | 0.0961 | 0.1365 |
| pStroke | 7.37 | – | 0.3683 | 0.386 |
| pPVD | 8.63 | – | 0.1007 | 0.1338 |
| Tendon xanthomas | 23.83 | – | 0.2784 | 0.3512 |
| Age | – | 218.98 | −0.0003 | −0.0003 |
| Lipid-low treat | 4.47 | 540.8 | – | 0.1492 |
For the result of the logistic regression, we identified the inclusion cutoff of logistic regression as 0.1 while the exclusion cutoff was 0.2. The Odd ratio and their 95% confidence interval have been displayed for each variable. RF stands for the random forest, and take the decreasing in Gini score for the variable selection. For Lasso and Elastic net, we have displayed the coefficients. LDL-c, the highest low density lipoprotein cholesterol during admission; pCHDTW, premature Coronary Heart Disease identified in Taiwan FH diagnostic criteria; pCHD_fhTW, Family History of premature Coronary Heart Disease identified in Taiwan FH diagnostic criteria; pStroke_fh, Family History of premature Stroke; pStroke, premature Stroke; pPVD, premature Peripheral Vascular Disease; Lipid-low treat, Lipid-lowering medication.
The performance of familial hypercholesterolemia (FH) risk models.
| Methods | Classifier | Probability model | ||||||||||
|
|
| |||||||||||
| Accuracy (%) | Sensitivity (%) | F | AUC_class (%) | RMSE_class | G score | AUC_prob (%) | RMSE_prob | Calmean | BS_1 | BS_0 | BS_ALL | |
| XGBoost | 71.79 ± 1.84 | 95.34 ± 1.52 | 0.5738 ± 0.0167 | 80.65 ± 1.25 | 0.5308 ± 0.0176 | 0.7928 ± 0.0141 | 92.96 ± 1.03 | 0.2787 ± 0.0274 | 0.0854 ± 0.0227 | 0.2962 ± 0.1276 | 0.0244 ± 0.0128 | 0.0784 ± 0.0163 |
| RF | 94.12 ± 0.52 | 86.09 ± 2.79 | 0.8532 ± 0.0137 | 91.1 ± 1.25 | 0.2423 ± 0.0106 | 0.9095 ± 0.0135 | 50 ± 0 | 0.4773 ± 0.0105 | 0.4606 ± 0.0122 | 0.2918 ± 0.021 | 0.2121 ± 0.0177 | 0.2279 ± 0.01 |
| SVM | 94.2 ± 0.45 | 82.82 ± 2.12 | 0.8503 ± 0.0121 | 89.93 ± 0.99 | 0.2406 ± 0.0094 | 0.8964 ± 0.0108 | 49.99 ± 0.03 | 0.3994 ± 0.0005 | 0.3025 ± 0.0007 | 0.6341 ± 0.0017 | 0.0418 ± 0.0006 | 0.1595 ± 0.0004 |
| SVMBoost | 94.28 ± 0.43 | 84.11 ± 2.49 | 0.8538 ± 0.0117 | 90.45 ± 1.1 | 0.2391 ± 0.0091 | 0.9022 ± 0.0119 | 49.98 ± 0.02 | 0.3994 ± 0.0004 | 0.3025 ± 0.0007 | 0.6341 ± 0.0017 | 0.0418 ± 0.0006 | 0.1596 ± 0.0003 |
| BPANN | 94.31 ± 0.48 | 84.56 ± 2.19 | 0.8552 ± 0.0126 | 90.65 ± 1.04 | 0.2384 ± 0.01 | 0.9043 ± 0.0112 | 98.4 ± 0.27 | 0.1972 ± 0.0076 | 0.0152 ± 0.0043 | 0.1168 ± 0.0143 | 0.0196 ± 0.0027 | 0.039 ± 0.003 |
| BPANNBoost | 94.38 ± 0.43 | 85.08 ± 2.12 | 0.8574 ± 0.0114 | 90.88 ± 0.98 | 0.237 ± 0.009 | 0.9069 ± 0.0105 | 50 ± 0.01 | 0.3999 ± 0.0026 | 0.3082 ± 0.0116 | 0.6193 ± 0.0287 | 0.0459 ± 0.0096 | 0.1599 ± 0.0021 |
| LOG | 93.77 ± 0.47 | 81.12 ± 2.25 | 0.838 ± 0.0132 | 89.01 ± 1.07 | 1.043 ± 0.0072 | 0.8865 ± 0.0118 | 87.42 ± 17.57 | 0.374 ± 0.2029 | 0.1909 ± 0.232 | 0.2797 ± 0.2448 | 0.1561 ± 0.2656 | 0.1807 ± 0.2012 |
| LOGBoost | 93.63 ± 0.65 | 82.18 ± 6.04 | 0.8361 ± 0.0207 | 89.32 ± 2.49 | 0.2522 ± 0.0128 | 0.8896 ± 0.0277 | 50 ± 0 | 0.4139 ± 0.0075 | 0.3635 ± 0.0195 | 0.4853 ± 0.0451 | 0.0935 ± 0.0187 | 0.1714 ± 0.0062 |
| STACK | 93.52 ± 0.47 | 97.06 ± 0.86 | 0.8564 ± 0.0092 | 94.85 ± 0.47 | 0.2543 ± 0.0092 | 0.9483 ± 0.0047 | 98.66 ± 0.27 | 0.2381 ± 0.0325 | 0.0683 ± 0.0299 | 0.2502 ± 0.103 | 0.01 ± 0.0079 | 0.0577 ± 0.0153 |
XGBoost, eXtreme Gradient Boosting; RF, random forest; SVM, support vector machine; LOG, logistic regression; BPANN, BackPropagation Artificial Neural Network; BPANNBoost, an AdaBoost model settled with BPANN as the basic model; SVMBoost, an AdaBoost model settled with SVM as the basic model; LOGBoost, an AdaBoost model settled with logistic regression as the basic model; STACK, the stacking model with RF, SVM, BPANN, and LOG as the basic classifier and logistic regression as the meta classier. F, precision-recall F measure; AUC_class, area under the curve for classifier; AUC_prob, area under the curve for probability model; RMSE_class, root-mean-squared error of the classifier; RMSE_prob, root-mean-squared error of the probability model; Calmean, mean of calibration error; BS_1, Brier score for the samples with label 1; BS_0, Brier score for the samples with label 0; BS_ALL, Brier score for all samples.
FIGURE 3The interpretation among age, LDL-c, and lipid-lowering therapy to FH risk.