| Literature DB >> 35107432 |
Matej Pičulin1, Tim Smole1, Bojan Žunkovič1, Enja Kokalj1, Marko Robnik-Šikonja1, Matjaž Kukar1, Dimitrios I Fotiadis2, Vasileios C Pezoulas2, Nikolaos S Tachos2, Fausto Barlocco3,4, Francesco Mazzarotto3,4,5, Dejana Popović6, Lars S Maier7, Lazar Velicki8,9, Iacopo Olivotto3,4, Guy A MacGowan10, Djordje G Jakovljević10,11, Nenad Filipović12, Zoran Bosnić1.
Abstract
BACKGROUND: Cardiovascular disorders in general are responsible for 30% of deaths worldwide. Among them, hypertrophic cardiomyopathy (HCM) is a genetic cardiac disease that is present in about 1 of 500 young adults and can cause sudden cardiac death (SCD).Entities:
Keywords: AI; ML; SCD; artificial intelligence; cardiomyopathy; cardiovascular disease; disease progression; hypertrophic cardiomyopathy; machine learning; prediction; prediction model; sudden cardiac death; validation
Year: 2022 PMID: 35107432 PMCID: PMC8851344 DOI: 10.2196/30483
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Overview of the proposed disease progression system. The system receives clinical data and disease-related events of a patient as input, uses virtual patient data and semisupervised learning for self-improvement, and returns the predictions and their explanation for 6 target variables.
Figure 2Relationship between the amount of labeled and unlabeled data. The bars for Yes and No values are stacked, visually revealing the ratio between labeled and unlabeled data. Note that the rightmost columns do not have 10-year follow-up data, as they are less than 10 years.
Basic characteristics of patients for basic continuous parameters (N=10,318).
| Continuous parameter | Mean (SD) | Missing data, n (%) |
| Age (years) | 52.1 (18.6) | 4 (0.04) |
| Weight (kg) | 73.4 (14.6) | 2381 (23.08) |
| Height (cm) | 169 (10.3) | 2273 (22.03) |
| Body mass index (BMI) | 25.6 (4.09) | 2423 (23.48) |
| NYHAa | 1.69 (0.73) | 983 (9.53) |
aNYHA: New York Heart Association.
Basic characteristics of patients for basic binary parameters (N=10,318).
| Binary parameter | 1-value, n (%) | 0-value, n (%) | Missing, n (%) |
| Alcohol | Yes, 103 (0.99) | No, 10,215 (99) | 0 |
| Drug | Yes, 18 (0.17) | No, 10,300 (99.83) | 0 |
| Smoking | Yes, 3437 (33.31) | No, 6881 (66.69) | 0 |
| Pregnancy | Yes, 443 (4.29) | No, 9875 (95.71) | 2515 (24.37) |
| Gender | Male, 6400 (62.03) | Female, 3918 (37.97) | 0 |
Basic characteristics for groups of parameters (N=10,318)a.
| Procedure | Parameters, n | Total missing values, n (%) |
| ECGb | 9 | 45,839 (49.36) |
| Echoc | 26 | 98,191 (36.60) |
| CMRd | 10 | 81,174 (78.67) |
aThe table shows aggregated statistics for several parameters obtained from the same procedure. The percentage for each procedure is obtained as follows: [Total missing values/(Parameter × N)] × 100.
bECG: electrocardiogram.
cEcho: echocardiogram.
dCMR: cardiovascular magnetic resonance.
Absolute number and percentage of missing values of target variables as class and as input (N=10,318).
|
| LA_da, n (%) | LVEFb, n (%) | NYHAc, n (%) | LVIDdd, n (%) | LVIDse, n (%) | LA_Volf, n (%) |
| Target | 8569 (83.05) | 8481 (82.19) | 8313 (80.57) | 8607 (83.42) | 9336 (90.48) | 8631 (83.65) |
| Input | 2691 (26.08) | 2399 (23.25) | 983 (9.53) | 2517 (24.39) | 5329 (51.65) | 3680 (35.67) |
aLA_d: left atrial diameter.
bLVEF: left ventricular ejection fraction.
cNYHA: New York Heart Association.
dLVIDd: left ventricular internal diameter at end diastole.
eLVIDs: left ventricular internal diameter at end systole.
fLA_Vol: left atrial volume.
Selected attributes using RReliefF.a
| Variableb | LA_dc score | LVEFd score | NYHAe score | LVIDdf score | LVIDsg score | LA_Volh score | Average rank | |
|
| ||||||||
|
|
| 0.198 | 0.194 | 0.166 | 0.142 | 0.166 | 0.158 | 1.000 |
|
|
| 0.051 | 0.037 | 0.043 | 0.055 | 0.058 | 0.022 | 12.500 |
|
|
| 0.057 | 0.064 | 0.045 | 0.075 | 0.051 | 0.029 | 9.167 |
|
|
| 0.075 | 0.073 | 0.053 | 0.095 | 0.085 | 0.045 | 4.167 |
|
| ||||||||
|
|
| 0.063 | 0.046 | 0.052 | 0.032 | 0.069 | 0.082 | 7.500 |
|
|
| 0.072 | 0.042 | 0.052 | 0.039 | 0.044 | 0.056 | 9.667 |
|
| History of syncope | 0.026 | 0.036 | 0.029 | 0.022 | 0.029 | 0.048 | 20.000 |
|
|
| 0.056 | 0.060 | 0.061 | 0.047 | 0.052 | 0.066 | 5.833 |
|
| Family history of SCDk | 0.027 | 0.051 | 0.032 | 0.031 | 0.051 | 0.049 | 14.667 |
|
| ||||||||
|
| NYHA | 0.011 | 0.017 | 0.069 | 0.007 | 0.027 | 0.022 | 33.000 |
|
| Presence of atrial fibrillation | 0.055 | 0.036 | 0.048 | 0.018 | 0.026 | 0.068 | 16.333 |
|
| QRS duration | 0.035 | 0.046 | 0.029 | 0.039 | 0.026 | 0.039 | 17.167 |
|
|
| 0.043 | 0.052 | 0.049 | 0.041 | 0.057 | 0.052 | 8.167 |
|
| LA_d | 0.078 | 0.037 | 0.036 | 0.018 | 0.031 | 0.070 | 15.000 |
|
| LA_Vol | 0.055 | 0.029 | 0.026 | 0.012 | 0.025 | 0.059 | 24.000 |
|
| LVIDs | 0.017 | 0.022 | 0.027 | 0.029 | 0.043 | 0.031 | 25.167 |
|
| LVIDd | 0.021 | 0.017 | 0.017 | 0.036 | 0.044 | 0.026 | 27.667 |
|
| LVEF | 0.018 | 0.051 | 0.019 | 0.014 | 0.050 | 0.013 | 27.833 |
|
| ||||||||
|
|
| 0.045 | 0.041 | 0.039 | 0.051 | 0.052 | 0.059 | 9.667 |
|
|
| 0.037 | 0.044 | 0.034 | 0.040 | 0.066 | 0.023 | 14.667 |
|
| Negative genetics | 0.036 | 0.037 | 0.027 | 0.043 | 0.030 | 0.031 | 18.667 |
aThe table shows RReliefF feature scores and the average ranks for each target variable.
bNames of the 10 highest-ranked variables are italicized.
cLA_d: left atrial diameter.
dLVEF: left ventricular ejection fraction.
eNYHA: New York Heart Association.
fLVIDd: left ventricular internal diameter at end diastole.
gLVIDs: left ventricular internal diameter at end systole.
hLA_Vol: left atrial volume.
iBSA: body surface area.
jHCM: hypertrophic cardiomyopathy.
kSCD: sudden cardiac death.
lECG: electrocardiogram.
mEcho: echocardiogram.
Comparison of the best-performing models for each target variable.
| Target | Model and parameter | MAEa | RMSEb | RRMSEcmean | RRMSEconst |
| LA_dd | RFe: Sf+VPg+Subset | 3.4 | 4.73 | 0.54 | 0.46 |
| LA_Volh | RF: S+VP+Subset | 18.4 | 26.73 | 0.56 | 0.47 |
| LVEFi | GBj: S+Subset | 4.92 | 6.73 | 0.67 | 0.61 |
| LVIDdk | RF: S+VP+Subset | 3.53 | 5.26 | 0.68 | 0.64 |
| LVIDsl | RF: S+VP+Subset | 3.42 | 4.81 | 0.66 | 0.56 |
| NYHAm | RF: S+VP+Subset | 0.39 | 0.5 | 0.67 | 0.66 |
aMAE: mean absolute error.
bRMSE: root-mean-square error.
cRRMSE: relative root-mean-square error.
dLA_d: left atrial diameter.
eRF: random forest.
fS: application of semisupervised learning.
gVP: addition of virtual patients' data into the learning data set.
hLA_Vol: left atrial volume.
iLVEF: left ventricular ejection fraction.
jGB: gradient boosted.
kLVIDd: left ventricular internal diameter at end diastole.
lLVIDs: left ventricular internal diameter at end systole.
mNYHA: New York Heart Association.
Figure 3Plotted results for the R2 statistic for each target variable using different sets (input parameters). Note that VP, S, and S + VP are used on feature subsets. LA_d: left atrial diameter; LA_Vol: left atrial volume; LVEF: left ventricular ejection fraction; LVIDd: left ventricular internal diameter at end diastole; LVIDs: left ventricular internal diameter at end systole; NYHA: New York Heart Association; S: application of semisupervised learning; VP: addition of virtual patients' data into the learning data set.
Figure 4Example of an explanation of the prediction for the target variable LA_d. LA_d: left atrial diameter; LA_Vol: left atrial volume; LVIDs: left ventricular internal diameter at end systole.
Mean absolute error (MAE) of the discretized model predictions (MD), individual experts (E), and the entire consortium (C).
| Target/prediction | Model (MD), MAE (SD) | Expert (E), MAE (SD) | Consortium (C), MAE (SD) |
| NYHAa |
| 0.84 (0.69) | 0.56 (0.34) |
| LA_d c | 1.70 (0.82) | 1.69 (0.97) | |
| LA_Vold |
| 1.25 (0.98) | 1.13 (0.63) |
| LVIDde |
| 1.09 (0.91) | 1.00 (0.77) |
| LVIDsf |
| 1.02 (0.86) | 0.88 (0.68) |
| LVEFg |
| 1.32 (0.90) | 1.28 (0.79) |
aNYHA: New York Heart Association.
bThe lowest achieved errors are italicized.
cLA_d: left atrial diameter.
dLA_Vol: left atrial volume.
eLVIDd: left ventricular internal diameter at end diastole.
fLVIDs: left ventricular internal diameter at end systole.
gLVEF: left ventricular ejection fraction.
Agreement ratios between experts and prediction explanations for parameters that contribute to predicting each target variable. The last two columns provide summary statistics.
| Target variable and parameters | Expert agreement | Summary | ||
|
|
| Ratio of agreed features from at least 50% of experts, n | Average agreement, n | |
|
| ||||
|
|
|
|
|
|
|
|
|
| 1.00 (4/4) | 0.73 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| BSAf | 0.15 |
|
|
|
|
|
|
|
|
|
|
|
| 0.67 (4/6) | 0.52 |
|
|
|
|
|
|
|
| LVEFg | 0.23 |
|
|
|
|
|
|
|
|
|
| ||||
|
| QRS duration | 0.38 |
|
|
|
|
|
|
|
|
|
| Syncope | 0.46 | 0.40 (2/5) | 0.49 |
|
|
|
|
|
|
|
| NYHA | 0.38 |
|
|
|
| ||||
|
|
|
|
|
|
|
|
|
| 0.75 (3/4) | 0.48 |
|
| Age | 0.15 |
|
|
|
|
|
|
|
|
|
| ||||
|
| LA_d | 0.38 |
|
|
|
| LVIDd | 0.38 |
|
|
|
|
|
|
|
|
|
|
|
| 0.43 (3/7) | 0.47 |
|
|
|
|
|
|
|
| Interventricular septum (IVS) | 0.38 |
|
|
|
| Family history of HCMh | 0.08 |
|
|
|
| ||||
|
|
|
|
|
|
|
| Atrial fibrillation | 0.15 |
|
|
|
| BSA | 0.08 | 0.17 (1/6) | 0.36 |
|
| IVS | 0.38 |
|
|
|
| Age | 0.31 |
|
|
|
| LVEF | 0.38 |
|
|
aNYHA: New York Heart Association.
bLA_d: left atrial diameter.
cNames of parameters with agreement higher than 50% are italicized.
dLA_Vol: left atrial volume.
eLVIDd: left ventricular internal diameter at end diastole.
fBSA: body surface area.
gLVEF: left ventricular ejection fraction.
iHCM: hypertrophic cardiomyopathy.