| Literature DB >> 30990827 |
Yuya Shinkawa1, Takashi Yoshida2, Yohei Onaka2, Makoto Ichinose2, Kazuo Ishii3.
Abstract
Cerebral white matter lesions are ischemic symptoms caused mainly by microangiopathy; they are diagnosed by MRI because they show up as abnormalities in MRI images. Because patients with white matter lesions do not have any symptoms, MRI often detects the lesions for the first time. Generally, head MRI for the diagnosis and grading of cerebral white matter lesions is performed as an option during medical checkups in Japan. In this study, we develop a mathematical model for the prediction of white matter lesions using data from routine medical evaluations that do not include a head MRI. Linear discriminant analysis, logistic discrimination, Naive Bayes classifier, support vector machine, and random forest were investigated and evaluated by ten-fold cross-validation, using clinical data for 1,904 examinees (988 males and 916 females) from medical checkups that did include the head MRI. The logistic regression model was selected based on a comparison of accuracy and interpretability. The model variables consisted of age, gender, plaque score (PS), LDL, systolic blood pressure (SBP), and administration of antihypertensive medication (odds ratios: 2.99, 1.57, 1.18, 1.06, 1.12, and 1.52, respectively) and showed Areas Under the ROC Curve (AUC) 0.805, the model displayed sensitivity of 72.0%, and specificity 75.1% when the most appropriate cutoff value was used, 0.579 as given by the Youden Index. This model has shown to be useful to identify patients with a high-risk of cerebral white matter lesions, who can then be diagnosed with a head MRI examination in order to prevent dementia, cerebral infarction, and stroke.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30990827 PMCID: PMC6467420 DOI: 10.1371/journal.pone.0215142
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Typical examples of the presence or absence of cerebral white matter lesions.
The upper three images are those of a subject with cerebral white matter lesions, while the lower three images are those of a subject without cerebral white matter lesions. T1 weighted images (T1WI), T2 weighted images (T2WI), and Fluid Attenuated Inversion Recovery (FLAIR) images were obtained using MRI equipment (MAGNETOM Symphony and MAGNETOM ESSENZA; Siemens Healthineers Global), but only the FLAIR images are shown here.
Assessment of each clinical examination data and questionnaire.
| non white matter | white matter | |||||
|---|---|---|---|---|---|---|
| n | 860 | 1044 | ||||
| factor | ||||||
| PS, mean (sd) | 0.59 | (1.3) | 1.49 | (2.3) | <0.001 | |
| age, mean (sd) | 49.96 | (10.8) | 61.73 | (9.2) | <0.001 | |
| LDL, mean (sd) | 119.64 | (31.6) | 121.88 | (29.4) | 0.109 | |
| HDL, mean (sd) | 59.47 | (14.7) | 62.49 | (15.8) | <0.001 | |
| LH, mean (sd) | 2.14 | (0.8) | 2.08 | (0.7) | 0.068 | |
| TG, mean (sd) | 115.59 | (124.4) | 108.71 | (73.6) | 0.134 | |
| HbA1c, mean (sd) | 5.66 | (0.6) | 5.86 | (0.7) | <0.001 | |
| BS, mean (sd) | 102.41 | (16.8) | 105.45 | (19.2) | <0.001 | |
| SBP, mean (sd) | 119.96 | (16.6) | 127.1 | (19.3) | <0.001 | |
| DBP, mean (sd) | 72.43 | (11.7) | 75.05 | (12.5) | <0.001 | |
| the number of plaque, mean (sd) | 0.34 | (0.7) | 0.84 | (1.2) | <0.001 | |
| BMI, mean (sd) | 23.13 | (3.4) | 23.17 | (3.4) | 0.774 | |
| gender, n (%) | male | 497 | (57.8) | 491 | (47.0) | <0.001 |
| female | 363 | (47.2) | 553 | (53.0) | ||
| metabolic syndrome, n (%) | no | 687 | (79.9) | 742 | (71.1) | <0.001 |
| reserve | 79 | (9.2) | 117 | (11.2) | ||
| yes | 94 | (10.9) | 185 | (17.7) | ||
| smoking habit, n (%) | yes | 197 | (22.9) | 139 | (13.3) | <0.001 |
| no | 663 | (77.1) | 905 | (86.7) | ||
| medication to reduce blood pressure, n (%) | yes | 109 | (12.7) | 353 | (33.8) | <0.001 |
| no | 751 | (87.3) | 691 | (66.2) | ||
| medication to reduce blood sugar or insulin injection, n (%) | yes | 32 | (3.7) | 107 | (10.2) | <0.001 |
| no | 828 | (96.3) | 937 | (89.8) | ||
| medication to reduce a level of cholesterol, n (%) | yes | 82 | (9.5) | 233 | (22.3) | <0.001 |
| no | 778 | (90.5) | 811 | (77.7) | ||
| amount of drinking per day, n (%) | less tha 180mL | 504 | (58.6) | 720 | (69.0) | <0.001 |
| (in terms of Sake) | 180-360mL | 237 | (27.6) | 230 | (22.0) | |
| 360mL-540mL | 89 | (10.3) | 69 | (6.6) | ||
| more than 540mL | 30 | (3.5) | 25 | (2.4) | ||
| drink habit, n (%) | rarely drink | 331 | (38.5) | 465 | (44.5) | 0.028 |
| sometimes | 269 | (31.3) | 296 | (28.4) | ||
| everyday | 260 | (30.2) | 283 | (27.1) | ||
p-values were calculated by the Student's t test for continuous variables and by the Fisher's exact test for categorical variables. “LH” shows the LH ratio. The other abbreviations are shown in “the Materials and Methods” section.
Fig 2Variable selection by graphical modeling.
The graphical modeling was performed using the R packages “corpcor” [26] and “qgraph” [27].
Performance comparison of each model by 10-fold cross-validation.
| model | AUC | Cut off point | accuracy | error | TPR | TFR | PPV | NPV |
|---|---|---|---|---|---|---|---|---|
| LogReg | 0.799 | 0.566 | 71.1% | 28.9% | 64.4% | 79.4% | 79.1% | 64.8% |
| NB | 0.776 | 0.382 | 72.0% | 28.0% | 76.5% | 66.6% | 73.8% | 69.8% |
| SVM | 0.787 | 0.679 | 70.7% | 29.3% | 64.5% | 78.7% | 48.8% | 28.0% |
| RF | 0.790 | 0.428 | 71.7% | 28.3% | 83.1% | 58.0% | 39.8% | 30.6% |
The four models, logistic regression (LogLeg), Naive Bayes classifier (NB), support vector machine (SVM), and random forest (RF), were compared with 6 indices: accuracy, error rate (error), true positive rate (TPR, also called sensitivity), true negative rate (TNR, also called specificity), positive predictive value (PPV), negative predictive value (NPV), and Area Under the ROC Curve (AUC).
Fig 3ROC curves using all the investigated models.
LogReg: logistic regression analysis. NB: Naive Bayes classifier. RF: random forest. SVM: support vector machine. The ROC curves were plotted using the R packages “plotROC” [32] and “ggplot2” [33].
Odds ratio of each variable estimated by the logistic regression model.
| odds ratio | 95%CI | ||||
|---|---|---|---|---|---|
| age | 2.99 | 2.61 | 3.45 | ||
| gender | male | reference | |||
| female | 1.57 | 1.22 | 2.03 | ||
| PS | 1.18 | 1.03 | 1.35 | ||
| LDL | 1.06 | 0.95 | 1.19 | ||
| SBP | 1.12 | 1.00 | 1.26 | ||
| HbA1c | 0.97 | 0.84 | 1.11 | ||
| metabolic syndrome | no | reference | |||
| reserve | 1.53 | 1.06 | 2.23 | ||
| yes | 1.16 | 0.81 | 1.67 | ||
| medication to reduce blood pressure | no | reference | |||
| yes | 1.52 | 1.13 | 2.06 | ||
| medication to reduce blood sugar or insulin injection | no | reference | |||
| yes | 1.45 | 0.83 | 2.61 | ||
| medication to reduce a level of cholesterol | no | reference | |||
| yes | 1.16 | 0.83 | 1.64 | ||
| drink habit | rarely drink | reference | |||
| sometimes | 1.28 | 0.98 | 1.67 | ||
| everyday | 1.04 | 0.78 | 1.40 | ||
(*) indicates that the confidence interval does not include 1.
Summary of the logistic regression model using selected variables.
| parameter | coefficients | std | ||
|---|---|---|---|---|
| (Intercept) | -0.28 | 0.14 | 0.04 | |
| age | 1.10 | 0.07 | <0.001 | |
| Gender:female | 0.45 | 0.13 | <0.001 | |
| PS | 0.16 | 0.07 | 0.02 | |
| LDL | 0.06 | 0.06 | 0.30 | |
| SBP | 0.12 | 0.06 | 0.05 | |
| HbA1c | -0.04 | 0.07 | 0.32 | |
| metabolic syndrome:reserve | 0.43 | 0.19 | 0.03 | |
| metabolic syndrome:yes | 0.15 | 0.18 | 0.41 | |
| medication to reduce blood pressure:yes | 0.42 | 0.15 | 0.01 | |
| medication to reduce blood sugar or insulin injection : yes | 0.37 | 0.29 | 0.20 | |
| medication to reduce a level of cholesterol : yes | 0.15 | 0.17 | 0.38 | |
| drink habit : sometimes | 0.24 | 0.14 | 0.08 | |
| drink habiit : everyday | 0.04 | 0.15 | 0.79 |
“std” represents the standard error in the output of glm().
(*) indicates a significant variable (p<0.05).
Fig 4A ROC curves using the final logistic discriminant model.
A point on the ROC curve shows the most appropriate cutoff value giving by the Youden Index.