| Literature DB >> 35449564 |
Qing Tian1,2,3, Ning-Bo Yang4, Yu Fan2,3, Fang Dong3, Qi-Jing Bo3, Fu-Chun Zhou3, Ji-Cong Zhang5, Liang Li6, Guang-Zhong Yin2, Chuan-Yue Wang3,7, Ming Fan1,8.
Abstract
Background: The search for a method that utilizes biomarkers to identify patients with schizophrenia from healthy individuals has occupied researchers for decades. However, no single indicator can be employed to achieve the good in clinical practice. We aim to develop a comprehensive machine learning pipeline based on neurocognitive and electrophysiological combined features for distinguishing schizophrenia patients from healthy people.Entities:
Keywords: biomarker; classification; electroencephalography; electrophysiology; machine learning; neurocognition; prepulse inhibition (PPI); schizophrenia
Year: 2022 PMID: 35449564 PMCID: PMC9016153 DOI: 10.3389/fpsyt.2022.810362
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 5.435
Demographic and clinical characteristics of healthy control and schizophrenia group.
| Factor | CON ( | SCZ ( | χ2/t |
|
| Gender (male/female) | 38/12 | 49/20 | 0.37 | 0.545 |
| Age (year) | 42.2 ± 8.8 | 44.8 ± 7.0 | −1.80 | 0.074 |
| Education (years of schooling) | 10.9 ± 3.1 | 10.8 ± 2.5 | 0.15 | 0.882 |
| Smoking (yes/no) | 24/26 | 35/34 | 0.09 | 0.769 |
| Duration of illness (year) | 19.7 ± 8.3 | |||
| Age at onset (year) | 24.5 ± 6.6 | |||
| CPZe (mg/day) | 292.7 ± 265.3 | |||
| PANSS score | 63.3 ± 13.1 | |||
| Positive Symptoms | 12.8 ± 4.5 | |||
| Negative Symptoms | 20.0 ± 6.4 | |||
| General Psychopathology | 30.5 ± 5.6 |
Mean ± SD are reported for age, education, duration of illness, age at onset, CPZe, and all PANSS scores.
CON, Health Control Group; SCZ, Schizophrenia Group; CPZe, Chlorpromazine Equivalent Doses; PANSS, Positive and Negative Syndrome Scale.
Statistical comparison of neurocognitive and electrophysiological features.
| Features | CON ( | SCZ ( | t | |
| IMM | 89.24 ± 19.73 | 54.70 ± 15.30 | 10.75 | 0.000 |
| VC | 89.92 ± 20.65 | 78.81 ± 15.47 | 3.36 | 0.003 |
| LAN | 91.88 ± 16.30 | 77.68 ± 12.51 | 5.38 | 0.000 |
| ATT | 104.34 ± 15.94 | 91.09 ± 12.26 | 5.13 | 0.000 |
| DEM | 90.66 ± 19.16 | 64.39 ± 18.32 | 7.57 | 0.000 |
| INT-C | 4.21 ± 3.88 | 7.39 ± 5.95 | –3.30 | 0.003 |
| INT-W | 16.78 ± 9.27 | 24.69 ± 12.85 | –3.71 | 0.001 |
| PSC-PPI (%) | 31.70 ± 26.15 | 11.26 ± 29.07 | 3.95 | 0.001 |
| PSS-PPI (%) | 50.65 ± 25.92 | 14.70 ± 25.30 | 7.57 | 0.000 |
| Abs-D (μV2) | 10.43 ± 12.57 | 11.01 ± 13.61 | –0.23 | 0.824 |
| Abs-T (μV2) | 3.30 ± 3.60 | 6.67 ± 7.96 | –2.79 | 0.014 |
| Abs-A (μV2) | 7.19 ± 8.18 | 9.78 ± 10.56 | –1.45 | 0.264 |
| Abs-B (μV2) | 0.80 ± 0.73 | 0.99 ± 1.07 | –1.09 | 0.376 |
| Abs-A/T | 2.79 ± 3.25 | 2.25 ± 2.41 | 1.03 | 0.393 |
| Abs-A/B | 9.17 ± 7.99 | 10.77 ± 9.32 | –0.98 | 0.397 |
| Abs-(D + T)/(A + B) | 3.67 ± 6.58 | 2.27 ± 2.55 | 1.61 | 0.202 |
| Abs-(D + T)L/(D + T)R | 1.00 ± 0.26 | 1.05 ± 0.33 | –1.01 | 0.393 |
| Abs-AL/AR | 0.98 ± 0.33 | 0.97 ± 0.28 | 0.22 | 0.824 |
| Abs-AFp/AO | 1.49 ± 1.75 | 0.69 ± 0.61 | 3.51 | 0.002 |
| Rel-D | 3.64 ± 1.54 | 2.98 ± 1.31 | 2.53 | 0.028 |
| Rel-T | 1.37 ± 0.65 | 1.85 ± 1.01 | –2.94 | 0.010 |
| Rel-A | 2.32 ± 1.16 | 2.55 ± 0.89 | –1.22 | 0.376 |
| Rel-B | 0.36 ± 0.15 | 0.34 ± 0.18 | 0.72 | 0.550 |
| Rel-A/T | 2.46 ± 2.45 | 2.01 ± 1.82 | 1.14 | 0.376 |
| Rel-A/B | 8.35 ± 6.62 | 9.71 ± 6.61 | –1.11 | 0.376 |
| Rel-(D + T)/(A + B) | 2.69 ± 2.53 | 2.18 ± 2.36 | 1.13 | 0.376 |
| Rel-(D + T)L/(D + T)R | 1.02 ± 0.13 | 1.05 ± 0.14 | –1.10 | 0.376 |
| Rel-AL/AR | 0.98 ± 0.13 | 0.98 ± 0.12 | 0.37 | 0.754 |
| Rel-AFp/AO | 0.76 ± 0.42 | 0.80 ± 0.24 | –0.61 | 0.592 |
| DFA-D | 0.73 ± 0.04 | 0.75 ± 0.07 | –2.06 | 0.082 |
| DFA-T | 0.68 ± 0.05 | 0.70 ± 0.06 | –2.38 | 0.039 |
| DFA-A | 0.77 ± 0.09 | 0.71 ± 0.09 | 3.42 | 0.003 |
| DFA-B | 0.66 ± 0.07 | 0.61 ± 0.06 | 4.21 | 0.000 |
| FD | 1.60 ± 0.04 | 1.61 ± 0.04 | –0.70 | 0.550 |
Mean ± SD are reported for all features.
CON, Health Control Group; SCZ, Schizophrenia Group; IMM, immediate memory score; VC, visuospatial/constructional score; LAN, language score; ATT, attention score; DEM, delayed memory score; INT-C, color interference time; INT-W, word interference time; PPI, prepulse inhibition; PSC-PPI, perceived spatial co-location PPI; PSS-PPI, perceived spatial separation PPI; Abs, absolute power spectra; Rel, relative power spectra; D, T, A, B denote delta, theta, alpha, and beta frequency band, respectively; L, left; R, right; Fp, frontal pole; O, occipital; DFA, detrended fluctuation analysis; FD, fractal dimension.
*P < 0.05; **P < 0.01; ***P < 0.001.
Cohen’s d and the classification performance of single feature.
| Features | Cohen’s d | Accuracy (%) | Sensitivity (%) | Specificity (%) | AUC (%) |
| IMM | 1.42 | 84.03 | 73.91 | 98.00 | 91.87 |
| VC | 0.60 | 69.75 | 95.65 | 34.00 | 64.65 |
| LAN | 0.90 | 76.47 | 75.36 | 78.00 | 75.74 |
| ATT | 0.86 | 77.31 | 85.51 | 66.00 | 77.84 |
| DEM | 1.16 | 83.19 | 82.61 | 84.00 | 86.58 |
| INT-C | 0.59 | 64.71 | 53.62 | 80.00 | 66.00 |
| INT-W | 0.65 | 67.23 | 53.62 | 86.00 | 71.26 |
| PSC-PPI | 0.69 | 70.59 | 76.81 | 62.00 | 70.61 |
| PSS-PPI | 1.16 | 80.67 | 79.71 | 82.00 | 84.32 |
| Abs-D | 0.04 | 53.78 | 49.28 | 60.00 | 50.00 |
| Abs-T | 0.50 | 67.23 | 62.32 | 74.00 | 67.13 |
| Abs-A | 0.27 | 68.07 | 86.96 | 42.00 | 61.71 |
| Abs-B | 0.20 | 60.5 | 72.46 | 44.00 | 57.88 |
| Abs-A/T | 0.19 | 37.82 | 7.25 | 80.00 | 48.7 |
| Abs-A/B | 0.18 | 67.23 | 91.3 | 34.00 | 58.43 |
| Abs-(D + T)/(A + B) | 0.30 | 63.87 | 81.16 | 40.00 | 56.96 |
| Abs-(D + T)L/(D + T)R | 0.19 | 53.78 | 23.19 | 96.00 | 54.26 |
| Abs-AL/AR | 0.04 | 42.86 | 43.48 | 42.00 | 51.54 |
| Abs-AFp/AO | 0.62 | 68.07 | 76.81 | 56.00 | 67.68 |
| Rel-D | 0.46 | 66.39 | 82.61 | 44.00 | 63.1 |
| Rel-T | 0.53 | 63.03 | 62.32 | 64.00 | 64.35 |
| Rel-A | 0.23 | 63.87 | 78.26 | 44.00 | 56.23 |
| Rel-B | 0.13 | 53.78 | 36.23 | 78.00 | 56.12 |
| Rel-A/T | 0.21 | 56.3 | 52.17 | 62.00 | 52.87 |
| Rel-A/B | 0.21 | 65.55 | 89.86 | 32.00 | 58.35 |
| Rel-(D + T)/(A + B) | 0.21 | 63.03 | 79.71 | 40.00 | 55.59 |
| Rel-(D + T)L/(D + T)R | 0.20 | 58.82 | 56.52 | 62.00 | 56.61 |
| Rel-AL/AR | 0.07 | 47.06 | 46.38 | 48.00 | 47.8 |
| Rel-AFp/AO | 0.11 | 60.5 | 66.67 | 52.00 | 59.13 |
| DFA-D | 0.38 | 57.14 | 30.43 | 94.00 | 58.52 |
| DFA-T | 0.43 | 61.34 | 62.32 | 60.00 | 61.57 |
| DFA-A | 0.61 | 68.07 | 63.77 | 74.00 | 70.1 |
| DFA-B | 0.73 | 71.43 | 75.36 | 66.00 | 71.94 |
| FD | 0.13 | 54.62 | 46.38 | 66.00 | 53.46 |
AUC, area under receiver operating characteristic curve; IMM, immediate memory score; VC, visuospatial/constructional score; LAN, language score; ATT, attention score; DEM, delayed memory score; INT-C, color interference time; INT-W, word interference time; PPI, prepulse inhibition; PSC-PPI, perceived spatial co-location PPI; PSS-PPI, perceived spatial separation PPI; Abs, absolute power spectra; Rel, relative power spectra; D, T, A, B denote delta, theta, alpha, and beta frequency band, respectively; L, left; R, right; Fp, frontal pole; O, occipital; DFA, detrended fluctuation analysis; FD, fractal dimension.
Classification performances of combined features.
| Feature set | Accuracy (%) | Sensitivity (%) | Specificity (%) | AUC (%) |
|
| ||||
| Logistics algorithm | 82.35 | 88.41 | 74.00 | 89.88 |
| Random forest algorithm | 88.24 | 82.61 | 96.00 | 96.59 |
| XGBoost algorithm | 89.08 | 89.86 | 88.00 | 93.99 |
|
| ||||
| Logistics algorithm | 82.35 | 86.96 | 76.00 | 90.84 |
| Random Forest algorithm | 84.87 | 91.30 | 76.00 | 91.88 |
| XGBoost algorithm | 88.24 | 89.86 | 86.00 | 90.52 |
|
| ||||
| Logistics algorithm | 87.39 | 92.75 | 80.00 | 92.54 |
| Random forest algorithm | 93.28 | 94.20 | 92.00 | 97.36 |
| XGBoost algorithm | 93.28 | 91.30 | 96.00 | 97.91 |
NSF subset, Neurocognitive Selected Features subset include IMM, LAN, ATT, DEM, INT-C, INT-W features; ESF subset, Electrophysiological Selected Features subset include PSC-PPI, PSS-PPI, Abs-T, Abs-A, Abs-AFp/AO, Abs-(D + T)/(A + B), Rel-D, Rel-T, Rel-A/B, DFA-A, DFA-B; ASF set, All Selected Features set include NSF subset and ESF subset.
FIGURE 1Receiver Operator Characteristics (ROC) curves for classification of schizophrenia patients and controls based on different combinations of features using logistics (A), random forest (B), and XGBoost algorithm (C). NSF subset, Neurocognitive Selected Features subset include IMM, LAN, ATT, DEM, INT-C, INT-W features; ESF subset, Electrophysiological Selected Features subset include PSC-PPI, PSS-PPI, Abs-T, Abs-A, Abs-AFp/AO, Abs-(D+T)/(A+B), Rel-D, Rel-T, Rel-A/B, DFA-A, DFA-B. ASF set, All Selected Features set include NSF subset and ESF subset. The red, green and blue lines show the ROC curves for the NSF subset, ESF subset and ASF set, respectively. The AUC of logistics models based on NSF subset, ESF subset, ASF set was 89.88%, 90.84%, 92.54%. The AUC of random forest models based on NSF subset, ESF subset, ASF set was 96.59 91.88%, 97.36%. The AUC of XGBoost models based on NSF subset, ESF subset, ASF set was 93.99%, 90.52%, 97.91%.
FIGURE 2The horizontal/longitudinal coordinate axis represents the probability of people with schizophrenia (%). NSF subset, Neurocognitive Selected Features subset include IMM, LAN, ATT, DEM, INT-C, INT-W features; ESF subset, Electrophysiological Selected Features subset include PSC-PPI, PSS-PPI, Abs-T, Abs-A, Abs-AFp/AO, Abs-(D + T)/(A + B), Rel-D, Rel-T, Rel-A/B, DFA-A, DFA-B; ASF set, All Selected Features set include NSF subset and ESF subset. (A–C) Scatter plots from two features set (NSF subset and ESF subset) using logistics, random forest and XGBoost models.
Correlation between all features and CPZs, duration of illness in patients.
| Features | CPZe | Duration of illness | ||
|
|
| |||
| r |
| r |
| |
| IMM | 0.100 | 0.398 | –0.050 | 0.671 |
| VC | 0.000 | 0.991 | –0.020 | 0.861 |
| LAN | –0.100 | 0.432 | 0.140 | 0.245 |
| ATT | –0.090 | 0.478 | 0.070 | 0.563 |
| DEM | 0.080 | 0.518 | 0.020 | 0.849 |
| INT-C | –0.010 | 0.955 | –0.010 | 0.955 |
| INT-W | –0.180 | 0.132 | –0.010 | 0.926 |
| PSC-PPI | –0.150 | 0.233 | 0.080 | 0.507 |
| PSS-PPI | –0.200 | 0.101 | 0.190 | 0.115 |
| Abs-D | 0.050 | 0.677 | 0.040 | 0.768 |
| Abs-T | –0.050 | 0.708 | 0.120 | 0.330 |
| Abs-A | 0.050 | 0.667 | 0.140 | 0.264 |
| Abs-B | 0.140 | 0.254 | 0.130 | 0.283 |
| Abs-A/T | 0.150 | 0.222 | –0.010 | 0.912 |
| Abs-A/B | –0.050 | 0.674 | 0.050 | 0.705 |
| Abs-(D + T)/(A + B) | –0.090 | 0.456 | –0.010 | 0.937 |
| Abs-(D + T)L/(D + T)R | –0.010 | 0.952 | 0.120 | 0.321 |
| Abs-AL/AR | –0.080 | 0.496 | 0.100 | 0.421 |
| Abs-AFp/AO | 0.030 | 0.799 | –0.010 | 0.953 |
| Rel-D | 0.040 | 0.763 | –0.060 | 0.610 |
| Rel-T | –0.210 | 0.078 | 0.070 | 0.547 |
| Rel-A | 0.080 | 0.513 | 0.030 | 0.803 |
| Rel-B | 0.010 | 0.926 | –0.020 | 0.879 |
| Rel-A/T | 0.150 | 0.211 | –0.020 | 0.893 |
| Rel-A/B | –0.040 | 0.740 | 0.070 | 0.544 |
| Rel-(D + T)/(A + B) | –0.100 | 0.405 | –0.020 | 0.880 |
| Rel-(D + T)L/(D + T)R | 0.130 | 0.280 | 0.050 | 0.677 |
| Rel-AL/AR | –0.170 | 0.159 | 0.080 | 0.519 |
| Rel-AFp/AO | –0.020 | 0.850 | 0.090 | 0.478 |
| DFA-D | 0.080 | 0.503 | 0.160 | 0.197 |
| DFA-T | –0.010 | 0.947 | 0.080 | 0.496 |
| DFA-A | –0.040 | 0.720 | –0.040 | 0.752 |
| DFA-B | 0.040 | 0.740 | 0.180 | 0.137 |
| FD | 0.130 | 0.281 | –0.080 | 0.519 |
P-value for spearman rank correlation analysis, and false discovery rate (FDR) was used to adjust P-value.
CPZe, Chlorpromazine Equivalent Doses; IMM, immediate memory score; VC, visuospatial/constructional score; LAN, language score; ATT, attention score; DEM, delayed memory score; INT-C, color interference time; INT-W, word interference time; PPI, prepulse inhibition; PSC-PPI, perceived spatial co-location PPI; PSS-PPI, perceived spatial separation PPI; Abs, absolute power spectra; Rel, relative power spectra; D, T, A, B denote delta, theta, alpha, and beta frequency band, respectively; L, left; R, right; Fp, frontal pole; O, occipital; DFA, detrended fluctuation analysis; FD, fractal dimension.