| Literature DB >> 32182578 |
Yuyang Luo1, Tara L Alvarez1, Jeffrey M Halperin2, Xiaobo Li3.
Abstract
Attention-deficit/hyperactivity disorder (ADHD) is a highly prevalent and heterogeneous neurodevelopmental disorder, which is diagnosed using subjective symptom reports. Machine learning classifiers have been utilized to assist in the development of neuroimaging-based biomarkers for objective diagnosis of ADHD. However, existing basic model-based studies in ADHD report suboptimal classification performances and inconclusive results, mainly due to the limited flexibility for each type of basic classifier to appropriately handle multi-dimensional source features with varying properties. This study applied ensemble learning techniques (ELTs), a meta-algorithm that combine several basic machine learning models into one predictive model in order to decrease variance, bias, or improve predictions, in multimodal neuroimaging data collected from 72 young adults, including 36 probands (18 remitters and 18 persisters of childhood ADHD) and 36 group-matched controls. All currently available optimization strategies for ELTs (i.e., voting, bagging, boosting and stacking techniques) were tested in a pool of semifinal classification results generated by seven basic classifiers. The high-dimensional neuroimaging features for classification included regional cortical gray matter (GM) thickness and surface area, GM volume of subcortical structures, volume and fractional anisotropy of major white matter fiber tracts, pair-wise regional connectivity and global/nodal topological properties of the functional brain network for cue-evoked attention process. As a result, the bagging-based ELT with the base model of support vector machine achieved the best results, with significant improvement of the area under the receiver of operating characteristic curve (0.89 for ADHD vs. controls and 0.9 for ADHD persisters vs. remitters). Features of nodal efficiency in right inferior frontal gyrus, right middle frontal (MFG)-inferior parietal (IPL) functional connectivity, and right amygdala volume significantly contributed to accurate discrimination between ADHD probands and controls; higher nodal efficiency of right MFG greatly contributed to inattentive and hyperactive/impulsive symptom remission, while higher right MFG-IPL functional connectivity strongly linked to symptom persistence in adults with childhood ADHD. Considering their improved robustness than the commonly implemented basic classifiers, findings suggest that ELTs may have the potential to identify more reliable neurobiological markers for neurodevelopmental disorders.Entities:
Keywords: ADHD; Classification; Ensemble learning; Machine learning; Persistence; Remission
Year: 2020 PMID: 32182578 PMCID: PMC7076568 DOI: 10.1016/j.nicl.2020.102238
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Demographic and clinical characteristics in groups of controls and ADHD probands (and further in the sub-groups of remitters and persisters of the ADHD probands).
| Controls ( | ADHD ( | Remitted ( | Persistent ( | |||
|---|---|---|---|---|---|---|
| Mean (SD) | Mean (SD) | p | Mean (SD) | Mean (SD) | p | |
| Age | 24.3 (2.3) | 24.66 (2.0) | 0.48 | 24.79 (2.2) | 24.52 (2.0) | 0.7 |
| Full-scale IQ | 103.83(15.4) | 97.96 (14.1) | 0.1 | 99.22 (14.9) | 96.71 (13.6) | 0.6 |
| Conners’ Adult ADHD Rating Scale (T score) | ||||||
| Inattentive | 45.75 (8.8) | 56.5 (13.2) | <0.001 | 49.83 (10.9) | 63.17 (12.0) | 0.001 |
| Hyperactive/impulsive | 42.97 (6.2) | 53.64 (12.9) | <0.001 | 46.17 (9.0) | 61.11 (12.0) | <0.001 |
| ADHD Total | 43.89 (8.2) | 56.5 (14.7) | <0.001 | 42.61 (7.5) | 54.33 (8.8) | <0.001 |
| ADHD semistructured interview (number of symptoms) | 0.79 (1.6) | 6.17 (5.2) | <0.001 | 2.64 (2.0) | 10.24 (3.6) | <0.01 |
| N (%) | N (%) | p | N (%) | N (%) | p | |
| Male | 31 (86.1) | 30 (83.3) | 0.74 | 16 (88.9) | 14 (77.8) | 0.37 |
| Right-handed | 32 (88.9) | 32 (88.9) | 1 | 15 (83.3) | 16 (88.9) | 0.63 |
| Race | 0.17 | 0.59 | ||||
| Caucasian | 15 (41.7) | 21 (58.3) | 9 (50.0) | 12 (66.7) | ||
| African American | 13 (36.1) | 7 (19.4) | 4 (22.2) | 3 (16.7) | ||
| More than one race | 6 (16.7) | 8 (22.2) | 5 (27.8) | 3 (16.7) | ||
| Asian | 2 (5.6) | 0 (0.0) | 0 (0.0) | 0 (0.0) | ||
| Ethnicity | 0.09 | 0.74 | ||||
| Hispanic/Latino | 10 (27.8) | 17 (47.2) | 8 (44.4) | 9 (50.0) | ||
| Task performance measures | Mean (SD) | Mean (SD) | p | Mean (SD) | Mean (SD) | p |
| Reaction time average | 395.8 (53.1) | 422.8 (74.3) | 0.08 | 431.1 (67.0) | 439.1 (107.8) | 0.79 |
| Reaction time std | 129.6 (24.8) | 137.2 (29.9) | 0.25 | 136.2 (27.6) | 138.2 (32.8) | 0.84 |
| Anticipation error | 1.86 (2.1) | 1.74 (1.6) | 0.78 | 1.69 (1.6) | 1.78 (1.7) | 0.88 |
| Commission error | 0.33 (0.8) | 0.85 (1.4) | 0.07 | 0.75 (1.6) | 0.94 (1.3) | 0.7 |
| Omission error | 4.97 (5.8) | 8 (10.8) | 0.15 | 4.38 (4.0) | 11.22 (13.8) | 0.06 |
Fig. 1The ensemble learning flowchart. (sMRI: structural MRI; DTI: diffusion tensor imaging; fMRI: functional MRI; CV: cross-validation; LOOCV: leave-one-out cross-validation; AUC: the area under the receiver operating characteristic curve; ELTs: ensemble learning techniques).
The hyperparameters of 7 basic models and 4 ELTs-based models. (ELTs: ensemble learning techniques; KNN: k-nearest neighbors; SVM: support vector machine; LR: logistic regression; RF: random forest; LDA: linear discriminant analysis; MLP: multilayer perceptron).
| Classifiers | Hyperparameters |
|---|---|
| KNN | n_neighbors: [1, 3, 5, 7, 9]; algorithm: [‘auto’, ‘ball_tree’, ‘kd_tree’, ‘brute’]; p: [1, 2, 3] |
| SVM | C: [0.001, 0.01, 0.1, 1, 10, 100, 1000]; gamma: [‘auto’, ‘scale’]; kernel: [‘linear’, ‘rbf’, ‘poly’, sigmoid] |
| LR | solver: [‘newton-cg’, ‘lbfgs’, ‘sag’, ‘saga’]; multi_class: [‘ovr’, ‘multinomial’, ‘auto’] |
| RF | n_estimators: list(range(3, 60, 5)); criterion: [‘gini’, ‘entropy’]; min_samples_leaf: [3, 5, 10]; max_depth: [3, 4, 5, 6]; min_samples_split: [3, 5, 10]; bootstrap: [True, False] |
| LDA | solver: [‘svd’, ‘lsqr’, ‘eigen’] |
| MLP | activation: [‘identity’, ‘logistic’, ‘tanh’, ‘relu’]; solver: [‘lbfgs’, ‘sgd’, ‘adam’]; hidden_layer_sizes: np.arange(1, 72, 10); max_iter: [4000] |
| ELT-Voting | estimators; voting: [‘hard’, ‘soft’] |
| ELT-Bagging | base_estimator; n_estimators: list(range(10, 150, 10)); max_samples=[0.2, 0.3, 0.4, 0.5]; max_features=[0.5, 0.6, 0.7, 0.8, 0.9, 1.0] |
| ELT-Boosting | base_estimator; n_estimators: list(range(10, 150, 10)); learning_rate: list(range(0.01, 1, 0.01)) |
| ELT-Stacking | classifiers; meta_classifiers |
The results of 7 basic and 4 ELTs-based classifications between the groups of ADHD and normal controls (Part I) as well as between the groups of ADHD persisters and ADHD remitters (Part II). (ELT: ensemble learning technique; KNN: k-nearest neighbors; SVM: support vector machine; LR: logistic regression; RF: random forest; LDA: linear discriminant analysis; MLP: multilayer perceptron; AUC: the area under the receiver operating characteristic curve; ADHD: attention deficit/hyperactivity disorder; NC: normal controls; ADHD-R: ADHD remitters; ADHD-P: ADHD persisters).
| Classifiers | Specificity | Sensitivity | Accuracy | AUC |
|---|---|---|---|---|
| Part I: ADHD vs. NC | ||||
| KNN | 0.72 | 0.66 | 0.689 | 0.69 |
| SVM | 0.942 | 0.69 | 0.816 | 0.87 |
| LR | 0.756 | 0.742 | 0.75 | 0.85 |
| NB | 0.778 | 0.718 | 0.748 | 0.86 |
| RF | 0.866 | 0.75 | 0.705 | 0.82 |
| LDA | 0.734 | 0.774 | 0.754 | 0.78 |
| MLP | 0.782 | 0.746 | 0.764 | 0.84 |
| ELT-Voting | 0.808 | 0.718 | 0.763 | 0.87 |
| ELT-Bagging | 0.734 | 0.798 | 0.766 | 0.89 |
| ELT-Boosting | 0.67 | 0.77 | 0.721 | 0.88 |
| ELT-Stacking | 0.756 | 0.742 | 0.75 | 0.82 |
| Part II: ADHD-P vs. ADHD-R | ||||
| KNN | 0.4 | 0.934 | 0.667 | 0.72 |
| SVM | 0.65 | 0.75 | 0.7 | 0.85 |
| LR | 0.6 | 0.682 | 0.642 | 0.85 |
| NB | 0.734 | 0.65 | 0.692 | 0.77 |
| RF | 0.734 | 0.6 | 0.667 | 0.76 |
| LDA | 0.568 | 0.518 | 0.542 | 0.63 |
| MLP | 0.634 | 0.75 | 0.692 | 0.84 |
| ELT-Voting | 0.8 | 0.65 | 0.725 | 0.82 |
| ELT-Bagging | 0.75 | 0.582 | 0.67 | 0.90 |
| ELT-Boosting | 0.75 | 0.682 | 0.717 | 0.86 |
| ELT-Stacking | 0.884 | 0.684 | 0.783 | 0.82 |
Fig. 2The AUC of each classification procedure for discrimination between ADHD probands and normal controls (A), and between ADHD persisters and remitters (B). (KNN: k-nearest neighbors; SVM: support vector machine; LR: logistic regression; RF: random forest; LDA: linear discriminant analysis; MLP: multilayer perceptron; HC: hierarchical clustering; ROC: the receiver operating characteristic curve; AUC: the area under the ROC curve).
The importance scores of top three features of classifications between ADHD probands and normal controls, as well as between ADHD persisters and ADHD remitters. (FC: functional connectivity; NC: normal controls; ADHD: attention deficit/hyperactivity disorder; ADHD-P: ADHD persisters; ADHD-R: ADHD remitters).
| Feature | Importance Score |
|---|---|
| ADHD vs. NC | |
| Nodal efficiency of right Inferior Frontal gyrus | 0.134 |
| FC between right Middle Frontal gyrus and right Inferior Parietal lobule | 0.111 |
| Volume of right amygdala | 0.1 |
| ADHD-P vs. ADHD-R | |
| Nodal efficiency of right Middle Frontal gyrus | 1.028 |
| FC between right Middle Frontal gyrus and right Inferior Parietal lobule | 0.852 |
| Betweenness-centrality of left putamen | 0.677 |
Pearson correlation coefficient and mean squared error performance of regression models.
| Regression | Pearson Correlation Coefficient | MSE |
|---|---|---|
| T-Inattentive | ||
| OLS | 126.3 | |
| LASSO | 124.6 | |
| Ridge | 126.1 | |
| Elastic Net | 121.1 | |
| T-Hyperactive/Impulsive | ||
| OLS | 126.5 | |
| LASSO | 123.3 | |
| Ridge | 126.3 | |
| Elastic Net | 119.8 | |
The importance scores of top three features of Elastic Net regression for inattentive and hyperactive/impulsive symptom T-scores. (FC: functional connectivity).
| Feature | r | Importance Score | |
|---|---|---|---|
| Inattentive | |||
| Nodal efficiency of right Inferior Frontal gyrus | −0.399 | 0.001 | 3.471 |
| FC between right Middle Frontal gyrus and right Inferior Parietal lobule | 0.405 | <0.001 | 2.126 |
| Volume of right Amygdala | −0.011 | 0.928 | 1.819 |
| Hyperactive/Impulsive | |||
| Nodal efficiency of right Inferior Frontal gyrus | −0.345 | 0.003 | 2.289 |
| FC between right Middle Frontal gyrus and right Inferior Parietal lobule | 0.361 | 0.002 | 2.134 |
| Nodal efficiency of right Middle Frontal gyrus | −0.333 | 0.004 | 1.997 |