| Literature DB >> 31420302 |
Weizheng Yan1, Vince Calhoun2, Ming Song1, Yue Cui1, Hao Yan3, Shengfeng Liu1, Lingzhong Fan1, Nianming Zuo1, Zhengyi Yang1, Kaibin Xu1, Jun Yan3, Luxian Lv4, Jun Chen5, Yunchun Chen6, Hua Guo7, Peng Li3, Lin Lu3, Ping Wan7, Huaning Wang6, Huiling Wang5, Yongfeng Yang8, Hongxing Zhang9, Dai Zhang10, Tianzi Jiang11, Jing Sui12.
Abstract
BACKGROUND: Current fMRI-based classification approaches mostly use functional connectivity or spatial maps as input, instead of exploring the dynamic time courses directly, which does not leverage the full temporal information.Entities:
Keywords: Cerebellum; Deep learning; Multi-site classification; Recurrent neural network (RNN); Schizophrenia; Striatum; fMRI
Mesh:
Year: 2019 PMID: 31420302 PMCID: PMC6796503 DOI: 10.1016/j.ebiom.2019.08.023
Source DB: PubMed Journal: EBioMedicine ISSN: 2352-3964 Impact factor: 8.143
Fig. 1The framework of the Multi-scale RNN model in distinguishing schizophrenia patients from healthy controls. (a) Data preprocessing and feature selection. All rsfMRI data were preprocessed using the standard procedure. Time courses were then extracted using group-ICA/AAL/Brainnetome Atlas respectively. (b) The TCs/FNC data were randomly split into training, validation and testing sets. In multi-site pooling classification, all seven datasets were pooled together, and then k-fold cross-validation strategies were used for evaluating classification performance. In leave-one-site-out transfer prediction, the samples of a given imaging site were left for testing, and the samples of other sites were used for training. The performance of conventional methods (including Adaboost, Random Forest and SVM) and various RNN-based models were used for comparison. The most discriminative components were found by using leave-one-IC-out method. (c) Details of the MsRNN classification model. Three different scales convolutional filters were used for extracting of spatial features from time courses. The extracted features were then concatenated, pooling, and sent to stacked GRU module.
Fig. 3Comparison of different atlas and Leave-One-IC-Out method. (a) The MsRNN classification results using three different feature selection methods. * P < .05 (two-sample t-test). (b) Leave-one-IC-out method for estimating the contribution of each IC. (c) Top two discriminative independent components discovered using the leave-one-IC-out method.
Performance comparison in multi-site pooling classification.
| Methods | ACC | SEN | SPE | F1 | AUC | |
|---|---|---|---|---|---|---|
| CON | Adaboost | 75.6(3.8)** | 77.0(4.4)** | 74.2(4.4)** | 76.2(3.8)** | 84.2(3.6)** |
| CON | Random Forest | 76.0(3.5)** | 81.0(3.9)o | 71.4(5.5)** | 77.4(3.5)** | 84.0(3.4)** |
| CON | SVM | 79.4(3.1)* | 80.4(3.5)o | 78.4(3.9)* | 79.6(3.3)* | 86.8(3.2)* |
| RNN | GRU_1_last | 51.6(3.6)** | 52.0(5.3)** | 51.2(4.3)** | 52.0(3.8)** | 51.2(3.6)** |
| RNN | GRU_1_ave | 77.8(3.4)** | 78.4(3.8)** | 77.0(3.5)** | 78.2(3.4)** | 86.8(3.5)* |
| RNN | GRU_2_ave | 78.0(3.9)** | 80.8(5.1)o | 76.0(4.2)** | 78.8(3.9)* | 86.8(4.1)* |
| CMLP | Multi_CNN_MLP | 77.8(3.4)** | 76.2(4.0)** | 79.2(4.8)○ | 77.2(3.4)** | 86.4(3.1)** |
| CRNN | Simple_CNN_GRU_2_ave | 80.8(3.0)○ | 80.2(4.3)○ | 82.0(3.5)○ | 80.8(3.1)○ | 89.2(2.8)○ |
| CRNN | Multi_CNN_GRU_1_ave | 80.6(3.5)○ | 80.8(4.1)○ | 80.6(4.3)○ | 80.8(3.3)○ | 88.2(3.6)○ |
| CRNN | Multi_CNN_GRU_2_ave | 81.2(3.4)○ | 81.4(4.1)○ | 81.0(4.9)○ | 81.0(3.5)○ | 88.6(3.7)○ |
| CRNN | Multi_CNN_LSTM_2_ave | 81.6(2.9)○ | 82.6(3.6)○ | 80.4(3.8)○ | 82.0(2.7)○ | 89.4(2.8)○ |
| CRNN | MsRNN(Proposed) | 83.2(3.2) | 83.1(3.7) | 83.5(3.7) | 83.3(3.2) | 90.6(3.0) |
CON: conventional classification methods; RNN: RNN-based methods; CMLP: CNN linked with multi-layer perception; CRNN: CNN-RNN based methods; SVM: Support vector machine with Gaussian kernel; LSTM: Long short-term memory network; GRU: gated recurrent unit. GRU_1: one layer of GRU; GRU_2: two-layer stacked GRU; #_last: the output of the last GRU step is connected to the next layer. #_ave: the average of the outputs of all GRU steps is connected to the next layer; SimpleCNN: Convolutional layer has fixed kernel size; Multi_CNN: Convolutional layer has different kernel size; ○ denotes that the methods have no significant difference (two-sample t-test) with the proposed. */** denote respectively that the methods are significantly worse than the proposed model with P value = .05/0.01. Details of all these mentioned architectures are shown in Supplementary file Fig. S2. The last row is our proposed method.
Fig. 2Classification results of multi-site pooling and leave-one-site-out transfer classification. (a) 5-fold multi-site pooling classification results. ** P < .01(two-sample t-test), * P < .05 (two-sample t-test). (b) The comparison of receiver operating characteristic curves of different methods. (c) Leave-one-site-out transfer classification results. (d) t-SNE visualization of the last hidden layer representation in the MsRNN for SZ/HC classification. Here we show the MsRNN's internal representation of SZ and HC by applying t-SNE, a method for visualizing high-dimensional data, to the last hidden layer in the MsRNN of training (Site 1–6: 951 subjects) and testing (Site 7: 149 subjects) samples.
Performance comparison in leave-one-site-out classification.
| Methods | ACC | SEN | SPE | F1 | AUC | |
|---|---|---|---|---|---|---|
| CON | Adaboost | 72.9(3.0)** | 76.6(7.4)○ | 70.1(6.7)* | 73.7(2.8)** | 81.3(2.4)** |
| CON | Random Forest | 72.6(4.4)** | 79.6(8.9)○ | 66.7(10.7○ | 74.3(3.2)** | 82.7(3.6)** |
| CON | SVM | 76.0(3.1)* | 80.0(7.5)○ | 73.3(9.5)○ | 77.4(2.2)* | 85.0(2.9)* |
| RNN | GRU_1_last | 47.7(3.2)** | 50.6(6.8)** | 44.7(7.1)** | 49.3(3.7)** | 46.7(2.4)** |
| RNN | GRU_1_ave | 78.7(2.8)○ | 80.9(7.3)○ | 77.4(7.4)○ | 79.4(1.9)○ | 86.9(2.3)* |
| RNN | GRU_2_ave | 77.9(3.9)○ | 79.0(9.2)○ | 77.9(7.5)○ | 78.1(2.7)○ | 87.7(3.0)○ |
| CMLP | Multi_CNN_MLP | 76.1(3.2)* | 79.7(8.2)○ | 73.4(9.8)○ | 77.0(2.1)** | 85.4(2.7)* |
| CRNN | Simple_CNN_GRU_2_ave | 79.1(3.7)○ | 82.4(7.9)○ | 76.7(10.7)○ | 80.1(2.3)○ | 89.1(2.3)○ |
| CRNN | Multi_CNN_GRU_1_ave | 80.3(3.0)○ | 82.9(7.3)○ | 79.0(9.4)○ | 81.1(1.8)○ | 88.7(2.3)○ |
| CRNN | Multi_CNN_GRU_2_ave | 79.7(3.0)○ | 80.4(7.2)○ | 79.6(7.7)○ | 79.9(2.7)○ | 88.6(2.3)○ |
| CRNN | Multi_CNN_LSTM_2_ave | 78.7(3.9)○ | 83.1(8.3)○ | 75.3(9.7)○ | 79.7(2.6)○ | 89.6(3.0)○ |
| CRNN | MsRNN(Proposed) | 80.2(3.0) | 82.5(7.7) | 79.0(8.4) | 80.8(2.0) | 89.4(2.1) |
CON: conventional classification methods; RNN: RNN-based methods; CMLP: CNN linked with multi-layer perception; CRNN: CNN-RNN based methods; SVM: Support vector machine with Gaussian kernel; LSTM: Long short-term memory network; GRU: gated recurrent unit. GRU_1: one layer of GRU; GRU_2: two-layer stacked GRU; #_last: the output of the last GRU step is connected to the next layer. #_ave: the average of the outputs of all GRU steps is connected to the next layer; SimpleCNN: Convolutional layer has fixed kernel size; Multi_CNN: Convolutional layer has different kernel size; Details of all these mentioned architectures are shown in Supplementary file Fig. S2. The last row is our proposed method. ○ denotes that the methods have no significant difference (two-sample t-test) with the proposed. */** denote respectively that the methods are significantly worse than the proposed model with P =.05/0.01.
Talairach labels of the peak activations in spatial maps of selected ICs.
| Area | Brodmann area | Volume (cc) | Random effects: Max Value (x, y, z) |
|---|---|---|---|
| IC_4 | |||
| Putamen | 4.2/4.9 | 1.4 (−24, 12, 15)/1.4 (29, −8, 14) | |
| Lentiform nucleus | 1.6/1.3 | 1.4 (−28, −17, 13)/1.4 (14, −1, −2) | |
| Parahippocampal Gyrus | 34 | 0.8/0.8 | 1.4 (−23, −8, −16)/1.4 (32, −10, −13) |
| Claustrum | 0.8/1.0 | 1.4 (−36, −13, 2)/1.4 (34, 1, 9) | |
| Inferior Frontal Gyrus | 13, 47 | 0.6/0.1 | 1.4 (−32, 10, −15)/1.4 (30, 13, −12) |
| Caudate | 1.7/1.8 | 1.4 (−11, 17, 7)/1.4 (16, −8, 19) | |
| IC_2 | |||
| Declive | 2.9/3.0 | 1.9 (−27, −71, −22)/1.9 (21, −71, −22) | |
| Uvula | 0.5/0.8 | 1.6 (−27, −71, −25)/1.8 (24, −71, 24) | |
| Pyramis | 0.0/0.1 | NA/1.6 (27, −71, −27) |