| Literature DB >> 32082370 |
Jinlong Hu1, Yuezhen Kuang1, Bin Liao2, Lijie Cao1, Shoubin Dong1, Ping Li3.
Abstract
Deep learning models have been successfully applied to the analysis of various functional MRI data. Convolutional neural networks (CNN), a class of deep neural networks, have been found to excel at extracting local meaningful features based on their shared-weights architecture and space invariance characteristics. In this study, we propose M2D CNN, a novel multichannel 2D CNN model, to classify 3D fMRI data. The model uses sliced 2D fMRI data as input and integrates multichannel information learned from 2D CNN networks. We experimentally compared the proposed M2D CNN against several widely used models including SVM, 1D CNN, 2D CNN, 3D CNN, and 3D separable CNN with respect to their performance in classifying task-based fMRI data. We tested M2D CNN against six models as benchmarks to classify a large number of time-series whole-brain imaging data based on a motor task in the Human Connectome Project (HCP). The results of our experiments demonstrate the following: (i) convolution operations in the CNN models are advantageous for high-dimensional whole-brain imaging data classification, as all CNN models outperform SVM; (ii) 3D CNN models achieve higher accuracy than 2D CNN and 1D CNN model, but 3D CNN models are computationally costly as any extra dimension is added in the input; (iii) the M2D CNN model proposed in this study achieves the highest accuracy and alleviates data overfitting given its smaller number of parameters as compared with 3D CNN.Entities:
Mesh:
Year: 2019 PMID: 32082370 PMCID: PMC7012272 DOI: 10.1155/2019/5065214
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Architecture of the M2D CNN model.
Figure 2The process of transforming a 3D brain image into three multichannel 2D images.
Figure 3Architecture of mv2D CNN.
Figure 4Architecture of 1D CNN.
Figure 5Architecture of 3D CNN.
Figure 6Architecture of s2D CNN.
Classification results for all deep learning models on data of 995 subjects (mean ± std).
| Model | Accuracy | Precision (%) | F1-score |
|---|---|---|---|
| PCA + SVM | 48.94 ± 2.36 | 48.17 ± 2.48 | 0.4779 ± 0.0232 |
| mv2D CNN | 63.36 ± 2.19 | 63.59 ± 2.27 | 0.6306 ± 0.0222 |
| 3D CNN | 82.34 ± 1.27 | 82.68 ± 1.39 | 0.8239 ± 0.0130 |
| 3D SepConv | 80.44 ± 1.16 | 80.88 ± 1.24 | 0.8043 ± 0.0116 |
| 1D CNN | 80.76 ± 1.69 | 80.94 ± 1.73 | 0.8068 ± 0.0178 |
| s2D CNN | 81.80 ± 0.89 | 81.95 ± 0.97 | 0.8179 ± 0.0094 |
| M2D CNN |
|
|
|
Note: accuracy by chance is 20% (i.e., given 5 types of movement behavior).
Figure 7Box plots for accuracy, precision, and F1-score for classification task on 995 subjects of different learning models over 5-fold cross-validation. The middle line in each box represents the median value. The circle represents the outlier.
Classification accuracy over different sample sizes.
| Model | Accuracy (mean ± std) over different sample sizes | ||
|---|---|---|---|
| 2000 samples (200 subjects) (%) | 5000 samples (500 subjects) (%) | 9950 samples (995 subjects) (%) | |
| mv2D CNN | 53.70 ± 4.20 | 60.88 ± 2.28 | 63.36 ± 2.19 |
| 3D CNN | 72.70 ± 2.54 | 77.36 ± 1.95 | 82.34 ± 1.27 |
| 3D SepConv | 73.60 ± 1.77 | 77.24 ± 2.79 | 80.44 ± 1.16 |
| 1D CNN | 67.40 ± 2.92 | 76.52 ± 1.09 | 80.76 ± 1.69 |
| s2D CNN | 66.20 ± 3.19 | 76.64 ± 1.96 | 81.80 ± 0.89 |
| M2D CNN | 71.70 ± 1.81 | 79.44 ± 1.70 | 83.20 ± 2.29 |
A comparison of model units, parameters, and training time.
| Model | Unit number input to fully connected layer | Total number of parameters in models | Training time (S) (mean ± std) | Total number of epochs (mean ± std) |
|---|---|---|---|---|
| mv2D CNN | 16,800 | 2,223,877 | 909 ± 134 | 54 ± 8 |
| 3D CNN | 352,800 | 45,174,181 | 1156 ± 185 | 39 ± 6 |
| 3D SepConv | 352,800 | 45,161,301 | 1601 ± 196 | 41 ± 5 |
| 1D CNN | 79,296 | 10,474,501 | 834 ± 157 | 39 ± 7 |
| s2D CNN | 16,800 | 2,236,837 | 565 ± 102 | 31 ± 6 |
| M2D CNN | 47,712 | 6,355,717 | 1074 ± 348 | 39 ± 13 |
Figure 8Mean training loss (in solid lines) and mean validation loss (in dashed lines) for 3D CNN (red) and M2D CNN (blue). The standard deviation is indicated by the shadow area in the image. (a) 2000 samples. (b) 5000 samples.