| Literature DB >> 35625002 |
Roman Vyškovský1, Daniel Schwarz1, Vendula Churová1, Tomáš Kašpárek2.
Abstract
Schizophrenia is a severe neuropsychiatric disease whose diagnosis, unfortunately, lacks an objective diagnostic tool supporting a thorough psychiatric examination of the patient. We took advantage of today's computational abilities, structural magnetic resonance imaging, and modern machine learning methods, such as stacked autoencoders (SAE) and 3D convolutional neural networks (3D CNN), to teach them to classify 52 patients with schizophrenia and 52 healthy controls. The main aim of this study was to explore whether complex feature extraction methods can help improve the accuracy of deep learning-based classifiers compared to minimally preprocessed data. Our experiments employed three commonly used preprocessing steps to extract three different feature types. They included voxel-based morphometry, deformation-based morphometry, and simple spatial normalization of brain tissue. In addition to classifier models, features and their combination, other model parameters such as network depth, number of neurons, number of convolutional filters, and input data size were also investigated. Autoencoders were trained on feature pools of 1000 and 5000 voxels selected by Mann-Whitney tests, and 3D CNNs were trained on whole images. The most successful model architecture (autoencoders) achieved the highest average accuracy of 69.62% (sensitivity 68.85%, specificity 70.38%). The results of all experiments were statistically compared (the Mann-Whitney test). In conclusion, SAE outperformed 3D CNN, while preprocessing using VBM helped SAE improve the results.Entities:
Keywords: 3D CNN; autoencoders; classification; deep learning; deformation-based morphometry; schizophrenia; voxel-based morphometry
Year: 2022 PMID: 35625002 PMCID: PMC9139344 DOI: 10.3390/brainsci12050615
Source DB: PubMed Journal: Brain Sci ISSN: 2076-3425
Figure 1Scheme of the performed experiments with SAE-based and 3D CNN-based classifiers trained on different types of brain imaging features. There are three distinct features extracted using different image preprocessing pipelines–two complex pipelines taken from automated morphometry methods (VBM, DBM) and a simple one that includes only registration and skull-stripping operations. The features are pooled using univariate testing, sorting, and thresholding in the case of the SAE classifier, whereas all features are taken in with the 3D CNN classifier. The experiments further involve changes in the architectures and the optional combination of different feature types.
The 3D CNNs architectures used. The numbers in the chart represent the number of kernels in a particular layer. The dashes stand for the following: batch normalization, RELu, and max-pooling layers. There is a fully connected layer at the end of each network with 10 hidden and 2 output neurons.
| 3D CNN Name | 3D CNN Architecture |
|---|---|
| CNN-1 | 10-50-100 |
| CNN-2 | 20-40-60-80-100 |
| CNN-3 | 20-50-100-150-200 |
| CNN-4 | 5-10-20-40-60-80-100 |
| CNN-5 | 10-20-30-40-50-60-70-80-90 |
| CNN-6 | 50-100-150-200-250-300-350 |
| CNN-7 | 50-100-150-200-250-300-350-400-450 |
Results for SAE learned on the three types of features (T1, GMD, and LVC). The metrics listed represent the accuracy (sensitivity/specificity) in percentage.
| Feature Pool | Neurons | T1 [%] | GMD [%] | LVC [%] |
|---|---|---|---|---|
| 1000 | 500 | 48.75 (58.27/39.23) | 67.6 (64.42/70.77) | 56.44 (58.27/54.62) |
| 1000 | 50 | 54.13 (47.88/60.38) | 65.48 (61.15/69.81) | 54.71 (55.96/53.46) |
| 1000 | 500-50 | 48.65 (46.73/50.58) | 69.62 (68.85/70.38) | 52.12 (58.27/45.96) |
| 1000 | 1000-500-100 | 52.31 (46.54/58.08) | 68.37 (67.69/69.04) | 52.98 (54.81/51.15) |
| 1000 | 100-50-10 | 52.12 (39.81/64.42) | 67.12 (67.12/67.12) | 56.63 (52.5/60.77) |
| 1000 | 1000-500-100-50-10 | 50.87 (39.42/62.31) | 51.44 (90.19/12.69) | 56.15 (60/52.31) |
| 5000 | 500 | 56.35 (65.77/46.92) | 50.77 (96.92/4.62) | 63.37 (62.88/63.85) |
| 5000 | 50 | 54.52 (58.85/50.19) | 50.77 (97.31/4.23) | 57.4 (51.92/62.88) |
| 5000 | 500-50 | 54.13 (71.73/36.54) | 59.04 (54.04/64.04) | 61.06 (55.77/66.35) |
| 5000 | 1000-500-100 | 52.79 (54.81/50.77) | 53.08 (54.23/51.92) | 58.65 (55.96/61.35) |
| 5000 | 100-50-10 | 50.58 (30/71.15) | 65 (64.04/65.96) | 57.31 (45.19/69.42) |
| 5000 | 1000-500-100-50-10 | 51.25 (39.42/63.08) | 66.25 (67.12/65.38) | 55.1 (49.42/60.77) |
Results for SAE learned on all combinations of the three types of features (T1, GMD, and LVC). The metrics listed represent the accuracy (sensitivity/specificity) in percentage.
| Feature Pool | Neurons | T1/GMD [%] | T1/LVC [%] | GMD/LVC [%] | T1/GMD/LVC [%] |
|---|---|---|---|---|---|
| 1000 | 500 | 53.17 (55.19/51.15) | 50.77 (49.23/52.31) | 58.46 (58.46/58.46) | 49.71 (47.31/52.12) |
| 1000 | 50 | 52.5 (47.12/57.88) | 48.56 (53.08/44.04) | 53.46 (57.5/49.42) | 50.38 (49.04/51.73) |
| 1000 | 500-50 | 55.29 (50.19/60.38) | 49.42 (44.62/54.23) | 59.23 (64.62/53.85) | 51.44 (52.69/50.19) |
| 1000 | 1000-500-100 | 53.46 (47.5/59.42) | 52.79 (50.77/54.81) | 60 (61.73/58.27) | 52.21 (47.12/57.31) |
| 1000 | 100-50-10 | 52.98 (36.15/69.81) | 51.15 (47.69/54.62) | 56.63 (66.15/47.12) | 53.94 (45/62.88) |
| 1000 | 1000-500-100-50-10 | 52.5 (37.31/67.69) | 52.02 (40/64.04) | 59.9 (62.5/57.31) | 55.19 (50.96/59.42) |
| 5000 | 500 | 57.12 (59.23/55) | 56.06 (59.62/52.5) | 61.54 (69.62/53.46) | 59.62 (55/64.23) |
| 5000 | 50 | 57.21 (55.38/59.04) | 56.15 (57.12/55.19) | 60.87 (67.69/54.04) | 55.19 (54.42/55.96) |
| 5000 | 500-50 | 56.35 (47.31/65.38) | 54.42 (53.65/55.19) | 60.58 (66.92/54.23) | 55.87 (58.08/53.65) |
| 5000 | 1000-500-100 | 54.9 (54.04/55.77) | 55.1 (57.69/52.5) | 59.9 (59.81/60) | 55.1 (52.69/57.5) |
| 5000 | 100-50-10 | 58.46 (43.65/73.27) | 56.06 (50/62.12) | 59.13 (60.19/58.08) | 58.27 (56.92/59.62) |
| 5000 | 1000-500-100-50-10 | 54.62 (47.88/61.35) | 53.08 (53.85/52.31) | 59.9 (66.15/53.65) | 56.92 (63.46/50.38) |
Results for 3D CNNs learned on the three types of features (T1, GMD, and LVC). The metrics listed represent the accuracy (sensitivity/specificity) in percentage.
| 3D CNN Architecture | T1 [%] | GMD [%] | LVC [%] |
|---|---|---|---|
| 10-50-100 | 43.37 (43.85/42.86) | 42.40 (46.15/38.65) | 45.58 (41.15/50.00) |
| 20-40-60-80-100 | 54.62 (57.5/51.73) | 61.15 (62.31/60.00) | 52.98 (56.15/49.81) |
| 20-50-100-150-200 | 60.15 (63.25/57.05) | 61.65 (61.11/62.18) | 51.92 (52.12/51.73) |
| 5-10-20-40-60-80-100 | 50.29 (48.85/51.73) | 56.73 (57.89/55.58) | 51.73 (52.69/50.77) |
| 50-100-150-200-250-300-350 | 60.39 (61.54/59.23) | 63.08 (63.85/55.58) | 53.75 (51.15/56.35) |
| 10-20-30-40-50-60-70-80-90 | 53.27 (52.89/53.66) | 59.52 (59.81/59.23) | 52.89 (51.54/54.23) |
| 50-100-150-200-250-300-350-400-450 | 60.19 (60.77/59.62) | 62.6 (60.00/65.19) | 50.86 (52.78/48.93) |
Results for 3D CNNs learned on all combinations of the three types of features (T1, GMD, and LVC). The metrics listed represent the accuracy (sensitivity/specificity) in percentage.
| 3D CNN Architecture | T1/GMD [%] | T1/LVC [%] | GMD/LVC [%] | T1/GMD/LVC [%] |
|---|---|---|---|---|
| 20-40-60-80-100 | 60.67 (59.81/61.54) | 54.42 (52.69/56.15) | 58.65 (57.69/59.62) | 58.46 (60.00/56.92) |
| 50-100-150-200-250-300-350 | 62.31 (60.19/64.42) | 51.64 (49.23/54.04) | 59.14 (58.27/60.00) | 59.90 (57.69/62.12) |
p-values of the Mann-Whitney test comparing the accuracies obtained from 10 repetitions of the experiments for both classifiers (SAE, CNN) and all input feature types (T1, GMD, LVC). Significant results (α = 0.05) in favor of the classifier, listed on the left, are marked with an asterisk (*). Those in favor of the classifiers listed above are marked with two asterisks (**).
| Classifier-Input Data | GMD-SAE | LVC-SAE | T1-3D CNN | GMD-3D CNN | LVC-3D CNN |
|---|---|---|---|---|---|
| T1-SAE | 3.6580 × 10−4 ** | 0.0018 ** | 0.0811 | 0.0018 ** | 0.0799 |
| GMD-SAE | - | 0.0098 * | 0.0031 * | 0.0031 * | 1.7265 × 10−4 * |
| LVC-SAE | - | - | 0.4459 | 0.9694 | 2.3313 × 10−4 * |
| T1-3D CNN | - | - | - | 0.4919 | 0.0254 * |
| GMD-3D CNN | - | - | - | - | 2.3313 × 10−4 * |