| Literature DB >> 35888037 |
Andrea Termine1, Carlo Fabrizio1, Carlo Caltagirone2, Laura Petrosini3.
Abstract
Despite Artificial Intelligence (AI) being a leading technology in biomedical research, real-life implementation of AI-based Computer-Aided Diagnosis (CAD) tools into the clinical setting is still remote due to unstandardized practices during development. However, few or no attempts have been made to propose a reproducible CAD development workflow for 3D MRI data. In this paper, we present the development of an easily reproducible and reliable CAD tool using the Clinica and MONAI frameworks that were developed to introduce standardized practices in medical imaging. A Deep Learning (DL) algorithm was trained to detect frontotemporal dementia (FTD) on data from the NIFD database to ensure reproducibility. The DL model yielded 0.80 accuracy (95% confidence intervals: 0.64, 0.91), 1 sensitivity, 0.6 specificity, 0.83 F1-score, and 0.86 AUC, achieving a comparable performance with other FTD classification approaches. Explainable AI methods were applied to understand AI behavior and to identify regions of the images where the DL model misbehaves. Attention maps highlighted that its decision was driven by hallmarking brain areas for FTD and helped us to understand how to improve FTD detection. The proposed standardized methodology could be useful for benchmark comparison in FTD classification. AI-based CAD tools should be developed with the goal of standardizing pipelines, as varying pre-processing and training methods, along with the absence of model behavior explanations, negatively impact regulators' attitudes towards CAD. The adoption of common best practices for neuroimaging data analysis is a step toward fast evaluation of efficacy and safety of CAD and may accelerate the adoption of AI products in the healthcare system.Entities:
Keywords: 3D MRI; Clinica; MONAI; artificial intelligence; computer aided diagnosis; deep learning; frontotemporal dementia; neurodegenerative diseases; neuroimaging
Year: 2022 PMID: 35888037 PMCID: PMC9323676 DOI: 10.3390/life12070947
Source DB: PubMed Journal: Life (Basel) ISSN: 2075-1729
Figure 1Graphical representation of the main steps of the workflow.
Data splitting before augmentation.
| Group | Train (n) | Validation (n) | Test (n) |
|---|---|---|---|
| FTD | 143 | 19 | 20 |
| NC | 91 | 19 | 20 |
Transformations applied to perform data augmentation.
| Transformation | Description | Probability of Application | Specs |
|---|---|---|---|
| Translation | Translate voxels for every spatial dimension. | 1 | ±2 voxels |
| Rotation | Randomly rotate the | 1 | ±5 degrees on the |
| Gaussian noise | Add Gaussian noise to the image. | 0.5 | Mean = 0; standard deviation = 2.5% of the range of activation values in the image |
| Contrast | Randomly updates each voxel intensity by gamma. | 1 | Gamma range = (0.0, 3.0) |
Train set after data augmentation.
| Group | Original Images (n) | Augmented Images (n) | Total |
|---|---|---|---|
| FTD | 143 | 257 | 400 |
| NC | 91 | 309 | 400 |
| Total | 234 | 566 | 800 |
Figure 2Schematic representation of DenseNet121. Its structure is similar to a classical convolutional neural network, yet DenseNet121 features dense blocks that concatenate outputs of multiple connected layers. In fact, within a dense block each layer is directly connected to every other layer in a feed-forward fashion.
Figure 3(A) The confusion matrix indicates classification results for both classes. Rows indicate true labels and columns indicate predicted labels. (B) Receiver Operating Characteristic (ROC) curve of the DenseNet121 classifier obtained when predicting disease status (FTD/NC) using 3D T1w MRI. Area Under the Curve (AUC) was calculated as the definite integral between 0 and 1 on the x-axis and provides an aggregate measure of performance.
FTD classification results.
| Citation | Comparison | Sample Size | Classification Method | Features | Metric |
|---|---|---|---|---|---|
| Proposed application | FTD vs. NC | 182 FTD | HOTS DenseNet121 | 3D T1 MRI scans | Acc = 0.80 |
| Hu et al., 2020 [ | FTD vs. NC | 552 FTD | HOTS CNN | Raw 3D T1 MRI images | Acc = 0.93 |
| Bron et al., 2017 [ | FTD vs. NC | 33 FTD | 4-fold CV SVM | Whole-brain VBM volume of GM | AUC = 0.95 |
| Zhang et al., 2013 [ | FTD vs. NC | 25 FTD | 4-fold CV SVM | VBM GM volume on frontotemporal ROI | Acc = 0.66 |
| Muñoz-Ruiz et al., 2012 [ | FTD vs. NC | 37 FTD | HOTS regression | VBM GM volume | Acc = 0.85 |
| Dukart et al., 2011 [ | FTD vs. NC | 14 FTD | LOOCV SVM | ROIs GM | Acc = 0.85 |
| Davatzikos et al., 2008 [ | FTD vs. NC | 12 FTD | LOOCV SVM | PCA on RAVENS GM and WM volume | Acc = 1 |
| Du et al., 2007 [ | FTD vs. NC | 19 FTD | LOOCV LR | Frontal | Acc = 0.89 |
| Chagué et al., 2021 [ | FTD vs. Late Onset AD | 39 FTD | 10-fold CV SVM | GM and WM | Acc = 0.72 |
| Chagué et al., 2021 [ | FTD vs. Early Onset AD | 39 FTD | 10-fold CV | GM and WM volumes | Acc = 0.80 |
| Bron et al., 2017 [ | FTD vs. AD | 33 FTD | 4-fold CV SVM | Whole-brain VBM volume of GM | AUC = 0.78 |
| McMillan et al., 2014 [ | FTD vs. AD | 72 FTD | HOTS linear regression | Global ventricles volume | AUC = 0.83 |
| Dukart et al., 2011 [ | FTD vs. AD | 14 FTD | LOOCV SVM | ROIs GM | Acc = 0.60 |
| Lehmann et al., 2010 [ | FTD vs. AD | 23 FTD | CV SVM | Whole brain cortical thickness | Acc = 0.79 |
| Davatzikos et al., 2008 [ | FTD vs. AD | 12 FTD | LOOCV SVM | PCA on | Acc = 0.84 |
| Klöppel et al., 2008 [ | FTD vs. AD | 19 FTD | LOOCV SVM | GM volume | Acc = 0.89 |
Acc: Accuracy; AUC: Area Under the Curve; CV: Cross-Validation; GM: Grey Matter; HOTS: held-out test set; LOOCV: Leave-One-Out Cross-Validation; LR: Logistic Regression; SVM: Support Vector Machine; VBM: Voxel-Based Morphometry; WM: White Matter. In the case that a paper presents multiple classification results based on different feature sets (e.g., comparing classification performance on hippocampus volume vs. VBM-GM volume), only the best result was reported in this table. Moreover, only results obtained with brain morphometry data were considered. Accuracy was reported where available, otherwise AUC was reported. The table is arranged by Comparison (descending) and year of publication (descending).
Figure 4Coronal view of the brain for one FTD and one NC subject. The original brain scans used for testing are in the upper row, while the attention maps are in the lower row.