| Literature DB >> 35893083 |
Andronicus A Akinyelu1,2, Fulvio Zaccagna3,4, James T Grist5,6,7,8, Mauro Castelli1, Leonardo Rundo9.
Abstract
Management of brain tumors is based on clinical and radiological information with presumed grade dictating treatment. Hence, a non-invasive assessment of tumor grade is of paramount importance to choose the best treatment plan. Convolutional Neural Networks (CNNs) represent one of the effective Deep Learning (DL)-based techniques that have been used for brain tumor diagnosis. However, they are unable to handle input modifications effectively. Capsule neural networks (CapsNets) are a novel type of machine learning (ML) architecture that was recently developed to address the drawbacks of CNNs. CapsNets are resistant to rotations and affine translations, which is beneficial when processing medical imaging datasets. Moreover, Vision Transformers (ViT)-based solutions have been very recently proposed to address the issue of long-range dependency in CNNs. This survey provides a comprehensive overview of brain tumor classification and segmentation techniques, with a focus on ML-based, CNN-based, CapsNet-based, and ViT-based techniques. The survey highlights the fundamental contributions of recent studies and the performance of state-of-the-art techniques. Moreover, we present an in-depth discussion of crucial issues and open challenges. We also identify some key limitations and promising future research directions. We envisage that this survey shall serve as a good springboard for further study.Entities:
Keywords: brain cancer; capsule neural networks; deep learning; machine learning; magnetic resonance imaging; vision transformers
Year: 2022 PMID: 35893083 PMCID: PMC9331677 DOI: 10.3390/jimaging8080205
Source DB: PubMed Journal: J Imaging ISSN: 2313-433X
Figure 1General architecture of a Convolutional Neural Network (CNN).
Figure 2Overview of a Vision Transformers (ViT) model. The image is partitioned into N small patches (e.g., 9 patches). Each of the image patches contains n × n pixels (e.g., 16 × 16 pixels). After partitioning, each image patch is flattened: each of the flattened image patches is fed into a linear projection layer to obtain a lower-dimensional linear embedding. Moreover, positional embeddings are added to the sequence of image patches to ensure that each image keeps its positional information. The input sequences and position embedded sequence are fed into a standard transformer encoder for training. The training can be conducted by an MLP or CNN head stacked on top of the transformer. The “*” symbol refers to an additional learnable (class) embedding that is appended to the sequence based on the position of the image patch. This class embedding is used to predict the class of an input image after self-attention updates it.
Figure 3General scheme of a capsule neural network (CapsNet). A CapsNet is a three-layer network composed of convolutional, primary capsule, and class capsule layers. The primary capsule layer is typically the first one, followed by an undetermined number of capsule layers. The capsule layer is followed by the class capsule layer. The convolutional layer is used to extract features, which are then transmitted to the primary capsule layer. The primary capsule performs a series of operations and transmits the resulting feature map to the digit capsule.
Figure 4Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagram of the proposed review on AI applications to brain tumor MRI.
Figure 5(a) percentage of reviewed articles; (b) number of reviewed articles and their publication year.
Summary of Dataset used in the literature.
| Dataset Name | Dataset Details | Reference |
|---|---|---|
| BraTS 2012 | 30 MRI, 50 simulated images (25 Low Grade Glioma (LGG) and 25 High Grade Glioma (HGG)) | [ |
| BraTS 2013 | 30 MRI (20 HGG and 10 LGG), 50 simulated images (25 LGG and 25 HGG) | [ |
| BraTS 2014 | 190 HGG and 26 LGG MRI | [ |
| BraTS 2015 | 220 HGG and 54 LGG MRI | [ |
| BraTS 2016 | 220 HGG and 54 LGG; Testing: 191 images with unknown grades | [ |
| BraTS 2017 | 285 MRI scan. Contains full masks for brain tumors. | [ |
| BraTS 2018 | Training dataset: 210 HGG and 75 LGG MRI scans. The validation dataset includes 66 different MRI scans | [ |
| BraTS 2019 | 259 HGG and 76 LGG MRI scans from 19 institution. | [ |
| BraTS 2020 | 2640 MRI scans from 369 patients with ground truth in four sequences (T1-weighted (T1w), T2-weighted (T2w), post-gadolinium-based contrast agent (GBCA) T1w (T1w post GBCA), Fluid Attenuated Inversion Recovery (FLAIR)) | [ |
| BraTS 2021 | 8000 MRI scans from 2000 cases | [ |
| TCIA | 3929 MRI scans from 110 patients. 1373 tumor images and 2556 normal images. | [ |
| Radiopedia | 121 MRI | [ |
| Contrast Enhanced Magnetic Resonance Images (CE-MRI) dataset | 3064 MRI T1w post GBCA images from 233 patients | [ |
| Brain MRI Images | 253 MRI images, 155 tumor images, 98 non-tumor images | [ |
| Br35H dataset | 3000 MRI images, 1500 tumor images, and 1500 non-tumor images | [ |
| MSD dataset | 484 multi-modal multi-site MRI data (FLAIR, T1W, T1w post GBCA, T2W) | [ |
Summary of classical ML-based techniques.
| Ref. | Year | Method | Classes Considered | Main Highlight | Dataset | Performance |
|---|---|---|---|---|---|---|
| [ | 2021 | 13 pre-trained CNN models and nine ML classifiers | Normal and tumor images | Concatenated three deep features from pre-trained CNN models and trained nine ML classifiers. | Three brain MRI datasets [ | DenseNet-169, Inception-v3, and ResNeXt-50 produced the best deep features, achieving an accuracy of 96.08%, 92.16% and 94.12%, respectively. |
| [ | 2021 | SVM, k-NN, and modified GoogleNet pre-trained architecture | Glioma, meningioma, and pituitary. | Extracted features from modified GoogleNet pre-trained architecture and used it to train SVM and k-NN | CE-MRI dataset | SVM and k-NN produced a specificity of 98.93% and 98.63%, respectively. |
| [ | 2022 | SVM, k-NN, binary decision tree, RF, ensemble methods. | FLAIR, T1w, T1w post GBCA, and T2w | Developed multiple ML models on six texture-based features. Applied a hybrid of k-NN and C-means clustering for tumor segmentation. | BraTS 2017 and BraTS 2019 | Classification accuracy of 96.98% and 97.01% for BraTS 2017 and 2019, respectively. DSC and accuracy of 90.16% and 98.4%, respectively. |
| [ | 2022 | NB, RNN, bat and lion optimization, PCA, ICA, and cuckoo search. | Tumor and normal images | Designed hybrid techniques model using the combination of metaheuristics and ML algorithms. | TCIA | Classification accuracy of 98.61%. |
| [ | 2021 | SVM and CNN | Glioma, meningioma, and pituitary tumor | Proposed a hybrid technique using CNN-based features and SVM. | FigShare dataset | 95.82% accuracy |
Figure 6Building blocks of a typical ML-based brain tumor classification and segmentation model.
Figure 7Workflow of deep learning (CNN)-based brain tumor classification techniques.
Summary of CNN-based classification techniques.
| Ref. | Year | Classes Considered | Method | Main Highlight | Dataset | Performance |
|---|---|---|---|---|---|---|
| [ | 2019 | Grades I–IV | InputCascadeCNN + data augmentation + VGG-19 | Adopted four data augmentation techniques. Also used InputCascadeNN architecture for data augmentation and VGG-19 for fine-tuning. | Radiopedia and Brain tumor dataset | Classification accuracy of 95.5%, 92.66%, 87.77%, and 86.71%. for Grades I–IV, respectively on radiopedia dataset. Sensitivity and specificity 88.41% and 96.12%, respectively, on the brain tumor dataset. |
| [ | 2021 | T1w, T2w and FLAIR images | Differential deep CNN + Data augmentation | Applied user-defined hyperactive values and a differential operator to generate feature maps for CNN. Proposed several data augmentation techniques. | TUCMD (17,600 MR brain images) | Classification accuracy, sensitivity, and specificity of 99.25%, 95.89%, and 93.75%, respectively |
| [ | 2019 | Glioma, meningioma, and pituitary tumor | VGG-19 | Introduced a block-wise fine-tuning technique for multi-class brain tumor MRI image. | CE-MRI [ | Classification accuracy: 94.82% |
| [ | 2020 | LGG and HGG | 3D CNN | Proposed a multi-scale 3D CNN architecture for grade classification capable of learning both local and global brain tumor features. Applied two image pre-processing techniques for reducing thermal noise and scanner-related artifacts in brain MRI. Used data augmentation. | BraTS2018: Training-209 HGG and 75 LGG from 284 patients. Validation: 67 mixed grades. | Classification accuracy: 96.49% |
| [ | 2020 | T1w, T1w post GBCA, T2w, FLAIR | U-Net architecture, GANs | GANs was used to generate synthetic images for four modalities of MRI: T1, T1e, T2, FLAIR. | TCGA-GBM [ | Average classification accuracy, sensitivity, and specificity of 88.82%, 81.81%, and 92.17%, respectively |
| [ | 2019 | Complete, core, and enhancing tumors | Custom CNN architecture, Bat algorithm | Used BAT algorithm to optimize the loss function of CNN. In addition, used skull stripping and image enhancement techniques for image pre-processing. | BraTS2015 | Accuracy, recall (or sensitivity), and precision of 92%, 87%, and 90%, respectively |
Figure 8Workflow for CNN-based brain tumor segmentation.
Summary of CNN-based segmentation techniques.
| Ref. | Year | Classes Considered | Method | Main Highlight | Dataset | Performance |
|---|---|---|---|---|---|---|
| [ | 2018 | Metastasis, meningiomas gliomas | CNN | Designed a patching-based technique for brain tumor segmentation. Evaluated the impact of inter-institutional dataset. | TCIA | DSC-Same institution: 0.72 ± 0.17 and 0.76 ± 0.12. Different Institution: 0.68 ± 0.19 and 0.59 ± 0.19 |
| [ | 2017 | Necrosis, edema, non-ET, ET. | CNN | Designed a two-pathway architecture for capturing global and local features. Also designed three cascade architectures. | BraTS2013 | DSC: 0.88 |
| [ | 2020 | WT, TC, and ET | Triple CNN architecture for multi-class segmentation. | Developed a triple network architecture to simplify the multiclass segmentation problem to a single binary segmentation problem. | BraT2018 dataset | DSC: 0.90, 0.82, and 0.79 for WT, TC, and ET, respectively |
| [ | 2020 | T1w, T1w post GBCA, T2w, and FLAIR | Modified U-Net architecture, Data augmentation, batch normalization using the N3 bias correction tool [ | Developed an encoder-decoder architecture for brain tumor segmentation. | BraTS 2019 | DSC, sensitivity, and specificity of 0.814, 0.783, 0.999, respectively. |
| [ | 2021 | HGG and LGG | Inception-v3, NSGA, LDA, SVM, k-NN, softmax, CART, YOLOv2, and McCulloch’s Kapur entropy | Designed a CNN-based hybrid framework for tumor enhancement, feature extraction and selection, localization, and tumor segmentation | BraTS 2018, BraTS2019, and BraTS2020 | Classification accuracy of 98%, 99%, and 99% for BraTS2018, BraTS2019, and BraTS2020, respectively. |
| [ | 2019 | Complete, core, and enhancing tumors | Custom CNN architecture, Bat algorithm | Used BAT algorithm to optimize the loss function of CNN. In addition, used skull stripping and image enhancement techniques for image pre-processing. | BraTS2015 | Accuracy, recall (or sensitivity), and precision of 92%, 87%, and 90%, respectively |
Figure 9General workflow for ViT-based brain tumor segmentation techniques.
Summary of ViT-based techniques.
| Ref. | Year | Classes Considered | Method | Dataset | Main Highlight | Performance |
|---|---|---|---|---|---|---|
| [ | 2021 | ET, WT, EC | Transformers and 3D CNN | BraTS2019 and BraTS2020 | Developed a transformer-based network for 3D brain tumor segmentation. | BraTS2020-DSC of 90.09%, 78.73%, and 81.73% for WT, ET, and TC, respectively. |
| [ | 2022 | ET, WT, and TC | Swin transformers and CNN | BraTS2021 | Developed a technique for multi-modal brain tumor images using Swin transformers and CNN. | DSC of 0.891, 0.933, and 0.917 for ET, WT, and TC, respectively. |
| [ | 2022 | WT, ET, and TC | Transformers and CNN | MSD dataset | Developed a segmentation technique for multi-modal brain tumor image using transformers and CNN. | DSC of 0.789, 0.585,, and 0.761 for WT, ET, and TC, respectively. |
| [ | 2021 | WT, ET, and TC | Transformers and 3D CNN | BraTS2021 | Designed a CNN-transformer technique for multi-modal brain MRI scan segmentation. | DSC of 0.823, 0.908, and 0.839 for ET, WT, and TC, respectively. |
| [ | 2021 | WT, ET, and TC | Transformers and 3D CNN | BraTS2021 | Developed a U-Net shaped encoder-decoder technique using only transformers. The transformer encoder can capture local and global information. The decoder block allows parallel computation of cross- and self-attention. | DSC of 85.59%, 87.41%, and 91.20% for ET, TC, and WT, respectively |
Figure 10Workflow for a Capsule Network (CapsNet)-based brain tumor segmentation technique.
Summary of CapsNet-based brain tumor segmentation module.
| Ref. | Year | Classes Considered | Method | Dataset | Main Highlight | Performance |
|---|---|---|---|---|---|---|
| [ | 2020 | Meningioma, | CapsNet and Bayesian theory. | Cancer dataset [ | Designed a DL technique that can model uncertainty associated with predictions of CapsNet models. | Classification accuracy: 68.3% |
| [ | 2021 | Meningioma, Glioma, Pituitary, normal | CapsNet | Brain tumor dataset. Meningioma (937 images), Glioma (926 images), Pituitary (901 images), normal (500) | Introduced a new activation function for CapsNet, called PSTanh activation function. | Classification accuracy of 96.70%. |
| [ | 2019 | Meningioma, Glioma, Pituitary, normal | CapsNet, dilation convolution | Brain tumor dataset [ | Developed a CapsNet-based technique using dilation convolution with the objective of maintaining the high resolution of the images for accurate classification. | Classification accuracy: 95.54%. |
| [ | 2019 | Meningioma, Glioma, Pituitary | CapsNet; classification; Data pre-processing | Brain tumor dataset: 3064 [ | Presented a performance analysis on the effect of image pre-processing on CapsNet for brain tumor segmentation. | Classification accuracy: 92.6% |
| [ | 2021 | T1w, T2w, T1 w post GBCA and FLAIR | SegCaps–Capsule network; brain tumor segmentation | BraTS 2020 | Designed a modified version of CapsNet using SegCaps network. | DSC of 87.96%. |
| [ | 2018 | Meningioma, Pituitary, and Glioma | Capsule network | Brain tumor dataset proposed by [ | Developed different CapsNet for brain tumor segmentation. Investigated the performance of input data on capsule network. Developed a visualization paradigm for the output of capsule network. | 86.5% for segmented tumor, and 78% for whole brain image |
Figure 11Summarized results for classical ML-based brain tumor classification techniques. For studies that used the same algorithm, we selected the algorithm that produced the best performance. The SVM model designed by Sekhar et al. [14] produced the best performance for ML-based tumor classification. The RNN constructed by Kaur et al. [94] generated the second-best result for ML-based brain tumor classification. The remaining competitors include the DenseNet-169 proposed in Kang et al. [84], the ensemble approach exploited by Jena et al. [15], and the work of Deepak and Ameer [13] that combined SVM with CNN.
Figure 12Performance overview of CNN-based brain tumor classification techniques. The technique developed by Isselmou et al. [95] achieved the best performance for CNN-based brain tumor classification. The techniques proposed by Mzoughi et al. [62] yielded the second-best classification accuracy. The remaining competitors include the work of Sajjad et al. [16], the VGG19-based approach of Swati et al. [97], the work of Ge et al. [75] based on UNet and GAN, and, finally, the combination of CNN and BAT proposed by Thaha et al. [17].
Figure 13Performance overview of CNN-based brain tumor segmentation techniques. The technique developed by Havaei et al. [10] outperformed all the CNN-based brain tumor segmentation techniques presented in this study. The remaining competitors include the patching-based technique developed by AlBadawy et al. [73], the encoder-decoder architecture proposed by Zeineldin et al. [41], and the triple network architecture developed by Yogananda et al. [74].
Figure 14Performance overview of ViT-based techniques. The ViT-based technique developed by Hatamizadeh et al. [32] produced the best result, achieving a WT, ET, and TC of 93.3%, 89.1%, and 91.7%, respectively. The remaining competitors include the ViT-based technique developed by Jia et al. [104], the work of Peiris et al. [107], the study of Wang et al. [35], and the method proposed by Ali et al. [105].
Figure 15Performance overview of CapsNet-based brain tumor techniques. The technique proposed by Thaha et al. [17] in 2020 yielded the best result for CapsNet-based brain tumor segmentation and classification. The method developed by Adu et al. [19] produced the second-best result. The remaining competitors include the work of Afshar et al. [40], the studies of Kurup et al. [110] and Aziz et al. [37], and the method proposed by Afshar et al. [18] in 2018.