| Literature DB >> 36268059 |
João Pedro Mazuco Rodriguez1,2, Rubens Rodriguez3, Vitor Werneck Krauss Silva2, Felipe Campos Kitamura2, Gustavo Cesar Antônio Corradi2, Ana Carolina Bertoletti de Marchi1, Rafael Rieder1.
Abstract
Digital pathology had a recent growth, stimulated by the implementation of digital whole slide images (WSIs) in clinical practice, and the pathology field faces shortage of pathologists in the last few years. This scenario created fronts of research applying artificial intelligence (AI) to help pathologists. One of them is the automated diagnosis, helping in the clinical decision support, increasing efficiency and quality of diagnosis. However, the complexity nature of the WSIs requires special treatments to create a reliable AI model for diagnosis. Therefore, we systematically reviewed the literature to analyze and discuss all the methods and results in AI in digital pathology performed in WSIs on H&E stain, investigating the capacity of AI as a diagnostic support tool for the pathologist in the routine real-world scenario. This review analyzes 26 studies, reporting in detail all the best methods to apply AI as a diagnostic tool, as well as the main limitations, and suggests new ideas to improve the AI field in digital pathology as a whole. We hope that this study could lead to a better use of AI as a diagnostic tool in pathology, helping future researchers in the development of new studies and projects.Entities:
Keywords: Artificial intelligence; Diagnosis; Pathology; Whole slide images
Year: 2022 PMID: 36268059 PMCID: PMC9577128 DOI: 10.1016/j.jpi.2022.100138
Source DB: PubMed Journal: J Pathol Inform
Fig. 1Selection process of the studies.
Summary of the studies in all aspects analyzed in this review.
| Author | Year | Sample | Number of classes | Training set | Test set | External test set | Pre-processing | Model (Patch level) | Model (Slide level) | Training approach | Results | Results of the external test set | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Lucas et al. | 2019 | Prostate | 4 | Cancer | Private | 268 000 patches | 89 000 patches | – | Data Augmentation | InceptionV3 + SVM | Percentages of GPs used for final Gleason grade | No | Supervised | Kappa: 0.70 | - |
| Pantanowitz et al. | 2020 | Prostate | 18 | Cancer | Private | 549 WSIs | 2501 WSIs | 1627 WSIs | Tissue segmentation and data augmentation | InceptionV1, InceptionV3 and ResNet101 | Maximum score | Yes | Supervised | AUC: 0.997 | AUC: 0.991, 0.941, 0.971, and 0.957 |
| Ström et al. | 2020 | Prostate | 2 and 4 | Cancer | Private | 1069 WSIs | 246 WSIs | 73 WSIs | Tissue segmentation and data augmentation | 30 InceptionV3 models | Boosted tree | Yes | Supervised | Kappa: 0.83 | Kappa: 0.70 |
| BenTaieb et al. | 2017 | Ovary | 5 | Cancer | Public | 68 WSIs | 65 WSIs | – | – | K-means | LSVM | Yes | Weakly Supervised | Kappa: 0.89 | – |
| Barker et al. | 2016 | Central nervous system | 2 | Cancer | Public | 302 WSIs | 45 WSIs | 302 WSIs | Tissue segmentation, color deconvolution and nuclei segmentation | – | Feature Extraction + Elastic Net (Regression) | No | Weakly supervised | Accuracy: 1.0 | Accuracy: 0.93 |
| Xu et al. | 2017 | Central nervous system | 2 | Cancer | Public | 55 WSIs | 40 WSIs | - | Tissue segmentation, resize and data augmentation | Customized AlexNet | Feature Pooling + SVM | Yes | Supervised | Accuracy: 0.975 | – |
| Bulten et al. | 2020 | Prostate | 7 | Cancer | Private | 933 WSIS | 210 WSIs | – | Tissue segmentation and data augmentation | Own CNN to detect tumor and U-Net to final label | Normalized percentage of the volume of each class | No | Supervised (with a semi-automatic annotation) | Kappa: 0.819 on Gleason score | – |
| Gecer et al. | 2018 | Breast | 5 | Cancer | Private | 180 WSIs | 60 WSIs | – | Color Normalization | RoI detector and an own proposed CNN | Majority voting | No | Weakly supervised | Accuracy: 0.55 | – |
| Silva-Rodríguez et al. | 2020 | Prostate | 4 and 1 | Cancer | Public | 155 WSIs | 2122 patches | - | Tissue segmentation and data augmentation | Own CNN | MLP | No and yes | Supervised | Kappa: 0.732 | – |
| Tokunaga et al. | 2019 | Gastric | 4 | Cancer | – | 29 WSIs | – | – | Data augmentation | AWMF-CNN | Aggregating CNN | No | Supervised | IoU (Mean): 0.536 | – |
| Sali et al. | 2019 | Small intestine | 4 | Celiac disease | Private | 336 WSIs | 120 WSIs | - | Tissue segmentation, color normalization, resize and data augmentation | Customized Resnet50 | Sum of all labels and majority | No | Weakly Supervised | Accuracy: 1.0 | - |
| Xu et al. | 2020 | Prostate | 3 | Cancer | Public | 312 WSIs | 49,883 patches | – | Grayscale and tissue segmentation | Feature extractor | PCA and SVM | No | Weakly Supervised | Accuracy: 0.771 | – |
| Mercan et al. | 2018 | Breast | 14 | Cancer | Private | 240 WSIs | 60 WSIs | – | – | Feature extractor + Linear classifier | PCA and SVM | No | Weakly supervised | Average precision: 0.737 | – |
| Adnan et al. | 2020 | Lung | 2 | Cancer | Public | 1026 WSIs | – | – | RoI selection | Feature extractor | GCN | No and yes | Weakly supervised | 0.89 AUC | – |
| van Zon et al. | 2020 | Skin | 3 | Cancer | Private | 232 WSIs | 331 WSIs | - | Tissue segmentation and data augmentation | U-Net | Own CNN | No | Supervised | 0.954 Accuracy | – |
| Wang et al. | 2019 | Lung | 4 | Cancer | Private | 754 WSIs | 185 WSIs | - | Tissue segmentation, resize and data augmentation | ScanNet | Aggregation of patch preditcions values + Random forest | No | Weakly supervised | Accuracy: 0.973 | – |
| Syrykh et al. | 2020 | Lymph node | 2 | Cancer | Private | 75% of 378 WSIs | 25% of 378 WSIs | 48 Cases | Tissue segmentation | CNN | Average of patch inferences | – | Weakly supervised | AUC: 0.99 | AUC: 0.69 |
| Wei et al. | 2019 | Small intestine | 3 | Celiac disease | Private | 1,018 WSIs | 212 WSIs | - | Data augmentation and color normalization | ResNet50 | Threshold to discard low confidence + Most frequent predicted class | Yes | – | Average F1 score: 0.872 | – |
| Korbar et al. | 2017 | Small intestine | 6 | Colorectal polyps | Private | 458 WSIs | 239 WSIs | - | Data augmentation, color normalization and resize | ResNet-D | At least 5 positive class patches with 70% of confidence | No | Supervised | Overall F1 score: 0.888 | – |
| Nagpal et al. | 2019 | Prostate | 4 | Cancer | Public and private | 1,226 WSIs | 331 WSIs | - | Data augmentation | Customized inception V3 | K-nearest neighbor model from patch prediction | No | Supervised | Gleason Score Accuracy: 0.70 | – |
| Olsen et al. | 2018 | Skin | 3 models with 2 classes | Cancer | Private | Study 1: 300 WSIs | Study 1: 126 WSIs | – | Tissue segmentation | Derivative VGG + Rule-based discriminator | Classification model trained with the segmented areas | No | Supervised | Study 1 Accuracy: 0.9945 | – |
| Study 2: 225 WSIs | Study 2: 114 WSIs | Study 2 Accuracy: 0.994 | |||||||||||||
| Study 3: 225 WSIs | Study 3: 123 WSIs | Study 3 Accuracy: 1.0 | |||||||||||||
| Wei et al. | 2019 | Lung | 6 | Cancer | Private | RoIs from 279 WSIs | 143 WSIs | – | Tissue segmentation, data augmentation and color normalization | ResNet18 | Threshold to discard low confidence + Most frequent predicted class | Yes | Supervised | Kappa Score: 0.525 | - |
| Ianni et al. | 2020 | Skin | 4 | Cancer | Private | 85% of 5070 WSIs | 15% of 5,070 WSIs | 13,537 WSIs | – | Own Enconder-Decoder CNN + U-Net | Own CNN | No | Supervised (Patch) and Weakly Supervised (Slide) | – | Accuracy: 0.98 |
| Iizuka et al. | 2020 | Stomach & Small intestine | 2 models with 3 classes | Cancer | Private | Stomach: 3,628 WSIs | Stomach & Colon: 500 WSIs | Stomach & Colon: 500 WSIs | Tissue segmentation and data augmentation | Customized Inception V3 | RNN using the last but one layer from the previous model as input | No | Supervised | AUC | AUC |
| Stomach: 0.97 and 0.99 | Stomach: 0.98 and 0.93 | ||||||||||||||
| Colon: 3,536 WSIs | Colon: 0.96 and 0.99 | Colon: 0.97 and 0.96 | |||||||||||||
| Campanella et al. | 2019 | Skin | 2 | Cancer | Private | 8387 WSIs | 1575 WSIs | – | – | ResNet34 | RNN using the last but one layer from the previous model as input | No | Weakly supervised | AUC: 0.994 | – |
| Chuang et al. | 2020 | Larynx, lip and oral cavity, esophagus, pharynx | 3 | Cancer | Private | 626 Cases | 100 Cases | – | – | ResNetXt | ResNet using the probability map as input | Yes | Supervised | AUC: 0.985 | – |
Captions – Not mentioned or not performed
Details can be found in the Supplementary Table
Training and validation set used during training was considered as training set in this column
Not clearly specified, only the test set size and the whole dataset size, this number was estimated with these 2 information
No metrics were performed by the authors in terms of final diagnosis, we calculated this metric using the table of misclassifcation comparison
AUC of adenocarcinoma and adenoma compared to benign, respectively
This study used the same model in 2 different tasks of lung carcinoma, one in a private set with 4 classes, and another in the TCGA differentiating 2 classes. We considered the most complex task.
Authors performed only the Benign vs. Cancer AUC in the internal test set.
Metrics representing: Benign vs Cancer, Gleason score 6 or ASAP vs Gleason score 7–10, ASAP or Gleason pattern 3 or 4 vs Gleason pattern 5, Cancer without vs with perineural invasion, respectively