| Literature DB >> 35070383 |
Yongzhong Li1, Donglai Chen2, Xuejie Wu1, Wentao Yang1, Yongbing Chen1.
Abstract
OBJECTIVE: To summarize the current evidence regarding the applications, workflow, and limitations of artificial intelligence (AI) in the management of patients pathologically-diagnosed with lung cancer.Entities:
Keywords: Lung cancer; artificial intelligence (AI); decision-making; pathologic diagnosis
Year: 2021 PMID: 35070383 PMCID: PMC8743410 DOI: 10.21037/jtd-21-806
Source DB: PubMed Journal: J Thorac Dis ISSN: 2072-1439 Impact factor: 2.895
Figure 1Architecture of the deep CNN used for discriminating NSCLC subtypes on the pathologic image slides. CNN, convolutional neural network; NSCLC, non-small cell lung cancer.
The characteristics of the included studies using AIs to distinguish subsets of NSCLC based on pathological image slides
| Authors | Publication year | Number of datasets | Number of cases | Number of images | Subtypes (images number) | Training set (images) | Validation set (images) | Test set (images) | Independent test datasets (images) | Classifier | Results | Conclusion | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | AUC | SN | SP | ||||||||||||
| Coudray | 2018 | 2 | NR | 2075 | TCGA: | 1,145 | 245 | 245 | 340 | Inception v3 | Normal: 0.984; | 0.968 | 89% | 93% | CNN model could be a useful tool for classification of ADCs and SCCs, depending on whole-slide images and mutational gene status of NSCLCs |
| Teramoto | 2017 | 1 | 76 | 298 | ADC (n=82), | 96 | 98 | 104 | 0 | CNN | ADC: 89.0%; | NR | NR | NR | Approximately 71% of the images were classified correctly, which was on par with the accuracy of cytotechnologists and pathologists |
| Yu | 2020 | 2 | 1009 | 1600 | TCGA: ADC (n=427), SCC (n=457), normal (n=514); ICGC: ADC (n=87), SCC (n=38), normal (n=77) | 1,280 | 0 | 320 | 202 | AlexNet, GoogLeNet, VGGNet-16, and ResNet | NR | VGGNet: 0.891; | NR | NR | The utility of CNNs in classifying the histopathology images of the major types of NSCLCs obtained promising performance |
| Khosravi | 2017 | 2 | NR | 4009 | TCGA: ADC (n=1,606), SCC (n=1,543); TMAD: ADC (n=637), SCC (n=223) | 1,629 | 0 | 1,520 | 860 | CNN-basic, Inception V3, Inception V1, Inception-ResNet V2 | CNN-basic: 73%; | CNN-basic: 64%; | CNN-basic: 73%; | NR | Fine-tuned inception architectures provided promising accuracies for distinguishing ADCs and SCCs in both datasets, significantly superior to the other four CNNs with various configurations |
| Koh | 2014 | 2 | 400 | 400 | SNUH 1: ADC (n=108), SCC (n=59), other (n=17); SNUH 2: NR; SNUH 3 and SNUBH: NR | 184 | 186 | 30 | 0 | DT and SVM | DT: 72.2%; | NR | ADC: 83.3%; | ADC: 83.3%; | Machine learning algorithms were effective for subtyping NSCLCs in small biopsies using p63 and/or CK5/6 in addition to the 3-marker IHC panel |
NR, not reported; AI, artificial intelligence; CNN, convolutional neural networks; SVM, Support Vector Machine; DT, Decision Tree; ADC, adenocarcinoma; SCC, squamous-cell carcinoma; NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer; TCGA, The Cancer Genome Atlas cohort; IGCG, the International Cancer Genome Consortium; TMAD, the Stanford Tissue Microarray Dataset; SNUH, Seoul National University Hospital; SNUBH, Seoul National University Bundang Hospital; IHC, immunohistochemistry; ACC, accuracy; SN, sensitivity; SP, specificity.
The characteristics of the included studies for diagnosing NSCLC through the gene profiles analyzed by AI models
| Authors | Publication year | Number of datasets | Number of cases | Number of genes (total) | Subtypes (cases) | Training set (cases) | Validation set (cases) | Test set (cases) | Independent test datasets (cases) | Classifier | Results | Conclusion | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | SN | SP | AUC | Precision | ||||||||||||
| Xiao | 2017 | 1 | 162 | 1,385 | TCGA: ADC (n=162) | NR | NR | NR | 0 | DL-based multi-model (KNN, SVM, DT, RF, GBDT) | KNN: 88.00%; SVM: 97.20%; DT: 96.80%; RF: 93.20%; GBDT: 96.80%; majority voting: 97.20%; DL-based method: 99.20% | DT: 97.37% | NR | NR | DT: 98.46% | The DL-based multi-model algorithm could obtain more information to achieve the accuracy of 99.20% for distinguishing ADCs from normal |
| Yuan | 2020 | 1 | 150 | 1,100, 260, 43 (n=20,502) | GEO: ADC (n=77), SCC (n=73) | NR | NR | NR | 0 | SVM, RF, RIPPER | SVM: 0.867; RF: 0.880; RIPPER: 0.867 | SVM: 0.987; RF: 0.974; RIPPER: 0.867 | SVM: 0.740; RF: 0.781; RIPPER: 0.872 | NR | SVM: 0.800; RF: 0.772; RIPPER: 0.877 | Analyzing the gene expression dataset of NSCLC subtypes, the RIPPER algorithm yielded the almost equal performance of subtyping NSCLCs compared with the SVM/RF classifier |
| Podolsky | 2016 | 3 | 480 | NR | DFCI: ADC (n=139), SCC (n=21), other (n=26), normal (n=17); UMD: ADC (n=86), normal (n=10); BWHD: ADC (n=150), other (n=31) | 235 | 96 | 149 | 0 | KNN, NB, SVM, DT | NR | NR | NR | KNN, k=1: 0.87; KNN, k=5: 0.96; KNN, k=10: 0.97; NB_normal: 0.85; NB_histogram: 0.84; SVM: 0.91; C4.5 DT: 0.92 | NR | Compared with other machine learning algorithms, SVM was the optimal tool in NSCLC morphology classification based on gene expression level evaluation |
| Cai | 2015 | 2 | 1,099 | 16 (n=45) | TCGC: ADC (n=126), SCC (n=134); GEO: SCLC (n=28); TCGA: ADC (n=452), SCC (n=359) | 288 | 0 | 811 | 0 | RF and multi-SVMs | Training datasets: 86.54%; Independent datasets: 84.60% | Training datasets: 84.37%; Independent datasets: 85.52% | NR | NR | Training datasets: 66.79%; Independent datasets: 85.94% | The accuracies of multi-SVM model with such 16 top features for diagnosing NSCLC subtypes were 86.54% and 84.6% in the training and test set, respectively |
| Li | 2018 | 2 | 853 | 20 (n=107) | TCGA: ADC (n=286), normal (n=59); GEO: ADC (n=387), normal (n=121) | 2/3 of each dataset | 0 | 1/3 of each dataset | 0 | RF, SVM, and ANN | TCGA: 98.68%; GSE68465: 99.51%; GSE10072: 97.91%. | TCGA: 99.28%; GSE68465: 99.95%; GSE10072: 98.05% | TCGA: 95.68%; GSE68465: 92.83%; GSE10072: 97.75% | NR | NR | Machine learning models with twenty ADC signature genes were robust for early ADC diagnosis |
| Dong | 2019 | 1 | 369 | 699 | TCGA: ADC (n=369) | NR | NR | NR | 0 | SVM, KNN, LR, RF, gcForest and the ensemble MLW-gcForest | Methylation: 0.751; RNA: 0.689; CNV: 0.645; multi-modal: 0.908 | Methylation: 0.763; RNA: 0.679; CNV: 0.677; Multi-modal: 0.882 | NR | Multi-model: 0.96 | Methylation: 0.771; RNA: 0.659; CNV: 0.675; Multi-modal: 0.896 | MLW-gcForest algorithm had an AUC of 0.96 and an accuracy of 0.908 for ADC staging, better than those achieved by traditional machine learning algorithms |
| Yang | 2020 | 2 | 600 | 42, 26, 16 (n=528) | TCGA: ADC (n=470); GSE62182: ADC (n=94); GSE83527: ADC (n=36) | 376 | 94 | 0 | 130 | SVM | NR | NR | NR | TCGA: 0.62; GSE62182: 0.66; GSE83527: 0.63 | NR | The 16‑miRNA signature analyzed by LIBSVM algorithm showed a similar ability to classify ADC pathological stages to that of the combinations of 42 or 26 miRNAs |
NR, not reported; AI, artificial intelligence; DL, deep learning; SVM, Support Vector Machine; KNN, K-nearest neighbors; GBDT, gradient boosting decision trees; LR, logistic regression; RF, Random Forest; DT, Decision Tree; ANN, artificial neural networks; NB, Naive Bayes; RIPPER, Repeated Incremental Pruning to Produce Error Reduction algorithm; ADC, adenocarcinoma; SCC, squamous-cell carcinoma; NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer; TCGA, The Cancer Genome Atlas; GEO, Gene Expression Omnibus; DFCI, Dana-Farber Cancer Institute; UMD, University of Michigan Dataset; BWHD, Brigham and Women’s Hospital Dataset; CNV, copy number variation; AUC, Receiver-operating characteristic (ROC) curve; ACC, accuracy; SN, sensitivity; SP, specificity.
The characteristics of the included studies regarding DL models for identifying tumor patterns or variety of cells on digital slides
| Authors | Publication year | Number of datasets | Number of cases | Number of images | Subtypes (images) | Training set (images/cells) | Validation set (images/cells) | Test set (images/cells) | Independent test datasets | Classifier | Results | Conclusion | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACC | SN | SP | AUC | Precision | ||||||||||||
| Gertych | 2019 | 3 | 110 | 206 | CSMC: ADC (n=91); | 78 | 19 | 109 | 0 | GoogLeNet, ResNet-50 and AlexNet | FT-AlexNet:75.3%; | NR | NR | NR | NR | One of the DN-AlexNets obtained the best performance than other CNNs, with the accuracies of 89.90% for classification involving the five tissue classes in test set |
| Wei | 2019 | 1 | NR | 422 | DHMC: ADC (n=422) | 245 | 34 | 143 | 0 | ResNet | NR | NR | NR | Lepidic: 0.988; | NR | CNN could improve classification accuracy of ADC patterns by automatically pre-screening, superior to pathologists |
| Wang | 2020 | 2 | 507 | 639 | TCGA: ADC (n=208); | 12,000 cell nuclei | 1,227 cell nuclei | 1,086 cell nuclei | 0 | Mask R-CNN, Cox proportional hazard prognostic model | 88% in the validation set; 90% in the testing set. | NR | NR | NR | NR | Mask R-CNN extracted and identified 48 cell spatial features, which could predict high-risk group, significantly worse survival than the low-risk group |
| AbdulJabbar | 2020 | 2 | 1,070 | 4,599 | TRACERx: NSCLC (n=275); | 16790 H&E cells and 9333 IHC cells | 4219 H&E cells | 5951 H&E cells and 5028 IHC cells | 5082 H&E cells | SCCNN | Lymphocyte: 0.942; | Lymphocyte: 0.902; | Lymphocyte: 0.982; | NR | NR | SCCNN for NSCLCs exhibited high accuracy of single-cell classification in H&E digital slides and T-cell identification in the IHC image slides, respectively |
| Wang | 2019 | 3 | NR | 159 | TCGA and NLST: ADC (n=29); | 29 | 130 | 0 | 0 | DL-based ConvPath software | Lymphocytes: 99.3%; | NR | NR | NR | NR | The overall classification accuracies of the CNN in both datasets were 99.3% for lymphocytes, 87.9% for stromal cells, and 91.6% for tumor cells, respectively |
| Teramoto | 2020 | 1 | 60 | 793 | Normal (n=25); | NR | 173 | NR | 0 | PGGAN, DCGAN, ImageNet | ImageNet: 0.810; | ImageNet: 0.850; | ImageNet: 0.768; | NR | NR | PGGAN for cytological specimens improved the classification specificity by 8.5% and the total classification accuracy by approximately 4.3% compared to a CNN model |
| Saha | 2021 | 1 | 712 | 712 | TCGA: ADC (n=356); | 356 | 160 | 160 | 0 | TilGAN | 0.98 | 0.96 | NR | NR | 0.98 | TilGAN generated the high quality of synthetic pathology images could efficiently classify real TIL and non-TIL patches with improved accuracy |
NR, not reported; CNN, convolutional neural networks; DL, deep learning; SCCNN, sensitive convolutional neural networks; ADC, adenocarcinoma; NSCLC, non-small cell lung cancer; TCGA, The Cancer Genome Atlas; CSMC, Cedars-Sinai Medical Center; MIMW, the Military Institute of Medicine in Warsaw; DHMC, the Dartmouth-Hitchcock Medical Center; NLST, the National Lung Screening Trial project; SPORE, the University of Texas Special Program of Research Excellence; H&E, hematoxylin and eosin; IHC, immunohistochemistry; ACC, accuracy; SN, sensitivity; SP, specificity; GAN, generative adversarial network; PGGAN, progressive growing of GAN; DCGAN, deep convolutional generative adversarial network; TIL, tumor-infiltrating lymphocyte.
The characteristics of the included studies for prognosis-predicting models of AI based on the features of the image slides or genes profiles
| Authors | Publication year | Number of datasets | Number of cases | Number of images | Features/genes (total) | Subtypes (cases) | Training set (cases) | Validation set (cases) | Test set (cases) | Independent test datasets (cases) | Classifier | Results | Conclusion | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| High | ACC | AUC | SN | |||||||||||||
| Wang | 2018 | 2 | 539 | 824 | 22 F (n=22) | NLST: ADC (n=150); TCGA: ADC (n=389) | 150 | 0 | 0 | 389 | Inception (V3), univariate Cox proportional hazard model | 2.25 (1.34–3.77) | Tumor: 88.1%; | NR | NR | Prognostic prediction model with 22 shape features extracted by CNN were considered as an objective prognostic model of ADCs superior to clinical variables |
| Yu | 2016 | 2 | 1,311 | 2,480 | 240, 15 F (n=9,879) | TCGA: ADC (n=515), SCC (n=502); TMAD: ADC (n=227), SCC (n=67) | 70% of TCGA | 0 | 30% of TCGA | 294 | NB, SVM, BT, RF; net-Cox proportional hazards models | NR | NR | Bagging: 0.74; Naive bayes: 0.63; RF: 0.75; RF with CITs: 0.73; SVMs with gaussian kernel: 0.75; SVMs with linear kernel: 0.70; SVMs with polynomial kernel: 0.74 | NR | Histopathological classifiers could successfully predict survival outcomes of NSCLCs, superior to pathologists |
| Luo | 2017 | 1 | 1,034 | 3,186 | 18 F (n=943) | TCGA: ADC (n=523), SCC (n=511) | 2/3 of TCGA | 0 | 1/3 of TCGA | 0 | RF prediction model | ADC: 2.34 (1.12–4.91); | NR | NR | NR | The RF model with morphological features of digital slides showed the ability to predict prognosis in NSCLCs |
| Wang | 2017 | 3 | 305 | NR | 7 F (n=242) | Cohort 1: ADC (n=17), SCC (n=44), Other (n=9); Cohort 2: ADC (n=51), SCC (n=21), Other (n=47); Cohort 3: ADC (n=54), SCC (n=20), Other (n=41) | 70 | 119 | 0 | 116 | QDA, LDA, SVM | NR | Cohort 1: 81%; | Cohort 2: 0.84; | NR | QDA with nuclear feature of digitized slides of NSCLC biopsies yielded an accuracy of 81%, 82% and 75% for recurrence prediction in cohort 1, 2 and 3 respectively |
| Li | 2019 | 2 | 1,463 | – | 16 G (n=2,472) | TCGA: ADC (n=492); GEO: ADC (n=971) | 492 | 232 | 347 | 386 | LASSO; Cox regression | 3.32 (2.11–5.21) | NR | 1-year: 0.822; | NR | The 16-gene-based LASSO model for ADC prognosis prediction was served as a practical and reliable prognosis predictive tool for ADCs |
| Yu | 2019 | 1 | 371 | – | 28, 85 G | TCGA: ADC (n=371) | 297 | 0 | 74 | 0 | SVM | NR | EBT_0.10: 73.6%; | EBT_0.10: 0.710; | EBT_0.10: 93.8%; | SVM model with the genetic features could well predict the ADC prognosis, much better than the conventional TNM staging system |
NR, not reported; AI, artificial intelligence; LASSO, least absolute shrinkage and selection operator; NB, Naive Bayes; RF, random forest; BT, bagging for classification trees; QDA, Quadratic discriminant analysis; LDA, linear discriminant analysis; SVM, support vector machine; ADC, adenocarcinoma; SCC, squamous-cell carcinoma; NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer; TCGA, The Cancer Genome Atlas; NLST, the National Lung Screening Trial project; TMAD, the Stanford Tissue Microarray dataset; GEO, Gene Expression Omnibus; TNM, tumor, nodes, and metastasis; ACC, accuracy; SN, sensitivity; SP, specificity.
The characteristics of the included studies with the concordance rate between WFO and MDT in different stages and subtypes
| Authors | Publication year | Country | Number of cases (M/F) | Median age (range), years | Subtypes (cases) | Stage I NSCLC | Stage II NSCLC | Stage III NSCLC | Stage IV NSCLC | ADC | SCC | SCLC | Overall |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Kim | 2020 | Korea | 405 (340/65) | 71 (37–88) | ADC (n=157); | NR | NR | NR | NR | 94.90% | 90.20% | 97.90% | 92.40% |
| You | 2020 | China | 310 (215/95) | NR | ADC (n=217); | NR | NR | NR | NR | 87.56% | 79.12% | – | 85.16% |
| Yao | 2020 | China | 165 (109/56) | NR | ADC (n=121); | Stage ≤III: 77.8% | 93.50% | 90.50% | 90.70% | – | 73.30% | ||
| Liu | 2018 | China | 149 (124/25) | 60 (26–83) | ADC (n=61); | 83% | 59% | 42% | 89% | NSCLC: 61.1% | 83% | 81.90% | |
NR, not reported; MDT, the multidisciplinary team; WFO, Watson for Oncology; ADC, adenocarcinoma; SCC, squamous-cell carcinoma; NSCLC, non-small cell lung cancer; SCLC, small cell lung cancer; ASC, adenosquamous carcinoma; LC, large cell lung cancer.