| Literature DB >> 35323346 |
Hang Qiu1,2, Shuhan Ding3, Jianbo Liu4,5, Liya Wang1, Xiaodong Wang4,5.
Abstract
Colorectal cancer (CRC) is one of the most common cancers worldwide. Accurate early detection and diagnosis, comprehensive assessment of treatment response, and precise prediction of prognosis are essential to improve the patients' survival rate. In recent years, due to the explosion of clinical and omics data, and groundbreaking research in machine learning, artificial intelligence (AI) has shown a great application potential in clinical field of CRC, providing new auxiliary approaches for clinicians to identify high-risk patients, select precise and personalized treatment plans, as well as to predict prognoses. This review comprehensively analyzes and summarizes the research progress and clinical application value of AI technologies in CRC screening, diagnosis, treatment, and prognosis, demonstrating the current status of the AI in the main clinical stages. The limitations, challenges, and future perspectives in the clinical implementation of AI are also discussed.Entities:
Keywords: artificial intelligence; colorectal cancer; deep learning; diagnosis; machine learning; prognosis; screening; treatment
Mesh:
Year: 2022 PMID: 35323346 PMCID: PMC8947571 DOI: 10.3390/curroncol29030146
Source DB: PubMed Journal: Curr Oncol ISSN: 1198-0052 Impact factor: 3.677
Figure 1Clinical applications of AI for CRC. The inner circle represents the main data types in CRC research, including radiological images (i.e., Computed Tomography (CT), Magnetic Resonance Imaging (MRI), etc), endoscopic images, pathological images, clinical data, and omics data; the outer circle represents the four key clinical parts of CRC, i.e., screening, diagnosis, treatment, and prognosis; for each clinical part, AI has subdivided and specific tasks, which are shown in boxes outside the circle respectively. OS, overall survival; DFS, disease-free survival; nCRT, neoadjuvant radiotherapy.
Figure 2Basics concepts of AI, ML, and DL. (a) The relationships of AI, ML, and DL; (b) The workflows of ML and DL.
Figure 3Common CRC image types. (a) Endoscopy; (b) CT; (c) MRI; (d) pathology image (the Hematoxylin & Eosin (HE)-stained slide). (Image courtesy: West China Hospital, Sichuan University).
Summary of AI applications for CRC screening. (DNN, deep neural network; SSD, single shot multibox detector; RF, random forest; LMT, logistic model trees; SVM, support vector machine; LR, logistic regression; NB, naïve Bayes; DT, decision tree; CNN, convolutional neural network; PPV, positive predictive value; Faster R-CNN, faster region-based CNN; AUC, area under the curve).
| Topic | Task | Dataset | Model | Performance | Year | Ref. |
|---|---|---|---|---|---|---|
| CRC | High-risk patient detection | 111 patients’ microarray data including 22,278 features | LightGBM, DNN | Accuracy: 100% | 2021 | [ |
| Polyp classification | 47,555 endoscopy images for 24 patients | SSD | Accuracy: 0.9067, | 2021 | [ | |
| Serum biomarker detection | 186 blood serum samples (39 advanced adenomas, 90 CRC and 57 healthy controls) | RF, Random Tree, LMT, SVM | Accuracy: 75% | 2021 | [ | |
| Serum biomarker detection | 263 blood serum protein samples (213 individuals undergoing screening endoscopy and 50 non-metastatic CRC) | LR, SVM, Gaussian NB, DT, RF, and extremely randomized trees | AUC: 0.75, | 2020 | [ | |
| Polyp detection and classification | 27,508 endoscopy images | CNN | Detection: | 2020 | [ | |
| Polyp localization | EAD2019, CVC-ClinicDB, ETIS-Larib, in-house dataset, Kvasir-SEG | RetinaNet | Precision: 0.537 | 2020 | [ | |
| Polyp detection | CVC-CLINIC, ASU-Mayo Clinic, CVC-ClinicVideoDB | Faster R-CNN, SSD | Sensitivity: 0.9086, | 2020 | [ | |
| Polyp detection and classification | 871 endoscopy images from218 patients | ResNet50, | F1: 0.6872, F2: 0.6607 | 2019 | [ | |
| Polyp detection | 8641 endoscopy images | CNN | Sensitivity: 90.0%, | 2018 | [ | |
| Polyp segmentation | CVC-ColonDB | CNN | Specificity: 74.8%, | 2018 | [ | |
| High-risk patient prediction | Colon cancer screening center data (EMRs) | Colonflag | The odds of Colonflag and normal colonoscopies: 2.0 | 2018 | [ | |
| Polyp classification | 1930 NBI images | CNN | Accuracy: 85.9%, | 2017 | [ | |
| High-risk patient detection | 112,584,133 US community-based insured data | Colonflag | AUC: 0.80 ± 0.01 | 2017 | [ | |
| High-risk patient detection | 17,095 patients from KPNW (EMRs) | Mescore | Top 3% score > 97.02 | 2017 | [ | |
| Polyp detection | 24 endoscopy videos | Energy map | AUC: 0.79, | 2016 | [ | |
| High-risk patient detection | 606,403 Israelis and 25,613 UK dataset (EMRs) | Mescore | AUC: 0.82 ± 0.01 and 0.81 for validation sets | 2016 | [ | |
| Polyp classification | 1890 NBI endoscopic images | HuPAS version 3.1 | Accuracy: 98.7% | 2012 | [ |
Summary of AI models for CRC diagnosis and staging session. (CNN, convolutional neural network; AUC, area under the curve; SVM, support vector machine; PNN, probabilistic neural network; NL, normal mucosa; AD, adenoma; ADC, adenocarcinoma; WSI, whole slide images; RNN, recurrent neural network; TCGA, The Cancer Genome Atlas; ResNet, residual network architecture; HP, hyperplastic polyp; VGG, visual geometry group; RF, random forest; PET-CT, positron emission tomography or computed tomography; LR, logistic regression; NN, neural network; XGBoost, extreme gradient boosting; CT, computed tomography; MRI, magnetic resonance imaging; Faster R-CNN, faster region-based CNN; CAD, computer aided diagnosis).
| Topic | Task | Dataset | Model | Performance | Year | Ref. |
|---|---|---|---|---|---|---|
| Pathological | Tumor mutational burden-high prediction | 278 HE slides | CNN | AUC: 0.934 | 2021 | [ |
| Low/high-grade classification | Immunohistochemically stained biopsy of 67 patients | hDL-system (VGG16, SVM) | hDL-system accuracy: 99.1%; sML-system accuracy: 92.5% | 2021 | [ | |
| NL/AD/ADC classification | 4036 WSI | CNN, RNN | AUC: 0.96 for ADC; 0.99 for AD | 2020 | [ | |
| Tumor immune microenvironment analysis | 404 CRC and 20 adjacent non-tumorous tissues | CIBERSORT | C-index: stage I-II 0.69; stage III-IV 0.71; AUC: 0.67 | 2019 | [ | |
| NL/Tumor classification | 94 WSI, 370 TCGA-KR, 378 TCGA-DX | ResNet18 | AUC > 0.99 | 2019 | [ | |
| NL/HP/AD/ADC | 393 WSI (12,565 patches) | CNN | Accuracy: 80% | 2019 | [ | |
| NL/Tumor classification | 57 WSI | VGG | Accuracy: 93.5%, | 2018 | [ | |
| NL/AD/ADC classification | 27 WSI | VGG16 | Accuracy: 96%, Specificity: 92.8% | 2018 | [ | |
| NL/AD/ADC classification | 30 multispectral image patches | CNN | Accuracy: 99.2% | 2017 | [ | |
| Cancer subtypes classification | 717 patches | AlexNet | Accuracy: 97.5% | 2017 | [ | |
| Polyp subtypes classification | 2074 patches 936 WSI | ResNet | Accuracy: 93.0% | 2017 | [ | |
| Radiological | Metastatic CRC prediction | MRI from 55 stage VI patients with known hepatic metastasis | RF | AUC: 0.94 (Add imaging-based heterogeneity features) | 2021 | [ |
| Metastatic lymph node prediction | PET-CT scan images from 199 CRC patients | LR, SVM, RF, NN, and XGBoost | AUC of LR: 0.866; AUC of XGBoost: 0.903 | 2021 | [ | |
| Colorectal liver metastasis prediction | 103 metastasis samples and 80 non-cancer tissues | Probe electrospray ionization-mass spectrometry, and LR | Accuracy: 99.5%, | 2021 | [ | |
| Colorectal liver metastasis prediction | CT scan images from 91 patients | Bayesian-optimized RF with wrapper feature selection | AUC of radiomics features model: 86%; | 2021 | [ | |
| KRAS mutations detection | CT scan images from 47 patients | Haralick texture analysis, SVM, LightGBM, NN, and RF | Accuracy: 83%, kappa: 64.7% | 2020 | [ | |
| Classification of T2 and T3 | 290 MRI images from 133 patients | CNN | Accuracy: 0.94 | 2019 | [ | |
| Metastatic lymph node prediction | MRI images from 414 patients | Faster R-CNN | r-radiologist-Faster R-CNN 0.912 | 2019 | [ | |
| Polyp detection | 825 CT scan images | CNN | Accuracy: 0.87, | 2017 | [ | |
| Polyp detection | 154 CT scan images | CNN | Accuracy: 0.971 | 2017 | [ | |
| Polyp classification | 1035 endomicroscopy images | Mathworks “NAVICAD” system | Accuracy: 84.5% | 2016 | [ | |
| Polyp detection and classification | 148 CT scan images | Haralick texture analysis, SVM | ROC: 0.85 | 2014 | [ | |
| CAD system for polyp detection | 24 T1 stage patients’ CT scan images | Coloncad API 4.0, Medicsight plc | True positives rate >96.1% | 2008 | [ |
Summary of AI applications for CRC treatment session (nCRT, neoadjuvant radiotherapy; ANN, artificial neural network; AUC, area under curve; KNN, K-nearest neighbors; SVM, support vector machine; NBC, naïve Bayesian classifier; MLR, mixed logistic regression; LR, logistic regression; NN, neural network; BN, Bayesian network; RF, random forest; CPT-11, Irinotecan; IC50, half maximal inhibitory concentration).
| Topic | Task | Dataset | Model | Performance | Year | Ref. |
|---|---|---|---|---|---|---|
| nCRT | nCRT response prediction | Medical records from 282 patients (248 training and 34 validation) | ANN, KNN, SVM, NBC, MLR | ANN model outperformed others: | 2020 | [ |
| nCRT response prediction | 6555 patients’ records from the SEER | LR | 3-year OS rate: 92.4% with pCR; 88.2% without pCR | 2019 | [ | |
| nCRT response prediction | 98 patients MRI (53 training set and 45 validation set) | SVM, NN, BN, KNN | Test: AUC: 97.8%, | 2019 | [ | |
| nCRT response prediction | 55 patients MRI | RF | Mean AUC: 0.83 | 2019 | [ | |
| Chemotherapy | The toxicity of CPT-11 prediction | Demographic data, liver function bloody tests and tumor markers from 20 advanced CRC patients | SVM | Accuracy: 91% for diarrhea, | 2019 | [ |
| Drug IC50 detection | 18,850 organic compounds | KNN, RF, SVM | Accuracy: over 63% | 2018 | [ |
Summary of AI models for CRC prognosis session. (C-index, concordance index; LR, logistic regression; DT, decision tree; GB, gradient boosting; LightGBM, light gradient boosting machine; CNN, convolutional neural network; AUC, area under curve; PET-CT, positron emission tomography or computed tomography; HR, hazard ratio; GSEA, gene set enrichment analysis; PPI, protein-protein interaction; HE, hematoxylin and eosin; WSI, whole slide image; MLP, multilayer perceptron; AdaBoost, adaptive boosting; LSTM, long short-term memory; EHR, electronic health record; SVM, support vector machine; NB, naïve Bayesian; KNN, K-nearest neighbors; NN, neural network; RF, random forest).
| Topic | Task | Dataset | Model | Performance | Year | Ref. |
|---|---|---|---|---|---|---|
| Recurrence | Recurrence perdition of stage II CRC | Clinicopathological data of 350 patients after curative resection for stage II CRC | Nomogram | C-index: 0.585 in the validation set | 2020 | [ |
| Recurrence prediction of Stage IV CRC after tumor resection | EHR data from 999 patients of stage IV CRC | LR, DT, GB and LightGBM | LightGBM: AUC: 0.761 | 2020 | [ | |
| Recurrence prediction of local tumor | PET-CT images from 84 patients | CNN, Proportional hazards model | C-index: 0.64 | 2019 | [ | |
| Risk prediction of recurrence of gastrointestinal stromal tumor | Clinical data of 2560 patients | Proportional hazards, Non-linear model | AUC: 0.88 | 2012 | [ | |
| Recurrence perdition after surgery | Clinicopathological data of 1320 nonmetastatic CRC patients | NomogramCOX regression | C-index: 0.77 | 2008 | [ | |
| Survival | Genetic risk factors Identification | National Center for Biotechnology Information Gene Expression Omnibus | GSEA, PPI network, Cox Proportional Hazard regression | 4 sub-networks and 8 hub genes as potential therapeutic targets | 2021 | [ |
| Prognostic prediction for stage III CRC | Clinicopathological data of 215 patients | CNN, GB | HR: 8.976 and 10.273 | 2020 | [ | |
| Outcome prediction | 12,000,000 HE images | CNN | HR: 3.84 and 3.04 with established prognostic markers | 2020 | [ | |
| Survival prediction | 7180 HE images of 25 patients | CNN | Nine-class accuracy: >94% | 2019 | [ | |
| Survival prediction | PET-CT images of 84 patients | CNN, proportional hazards model | C-index: 0.64 | 2019 | [ | |
| Outcome prediction, and remaining lifespan prediction | SEER | tree-based ensemble model | Accuracy: 0.7069, | 2019 | [ | |
| Outcome prediction | 75 WSIs from stage I and II CRC patients with surgical resection | CNN | F1: 0.67 | 2019 | [ | |
| Outcome prediction | EHR data of 58,152 patients | CNN | AUC: 0.922, Sensitivity: 0.837, specificity: 0.867, PPV: 0.532 | 2019 | [ | |
| Prediction of Stages and Survival Period | Clinicopathological data of 4021 patients | RF, SVM, LR, MLP, KNN, and AdaBoost | RF: F-measure: 0.89, Accuracy: 84%, AUC: 0.82 ± 0.10 | 2019 | [ | |
| 1/2/5 years Survival prediction | SEER data | DNN | AUC: 0.87 | 2019 | [ | |
| Outcome prediction | Digitized HE tumor tissue microarray samples of 420 patients | CNN, LSTM | LSTM: AUC: 0.69, histological grade AUC: 0.57, the visual risk score AUC: 0.58 | 2018 | [ | |
| 5-year survival prediction | EHR data of 1127 CRC patients | Ensemble (bagging and voting) classifier | Ensemble voting model AUC: 0.96 | 2017 | [ | |
| 5-year survival prediction | EHR data of 334,583 cases from Robert Koch Institute | SVM, LR, NB, DT, KNN, LR, NN, RF | Average accuracy of the clinicians: 59%, ML: 67.7% | 2015 | [ |