| Literature DB >> 34602811 |
Yogesh Kumar1, Surbhi Gupta2, Ruchi Singla3, Yu-Chen Hu4.
Abstract
Artificial intelligence has aided in the advancement of healthcare research. The availability of open-source healthcare statistics has prompted researchers to create applications that aid cancer detection and prognosis. Deep learning and machine learning models provide a reliable, rapid, and effective solution to deal with such challenging diseases in these circumstances. PRISMA guidelines had been used to select the articles published on the web of science, EBSCO, and EMBASE between 2009 and 2021. In this study, we performed an efficient search and included the research articles that employed AI-based learning approaches for cancer prediction. A total of 185 papers are considered impactful for cancer prediction using conventional machine and deep learning-based classifications. In addition, the survey also deliberated the work done by the different researchers and highlighted the limitations of the existing literature, and performed the comparison using various parameters such as prediction rate, accuracy, sensitivity, specificity, dice score, detection rate, area undercover, precision, recall, and F1-score. Five investigations have been designed, and solutions to those were explored. Although multiple techniques recommended in the literature have achieved great prediction results, still cancer mortality has not been reduced. Thus, more extensive research to deal with the challenges in the area of cancer prediction is required. © CIMNE, Barcelona, Spain 2021.Entities:
Year: 2021 PMID: 34602811 PMCID: PMC8475374 DOI: 10.1007/s11831-021-09648-w
Source DB: PubMed Journal: Arch Comput Methods Eng ISSN: 1134-3060 Impact factor: 8.171
Fig. 1Estimated number of new cases and deaths in 2020 for common cancer types (www.cancer.net)
Fig. 2PRISMA flow chart
Fig. 3Causes of cancers [26]
Fig. 4Types of imaging for cancer test
Fig. 5Deep learning process for cancer diagnosis [1]
Fig. 6Working of auto-encoder method [126]
Fig. 7Transfer learning-based snapshot ensemble method [37]
Fig. 8Deep learning-based CNN model for segmentation of MRI imaging [1]
Fig. 9Evaluation parameters
Comparative analysis using AI techniques for different cancers
| Authors | Cancer types | Training data | Techniques | Challenges | Reported outcomes |
|---|---|---|---|---|---|
| Sudharani et al. [ | Brain | MRIs images | Fuzzy C-Means | The small and unstructured data were not used in the system, restricting the generality and clinical applicability | Accuracy = 89.2% Sensitivity = 88.9% Specificity = 90% |
| Mohsen et al. [ | Brain | Brain MR images | DNN with PCA and DWT (discrete wavelet transform) | The present technique is complex as it requires a large number of processors to execute the data | Prediction rate = 96.7% Precision = 97% |
| Dong et al. [ | Brain | BRATS 2015 | Deep-CNN | The system can be improvised by adding multi-institutional and longitudinal datasets in the future | Complete tumor region = 88% |
| Sobhaninia et al. [ | Brain | Brain MR images | CNN | The technique can be extended by using instance segmentation for detecting the tumor in the image | Dice Score = 79% |
| Malathi et al. [ | Brain | BRATS 2015 | CNN with TensorFlow | New methodologies need to be used to segment the tumor images and perform the accurate delineation in radiotherapy | Dice coefficient = 0.73 Advancing tumor = 0.76 Sensitivity = 0.82 |
| Alam et al. [ | Brain | MRI images | Template-based K-means | The features used for enhancing the accuracy and detection can be improved in the future | Tumor detection = 97.43% |
| Devi et al. [ | Brain | MRI images | Radial basis functional network ( RBFN) | The technique cannot predict the progressive growth of tumor cells | Energy = 0.1743 Homogeneity = 0.9300 Contrast = 0.2450 |
| Kalaiselvi et al. [ | Brain | MRI brain images | Modified MET (minimum error thresholding technique) | The system can be improved by incorporating more datasets in the future | Predictive Accuracy (PA) = 97.6% Dice coefficient (DC) = 67.9% |
| Al-Ayyoub et al. [ | Brain | MRI images | Neural Network J48 Naïve Bayes Lazy-IBk | The current system failed to predict complex features which need to be solved in the future | Accuracy = 66.6% for NN, 59.2% J48, 59.2% for Naïve Bayes, 62.9% for Lazy-IBk |
| Kaur et al. [ | Breast | Mammogram breast images | Support vector machine Deep Neural Network K-mean clustering | The system can improve its accuracy by working on large-scale deep learning internal layers, which will help radiologists validate data in less time in the future | Accuracy = 92% Specificity = 90% Sensitivity = 93% F-score = 96% |
| Bidard et al. [ | Breast | Mammogram breast images | CTC cell search system | The system can enhance its work by developing and validating new bio clinical prognostic indices by pooling future trials | Sensitivity = 55% Specificity = 81% Accuracy = 77% |
| Patil et al. [ | Breast | Mammogram images | CRNN FC-CSO | The current system did not work with blur images which should be improved by using a wiener filter | Accuracy = 98.4% Specificity = 99.9% F1-score = 74.5% |
| Eleyan et al. [ | Breast | Wisconsin Breast Cancer Datasets | KNN | The present system failed to work with large datasets, which should be improved in the future | Accuracy = 97.51% |
| Nallamala et al. [ | Breast | Mammogram images | CNN Logistic Regression | The system can be improved by working on large number of datasets in the future | Precision = 98.5% |
| Assiri et al. [ | Breast | Wisconsin Breast Cancer Dataset | Multilayer Perceptron, Logistic Regression, Stochastic Gradient descent Learning | This technique failed to accurately perform the segmentation to be solved by applying semantic or instance segmentation | Accuracy = 99.42% Precision = 0.9940 |
| Saha et al. [ | Breast | DCE-MR images | Multivariate machine learning models | The system needed to work on its algorithms in image-controlled conditions with uniform scanning and contrast protocol | AUC = 0.771 |
| Abdallah et al. [ | Breast | Mammography images | Segmentation Techniques | The segmentation techniques should be escalated to improve its accuracy | Matching ratio = 96.3 ± 8.5 |
| Mejia et al. [ | Breast | Mammography images | KNN | The current system should enhance classification accuracy to improve the work | Accuracy = 94.44% |
| Qayyum et al. [ | Breast | Digital mammograms | SVM Gray level co-occurrence matrix (GLCM) Features | It has been challenging for the system to interpret the final model because of its high dimensionality matrix | Accuracy = 96.55% Sensitivity = 96.97% Specificity = 96.29% |
| Ragab et al. [ | Breast | Mammography images | SVM | The current system should improve its accuracy by working with a large number of datasets | Accuracy = 87.2% AUC = 94% |
| Win. et al. [ | Cervical | Pap Smear images | Bagging Ensemble | The system produces false-negative results because it failed to detect specific abnormalities in Pap smear images | Accuracy = 98.27% |
| Wu et al. [ | Cervical | Pathological images | CNN | The accuracy of the system can be improved by incorporating more training datasets | Accuracy = 93.33% |
| Alyafeai et al. [ | Cervical | Cervigram images | CNN | The accuracy should be improved to increase the efficiency of the system | AUC score = 0.82 Accuracy = 0.68 |
| Gupta et al. [ | Cervical | Pap Smear images | ANN | The accuracy can be improved further to improve the work | Accuracy = 78% |
| Kurnianingsih et al. [ | Cervical | Herlev Pap Smear dataset | R-CNN | The current technique required higher processing power which should be extended with the deeper network in order to improve the performance results | Accuracy = 95% Sensitivity = 96% |
| Rudra et al. [ | Cervical | Pap Smear images | K-nearest Neighbor | The present system failed to classify and detect the abnormalities in the image | Accuracy = 98.31% |
| Sajenna et al. [ | Cervical | Pap Smear images | SVM | The present system's classification technique did not include high-dimensional data that should be improved in the future to increase its accuracy | Accuracy = 93.78% Sensitivity = 98.96% Specificity = 96.69% |
| Hoerter et al. [ | Colorectal | ImageNet database | CNN | The current system is restricted to detect polyps that are smaller than 10 mm | per-polyp sensitivity = 71% |
| Shin et al. [ | Colorectal | Polyp images and videos | Deep-CNN | The current system showed much detection processing time, which should be improved in the future | Detection processing time = 0.39 s |
| Figueiredo et al. [ | Colorectal | PillCam COLON2 capsule-based images and videos | Image processing approach | The present system worked with a limited number of videos and frames, increasing for better prediction outcomes | P-value higher than 500 |
| Godkhindi et al. [ | Colorectal | CT images | CNN | The polyp detection accuracy needs to be improved for the better working of the system | polyp detection accuracy = 88% |
| Zhang et al. [ | Colorectal | Endoscopic images | CNN | The current system had been manually selecting the RoI of each polyp which should be done automatically in the future to improve its accuracy | Accuracy = 85.9% Precision = 87.3% Recall = 87.6% |
| Yamada et al. [ | Colorectal | polyp images and videos | Deep learning | The present system lacked robustness, limiting the utility of a computer-aided diagnosis system | Specificity = 97.3% |
| Santini et al. [ | Kidney | KiTS19 | CNN | New training strategies will be designed to differentiate between the data, and a different stage will be added for more detailed local features for escalating the current system's efficiency | Mean Dice score = 0.96 |
| Tabibu et al. [ | Kidney | Renal Cell Carcinoma | CNN | The current system had data imbalance issues which should be improved in the future | Accuracy = 92.61% |
| Ali et al. [ | Kidney | miRNA Dataset | LSTM | Further clinical studies must validate the effectiveness of the selected miRNAs by the current system | Accuracy = 95% |
| Han et al. [ | Kidney | Renal Cell Carcinoma | DNN | The accuracy, sensitivity, and specificity of the system should be improved further | Accuracy = 85% Sensitivity = 64% to 98% Specificity = 83% to 93% |
| Skalski [ | Kidney | CT images | Vascular Tree (RUSBoost and Decision Trees) | Needed improvement regarding feature selection and segmentation of the image | Accuracy = 92.1% |
| Chlebus et al. [ | Liver | CT images | Deep-CNN | The present system requires more work to be done to match the performance of human expertise | Detection rate = 77% |
| Le et al. [ | Brain | BraTS 2018 | CNN Random Forest Regression | The current system should add more datasets to increases the prediction rate in the future | Predict the survival rate |
| Wang et al. [ | Liver | CT images | RT-PCR (Polymerase chain reaction) | The sensitivity and specificity can be improved further for the improvement of the work | Area under curve = 80.3% Sensitivity = 75% Specificity = 75% |
| Das et al. [ | Liver | CT images | DNN | The current system failed to calculate the lesion's volumetric size, which hampered its efficiency | Accuracy = 99.38% |
| Raj et al. [ | Liver | CT images | SVM | The model restricted access to the large datasets, which hurdles the efficiency of the system | Accuracy more than 80% |
| Rajkumar et al. [ | Liver | CT images | SVM | The present system will be strived to improve the accuracy, precision, computational speed, automation, and reduction of manual interaction | Accuracy = 98% |
| Bach et al. [ | Liver | CT images | LDCT, | The system showed the existence of uncertainty about the potential harms of screening and generalizability of results | Accuracy = 80% |
| Kang et al. [ | Liver | CT images | Neural Network Fuzzy Neural Network | The current system lacked sufficient accuracy for the clinical application that should be further improved | Accuracy = 79.19% |
| Gruber et al. [ | Liver | CT Liver images | DNN | An accurate minimization strategy will be developed for joint loss function and an improved deep learning algorithm for classification that the current system lacked | Accuracy = 99.9% |
| Shakeel et al. [ | Lung | CT images | Improved-DNN | The present system can be improved by adding more datasets to it | Accuracy = 96.2% Specificity = 98.4% Precision = 97.4% |
| Asuntha and Srinivasan [ | Lung | CT images | CNN Fuzzy Particle Swarm Optimization (FPSO) | The current framework neglected to order the disease as favorable or threatening, which ought to be improved in the future | Accuracy = 94.97% Sensitivity = 96.68% Specificity = 95.89% |
| Riquelme et al. (2020) | Lung | CT images | DBN | The present work can incorporate the improved version of convolutional architectures to enhance the efficiency of lung cancer detection | Sensitivity = 0.734 Specificity = 0.822 |
| Ausawalaithong et al. [ | Lung | Chest X-ray dataset | CNN | The current system required more features to enhance its accuracy, specificity, and sensitivity | Accuracy = 84.02% Specificity = 85.34% Sensitivity = 82.71% |
| Nasrullahet al. [ | Lung | LIDC-IDRI datasets | CNN, MixNet | The present system should incorporate shifted additions in the future to reduce the redundancy of data | Sensitivity = 94% Specificity = 91% |
| Senthil et al. [ | Lung | CT scan images | Guaranteed convergence particle swarm optimization (GCPSO) | The framework is expected to add more improvement calculations to upgrade precision | Accuracy = 95.89% |
| Bur et al. [ | Oral | NCDB dataset | Tumor depth of invasion (DOI) Model | The current system needs improved predictive algorithms to enhance accuracy in detecting oral cancer in patients | Sensitivity = 86.6% |
| Lavanya and Chandra [ | Oral | Oral Leukoplakia dataset | Decision Tree | The accuracy needs to be improved for the improvement of the work | Accuracy = 83.703% |
| Liu et al. [ | Prostate | MRI images | CNN | The current system worked on a limited dataset that should be increased to improve its efficiency | AUC = 0.84 |
| Yoo et al. [ | Prostate | MRI images | CNN | The current system should be extended by 3DCNN’s and recurrent neural networks for improving the work in the future | AUC = 0.87 Confidence level = 95% |
| Zhang et al. [ | Skin | DermIS Digital, Dermaquest Database | CNN/WOA Method | The optimization technique of the current system needs to be improvised in the future for better exploration ability | Sensitivity = 95% Specificity = 92% PPV = 84% NPV = 95% Accuracy = 91% |
| Mane et al. [ | Skin | Dermoscopy images | SVM linear kernel | The present work is invasive, painful, and time-consuming, which needs to be improved in the future | Sensitivity = 90% Specificity = 90.90% Accuracy = 90.47% |
| Hasan et al. [ | Skin | Dermoscopy images | CNN | The technique shrank the size of the image, which led to the loss of information | Accuracy = 89.5% |
| Marka et al. [ | Skin | Dermoscopy images | Machine Learning, computer-aided design | The present system can be extended by testing the viability of the models in a clinical setting | AUC = 0.832 |
| Hasan et al.[ | Skin | PH2 dataset | ANN | The current system can be improved by using large datasets in the future | Accuracy = 95% |
| Khan et al. [ | Skin | DERMIS dataset | SVM | The proposed system failed to classify the data accurately, which should be improved in the future | Accuracy = 96% Sensitivity = 97% |
| Radu et al. [ | Skin | Clinical images | CNN | Though the system has maximized classification accuracy, its computation is too high because of its complex nature | Accuracy = 81% Sensitivity = 72% Specificity = 89% |
| Udrea et al. [ | Skin | Dermoscopy images | ANN, Generative Adversarial Neural Network | The system can be improved by enlarging the training and testing data based on skin lesions images | Accuracy = 92% |
| Kloeckner et al. [ | Stomach | Gastric cancer Images | CNN | The system's limitation is based on the selection and classification based on the selection of gastric images | ROC curves above 0.9 |
| Khryashchev et al. [ | Stomach | Endoscopic images | CNN | The system can be improved by adding many endoscopic image datasets to increase generalizing ability | mAP metric = 0.875 |
| Shibata et al. [ | Stomach | Endoscopic images | RNN | The present work should incorporate picture information, for example, screening endoscopic pictures and films in the future | Dice index = 71% Sensitivity = 96% |
| Hirasawa et al. [ | Stomach | Endoscopic images | CNN | Worked on less training and high-quality data | Sensitivity = 92.2%, |
| Leon et al. [ | Stomach | Histopathological Samples | Deep-CNN | The current system needs more samples for the better classification of data | Detection accuracy = 89.72% |
| Sakai et al. [ | Stomach | Endoscopic images | CNN | The detection accuracy can be improved further for the improvement of the work | Detection accuracy = 82.8% |
| Thapa et al. [ | Stomach | Gastroscopy samples | Random Forest | The presented work had used a minimal sample size which affected the validity of the model | Sensitivities = 86% Specificities = 79% |
| Dov et al. [ | Thyroid | Cytopathology images | multiple-instance learning (MIL) | Due to limited memory, the present system is not able to access the large-sized database | AUC Score = 0.87 Average precision = 0.743 |
| Ma et al. [ | Thyroid | SPECT images | CNN | The system can be improved by adding more SPECT images of the Thyroid in the future | Accuracy = 99.08% Precision = 98.82% Specificity = 99.61% |
| Poudel et al. [ | Thyroid | 3D thyroid images | CNN | The current system only worked with a limited dataset which should be improved in the future by adding more training data | Dice coefficient = 0.876 |
| Sokoutil et al. [ | Thyroid | MRI images | Hill climbing | The present system was more manual and less automatic. Hence it can be improved by using algorithms to make the system more automatic | Accuracy = 98.96% |
| Guan et al. [ | Thyroid | Ultrasound images | CNN, Inception v3 | The present work can be improved and can be extended to use Doppler images in the future work | Sensitivity = 93.3% Specificity = 87.4% Confidence interval = 95% |
| Hu et al. [ | Breast | Breast magnetic resonance imaging | Dichotomous Technique | The reported work can be enhanced further by adding more MRI images to screen the breast cancer for avoiding any future complications | Confidence Interval (CI) = 95% |
| Song et al. [ | Neuroendocrine tumors | Contrast enhanced (CE)-MRI | Logistic regression analysis | The diagnostic accuracy can be improved further using clinical decision | AUC = 0.900 Validation Cohert = 0.978 Confidence Interval = 95% |
| Chillakuru, et al. [ | Lung | Chest CT | Neural network | The performance can be improved on ground glass for lung cancer detection | Precision = 0.962 Recall = 0.573 |
| Iuga et al. [ | Tumors | lymph nodes (LNs) in computed tomography (CT) | CNN | Quantitative features of LNs can be improved to accelerate diagnosis | Detection rate = 76.9% Detection rate = 91.6% |
| Weng et al. [ | Lung | Magnetic resonance imaging | CNN | The deep learning-based image segmentation for lungs using MRI images are time consuming process, further enhancement can be done to reduce the time duration process for segmentation | Mean difference in Lung = 0.032 ± 0.048 L |
| Gupta et al. [ | Cervical | Pap Smear images | Stacking Model | The oversampling technique used in the study may lead to over-fitting | AUC = 99.7% |
| Gupta et al. [ | Breast | Wisconsin Breast Cancer Dataset | Neural Ensemble stacking | Neural Ensemble stacking performed the best prediction | Accurcy = 99.8% |
Fig. 13Year-wise distribution of papers
Fig. 10AI-Based Prediction Models
Fig. 11Cancer site-wise distribution of papers
Fig. 12Distribution of papers based on the type of training data