| Literature DB >> 35115832 |
Wai Tong Ng1,2, Barton But2, Horace C W Choi3, Remco de Bree4, Anne W M Lee1,2, Victor H F Lee1,2, Fernando López5,6, Antti A Mäkitie7,8,9, Juan P Rodrigo5,6, Nabil F Saba10, Raymond K Y Tsang11, Alfio Ferlito12.
Abstract
INTRODUCTION: Nasopharyngeal carcinoma (NPC) is endemic to Eastern and South-Eastern Asia, and, in 2020, 77% of global cases were diagnosed in these regions. Apart from its distinct epidemiology, the natural behavior, treatment, and prognosis are different from other head and neck cancers. With the growing trend of artificial intelligence (AI), especially deep learning (DL), in head and neck cancer care, we sought to explore the unique clinical application and implementation direction of AI in the management of NPC.Entities:
Keywords: auto contouring; deep learning; diagnosis; machine learning; neural network; prognosis
Year: 2022 PMID: 35115832 PMCID: PMC8801370 DOI: 10.2147/CMAR.S341583
Source DB: PubMed Journal: Cancer Manag Res ISSN: 1179-1322 Impact factor: 3.989
Data Extraction Table
| Authors, Year and Country | Site, No. of Cases (Data Type) | AI Subfield (Application) | Artificial Intelligence Methods and its Application | Study Aim | Performance Metric (s) | Results | Conclusion | Limitations |
|---|---|---|---|---|---|---|---|---|
| Wang et al (2010) | NPC and other type | Machine learning (Diagnosis) | 1. Classification: Non-linear regulatory network (hopfield-like network) | To find relationships between protein biomarkers and classify different disease groups | 1. Classification performance | The developed regulatory network out-performed Fisher linear discriminant, KNN, linear SVM and radial basis function SVM in classification performance. | The proposed technique has promise in assisting disease diagnosis by finding protein regulation relationships. | N/A |
| Aussem et al (2012) | NPC | Machine learning (Miscellaneous applications - Risk factor identification) | 1. Feature selection: Markov boundary discovery algorithm | To extract relevant dietary, social and environmental risk factors related to increasing risk of NPC | 1. Identification of potential risk factors | The proposed model had a better performance in recognizing risk factors associated with NPC than other algorithms. | The proposed techniques can integrate experts’ knowledge and information extracted from data to analyze epidemiologic data | N/A |
| Kumdee, Bhongmakapat, and Ritthipravat (2012) | NPC | Machine learning (Prognosis) | 1. Prediction: Generalized neural network-type SIRM (G-NN-SIRM) | To predict NPC recurrence | 1. Classification performance | G-NN-SIRM had a significantly higher performance than the other techniques in NPC recurrence prediction | The G-NN-SIRM can be applied to NPC recurrence prediction | N/A |
| Ritthipravat, Kumdee, and Bhongmakapat (2013) | NPC | Machine learning (Prognosis) | 1. Missing data technique to complete data for model training: | To predict NPC recurrence | 1. Predictive performance | The closest performance to the Kaplan-Meier model were the expectation-maximization imputation technique models, particularly with sequential neural network. | Missing data technique cross-combined with ANNs were investigated for predicting NPC recurrence. | N/A |
| Zhu, and Kan (2014) | NPC | Machine learning (Prognosis) | 1. Data transformation, data integration, or prediction output: | To assess cancer prognosis | 1. Risk prediction performance | The neural network cascade out-performed both the transformed and untransformed neural network models. | The study proposed a potential method for constructing a microRNA biomarker selection and prediction model | N/A |
| Jiang et al. (2016) | NPC | Machine learning (Prognosis) | 1. Feature Selection: | To predict the survival of NPC patients with synchronous metastases | 1. Prognostic performance | The ML model had a better prognostic performance than classifiers using clinical indexes alone or with haematological markers | The model has the potential to help clinicians choose the most appropriate treatment strategy for metastatic NPC patients | 1. Not all clinical indexes and haematological markers were included in the study. |
| Liu et al (2016) | NPC | Machine learning (Prognosis) | 1. Classification: | To predict NPC response to chemoradiotherapy | 1. Classification performance | The model using parameters extracted from T1 sequence had a better classification performance than the other two models | Integrating texture parameters to ML algorithms can act as imaging biomarkers for NPC tumor response to chemoradiotherapy | 1. Relatively small sample size |
| Wang et al (2016) | NPC | Machine learning (Diagnosis) | 1. Classification: | To assess the diagnostic accuracy of adding additional nodal parameters and PET/CT | 1. Diagnostic efficacy | ANN demonstrated that combining three (and four) of the proposed parameters yielded good results. | Additional parameters in MRI and PET/CT were found which can improve prediction accuracy. | 1. Possible diagnostic errors from not using histopathology. |
| Men et al (2017) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To assess the segmentation ability of the developed model | 1. Segmentation performance | The deep deconvolutional neural network had a much better performance when compared with VGG-16 model. | The proposed model has promise in enhancing the consistency of delineation and streamline the radiotherapy workflows. | 1. The model may be hard to converge due to having N0 and N+ patients in both the training and testing set. |
| Mohammed et al (2017) | NPC | Machine learning (Auto-contouring/ Diagnosis) | 1. Segmentation: | To evaluate the segmentation and identification performance of the developed models | 1. Classification performance | The proposed method had an improved classification and segmentation performance over SVM. | Texture features can assist in differentiating benign and malignant tumors. Thus, the fully automated proposed model can help with doctors’ diagnosis and support them. | 1. Disproportionate distribution between benign and malignant tumors in the dataset. |
| Zhang et al (2017) | NPC | Machine learning (Prognosis) | 1. Classification: | To predict local and distant failure of advanced NPC patients’ prior treatment | 1. Prognostic performance | Using RF for both feature selection and classification had the best prognostic performance | The most optimal ML methods for local and distant failure prediction in advanced NPC can improve precision oncology and clinical practice | N/A |
| Zhang et al (2017) | NPC | Machine learning (Prognosis) | 1. Analysis | Individualized progression-free survival evaluation of advanced NPC patients’ prior treatment | 1. Prognostic performance | Integrating radiomics signature with other factors within a nomogram such as TNM staging system or clinical data improved its performance. | The use of quantitative radiomics models can be useful in precision medicine and assist with the treatment strategies for NPC patients | 1. The analysis did not consider two-way or higher order interactions between features. |
| Li et al (2018) | NPC | Deep learning (Auto-contouring/ Diagnosis) | 1. Detection: Fully CNN | To evaluate the performance of the developed model to segment and detect nasopharyngeal malignancies in endoscopic images | 1. Detection performance | The developed model had a better performance than oncologists in nasopharyngeal mass differentiation. | The proposed method has potential in guided biopsy for nasopharyngeal malignancies. | 1. Limited diversity due to all samples being acquired from the same institution, which leads to over-fitting. |
| Mohammed et al (2018) | NPC | Artificial intelligence and machine learning (Diagnosis) | 1. Feature selection: | To evaluate the proposed method in detecting NPC from endoscopic images | 1. Classification performance | The detection performance of the trained ANN was close to that done manually by ear, nose and throat specialists. | The study demonstrated the feasibility of using ANNs for NPC identification in endoscopic images. | N/A |
| Mohammed et al (2018) | NPC | Artificial intelligence and machine learning (Auto-contouring/ Diagnosis) | 1. Feature selection: | To evaluate the proposed model in diagnosing NPC from endoscopic images | 1. Segmentation performance | The developed model yielded similar results to that of ENT specialists in segmentation performance. The classification performance achieved high results but the training dataset had a better performance. | The study demonstrated the effectiveness and accuracy of the proposed method. | N/A |
| Du et al (2019) | NPC | Machine learning (Prognosis) | 1. Feature selection: | To predict early progression of nonmetastatic NPC | 1. Model performance | The proposed model trained with five clinical features and radiomic features had the best performance over the other models. Tumor shape sphericity, first-order mean absolute deviation, T stage, and overall stage were important factors affecting 3-year disease progression. | The use of radiomics can be used for tumor diagnosis and risk assessment. Shapley additive explanations helped to find relationship between features in the model. | 1. The association between Epstein-Barr virus and progression-free survival was not explored. |
| Jiao et al (2019) | NPC | Machine learning (Miscellaneous applications - radiotherapy planning) | 1. Prediction: | To predict dose-volume histograms of OARs from IMRT plan | 1. DVH prediction accuracy (OARs) | The addition of dosimetric information improved the DVH prediction of the developed model. | The study showed the prediction capability of the model when patient dosimetric information was added to geometric information. | N/A |
| Jing et al (2019) | NPC and other types | Deep learning (Prognosis) | 1. Prediction: | Compare RankDeepSurv with other survival models and clinical experts in the analysis and prognosis of four public medical datasets and NPC | 1. Predictive performance of survival analysis | The RankDeepSurv had a better performance than the other three referenced methods in four public medical clinical datasets and in the NPC dataset versus clinical experts | The proposed model can assist clinicians in providing more accurate predictions for NPC recurrence | N/A |
| Li et al (2019) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To assess the developed model’s accuracy in delineating CT images | 1. Model performance | The modified U-net had a higher consistency and performance than manual contouring, while using less time per patient. | The developed model has the potential to help lighten clinicians’ workload and improve NPC treatment outcomes. | N/A |
| Liang et al (2019) | NPC | Deep learning (Auto-contouring/Diagnosis) | 1. Recognition and classification: | To assess the performance of the model in detecting and segmenting OARs | 1. Detection performance | The ODS net had good result in both detection and segmentation performances. | The fully automatic model may help to facilitate therapy planning. | 1. The manual segmentation of images by a radiologist may not be consistent and may not be the true standard of reference. |
| Lin et al (2019) | NPC | Deep learning (Auto-contouring) | 1. Feature extraction: | To evaluate the developed model in auto-contouring of primary gross target volume | 1. Model performance | The model yielded good accuracy. It also helped improve the contouring accuracy and time of practitioners in the study. | The model has the potential to help tumor control and patient survival by enhancing the delineation accuracy and lower contouring variation by different practitioners and the time required. | 1. Low statistical power due to small number of events. |
| Liu et al (2019) | NPC | Deep learning (Miscellaneous applications - radiotherapy planning) | 1. Prediction: | To predict the three-dimensional dose distribution of helical tomotherapy | 1. Predictive performance | The U-ResNet-D model yielded good results in predicting 3D dose distribution. | The developed method has the potential to increase the quality and consistency of treatment plans. | 1. Can only predict one type of dose distribution. |
| Ma et al (2019) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To use MRI and CT in NPC segmentation with the proposed models | 1. Segmentation performance | The proposed model out-performed the same model but without multi-modal information fusion and other existing CNN-based multi-modality segmentation models. | The proposed model was the first CNN-based method to solve the challenge of performing multi-modality tumor contouring on NPCs. | N/A |
| Peng et al (2019) | NPC | Deep learning (Prognosis) | 1. Feature extraction: | To assess risk and guide induction chemotherapy for patients | 1. Prognostic performance | The DL-based radiomics nomogram out-performed the EBV DNA-based model in risk stratification and induction chemotherapy guiding | The DL-based radiomics nomogram can be used for individualized treatment strategies. | 1. The follow-up period was too short |
| Rehioui et al (2019) | NPC | Machine learning (Miscellaneous applications - Risk factor identification) | 1. Prediction: | Compare the prognosis performance of different algorithms | 1. Model performance | The density-based algorithms (DENCLUE and its variants) obtained a better result than partitioning or statistical models | Familial history of cancer, living conditions and tobacco consumption are all associated with advanced stage of NPC. | N/A |
| Zhong et al (2019) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To assess the proposed model in delineating OARs | 1. Segmentation performance | The proposed cascaded method gave a significantly better performance than other existing single network architecture or segmentation algorithms. | The study showed the effectiveness of the developed model when auto-contouring OARs and the benefits of using the cascaded DL structure. | N/A |
| Zou et al (2019) | NPC | Deep learning (Miscellaneous applications - Image registration) | 1. Image registration: | To develop a model for image registration | 1. Image registration | The proposed method with additional use of transfer learning and fine-tuning out-performed both the proposed method and scale invariant feature transform (SIFT). | The use of transfer learning and fine tuning in the proposed model is promising in improving image registration. | N/A |
| Abd Ghani et al (2020) | NPC | Machine learning (Diagnosis) | 1. Classification: | To develop a model with endoscopic images for NPC identification | 1. Classification performance | The majority rule for decision-based fusion technique had a significantly lower performance than using a single best performing feature scheme for the SVM classifier, which uses pair-wise fusion of only two features. | A fully automated NPC detection model with good accuracy was developed. Although the proposed method had a high accuracy, the single best performing feature scheme for the SVM classifier outperforms it. | N/A |
| Bai et al (2020) | NPC | Machine learning (Miscellaneous applications - radiotherapy planning) | 1. Prediction: | To explore viability of a model for knowledge-based automated intensity-modulated radiation therapy planning | 1. Plan quality | The proposed model had a similar performance but with a higher efficiency in treatment planning when compared with manual planning. | The proposed technique can significantly reduce the treatment planning time while maintaining the same plan quality. | N/A |
| Chen et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To evaluate the segmentation performance of a model which uses T1, T2 and contrast-enhanced T1 MRI | 1. Segmentation performance | In comparison with other existing DL algorithms, the proposed model had the best segmentation performance. | Multi-modality MRI is useful to the proposed model for NPC delineation. | N/A |
| Chen et al (2020) | NPC | Machine learning (Miscellaneous applications - radiotherapy planning) | 1. Prediction: | To develop models for radiotherapy planning for planning quality control | 1. Plan quality | The developed ANN model had a lower capability than the junior physicist in designing radiotherapy plans. | The proposed model enhancing the quality and stability of individualized radiotherapy planning. | N/A |
| Chuang et al (2020) | NPC | Deep learning (Diagnosis) | 1. Classification: | To assess proposed model in detecting NPC in biopsies | 1. Diagnostic performance | The slide-level model had a better performance than pathology residents. However, its diagnostic ability is slightly worse than both attending pathologists and the chief resident. | The study demonstrated for the first time that DL algorithms can identify NPC in biopsies. | N/A |
| Cui et al (2020) | NPC | Machine learning and deep learning (Prognosis) | 1. Feature selection: Generalized linear model (ridge/lasso), XRT, Gradient boosting machine, RF & DL (Unknown) | To assign prediction scores to NPC patients and compare with the current clinical staging system | 1. Prognostic performance | The new scoring system had a better prognostic performance than the TNM/AJCC system in predicting treatment outcome for NPC | The new scoring system has the potential to improve image data-based clinical predictions and precision oncology | 1. The time to event was not considered. |
| Diao et al (2020) | NPC | Deep learning (Diagnosis) | 1. Classification: | To assess the pathologic diagnosis of NPC with the proposed model | 1. Diagnostic performance | Inception-v3 performed better than the junior and intermediate pathologists, but was worse than the senior pathologist in accuracy, specificity, sensitivity, AUC and consistency. | The proposed model can be used to support pathologists in clinical diagnosis by acting as a diagnostic reference. | 1. Improvement in the model’s design is required. |
| Du et al (2020) | NPC | Machine learning (Diagnosis) | 1. Classifications: | To evaluate and compare different machine learning methods in differentiating local recurrence and inflammation | 1. Diagnostic performance | The combination of fisher score with KNN, FSCR with support vector machines with RBF-SVM, fisher score with RF, and minimum redundancy maximum relevance with RBF-SVM had significantly better performance in accuracy, sensitivity, specificity and reliability than other combination of techniques. | Several methods to integrate ML algorithms with radiomics have the potential to improve NPC diagnostics. | 1. Limited by the retrospective nature and small sample size from one source. |
| Guo et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To evaluate the segmentation performance of a model | 1. Segmentation performance | The developed model out-performed the other DL models. Furthermore, the Jaccard loss function improved the segmentation performance of all models substantially. | The Jaccard loss function solved the issue of extreme foreground and background imbalance in image segmentation. However, further validation is required. | N/A |
| Jing et al (2020) | NPC | Deep learning (Prognosis) | 1. Prediction: | To predict and categorize the risk scores of NPC patients | 1. Model performance | The end-to-end MDSN had a better performance than the other four survival methods. | MDSN has the potential to support clinicians in making treatment decisions. | N/A |
| Ke et al (2020) | NPC | Deep learning (Auto-contouring/ Diagnosis) | 1. Classification and segmentation: | To assess the detection and segmentation ability of the developed model | 1. Diagnostic performance | The model had encouraging segmentation ability and the diagnostic performance of the proposed model obtained a better result than that of experienced radiologists. | The developed model may be able to improve the diagnostic efficiency and assist in clinical practice. | 1. No external validation. |
| Liu et al (2020) | NPC | Deep learning (Prognosis) | 1. Risk score calculation: | To assess the survival risk of NPC patients in order to make treatment decisions | 1. Survival risk assessment | DeepSurv to analyze the pathological microscopic features was a higher independent prognostic risk factor than EBV DNA copies and N stage | DeepSurv to analyze the pathological microscopic features can be used as a reliable tool for assessing survival risk in NPC patients. | 1. Decreased generalizability when applied to other centers or populations. |
| Men et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To assess the proposed method to improve segmentation constantly with less labelling effort | 1. Classification Performance | The proposed method could improve segmentation performance, while reducing the amount of labelling required. | The developed model decreased the amount of labelling and boosted segmentation performance by constantly obtaining, fine-tuning and transferring knowledge over long periods of time. | 1. The effect of the number of locked layers were not investigated. |
| Mohammed et al (2020) | NPC | Machine learning (Diagnosis) | 1. Classification: | To detect NPC from endoscopic images | 1. Classification performance | The developed models yielded good results and ANN,50–50-A, had the best performance. | The study was the first to consolidate diverse features into one fully automated classifier. | 1. Insufficient sample size and limited changeability. |
| Wang et al (2020) | NPC | Machine learning (Radiation-induced injury diagnosis) | 1. Prediction: | To assess the feasibility in developing a model for predicting radiation-related fibrosis | 1. Predictive performance | The proposed model trained with CT images had a better diagnostic accuracy than when using MRI features. | The proposed technique can be used to perform patient specific treatments by adjusting the administered dose on the neck, which can minimize the side effects. | 1. There is subject bias in fibrosis grading. |
| Wang et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Feature extraction: | To develop a model for automatic delineation of NPC in computed tomography | 1. Delineation accuracy | The proposed model out-performed the other methods in the experiment. In addition, using CT combined with contrast-enhanced-CT instead of CT alone improves the performance of all models. | The study showed that the proposed fully automated model has promise in helping clinicians in 3D delineation of tumour during radiotherapy planning by minimize delineation variability. | 1. The patient samples were all from one medical center. |
| Xie et al (2020) | NPC | Machine learning (Prognosis) | 1. Prediction: | To investigate the effect of re-sampling technique and machine learning classifiers on radiomics-based model | 1. Predictive performance | The combination of adaptive synthetic re-sampling technique and SVM classifier gave the best performance | Re-sampling technique significantly improved the prediction performance of imbalanced datasets | 1. The relatively small number of instances and features in the retrospective dataset may reduce the generalizability to other kinds of cancer. |
| Xue et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To evaluate the performance of the proposed model in segmenting high risk tumors | 1. Segmentation accuracy | The developed model had a better performance when compared with the U-net model. Its results were closer to manual contouring. | The developed model has promise in increasing the effectiveness and consistency of primary tumour gross target volume delineation for NPC patients. | 1. Insufficient training data. |
| Xue et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To assess the model’s ability to segment high-risk tumors | 1. Segmentation performance | The SI-Net model had a better segmentation performance than the U-net model. The mean contouring time of the model is also less than when performed manually. | The proposed model has the potential to help with treatment planning by improving the efficiency and consistency of CTVp1 segmentation. | 1. Insufficient training data. |
| Yang et al (2020) | NPC | Deep learning (Prognosis) | 1. Prediction: | To evaluate an automatic T staging system that requires no additional annotation | 1. Prognostic performance | The proposed model had a similar performance to the TNM staging system | The model had a good prognostic performance in fully automated T staging of NPC. | 1. Some imaging information may be ignored as contrast-enhanced-T1 weighted images in the coronal plane and T1 weighted images in the sagittal plane were not included in the model construction. |
| Yang et al (2020) | NPC | Deep learning (Auto-contouring) | 1. Feature extraction: | To investigate the segmentation accuracy of OARs | 1. Segmentation performance | There was no statistical significance between the results obtained from the proposed model and manual contouring of the OARs except for the optic nerves and chiasm. | The developed model can be used for auto-contouring of OARs. | N/A |
| Zhang et al (2020) | NPC | Machine learning (Radiation-induced injury diagnosis) | 1. Radiomic analysis: | To develop a model for early detection of radiation-induced temporal lobe injury | 1. Predictive performance | The use of texture features in feature selection improved the performance of the prediction model. | The developed models have the potential to support in providing early detection and taking preventive measures against radiation-induced temporal lobe injury. | 1. Insufficient sample size. |
| Zhang et al (2020) | NPC | Deep learning (Prognosis) | 1. Prediction: | To explore the use of magnetic resonance imaging and microscopic whole-slide images to improve the prognosis of model | 1. Prognostic performance | The established nomogram had a much higher performance compared to the clinical model. | The developed multi-scale nomogram has the potential to be a non-invasive, cost-effective tool for assisting in individualized treatment and decision making on NPC. | 1. The study was retrospective and the sample size was relatively small |
| Zhao et al (2020) | NPC | Machine learning (Prognosis) | 1. Prediction: | To investigate an MRI-based radiomics nomogram in predicting induction chemotherapy response and survival | 1. Prediction performance | The proposed nomogram had a better performance than the clinical nomogram. | The constructed nomogram could be used for personalized risk stratification and for treating NPC patients that received induction chemotherapy. | 1. Small sample size due to the strict inclusion criteria. |
| Zhong et al (2020) | NPC | Deep learning (Prognosis) | 1. Prediction: | To predict the survival of stage T3N1M0 NPC patients treated with induction chemotherapy and concurrent chemoradiotherapy | 1. Model performance | The DL-based radiomics model had a higher predictive performance than the clinical model. | It has the potential to be a useful non-invasive tool for risk stratification and prognostic prediction | 1. Only the basilar region was used for analysis, while nasopharyngeal and other regions were not considered. |
| Bai et al (2021) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To use computed tomography for the segmentation of NPC | 1. Segmentation performance | The developed DL algorithm had a significantly better performance than three existing DL models | An NPC-seg algorithm was developed and won 9th place on the StructSeg | N/A |
| Cai et al (2021) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To use image and T-staging information to improve NPC tumor delineation accuracy | 1. Segmentation performance | Having the attention module and T-channel improved the effectiveness of the model. The proposed model had the best performance over four other state-of-The-art methods. | Integrating both the attention and the T-channel module can improve the delineation performance of a model substantially | 1. Small batch size due to GPU memory limitation |
| Tang et al (2021) | NPC | Deep learning (Auto-contouring) | 1. Segmentation: | To develop a model for NPC segmentation using MRI | 1. Segmentation performance | The developed network had a higher performance than three other segmentation methods | The proposed model can help clinicians by delineating the tumor in order to provide accurate staging and radiotherapy planning of NPC. | 1. Insufficient training data. |
| Wen et al (2021) | NPC | Machine learning (Radiation-induced injury diagnosis) | 1. Dosimetric factors selection: | To predict temporal lobe injury after intensity-modulated radiotherapy in NPC | 1. Identification of dosimetric factors associated with temporal lobe injury incidence | The nomogram that included dosimetric and clinical factors had a better prediction performance than the nomogram with only DVH. | The proposed method was able to predict temporal lobe injury accurately and can be used to help provide individualized follow-up management. | 1. Selection bias due to the retrospective nature of the study. |
| Wong et al (2021) | NPC | Deep learning (Diagnosis) | 1. Classification: | To differentiate early stage NPC from benign hyperplasia using T2-weighted MRI | 1. Diagnostic performance | The CNN obtained a good result in discriminating NPC and benign hyperplasia. | The proposed fully automatic network model demonstrated the prospect of CNN in identifying NPC at an early stage. | 1. There is limited generalizability as only MRI scans of the head and neck region with the field of view centered on the nasopharynx can be used. |
| Wong et al (2021) | NPC | Deep learning (Auto-contouring) | 1. Delineation: | To evaluate the delineation performance of a model using non-contrast-enhanced MRI | 1. Delineation performance | The performance of CNN using fs-T2W images was similar to that of CNNs using contrast-enhanced-T1 weighted and contrast-enhanced-fat-suppressed-T1 weighted images. | Although using contrast-enhanced sequence for head and neck MRI is still recommended, when avoiding use of contrast agent is preferred, CNN is a potential future option. | 1. Limited generalizability to other CNN architectures due to variations in tissue contrasts. |
| Wu et al (2021) | NPC and other types | Machine learning and deep learning (Prognosis) | 1. Classification: | To assess the predicted value of peritumoral regions and explore the effects of different peritumoral sizes in learning models | 1. Model performance | Radiomics is more suitable than DL for modelling peritumors | The peritumoral models, and ML and DL helped improved the prediction performance. | 1. Datasets were small. |
| Zhang et al (2021) | NPC | Machine learning and deep learning (Prognosis) | 1. Prediction: | To predict DMFS and to investigate the influence of additional chemotherapy to concurrent chemoradiotherapy for different risk groups. | 1. Prediction performance of Distant metastasis-free survival | By integrating DL signature with N stage, EBV DNA and treatment regimen, the MRI-based combined model had a better predictive performance than the DL signature-based, radiomic signature-based and clinical-based model | The MRI-based combined model could be used as a complementary tool for making treatment decisions by assessing the risk of DMFS in locoregionally advanced NPC patients | 1. The value of the deep learning model and the collected information were limited. |
Notes: aIndicates performance metric presented in graph and not as a numerical value. bValues found in publication.
Abbreviations: NPC, nasopharyngeal carcinoma; MRI, magnetic resonance imaging; SVM, support vector machines; KNN, k-nearest neighbor; ANN, artificial neural network; AUC, area under the receiver operating characteristic curve; ML, machine learning; PET, positron emission tomography; CT, computed tomography; PPV, positive predictive values; NPV, negative predictive values; HD, Hausdorff distance; LR, logistic regression; RF, random forest; C-index, concordance index; CNN, convolutional neural network; IMRT, intensity-modulated radiation therapy; DVH, dose-volume histogram; MAE, mean absolute error; OAR, organ-at-risk; EBV DNA, Epstein–Barr Virus DNA; DL, deep learning; VMAT, volumetric modulated arc therapy; WSI, whole slide image; LASSO, least absolute shrinkage and selection operator; OS, Overall survival; DMFS, distant metastasis-free survival; LRFS, local-region relapse-free survival; AJCC, American Joint Committee on Cancer; PFS, progression-free survival.
Quality Assessment via the QUADAS-2 Tool
| Authors Publication Year | Risk of Biasa | Applicability Concernsa | At Riskb | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Patients Selection | Index Test | Reference Standard | Flow and Timing | Patients Selection | Index Test | Reference Standard | Risk of Bias | Applicability | |
| Wang et al (2010) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Aussem et al (2012) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Kumdee, Bhongmakapat and Ritthipravat (2012) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Ritthipravat, Kumdee, and Bhongmakapat (2013) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Zhu and Kan (2014) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Jiang et al (2016) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Liu et al (2016) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Wang et al (2016) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Men et al (2017) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Mohammed et al (2017) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Zhang et al (2017) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Zhang et al (2017) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Li et al (2018) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Mohammed et al (2018) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Mohammed et al (2018) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Du et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Jiao et al (2019) | ✓ | ✓ | ? | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Jing et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Li et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Liang et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Lin et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Liu et al (2019) | ✓ | ✓ | ? | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Ma et al (2019) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Peng et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Rehioui et al (2019) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Zhong et al (2019) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Zou et al (2019) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Abd Ghani et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Bai et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Chen et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Chen et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Chuang et al (2020) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Cui et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ? | ✓ | No | Yes |
| Diao et al (2020) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Du et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Guo et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Jing et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Ke et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Liu et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Men et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Mohammed et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Wang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Wang et al (2020) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Xie et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Xue et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Xue et al (2020) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Yang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Yang et al (2020) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Zhang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Zhang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Zhao et al (2020) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Zhong et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Bai et al (2021) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
| Cai et al (2021) | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Tang et al (2021) | ? | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Wen et al (2021) | ✓ | ✓ | ? | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Wong et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Wong et al (2021) | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | Yes | No |
| Wu et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | No | No |
| Zhang et al (2021) | ✓ | ✓ | ✓ | ? | ✓ | ✓ | ✓ | Yes | No |
Notes: aA check mark (✓) refers to passing (ie, absence of risk) of the criteria; a cross mark (✘) refers to not passing (ie, presence of risk) of the criteria; and a question mark (?) refers to missing information to assess the criteria. bThe domain “risk of bias” and “applicability” were considered as no bias (ie, “No”) if passing all of the corresponding criteria (ie, all ✓); and were considered as having bias (ie, “Yes”) if not passing any of the corresponding criteria (ie, at least one ? Or ✘).
Quality Assessment Guidelines
| Article Sections | Parameters | Explanation |
|---|---|---|
| Title and Abstract | Title (Nature of study) | Introduce predictive model |
| Abstract (Structured summary) | Include background, objectives, data sources, performance metrics of predictive models and conclusion about model value | |
| Introduction | Rationale | Define the clinical goal, and review the current practice and performance of existing models |
| Objectives | Identify how the proposed method can benefit the clinical target | |
| Method | Describe the setting | Describe the data source, sample size, year and duration of the data |
| Define the prediction problem | Define the nature of the study (retrospective/prospective), model function (prognosis, diagnosis, etc.) and performance metrics | |
| Prepare data for model building | Describe the inclusion and exclusion criteria of the data, data pre-processing method, performance metrics for validation, and define the training and testing set. External validation is recommended | |
| Build the predictive model | Describe how the model was built including AI modelling techniques used (eg random forest, ANN, CNN) | |
| Results | Report the final model and performance | Reports the performance of the final proposed model, comparison with other models and human performance. It is recommended to include confidence intervals |
| Discussion | Clinical implications | Discuss any significant findings |
| Limitations of the model | Discuss any possible limitations found | |
| Conclusion | Discuss the clinical benefit of the model and summarize the result and findings |
Note: Data from the guideline of Luo et al.8
Quality Scores of the Finalized Articles
| Studies | Title | Abstract | Rationale | Objectives | Setting Description | Problem Definition | Data Preparation | Build Model | Report Performance | Clinical Implications | Limitations | Scores (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Wang et al (2010) | ✓ | ✘ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 73% |
| Aussem et al (2012) | ✓ | ✘ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 73% |
| Kumdee, Bhongmakapat and Ritthipravat (2012) | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 82% |
| Ritthipravat, Kumdee and Bhongmakapat (2013) | ✓ | ✘ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 73% |
| Zhu and Kan (2014) | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 82% |
| Jiang et al (2016) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | ✓ | 82% |
| Liu et al (2016) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Wang et al (2016) | ✘ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 82% |
| Men et al (2017) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Mohammed et al (2017) | ✓ | ✘ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 82% |
| Zhang et al (2017) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 91% |
| Zhang et al (2017) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Li et al (2018) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Mohammed et al (2018) | ✓ | ✘ | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✓ | ✘ | ✘ | 55% |
| Mohammed et al (2018) | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✓ | ✘ | ✘ | 64% |
| Du et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Jiao et al (2019) | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | 73% |
| Jing et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | ✓ | ✘ | ✘ | ✘ | 55% |
| Li et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 91% |
| Liang et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Lin et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Liu et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | 91% |
| Ma et al (2019) | ✓ | ✘ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 73% |
| Peng et al (2019) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Rehioui et al (2019) | ✓ | ✘ | ✘ | ✘ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✘ | 55% |
| Zhong et al (2019) | ✓ | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 73% |
| Zou et al (2019) | ✓ | ✘ | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✓ | ✓ | ✘ | 64% |
| Abd Ghani et al (2020) | ✓ | ✘ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | 64% |
| Bai et al (2020) | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 82% |
| Chen et al (2020) | ✓ | ✘ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 73% |
| Chen et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 91% |
| Chuang et al (2020) | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | 64% |
| Cui et al (2020) | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 91% |
| Diao et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | 91% |
| Du et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | 91% |
| Guo et al (2020) | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 82% |
| Jing et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 91% |
| Ke et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Liu et al (2020) | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 91% |
| Men et al (2020) | ✘ | ✓ | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | 73% |
| Mohammed et al (2020) | ✓ | ✘ | ✓ | ✓ | ✘ | ✘ | ✓ | ✓ | ✘ | ✘ | ✓ | 55% |
| Wang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | ✓ | 82% |
| Wang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Xie et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Xue et al (2020) | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | 82% |
| Xue et al (2020) | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | 82% |
| Yang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Yang, et al (2020) | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | 82% |
| Zhang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Zhang et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Zhao et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Zhong et al (2020) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Bai et al (2021) | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✘ | 73% |
| Cai et al (2021) | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 91% |
| Tang et al (2021) | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 91% |
| Wen et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Wong et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Wong et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
| Wu et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✘ | ✓ | 91% |
| Zhang et al (2021) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 100% |
Notes: A check mark (✓) refers to passing of the criteria; and a cross mark (✘) refers to not passing of the criteria. The score refers to the proportion of passed criteria for that publication. Assessment parameters based on the guideline of Luo et al.8
Figure 1PRISMA flow diagram 2020.
Figure 2Comparison of studies on AI application for NPC management. (A) Application types of AI and its subfields on NPC; (B) Main performance metrics of application types on NPC.
Figure 3Performance metric boxplots of AI application types on NPC. (A) Prognosis and diagnosis: accuracy, AUC, sensitivity and specificity metric; (B) Auto-contouring: DSC metric; (C) Auto-contouring: ASSD metric.