| Literature DB >> 35712463 |
Alberto Eugenio Tozzi1, Francesco Fabozzi2,3, Megan Eckley2, Ileana Croci1, Vito Andrea Dell'Anna2, Erica Colantonio2, Angela Mastronuzzi2.
Abstract
The application of artificial intelligence (AI) systems is emerging in many fields in recent years, due to the increased computing power available at lower cost. Although its applications in various branches of medicine, such as pediatric oncology, are many and promising, its use is still in an embryonic stage. The aim of this paper is to provide an overview of the state of the art regarding the AI application in pediatric oncology, through a systematic review of systematic reviews, and to analyze current trends in Europe, through a bibliometric analysis of publications written by European authors. Among 330 records found, 25 were included in the systematic review. All papers have been published since 2017, demonstrating only recent attention to this field. The total number of studies included in the selected reviews was 674, with a third including an author with a European affiliation. In bibliometric analysis, 304 out of the 978 records found were included. Similarly, the number of publications began to dramatically increase from 2017. Most explored AI applications regard the use of diagnostic images, particularly radiomics, as well as the group of neoplasms most involved are the central nervous system tumors. No evidence was found regarding the use of AI for process mining, clinical pathway modeling, or computer interpreted guidelines to improve the healthcare process. No robust evidence is yet available in any of the domains investigated by systematic reviews. However, the scientific production in Europe is significant and consistent with the topics covered in systematic reviews at the global level. The use of AI in pediatric oncology is developing rapidly with promising results, but numerous gaps and challenges persist to validate its utilization in clinical practice. An important limitation is the need for large datasets for training algorithms, calling for international collaborative studies.Entities:
Keywords: CNS tumors; artificial intelligence; childhood cancer; deep learning; machine learning; pediatric oncology
Year: 2022 PMID: 35712463 PMCID: PMC9194810 DOI: 10.3389/fonc.2022.905770
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1PRISMA flowchart describing the selection process for systematic reviews and for original papers included in the bibliometric analysis.
Summary of reviews included in the systematic review.
| Reference | Year | Number Of Iincluded Studies | Cancer Group | Data Source | AI Application | AIm | Key Findings |
|---|---|---|---|---|---|---|---|
| Katsila T, et al. ( | 2017 | 220 | CNS tumors | MRI | Radiomics, pharmacometabolomics | Diagnosis | Radiomics: (i) tumor grading needs to be better-refined, (ii) diagnostic precision should be improved, (iii) standardization in radiomics is lacking, and (iv) quantitative radiomics needs to prove clinical implementation. Pharmacometabolomics: data regarding this topic in CNS tumors are scarce. |
| Nguyen A V, et al. ( | 2018 | 8 | CNS tumors | MRI | Machine learning algorithms | Differential diagnosis | In differentiation of primary central nervous system lymphoma from glioblastoma on imaging, ML performed well with the lowest reported AUC being 0.878. In studies in which ML was directly compared with radiologists, ML performed better than or as well as the radiologists. However, when ML was applied to an external data set, it performed more poorly. |
| Sarkiss CA, et al. ( | 2019 | 29 | CNS tumors | Diagnostic images, gene expression | Machine learning algorithms | Diagnosis, prognosis | ML can predict patient outcomes, with a sensitivity range of 78%–98% and specificity range of 76%–95%. ML based algorithms show accuracy in diagnosing low-grade versus high-grade gliomas, ranging from 80% to 93% and 90% for diagnosing high-grade glioma versus lymphoma. |
| Sohn CK, et al. ( | 2020 | 5 | CNS tumors | MRI | Radiomics | Diagnosis | The pooled sensitivity when diagnosing HGG was higher (96% (95% CI: 0.93, 0.98)) than the specificity when diagnosing LGG (90% (95% CI 0.85, 0.93)). Heterogeneity was observed in both sensitivity and specificity. Metaregression confirmed the heterogeneity in sample sizes (p = 0.05), imaging sequence types (p = 0.02), and data sources (p = 0.01), but not for the inclusion of the testing set (p = 0.19), feature extraction number (p = 0.36), and selection of feature number (p = 0.18). The results of subgroup analysis indicate that sample sizes of more than 100 and feature selection numbers less than the total sample size positively affected the diagnostic performance in differentiating HGG from LGG. |
| Park JE, et al. ( | 2020 | 51 | CNS tumors | MRI | Radiomics | Quality of radiomics studies | Prognostic/predictive studies received higher score than diagnostic studies in comparison to gold standard (P <.001), use of calibration (P = .02), and cut-off analysis (P = .001). The quality of reporting of radiomics studies in neuro-oncology is currently insufficient. |
| Bhandari AP, et al. ( | 2020 | 9 | CNS tumors | MRI | Convolutional Neural Networks algorithms | Brain tumors segmentation | Only one study used a training set from their own institution. Specifics of convolution layers (i.e. filtration of images) were not detailed extensively. The majority of overfitting was done |
| Bhandari AP, et al. ( | 2020 | 14 | CNS tumors | MRI | Radiomics | Classification | The best classifier of IDH status was with conventional radiomics in combination with convolutional neural network–derived features (AUC = 0.95, 94.4% sensitivity, 86.7% specificity). Optimal classification of 1p19q status occurred with texture-based radiomics (AUC = 0.96, 90% sensitivity, 89% specificity). A meta-analysis showed high heterogeneity due to the uniqueness of radiomic pipelines. |
| Booth TC, et al. ( | 2020 | 20 | CNS tumors | MRI, PET | Radiomics | Diagnosis, prognosis and treatment response | Much research is applied to determining molecular profiles, histological tumor grade, and prognosis using MRI images acquired at the time that patients first present with a brain tumor. Although pioneering, most of the evidence is of a low level, having been obtained retrospectively and in single centers. |
| Tewarie IA, et al. ( | 2021 | 27 | CNS tumors | Genomics, MRI, clinical information, histopathology, pharmacokinetics | Algorithmic prognostic models | Prognosis | The included studies developed and evaluated 59 models, of which only seven were externally validated in a different patient cohort. The predictive performance among these studies varied widely according to the AUC (0.58–0.98), accuracy (0.69–0.98), and C-index (0.66–0.70). However, none of these models has been implemented into clinical care |
| van Kempen EJ, et al. ( | 2021 | 17 | CNS tumors | MRI | Machine learning algorithms | Prediction of glioma genotype | Meta-analysis showed excellent accuracy for all subgroups, with the classification of 1p/19q codeletion status scoring significantly poorer than other subgroups (AUC: 0.748, p = 0.132). Classification of IDH mutation shows an overall AUC of 0.909 (95%-CI: 0.867–0.951). AUC of MGMT promoter methylation status was estimated as 0.866 (95%-CI: 0.812–0.921). There was considerable heterogeneity among some of the included studies. |
| van Kempen EJ, et al. ( | 2021 | 8 | CNS tumors | MRI | Machine learning algorithms -based glioma segmentation tools | Accuracy of brain tumor segmentation | Overall, the MLAs from the included studies showed an overall dice similarity coefficient (DSC) score of 0.84 (95% CI: 0.82–0.86). In addition, a DSC score of 0.83 (95% CI: 0.80–0.87) and 0.82 (95% CI: 0.78–0.87) was observed for the automated glioma segmentation of the high-grade and low-grade gliomas, respectively. However, heterogeneity was considerably high between included studies, and publication bias was observed |
| Al-Galal SAY, et al. ( | 2021 | 92 | CNS tumors | MRI | Deep learning techniques for classification and segmentation | Brain tumors segmentation, classification | The significant advantage of the techniques of DL is its computability and consistency with many conventional techniques. DL methods focusing on convolutional neural networks (CNN) are more applicable to all sub-fields of medical image processing, such as classification, identification, and segmentation. |
| Buchlak QD, et al. ( | 2021 | 153 | CNS tumors | MRI | Machine learning for diagnosis and classification | Diagnosis, classification | Model performance of machine learning was generally strong (AUC = 0.87 ± 0.09; sensitivity = 0.87 ± 0.10; specificity = 0.0.86 ± 0.10; precision = 0.88 ± 0.11). Convolutional neural network, support vector machine and random forest algorithms were top performers. |
| Jian A, et al. ( | 2021 | 44 | CNS tumors | MRI | Radiomics | Diagnosis, prediction of molecular markers | The pooled sensitivity and specificity for predicting isocitrate dehydrogenase (IDH) mutation in training datasets were 0.88 (95% CI 0.83-0.91) and 0.86 (95% CI 0.79-0.91), respectively, and 0.83 to 0.85 in validation sets. Use of data augmentation and MRI sequence type were weakly associated with heterogeneity. Both O6-methylguanine-DNA methyltransferase (MGMT) gene promoter methylation and 1p/19q codeletion could be predicted with a pooled sensitivity and specificity between 0.76 and 0.83 in training datasets. |
| Tabatabaei M, et al. ( | 2021 | 18 | CNS tumors | MRI | Radiomics | Classification | Results appear promising for grade prediction from MR images using the radiomics techniques. However, there is no agreement about the radiomics pipeline, and the prior studies are very heterogeneous regarding the software used, the number of extracted features, MR sequences, and machine learning technique. Before the clinical implementation of glioma grading by radiomics, more standardized research is needed. |
| d’Este SH, et al. ( | 2021 | 14 | CNS tumors | PET | Combination of multimodality imaging with AI | Defining tumor infiltration by imaging | All studies concluded their findings to be of significant value for future clinical practice. Diagnostic test accuracy reached an area under the curve of 0.74–0.91 reported in six studies. When AUC is not provided, a sensitivity of 80.0%–100% and a specificity of 69.2%–100%, an accuracy of 78%–81.8% and a Pearson’s correlation coefficient of 0.74–0.88 were found. |
| Zhong J, et al. ( | 2020 | 12 | Bone and soft tissue sarcomas | MRI, PET | Radiomics | Quality of radiomics studies; prognosis | Median Radiomics Quality Score: 5 (– |
| Crombé A, et al. ( | 2020 | 52 | Bone and soft tissue sarcomas | MRI, PET, CT, Ultrasound | Radiomics | Quality of radiomics studies | None of the study did a test-retest analysis of the radiomic features nor a phantom study. Only two studies were prospectively designed. Thirty-eight out of 52 (73.1%) studies did not validate their results on an independent cohort. Median Radiomics Quality Score: 4,5 (– |
| Gitto S, et al ( | 2021 | 49 | Bone and soft tissue sarcomas | MRI, CT | Radiomics | Reproducibility and prediction of diagnosis | Eighteen (37%) of the 49 studies included a reproducibility analysis of the radiomic features in their workflow. The intraclass correlation coefficient (ICC) was the statistical method used in most of the papers reporting a reproducibility analysis. At least one machine learning validation technique was used in 25 (51%) of the 49 papers. A clinical validation of the radiomics-based prediction model was reported in 19 (39%) of the 49 papers. The quality of sarcoma radiomics studies is low, which may hamper performance generalizability of radiomic models on independent cohorts and, consequently, their practical application. |
| Wang H, et al. ( | 2020 | 45 | Lymphomas | MRI, CT, PET | Radiomics | Diagnosis, prognosis, quality of radiomics studies | Radiomics features can be used to effectively differentiate lymphoma from another disease (AUC values of the studies ranged from 0.730 to 1.000). Radiomics features are prognostic predictors for the outcome of patients with several types of lymphoma. However, the quality of published radiomics studies in lymphoma has been suboptimal to date. |
| Frood R, et al. ( | 2021 | 41 | Lymphomas | PET; CT | Quantitative imaging parameters derived from pretreatment FDG PET/CT; radiomics | Prognosis, treatment outcome | Significant predictive ability was reported in 5/20 DLBCL studies assessing SUVmax (PFS: HR 0.13–7.35, OS: HR 0.83–11.23), 17/19 assessing metabolic tumor volume (MTV) (PFS: HR 2.09– 11.20, OS: HR 2.40–10.32) and 10/13 assessing total lesion glycolysis (TLG) (PFS: HR 1.078–11.21, OS: HR 2.40–4.82). Significant predictive ability was reported in 1/4 HL studies assessing SUVmax (HR not reported), 6/8 assessing MTV (PFS: HR 1.2–10.71, OS : HR 1.00–13.20) and 2/3 assessing TLG (HR not reported). There are 7/41 studies assessing the use of radiomics (4 DLBCL, 2 HL); 5/41 studies had internal validation and 2/41 included external validation. All studies had overall moderate or high risk of bias. |
| Badrigilan S, et al. ( | 2021 | 30 | Head and neck cancer | MRI | AI assisted classification and segmentation | Tumor segmentation, classification | The overall performance of DL models for the complete tumor in terms of the pooled Dice score, sensitivity, and specificity was 0.8965 (95% confidence interval (95% CI): 0.76–0.9994), 0.9132 (95% CI: 0.71–0.994) and 0.9164 (95% CI: 0.78–1.00), respectively. The DL methods achieved the highest performance for classifying three types of gliomas, meningioma, and pituitary tumors with overall accuracies of 96.01%, 99.73%, and 96.58%, respectively. Stratification of glioma tumors by high and low grading revealed overall accuracies of 94.32% and 94.23% for the DL methods, respectively. |
| Carbonara R, et al. ( | 2021 | 8 | Head and neck cancers | MRI, PET, CT | Radiomics | Prediction of radiation-induced side effects | Published radiomic studies provide encouraging but still limited and preliminary data that require further validation to improve the decision-making processes in preventing and managing radiation-induced toxicities. |
| Gupta V, et al. ( | 2020 | 27 | HSCT | Clinical data, imaging, genomic and demographic data | Machine learning techniques | Prognosis | The majority of studies used supervised ML, related to post-HSCT complications, but were limited by small numbers of patients. None of the studies provided robust evidence to determine an optimal ML technique for HSCT or minimal number of variables required to build predictive models. However, our results suggest that ADT could be applicable in HSCT settings due to their interpretability. |
| Salah HT, et al. ( | 2019 | 23 | Leukemia | Microscopy, flow cytometry | Machine learning techniques | Diagnosis | Multiple studies have applied ML tools on leukemia diagnosis. Some studies have reached high classification accuracy. Nevertheless, literature presented in this review illustrates the need for multiple future directions. |
AUC, Area Under the Curve; DL, Deep Learning; DLBCL, Diffuse Large B Cell Lymphoma; HL, Hodgkin Lymphoma; HR, Hazard Ratio; CI, Confidence Interval; HGG, High Grade Glioma; LGG, Low Grade Glioma, MLAs, Machine Learning Algorithms; PFS, Progression free survival.
Figure 2Number of original papers included in the bibliometric analysis by year of publication.
Scientific journals hosting original papers on AI in pediatric oncology.
| Sources | n. articles | h_index | g_index | m_index | TC | NP | PY_start |
|---|---|---|---|---|---|---|---|
| PLOS ONE | 10 | 1 | 1 | 0,250 | 107 | 1 | 2018 |
| CANCERS | 8 | 1 | 1 | 0,250 | 13 | 1 | 2018 |
| IEEE ACCESS | 8 | 2 | 2 | 0,250 | 63 | 2 | 2014 |
| JOURNAL OF MEDICAL IMAGING | 8 | 1 | 1 | 0,167 | 35 | 1 | 2016 |
| IEEE TRANSACTIONS ON MEDICAL IMAGING | 7 | 1 | 1 | 0,059 | 9 | 1 | 2005 |
| MEDICAL HYPOTHESES | 7 | 1 | 1 | 0,500 | 1 | 1 | 2020 |
| SCIENTIFIC REPORTS | 7 | 1 | 1 | 0,077 | 55 | 1 | 2009 |
| APPLIED SCIENCES-BASEL | 6 | 4 | 6 | 1,000 | 86 | 6 | 2018 |
| COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE | 6 | 1 | 1 | 0,071 | 120 | 1 | 2008 |
| SENSORS | 6 | 4 | 4 | 0,222 | 270 | 4 | 2004 |
| BMC BIOINFORMATICS | 5 | 1 | 1 | 0,250 | 5 | 1 | 2018 |
| COMPUTERIZED MEDICAL IMAGING AND GRAPHICS | 5 | 1 | 1 | 1,000 | 3 | 1 | 2021 |
| COMPUTERS IN BIOLOGY AND MEDICINE | 5 | 2 | 2 | 0,125 | 156 | 2 | 2006 |
| CYTOMETRY PART A | 5 | 1 | 1 | 0,333 | 1 | 1 | 2019 |
| ARTIFICIAL INTELLIGENCE IN MEDICINE | 4 | 3 | 3 | 0,750 | 48 | 3 | 2018 |
| BIOMEDICAL SIGNAL PROCESSING AND CONTROL | 4 | 1 | 1 | 0,250 | 5 | 1 | 2018 |
| DIAGNOSTICS | 4 | 2 | 2 | 0,333 | 23 | 2 | 2016 |
| NMR IN BIOMEDICINE | 4 | 1 | 1 | 0,333 | 8 | 1 | 2019 |
| AMERICAN JOURNAL OF NEURORADIOLOGY | 3 | 1 | 1 | 0,500 | 1 | 1 | 2020 |
| BIOLOGY DIRECT | 3 | 2 | 3 | 0,182 | 19 | 3 | 2011 |
The Table includes the first 20 journals in order of number of publications. H-index: The Hirsch index (H-index) is a journal’s number of published articles (h), each of which has been cited in other papers at least h time(s). m-index: The m-index is defined as H/n, where H is the H-index and n is the number of years since the first published paper of the journal. The g-index is an improvement of H-index. TC, Total number of Citations; NP, Net Production; PY_start, starting year of the journal.
Figure 3Country collaboration network based on country of authors in the original papers included in the bibliometric analysis.
Distribution of diagnosis and source of information for the AI application in publications selected for the bibliometric analysis.
| Images | Omics | Hystopathology/Blood | Other | |||||
|---|---|---|---|---|---|---|---|---|
| n | % | n | % | n | % | n | % | |
| CNS tumor (n=186) | 146 | 78.5 | 20 | 10.8 | 17 | 9.1% | 6 | 3.2 |
| Leukemia (n=55) | 1 | 1.8 | 22 | 40.0 | 27 | 49.1 | 7 | 12.7 |
| Lymphoma (n=24) | 6 | 25.0 | 10 | 41.7 | 6 | 25.0 | 2 | 8.3 |
| Neuroblastoma (n=15) | 4 | 26.7 | 10 | 66.7 | 2 | 13.3 | 0 | – |
| Bone and soft-tissue sarcoma (n=13) | 0 | – | 4 | 30.8 | 8 | 61.5 | 1 | 7.7 |
| Wilms Tumor (n=11) | 7 | 63.4 | 3 | 27.3 | 1 | 9.1 | 0 | – |
| Hematopoietic stem cells transplantation (n=3) | 0 | – | 1 | 33.3 | 0 | – | 2 | 66.7 |
| Other tumors (n=27) | 6 | 22.2 | 16 | 59.3 | 5 | 18.5 | 1 | 3.7 |
Each publication may include different data sources for the development of AI applications and may focus on more than one disease group.