Risa K Kawaguchi1,2,3, Masamichi Takahashi4, Mototaka Miyake5, Manabu Kinoshita6, Satoshi Takahashi2,7, Koichi Ichimura8, Ryuji Hamamoto2,7, Yoshitaka Narita4, Jun Sese2,9. 1. Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan. 2. Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan. 3. Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, NY 11724, USA. 4. Department of Neurosurgery and Neuro-Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan. 5. Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan. 6. Department of Neurosurgery, Osaka University Graduate School of Medicine, 2-2 Yamadaoka, Suita 565-0871, Japan. 7. Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonba-shi, Chuo-ku, Tokyo 103-0027, Japan. 8. Division of Brain Tumor Translational Research, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan. 9. Humanome Laboratory, 2-4-10 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
Abstract
Radiogenomics use non-invasively obtained imaging data, such as magnetic resonance imaging (MRI), to predict critical biomarkers of patients. Developing an accurate machine learning (ML) technique for MRI requires data from hundreds of patients, which cannot be gathered from any single local hospital. Hence, a model universally applicable to multiple cohorts/hospitals is required. We applied various ML and image pre-processing procedures on a glioma dataset from The Cancer Image Archive (TCIA, n = 159). The models that showed a high level of accuracy in predicting glioblastoma or WHO Grade II and III glioma using the TCIA dataset, were then tested for the data from the National Cancer Center Hospital, Japan (NCC, n = 166) whether they could maintain similar levels of high accuracy. Results: we confirmed that our ML procedure achieved a level of accuracy (AUROC = 0.904) comparable to that shown previously by the deep-learning methods using TCIA. However, when we directly applied the model to the NCC dataset, its AUROC dropped to 0.383. Introduction of standardization and dimension reduction procedures before classification without re-training improved the prediction accuracy obtained using NCC (0.804) without a loss in prediction accuracy for the TCIA dataset. Furthermore, we confirmed the same tendency in a model for IDH1/2 mutation prediction with standardization and application of dimension reduction that was also applicable to multiple hospitals. Our results demonstrated that overfitting may occur when an ML method providing the highest accuracy in a small training dataset is used for different heterogeneous data sets, and suggested a promising process for developing an ML method applicable to multiple cohorts.
Radiogenomics use non-invasively obtained imaging data, such as magnetic resonance imaging (MRI), to predict critical biomarkers of patients. Developing an accurate machine learning (ML) technique for MRI requires data from hundreds of patients, which cannot be gathered from any single local hospital. Hence, a model universally applicable to multiple cohorts/hospitals is required. We applied various ML and image pre-processing procedures on a glioma dataset from The Cancer Image Archive (TCIA, n = 159). The models that showed a high level of accuracy in predicting glioblastoma or WHO Grade II and III glioma using the TCIA dataset, were then tested for the data from the National Cancer Center Hospital, Japan (NCC, n = 166) whether they could maintain similar levels of high accuracy. Results: we confirmed that our ML procedure achieved a level of accuracy (AUROC = 0.904) comparable to that shown previously by the deep-learning methods using TCIA. However, when we directly applied the model to the NCC dataset, its AUROC dropped to 0.383. Introduction of standardization and dimension reduction procedures before classification without re-training improved the prediction accuracy obtained using NCC (0.804) without a loss in prediction accuracy for the TCIA dataset. Furthermore, we confirmed the same tendency in a model for IDH1/2 mutation prediction with standardization and application of dimension reduction that was also applicable to multiple hospitals. Our results demonstrated that overfitting may occur when an ML method providing the highest accuracy in a small training dataset is used for different heterogeneous data sets, and suggested a promising process for developing an ML method applicable to multiple cohorts.
Authors: Luca Pasquini; Antonio Napolitano; Martina Lucignani; Emanuela Tagliente; Francesco Dellepiane; Maria Camilla Rossi-Espagnet; Matteo Ritrovato; Antonello Vidiri; Veronica Villani; Giulio Ranazzi; Antonella Stoppacciaro; Andrea Romano; Alberto Di Napoli; Alessandro Bozzao Journal: Front Oncol Date: 2021-11-23 Impact factor: 6.244
Authors: Johannes Haubold; René Hosch; Vicky Parmar; Martin Glas; Nika Guberina; Onofrio Antonio Catalano; Daniela Pierscianek; Karsten Wrede; Cornelius Deuschl; Michael Forsting; Felix Nensa; Nils Flaschel; Lale Umutlu Journal: Cancers (Basel) Date: 2021-12-08 Impact factor: 6.639