| Literature DB >> 32382048 |
Sidong Liu1,2,3, Zubair Shah2,3,4, Aydin Sav5, Carlo Russo2, Shlomo Berkovsky3, Yi Qian6, Enrico Coiera3, Antonio Di Ieva7,8.
Abstract
Mutations in isocitrate dehydrogenase genes IDH1 and IDH2 are frequently found in diffuse and anaplastic astrocytic and oligodendroglial tumours as well as in secondary glioblastomas. As IDH is a very important prognostic, diagnostic and therapeutic biomarker for glioma, it is of paramount importance to determine its mutational status. The haematoxylin and eosin (H&E) staining is a valuable tool in precision oncology as it guides histopathology-based diagnosis and proceeding patient's treatment. However, H&E staining alone does not determine the IDH mutational status of a tumour. Deep learning methods applied to MRI data have been demonstrated to be a useful tool in IDH status prediction, however the effectiveness of deep learning on H&E slides in the clinical setting has not been investigated so far. Furthermore, the performance of deep learning methods in medical imaging has been practically limited by small sample sizes currently available. Here we propose a data augmentation method based on the Generative Adversarial Networks (GAN) deep learning methodology, to improve the prediction performance of IDH mutational status using H&E slides. The H&E slides were acquired from 266 grade II-IV glioma patients from a mixture of public and private databases, including 130 IDH-wildtype and 136 IDH-mutant patients. A baseline deep learning model without data augmentation achieved an accuracy of 0.794 (AUC = 0.920). With GAN-based data augmentation, the accuracy of the IDH mutational status prediction was improved to 0.853 (AUC = 0.927) when the 3,000 GAN generated training samples were added to the original training set (24,000 samples). By integrating also patients' age into the model, the accuracy improved further to 0.882 (AUC = 0.931). Our findings show that deep learning methodology, enhanced by GAN data augmentation, can support physicians in gliomas' IDH status prediction.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32382048 PMCID: PMC7206037 DOI: 10.1038/s41598-020-64588-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Samples of the GAN synthesised images from coarse to fine scales.
Classification performance of different DNN models on the validation and test sets.
| DNN models | Sensitivity | Specificity | Accuracy | AUC | |
|---|---|---|---|---|---|
| On validation set | ResNet50 | 0.722 | 0.823 | ||
| Inception_V3 | 0.722 | 0.750 | 0.735 | 0.858 | |
| IncepResNet_V2 | 0.688 | ||||
| VGG19 | 0.688 | 0.851 | |||
| On test set | ResNet50 | 0.778 | |||
| Inception_V3 | 0.778 | 0.750 | 0.765 | 0.872 | |
| IncepResNet_V2 | 0.750 | 0.844 | |||
| VGG19 | 0.778 | 0.809 |
Figure 2Receiver Operating Characteristics (ROC) curves of different DNN models on the validation set (a) and test set (b).
Classification performance of ResNet50 on the test set with different numbers of GAN-augmented training samples. In bold, the highest values reached by GAN-based augmentation of the dataset to 3,000 synthetic images (12.5% addition to the original training set).
| Samples | Sensitivity | Specificity | Accuracy | AUC | |
|---|---|---|---|---|---|
| On validation set | Baseline | 0.722 | 0.813 | 0.765 | 0.823 |
| +1 K (+4.2%) | 0.778 | 0.750 | 0.765 | 0.847 | |
| +2 K ( + 8.3%) | 0.722 | 0.765 | 0.851 | ||
| +3 K (12.5%) | |||||
| +4 K (+16.7%) | 0.778 | 0.794 | 0.854 | ||
| +5 K (+20.8%) | 0.833 | 0.750 | 0.794 | 0.840 | |
| On test set | Baseline | 0.778 | 0.813 | 0.794 | 0.920 |
| +1 K (+4.2%) | 0.813 | 0.924 | |||
| +2 K (+8.3%) | 0.833 | 0.910 | |||
| +3 K (12.5%) | 0.813 | ||||
| +4 K (+16.7%) | 0.778 | 0.813 | 0.794 | 0.893 | |
| +5 K (+20.8%) | 0.778 | 0.813 | 0.794 | 0.882 |
Figure 3ROC curves of the models with different number of GAN generated image samples on the validation set (a) and test set (b).
Classification performance of using different predictors, including age, image (based on the model trained on 3,000 GAN samples), and age and image combined.
| Predictor | Sensitivity | Specificity | Accuracy | AUC | |
|---|---|---|---|---|---|
| On validation set | Age | 0.667 | 0.563 | 0.618 | 0.724 |
| Image | 0.868 | ||||
| Age & image | |||||
| On test set | Age | 0.722 | 0.750 | 0.735 | 0.786 |
| Image | 0.889 | 0.853 | 0.927 | ||
| Age & image |
Figure 4ROC curves of the models with different predictors, including age, image, and age and image combined.
Performance on different age groups.
| Age Group | Group I: < 55 | Group II: ≥ 55 | |
|---|---|---|---|
| Total | 44 (28:17) | 24 (8:16) | |
| Age | 14 (3:11) | 8 (8:0) | |
| Image | 8 (4:4) | 1 (0:1) | |
| Age & Image | 7 (3:4) | 1 (0:1) | |
| Accuracy | Age | 0.682 | 0.667 |
| Image | 0.818 | ||
| Age & Image |
The sample distribution of the TCGA patients (SD = standard deviation).
| Tumour type (WHO Grade) | IDH-wildtype | IDH-mutant | Total |
|---|---|---|---|
| Glioblastoma (IV)* | 70 | 30 | 100 |
| Diffuse astrocytoma (II) # | 4 | 14 | 18 |
| Anaplastic astrocytoma (III) # | 26 | 16 | 42 |
| Oligodendroglioma (II) # | 0 | 29 | 29 |
| Anaplastic oligodendroglioma (III) # | 0 | 11 | 11 |
| Total | 100 | 100 | 200 |
| Age (mean±SD) | 55.5±13.3 | 44.2±12.7 | 49.9±14.1 |
| Gender (male:female) | 58:42 | 59:41 | 117:83 |
*from the TCGA GBM project.
#from the TCGA LGG project.
The sample distribution of the cases from a local hospital (SD = standard deviation).
| Tumour type (WHO Grade) | IDH-wildtype | IDH-mutant | Total |
|---|---|---|---|
| Glioblastoma (IV) | 21 | 10 | 31 |
| Diffuse astrocytoma (II) | 4 | 3 | 7 |
| Anaplastic Astrocytoma (III) | 4 | 0 | 4 |
| Oligodendroglioma (II) | 0 | 7 | 7 |
| Anaplastic oligodendroglioma (III) | 0 | 16 | 16 |
| Diffuse Midline Glioma (IV) | 1 | 0 | 1 |
| Total | 30 | 36 | 66 |
| Age (mean±SD) | 53.3±14.7 | 45.9±9.0 | 49.3±12.4 |
| Gender (male:female) | 14:16 | 11:25 | 25:41 |
Figure 5A generalized overview of the proposed method, including two GAN models for data augmentation and a ResNet50 model for image classification.
Figure 6TCGA data pre-processing. (a) whole-slide image partition; (b) tissue proportion-based tile selection; (c) distribution of whole-slide image sizes; (d) distribution of number of selected tiles per image.