| Literature DB >> 32704007 |
Alexandre Carré1,2, Guillaume Klausner1,2, Myriam Edjlali3,4,5, Marvin Lerousseau1,6, Jade Briend-Diop1, Roger Sun1,2,6, Samy Ammari7, Sylvain Reuzé1,2, Emilie Alvarez Andres1,8, Théo Estienne1,6, Stéphane Niyoteka1,2, Enzo Battistella1,6, Maria Vakalopoulou6, Frédéric Dhermain2, Nikos Paragios6,8, Eric Deutsch1,2, Catherine Oppenheim3,4,5, Johan Pallud4,5,9, Charlotte Robert10,11.
Abstract
Radiomics relies on the extraction of a wide variety of quantitative image-based features to provide decision support. Magnetic resonance imaging (MRI) contributes to the personalization of patient care but suffers from being highly dependent on acquisition and reconstruction parameters. Today, there are no guidelines regarding the optimal pre-processing of MR images in the context of radiomics, which is crucial for the generalization of published image-based signatures. This study aims to assess the impact of three different intensity normalization methods (Nyul, WhiteStripe, Z-Score) typically used in MRI together with two methods for intensity discretization (fixed bin size and fixed bin number). The impact of these methods was evaluated on first- and second-order radiomics features extracted from brain MRI, establishing a unified methodology for future radiomics studies. Two independent MRI datasets were used. The first one (DATASET1) included 20 institutional patients with WHO grade II and III gliomas who underwent post-contrast 3D axial T1-weighted (T1w-gd) and axial T2-weighted fluid attenuation inversion recovery (T2w-flair) sequences on two different MR devices (1.5 T and 3.0 T) with a 1-month delay. Jensen-Shannon divergence was used to compare pairs of intensity histograms before and after normalization. The stability of first-order and second-order features across the two acquisitions was analysed using the concordance correlation coefficient and the intra-class correlation coefficient. The second dataset (DATASET2) was extracted from the public TCIA database and included 108 patients with WHO grade II and III gliomas and 135 patients with WHO grade IV glioblastomas. The impact of normalization and discretization methods was evaluated based on a tumour grade classification task (balanced accuracy measurement) using five well-established machine learning algorithms. Intensity normalization highly improved the robustness of first-order features and the performances of subsequent classification models. For the T1w-gd sequence, the mean balanced accuracy for tumour grade classification was increased from 0.67 (95% CI 0.61-0.73) to 0.82 (95% CI 0.79-0.84, P = .006), 0.79 (95% CI 0.76-0.82, P = .021) and 0.82 (95% CI 0.80-0.85, P = .005), respectively, using the Nyul, WhiteStripe and Z-Score normalization methods compared to no normalization. The relative discretization makes unnecessary the use of intensity normalization for the second-order radiomics features. Even if the bin number for the discretization had a small impact on classification performances, a good compromise was obtained using the 32 bins considering both T1w-gd and T2w-flair sequences. No significant improvements in classification performances were observed using feature selection. A standardized pre-processing pipeline is proposed for the use of radiomics in MRI of brain tumours. For models based on first- and second-order features, we recommend normalizing images with the Z-Score method and adopting an absolute discretization approach. For second-order feature-based signatures, relative discretization can be used without prior normalization. In both cases, 32 bins for discretization are recommended. This study may pave the way for the multicentric development and validation of MR-based radiomics biomarkers.Entities:
Mesh:
Year: 2020 PMID: 32704007 PMCID: PMC7378556 DOI: 10.1038/s41598-020-69298-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Normalization methods and grey level discretization applied in recent radiomics studies dedicated to brain tumors.
| References | Multicenter | Number of patients | MRI sequences | Normalization technique | Grey-level discretization | Radiomics software | Features | Objective |
|---|---|---|---|---|---|---|---|---|
| Su et al.[ | No | 100 | T2w-flair | – | – | Pyradiomics | 18 first-order, 13 shape, 54 texture | Investigate the feasibility of predicting H3 K27M mutation status by applying an automated machine learning approach to the MR radiomics features of patients with midline gliomas |
| Liu et al.[ | Yes | 130 | T1w, T2w-fl1air | ComBat | – | Artificial Intelligence Kit (GE) | First-order, texture | Develop and validate a model that can be used to predict the individualized treatment response in children with cerebral palsy |
| Bologna et al.[ | – | Phantom | T1w, T2w | Z-Score | 32 FBN | Pyradiomics | 18 first-order, 14 shape, 75 texture | Analysis of virtual phantom for preprocessing evaluation and detection of a robust feature set for MRI-radiomics of the brain |
| Elsheikh et al.[ | Yes | 135 | T1w, T1w-gd, T2w, T2w-flair | – | – | – | First-order, texture | Analysis of multi-stage association of glioblastoma gene expressions with texture and spatial patterns |
| Tixier et al.[ | Yes | 90 | T1w-gd, T2w-flair | – | 128 FBN | CERR | 72 features (first-order, texture, shape) | Study the impact of tumor segmentation variability on the robustness of MRI radiomics features |
| Ortiz-Ramón et al.[ | No | 200 | T1w, T2w, T2w-flair | – | 32 FBN | MATLAB | 114 textures | Identify the presence of ischaemic stroke lesions by means of texture analysis on brain MRI |
| Vamvakas et al.[ | No | 40 | T1w, T1w-gd, T2w, T2w-flair | – | – | MATLAB | 11 first-order, 16 texture | Investigate the value of advanced multiparametric MRI biomarker analysis based on radiomics features and machine learning classification for glioma grading |
| Tixier et al.[ | Yes | 159 | T1w, T1w-gd, T2w-flair | – | 128 FBN | CERR | 286 features (first-order, shape, texture) | Evaluate the capacity of radiomics features to add complementary information to MGMT status, to improve the ability to predict prognosis |
| Wu et al.[ | Yes | 126 | T1w, T1w-gd, T2w, T2w-flair | – | – | – | 704 features (first-order, shape, texture) | Identify the optimal radiomics-based machine learning method for isocitrate dehydrogenase genotype prediction in diffuse gliomas |
| Artzi et al.[ | No | 439 | T1w-gd | WhiteStripe | – | MATLAB | 757 features (first-order, shape, texture) | Differentiate between glioblastoma and brain metastasis subtypes using radiomics analysis |
| Kniep et al.[ | No | 189 | T1w, T1w-gd, T2w-flair | WhiteStripe | – | Pyradiomics | 18 first-order, 17 shape, 56 texture | Investigate the feasibility of tumor type prediction with MRI radiomics image features of different brain metastases in a multiclass machine learning approach for patients with unknown primary lesion at the time of diagnosis |
| Sanghani et al.[ | Yes | 163 | T1w, T1w-gd, T2w, T2w-flair | – | – | Pyradiomics | 2200 features (first-order, shape, texture) | Predict overall survival in glioblastoma multiforme patients from volumetric, shape and texture features using machine learning |
| Liu et al.[ | Yes | 84 | T2w | Z-Score | – | MATLAB | 131 features (first-order, shape, texture) | Develop a radiomics signature for prediction of progression-free survival (PFS) in lower-grade gliomas and investigate the genetic background behind the radiomics signature |
| Peng et al.[ | No | 66 | T1w-gd, T2w-flair | – | 64 FBN | MATLAB | 51 features (first-order, shape, texture) | Distinguish true progression from radionecrosis after stereotactic radiation therapy for brain metastases with machine learning and radiomics |
| Bae et al.[ | No | 217 | T1w-gd, T2w-flair | WhiteStripe | – | Pyradiomics | 796 features (first-order, shape, texture) | Investigate whether radiomics features based on MRI improve survival prediction in patients with glioblastoma multiforme (GBM) when they are integrated with clinical and genetic profiles |
| Chen et al.[ | Yes | 220 | T1w, T1w-gd, T2w, T2w-flair | Nyul | – | Pyradiomics | 420 features (first-order, shape, texture) | Classify gliomas combining automatic segmentation and radiomics |
Jensen–Shannon divergences on DATASET1 compared using a Turkey HSD test.
| Turkey HSD (mean difference) | |||
|---|---|---|---|
| T1w-gd | T2w-flair | ||
| Pair 1 | No normalization-Nyul | − 0.469 | − 0.284 |
| Pair 2 | No normalization-WhiteStripe | − 0.446 | − 0.237 |
| Pair 3 | No normalization-Z-score | − 0.433 | − 0.241 |
| Pair 4 | Nyul-WhiteStripe | 0.024 | 0.048 |
| Pair 5 | Nyul-Z-score | 0.036 | 0.043 |
| Pair 6 | WhiteStripe-Z-score | 0.012 | − 0.005 |
| ANOVA | < 0.001 | < 0.001 | |
*Significant (P < .05).
Number of first-order features with ICCs and CCCs > 0.80 on DATASET1.
| Number of first-order features with ICCs and CCCs > 0.80 | ||
|---|---|---|
| T1w-gd | T2w-flair | |
| No normalization | 0/18 | 0/18 |
| Nyul | 16/18 | 8/18 |
| WhiteStripe | 5/18 | 1/18 |
| Z-Score | 9/18 | 1/18 |
Figure 1Balanced accuracies obtained for the tumour grade classification task using the 18 first-order features only. Bar plots and associated error bars represent the average balanced accuracies and the 95% CIs obtained using all 5 test folds of the cross-validation of the 5 machine learning models as a function of the normalization method, respectively. (A) T1w-gd MRI sequence only, (B) T2w-flair MRI sequence only.
Figure 2Percentages of the 73 textural features showing ICCs and CCCs values > 0.8 depending on the intensity normalization and the discretization method. (A) FBN T1w-gd, (B) FBN T2w-flair, (C) FBS T1w-gd, (D) FBS T2w-flair. FBN fixed bin number (relative discretization), FBS fixed bin size (absolute discretization), ICC intra-class correlation coefficient, CCC cross correlation coefficient. In (A) and (B), the No Normalization, WhiteStripe and Z-Score line plots are confounded. In (C) and (D), the No Normalization, WhiteStripe and Z-Score line plots are separated.
Figure 3Balanced accuracies obtained for the tumour grade classification task using the 73 textural features only. Bar plots and associated error bars represent the average balanced accuracies and the 95% CIs obtained using all 5 test folds of the cross-validation of the 5 machine learning models as a function of the normalization method and number of bins, respectively. (A) FBN T1w-gd. (B) FBN T2w-flair. (C) FBS T1w-gd. (D) FBS T2w-flair. fixed bin number (relative discretization). FBS fixed bin size (absolute discretization).
Summary of the average balanced accuracies and the corresponding 95% CI (DATASET2) obtained using all 5 test folds of the cross-validation of the 5 machine learning models (neural network, random forest, support vector machine, logistic regression, naïve Bayes) as a function of the normalization method. For both intensity discretization methods (FBN and FBS), 32 bins were used. For model 4, numbers of robust features as defined using DATASET1 are written in square brackets. BAC balanced accuracy, ROC-AUC area under the receiver operating characteristic curve.
| T1w-gd | T2w-flair | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model 1 | Model 2 | Model 3 | Model 4 | Model 1 | Model 2 | Model 3 | Model 4 | ||||||||||
| BAC | ROC-AUC | BAC | ROC-AUC | BAC | ROC-AUC | BAC | ROC-AUC | BAC | ROC-AUC | BAC | ROC-AUC | BAC | ROC-AUC | BAC | ROC-AUC | ||
| FBN | No normalization | 0.67 (0.61–0.73) | 0.74 (0.68–0.80) | 0.80 (0.76–0.83) | 0.86 (0.82–0.89) | 0.76 (0.71–0.81) | 0.83 (0.77–0.88) | 0.73 (0.70–0.77) [23] | 0.83 (0.80–0.86) [23] | 0.62 (0.59–0.64) | 0.64 (0.60–0.68) | 0.65 (0.62–0.68) | 0.70 (0.67–0.73) | 0.63 (0.60–0.65) | 0.70 (0.66–0.74) | 0.60 (0.57–0.63) [23] | 0.66 (0.63–0.70) [23] |
| Nyul | 0.82 (0.79–0.84) | 0.90 (0.87–0.92) | 0.76 (0.72–0.79) | 0.83 (0.80–0.86) | 0.81 (0.77–0.84) | 0.88 (0.86–0.91) | 0.81 (0.78–0.84) [43] | 0.89 (0.86–0.92) [43] | 0.56 (0.52–0.59) | 0.61 (0.58–0.65) | 0.67 (0.64–0.69) | 0.72 (0.70–0.74) | 0.66 (0.64–0.69) | 0.71 (0.69–0.74) | 0.62 (0.59–0.66) [27] | 0.67 (0.63–0.70) [27] | |
| WhiteStripe | 0.79 (0.77–0.82) | 0.88 (0.86–0.90) | 0.80 (0.76–0.83) | 0.86 (0.83–0.90) | 0.80 (0.77–0.84) | 0.89 (0.86–0.92) | 0.79 (0.76–0.83) [28] | 0.89 (0.87–0.91) [28] | 0.57 (0.54–0.60) | 0.63 (0.60–0.67) | 0.65 (0.62–0.68) | 0.70 (0.67–0.73) | 0.65 (0.62–0.67) | 0.70 (0.67–0.73) | 0.62 (0.59–0.65) [24] | 0.67 (0.64–0.71) [24] | |
| Z-Score | 0.82 (0.80–0.85) | 0.91 (0.89–0.93) | 0.80 (0.76–0.83) | 0.86 (0.83–0.90) | 0.82 (0.80–0.86) | 0.90 (0.88–0.93) | 0.81 (0.78–0.84) [32] | 0.91 (0.89–0.94) [32] | 0.60 (0.57–0.63) | 0.65 (0.62–0.69) | 0.65 (0.62–0.68) | 0.70 (0.66–0.73) | 0.67 (0.64–0.70) | 0.72 (0.69–0.75) | 0.63 (0.60–0.66) [24] | 0.68 (0.65–0.72) [24] | |
| FBS | No normalization | 0.67 (0.61–0.73) | 0.74 (0.68–0.80) | 0.68 (0.62–0.72) | 0.75 (0.70–0.79) | 0.69 (0.63–0.74) | 0.75 (0.68–0.81) | 0.58 (0.54–0.61) [9] | 0.64 (0.59–0.69) [23] | 0.62 (0.59–0.64) | 0.64 (0.60–0.68) | 0.60 (0.58–0.63) | 0.64 (0.61–0.68) | 0.59 (0.56–0.62) | 0.64 (0.60–0.67) | 0.56 (0.54–0.59) [7] | 0.61 (0.58–0.65) [7] |
| Nyul | 0.82 (0.79–0.84) | 0.90 (0.87–0.92) | 0.76 (0.74–0.79) | 0.83 (0.80–0.86) | 0.81 (0.78–0.84) | 0.88 (0.85–0.91) | 0.82 (0.79–0.85) [40] | 0.89 (0.86–0.91) [43] | 0.56 (0.52–0.59) | 0.61 (0.58–0.65) | 0.64 (0.60–0.68) | 0.71 (0.67–0.75) | 0.62 (0.59–0.66) | 0.70 (0.66–0.73) | 0.59 (0.55–0.62) [50] | 0.64 (0.61–0.68) [50] | |
| WhiteStripe | 0.79 (0.77–0.82) | 0.88 (0.86–0.90) | 0.76 (0.72–0.79) | 0.84 (0.81–0.87) | 0.79 (0.76–0.82) | 0.87 (0.84–0.90) | 0.79 (0.76–0.82) [20] | 0.88 (0.86–0.90) [28] | 0.57 (0.54–0.60) | 0.63 (0.60–0.67) | 0.63 (0.60–0.66) | 0.69 (0.67–0.73) | 0.61 (0.58–0.64) | 0.69 (0.65–0.72) | 0.61 (0.58–0.64) [36] | 0.68 (0.65–0.71) [36] | |
| Z-Score | 0.82 (0.80–0.85) | 0.91 (0.89–0.93) | 0.78 (0.75–0.82) | 0.86 (0.83–0.89) | 0.80 (0.77–0.83) | 0.90 (0.87–0.93) | 0.83 (0.80–0.85) [45] | 0.91 (0.88–0.93) [32] | 0.60 (0.57–0.63) | 0.65 (0.62–0.69) | 0.64 (0.61–0.67) | 0.70 (0.67–0.73) | 0.64 (0.60–0.67) | 0.71 (0.68–0.74) | 0.61 (0.58–0.63) [36] | 0.66 (0.62–0.69) [36] | |
Datasets description including MR acquisition parameters.
| Parameters | DATASET1 | DATASET2a | ||||||
|---|---|---|---|---|---|---|---|---|
| Sequence | T1w-gd | T2w-flair | T1w-gd | T2w-flair | ||||
| Manufacturer model | GE Signa HDxt | GE Discovery MR750 | GE Signa HDxt | GE Discovery MR750 | Philips AchievaSiemens (17) GE Signa Genesis (52) GE Signa Excite (71) GE Signa HDx (3) GE Signa HDxt (8) Siemens Magnetom Vision (10) Hitachi Oasis (1) Philips Ingenia (6) Philips Intera (6) Philips Intera Achieva (1) Siemens Avanto (9) Siemens Skyra (1) Siemens Symphony (10) Siemens Trio (2) Siemens TrioTim (3) Siemens Verio (5) Undefined (38) | |||
| Cohort | LGG | HGG | LGG | HGG | ||||
| Magnetic field strength (T) | 1.5 | 3.0 | 1.5 | 3.0 | 1.16 (N = 1), 1.5 (N = 51), 3.0 (N = 47), undefined (N = 9) | 0.5 (N = 2), 1 (N = 1), 1.5 (N = 82), 3.0 (N = 44) undefined (N = 6) | 1.16 (N = 1), 1.5 (N = 51), 3.0 (N = 47), undefined (N = 9) | 0.5 (N = 2), 1 (N = 1), 1.5 (N = 82), 3.0 (N = 44) undefined (N = 6) |
| TR (ms) | 11 | 10 | 9802 | 8000 | 1106 [6–5500] | 890 [5–3286] | 9686 [6000–11,000] | 9581 [1000–11,000] |
| TE (ms) | 4 | 3 | 157 | 123 | 7 [3–17] | 9 [2–105] | 128 [94–158] | 135 [74–355] |
| Slice thickness (mm) | 1.4 | 1.2 | 5.0 | 3.5 | 2.4 [1.0–5.0] | 3.2 [1.0–6.0] | 3.8 [2.0–5.0] | 4.14 [1.2–6.0] |
| Pixel spacing (mm) | 0.49 × 0.49 | 0.47 × 0.47 | 0.47 × 0.47 | 0.43 × 0.43 | 0.68 × 0.68 [0.39 × 0.39–1.02 × 1.02] | 0.77 × 0.77 [0.43 × 0.43–1.02 × 1.02] | 0.74 × 0.74 [0.39 × 0.39–1.01 × 1.01] | 0.77 × 0.77 [0.43 × 0.43–1.01 × 1.01] |
| Matrix dimensions | 288 × 288 | 320 × 288 | 256 × 192 | 352 × 192 | 303 × 2130 [224 × 134–512 × 300] | 283 × 204 [224 × 134–512 × 300] | 306 × 214 [256 × 112–512 × 256] | 283 × 194 [192 × 98–512 × 320] |
| FOV (mm) | 250 | 240 | 240 | 220 | 244 [200–260] | 235 [200–260] | 237 [200–260] | 228 [200–260] |
| Pixel bandwidth (Hz/px) | 65.12 | 65.12 | 122 | 195 | 166 [81–250] | 162 [61–355] | 153 [61–358] | 170 [61–750] |
| Flip angle (°) | 17 | 15 | 90 | 90 | 53 [8–90] | 70 [8–90] | 100 [90–180] | 102 [90–180] |
TR repetition time, TE echo time, FOV field of view.
aSome metadata information are missing (< 10% of all patients). For the DATASET2, values representations are: mean [min–max]. The number of patients for each MR system is indicated in brackets. Additional information about DATASET2 are available in Bakas et al. [34,35].
Figure 4Design of the study.