Literature DB >> 31001929

Machine-learning based radiogenomics analysis of MRI features and metagenes in glioblastoma multiforme patients with different survival time.

Xin Liao1, Bo Cai2, Bin Tian1, Yilin Luo1, Wen Song1, Yinglong Li3.   

Abstract

BACKGROUND: This study aimed to examine multi-dimensional MRI features' predictability on survival outcome and associations with differentially expressed Genes (RNA Sequencing) in groups of glioblastoma multiforme (GBM) patients.
METHODS: Radiomics features were extracted from segmented lesions of T2-FLAIR MRI data of 137 GBM patients. Radiomics features include intensity, shape and textural features in seven classes were included in the analysis. Patients were divided into two groups depending on their survival time (shorter or longer than 1-year survival). Four different machine learning algorithms were implemented to construct the prediction models. Features with top importance (importance >0.04) were selected to construct the prediction model using the model with the best performance. The interactions between image features and genomics were then analysed with Pearson's correlation analysis.
RESULTS: The GBDT model with 72 features with highest importance had the highest accuracy of 0.81 on both short and long survival time classes, and the area under the curve (AUC) of the receiver operative characteristic (ROC) of the short and long survival time class were 0.79 and 0.81. Six metagenes showed significant interactive effect (P < 0.05), and Pearson's correlation analysis revealed that three of these metagenes (TIMP1, ROS1 EREG) showed moderate (0.3 < |r| < 0.5) or high correlation (|r| > 0.5) with image features.
CONCLUSION: Radiogenomics analysis shows that MRI features are predictive of survival outcomes, and image features are highly associated with selective metagenes. Radiogenomics analysis is a useful method for optimizing clinical diagnosis and selecting effective treatments.
© 2019 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.

Entities:  

Keywords:  zzm321990EREGzzm321990; ROS1; TIMP1; death day to diagnosis; glioblastoma multiforme; machine learning; radiogenomics

Mesh:

Substances:

Year:  2019        PMID: 31001929      PMCID: PMC6533509          DOI: 10.1111/jcmm.14328

Source DB:  PubMed          Journal:  J Cell Mol Med        ISSN: 1582-1838            Impact factor:   5.310


INTRODUCTION

Glioblastoma multiforme (GBM), one of the most invasive and fatal brain tumours that develops from glial cells, can severely affects the central nervous system and general health [1]. The percent 5‐year surviving rate was estimated to be 33.2% between 2008 and 2014 according to statistics from the Surveillance, Epidemiology and End Results (SEER) database and the Centers for Disease Control and Prevention's National Center for Health Statistics (https://seer.cancer.gov/csr/1975_2015/). Due to the heterogeneous nature of GBM, relatively high age of disease onset, migration of malignant cells to surrounding tissue, the treatment outcome for GBM are highly variable, yielding an average survival rate of 12.6 months [2]. Current clinical practice for treating GBM mostly involves tumour resection and chemotherapy [3]. Genomics study is an essential method to study GBM by examining alternations in genomic pathways and identifying relevant biomarkers. Gene studies involving tissues, plasma, or cell lines used protein expression data to reveal that common alternations in GBM include mutations of specific gene and proteins such as RTKs, TP53 RB1 and increased expression of EGFR and PDGFRA [4, 5]. However, tissue sample is usually acquired after biopsy and may not be suitable for all patients, especially for early diagnosis. Neuroimaging of GBM is a non‐invasive tool for disease diagnosis and monitor treatment outcome. A wide range of MR techniques including T1, T2 and FLAIR imaging are used to capture GBM characteristics. Typically, GBM appears as a heterogeneous enhancement region with a non‐enhancing necrosis in the center [6]. FLAIR sequences have advantages of showing abnormalities more clearly [7]. MRI‐based features were shown to be highly predictive of tumour grading in GBM [8]. Textural image features were associated with CD3 T cell infiltration status in GMB [9]. In recent years, the emergence of radiogenomics, combing radiomics image features and genomics, allows the study of GBM more comprehensively. For example, MRI parameters revealed that haemodynamic abnormalities were associated with the expression level of the mTOREGFR pathway in [10]. Based on previous findings, we aimed to investigate the machine learning based methods in combination with radiogenomics to study the associations among MRI features, genomics and the survival rates in GBM patients. Computer assisted methods allow more comprehensive characterization of imaging data and more sophisticated way to predict disease outcome. We hypothesize that radiomics features of FLAIR imaging data can be predictive of patients’ survival, and radiogenomics analysis can reveal the linkage between images features and known genes in previously defined molecular pathways.

MATERIALS AND METHODS

Dataset

MRI data were obtained from the Cancer Imaging Archive (TCIA) (https://wiki.cancerimagingarchive.net/display/Public/TCGA-GBM), and corresponding genomics data were acquired from the Genomic Data Commons (GDC) Data Portal. A total of 137 patients with MRI data, 129 patients with known genomic data were included in the analysis and 46 patients were the intersection of MRI data set and gene data set. Patient characteristics are summarized in Table 1. Because the average survival rate of GBM patients was reported to be 12.6 months [2], and all the patients in our cohort has demised during follow‐up, for the classification purpose, we used 1 year as a threshold and the patients were divided into short (<1 year) and long (>1 year) survival groups. Figure 1 shows the process of the workflow of this study.
Table 1

Clinical characteristics of the cohort. This table shows the clinical information of the data analysed in this study. Gene∩MRI means that the dataset has both genetic data and MRI data

GenderDeath days to diagnosisNumberAge
MenFemaleLong (>1 year)Short (<1 year)TotalMeanSD
Gene8544686112962.05 (25‐89)12.55
MRI8552716613761.24 (16‐86)13.53
Gene+MRI271925214661.86 (32‐86)12.04
Figure 1

The workflow of this study. The radiomics workflow. Lesions were segmented from untreated MR images. Feature extraction was performed from lesions by pyradiomics. The radiomics features were selected for classifier model constructing. And the classifier model was evaluated by confusion matrix and ROC curves

Clinical characteristics of the cohort. This table shows the clinical information of the data analysed in this study. Gene∩MRI means that the dataset has both genetic data and MRI data The workflow of this study. The radiomics workflow. Lesions were segmented from untreated MR images. Feature extraction was performed from lesions by pyradiomics. The radiomics features were selected for classifier model constructing. And the classifier model was evaluated by confusion matrix and ROC curves

Image preprocessing and lesion segmentation

Lesion segmentation is required before feature extraction. Lesion volumes were manually delineated by an experienced radiologist using 3D slicer (https://www.slicer.org/). All original loaded MRI images of patients were DICOM format. After adding MRI data into 3D slicer, we selected the Segment Editor module to segment the lesion.

Feature extraction

Feature extraction was performed using a Python software package Pyradiomics [11]. First‐order and multi‐dimensional features were extracted from seven feature classes including First Order Features, Shape Features, Gray Level Co‐occurrence Matrix (GLCM) Features, Gray Level Size Zone Matrix (GLSZM) Features, Gray Level Run Length Matrix (GLRLM) Features, Neighboring Gray Tone Difference Matrix (NGTDM) Features, Gray Level Dependence Matrix (GLDM) Features. Detailed number of each feature is listed in Table S1.

Machine learning model construction and evaluation

The MRI dataset was divided into the training and testing sets according to a ratio of 7:3. Four machine learning algorithms including GBDT (Gradient Boosting Decision Tree), logistic regression, support vector machine (SVM) and KNN (k‐nearest neighbours) were tested. These four methods are representative in their own category. Gradient boosting decision tree is a tree‐based ensemble machine learning model which can achieve state‐of‐the‐art accuracy in classification and regression. Logistic regression is a classic probabilistic model. Support vector machine is another widely used model featured by kernel trick [12]. As for k‐nearest neighbours, it is a typical lazy‐learning method and is frequently treated as a benchmark in predictive modelling [13]. Feature importance was computed using GBDT (https://doi.org/10.2307/2699986), implemented by python package sci‐kit learn (https://scikit-learn.org/stable/index.html). In the final prediction model construction, feature with importance value smaller than 0.04 which were treated as not important were excluded. This threshold is chosen after the manually checking of the distribution of feature importance. Confusion matrices and receiver operative characteristic (ROC) were computed to evaluate and compare the performances of all four machine learning models. The model that is most predictive of GBM patients’ survival time is chosen for further radiogenomics analysis.

Relevant gene selection

Differentially expressed genes (DEGs) analysis was performed with R software, using package DESeq2. A gene is declared to be DEGs if a difference or change observed in read counts or expression is statistically significant. Fold change and t test are widely used methods to estimate gene variances in practice [14]. The condition we added for screening out DEGs was |log2(fold change)| > 1 and adjusted P < 0.05. And the same DEGs analytical process was applied to Dataset of Gene and Dataset with both MRI and Gene data to obtain DEGs. DEGs are treated as metagenes in our analysis. After screening out DEGs, the number of samples was reduced while individual differences among groups were enhanced. In order to screen for efficiently DEGs, we selected the DEGs from the intersection of the Genetic Dataset and Dataset which contain both MRI and Gene data.

Correlations between image features and genomics

To survey the potential correlations between the important image features of the classification model and the efficiently DEGs, we performed Pearson correlation analysis. Statistically, the absolute value of Pearson's correlation coefficient is between 0.3 and 0.5, indicating a moderate correlation and greater than 0.5 indicating a significant correlation. We also filtered out the weak correlation based on Benjamin‐Hochberg adjusted P‐values [15]. For the generalization purpose, we used the Pearson correlation as the final correlation selection metrics.

Risk stratification of metagenes

In order to survey the prognostic power of identified metagenes. We used the maximally selected rank statistics [16], implemented by R package maxstat to find the optimal cut point for the risk stratification on the basis of expression value of corresponding metagenes. Afterwards, we used Kaplan‐Meier (KM) estimator to measure the patients’ survival rates in high and low gene expression strata and plotted the aforementioned information by R package survminer.

RESULTS

Selected radiomics features

Thresholding based on feature importance (importance index >0.04) resulted in a total of 72 features for constructing the final prediction model. The threshold is chosen after the manually checking of feature importance distribution (Figure S1A). Table 2 lists the full name and abbreviation of the corresponding 72 features in the model.
Table 2

Detailed names and abbreviations of 72 features

Full nameShort name
log‐sigma‐3‐0‐mm‐3D_gldm_SmallDependenceEmphasisgldm‐SDE
wavelet‐HHL_gldm_DependenceNonUniformityNormalizedgldm‐DNUN
log‐sigma‐3‐0‐mm‐3D_firstorder_Uniformityfirstorder‐Uniformity
wavelet‐HHH_glszm_GrayLevelNonUniformityNormalizedglszm‐GLNUN
wavelet‐LLL_glcm_InverseVarianceglcm‐IV
wavelet‐LLH_glszm_ZonePercentageglszm‐ZP
wavelet‐LLH_glcm_Idmglcm‐LLH‐Idm
wavelet‐HLH_glcm_InverseVarianceglcm‐HLH‐IV
log‐sigma‐4‐0‐mm‐3D_glcm_Idmglcm‐Idm
wavelet‐HLH_glcm_SumSquaresglcm‐HLH‐SS
wavelet‐HLH_gldm_GrayLevelVariancegldm‐HLH‐GLV
wavelet‐LLL_glszm_ZoneVarianceglszm‐LLL‐ZV
log‐sigma‐4‐0‐mm‐3D_glcm_Idglcm‐Id
wavelet‐HHL_glrlm_RunLengthNonUniformityNormalizedglrlm‐HHL‐RLNUN
wavelet‐HLL_glrlm_RunLengthNonUniformityNormalizedglrlm‐HLL‐RLNUN
wavelet‐HLH_glszm_SmallAreaEmphasisglszm‐HLH‐SAE
wavelet‐LLL_glcm_Correlationglcm‐LLL‐Correlation
wavelet‐HHH_glszm_SmallAreaEmphasisglszm‐SAE
wavelet‐HHL_glcm_DifferenceAverageglcm‐DA
log‐sigma‐5‐0‐mm‐3D_glcm_Correlationglcm‐Correlation
log‐sigma‐4‐0‐mm‐3D_glrlm_ShortRunEmphasisglrlm‐SRE
original_glrlm_RunLengthNonUniformityNormalizedglrlm‐RLNUN
wavelet‐LHL_glcm_Idnglcm‐LHL‐Idn
wavelet‐HLH_glcm_Idnglcm‐HLH‐Idn
wavelet‐HHL_glcm_Idnglcm‐HHL‐Idn
wavelet‐LLL_glrlm_RunLengthNonUniformityNormalizedglrlm‐LLL‐RLNUN
wavelet‐LLL_glcm_Imc2glcm‐LLL‐Imc2
log‐sigma‐5‐0‐mm‐3D_glcm_Idnglcm‐Idn
log‐sigma‐4‐0‐mm‐3D_glcm_Idmnglcm‐Idmn
wavelet‐HLH_glcm_ClusterProminenceglcm‐HLH‐CP
wavelet‐HHL_glcm_DifferenceEntropyglcm‐HHL‐DE
wavelet‐HHH_firstorder_InterquartileRangefirstorder‐HHH‐IR
wavelet‐HHL_firstorder_InterquartileRangefirstorder‐HHL‐IR
log‐sigma‐3‐0‐mm‐3D_firstorder_Entropyfirstorder‐Entropy
wavelet‐LLH_gldm_LargeDependenceEmphasisgldm‐LLH‐LDE
wavelet‐LLH_glcm_DifferenceEntropyglcm‐LLH‐DE
wavelet‐HLH_firstorder_InterquartileRangefirstorder‐HLH‐IR
wavelet‐LHL_gldm_LargeDependenceEmphasisgldm‐LHL‐LDE
original_gldm_LargeDependenceEmphasisgldm‐LDE
wavelet‐LHH_glcm_SumEntropyglcm‐LHH‐SE
wavelet‐LHH_glszm_LargeAreaEmphasisglszm‐LHH‐LAE
log‐sigma‐5‐0‐mm‐3D_glcm_SumSquaresglcm‐SS
log‐sigma‐2‐0‐mm‐3D_glcm_Contrastglcm‐Contrast
wavelet‐LHH_gldm_LargeDependenceEmphasisgldm‐LHH‐LDE
log‐sigma‐5‐0‐mm‐3D_glrlm_RunEntropyglrlm‐RE
log‐sigma‐5‐0‐mm‐3D_glszm_ZoneVarianceglszm‐ZV
wavelet‐HHL_glcm_JointEntropyglcm‐JointEntropy
log‐sigma‐3‐0‐mm‐3D_glszm_LargeAreaEmphasisglszm‐LAE
wavelet‐LLH_gldm_GrayLevelNonUniformitygldm‐GLNU
wavelet‐HLL_glszm_GrayLevelNonUniformityglszm‐GLNU
log‐sigma‐5‐0‐mm‐3D_glszm_ZoneEntropyglszm‐ZE
wavelet‐HLL_gldm_GrayLevelNonUniformitygldm‐HLL‐GLNU
wavelet‐LLL_glcm_SumEntropyglcm‐LLL‐SE
wavelet‐HHH_glrlm_HighGrayLevelRunEmphasisglrlm‐HHH‐HGLRE
wavelet‐HHH_firstorder_Maximumfirstorder‐Max
wavelet‐LLH_gldm_GrayLevelVariancegldm‐LLH‐GLV
wavelet‐LLH_glcm_SumSquaresglcm‐LLH‐SS
original_firstorder_MeanAbsoluteDeviationfirstorder‐MAD
wavelet‐LLH_glcm_JointAverageglcm‐JointAverage
wavelet‐LLH_glrlm_GrayLevelVarianceglrlm‐GLV
log‐sigma‐2‐0‐mm‐3D_gldm_DependenceNonUniformitygldm‐DNU
log‐sigma‐5‐0‐mm‐3D_glrlm_HighGrayLevelRunEmphasisglrlm‐HGLRE
wavelet‐HLL_firstorder_Variancefirstorder‐Variance
wavelet‐LHL_firstorder_RootMeanSquaredfirstorder‐RMS
original_glrlm_ShortRunHighGrayLevelEmphasisglrlm‐SRHGLE
original_glszm_SmallAreaHighGrayLevelEmphasisglszm‐SAHGLE
log‐sigma‐2‐0‐mm‐3D_glcm_ClusterProminenceglcm‐CP
original_glszm_HighGrayLevelZoneEmphasisglszm‐HGLZE
wavelet‐LLL_gldm_SmallDependenceHighGrayLevelEmphasisgldm‐SDHGLE
original_glszm_LargeAreaHighGrayLevelEmphasisglszm‐LAHGLE
log‐sigma‐2‐0‐mm‐3D_gldm_LargeDependenceHighGrayLevelEmphasisgldm‐LDHGLE
wavelet‐LLL_gldm_LargeDependenceHighGrayLevelEmphasisgldm‐LLL‐LDHGLE
Detailed names and abbreviations of 72 features

Model comparison

Among the four machine learning algorithms, GBDT had the highest accuracy of 0.81 for discriminating patients with short or long survival in testing set, while the accuracy of logistic regression, SVM and KNN is 0.69, 0.76 and 0.79, respectively. Figure 2 shows the performance of the GBDT classifier. Figure 2A is the confusion matrix demonstrating the proportion of correct and wrong predictions in each survival class. Figure 2B shows the ROC curves for predicting patients with short and long survivals, yielding an AUC value of 0.79 for short‐survival class and 0.81 for long‐survival class.
Figure 2

The performance of the GBDT classifier. A, Confusion matrix (The horizontal line means the number of predicted in each group; the vertical line means the actual number of each group. The leading diagonal represents correct prediction; the minor diagonal represents incorrect prediction). B, Receiver operating characteristic (ROC) curve. (X axis represents false positive rate and Y axis is true positive rate.)

The performance of the GBDT classifier. A, Confusion matrix (The horizontal line means the number of predicted in each group; the vertical line means the actual number of each group. The leading diagonal represents correct prediction; the minor diagonal represents incorrect prediction). B, Receiver operating characteristic (ROC) curve. (X axis represents false positive rate and Y axis is true positive rate.)

Metagenes selection

Six metagenes including WDR72, C14orf39, TIMP1, CHIT1, ROS1 and EREG were found to have significantly different expression levels among patients with short vs. long survival time (Figure 3). The difference analysis of these six genes was conducted between the long and short group, and the result is shown in Table 3.
Figure 3

Gene expressions of six gene. The distribution of six Gene expressions among patients with short vs. long survival time. The expression levels of six genes were significantly different in two classes of survival patients

Table 3

Intersection of difference analysis between group long and short. Threshold of difference analysis adjusted P < 0.05 & |log2FoldChange|>1

mRNABase meanlog2FC P value P.adjBase meanlog2FC P value P.adj
WDR7222.54202−1.530570.000010.0032722.10411−2.66113<0.000010.00678
C14orf399.74465−1.035450.000510.0424713.31277−2.217500.000020.02407
TIMP125087.603181.06274<0.000010.0010928495.578701.536570.000040.03445
CHIT1345.048441.40483<0.000010.00109329.451622.058790.000010.02339
ROS116.001961.42552<0.000010.0017821.318382.241190.000030.03445
EREG57.470732.63671<0.00001<0.0000169.451372.755920.000020.00678

FC: fold change; p.adj: adjusted p value.

Gene expressions of six gene. The distribution of six Gene expressions among patients with short vs. long survival time. The expression levels of six genes were significantly different in two classes of survival patients Intersection of difference analysis between group long and short. Threshold of difference analysis adjusted P < 0.05 & |log2FoldChange|>1 FC: fold change; p.adj: adjusted p value.

Relationship between genes and image features

Figure 4A is the matrix showing the correlations between top image features and metagenes. A threshold of 0.4 was applied to filter out features that had weak correlations with corresponding metagenes (Figure 4B). A total of nine image features (including eight textural features and one intensity‐based feature) were strongly correlated with three metagenes (TIMP1, ROS1, EREG). EREG is positively associated with Dependence Non‐Uniformity (gldm‐DNUN), Difference Average (glcm‐DA), Contrast (glcm‐Contrast) and Cluster Prominence (glcm‐CP) and negatively associated with Inverse Difference (glcm‐Id), Zone Variance (glszm‐ZV), LargeArea Emphasis (glszm‐LAE) and Root Mean Squared (firstorder‐RMS). ROS1 gene is negatively associated with Inverse Difference Moment (glcm‐LLH‐Idm). TIMP1 is positively associated with Contrast (glcm‐Contrast), Cluster Prominence (glcm‐CP) and negatively associated with Inverse Difference (glcm‐Id), Zone Variance (glszm‐ZV), LargeArea Emphasis (glszm‐LAE). Correlation thresholding based on Benjamini‐Hochberg adjusted P‐values was show in Figure S1B. The correlations of image features and metagenes are shown in Figure 5.
Figure 4

Correlation between genes and image features. The matrix correlation between top image features and genes. A, The matrix showing the correlations between top image features and genes. B, The correlations between top image features and genes after the threshold of 0.4 was applied to filter out features that had weak correlations with corresponding genes

Figure 5

Correlation between three genes and nine image features. The correlations of nine image features and three genes. The solid line represents a positive correlation, and the dotted line represents a negative correlation

Correlation between genes and image features. The matrix correlation between top image features and genes. A, The matrix showing the correlations between top image features and genes. B, The correlations between top image features and genes after the threshold of 0.4 was applied to filter out features that had weak correlations with corresponding genes Correlation between three genes and nine image features. The correlations of nine image features and three genes. The solid line represents a positive correlation, and the dotted line represents a negative correlation

DISCUSSION

Associations between image features and survival outcome

Our results indicate that prediction models using radiomics features can discriminate patients with under or over 1‐year survival time, suggesting that MR image features are predictive of survival outcome in GBM. Textual features such as large dependence emphasis and entropy are especially indicative of clinical outcome. Similarly, Gutman et al. showed that contrast‐enhanced tumour volume was strongly correlated with poor survival [17]. Lao et al. used deep learning method to correlate radiomics features with survival in GBM [18]. Our study provides additional evidence of using computer assisted learning methods to examine the relevant information contained in image features. Compared to conventional manual analysis approaches, radiomics analysis can have the advantage of providing more efficient and unbiased quantification.

Differentially expressed genes in different survival groups

We identified six genes (WDR72, C14orf39, TIMP1, CHIT1, ROS1 and EREG) with significantly different levels of expression between short and long survival groups. To reveal the relationship between expression levels of six genes and the prognosis of patients, a survival analysis was performed. In this study, we used Kaplan‐Meier (KM) estimator to measure the patients’ survival rates in high and low gene expression [19]. Figure 6 shows the KM survival curve for six genes. The KM survival curves showed significant differences in overall survival between patients with high and low expression levels of six genes. The association between six genes expression levels and patient survival was significantly (P < 0.05). The C‐index of the six genes (WDR72, C14orf39, TIMP1, CHIT1, ROS1 and EREG) is 0.59, 0.55, 0.47, 0.46, 0.55, 0.45, respectively. EGFR has long been identified as an important therapeutic target for the treatment of GBM, and in patients with low overall survival time, elevated levels of EREG expression has been found. [20]. EREG can initiate the signalling cascade, and in gastric, EREG is up‐regulate [21]. Previous studies have shown the Epiregulin (EGFR) ligands have the effect of stabilizing receptors, affecting breast cancer cells associated with differentiation function [22]. Altered TIMP‐1 expression has been identified as a biomarker in GBM, with decreased TIMP‐1 linking to longer survival in GBM [23]. ROS1, which belongs to one subfamily of kinase insulin receptor genes, is a proto‐oncogene, highly expressed in a variety of tumour cells. This gene is often altered in lung cancer, of which the effects on the progression of GBM are remains to be eliminated [24].
Figure 6

The Kaplan‐Meier survival curve of six genes. KM survival curves show significant overall survival differences between higher‐expression levels and lower‐expression levels of survival rates of patients. For all the subplots, the ‘group 1’, coloured by yellow, stands for higher‐expression group at the optimal cut point identified by maximally selected rank statistics

The Kaplan‐Meier survival curve of six genes. KM survival curves show significant overall survival differences between higher‐expression levels and lower‐expression levels of survival rates of patients. For all the subplots, the ‘group 1’, coloured by yellow, stands for higher‐expression group at the optimal cut point identified by maximally selected rank statistics

Associations between image features and genes

Associating genes and microRNAs with high FLAIR volumes enables researchers to screen for molecular cancer subtypes and genomic relationship of cellular invasion. [25]. We found TIMP‐1 and EREG showed similar correlations with textural features (Table 4). Similar to our finding about EREG, Hu et al. indicated six genes including EGFR were significantly correlated with imaging features in GBM [26]. Grossmann et al. showed that volumetric image features were associated with homoeostasis and cell cycling pathways, concluding that oedema in FLAIR images were most predictive of GBM subtypes and overall survival [27]. Other relevant gene, such as POSTN, was found to play important roles in the regulatory pathways through radiogenomics analysis [25].
Table 4

Associations between image features and metagenes. This table shows the associations between nine image features and three metagenes, and the last column is the values of Pearson correlation coefficient

Efficient DEGsImportant image featuresPCC
EREGwavelet‐HHL_gldm_DependenceNonUniformityNormalized0.41
EREGlog‐sigma‐4‐0‐mm‐3D_glcm_Id−0.46
EREGwavelet‐HHL_glcm_DifferenceAverage0.42
EREGlog‐sigma‐2‐0‐mm‐3D_glcm_Contrast0.49
EREGlog‐sigma‐5‐0‐mm‐3D_glszm_ZoneVariance−0.56
EREGlog‐sigma‐3‐0‐mm‐3D_glszm_LargeAreaEmphasis−0.51
EREGwavelet‐LHL_firstorder_RootMeanSquared−0.41
EREGlog‐sigma‐2‐0‐mm‐3D_glcm_ClusterProminence0.46
TIMP1log‐sigma‐4‐0‐mm‐3D_glcm_Id−0.43
TIMP1log‐sigma‐2‐0‐mm‐3D_glcm_Contrast0.42
TIMP1log‐sigma‐5‐0‐mm‐3D_glszm_ZoneVariance−0.47
TIMP1log‐sigma‐3‐0‐mm‐3D_glszm_LargeAreaEmphasis−0.49
TIMP1log‐sigma‐2‐0‐mm‐3D_glcm_ClusterProminence0.43
ROS1wavelet.LLH_glcm_Idm−0.40

DEG: differentially expressed genes; PCC: Pearson correlation coefficient.

Associations between image features and metagenes. This table shows the associations between nine image features and three metagenes, and the last column is the values of Pearson correlation coefficient DEG: differentially expressed genes; PCC: Pearson correlation coefficient.

Limitations and suggestions

In this study, we used MRI data of 137 to identify radiomics features, but only a subpopulation of them (46) are provided with genomics data as well. For future analysis, larger patient sample size with both imaging and genomics data may be better to detect more correlating genes. In addition to FLAIR data, additional sequences and imaging modalities can be combined for multimodal analysis, which can provide comparison results about different methods. We selected 72 features to construct the prediction model. More advanced dimensionality reduction method can be implemented for potential improvements of dimensionality reduction and improving classification performances. Our study validates the method of radiogenomics analysis to study the correlations among gene variables, imaging features and survival outcome in GBM. Our findings provide useful information for further examination of corresponding genes, which may potentially serve as biomarkers for GMB diagnosis and treatment indicators.

CONFLICT OF INTEREST

The authors declare that they have no conflict of interest with the contents of this article.

AUTHOR CONTRIBUTION

XL performed the research and wrote the paper, BC designed the research study and wrote the paper, YL(Luo) and WS analysed the data, YL(Li) contributed essential reagents and analysed the data. Click here for additional data file. Click here for additional data file.
  28 in total

1.  ROS1 Fusions Rarely Overlap with Other Oncogenic Drivers in Non-Small Cell Lung Cancer.

Authors:  Jessica J Lin; Lauren L Ritterhouse; Siraj M Ali; Mark Bailey; Alexa B Schrock; Justin F Gainor; Lorin A Ferris; Mari Mino-Kenudson; Vincent A Miller; Anthony J Iafrate; Jochen K Lennerz; Alice T Shaw
Journal:  J Thorac Oncol       Date:  2017-01-11       Impact factor: 15.609

2.  Generalized maximally selected statistics.

Authors:  Torsten Hothorn; Achim Zeileis
Journal:  Biometrics       Date:  2008-03-05       Impact factor: 2.571

3.  Introduction to machine learning: k-nearest neighbors.

Authors:  Zhongheng Zhang
Journal:  Ann Transl Med       Date:  2016-06

4.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1.

Authors:  Roel G W Verhaak; Katherine A Hoadley; Elizabeth Purdom; Victoria Wang; Yuan Qi; Matthew D Wilkerson; C Ryan Miller; Li Ding; Todd Golub; Jill P Mesirov; Gabriele Alexe; Michael Lawrence; Michael O'Kelly; Pablo Tamayo; Barbara A Weir; Stacey Gabriel; Wendy Winckler; Supriya Gupta; Lakshmi Jakkula; Heidi S Feiler; J Graeme Hodgson; C David James; Jann N Sarkaria; Cameron Brennan; Ari Kahn; Paul T Spellman; Richard K Wilson; Terence P Speed; Joe W Gray; Matthew Meyerson; Gad Getz; Charles M Perou; D Neil Hayes
Journal:  Cancer Cell       Date:  2010-01-19       Impact factor: 31.743

5.  Radiogenomics to characterize regional genetic heterogeneity in glioblastoma.

Authors:  Leland S Hu; Shuluo Ning; Jennifer M Eschbacher; Leslie C Baxter; Nathan Gaw; Sara Ranjbar; Jonathan Plasencia; Amylou C Dueck; Sen Peng; Kris A Smith; Peter Nakaji; John P Karis; C Chad Quarles; Teresa Wu; Joseph C Loftus; Robert B Jenkins; Hugues Sicotte; Thomas M Kollmeyer; Brian P O'Neill; William Elmquist; Joseph M Hoxworth; David Frakes; Jann Sarkaria; Kristin R Swanson; Nhan L Tran; Jing Li; J Ross Mitchell
Journal:  Neuro Oncol       Date:  2016-08-08       Impact factor: 12.300

6.  MR imaging predictors of molecular profile and survival: multi-institutional study of the TCGA glioblastoma data set.

Authors:  David A Gutman; Lee A D Cooper; Scott N Hwang; Chad A Holder; Jingjing Gao; Tarun D Aurora; William D Dunn; Lisa Scarpace; Tom Mikkelsen; Rajan Jain; Max Wintermark; Manal Jilwan; Prashant Raghavan; Erich Huang; Robert J Clifford; Pattanasak Mongkolwat; Vladimir Kleper; John Freymann; Justin Kirby; Pascal O Zinn; Carlos S Moreno; Carl Jaffe; Rivka Colen; Daniel L Rubin; Joel Saltz; Adam Flanders; Daniel J Brat
Journal:  Radiology       Date:  2013-02-07       Impact factor: 11.105

7.  Tumor image-derived texture features are associated with CD3 T-cell infiltration status in glioblastoma.

Authors:  Shivali Narang; Donnie Kim; Sathvik Aithala; Amy B Heimberger; Salmaan Ahmed; Dinesh Rao; Ganesh Rao; Arvind Rao
Journal:  Oncotarget       Date:  2017-09-05

8.  Identify clear cell renal cell carcinoma related genes by gene network.

Authors:  Fangrong Yan; Yue Wang; Chunhui Liu; Huiling Zhao; Liya Zhang; Xiaofan Lu; Chen Chen; Yaoyan Wang; Tao Lu; Fei Wang
Journal:  Oncotarget       Date:  2017-11-30

Review 9.  The 2007 WHO classification of tumours of the central nervous system.

Authors:  David N Louis; Hiroko Ohgaki; Otmar D Wiestler; Webster K Cavenee; Peter C Burger; Anne Jouvet; Bernd W Scheithauer; Paul Kleihues
Journal:  Acta Neuropathol       Date:  2007-07-06       Impact factor: 17.088

10.  Expression of the EGF family in gastric cancer: downregulation of HER4 and its activating ligand NRG4.

Authors:  Trine Ostergaard Nielsen; Lennart Friis-Hansen; Steen Seier Poulsen; Birgitte Federspiel; Boe Sandahl Sorensen
Journal:  PLoS One       Date:  2014-04-11       Impact factor: 3.240

View more
  16 in total

1.  Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data.

Authors:  Andreas Holzinger; Benjamin Haibe-Kains; Igor Jurisica
Journal:  Eur J Nucl Med Mol Imaging       Date:  2019-06-15       Impact factor: 9.236

Review 2.  Radiomics for precision medicine in glioblastoma.

Authors:  Kiran Aftab; Faiqa Binte Aamir; Saad Mallick; Fatima Mubarak; Whitney B Pope; Tom Mikkelsen; Jack P Rock; Syed Ather Enam
Journal:  J Neurooncol       Date:  2022-01-12       Impact factor: 4.130

3.  Combination of pre-treatment dynamic [18F]FET PET radiomics and conventional clinical parameters for the survival stratification in patients with IDH-wildtype glioblastoma.

Authors:  Nathalie L Albert; Lena Kaiser; Zhicong Li; Adrien Holzgreve; Lena M Unterrainer; Viktoria C Ruf; Stefanie Quach; Laura M Bartos; Bogdana Suchorska; Maximilian Niyazi; Vera Wenter; Jochen Herms; Peter Bartenstein; Joerg-Christian Tonn; Marcus Unterrainer
Journal:  Eur J Nucl Med Mol Imaging       Date:  2022-10-13       Impact factor: 10.057

Review 4.  What Genetics Can Do for Oncological Imaging: A Systematic Review of the Genetic Validation Data Used in Radiomics Studies.

Authors:  Rebeca Mirón Mombiela; Anne Rix Arildskov; Frederik Jager Bruun; Lotte Harries Hasselbalch; Kristine Bærentz Holst; Sine Hvid Rasmussen; Consuelo Borrás
Journal:  Int J Mol Sci       Date:  2022-06-10       Impact factor: 6.208

5.  Differentiation of Cerebral Dissecting Aneurysm from Hemorrhagic Saccular Aneurysm by Machine-Learning Based on Vessel Wall MRI: A Multicenter Study.

Authors:  Xin Cao; Yanwei Zeng; Junying Wang; Yunxi Cao; Yifan Wu; Wei Xia
Journal:  J Clin Med       Date:  2022-06-23       Impact factor: 4.964

6.  Considerable effects of imaging sequences, feature extraction, feature selection, and classifiers on radiomics-based prediction of microvascular invasion in hepatocellular carcinoma using magnetic resonance imaging.

Authors:  Houjiao Dai; Minhua Lu; Bingsheng Huang; Mimi Tang; Tiantian Pang; Bing Liao; Huasong Cai; Mengqi Huang; Yongjin Zhou; Xin Chen; Huijun Ding; Shi-Ting Feng
Journal:  Quant Imaging Med Surg       Date:  2021-05

7.  Survival prediction in glioblastoma on post-contrast magnetic resonance imaging using filtration based first-order texture analysis: Comparison of multiple machine learning models.

Authors:  Sarv Priya; Amit Agarwal; Caitlin Ward; Thomas Locke; Varun Monga; Girish Bathla
Journal:  Neuroradiol J       Date:  2021-02-03

8.  Multiparametric MRI texture analysis in prediction of glioma biomarker status: added value of MR diffusion.

Authors:  Shingo Kihira; Nadejda M Tsankova; Adam Bauer; Yu Sakai; Keon Mahmoudi; Nicole Zubizarreta; Jane Houldsworth; Fahad Khan; Noriko Salamon; Adilia Hormigo; Kambiz Nael
Journal:  Neurooncol Adv       Date:  2021-04-08

9.  COVID-19 discrimination framework for X-ray images by considering radiomics, selective information, feature ranking, and a novel hybrid classifier.

Authors:  Hasan Koyuncu; Mücahid Barstuğan
Journal:  Signal Process Image Commun       Date:  2021-06-17       Impact factor: 3.256

10.  Differentiation of Low-Grade Astrocytoma From Anaplastic Astrocytoma Using Radiomics-Based Machine Learning Techniques.

Authors:  Boran Chen; Chaoyue Chen; Jian Wang; Yuen Teng; Xuelei Ma; Jianguo Xu
Journal:  Front Oncol       Date:  2021-06-01       Impact factor: 6.244

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.