Yun Liu1,2, Fu Liu2, Xintong Hu1, Jiaxue He1, Yanfang Jiang1. 1. Key Laboratory of Organ Regeneration & Transplantation of the Ministry of Education, Genetic Diagnosis Center, The First Hospital of Jilin University, Changchun, China. 2. College of Communication Engineering, Jilin University, Changchun, China.
Abstract
MOTIVATION: Although several prognostic signatures for lung adenocarcinoma (LUAD) have been developed, they are mainly based on a single-omics data set. This article aims to develop a novel set of prognostic signatures by combining genetic mutation and expression profiles of LUAD patients. METHODS: The genetic mutation and expression profiles, together with the clinical profiles of a cohort of LUAD patients from The Cancer Genome Atlas (TCGA), were downloaded. Patients were separated into 2 groups, namely, the high-risk and low-risk groups, according to their overall survivals. Then, differential analysis was performed to determine differentially expressed genes (DEGs) and mutated genes (DMGs) in the expression and mutation profiles, respectively, between the 2 groups. Finally, a prognostic model based on the support vector machine (SVM) algorithm was developed by combining the expression values of the DEGs and the mutation times of the DMGs. RESULTS: A total of 13 DEGs and 7 DMGs were recognized between the 2 groups. Their prognostic values were validated using independent cohorts. Compared with several existing signatures, the proposed prognostic signatures exhibited better prediction performance in the testing set. In addition, it is found that 1 of the 7 DMGs, GRIN2B, is mutated much more frequently in the high-risk group, showing a potential value as a therapy target. CONCLUSIONS: Combining multi-omics data sets is an applicable manner to identify novel prognostic signatures and to improve the prognostic prediction for LUAD, which will be heuristic to other types of cancers.
MOTIVATION: Although several prognostic signatures for lung adenocarcinoma (LUAD) have been developed, they are mainly based on a single-omics data set. This article aims to develop a novel set of prognostic signatures by combining genetic mutation and expression profiles of LUAD patients. METHODS: The genetic mutation and expression profiles, together with the clinical profiles of a cohort of LUAD patients from The Cancer Genome Atlas (TCGA), were downloaded. Patients were separated into 2 groups, namely, the high-risk and low-risk groups, according to their overall survivals. Then, differential analysis was performed to determine differentially expressed genes (DEGs) and mutated genes (DMGs) in the expression and mutation profiles, respectively, between the 2 groups. Finally, a prognostic model based on the support vector machine (SVM) algorithm was developed by combining the expression values of the DEGs and the mutation times of the DMGs. RESULTS: A total of 13 DEGs and 7 DMGs were recognized between the 2 groups. Their prognostic values were validated using independent cohorts. Compared with several existing signatures, the proposed prognostic signatures exhibited better prediction performance in the testing set. In addition, it is found that 1 of the 7 DMGs, GRIN2B, is mutated much more frequently in the high-risk group, showing a potential value as a therapy target. CONCLUSIONS: Combining multi-omics data sets is an applicable manner to identify novel prognostic signatures and to improve the prognostic prediction for LUAD, which will be heuristic to other types of cancers.
In China, about 3.9 million new cancer cases were reported in 2015,
of which lung cancer ranks the first place and accounts for nearly 20% of
total cases. Lung adenocarcinoma (LUAD), a subtype of non-small cell lung carcinoma
(NSCLC), accounts for 40% of all lung cancers.[2,3] LUAD patients are usually
diagnosed at a relatively late stage and suffer poor survivals.
Therefore, it is of great value to further improve the long-term survival
rate of LUAD,[5,6] which can be
achieved by developing individual therapy based on prognostic signatures.There have been several efforts to identify prognostic signatures for lung cancer in
the genomics era since 2002.[7-13] In recent years, Shukla et al
developed a 4-gene signature set based on the univariate Cox analysis on an
LUAD cohort from The Cancer Genome Atlas (TCGA). A 20-gene-based signature set was
identified from differentially expressed genes in LUAD compared with adjacent normal
lung tissues in Zhao et al.
Chen et al
constructed a multistep bioinformatics analysis pipeline and identified 27
genes that are significantly related to overall survival in LUAD patients. Songyang
et al
identified a set of robust prognostic signatures containing 25 genes by the
meta-analysis-based Cox analysis on 10 gene expression data sets. However, these
studies are all based on single-omics data set, namely, genetic expression data set.
As the multi-omics data sets of lung cancer are available in TCGA, it is possible to
explore prognostic signatures by integrating different types of omics data sets. In
some applications of machine learning, combining different types of features will
result in better prediction performance.The patterns of the somatic mutations in NSCLC have been extensively studied to
reveal mutation characteristics from different aspects, such as the distinct genetic
mutations in LUAD and other subtypes of NSCLC,
in different races with LUAD,[20,21] in younger patients compared
with elderly,[22,23] and in never-smoking patients.[24-26] Thus, it is necessary to
identify prognostic signatures from the mutation profiles of LUAD patients. Till to
now, there have been several studies that try to combine genetic expression and
mutation profiles to improve the outcome prediction of some diseases, including
myelodysplastic syndromes.
To our knowledge, the study by Song et al
is the first trail of survival prediction of LUAD by integrating genetic
mutation and expression profiles. It has been verified that the predictive accuracy
was improved by the contributions of genetic mutations.In this article, a novel set of prognostic signatures of LUAD was identified by
integrating genetic mutation and expression profiles. An LUAD cohort from TCGA was
downloaded and used to identify prognostic signatures. The patients of the cohort
were separated into the high-risk and low-risk groups according to their overall
survivals. Differential analysis between the 2 groups produced 20 prognostic genes,
including 13 differentially expressed genes (DEGs) and 7 differentially mutated
genes (DMGs). A prognostic model based on the support vector machine (SVM) algorithm
was then built by combining the expression values of the DEGs and the mutation times
of the DMGs. When training the prognostic model, the 10-fold cross-validation
strategy was used to find the optimal hyper-parameters. The validation results in
the testing set have showed that the identified prognostic signatures are effective
for the stratification of LUAD samples, and the prediction performance has been
improved by the contributions of the DMGs. The main contribution of this article is
the identification of DMGs between the high-risk and low-risk groups, and the
combination of the features of the DEGs and DMGs, which can be applied to the
survival prediction of other types of cancers.
Materials and Methods
The pipeline of the proposed method is showed in Figure 1. First, 272 samples of the TCGA LUAD
cohort were selected according to the overall survival and were partitioned into the
training set and testing set. Then, DEGs and DMGs were determined from the training
set, and the features of them were combined to train a prognostic model based on the
SVM algorithm. The 10-fold cross validation was used to find the best
hyper-parameters. Finally, this model was used to stratify the samples in the
testing set to evaluate its performance.
Figure 1.
Pipeline of the proposed method.
DEG indicates differentially expressed genes; DMG, differentially mutated
genes; LUAD, lung adenocarcinoma; SVM, support vector machine; TCGA, The
Cancer Genome Atlas.
Pipeline of the proposed method.DEG indicates differentially expressed genes; DMG, differentially mutated
genes; LUAD, lung adenocarcinoma; SVM, support vector machine; TCGA, The
Cancer Genome Atlas.
Data collection and grouping
A cohort of LUAD from TCGA was used in this article. The genetic mutation and
expression profiles, and their corresponding clinical profiles, were downloaded
on October 2019, including 522 samples.
Table 1 lists the
clinical information of them. The overall survival is the only considered factor
to group these samples into 3 subgroups. A total of 137 samples were partitioned
into the low-risk group as their overall survivals are larger than 36 months.
For the high-risk group, a more rigorous standard was used, and a sample was
determined to be high risk if its vital status is dead and its overall survival
is less than 36 months; 135 patients were grouped into the high-risk group. The
remaining samples were classified into the unknown group as the days to last
follow-up are less than 36 months and the vital status was alive; their exact
overall survivals cannot be determined. Totally, 272 of 522 samples were
selected and randomly separated into the training set (75%) and testing set
(25%). The training set contains 200 samples, comprising 100 high-risk and 100
low-risk samples, respectively. The remaining samples were used as the testing
set. The training set was used to identify the prognostic signatures and to
build the prognostic model, and the testing set was used to evaluate them.
Table 1.
Clinical information of 522 LUAD samples from TCGA.
Statistics
N
Sex
Male
242
Female
280
Stage
I
279
II
124
III
85
IV
26
Not available
8
Vital status
Alive
334
Dead
188
Overall survival
>36 months
137
⩽36 months
135
Unknown
242
Not available
8
Abbreviations: LUAD, lung adenocarcinoma; TCGA, The Cancer Genome
Atlas.
Clinical information of 522 LUAD samples from TCGA.Abbreviations: LUAD, lung adenocarcinoma; TCGA, The Cancer Genome
Atlas.
Identification of differentially expressed genes
GDCRNATools
was used to identify the DEGs in the high-risk samples compared with the
low-risk samples. The profiles of message RNA (mRNA) were used for this
analysis. The gdcDEAnalysis function of GDCRNATools with the limma method
selected was used to determine the DEGs. The criteria are false discovery rate
(FDR)-adjusted P value <.05 and the absolute value of
log2-based fold change >1.
Identification of differentially mutated genes
The MafCompare function in Maftools package
was used to detect the DMGs between the high-risk and low-risk samples.
The parameter “minMut” was set to be 10, meaning that the number of samples with
a DMG mutated in one group must be at least 10 more than that in another
group.
External validation of the DEGs and DMGs
The prognostic values of the DEGs and DMGs were validated by Kaplan-Meier (KM) plot
and International Cancer Genome Consortium (ICGA) Data Portal (https://dcc.icgc.org/), respectively. When using KM plot, the
Jetset was used to select the optimal probe set for each prognostic gene.
Prognostic prediction model
First, a feature matrix was established by combining the features of the DEGs and
DMGs. The expression values of the DEGs and the mutation times of the DMGs of
all the samples in the training set were integrated as a combined feature
matrix. The risk of a sample was used as its label.Then, a prognostic model based on the SVM algorithm was built; the principle of
the SVM algorithm can be found in Maldonado et al.
The feature matrix and the labels of the training set were inputted into
the SVM model. The 10-fold cross validation was used to determine the optimal
hyper-parameters. In this article, the e1071 package in R was used to build and
train this model.
Results
Differentially expressed genes
By performing the gdcDEAnalysis function with defined criteria in the GDCRNATools package,
13 DEGs were identified and are listed in Table 2 and Supplementary File 1. Among these genes,
FAM83A-AS1 and AC005077.4 belong to long
noncoding RNA and pseudogene, respectively, and the rest are protein-coding
genes, as depicted in Figure
2A. Three DEGs, SFTA3, KLRG2, and
BMP5, are downregulated in the high-risk samples compared
with the low-risk samples, while other 10 genes are all upregulated, as depicted
in Figure 2B.
Table 2.
Information of identified DEGs (sorted by the value of logFC).
Information of identified DEGs (sorted by the value of logFC).Abbreviations: DEG, differentially expressed genes; FDR, false
discovery rate.(A) Bar plot of identified DEGs and (B) volcanic plot of identified DEGs,
in which FDR represents FDR-adjusted P values.DEG indicates differentially expressed genes; FDR, false discovery
rate.
Differentially mutated genes
By performing the MafCompare function in the Maftools package,
7 DMGs were identified and are listed in Supplementary File 1. Three samples in the training set do not
contain genetic mutation profiles, so 0197 samples were included in this
analysis. Figures 3 and
4 depict the forest
and co-onco plots of the DMGs. Six DMGs are mutated more frequently in the
low-risk samples. One gene, GRIN2B, is mutated in 19% of the
high-risk samples, while only 6% of the low-risk samples have mutations of this
gene. The locations of the somatic mutations of the DMGs are showed in Supplemental Figure S1 to S7.
Figure 3.
Forest plot of 7 DMGs.
DMG indicates differentially mutated genes.
**P < .01. ***P < .001.
Figure 4.
Co-oncoplot of 7 DMGs.
DMG indicates differentially mutated genes.
Forest plot of 7 DMGs.DMG indicates differentially mutated genes.**P < .01. ***P < .001.Co-oncoplot of 7 DMGs.DMG indicates differentially mutated genes.The expression levels of the DMGs between patients in the high-risk and low-risk
groups were then analyzed, as showed in Figure 5. Based on the P
values of the DMGs, there are not significant differences between the 2 groups.
However, it is found that the mutations of DMXL1, FAT2, GRIN2B,
and THSD7A may impact their mRNA levels because their
P values are smaller than that of other genes. Compared
with the expression levels of the high-risk samples, DMXL1 and
FAT2 are downregulated in the low-risk samples with them
mutated, while THSD7A is upregulated.
Figure 5.
Expression levels of 7 DMGs of the high-risk and the low-risk samples:
(A) DMXL1, (B) CHD6, (C)
THSD7A, (D) FAT2, (E)
SPATA31A6, (F) GRIN2B, and (G)
ADGRL3.
DMG indicates differentially mutated genes.
Expression levels of 7 DMGs of the high-risk and the low-risk samples:
(A) DMXL1, (B) CHD6, (C)
THSD7A, (D) FAT2, (E)
SPATA31A6, (F) GRIN2B, and (G)
ADGRL3.DMG indicates differentially mutated genes.Figures 6 and 7 show the results of the
univariate Cox regression analysis of the DEGs and DMGs. Three DEGs and one DMG
are not found in KM plot and ICGA Data Portal. In Figure 6, it is found that 9 DEGs, except
DENR and LARC, are significantly related
to the overall survival of LUAD samples; the relationships between their
expression levels and the survival rate of LUAD samples are consistent with the
results of this article. The validation results of DNER and
LARC are not so promising. SFTA3 and
BMP5 could be tumor suppressor genes because high
expression levels of them relate to better survival, while others could be
oncogenes.
Figure 6.
KM curves of 10 DEGs on 1926 LUAD patients, who were separated into the
high-expression and low-expression groups. SFTA3 and
BMP5 could be tumor suppressor genes because higher
expression levels of them relate to better survival, while others could
be oncogenes: (A) SFTA3, (B) BMP5, (C)
FAM83A, (D) TFAP2A, (E)
PKP2, (F) CCL20, (G)
RHOV, (H) DNER, (I)
TNS4, and (J) ABCC2.
DEG indicates differentially expressed genes; KM, Kaplan-Meier; LUAD,
lung adenocarcinoma.
Figure 7.
KM curves of 6 DMGs on 195 LUAD patients, who were separated into the
mutated and not mutated groups. GRIN2B is significantly
associated with the survival rate of LUAD patients and it could be a
tumor suppressor as the risk of patients with it mutated is higher: (A)
DMXL1, (B) CHD6, (C)
THSD7A, (D) FAT2, (E)
SPATA31A6, and (F) GRIN2B.
DMG indicates differentially mutated genes; KM, Kaplan-Meier; LUAD, lung
adenocarcinoma.
KM curves of 10 DEGs on 1926 LUAD patients, who were separated into the
high-expression and low-expression groups. SFTA3 and
BMP5 could be tumor suppressor genes because higher
expression levels of them relate to better survival, while others could
be oncogenes: (A) SFTA3, (B) BMP5, (C)
FAM83A, (D) TFAP2A, (E)
PKP2, (F) CCL20, (G)
RHOV, (H) DNER, (I)
TNS4, and (J) ABCC2.DEG indicates differentially expressed genes; KM, Kaplan-Meier; LUAD,
lung adenocarcinoma.KM curves of 6 DMGs on 195 LUAD patients, who were separated into the
mutated and not mutated groups. GRIN2B is significantly
associated with the survival rate of LUAD patients and it could be a
tumor suppressor as the risk of patients with it mutated is higher: (A)
DMXL1, (B) CHD6, (C)
THSD7A, (D) FAT2, (E)
SPATA31A6, and (F) GRIN2B.DMG indicates differentially mutated genes; KM, Kaplan-Meier; LUAD, lung
adenocarcinoma.In Figure 7, it can be
found that the mutations of GRIN2B are significantly associated
with the survival rate of LUAD samples, which is consistent with the results of
this article. The patients with mutations of GRIN2B show worse
outcomes compared with these who do not have mutations of
GRIN2B. Therefore, GRIN2B could be a tumor
suppressor gene as the risk of a patient with it mutated is higher.
Validation of the prognostic model
A combined feature matrix was constructed by integrating the normalized
expression values of the DEGs and the mutation times of the DMGs. The dimension
of the feature matrix is 200 × 20, in which each column represents an LUAD
sample of the training set. Then, the prognostic model based on the SVM
algorithm was trained using the 10-fold cross validation, and the optimal
hyper-parameters were obtained.Sensitivity, specificity, and the area under the ROC (receiver operating
characteristic) curve (AUC) were used to evaluate the prognostic performance.
They are defined by 4 terms, namely, true positive (TP), true negative (TN),
false positive (FP), and false negative (FN). TP is the number of the high-risk
samples that are predicted as high risk, while FP is the number of the high-risk
samples but predicted as low risk incorrectly; TN is the number of the low-risk
samples that are predicted as low risk, while FP is the number of low-risk
samples but predicted as high risk. Sensitivity, specificity, and AUC are
defined as follows:andSensitivity mainly evaluates the ability to recognize high-risk samples, whereas
specificity mainly focuses on the prediction performance of the low-risk
samples. The greater values of these criteria indicate better classification
result.The prediction performance of the DEGs and DMGs with patients in different stages
was first evaluated by the prognostic model. The values of sensitivity,
specificity, and AUC of stage I to IV are listed in Table 3. The prognostic genes worked
the best with the patients in stage III, where 7 of 8 high-risk patients were
stratified correctly. There are only 2 high-risk patients in stage IV, so the
specificity is not available.
Table 3.
Evaluation metrics of stratification results based on DEGs and DMGs of
the training set with different stages.
Stage I
Stage II
Stage III
Stage IV
Overall
Sensitivity
0.69
0.58
0.875
0.5
0.531
Specificity
0.55
0.66
0
–
0.743
AUC
0.62
0.62
0.438
–
0.637
Abbreviations: AUC, area under the ROC curve; DEG, differentially
expressed genes; DMG, differentially mutated genes; ROC, receiver
operating characteristic.
Statistically significant values were represented in bold.
Evaluation metrics of stratification results based on DEGs and DMGs of
the training set with different stages.Abbreviations: AUC, area under the ROC curve; DEG, differentially
expressed genes; DMG, differentially mutated genes; ROC, receiver
operating characteristic.Statistically significant values were represented in bold.The values of sensitivity, specificity, and AUC by using single (DEGs) and
integrated (DEGs and DMGs) features are listed Table 4. It can be found that by
integrating the features of the DEGs and DMGs, the performance of the proposed
prognostic model was significantly improved, with the specificity and AUC
increasing from 0.543 to 0.743 and from 0.537 to 0.637 respectively.
Table 4.
Evaluation metrics of stratification results based on different signature
sets with the best metrics written in bold.
Shukla et al14
Chen et al16
Zhao et al15
Songyang et al17
DEGs
DEGs + DMGs
Sensitivity
0.629
0.543
0.657
1
0.531
0.531
Specificity
0.5
0.594
0.594
0
0.543
0.743
AUC
0.564
0.568
0.625
0.451
0.537
0.637
Abbreviations: AUC, area under the ROC curve; DEG, differentially
expressed genes; DMG, differentially mutated genes; ROC, receiver
operating characteristic.
Statistically significant values were represented in bold.
Evaluation metrics of stratification results based on different signature
sets with the best metrics written in bold.Abbreviations: AUC, area under the ROC curve; DEG, differentially
expressed genes; DMG, differentially mutated genes; ROC, receiver
operating characteristic.Statistically significant values were represented in bold.The proposed signature set was then compared with 4 most recent sets of
prognostic signatures for LUAD, which are all based on genetic expression
profiles. The prediction experiments of all the signature sets were performed on
the same training and testing sets used in this article. In the training set,
the expression values of genes in each prognostic signature set were selected
and used to train the SVM model. The 10-fold cross validation was also used in
the training progress of them. Finally, the trained model was evaluated by the
testing set and the prediction results are listed in Table 4. It can be found that the
proposed signatures achieved the greatest values of specificity and AUC. Figure 8 depicts the ROC
curves of the stratification results of the samples in the testing set by
different prognostic signature sets and their corresponding prognostic models;
it can be found that the proposed prognostic signatures stood out on top
compared with others.
Figure 8.
ROC curves of proposed prognosis model by using our signatures and
others.
ROC curves of proposed prognosis model by using our signatures and
others.DEG indicates differentially expressed genes; DMG, differentially mutated
genes; ROC, receiver operating characteristic.
Discussion
This study aims to integrate the genetic mutation and expression profiles to predict
overall survival (OS) of LUAD using a TCGA data set. Patients in this data set were
separated into the high-risk and low-risk groups according to the overall survival.
Differential analysis between the 2 groups produced a novel set of prognostic genes,
containing 13 DEGs and 7 DMGs. Finally, a prognostic model was constructed using the
integrated features of the DEGs and DMGs. The validation results have showed the
prognostic value of the DEGs and DMGs and have showed the power of the DMGs on the
survival prediction. The most significant contribution of this article is the
integration of the genetic mutation and expression profiles to determine prognostic
genes for LUAD patients. If genetic expression and mutation profiles are available,
the pipeline of determining DEGs and DMGs in this article can be applied to other
types of cancers.The functions of the DEGs and DMGs were searched from https://www.uniprot.org/ and
are listed in Supplementary File 2. The DEGs and DMGs were also searched on PubMed
using the gene name and lung cancer. The number of related papers and functions of
the DEGs and DMGs in lung cancer are listed in Supplementary File 2. SFAT3, BMP5,
FAM83A, TSN4, ABCC2, and
FAT2 have been suggested to be biomarkers in lung
cancer.[34-39] The other genes were
identified as biomarkers for lung cancer for the first time in this article. In
addition, there is no related study about KLRG2,
DMXL1, CHD6, ADGRL3, and
SPATA31A6, suggesting that they act independently as biomarkers
in lung cancer.The relationships between the DEGs and DMGs and some known driver genes of lung
cancer were analyzed. FAM83A has been indicated as a proto-oncogene
that functions in the epidermal growth factor receptor (EGFR) signaling pathway.
EGFR is a well-known driver gene in lung cancer, and it has been
suggested that FAM83A lies downstream of EGFR/PI3K and upstream of
MEK. In breast cancer cells, it has been revealed that downmodulation of
FAM83A led to decreased proliferation and invasiveness in cell
cultures as well as to decreased tumor growth in vivo.
In lung cancer, several studies about FAM83A have been
published in recent years. Overexpression of FAM83A has been
indicated to be related to poor clinical outcomes in LUAD,[41-44] and FAM83A
promotes the progression and tumorigenicity in non-small cell lung cancer by
regulating Wnt and Hippo signaling pathways.[44,45] Therefore,
FAM83A is an effective prognostic biomarker and a potential new
therapeutic target in lung cancer.[36,46] In this article, it is found
that FAM83A is overexpressed in the high-risk samples compared with
the low-risk samples.Enrichment analysis of the gene set, comprising the DEGs, DMGs,
EGFR, and KRAS, was performed using KEGG pathways
database in this article. The results illustrated that GRIN2B and
EGFR were enriched into Rap1 signaling pathway
(P = .002) and Ras signaling pathway
(P = .003), suggesting a potential relationship between them. The 2
genes were then searched in PubMed and only one paper was obtained,
in which both GRIN2B and EGFR were
determined as biomarkers for gastric cancer but the relationship between them was
not discussed. Therefore, it is a promising study to explore the relationship
between GRIN2B and EGFR in lung cancer.Genetic mutations can affect gene expressions.
However, the interactive mechanism between the genetic mutation and
expression is still not well understood.
It is found that DMXL1 and FAT2 are
downregulated in the low-risk patients with these genes mutated, while
THSD7A is upregulated. To reveal how the DMGs impact their
expression levels is one of our future works. From the KM plots of the DMGs,
GRIN2B is a potential tumor suppressor gene in LUAD, which has
been confirmed in some types of cancer, such as diffuse large B-cell lymphoma,
gastric cancer,
and LUAD.There are 2 main limitations of this study. First, in the validation results of DMGs,
only GRIN2B showed a promising prognostic performance. The reason
is that the DMGs only mutated in a small portion of whole samples. For example, only
19% of the high-risk samples have mutations of GRIN2B. Second, the
performance of stratification is far from satisfactory. Although several prognostic
models can be used to predict the overall survival of LUAD,[15,17] the AUC values
are only 0.615 and 0.637 in Zhao et al
and in this article, respectively. New signatures, such as microbial
biomarkers,[50,51] and prognostic models based on advanced machine learning
algorithms, such as deep learning methods,
are needed to further improve the performance of the survival prediction.
Conclusion
In this article, 13 DEGs and 7 DMGs were identified in an LUAD cohort, and a
prognostic model was constructed by combining the features of the DEGs and DMGs. The
validation results on the testing set have showed the superiority of the proposed
prognostic signatures and model compared with others. The main contribution of this
article is that the prognostic signatures are determined by a new manner, in which
patients with LUAD were partitioned into the high-risk and low-risk groups.
Differential analysis of the expression and mutation profiles between the 2 groups
identified a new set of prognostic signatures. This pipeline can be applied to other
types of cancers to determine novel prognostic signatures and potential therapeutic
targets.There are several promising extensions of this study. First, the overall survival of
LUAD patients is only considered when identifying DEGs and DMGs in this article. In
future study, more factors, such as the therapy, should be included to determine
prognostic signatures. Second, with multi-omics data sets available, it is a
practical manner to improve the prognostic prediction performance by combining the
features from the genetic to epigenomic profiles of cancer samples.Click here for additional data file.Supplemental material, SupplementaryFigures for Combining Genetic Mutation and
Expression Profiles Identifies Novel Prognostic Biomarkers of Lung
Adenocarcinoma by Yun Liu, Fu Liu, Xintong Hu, Jiaxue He and Yanfang Jiang in
Clinical Medicine Insights: Oncology
Authors: Sudhanshu Shukla; Joseph R Evans; Rohit Malik; Felix Y Feng; Saravana M Dhanasekaran; Xuhong Cao; Guoan Chen; David G Beer; Hui Jiang; Arul M Chinnaiyan Journal: J Natl Cancer Inst Date: 2016-10-05 Impact factor: 13.506
Authors: En-Guo Chen; Pin Wang; Haizhou Lou; Yunshan Wang; Hong Yan; Lei Bi; Liang Liu; Bin Li; Antoine M Snijders; Jian-Hua Mao; Bo Hang Journal: Oncotarget Date: 2017-12-15
Authors: Sarah Richtmann; Dennis Wilkens; Arne Warth; Felix Lasitschka; Hauke Winter; Petros Christopoulos; Felix J F Herth; Thomas Muley; Michael Meister; Marc A Schneider Journal: Cancers (Basel) Date: 2019-05-11 Impact factor: 6.639