Literature DB >> 35153523

Combining Genetic Mutation and Expression Profiles Identifies Novel Prognostic Biomarkers of Lung Adenocarcinoma.

Yun Liu^1,2, Fu Liu², Xintong Hu¹, Jiaxue He¹, Yanfang Jiang¹.

Abstract

MOTIVATION: Although several prognostic signatures for lung adenocarcinoma (LUAD) have been developed, they are mainly based on a single-omics data set. This article aims to develop a novel set of prognostic signatures by combining genetic mutation and expression profiles of LUAD patients.
METHODS: The genetic mutation and expression profiles, together with the clinical profiles of a cohort of LUAD patients from The Cancer Genome Atlas (TCGA), were downloaded. Patients were separated into 2 groups, namely, the high-risk and low-risk groups, according to their overall survivals. Then, differential analysis was performed to determine differentially expressed genes (DEGs) and mutated genes (DMGs) in the expression and mutation profiles, respectively, between the 2 groups. Finally, a prognostic model based on the support vector machine (SVM) algorithm was developed by combining the expression values of the DEGs and the mutation times of the DMGs.
RESULTS: A total of 13 DEGs and 7 DMGs were recognized between the 2 groups. Their prognostic values were validated using independent cohorts. Compared with several existing signatures, the proposed prognostic signatures exhibited better prediction performance in the testing set. In addition, it is found that 1 of the 7 DMGs, GRIN2B, is mutated much more frequently in the high-risk group, showing a potential value as a therapy target.
CONCLUSIONS: Combining multi-omics data sets is an applicable manner to identify novel prognostic signatures and to improve the prognostic prediction for LUAD, which will be heuristic to other types of cancers.

Entities: Chemical

Keywords: Lung adenocarcinoma; SVM; differentially expressed genes; differentially mutated genes; prognostic model; prognostic signatures

Year: 2020 PMID： 35153523 PMCID： PMC8826273 DOI： 10.1177/1179554920966260

Source DB: PubMed Journal: Clin Med Insights Oncol ISSN： 1179-5549

Introduction

In China, about 3.9 million new cancer cases were reported in 2015, of which lung cancer ranks the first place and accounts for nearly 20% of total cases. Lung adenocarcinoma (LUAD), a subtype of non-small cell lung carcinoma (NSCLC), accounts for 40% of all lung cancers.[2,3] LUAD patients are usually diagnosed at a relatively late stage and suffer poor survivals. Therefore, it is of great value to further improve the long-term survival rate of LUAD,[5,6] which can be achieved by developing individual therapy based on prognostic signatures. There have been several efforts to identify prognostic signatures for lung cancer in the genomics era since 2002.[7-13] In recent years, Shukla et al developed a 4-gene signature set based on the univariate Cox analysis on an LUAD cohort from The Cancer Genome Atlas (TCGA). A 20-gene-based signature set was identified from differentially expressed genes in LUAD compared with adjacent normal lung tissues in Zhao et al. Chen et al constructed a multistep bioinformatics analysis pipeline and identified 27 genes that are significantly related to overall survival in LUAD patients. Songyang et al identified a set of robust prognostic signatures containing 25 genes by the meta-analysis-based Cox analysis on 10 gene expression data sets. However, these studies are all based on single-omics data set, namely, genetic expression data set. As the multi-omics data sets of lung cancer are available in TCGA, it is possible to explore prognostic signatures by integrating different types of omics data sets. In some applications of machine learning, combining different types of features will result in better prediction performance. The patterns of the somatic mutations in NSCLC have been extensively studied to reveal mutation characteristics from different aspects, such as the distinct genetic mutations in LUAD and other subtypes of NSCLC, in different races with LUAD,[20,21] in younger patients compared with elderly,[22,23] and in never-smoking patients.[24-26] Thus, it is necessary to identify prognostic signatures from the mutation profiles of LUAD patients. Till to now, there have been several studies that try to combine genetic expression and mutation profiles to improve the outcome prediction of some diseases, including myelodysplastic syndromes. To our knowledge, the study by Song et al is the first trail of survival prediction of LUAD by integrating genetic mutation and expression profiles. It has been verified that the predictive accuracy was improved by the contributions of genetic mutations. In this article, a novel set of prognostic signatures of LUAD was identified by integrating genetic mutation and expression profiles. An LUAD cohort from TCGA was downloaded and used to identify prognostic signatures. The patients of the cohort were separated into the high-risk and low-risk groups according to their overall survivals. Differential analysis between the 2 groups produced 20 prognostic genes, including 13 differentially expressed genes (DEGs) and 7 differentially mutated genes (DMGs). A prognostic model based on the support vector machine (SVM) algorithm was then built by combining the expression values of the DEGs and the mutation times of the DMGs. When training the prognostic model, the 10-fold cross-validation strategy was used to find the optimal hyper-parameters. The validation results in the testing set have showed that the identified prognostic signatures are effective for the stratification of LUAD samples, and the prediction performance has been improved by the contributions of the DMGs. The main contribution of this article is the identification of DMGs between the high-risk and low-risk groups, and the combination of the features of the DEGs and DMGs, which can be applied to the survival prediction of other types of cancers.

Materials and Methods

The pipeline of the proposed method is showed in Figure 1. First, 272 samples of the TCGA LUAD cohort were selected according to the overall survival and were partitioned into the training set and testing set. Then, DEGs and DMGs were determined from the training set, and the features of them were combined to train a prognostic model based on the SVM algorithm. The 10-fold cross validation was used to find the best hyper-parameters. Finally, this model was used to stratify the samples in the testing set to evaluate its performance.

Figure 1.

Pipeline of the proposed method.

DEG indicates differentially expressed genes; DMG, differentially mutated genes; LUAD, lung adenocarcinoma; SVM, support vector machine; TCGA, The Cancer Genome Atlas.

Pipeline of the proposed method. DEG indicates differentially expressed genes; DMG, differentially mutated genes; LUAD, lung adenocarcinoma; SVM, support vector machine; TCGA, The Cancer Genome Atlas.

Data collection and grouping

A cohort of LUAD from TCGA was used in this article. The genetic mutation and expression profiles, and their corresponding clinical profiles, were downloaded on October 2019, including 522 samples. Table 1 lists the clinical information of them. The overall survival is the only considered factor to group these samples into 3 subgroups. A total of 137 samples were partitioned into the low-risk group as their overall survivals are larger than 36 months. For the high-risk group, a more rigorous standard was used, and a sample was determined to be high risk if its vital status is dead and its overall survival is less than 36 months; 135 patients were grouped into the high-risk group. The remaining samples were classified into the unknown group as the days to last follow-up are less than 36 months and the vital status was alive; their exact overall survivals cannot be determined. Totally, 272 of 522 samples were selected and randomly separated into the training set (75%) and testing set (25%). The training set contains 200 samples, comprising 100 high-risk and 100 low-risk samples, respectively. The remaining samples were used as the testing set. The training set was used to identify the prognostic signatures and to build the prognostic model, and the testing set was used to evaluate them.

Table 1.

Clinical information of 522 LUAD samples from TCGA.

Statistics	N
Sex
Male	242
Female	280
Stage
I	279
II	124
III	85
IV	26
Not available	8
Vital status
Alive	334
Dead	188
Overall survival
>36 months	137
⩽36 months	135
Unknown	242
Not available	8

Abbreviations: LUAD, lung adenocarcinoma; TCGA, The Cancer Genome Atlas.

Clinical information of 522 LUAD samples from TCGA. Abbreviations: LUAD, lung adenocarcinoma; TCGA, The Cancer Genome Atlas.

Identification of differentially expressed genes

GDCRNATools was used to identify the DEGs in the high-risk samples compared with the low-risk samples. The profiles of message RNA (mRNA) were used for this analysis. The gdcDEAnalysis function of GDCRNATools with the limma method selected was used to determine the DEGs. The criteria are false discovery rate (FDR)-adjusted P value <.05 and the absolute value of log2-based fold change >1.

Identification of differentially mutated genes

The MafCompare function in Maftools package was used to detect the DMGs between the high-risk and low-risk samples. The parameter “minMut” was set to be 10, meaning that the number of samples with a DMG mutated in one group must be at least 10 more than that in another group.

External validation of the DEGs and DMGs

The prognostic values of the DEGs and DMGs were validated by Kaplan-Meier (KM) plot and International Cancer Genome Consortium (ICGA) Data Portal (https://dcc.icgc.org/), respectively. When using KM plot, the Jetset was used to select the optimal probe set for each prognostic gene.

Prognostic prediction model

First, a feature matrix was established by combining the features of the DEGs and DMGs. The expression values of the DEGs and the mutation times of the DMGs of all the samples in the training set were integrated as a combined feature matrix. The risk of a sample was used as its label. Then, a prognostic model based on the SVM algorithm was built; the principle of the SVM algorithm can be found in Maldonado et al. The feature matrix and the labels of the training set were inputted into the SVM model. The 10-fold cross validation was used to determine the optimal hyper-parameters. In this article, the e1071 package in R was used to build and train this model.

Results

Differentially expressed genes

By performing the gdcDEAnalysis function with defined criteria in the GDCRNATools package, 13 DEGs were identified and are listed in Table 2 and Supplementary File 1. Among these genes, FAM83A-AS1 and AC005077.4 belong to long noncoding RNA and pseudogene, respectively, and the rest are protein-coding genes, as depicted in Figure 2A. Three DEGs, SFTA3, KLRG2, and BMP5, are downregulated in the high-risk samples compared with the low-risk samples, while other 10 genes are all upregulated, as depicted in Figure 2B.

Table 2.

Information of identified DEGs (sorted by the value of logFC).

Symbol	Group	logFC	AveExpr	P value	FDR
SFTA3	protein_coding	−1.269826599	5.417458	.000265	0.04701
KLRG2	protein_coding	−1.157861235	–0.04149	5.26E–05	0.028722
BMP5	protein_coding	−1.157194445	2.649085	.000175	0.042865
FAM83A-AS1	long_non_coding	1.044754193	1.091472	.00015	0.040683
TFAP2A	protein_coding	1.047712916	3.277663	3.06E-05	0.023603
PKP2	protein_coding	1.068768408	3.276142	.000192	0.042865
FAM83A	protein_coding	1.141982369	6.422601	3.12E-05	0.023603
CCL20	protein_coding	1.159603789	3.570545	.000284	0.048067
AC005077.4	pseudogene	1.180426431	1.043103	2.27E-05	0.023603
RHOV	protein_coding	1.3313045	3.822388	9.23E-06	0.020341
DNER	protein_coding	1.462348178	1.156925	1.27E-05	0.021316
TNS4	protein_coding	1.521092422	3.270374	8.69E-05	0.03118
ABCC2	protein_coding	1.537791588	1.063846	.000242	0.045465

Abbreviations: DEG, differentially expressed genes; FDR, false discovery rate.

Figure 2.

(A) Bar plot of identified DEGs and (B) volcanic plot of identified DEGs, in which FDR represents FDR-adjusted P values.

DEG indicates differentially expressed genes; FDR, false discovery rate.

Information of identified DEGs (sorted by the value of logFC). Abbreviations: DEG, differentially expressed genes; FDR, false discovery rate. (A) Bar plot of identified DEGs and (B) volcanic plot of identified DEGs, in which FDR represents FDR-adjusted P values. DEG indicates differentially expressed genes; FDR, false discovery rate.

Differentially mutated genes

By performing the MafCompare function in the Maftools package, 7 DMGs were identified and are listed in Supplementary File 1. Three samples in the training set do not contain genetic mutation profiles, so 0197 samples were included in this analysis. Figures 3 and 4 depict the forest and co-onco plots of the DMGs. Six DMGs are mutated more frequently in the low-risk samples. One gene, GRIN2B, is mutated in 19% of the high-risk samples, while only 6% of the low-risk samples have mutations of this gene. The locations of the somatic mutations of the DMGs are showed in Supplemental Figure S1 to S7.

Figure 3.

Forest plot of 7 DMGs.

DMG indicates differentially mutated genes.

**P < .01. ***P < .001.

Figure 4.

Co-oncoplot of 7 DMGs.

DMG indicates differentially mutated genes.

Forest plot of 7 DMGs. DMG indicates differentially mutated genes. **P < .01. ***P < .001. Co-oncoplot of 7 DMGs. DMG indicates differentially mutated genes. The expression levels of the DMGs between patients in the high-risk and low-risk groups were then analyzed, as showed in Figure 5. Based on the P values of the DMGs, there are not significant differences between the 2 groups. However, it is found that the mutations of DMXL1, FAT2, GRIN2B, and THSD7A may impact their mRNA levels because their P values are smaller than that of other genes. Compared with the expression levels of the high-risk samples, DMXL1 and FAT2 are downregulated in the low-risk samples with them mutated, while THSD7A is upregulated.

Figure 5.

Expression levels of 7 DMGs of the high-risk and the low-risk samples: (A) DMXL1, (B) CHD6, (C) THSD7A, (D) FAT2, (E) SPATA31A6, (F) GRIN2B, and (G) ADGRL3.

DMG indicates differentially mutated genes.

Expression levels of 7 DMGs of the high-risk and the low-risk samples: (A) DMXL1, (B) CHD6, (C) THSD7A, (D) FAT2, (E) SPATA31A6, (F) GRIN2B, and (G) ADGRL3. DMG indicates differentially mutated genes. Figures 6 and 7 show the results of the univariate Cox regression analysis of the DEGs and DMGs. Three DEGs and one DMG are not found in KM plot and ICGA Data Portal. In Figure 6, it is found that 9 DEGs, except DENR and LARC, are significantly related to the overall survival of LUAD samples; the relationships between their expression levels and the survival rate of LUAD samples are consistent with the results of this article. The validation results of DNER and LARC are not so promising. SFTA3 and BMP5 could be tumor suppressor genes because high expression levels of them relate to better survival, while others could be oncogenes.

Figure 6.

DEG indicates differentially expressed genes; KM, Kaplan-Meier; LUAD, lung adenocarcinoma.

Figure 7.

KM curves of 6 DMGs on 195 LUAD patients, who were separated into the mutated and not mutated groups. GRIN2B is significantly associated with the survival rate of LUAD patients and it could be a tumor suppressor as the risk of patients with it mutated is higher: (A) DMXL1, (B) CHD6, (C) THSD7A, (D) FAT2, (E) SPATA31A6, and (F) GRIN2B.

DMG indicates differentially mutated genes; KM, Kaplan-Meier; LUAD, lung adenocarcinoma.

KM curves of 10 DEGs on 1926 LUAD patients, who were separated into the high-expression and low-expression groups. SFTA3 and BMP5 could be tumor suppressor genes because higher expression levels of them relate to better survival, while others could be oncogenes: (A) SFTA3, (B) BMP5, (C) FAM83A, (D) TFAP2A, (E) PKP2, (F) CCL20, (G) RHOV, (H) DNER, (I) TNS4, and (J) ABCC2. DEG indicates differentially expressed genes; KM, Kaplan-Meier; LUAD, lung adenocarcinoma. KM curves of 6 DMGs on 195 LUAD patients, who were separated into the mutated and not mutated groups. GRIN2B is significantly associated with the survival rate of LUAD patients and it could be a tumor suppressor as the risk of patients with it mutated is higher: (A) DMXL1, (B) CHD6, (C) THSD7A, (D) FAT2, (E) SPATA31A6, and (F) GRIN2B. DMG indicates differentially mutated genes; KM, Kaplan-Meier; LUAD, lung adenocarcinoma. In Figure 7, it can be found that the mutations of GRIN2B are significantly associated with the survival rate of LUAD samples, which is consistent with the results of this article. The patients with mutations of GRIN2B show worse outcomes compared with these who do not have mutations of GRIN2B. Therefore, GRIN2B could be a tumor suppressor gene as the risk of a patient with it mutated is higher.

Validation of the prognostic model

A combined feature matrix was constructed by integrating the normalized expression values of the DEGs and the mutation times of the DMGs. The dimension of the feature matrix is 200 × 20, in which each column represents an LUAD sample of the training set. Then, the prognostic model based on the SVM algorithm was trained using the 10-fold cross validation, and the optimal hyper-parameters were obtained. Sensitivity, specificity, and the area under the ROC (receiver operating characteristic) curve (AUC) were used to evaluate the prognostic performance. They are defined by 4 terms, namely, true positive (TP), true negative (TN), false positive (FP), and false negative (FN). TP is the number of the high-risk samples that are predicted as high risk, while FP is the number of the high-risk samples but predicted as low risk incorrectly; TN is the number of the low-risk samples that are predicted as low risk, while FP is the number of low-risk samples but predicted as high risk. Sensitivity, specificity, and AUC are defined as follows: and Sensitivity mainly evaluates the ability to recognize high-risk samples, whereas specificity mainly focuses on the prediction performance of the low-risk samples. The greater values of these criteria indicate better classification result. The prediction performance of the DEGs and DMGs with patients in different stages was first evaluated by the prognostic model. The values of sensitivity, specificity, and AUC of stage I to IV are listed in Table 3. The prognostic genes worked the best with the patients in stage III, where 7 of 8 high-risk patients were stratified correctly. There are only 2 high-risk patients in stage IV, so the specificity is not available.

Table 3.

Evaluation metrics of stratification results based on DEGs and DMGs of the training set with different stages.

	Stage I	Stage II	Stage III	Stage IV	Overall
Sensitivity	0.69	0.58	0.875	0.5	0.531
Specificity	0.55	0.66	0	–	0.743
AUC	0.62	0.62	0.438	–	0.637

Abbreviations: AUC, area under the ROC curve; DEG, differentially expressed genes; DMG, differentially mutated genes; ROC, receiver operating characteristic.

Statistically significant values were represented in bold.

Evaluation metrics of stratification results based on DEGs and DMGs of the training set with different stages. Abbreviations: AUC, area under the ROC curve; DEG, differentially expressed genes; DMG, differentially mutated genes; ROC, receiver operating characteristic. Statistically significant values were represented in bold. The values of sensitivity, specificity, and AUC by using single (DEGs) and integrated (DEGs and DMGs) features are listed Table 4. It can be found that by integrating the features of the DEGs and DMGs, the performance of the proposed prognostic model was significantly improved, with the specificity and AUC increasing from 0.543 to 0.743 and from 0.537 to 0.637 respectively.

Table 4.

Evaluation metrics of stratification results based on different signature sets with the best metrics written in bold.

	Shukla et al¹⁴	Chen et al¹⁶	Zhao et al¹⁵	Songyang et al¹⁷	DEGs	DEGs + DMGs
Sensitivity	0.629	0.543	0.657	1	0.531	0.531
Specificity	0.5	0.594	0.594	0	0.543	0.743
AUC	0.564	0.568	0.625	0.451	0.537	0.637

Abbreviations: AUC, area under the ROC curve; DEG, differentially expressed genes; DMG, differentially mutated genes; ROC, receiver operating characteristic.

Statistically significant values were represented in bold.

Evaluation metrics of stratification results based on different signature sets with the best metrics written in bold. Abbreviations: AUC, area under the ROC curve; DEG, differentially expressed genes; DMG, differentially mutated genes; ROC, receiver operating characteristic. Statistically significant values were represented in bold. The proposed signature set was then compared with 4 most recent sets of prognostic signatures for LUAD, which are all based on genetic expression profiles. The prediction experiments of all the signature sets were performed on the same training and testing sets used in this article. In the training set, the expression values of genes in each prognostic signature set were selected and used to train the SVM model. The 10-fold cross validation was also used in the training progress of them. Finally, the trained model was evaluated by the testing set and the prediction results are listed in Table 4. It can be found that the proposed signatures achieved the greatest values of specificity and AUC. Figure 8 depicts the ROC curves of the stratification results of the samples in the testing set by different prognostic signature sets and their corresponding prognostic models; it can be found that the proposed prognostic signatures stood out on top compared with others.

Figure 8.

ROC curves of proposed prognosis model by using our signatures and others.

DEG indicates differentially expressed genes; DMG, differentially mutated genes; ROC, receiver operating characteristic.

ROC curves of proposed prognosis model by using our signatures and others. DEG indicates differentially expressed genes; DMG, differentially mutated genes; ROC, receiver operating characteristic.

Discussion

This study aims to integrate the genetic mutation and expression profiles to predict overall survival (OS) of LUAD using a TCGA data set. Patients in this data set were separated into the high-risk and low-risk groups according to the overall survival. Differential analysis between the 2 groups produced a novel set of prognostic genes, containing 13 DEGs and 7 DMGs. Finally, a prognostic model was constructed using the integrated features of the DEGs and DMGs. The validation results have showed the prognostic value of the DEGs and DMGs and have showed the power of the DMGs on the survival prediction. The most significant contribution of this article is the integration of the genetic mutation and expression profiles to determine prognostic genes for LUAD patients. If genetic expression and mutation profiles are available, the pipeline of determining DEGs and DMGs in this article can be applied to other types of cancers. The functions of the DEGs and DMGs were searched from https://www.uniprot.org/ and are listed in Supplementary File 2. The DEGs and DMGs were also searched on PubMed using the gene name and lung cancer. The number of related papers and functions of the DEGs and DMGs in lung cancer are listed in Supplementary File 2. SFAT3, BMP5, FAM83A, TSN4, ABCC2, and FAT2 have been suggested to be biomarkers in lung cancer.[34-39] The other genes were identified as biomarkers for lung cancer for the first time in this article. In addition, there is no related study about KLRG2, DMXL1, CHD6, ADGRL3, and SPATA31A6, suggesting that they act independently as biomarkers in lung cancer. The relationships between the DEGs and DMGs and some known driver genes of lung cancer were analyzed. FAM83A has been indicated as a proto-oncogene that functions in the epidermal growth factor receptor (EGFR) signaling pathway. EGFR is a well-known driver gene in lung cancer, and it has been suggested that FAM83A lies downstream of EGFR/PI3K and upstream of MEK. In breast cancer cells, it has been revealed that downmodulation of FAM83A led to decreased proliferation and invasiveness in cell cultures as well as to decreased tumor growth in vivo. In lung cancer, several studies about FAM83A have been published in recent years. Overexpression of FAM83A has been indicated to be related to poor clinical outcomes in LUAD,[41-44] and FAM83A promotes the progression and tumorigenicity in non-small cell lung cancer by regulating Wnt and Hippo signaling pathways.[44,45] Therefore, FAM83A is an effective prognostic biomarker and a potential new therapeutic target in lung cancer.[36,46] In this article, it is found that FAM83A is overexpressed in the high-risk samples compared with the low-risk samples. Enrichment analysis of the gene set, comprising the DEGs, DMGs, EGFR, and KRAS, was performed using KEGG pathways database in this article. The results illustrated that GRIN2B and EGFR were enriched into Rap1 signaling pathway (P = .002) and Ras signaling pathway (P = .003), suggesting a potential relationship between them. The 2 genes were then searched in PubMed and only one paper was obtained, in which both GRIN2B and EGFR were determined as biomarkers for gastric cancer but the relationship between them was not discussed. Therefore, it is a promising study to explore the relationship between GRIN2B and EGFR in lung cancer. Genetic mutations can affect gene expressions. However, the interactive mechanism between the genetic mutation and expression is still not well understood. It is found that DMXL1 and FAT2 are downregulated in the low-risk patients with these genes mutated, while THSD7A is upregulated. To reveal how the DMGs impact their expression levels is one of our future works. From the KM plots of the DMGs, GRIN2B is a potential tumor suppressor gene in LUAD, which has been confirmed in some types of cancer, such as diffuse large B-cell lymphoma, gastric cancer, and LUAD. There are 2 main limitations of this study. First, in the validation results of DMGs, only GRIN2B showed a promising prognostic performance. The reason is that the DMGs only mutated in a small portion of whole samples. For example, only 19% of the high-risk samples have mutations of GRIN2B. Second, the performance of stratification is far from satisfactory. Although several prognostic models can be used to predict the overall survival of LUAD,[15,17] the AUC values are only 0.615 and 0.637 in Zhao et al and in this article, respectively. New signatures, such as microbial biomarkers,[50,51] and prognostic models based on advanced machine learning algorithms, such as deep learning methods, are needed to further improve the performance of the survival prediction.

Conclusion

In this article, 13 DEGs and 7 DMGs were identified in an LUAD cohort, and a prognostic model was constructed by combining the features of the DEGs and DMGs. The validation results on the testing set have showed the superiority of the proposed prognostic signatures and model compared with others. The main contribution of this article is that the prognostic signatures are determined by a new manner, in which patients with LUAD were partitioned into the high-risk and low-risk groups. Differential analysis of the expression and mutation profiles between the 2 groups identified a new set of prognostic signatures. This pipeline can be applied to other types of cancers to determine novel prognostic signatures and potential therapeutic targets. There are several promising extensions of this study. First, the overall survival of LUAD patients is only considered when identifying DEGs and DMGs in this article. In future study, more factors, such as the therapy, should be included to determine prognostic signatures. Second, with multi-omics data sets available, it is a practical manner to improve the prognostic prediction performance by combining the features from the genetic to epigenomic profiles of cancer samples. Click here for additional data file. Supplemental material, SupplementaryFigures for Combining Genetic Mutation and Expression Profiles Identifies Novel Prognostic Biomarkers of Lung Adenocarcinoma by Yun Liu, Fu Liu, Xintong Hu, Jiaxue He and Yanfang Jiang in Clinical Medicine Insights: Oncology

50 in total

Combining Genetic Mutation and Expression Profiles Identifies Novel Prognostic Biomarkers of Lung Adenocarcinoma.

Introduction

Materials and Methods

Data collection and grouping

Identification of differentially expressed genes

Identification of differentially mutated genes

External validation of the DEGs and DMGs

Prognostic prediction model

Results

Differentially expressed genes

Differentially mutated genes

Validation of the prognostic model

Discussion

Conclusion

Review 1. Lung cancer screening: panacea or pipe dream?

Review 2. Lung cancer in never smokers: disease characteristics and risk factors.

3. DNA methylation profile during multistage progression of pulmonary adenocarcinomas.

4. Fibulin-1 functions as a prognostic factor in lung adenocarcinoma.

5. Development of a RNA-Seq Based Prognostic Signature in Lung Adenocarcinoma.

6. FAM83A Is a Prognosis Signature and Potential Oncogene of Lung Adenocarcinoma.

7. The potential for using risk models in future lung cancer screening trials.

8. A robust gene expression-based prognostic risk score predicts overall survival of lung adenocarcinoma patients.

9. FAM83A and FAM83B as Prognostic Biomarkers and Potential New Therapeutic Targets in NSCLC.

10. Comprehensive molecular profiling of lung adenocarcinoma.