Jing Zhou1, Xinyue Wang1, Zhaona Li1, Richeng Jiang1. 1. Department of Thoracic Oncology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer; Key Laboratory of Cancer Prevention and Therapy, Tianjin; Tianjin's Clinical Research Center for Cancer, Tianjin 300060, China.
Abstract
BACKGROUND: Autophagy related genes (ARGs) regulate lysosomal degradation to induce autophagy, and are involved in the occurrence and development of a variety of cancers. The expression of ARGs in tumor tissues has a great prospect in predicting the survival of patients. The aim of this study was to construct a prognostic risk score model for lung adenocarcinoma (LUAD) based on ARGs. METHODS: 5,786 ARGs were obtained from GeneCards database. Gene expression profiles and clinical data of 395 LUAD patients were collected from The Cancer Genome Atlas (TCGA) database. All ARGs expression data were extracted, and The ARGs differentially expressed were identified by R software. Survival analysis of differentially expressed ARGs was performed to screen for ARGs with prognostic value, and functional enrichment analysis was performed. The least absolute selection operator (LASSO) regression and Cox regression model were used to construct a prognostic risk scoring model for ARGs. The receiver operating characteristic (ROC) curve was drawn to obtain the optimal cut-off value of risk score. According to the cut-off value, the patients were divided into high-risk group and low-risk group. The area under curve (AUC) and the Kaplan-Meier survival curve was plotted to evaluate the model performance, which was verified in external data sets. Finally, univariate and multivariate Cox regression analysis was applied to evaluate the independent prognostic value of the model, and its clinical relevance was analyzed. RESULTS: Survival analysis, Lasso regression and Cox regression analysis were used to construct a LUAD prognostic risk score model with five ARGs (ADAM12, CAMP, DKK1, STRIP2 and TFAP2A). The survival time of patients with low-risk score in this model was significantly better than that of patients with high-risk score (P<0.001). The model showed good prediction performance for LUAD in both the training set (AUCmax=0.78) and two external validation sets (AUCmax=0.88). Risk score was significantly associated with the prognosis of LUAD patients in univariate and multivariate Cox regression analyses, suggested that risk score could be a potential independent prognostic factor for LUAD. Correlation analysis of clinical characteristic showed that high risk score was closely associated with high T stage, high tumor stage and poor prognosis. CONCLUSIONS: We constructed a LUAD risk score model consisting of five ARGs, which can provide a reference for predicting the prognosis of LUAD patients, and may be used in combination with tumor node metastasis (TNM) staging for prognosis prediction of LUAD patients in the future.
BACKGROUND: Autophagy related genes (ARGs) regulate lysosomal degradation to induce autophagy, and are involved in the occurrence and development of a variety of cancers. The expression of ARGs in tumor tissues has a great prospect in predicting the survival of patients. The aim of this study was to construct a prognostic risk score model for lung adenocarcinoma (LUAD) based on ARGs. METHODS: 5,786 ARGs were obtained from GeneCards database. Gene expression profiles and clinical data of 395 LUAD patients were collected from The Cancer Genome Atlas (TCGA) database. All ARGs expression data were extracted, and The ARGs differentially expressed were identified by R software. Survival analysis of differentially expressed ARGs was performed to screen for ARGs with prognostic value, and functional enrichment analysis was performed. The least absolute selection operator (LASSO) regression and Cox regression model were used to construct a prognostic risk scoring model for ARGs. The receiver operating characteristic (ROC) curve was drawn to obtain the optimal cut-off value of risk score. According to the cut-off value, the patients were divided into high-risk group and low-risk group. The area under curve (AUC) and the Kaplan-Meier survival curve was plotted to evaluate the model performance, which was verified in external data sets. Finally, univariate and multivariate Cox regression analysis was applied to evaluate the independent prognostic value of the model, and its clinical relevance was analyzed. RESULTS: Survival analysis, Lasso regression and Cox regression analysis were used to construct a LUAD prognostic risk score model with five ARGs (ADAM12, CAMP, DKK1, STRIP2 and TFAP2A). The survival time of patients with low-risk score in this model was significantly better than that of patients with high-risk score (P<0.001). The model showed good prediction performance for LUAD in both the training set (AUCmax=0.78) and two external validation sets (AUCmax=0.88). Risk score was significantly associated with the prognosis of LUAD patients in univariate and multivariate Cox regression analyses, suggested that risk score could be a potential independent prognostic factor for LUAD. Correlation analysis of clinical characteristic showed that high risk score was closely associated with high T stage, high tumor stage and poor prognosis. CONCLUSIONS: We constructed a LUAD risk score model consisting of five ARGs, which can provide a reference for predicting the prognosis of LUAD patients, and may be used in combination with tumor node metastasis (TNM) staging for prognosis prediction of LUAD patients in the future.
Entities:
Keywords:
Autophagy related genes; Cox regression model; LASSO regression; Lung neoplasms; Prognostic model
Survival analysis of 5 signature genes. A: The survival curve of ADAM12; B: The survival curve of CAMP; C: The survival curve of DKK1; D: The survival curve of STRIP2; E: The survival curve of TFAP2A.
5个建模基因的生存分析。A:ADAM12的生存曲线; B:CAMP的生存曲线; C:DKK1的生存曲线; D:STRIP2的生存曲线; E:TFAP2A的生存曲线。Survival analysis of 5 signature genes. A: The survival curve of ADAM12; B: The survival curve of CAMP; C: The survival curve of DKK1; D: The survival curve of STRIP2; E: The survival curve of TFAP2A.
Performance evaluation of prognostic risk score model. A: Risk curve; B: The survival status chart; C: The heatmap of the five signature genes expression profiles; D: Kaplan-Meier survival curve; E: Time ROC curve.
预后风险评分模型的性能评估。A:风险曲线; B:生存状态图; C:建模基因表达热图; D:Kaplan-Meier生存曲线; E:时间ROC曲线。Performance evaluation of prognostic risk score model. A: Risk curve; B: The survival status chart; C: The heatmap of the five signature genes expression profiles; D: Kaplan-Meier survival curve; E: Time ROC curve.
Clinical characteristic correlation analysis. A: The clinical correlation between risk score and survival status; B: The clinical correlation between risk score and tumor stage; C: The clinical association between risk score and T staging. *P < 0.05, ***P < 0.001.
临床相关性分析。A:风险评分与生存状态的临床相关性; B:风险评分与肿瘤分期的临床相关性; C:风险评分与T分期的临床相关性。*P < 0.05,***P < 0.001。Clinical characteristic correlation analysis. A: The clinical correlation between risk score and survival status; B: The clinical correlation between risk score and tumor stage; C: The clinical association between risk score and T staging. *P < 0.05, ***P < 0.001.
Performance evaluation of prognostic risk score model in external validation sets. A: Time ROC curve of GSE31210; B: Kaplan-Meier survival curve of GSE31210; C: The time ROC curve of GSE72094; D: Kaplan-Meier survival curve of GSE72094.
预后风险评分模型在外部验证集中的性能评估。A:GSE31210的时间ROC曲线; B:GSE31210的Kaplan-Meier生存曲线; C:GSE72094的时间ROC曲线; D:GSE72094的Kaplan-Meier生存曲线。Performance evaluation of prognostic risk score model in external validation sets. A: Time ROC curve of GSE31210; B: Kaplan-Meier survival curve of GSE31210; C: The time ROC curve of GSE72094; D: Kaplan-Meier survival curve of GSE72094.
讨论
研究[表明自噬参与肺腺癌的发生发展,可以满足肿瘤细胞高代谢的需求,在肿瘤的生长和侵袭中发挥重要作用。自噬是肺腺癌治疗过程中耐药的关键调控因子,抑制自噬可激活EGFR突变从而提高Afatinib在肺腺癌中的抗肿瘤活性[。抑制自噬还可以提高Shh抑制剂vismodegib对LUAD的疗效[。本研究通过GeneCard数据库收集ARGs,利用来自TCGA的肺腺癌RNA-seq数据和生存信息,筛选出52个有预后价值的ARGs,经GO和KEGG富集分析提示这些基因主要富集在调控细胞周期、参与自噬、参与HIF-1信号通路和p53信号通路等功能。通过单因素Cox回归分析、LASSO回归和多因素Cox回归分析筛选出5个关键ARGs(ADAM12、CAMP、DKK1、STRIP2和TFAP2A),构建了肺腺癌预后风险评分模型。ADAM12的分泌形式在肺癌中高表达,可促进肿瘤细胞的增殖、迁移和侵袭[。沉默ADAM12可通过激活人绒毛膜癌JEG-3细胞自噬促进细胞凋亡。抑制ADAM12可降低小细胞肺癌细胞增殖,促进细胞凋亡[。CAMP是体内的一种宿主免疫肽,具有抗肿瘤作用。CAMP的C端肽LL-37是体内唯一的抗菌肽,在细胞趋化、血管生成、免疫介质诱导和炎症反应调节中发挥重要作用[。有研究[发现,LL-37在正常结肠黏膜中表达强烈,在结肠癌组织中表达下调,LL-37可诱导结肠癌细胞凋亡和自噬性死亡,具有独特的抗肿瘤发生作用。DKK1是Wnt信号的负调控因子,是β-catenin/TCF通路的一个靶点,DDK1可通过抑制Wnt-CTNNB1信号通路诱导自噬[。STRIP2可调节多种肿瘤细胞的生长和迁移。STRIP2在肺腺癌中高表达,通过调控AKT/mTOR通路和上皮-间质转化促进肺肿瘤的增殖和侵袭[。TFAP2A在多种癌症中均异常表达,例如,TFAP2A在人鼻咽癌中过表达,通过调节HIF-1α介导的VEGF/PEDF信号通路促进肿瘤的发生[。既往研究[发现TFAP2A可诱导KRT16过表达,通过EMT促进肺腺癌的发生发展。上述5个风险ARGs的生存分析和风险评分分布图提示,CAMP基因低表达和ADAM12、DKK1、STRIP2、TFAP2A基因高表达患者的风险评分高,更易发生预后不良(P < 0.05)。通过绘制风险评分分布、Kaplan-Meier生存曲线证明高风险评分较低风险评分患者的预后更差,1年、2年、3年、5年和7年时间AUC证明模型对肺腺癌预后预测有较好的敏感性和特异性,并在外部数据集GSE31210和GSE72094得到验证,证明模型的预测性能具有一定的准确性。同时,我们还对风险评分和其他临床预测指标进行了单因素和多因素Cox回归分析,证明了风险评分具有独立预后价值,可作为LUAD患者的独立预后预测因子。TNM分期是国际公认的临床预后预测指标,尽管从单因素和多因素Cox分析上看风险评分(P < 0.001)较肿瘤分期(P=0.018)更有优势,但尚不能说明本模型一定优于TNM分期的预测能力。因为本模型尚处于初步建立阶段,且为回顾性研究,样本量较少,仍需要大规模的前瞻性临床试验数据来验证其预测能力是否较TNM分期更好。待完善基础实验后,未来在临床应用中或可与TNM分期联合应用于肺腺癌患者的预后预测。风险评分与临床特征相关性分析结果提示,风险评分高低与T分期、肿瘤分期和发生不良预后密切相关,但它们之间的因果关系仍需进一步探索。同时,有无EGFR、ALK、KRAS基因突变,在单因素和多因素Cox回归分析,以及与风险评分进行相关性分析中均未呈现显著差异,此结果可能与临床中的观察并不吻合。分析其原因,可能为包含基因突变信息的样本量较少所导致。综上所述,本研究通过LASSO和Cox回归分析构建了基于自噬相关基因组肺腺癌的预后风险评分模型,该模型的预测性能稳定,具有的独立预后价值和临床相关性,可辅助为LUAD患者的个体化诊疗提供参考。与同类研究相比,本研究的LUAD预后风险评分模型存在如下特点:首先,许多同类研究是以免疫为背景,构建免疫相关基因(immune related genes, IRGs)风险模型以预测LUAD患者的预后[。然而,很少有预测模型以ARGs为基础构建预后预测模型。许多ARGs可调控肿瘤的发生发展,肿瘤组织中ARGs的表达情况在预测生存预后方面具有很大的前景,这些ARGs可作为新的分子靶点。因此,本研究以自噬为背景,筛选与预后显著相关的ARGs,并构建了包含多个ARGs的风险评分模型来预测LUAD患者的生存预后。同时,用于建模的风险基因也可作为LUAD基础研究和潜在的治疗靶点。因此,本研究补充了ARGs风险评分模型在LUAD中的研究空白,以实现对LUAD患者更精准的预后评估,为其个性化治疗提供重要参考。其次,本研究通过2个外部数据集验证所构建的风险模型的可靠性。通过对风险模型的预测效能验证(AUC均值> 0.600),证明该模型在其他独立数据集中也具有中等程度的预测性能,而上述前人研究中并未在多个数据集中进行验证。遗憾的是,我们的研究仍存在一些局限性。首先,本研究中分析的所有数据均来自公共数据库,所构建的风险模型仍需大规模的临床试验以评估其预测效能; 其次,本研究用于建模的风险基因尚缺少体内、体外实验进一步验证。
Authors: Shun X Ren; Jin Shen; Alfred S L Cheng; Lan Lu; Ruby L Y Chan; Zhi J Li; Xiao J Wang; Clover C M Wong; Lin Zhang; Simon S M Ng; Franky L Chan; Francis K L Chan; Jun Yu; Joseph J Y Sung; William K K Wu; Chi H Cho Journal: PLoS One Date: 2013-05-20 Impact factor: 3.240