Literature DB >> 34854250

A functional methylation signature to predict the prognosis of Chinese lung adenocarcinoma based on TCGA.

Ke Wang1,2, Ying Liu2, Guanzhong Lu1, Jinrong Xiao2, Jiao Huang2, Lin Lei3, Ji Peng3, Yangkai Li4, Sheng Wei2.   

Abstract

BACKGROUND: Lung cancer is the leading cause of cancer morbidity and mortality worldwide, however, the individualized treatment is still unsatisfactory. DNA methylation can affect gene regulation and may be one of the most valuable biomarkers in predicting the prognosis of lung adenocarcinoma. This study was aimed to identify methylation CpG sites that may be used to predict lung adenocarcinoma prognosis.
METHODS: The Cancer Genome Atlas (TCGA) database was used to detect methylation CpG sites associated with lung adenocarcinoma prognosis and construct a methylation signature model. Then, a Chinese cohort was carried out to estimate the association between methylation and lung adenocarcinoma prognosis. Biological function studies, including demethylation treatment, cell proliferative capacity, and gene expression changes in lung adenocarcinoma cell lines, were further performed.
RESULTS: In the TCGA set, three methylation CpG sites were selected that were associated with lung adenocarcinoma prognosis (cg14517217, cg15386964, and cg18878992). The risk of mortality was increased in lung adenocarcinoma patients with the gradual increase level of methylation signature based on three methylation sites levels (HR = 45.30, 95% CI = 26.69-66.83; p < 0.001). The C-statistic value increased to 0.77 when age, gender, and other clinical variables were added to the signature to prediction model. A similar situation was confirmed in Chinese lung adenocarcinoma cohort. In the biological function studies, the proliferative capacity of cell lines was inhibited when the cells were demethylated with 5-aza-2'-deoxycytidine (5-aza-2dC). The mRNA and protein expression levels of SEPT9 and HIST1H2BH (cg14517217 and cg15386964) were downregulated with different concentrations of 5-aza-2dC treatment, while cg18878992 showed the opposite result.
CONCLUSION: This study is the first to develop a three-CpG-based model for lung adenocarcinoma, which is a practical and useful tool for prognostic prediction that has been validated in a Chinese population.
© 2021 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

Entities:  

Keywords:  DNA methylation; biological mechanism; lung adenocarcinoma; prediction model; prognosis

Mesh:

Substances:

Year:  2021        PMID: 34854250      PMCID: PMC8704183          DOI: 10.1002/cam4.4431

Source DB:  PubMed          Journal:  Cancer Med        ISSN: 2045-7634            Impact factor:   4.452


INTRODUCTION

Lung cancer is the leading cause of cancer morbidity and mortality worldwide, contributing to 13.0% of newly diagnosed cancers and 19.4% of cancer‐related deaths.  Moreover, the percent has risen to 17.1% of new cases and 21.7% of cancer deaths due to lung cancer in China. As one of the most important pathological types of lung cancer, in recent years, individualized lung adenocarcinoma treatments, including EGFR TKIs and EML4‐ALK, have been highly effective for the treatment of patients with EGFR mutations or harboring EML4‐ALK fusions. , Although these recent developments in targeted treatment have been used for lung adenocarcinoma, patients without such mutations generally lack sensitivity to these targeted treatments compared to those with mutations, leading to an unsatisfactory 5‐year survival rate.  Thus, continued studies to improve the efficacy of treatment for lung adenocarcinoma are crucial. Exploring the molecular and epigenetic features of lung adenocarcinoma during tumor occurrence and development may be beneficial for lung adenocarcinoma therapy. The unsatisfactory nature of individualized therapy reflects the need to elucidate underlying mechanisms other than gene alterations to promote the treatment for lung cancer. There have been studies reported that epigenetic changes are closely associated with lung cancer tumorigenesis besides genetic alteration. , As one important mechanism of epigenetic changes, DNA methylation can not only affect chromatin construction, but also closely related to gene expression regulation system. Abnormal methylation status can lead to disturbance of several genes, and has been shown to correlate with the potential aggressiveness of tumors. Recent studies have reported that aberrant DNA methylation often happens in early lung cancer. , , In addition, some studies have found a phenomenon of abnormal methylation status at certain CpG sites in some lung cancer tissues. , ,  Tang et al. found that compared with lung cancer patients with an unmethylated DAPK gene promoter, patients with abnormal hypermethylation had a worse prognosis. Other studies have showed that the levels of methylation CpG sites located on the gene promoter CpG islands in CHD13, RASSF1A, and BCAT1 are higher in lung cancer tumor tissue than in adjacent normal tissue. ,  Thus, the methylation level of certain CpG sites may represent potential molecular markers that may have an important effect in prognosis prediction of lung cancer patients. However, considering the lack of genome‐wide DNA methylation analyses with multiple samples, there are only a few studies that have explored which DNA methylation markers are most valuable in predicting the prognosis of lung adenocarcinoma. In consideration that TCGA database can provide large omics data, this study first analyzed the DNA methylation patterns of lung adenocarcinoma patients, identified differentially methylation CpG sites (DMCs), and then screened for methylation CpG sites associated with lung adenocarcinoma prognosis. A methylation signature was subsequently identified that can be directly incorporated into a clinical test for prognostic prediction. A Chinese lung adenocarcinoma cohort was used to validate the correlation between DNA methylation previously identified in TCGA set and lung adenocarcinoma prognosis. Moreover, the biological function of methylation CpG sites was further performed in cell lines.

METHODS

TCGA data of lung adenocarcinoma patients

The omics data in TCGA for lung adenocarcinoma were used in this study, including methylation, mRNA expression, and clinical information (https://portal.gdc.cancer.gov/). The information of DNA methylation (level three) was generated using the Illumina Human Methylation 450 platform for all samples on 1 October 2017. The gene expression data (level three) were generated from the Illumina HiSeq RNA‐seq V2 platform. The clinical data were downloaded for survival analysis at the same time. In total, 418 patients for whom both TCGA methylation array and clinical data were available and 26 adjacent normal samples from the TCGA dataset were included in this study.

Screening of methylation CpG sites associated with the prognosis of lung adenocarcinoma

According to TCGA database, 48,2421 CpG probes were collected in the Illumina Human Methylation 450 of lung adenocarcinoma in total. And we removed 8,8621 probes showed “NA” values in more than 25% samples and ultimately obtained 39,3800 CpG probes. The Illumina Human Methylation 450 data of tumor and adjacent normal samples in lung adenocarcinoma were gathered at the same time by the same standards. DMCs between tumor tissue and adjacent tissue were defined as false discovery rate (FDR) <0.05 for the comparison between tumor tissue methylation level and adjacent tissue methylation level, and the absolute value of log2 (fold change) >2.5. Bioinformatics analysis including GO enrichment analysis and cluster analysis were also performed. LASSO Cox regression method was analyzed to confirm methylation CpG sites associated with lung adenocarcinoma prognosis based on DMCs. A methylation signature was built according to the sum of each methylation CpG level times the regression coefficient of LASSO Cox regression. ,

Chinese lung adenocarcinoma cohort for the validation set

To validate the effect of methylation CpG sites found in the TCGA database on Chinese lung adenocarcinoma prognosis, information for people with pathologically confirmed primary lung adenocarcinoma was collected from Wuhan Tongji Hospital from January 2014 to December 2014. A systematic questionnaire survey was performed to collect basic characteristics information of lung adenocarcinoma patients, such as age, gender, history of smoking, and drinking et al. Clinicopathologic information acquisition including cancer stage, degree of tumor differentiation, histology, chemotherapy, and radiation treatment history before operation were from the individual medical records of each patient. The standard of tumor staging was carried out according to the UICC/AJCC in terms of tumor‐node‐metastasis staging system.  The tumor tissues and some adjacent normal tissues of participants were collected. Patient selection criteria for enrollment are as follows: (1) the diagnosis of histological and pathological for lung adenocarcinoma criteria must meet the criteria of WHO, (2) the patient had no record of having received any other anti‐cancer treatment before surgery, (3) the basic characteristics information and clinicopathologic information is available, and (4) the patients can be kept in touch during follow‐up.

Follow‐up of lung adenocarcinoma patients

All the participants included in this study were followed up every 3 to 4 months in the first 2 years and every 6 to 7 months thereafter. The follow‐up process was mainly conducted by trained and experienced doctors, which included the state of life (alive or dead), the treatment (chemotherapy or radiation therapy) and recurrence and metastasis after operation. Follow‐up information on the therapeutic efficacy and patient prognosis was collected through interviews, telephone calls, and questionnaires. The interval between diagnosis date and death date or the date of last follow‐up was defined as survival time. Participants were followed until December 2017.

DNA isolation and pyrosequencing

Total DNA from tumor samples and adjacent normal samples were isolated referring to the instructions of DNA isolation kit (TIANGEN, Beijing, China). DNA Bisulfite Conversion Kit (QIAGEN, Hilden, Germany) was used for sodium bisulfite treatment of extracted DNA and then pyrosequenced. PyroMark Assay Design 2.0 (QIAGEN, Hilden, Germany) was used to design primers for each methylation CpG site. The experiment conditions of PCR amplification were 40 cycles containing denaturation at 98°C for 10 s, annealing at 55.0°C for 30 s, and extension at 72°C for 30 s, with PyroMark PCR Kit (QIAGEN, Hilden, Germany). Pyrosequencing was used to quantify CpG methylation levels based on the PCR products with manufacturer's instructions of PyroMark Q96 instrument (QIAGEN, Hilden, Germany).

RNA isolation and quantitative RT‐PCR

Total RNA from patient samples was isolated referring to the instructions of TRIzol reagent (Invitrogen, Carlsbad, USA). Primer design of target genes was performed with Primer Premier 5.0. Pre‐designed random primers were used to reverse transcribe total RNA (500 ng) with a volume of 10 µl, referring to PrimeScript RT Reagent Kit instructions (TaKaRa, Dalian, China). SYBR Green Real‐time PCR (RT‐PCR) was applied to measure gene (ABI [Foster, USA] 7500). The mRNA expression level of target gene was calculated with the 2 × 2−△△Ct method. In addition, △Ct was defined as the difference between the level in target gene and reference gene (GAPDH) in threshold cycles, and △△Ct value for one gene was calculated by the difference value between △Ct of sample and △Ct of standard. The specific primer sequences of the target genes and GAPDH are listed in Supplementary materials.

Protein isolation and Western blotting

Samples of lung adenocarcinoma participants were gathered for protein detection. RIPA buffer with protease inhibitor cocktail (Roche, Basel, Switzerland) was used for sample preparation and the supernatants were then resuspended in sample buffer. After denaturation of the samples by the way of boiling for 10 min, 30% polyacrylamide gel was used for protein electrophoresis with the same amount (40 μg). Whereafter, PVDF membrane was used for protein transfer (Millipore, Massachusetts, USA). Blocking buffer was used to block membranes for 2 h, then the following antibodies were used SEPT9 (Abcam [Cambridge, USA] ab38314, 1:500), HIST1H2BH (Abcam ab1790, 1:1000); MAPT (Abcam ab64193, 1:500), GAPDH (Goodhere [Hangzhou, China] AB‐P‐R 001, 1:1000), and finally overnight under the condition of 4°C. The membranes were in a state of HRP‐linked secondary antibodies (Boster [Wuhan, China] BA1054, 1:50,000) for 2 h after cleaned by TBST for three times on the following day. An electrochemiluminescence (ECL) detection kit (Thermo [Waltham, USA] NCI5079) was used for immunoreactive protein bands observation at the completion of washing. Azure c300 Gel Imaging System (Azure Biosystems, Dublin, USA) and BandScan software was used to scan and quantify protein bands. Three times have been done for all of the experiments.

Cell lines

Lung adenocarcinoma cell lines A549 and NCI‐H1975, as well as immortalized human bronchial epithelial cells MRC‐5 were acquired from American Type Culture Collection (ATCC, Manassas, USA), and then identified not longer than 6 months after cell passage by ATCC for this study. A549, NCI‐H1975, and MRC‐5 were maintained in Dulbeccoʼs modified Eagle medium (Invitrogen, Carlsbad, USA) including fetal bovine serum (10%) and penicillin–streptomycin (0.1 mmol/L, 5%) at an environment of 37 °C and 5% CO2.

Treatment with 5‐aza‐2dC

Cells were seeded at a density of 1.0×104/ml 24 h before treatment. Cells were demethylated with 5‐aza‐2dC (Sigma‐Aldrich, St Louis, USA) at a concentration gradient of 0, 1, 5, and 10 μmol/L for 6 days. Culture medium containing 5‐aza‐2dC was changed every 24 h. Total RNA was isolated and used for RT‐qPCR. For cell growth assays, the cell lines were then dealt with different concentration gradients of 5‐aza‐2dC, and an MTT assay were performed with three multiple holes.

Statistical analysis

Population characteristics are showed as the mean ± standard deviation (SD) and counts, respectively. In order to compare the differences of quantitative data between groups, Student's t‐test, one‐way ANOVA as well as Mann–Whitney U test were performed and for qualitative data comparison, Pearson chi‐squared test was analyzed. To assess the association between DNA methylation and lung adenocarcinoma prognosis, unconditional Cox regression method was performed. The methylation signature was analyzed in all lung adenocarcinoma patients of TCGA dataset and Chinese dataset, and the participants were defined as high‐ and low‐risk groups based on a cut off value of median. Grouping analyses were performed to evaluate the association between methylation CpG sites and lung adenocarcinoma risk. The prognostic prediction efficiency of the methylation signature was estimated with Harrell's C‐statistic. The correlation between DNA methylation and gene expression or gene expression level and protein level was analyzed using a linear correlation model. p value less than 0.05 was supposed to be significantly different (two sided). GraphPad Prism 5 software was performed to generate pictures. SAS version 9.4 (SAS Institute, Cary, USA) was used for data analyses.

RESULTS

The discovery set (TCGA database)

Characteristics of the lung adenocarcinoma patients and screening for methylated CpG sites

The flow chart indicating the study design in the discovery set based on the TCGA set is shown in Figure 1. A total of 418 lung adenocarcinoma patients with information on both TCGA methylation array and clinical data were included. The clinical characteristics of lung adenocarcinoma based on TCGA set are presented in Table 1. As shown, 195 patients were males (46.7%) and the mean age was 65.10 (± 10.14) years. A total of 339 subjects reported a history of smoking (84.8%). During the follow‐up, 102 (24.4%) patients died, and 316 (75.6%) patients were still alive. The mean survival time of all participants was 6.75 (1.60–22.28) months.
FIGURE 1

Flow chart of the study design. Twenty‐six paired tumor and adjacent normal samples of lung adenocarcinoma patients were used for candidate methylation CpG islands screening with Illumina Human Methylation 450 platform in TCGA set. Moreover, LASSO Cox regression method was used to build a methylation signature model in the validation set. In addition, the three‐CpG‐based prognostic signature was validated in a Chinese dataset. Biological function study of methylation CpG sites in lung adenocarcinoma cell lines was further performed

TABLE 1

Baseline characteristics of lung adenocarcinoma based on TCGA set (n = 418)

Characteristic a Results
Age (years)65.10 ± 10.14
Gender
Male195 (46.7%)
Female223 (53.3%)
Race
White336 (80.4%)
Others82 (19.6%)
History of smoking
Yes339 (84.8%)
No61 (15.2%)
Clinical stage
Stage I227 (55.0%)
Stage II99 (24.0%)
Stage III67 (16.2%)
Stage IV20 (4.8%)
Chemotherapy history
Yes152 (36.4%)
No266 (63.6%)
Radiotherapy history
Yes88 (21.1%)
No330 (78.9%)
Death
Yes102 (24.4%)
No316 (75.6%)
Survival time (month)6.75 (1.60–22.28)

Abbreviation: TCGA, The Cancer Genome Atlas.

Characteristics are showed as the mean ± standard deviation or median (quartile1–quartile3) for quantitative data, and number (percentage) for categorical data.

Flow chart of the study design. Twenty‐six paired tumor and adjacent normal samples of lung adenocarcinoma patients were used for candidate methylation CpG islands screening with Illumina Human Methylation 450 platform in TCGA set. Moreover, LASSO Cox regression method was used to build a methylation signature model in the validation set. In addition, the three‐CpG‐based prognostic signature was validated in a Chinese dataset. Biological function study of methylation CpG sites in lung adenocarcinoma cell lines was further performed Baseline characteristics of lung adenocarcinoma based on TCGA set (n = 418) Abbreviation: TCGA, The Cancer Genome Atlas. Characteristics are showed as the mean ± standard deviation or median (quartile1–quartile3) for quantitative data, and number (percentage) for categorical data. The screening of methylation CpG sites was based on the TCGA set. A total of 14,9486 methylated CpG sites had an FDR < 0.05 in the comparison between the tumor tissue methylation level and adjacent tissue methylation level. Moreover, 1291 methylation CpG sites defined as DMCs had an absolute value of log2 (fold change) greater than 2.5. A heatmap showing the top 100 differentially methylation CpGs in lung adenocarcinoma tumor and adjacent nontumor tissues (26 samples) is presented in Figure 2A. The distributions between hyper‐ and hypomethylated CpG sites appeared to be asymmetric, and the tumor samples tended to have hypermethylated CpG sites. GO enrichment analysis was performed using the top 100 DMCs, and the results of enrichment in biological process (BP), cellular component (CC), and molecular function (MF) are presented in Figure 2B.
FIGURE 2

Bioinformatics analysis based on the TCGA set. (A) Heatmap showing the top 100 differentially methylation CpGs in 26 paired tumor and adjacent normal tissues of lung adenocarcinoma. (B) GO enrichment analysis using the top 100 differentially methylation CpGs in 26 paired samples. (C) and (D) LASSO Cox regression analysis based on the TCGA set. (C) LASSO coefficient information of candidate methylation CpG islands. (D) Two dotted vertical lines were drawn in the optimal criteria of 1‐s.e. and with a result of three nonzero coefficients. Three CpGs, cg14517217, cg15386964, and cg18878992, with coefficients of 0.45, 0.04, and −0.17, were eventually incorporated into the model

Bioinformatics analysis based on the TCGA set. (A) Heatmap showing the top 100 differentially methylation CpGs in 26 paired tumor and adjacent normal tissues of lung adenocarcinoma. (B) GO enrichment analysis using the top 100 differentially methylation CpGs in 26 paired samples. (C) and (D) LASSO Cox regression analysis based on the TCGA set. (C) LASSO coefficient information of candidate methylation CpG islands. (D) Two dotted vertical lines were drawn in the optimal criteria of 1‐s.e. and with a result of three nonzero coefficients. Three CpGs, cg14517217, cg15386964, and cg18878992, with coefficients of 0.45, 0.04, and −0.17, were eventually incorporated into the model

Methylation CpG sites associated with lung adenocarcinoma prognosis

LASSO Cox regression method was analyzed to confirm methylation CpG sites associated with lung adenocarcinoma prognosis. Three significant methylation CpG sites were included in the final model: cg14517217, cg15386964, and cg18878992 (Figure 2C,D), the meaningful target gene of each methylation CpG site was SEPT9, HIST1H2BH, MAPT, respectively. The gene annotation information of three methylation CpG islands selected by LASSO Cox regression in the Human Methylation 450 Array of TCGA datasets is shown in Table S1. The regression coefficients of three methylation CpG islands in the optimal LASSO Cox regression model were 0.45, 0.04, and −0.17 (Table S2). According to the optimal LASSO Cox regression model, a methylation signature for every lung adenocarcinoma participant was built: methylation signature = (0.45 × cg14517217) + (0.04 × cg15386964) − (0.17 × cg18878992). The participants were grouped into high‐risk and low‐risk, based on a cut off value defined as the median value of methylation signature. In TCGA set, the cut off value was 0.02. The identified CpG islands and methylation signature levels grouped by clinical and pathological characteristics of lung adenocarcinoma patients in TCGA set are shown in Table S3. As shown, compared with the subjects without a history of smoking, the level of methylation signature was significantly higher in smokers (0.06 ± 0.10 vs. 0.02 ± 0.07, p = 0.010). The similar trend was shown in objects with different clinical disease stages (0.04 ± 0.09 vs. 0.07 ± 0.11, p = 0.007).

Association between methylation signature and lung adenocarcinoma prognosis

Cox regression model was used to analyze the association between methylation signature and lung adenocarcinoma prognosis (Table 2). After adjustment for basic clinical variables, the risk of mortality was increased with the gradual increase level of methylation signature (HR = 45.30, 95% CI = 26.69–66.83; p < 0.001). When the subjects were grouped into high‐risk and low‐risk, subjects in high‐risk group showed a 123% higher risk of mortality in comparison with low‐risk subjects (HR = 2.23, 95% CI = 1.41–3.54; p < 0.001).
TABLE 2

The association between methylation signature level and the prognosis of lung adenocarcinoma in TCGA set and Chinese cohort

PopulationCrude HR (95% CI)Adjusted HR (95% CI) a Adjusted HR (95% CI) b
TCGA set
Methylation signature33.80 (20.43–47.54)**54.90 (38.72–75.60)*45.30 (26.69, 66.83)**
Subgroup
Low riskRefRefRef
High risk2.33 (1.55–3.51)*2.35 (1.51–3.67)**2.23 (1.41, 3.54)**
Chinese cohort
Methylation signature5.06 (2.08–8.34)**4.85 (1.84–9.87)*8.81 (4.72, 18.51)**
Subgroup
Low riskRefRefRef
High risk1.26 (1.16–3.87)*1.50 (1.16–4.91)*8.52 (5.52, 10.58)*

*p < 0.05; **p < 0.001.

Abbreviations: 95% CI, 95% confidence interval; HR, hazard ratio; TCGA, The Cancer Genome Atlas.

Adjustment for age, gender, race, and smoking status.

Adjustment for age, gender, race, history of smoking, chemotherapy history, radiotherapy history, and clinical stage.

The association between methylation signature level and the prognosis of lung adenocarcinoma in TCGA set and Chinese cohort *p < 0.05; **p < 0.001. Abbreviations: 95% CI, 95% confidence interval; HR, hazard ratio; TCGA, The Cancer Genome Atlas. Adjustment for age, gender, race, and smoking status. Adjustment for age, gender, race, history of smoking, chemotherapy history, radiotherapy history, and clinical stage. In subgroup analysis, the methylation signature showed a more pronounced protective effect on the younger group, females, nonsmokers, and those who had received postoperative chemotherapy (Table 3). We also demonstrated that the association between methylation signature and the prognosis of lung adenocarcinoma had some differences due to the characteristics of tumor. The methylation signature showed a stronger protective effect on early‐stage patients (stage I) than late‐stage patients (stage II/III/IV).
TABLE 3

The association between methylation signature and the prognosis of lung adenocarcinoma in TCGA set and Chinese cohort by subgroup

CharacteristicsHR (95% CI) a HR (95% CI) b
Age (years) c
<6528.36 (22.17–48.99)*1.77 (0.87–3.96)
≥6538.52 (32.60–40.08)**6.61 (1.85–9.59)*
Gender d
Male19.02 (15.12–24.18)*8.01 (5.87–14.33)*
Female20.53 (11.56–29.71)*10.99 (3.27–17.51)*
History of smoking d
Yes49.49 (37.15–62.38)*57.47 (43.51–71.24)**
No27.13 (10.02–32.41)*15.84 (10.08–21.48)*
Clinical stage d
Stage I49.71 (31.68–67.40)*4.26 (1.24–7.59)*
Stage II/III/IV58.58 (35.14–67.90)**12.64 (10.21–15.88)**
Chemotherapy history d
Yes10.42 (6.29–17.97)**1.34 (0.38–4.49)
No26.07 (22.37–35.00)*9.46 (3.60–14.87)*

*p < 0.05; **p < 0.001.

Abbreviations: 95% CI, 95% confidence interval; HR, hazard ratio; TCGA, The Cancer Genome Atlas.

HR (95%CI) in TCGA set.

HR (95%CI) in Chinese lung adenocarcinoma patients.

Adjustment for gender, race, history of smoking, chemotherapy history, radiotherapy history, and clinical stage.

Adjustment for other variables.

The association between methylation signature and the prognosis of lung adenocarcinoma in TCGA set and Chinese cohort by subgroup *p < 0.05; **p < 0.001. Abbreviations: 95% CI, 95% confidence interval; HR, hazard ratio; TCGA, The Cancer Genome Atlas. HR (95%CI) in TCGA set. HR (95%CI) in Chinese lung adenocarcinoma patients. Adjustment for gender, race, history of smoking, chemotherapy history, radiotherapy history, and clinical stage. Adjustment for other variables.

The prognostic prediction efficiency of the methylation signature

The prognostic prediction efficiency of the methylation signature was estimated with Harrell's C‐statistic (Table 4). It was shown that the C‐statistic value was 0.66 (95% CI = 0.62–0.69) with the methylation signature alone used to evaluate the prediction efficiency on the prognosis of lung adenocarcinoma. When age, gender, race, and other clinical variables were included to make a prediction model, the C‐statistic value showed a marked increase to 0.77 (95% CI = 0.73–0.81). The C‐statistic value increased by 16.7 percent overall.
TABLE 4

The prognostic prediction efficiency of methylation signature and prediction model in TCGA set and Chinese cohort

PopulationC‐statistic95% CI
TCGA set
Methylation signature0.660.62–0.69
Prediction model a 0.770.73–0.81
Chinese cohort
Methylation signature0.610.57–0.68
Prediction model a 0.730.70–0.79

Abbreviations: 95% CI, 95% confidence interval; TCGA, The Cancer Genome Atlas.

Prediction model was built based on methylation signature and some clinical variables (age, gender, race, history of smoking, chemotherapy history, radiotherapy history, and clinical stage).

The prognostic prediction efficiency of methylation signature and prediction model in TCGA set and Chinese cohort Abbreviations: 95% CI, 95% confidence interval; TCGA, The Cancer Genome Atlas. Prediction model was built based on methylation signature and some clinical variables (age, gender, race, history of smoking, chemotherapy history, radiotherapy history, and clinical stage). In addition, we analyzed the correlation between DNA methylation and target gene expression (Figure S1). mRNA levels of SEPT9 and HIST1H2BH in tumor tissues were significantly higher than those in adjacent normal tissues, while that of MAPT was lower (p < 0.05). The correlation coefficients between DNA methylation and gene expression of these three methylation CpG sites were 0.32, 0.71, and −0.35, respectively (all p values < 0.001).

The validation set (Chinese lung adenocarcinoma cohort)

Characteristics of lung adenocarcinoma patients in the Chinese population

One hundred and sixteen Chinese participants diagnosed with lung adenocarcinoma were incorporated into validation study set. One hundred and sixteen tumor tissues and 30 adjacent normal tissues were collected and used in the subsequent study. The basic characteristics of lung adenocarcinoma patients are presented in Table S4. As shown, 56 patients were males (48.3%) with a mean (± SD) age of 60.27 (± 8.57) years. Forty‐five subjects reported a history of smoking (38.8%). During the follow‐up, 20 (17.2%) participants died, and 96 (82.8%) participants were still alive. The mean survival time of all participants was 24.18 (16.71–27.00) months.

The level of methylation CpG sites associated with lung adenocarcinoma prognosis

Pyrosequencing was used to quantify the CpG site methylation levels associated with lung adenocarcinoma prognosis found in TCGA set. The sequences of all primers are shown in Table S5. The methylation levels of cg14517217, cg15386964, and cg18878992 were detected using pyrosequencing in 116 tumor tissues and 30 adjacent normal tissues (Figure S2). Compared with adjacent normal tissue, DNA methylation levels of three CpG sites were differently higher in tumor tissue (0.09 ± 0.12 vs. 0.01 ± 0.00, p < 0.001 for cg14517217; 0.15 ± 0.13 vs. 0.03 ± 0.01, p < 0.001 for cg15386964 and 0.08 ± 0.09 vs. 0.03 ± 0.01, p = 0.005 for cg18878992, respectively), as shown in Figure S3 (left panel). The identified CpG islands and methylation signature levels grouped by the basic characteristics of Chinese lung adenocarcinoma are shown in Table S6. As shown, compared with the subjects without a history of smoking, the level of methylation signature was significantly higher in smokers (0.15 ± 0.46 vs. −0.02 ± 0.28, p = 0.013). The similar trend was also shown for patients in different age groups and clinical stages. The association between methylation signature and lung adenocarcinoma prognosis in Chinese cohort is shown in Table 2. After adjustment for basic clinical variables, the prognosis of Chinese patients worsened with increasing methylation signature levels (HR = 8.81, 95% CI = 4.72–18.51; p < 0.001). When subjects were grouped into high‐risk and low‐risk, subjects in high‐risk group had a 7.52 times higher risk of death than low‐risk subjects (HR = 8.52, 95% CI = 5.52–10.58; p = 0.002). We also performed a stratified analysis in the validation set, as shown in Table 3. Similar to the results in the discovery set, the methylation signature showed a more pronounced protective effect on the younger group, nonsmokers, and those who had received postoperative chemotherapy. Moreover, the association between methylation signature and the prognosis of lung adenocarcinoma had some differences due to the characteristics of tumor. The methylation signature showed a stronger protective effect on early‐stage patients (stage I) than late‐stage patients (stage II/III/IV). The prognostic prediction efficiency of the methylation signature was also estimated in Chinese lung adenocarcinoma cohort (Table 4). It was shown that the C‐statistic value was 0.61 (95% CI = 0.57–0.68) when only methylation signature was used to evaluate the prediction efficiency on lung adenocarcinoma prognosis. When age, gender, and other clinical variables were included in the prediction model, the C‐statistic value showed a marked increase to 0.73 (95% CI = 0.70–0.79). The C‐statistic increased by 19.7 percent overall. In addition, RT‐PCR as well as Western blotting were conducted to measure the level of gene expression and protein expression of target genes (SEPT9, HIST1H2BH, and MAPT). The specific primer sequences of the target genes and GAPDH are listed in Table S7. In the analyses to validate the correlation between DNA methylation and gene expression as well as gene expression and protein expression, it was found that mRNA levels of SEPT9 and HIST1H2BH in tumor tissues were significantly higher than those in adjacent normal tissue, while that of MAPT was lower (p < 0.05). The correlation coefficients between DNA methylation and gene expression of these three methylated CpG sites were 0.31 for cg14517217, 0.41 for cg15386964, and −0.28 for cg18878992. The same result was seen in the correlation between gene expression and protein expression (data not shown). Furthermore, we have found that the mRNA levels of SEPT9 and HIST1H2BH were higher in tumor tissue in comparison with adjacent tissue in 30 lung adenocarcinoma patients and lower in MAPT (Figure S3, right panel).

Biological function studies

The influence of 5‐aza‐2dC on the methylation level

Human lung adenocarcinoma A549 cells, NCI‐H1975 cells, and immortalized human bronchial epithelia MRC‐5 cells were cultured in vitro. Different concentrations of 5‐aza‐2dC were added to these three cell lines: 0 μmol/L (control group), 1 μmol/L (low‐demethylated group), 5 μmol/L (middle‐demethylated group), and 10 μmol/L (high‐demethylated group). Pyrosequencing was conducted to validate CpG methylation levels in cell lines. As shown in Figure 3, in the control groups, the MRC‐5 cell line was slightly methylated with low methylation levels of cg14517217, cg15386964, and cg18878992, while the A549 cell line was highly methylated, with high methylation levels of cg14517217 and cg15386964. The NCI‐H1975 cell line was also highly methylated, with high methylation levels of cg14517217 and cg18878992. Without treatment, cg18878992 was slightly methylated in A549 and the methylation level of cg15386964 was slightly methylated in NCI‐H1975. Thus, the following study focused on cg14517217 in A549, NCI‐H1975, and MRC‐5, cg15386964 in A549 and MRC‐5, and cg18878992 in NCI‐H1975 and MRC‐5. The effect of demethylation via 5‐aza‐2dC treatment was not obvious in MRC‐5, while in A549 and NCI‐H1975, the methylation level of the three methylated CpG sites decreased with the increasing concentration of 5‐aza‐2dC.
FIGURE 3

The influence of 5‐aza‐2dC on the methylation level in different cell lines. Human lung adenocarcinoma A549 cells, NCI‐H1975 cells, and immortalized human bronchial epithelia MRC‐5 cells were cultured in vitro. Different concentrations of 5‐aza‐2dC were added to these three cell lines: 0 μmol/L (control group), 1 μmol/L (low‐demethylated group), 5 μmol/L (middle‐demethylated group), and 10 μmol/L (high‐demethylated group). Pyrosequencing was conducted to validate the CpG methylation levels

The influence of 5‐aza‐2dC on the methylation level in different cell lines. Human lung adenocarcinoma A549 cells, NCI‐H1975 cells, and immortalized human bronchial epithelia MRC‐5 cells were cultured in vitro. Different concentrations of 5‐aza‐2dC were added to these three cell lines: 0 μmol/L (control group), 1 μmol/L (low‐demethylated group), 5 μmol/L (middle‐demethylated group), and 10 μmol/L (high‐demethylated group). Pyrosequencing was conducted to validate the CpG methylation levels

The proliferative capacity of cells after 5‐aza‐2dC treatment

MTT was performed to detect the proliferative capacity after dealing with 5‐aza‐2dC in different concentration gradients. The proliferative capacity of cell lines was inhibited when the cells were demethylated with 5‐aza‐2dC, especially in A549 and NCI‐H1975 cells (Figure 4). The proliferation rate of cell lines significantly decreased with the concentration of 10 μmol/L compared with 5 μmol/L, 1 μmol/L 5‐aza‐2dC treatment and control group after 6 days.
FIGURE 4

Cell growth analysis via the MTT assay. A549, NCI‐H1975, and MRC‐5 cells were demethylated with 5‐aza‐2dC at different concentrations (0 μmol/L, 1 μmol/L, 5 μmol/L, and 10 μmol/L). MTT was performed to evaluate cell growth for 6 days. *p < 0.05

Cell growth analysis via the MTT assay. A549, NCI‐H1975, and MRC‐5 cells were demethylated with 5‐aza‐2dC at different concentrations (0 μmol/L, 1 μmol/L, 5 μmol/L, and 10 μmol/L). MTT was performed to evaluate cell growth for 6 days. *p < 0.05

Gene expression and protein expression of target genes with 5‐aza‐2dC treatment

RT‐PCR as well as western blotting were conducted to measure the gene and protein levels in cell lines. As shown, 5‐Aza‐2dC treatment had no obvious effect on mRNA level or protein expression level in the MRC‐5 cell line. In A549 and NCI‐H1975 cells, mRNA and protein levels of SEPT9 and HIST1H2BH (cg14517217 and cg15386964) were downregulated with a concentration gradient of 5‐aza‐2dC treatment, while the mRNA and protein levels of MAPT (cg18878992) were upregulated (Figure 5 and Figure S4).
FIGURE 5

The gene expression of three methylation CpG islands in three cell lines by RT‐PCR. (A) RT‐PCR showing the gene expression of three methylation CpG islands in cells demethylated with 5‐aza‐2dC at different concentrations. (B) The gene expression of three methylation CpG islands in cells demethylated with 5‐aza‐2dC. **p < 0.001

The gene expression of three methylation CpG islands in three cell lines by RT‐PCR. (A) RT‐PCR showing the gene expression of three methylation CpG islands in cells demethylated with 5‐aza‐2dC at different concentrations. (B) The gene expression of three methylation CpG islands in cells demethylated with 5‐aza‐2dC. **p < 0.001

DISCUSSION

In this study, we first found three methylation CpG sites which were associated with lung adenocarcinoma prognosis (cg14517217, cg15386964, and cg18878992) based on the TCGA database. The association between methylation signature and lung adenocarcinoma prognosis was explored, and the outcome of multivariate Cox regression model revealed that the risk of mortality increased with the gradual increase level of methylation signature in lung adenocarcinoma patients. Moreover, the prognostic prediction model based on methylation signature showed a satisfactory efficiency for lung adenocarcinoma. Interestingly, DNA methylation of these three CpG sites was significantly correlated with target gene expression. In validation set, the same conclusion was obtained and confirmed in a Chinese lung adenocarcinoma cohort. Exploration of the biological mechanism underlying the methylation of these CpG sites in human lung adenocarcinoma cell lines revealed that changes in the methylation level could affect mRNA and protein expression of target gene. The biological mechanism of methylation in the development of lung adenocarcinoma may occur through the causal relationship between DNA methylation and target gene/protein expression. Building a model by combining multiple biomarkers may significantly improve the predictive ability in comparison with single biomarker. Recently, few genome‐wide studies have focused on the DNA methylation model in lung adenocarcinoma. , ,  Wang et al. have found a nine‑gene methylation signature for lung adenocarcinoma prognosis with the data from TCGA and Gene Expression Omnibus (GEO). Cai et al. have defined a methylation signature including 16 CpG sites for lung adenocarcinoma prognosis with the data from TCGA. Cox regression method was used to confirm methylation CpG sites associated with the prognosis of lung adenocarcinoma.  Wang Y et al. have built a methylation signature including 16 CpG sites for lung adenocarcinoma prognosis with the data from TCGA, regardless of the difference between tumor tissue and adjacent tissue as well as the important clinical characteristics, and with a predict value of 0.68 for the prediction model. However, no population validation and functional experiments were performed in these studies. Moreover, the methods of statistics for sequencing data analysis in some early researches may not suitable. As the most commonly used statistical method for survival analysis, Cox regression method is not applicable when analyzing high‐dimensional sequencing data. , In this study, LASSO Cox model was performed as an effective way to solve this problem. , In our study, a three‐CpG‐based model was built and can efficiently predict the prognosis of lung adenocarcinoma based on TCGA set. Given the three biomarkers, our model may be more acceptable and more likely to be applied in clinical practice in comparison with other studies. The prediction model in this study has a kind of ability to evaluate the prognosis of lung adenocarcinoma patients accurately. We further performed a validation study in a Chinese cohort, and the prognostic accuracy of the prediction model was found to be still effective. Therefore, the prediction model may have potential prognostic value and is feasible for use in the clinic. This study is the first to report three methylation sites (cg14517217, cg15386964, and cg18878992) associated with lung adenocarcinoma prognosis in Chinese set. cg14517217 is located in the CpG island sequence on chromosome 17 in the gene body region of the SEPT9 (septin9) gene. As a protein‐coding gene, SEPT9 is a member of the membrane protein family and encodes a GTP‐binding protein. Moreover, SEPT9 is associated with the regulation of cell division and the cell cycle and has significant effect on tumor occurrence and development as a tumor oncogene or suppressor gene. Recent studies have found abnormal methylation status of SEPT9 promoter in several cancers and this status can affect the condition of prognosis, especially for colorectal cancer patients. , , However, the effect of aberrant SEPT9 hypermethylation on gene mRNA expression is inconsistent across different studies. , , In this study, cg14517217 showed hypermethylation in the tumor tissue of lung adenocarcinoma, which was significantly correlated with SEPT9 expression. The biological function studies indicated that mRNA and protein expression of SEPT9 were downregulated in lung adenocarcinoma cell lines after dealing with 5‐aza‐2dC in different concentration gradients. This suggests that cg14517217 may have a certain effect on the occurrence and development of lung adenocarcinoma through positively regulating the expression of SEPT9. A similar trend was seen in cg15386964, which may play a vital role in lung adenocarcinoma by upregulating the expression of HIST1H2BH. , , In contrast, cg18878992 was found to be negatively correlated with MAPT mRNA expression, showing that cg18878992 may influence the development of lung adenocarcinoma by downregulating the expression of MAPT , , . The functional relationship between these three methylated CpG sites and lung adenocarcinoma remains to be explored. In this study, systematic exploration of methylated CpG sites was performed to predict lung adenocarcinoma prognosis through TCGA analysis, and the results obtained from the Chinese population and biological function studies provide effective evidence that DNA methylation biomarker can affect the process of occurrence and development of lung adenocarcinoma. As this methylation signature can be performed to evaluate the prognosis both in TCGA and Chinese lung adenocarcinoma, suggesting that this model may provide new genetic factors of lung adenocarcinoma and can help to prognostic predictions in the clinic. However, there are still some limitations in this study. First, only DNA methylation effect was considered for lung adenocarcinoma prognosis study. The development of lung adenocarcinoma may be related to other types of variants. Other types of variants included in subsequent studies will be helpful to know more about the effect of DNA methylation in cancer. Second, the population for validation with Chinese cohort was not large enough. The sample size needs to be increased for further research in order to verify the role of DNA methylation in lung adenocarcinoma.

CONCLUSIONS

In conclusion, this is the first study to employ a three‐CpG‐based model for lung adenocarcinoma, which is a feasible and useful prognostic method in prognostic prediction which was validated in a Chinese cohort. Additional Chinese population studies with larger sizes and biological function studies in animal models are necessary for our finding verification.

CONFLICT OF INTEREST

The authors have no conflict of interest.

ETHICS APPROVAL

This project was approved by the Institutional Review Board of Huazhong University of Science and Technology. Supplementary Material Click here for additional data file.
  42 in total

1.  Cancer statistics, 2019.

Authors:  Rebecca L Siegel; Kimberly D Miller; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2019-01-08       Impact factor: 508.702

2.  Hypermethylation of the death-associated protein (DAP) kinase promoter and aggressiveness in stage I non-small-cell lung cancer.

Authors:  X Tang; F R Khuri; J J Lee; B L Kemp; D Liu; W K Hong; L Mao
Journal:  J Natl Cancer Inst       Date:  2000-09-20       Impact factor: 13.506

Review 3.  Empirical extensions of the lasso penalty to reduce the false discovery rate in high-dimensional Cox regression models.

Authors:  Nils Ternès; Federico Rotolo; Stefan Michiels
Journal:  Stat Med       Date:  2016-03-10       Impact factor: 2.373

4.  The American Joint Committee on Cancer: the 7th edition of the AJCC cancer staging manual and the future of TNM.

Authors:  Stephen B Edge; Carolyn C Compton
Journal:  Ann Surg Oncol       Date:  2010-06       Impact factor: 5.344

5.  Identification of DLC-1 expression and methylation status in patients with non-small-cell lung cancer.

Authors:  Hongxiang Feng; Zhenrong Zhang; Xiaowei Wang; Deruo Liu
Journal:  Mol Clin Oncol       Date:  2015-11-20

6.  Promoter methylation of APC and RAR-β genes as prognostic markers in non-small cell lung cancer (NSCLC).

Authors:  Hongxiang Feng; Zhenrong Zhang; Xin Qing; Xiaowei Wang; Chaoyang Liang; Deruo Liu
Journal:  Exp Mol Pathol       Date:  2015-12-08       Impact factor: 3.362

7.  miR-186 regulates chemo-sensitivity to paclitaxel via targeting MAPT in non-small cell lung cancer (NSCLC).

Authors:  Jinjun Ye; Zhi Zhang; Lei Sun; Ying Fang; Xinyu Xu; Guoren Zhou
Journal:  Mol Biosyst       Date:  2016-10-18

8.  The role of mSEPT9 in screening, diagnosis, and recurrence monitoring of colorectal cancer.

Authors:  Jie Sun; Fei Fei; Mingqing Zhang; Yuwei Li; Xipeng Zhang; Siwei Zhu; Shiwu Zhang
Journal:  BMC Cancer       Date:  2019-05-14       Impact factor: 4.430

9.  Diagnostic and prognostic value of SHOX2 and SEPT9 DNA methylation and cytology in benign, paramalignant, and malignant ascites.

Authors:  Maria Jung; Svenja Pützer; Heidrun Gevensleben; Sebastian Meller; Glen Kristiansen; Dimo Dietrich
Journal:  Clin Epigenetics       Date:  2016-03-01       Impact factor: 6.551

10.  Appraising the causal relevance of DNA methylation for risk of lung cancer.

Authors:  Thomas Battram; Rebecca C Richmond; Laura Baglietto; Philip C Haycock; Vittorio Perduca; Stig E Bojesen; Tom R Gaunt; Gibran Hemani; Florence Guida; Robert Carreras-Torres; Rayjean Hung; Christopher I Amos; Joshua R Freeman; Torkjel M Sandanger; Therese H Nøst; Børge G Nordestgaard; Andrew E Teschendorff; Silvia Polidoro; Paolo Vineis; Gianluca Severi; Allison M Hodge; Graham G Giles; Kjell Grankvist; Mikael B Johansson; Mattias Johansson; George Davey Smith; Caroline L Relton
Journal:  Int J Epidemiol       Date:  2019-10-01       Impact factor: 7.196

View more
  2 in total

1.  The DDR-related gene signature with cell cycle checkpoint function predicts prognosis, immune activity, and chemoradiotherapy response in lung adenocarcinoma.

Authors:  Quan Li; Pan Zhang; Huixiao Hu; Hang Huang; Dong Pan; Guangyun Mao; Burong Hu
Journal:  Respir Res       Date:  2022-07-15

2.  A functional methylation signature to predict the prognosis of Chinese lung adenocarcinoma based on TCGA.

Authors:  Ke Wang; Ying Liu; Guanzhong Lu; Jinrong Xiao; Jiao Huang; Lin Lei; Ji Peng; Yangkai Li; Sheng Wei
Journal:  Cancer Med       Date:  2021-12-02       Impact factor: 4.452

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.