| Literature DB >> 31850197 |
Nan Shen1,2, Jun Du3, Hui Zhou4, Nan Chen5, Yi Pan6,7, Jörg D Hoheisel6, Zonghui Jiang8, Ling Xiao9, Yue Tao2, Xi Mo2.
Abstract
Lung adenocarcinoma (LUAD) is one of the most common cancers and lethal diseases in the world. Recognition of the undetermined lung nodules at an early stage is useful for a favorable prognosis. However, there is no good method to identify the undetermined lung nodules and predict their clinical outcome. DNA methylation alteration is frequently observed in LUAD and may play important roles in carcinogenesis, diagnosis, and prediction. This study took advantage of publicly available methylation profiling resources and a machine learning method to investigate methylation differences between LUAD and adjacent non-malignant tissue. The prediction panel was first constructed using 338 tissue samples from LUAD patients including 149 non-malignant ones. This model was then validated with data from The Cancer Genome Atlas database and clinic samples. As a result, the methylation status of four CpG loci in homeobox A9 (HOXA9), keratin-associated protein 8-1 (KRTAP8-1), cyclin D1 (CCND1), and tubby-like protein 2 (TULP2) were highlighted as informative markers. A random forest classification model with an accuracy of 94.57% and kappa of 88.96% was obtained. To evaluate this panel for LUAD, the methylation levels of four CpG loci in HOXA9, KRTAP8-1, CCND1, and TULP2 of tumor samples and matched adjacent lung samples from 25 patients with LUAD were tested. In these LUAD patients, the methylation of HOXA9 was significantly upregulated, whereas the methylation of KRTAP8-1, CCND1, and TULP2 were downregulated obviously in tumor samples compared with adjacent tissues. Our study demonstrates that the methylation of HOXA9, KRTAP8-1, CCND1, and TULP2 has great potential for the early recognition of LUAD in the undetermined lung nodules. The findings also exhibit that the application of improved mathematic algorithms can yield accurate and particularly robust and widely applicable marker panels. This approach could greatly facilitate the discovery process of biomarkers in various fields.Entities:
Keywords: CCND1; DNA methylation; HOXA9; KRTAP8-1; TULP2; biomarker; lung adenocarcinoma; random forest
Year: 2019 PMID: 31850197 PMCID: PMC6901798 DOI: 10.3389/fonc.2019.01281
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Summary of the gene expression omnibus (GEO) datasets.
| GSE32861 | 59 | 59 |
| GSE32866 | 28 | 27 |
| GSE62948 | 28 | 28 |
| GSE63384 | 35 | 35 |
| GSE83845 | 39 | 0 |
Differentially methylated probes detected in the Gene Expression Omnibus (GEO) datasets.
| cg26521404 | 1.41 | 7.07E−24 | cg04490714 | 1.03 | 3.67E−17 | ||
| cg01354473 | 1.12 | 6.87E−23 | cg01009664 | 1.08 | 4.21E−17 | ||
| cg25720804 | 1.42 | 8.39E−23 | cg25875213 | 1.37 | 1.05E−16 | ||
| cg15540820 | 1.04 | 2.58E−20 | cg01295203 | 1.04 | 1.28E−16 | ||
| cg22660578 | 1.10 | 3.78E−20 | cg14458834 | 1.32 | 1.80E−16 | ||
| cg01381846 | 1.01 | 4.57E−20 | cg07307078 | 1.07 | 3.09E−16 | ||
| cg12374721 | 1.17 | 4.57E−20 | cg12680609 | 1.02 | 5.88E−16 | ||
| cg08089301 | 1.55 | 6.89E−20 | cg06151165 | 1.07 | 6.28E−16 | ||
| cg23290344 | 1.28 | 7.04E−20 | cg05436658 | 1.29 | 1.00E−15 | ||
| cg07533148 | 1.54 | 7.04E−20 | cg17619823 | 1.09 | 1.85E−15 | ||
| cg12880658 | 1.29 | 7.96E−20 | cg26963271 | 1.04 | 2.25E−15 | ||
| cg19456540 | 1.42 | 2.06E−19 | cg21529533 | 1.06 | 3.08E−15 | ||
| cg13323752 | 1.34 | 2.31E−19 | cg10303487 | 1.13 | 3.28E−15 | ||
| cg04534765 | 1.23 | 2.50E−19 | cg25691167 | 1.14 | 3.31E−15 | ||
| cg08118311 | 1.19 | 2.67E−19 | cg26721264 | 1.05 | 3.40E−15 | ||
| cg02164046 | 1.02 | 2.92E−19 | cg21546671 | 1.16 | 3.76E−15 | ||
| cg14859460 | 1.09 | 3.29E−19 | cg25574024 | 1.02 | 4.55E−15 | ||
| cg02008154 | 1.00 | 3.96E−19 | cg15520279 | 1.04 | 1.38E−14 | ||
| cg18952647 | 1.30 | 4.06E−19 | cg00949442 | 1.02 | 2.37E−14 | ||
| cg07778029 | 1.05 | 4.13E−19 | cg09516965 | 1.26 | 2.91E−14 | ||
| cg20959866 | 1.05 | 1.32E−18 | cg16731240 | 1.04 | 3.74E−14 | ||
| cg22471346 | 1.36 | 2.02E−18 | cg10883303 | 1.03 | 4.78E−14 | ||
| cg14991487 | 1.21 | 4.14E−18 | cg13912117 | 1.05 | 5.27E−14 | ||
| cg22881914 | 1.42 | 4.55E−18 | cg00848728 | 1.09 | 5.27E−14 | ||
| cg24423088 | −1.07 | 4.72E−18 | cg21790626 | 1.13 | 1.20E−13 | ||
| cg23432345 | 1.20 | 4.72E−18 | cg01805540 | 1.01 | 3.98E−12 | ||
| cg06760035 | 1.40 | 5.67E−18 | cg09229912 | 1.06 | 4.84E−12 | ||
| cg18722841 | 1.04 | 5.68E−18 | cg18349835 | 1.10 | 6.70E−12 | ||
| cg15191648 | 1.04 | 6.75E−18 | cg00062776 | −1.00 | 2.08E−11 | ||
| cg04048259 | 1.10 | 1.32E−17 | cg19332710 | 1.00 | 1.67E−10 | ||
| cg17525406 | 1.22 | 1.90E−17 | cg02723533 | −1.04 | 2.70E−10 |
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment results of differentially methylated genes.
| Negative regulation of transcription from RNA polymerase II promoter | 10 | 2.90E−02 |
| Transcription, DNA-templated | 17 | 3.60E−02 |
| Adenylate cyclase-activating G-protein coupled receptor signaling pathway | 4 | 4.40E−02 |
| Nucleus | 28 | 2.80E−02 |
| Sequence-specific DNA binding | 11 | 2.10E−04 |
| RNA polymerase II regulatory region sequence-specific DNA binding | 8 | 1.20E−04 |
| Transcription factor activity, sequence-specific DNA binding | 12 | 2.80E−03 |
| G alpha (s) signaling events | 5 | 3.00E−02 |
Figure 1Relationship between the model performance and the feature numbers based on both accuracy and kappa value.
Figure 2Methylation level (beta value) for the four selected probes in lung adenocarcinomas and non-malignant samples from GEO dataset measured by Illumina Infinium HumanMethylation27 BeadChip.
Figure 3Analysis of the methylation level of the four selected probes. (A) The beta value for the four selected probes was calculated from the TCGA dataset representing lung adenocarcinomas and non-malignant samples, with the measurements being done on the Illumina Infinium HumanMethylation27 BeadChip platform. (B) Analysis of the TCGA data generated on the Illumina Infinium HumanMethylation450 BeadChip platform.
Figure 4The methylation level (%) for the four selected probes of the tumors and non-malignant samples validated in 25 patients with LUAD using pyrosequencing (**p < 0.001, ***p < 0.0001).