| Literature DB >> 32420331 |
Yan-Mei Dong1, Ming Li2, Qi-En He1, Yi-Fan Tong1, Hong-Zhi Gao3, Yi-Zhi Zhang4, Ya-Meng Wu2, Jun Hu4, Ning Zhang2, Kai Song1.
Abstract
Tobacco exposure is one of the major risks for the initiation and progress of lung cancer. The exact corresponding mechanisms, however, are mainly unknown. Recently, a growing body of evidence has been collected supporting the involvement of DNA methylation in the regulation of gene expression in cancer cells. The identification of tobacco-related signature methylation probes and the analysis of their regulatory networks at different molecular levels may be of a great help for understanding tobacco-related tumorigenesis. Three independent lung adenocarcinoma (LUAD) datasets were used to train and validate the tobacco exposure pattern classification model. A deep selecting method was proposed and used to identify methylation signature probes from hundreds of thousands of the whole epigenome probes. Then, BIMC (biweight midcorrelation coefficient) algorithm, SRC (Spearman's rank correlation) analysis, and shortest path tracing method were explored to identify associated genes at gene regulation level and protein-protein interaction level, respectively. Afterwards, the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis and GO (Gene Ontology) enrichment analysis were used to analyze their molecular functions and associated pathways. 105 probes were identified as tobacco-related DNA methylation signatures. They belong to 95 genes which are involved in hsa04512, hsa04151, and other important pathways. At gene regulation level, 33 genes are uncovered to be highly related to signature probes by both BIMC and SRC methods. Among them, FARSB and other eight genes were uncovered as Hub genes in the gene regulatory network. Meanwhile, the PPI network about these 33 genes showed that MAGOH, FYN, and other five genes were the most connected core genes among them. These analysis results may provide clues for a clear biological interpretation in the molecular mechanism of tumorigenesis. Moreover, the identified signature probes may serve as potential drug targets for the precision medicine of LUAD.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32420331 PMCID: PMC7201762 DOI: 10.1155/2020/2471915
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Flowchart of current research. Cited from “Genetic and Environmental Contributions to Functional Connectivity Architecture of the Human Brain.”
The summary of the clinical information of LUAD samples in all datasets.
| LUAD | Discovery cohort | Validation cohort | |
|---|---|---|---|
| GSE39279 | TCGA | GSE66836 | |
| Number of samples | 145 | 172 | 121 |
| Age | 40–90 | 33–84 | 39–84 |
| Gender | |||
| Females | 76 | 91 | 66 |
| Males | 69 | 81 | 55 |
| Smoking history | |||
| Current | 102 | 105 | 105 |
| Never | 43 | 67 | 16 |
| Stage | |||
| Stage I | 84 | 89 | 70 |
| Stage II | 24 | 49 | 24 |
| Stage III | 30 | 23 | 25 |
| Stage IV | 7 | 10 | 2 |
| Not applicable | 1 | ||
| Smoking pack-years | |||
| 0–50 | 99 | 58 | Unknown |
| 51–100 | 35 | 19 | Unknown |
| 101–125 | 2 | 8 | Unknown |
| Not applicable | 9 | 87 | Unknown |
The ages of 4 smokers and 5 nonsmokers in TCGA data are unknown.
List of 105 signature methylation probes.
| Number | Probe | Chromosome | Gene symbol | Feature type |
|---|---|---|---|---|
| 1 | cg00334821 | chr7 | LIMK1 | — |
| 2 | cg00370022 | chr15 | CYP1A1 | N_Shelf |
| 3 | cg00688979 | chr6 | CCHCR1 | — |
| 4 | cg00702638 | chr3 | KIAA1143; KIF15 | Island |
| 5 | cg00976097 | chr5 | AHRR | Island |
| 6 | cg00993400 | chr1 | AL672294.1; ZNF692 | S_Shore |
| 7 | cg01049916 | chr10 | FGFR2 | — |
| 8 | cg01637537 | chr22 | FAM83F | N_Shore |
| 9 | cg01971034 | chr1 | — | N_Shelf |
| 10 | cg02050426 | chr18 | CDH20 | Island |
| 11 | cg02387679 | chr5 | IQGAP2 | Island |
| 12 | cg02498206 | chr3 | BOC | — |
| 13 | cg02826525 | chr2 | ARHGEF33; RP11-173C1.1 | Island |
| 14 | cg02988118 | chr2 | — | N_Shelf |
| 15 | cg03078488 | chr7 | IGF2BP3 | N_Shore |
| 16 | cg03277049 | chr3 | LINC00886 | Island |
| 17 | cg03642695 | chr17 | TEKT3 | — |
| 18 | cg03789372 | chr8 | — | — |
| 19 | cg03806812 | chr2 | — | — |
| 20 | cg03945895 | chr1 | PRDM2 | — |
| 21 | cg03985801 | chr1 | LGR6 | N_Shore |
| 22 | cg04267214 | chr1 | — | Island |
| 23 | cg04616529 | chr16 | CLEC16A | — |
| 24 | cg04865290 | chr3 | TMEM110; TMEM110-MUSTN1 | N_Shelf |
| 25 | cg05033369 | chr1 | FCRLA | — |
| 26 | cg05559381 | chr5 | — | — |
| 27 | cg05575921 | chr5 | AHRR | N_Shore |
| 28 | cg05752786 | chr1 | SYT2 | Island |
| 29 | cg05787209 | chr16 | STX1B | S_Shore |
| 30 | cg05951221 | chr2 | ECEL1P1 | Island |
| 31 | cg06010163 | chr6 | — | — |
| 32 | cg06227763 | chr18 | — | — |
| 33 | cg06540950 | chr5 | — | — |
| 34 | cg06637330 | chr5 | — | N_Shelf |
| 35 | cg07160783 | chr16 | — | — |
| 36 | cg07325233 | chr21 | AP000295.9; IL10RB; IL10RB-AS1 | Island |
| 37 | cg07709148 | chr8 | RP11-486M23.1 | — |
| 38 | cg07813142 | chr2 | SP5 | Island |
| 39 | cg08008475 | chr13 | RNY1P1 | — |
| 40 | cg08374798 | chr20 | COL9A3 | N_Shore |
| 41 | cg08733957 | chr1 | GALE | N_Shore |
| 42 | cg08894131 | chr1 | GJA5 | — |
| 43 | cg09194449 | chr7 | PTPRN2 | Island |
| 44 | cg09278187 | chr1 | FOXJ3 | — |
| 45 | cg09370982 | chr16 | RP11-20I23.1; TBC1D24 | S_Shore |
| 46 | cg09799983 | chr2 | CYP1B1; CYP1B1-AS1 | Island |
| 47 | cg10076730 | chr13 | COL4A2 | N_Shore |
| 48 | cg10354195 | chr10 | LRRC27 | — |
| 49 | cg10385390 | chr1 | PARK7 | S_Shore |
| 50 | cg10413224 | chr7 | BMPER | Island |
| 51 | cg10650290 | chr7 | PTPRN2 | — |
| 52 | cg11545521 | chr11 | PTPRJ | — |
| 53 | cg11751707 | chr2 | CYP1B1; CYP1B1-AS1 | Island |
| 54 | cg11954332 | chr1 | PRRX1 | — |
| 55 | cg12020590 | chr11 | — | — |
| 56 | cg12387247 | chr19 | FCER2 | — |
| 57 | cg13563863 | chr19 | FZR1 | Island |
| 58 | cg13654445 | chr9 | NTRK2 | — |
| 59 | cg13990746 | chr1 | ANKRD45 | Island |
| 60 | cg14270346 | chr9 | RP11-613M10.9; SHB | — |
| 61 | cg14320852 | chr9 | — | — |
| 62 | cg14373988 | chr1 | PEX10 | N_Shore |
| 63 | cg14419740 | chr7 | PTPRN2 | Island |
| 64 | cg15233380 | chr13 | SHISA2 | — |
| 65 | cg15513657 | chr14 | MEG3 | — |
| 66 | cg15585555 | chr1 | RASSF5 | S_Shore |
| 67 | cg15680620 | chr1 | ANKRD45 | S_Shore |
| 68 | cg15922705 | chr6 | COL9A1 | N_Shore |
| 69 | cg16315376 | chr1 | SYT2 | Island |
| 70 | cg16322479 | chr5 | EXOC3; EXOC3-AS1 | S_Shore |
| 71 | cg16377959 | chr5 | LINC01019 | — |
| 72 | cg16840978 | chr8 | RAB11FIP1 | N_Shelf |
| 73 | cg16884847 | chr2 | PRKCE | — |
| 74 | cg17211612 | chr12 | DNAH10 | — |
| 75 | cg17373442 | chr3 | CHST2 | Island |
| 76 | cg17676618 | chr10 | — | — |
| 77 | cg18001059 | chr7 | MRPL32; PSMA2 | Island |
| 78 | cg18713316 | chr1 | KCNN3 | Island |
| 79 | cg18883807 | chr14 | — | — |
| 80 | cg18919659 | chr22 | PACSIN2 | — |
| 81 | cg19341901 | chr7 | — | — |
| 82 | cg19859270 | chr3 | CPOX; GPR15 | — |
| 83 | cg20439473 | chr17 | VEZF1 | N_Shelf |
| 84 | cg20459495 | chr2 | — | — |
| 85 | cg20538211 | chr4 | IGFBP7-AS1 | N_Shelf |
| 86 | cg20546279 | chr7 | — | S_Shore |
| 87 | cg20628376 | chr10 | RP11-351M16.3 | — |
| 88 | cg21012061 | chr1 | ELK4; MFSD4 | — |
| 89 | cg21083936 | chr11 | — | S_Shelf |
| 90 | cg21500300 | chr12 | BCAT1; RP11-662I13.2 | S_Shore |
| 91 | cg21885107 | chr14 | PAPLN; RP4-647C14.2 | Island |
| 92 | cg23369748 | chr6 | SASH1 | — |
| 93 | cg23501962 | chr11 | RP4-683L5.1; SLC1A2 | N_Shore |
| 94 | cg23854567 | chr12 | PXN | — |
| 95 | cg24203542 | chr11 | NAV2 | — |
| 96 | cg24279017 | chr12 | ETV6 | — |
| 97 | cg24772753 | chr2 | SP5 | Island |
| 98 | cg25192619 | chr6 | CCDC167 | Island |
| 99 | cg26005485 | chr8 | FAM135B | N_Shore |
| 100 | cg26029292 | chr8 | ZNF7 | S_Shelf |
| 101 | cg26076054 | chr5 | AHRR | Island |
| 102 | cg26582784 | chr7 | AC002454.1; CDK6 | Island |
| 103 | cg26799398 | chr1 | ECM1; TARS2 | — |
| 104 | cg26972614 | chr11 | IL18BP | — |
| 105 | cg27052537 | chr7 | — | — |
The classification performance obtained by the methylation values of 105 identified signature probes.
| Data | Database | SN | SP | ACC |
|---|---|---|---|---|
| Training cohort | GSE39279 | 0.9804 | 1 | 0.9862 |
| Validation cohort | TCGA | 0.8476 | 0.806 | 0.8314 |
| GSE66836 | 0.8381 | 0.8125 | 0.8347 |
The strongly related genes and their regulatory network at different levels.
Figure 2The gene weight network of important genes. The soft threshold was used to calculate the connectivity of the genes. Gene connectivity was expressed using the color of nodes (nodes representing genes). (a) A gene weight network constructed by 34 genes. (b) A gene weight network of part of the genes with significant weights performed a map of gene weights.
Figure 3The protein-protein interaction (PPI) networks. (a) PPI corresponding to the 95 signature genes (the genes corresponding to the protein whose betweenness is greater than or equal to 100). (b) PPI corresponding to the 84 BIMC genes (the genes corresponding to the protein whose betweenness is greater than or equal to 100). (c) PPI corresponding to the 134 SRC genes (the genes corresponding to the protein whose betweenness is greater than or equal to 200).
Figure 4Distribution of feature types of 105 signature probes.
Figure 5Beeswarm plot of gene AHRR (cg05575921).