Xiaomin Chen1,2, Ruoyu Wang3, Tianze Xu4,5,6, Yajing Zhang1,2, Hongyan Li1,2, Chengcheng Du1,2, Kun Wang1,2, Zairong Gao1,2. 1. Department of Nuclear Medicine, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China. 2. Hubei Province Key Laboratory of Molecular Imaging, Wuhan, China. 3. Department of Orthopaedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China. 4. College of ACES, University of Illinois at Urbana-Champaign, Urbana-Champaign, IL, USA. 5. Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. 6. Department of Pediatrics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
Abstract
BACKGROUND: The genes and genetic mechanisms underlying the occurrence and progression of papillary thyroid carcinoma (PTC) are still unknown. This study aimed to find candidate genes related to the pathogenesis and progression of PTC. METHODS: RNA sequencing (RNA-seq) data of PTC and normal tissues were downloaded from The Cancer Genome Atlas (TCGA) database with clinical stage data to form a test, validation, and clinical-stage data matrix. We used the test data set to analyze differentially expressed genes (DEGs) and weighted gene co-expression network analysis (WGCNA) to find those gene clusters highly correlated with PTC. We then verified the expression of genes in the interested modules using the validation matrix. The quantitative real-time polymerase chain reaction (qRT-PCR) was used to verify the reliability of the expression of selected genes. Five key genes (GDF15, LCN2, KCNN4, SH3BGRL3, and MMP2) were used to analyze the connection between gene expression and the American Joint Committee on Cancer (AJCC) stage. The upregulated and downregulated DEGs, along with the modules of interest, were subjected to Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). RESULTS: We used WGCNA to find two modules of interest, the yellow module, which was positively associated with PTC, and the blue module, which was negatively correlated with PTC. Four genes (GDF15, LCN2, KCNN4, and SH3BGRL3) from the yellow module were determined to be highly expressed in PTC in the test data matrix and were verified in both the validation data matrix and quantitative real-time PCR, which indicated that these four genes were highly correlated with the occurrence of the PTC. Furthermore, these four genes also had a significantly higher expression in the advanced levels of pathological T, N, and AJCC stage, meaning that they were correlated with the progression of PTC. Genes in the yellow module and upregulated DEGs were significantly enriched in three vital signaling pathways, including focal adhesion, extracellular matrix (ECM)-receptor interaction, and the PI3K-Akt signaling pathway. CONCLUSIONS: Four candidate genes (GDF15, LCN2, KCNN4, and SH3BGRL3) may be potential biomarkers for the PTC's pathogenesis and may be useful for predicting the disease stage. 2021 Translational Cancer Research. All rights reserved.
BACKGROUND: The genes and genetic mechanisms underlying the occurrence and progression of papillary thyroid carcinoma (PTC) are still unknown. This study aimed to find candidate genes related to the pathogenesis and progression of PTC. METHODS: RNA sequencing (RNA-seq) data of PTC and normal tissues were downloaded from The Cancer Genome Atlas (TCGA) database with clinical stage data to form a test, validation, and clinical-stage data matrix. We used the test data set to analyze differentially expressed genes (DEGs) and weighted gene co-expression network analysis (WGCNA) to find those gene clusters highly correlated with PTC. We then verified the expression of genes in the interested modules using the validation matrix. The quantitative real-time polymerase chain reaction (qRT-PCR) was used to verify the reliability of the expression of selected genes. Five key genes (GDF15, LCN2, KCNN4, SH3BGRL3, and MMP2) were used to analyze the connection between gene expression and the American Joint Committee on Cancer (AJCC) stage. The upregulated and downregulated DEGs, along with the modules of interest, were subjected to Gene Ontology (GO) enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). RESULTS: We used WGCNA to find two modules of interest, the yellow module, which was positively associated with PTC, and the blue module, which was negatively correlated with PTC. Four genes (GDF15, LCN2, KCNN4, and SH3BGRL3) from the yellow module were determined to be highly expressed in PTC in the test data matrix and were verified in both the validation data matrix and quantitative real-time PCR, which indicated that these four genes were highly correlated with the occurrence of the PTC. Furthermore, these four genes also had a significantly higher expression in the advanced levels of pathological T, N, and AJCC stage, meaning that they were correlated with the progression of PTC. Genes in the yellow module and upregulated DEGs were significantly enriched in three vital signaling pathways, including focal adhesion, extracellular matrix (ECM)-receptor interaction, and the PI3K-Akt signaling pathway. CONCLUSIONS: Four candidate genes (GDF15, LCN2, KCNN4, and SH3BGRL3) may be potential biomarkers for the PTC's pathogenesis and may be useful for predicting the disease stage. 2021 Translational Cancer Research. All rights reserved.
The incidence of thyroid cancer has increased in recent years. According to data from the National Cancer Center, the 2018 incidence of thyroid cancer in China was 10.1 per 100,000 individuals, while the global incidence of thyroid cancer was 6.7 per 100,000 individuals; during the same period, the 5-year survival rate was 84.3% in China and was 98.1% in the United States (1). Papillary thyroid carcinoma (PTC) is the primary thyroid cancer type, accounting for approximately 80% of all thyroid cancers (2). Although the majority of patients with PTC have a good prognosis with surgery, adjuvant radio-iodine therapy, and thyroid-stimulating hormone therapy (3,4), the prognosis of advanced PTC is worse, with a significantly reduced 5-year survival rate of less than 60% and a markedly increased recurrence rate of more than 30% (5). It is therefore critical to explore the genes and potential mechanisms related to the occurrence and development of PTC, and to guide the molecular diagnosis and treatment of PTC patients.In recent years, many studies have used bioinformatics analysis to analyze genes related to the incidence of PTC. Li’s research has indicated that the survivin gene mutation might be an excellent diagnostic criterion for PTC (6). Meanwhile, Han et al. (7) identified four differentially expressed genes (DEGs), including COMP, COL3A1, ZAP70, and CD247, to be related to PTC clinical phenotypes, that might be promising biomarkers for early-stage PTC. Furthermore, Ao et al. reported that 5 candidate genes (LRP4, KLK7, PRICKLE1, ETV4, and ETV5) could be used to predict the pathogenesis of PTC (8). These important findings have identified several biomarkers for early-stage PTC, primarily through the Gene Expression Omnibus (GEO) database; however, The Cancer Genome Atlas (TCGA), which includes a large number of PTC sequencing samples, has yet to be the subject of such an investigation to find genes related to the pathogenesis and progression of PTC. Therefore, exploring the molecular mechanisms and searching for novel biomarkers of PTC in the TCGA database may have considerable clinical value.Weighted gene co-expression network analysis (WGCNA) is a novel method that has been used to effectively screen for new biomarkers in cancers (9). The WGCNA algorithm can group genes into modules based on the gene co-expression similarities across the samples, yielding a cluster of genes with similar functions that can be useful for relating modules and external sample traits. From these correlation networks, tissue-specific biomarkers and pathophysiological-related pathways can then be identified (10). This study used WGCNA to select core network genes and modules with high-expression oncogenes, to find candidate genes related to the pathogenesis and progression of PTC.We present the following article in accordance with the MDAR reporting checklist (available at http://dx.doi.org/10.21037/tcr-20-2866).
Methods
Data acquisition
We downloaded the RNA sequencing (RNA-seq) expression data of 560 samples, including 58 normal and 502 PTC samples, and the corresponding clinical information of the 502 cases from the TCGA database (https://portal.gdc.cancer.gov/). The flowchart of the study design and the process of sample enrollment is illustrated in . After using the random number method, 116 PTC samples were randomly selected from the dataset of the 502 PTC samples for pairing with normal tissues. We merged the count matrix of 116 PTC samples and 58 normal samples to form a test data matrix; we merged the count matrix of 386 PTC samples and 58 normal samples to form a validation data matrix; we merged the count matrix of 502 PTC samples and clinical-stage data to form a clinical-stage matrix.
Figure 1
The detailed flow chart of this study. TGCA, The Cancer Genome Atlas; PTC, papillary thyroid carcinoma; DEGs, differentially expressed genes; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; GEO, Gene Expression Omnibus; qRT-PCR, quantitative real-time polymerase chain reaction.
The detailed flow chart of this study. TGCA, The Cancer Genome Atlas; PTC, papillary thyroid carcinoma; DEGs, differentially expressed genes; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; GEO, Gene Expression Omnibus; qRT-PCR, quantitative real-time polymerase chain reaction.We also downloaded the gene expression dataset (accession number: GSE33630), which contained microarray data of 49 PTC samples and 45 adjacent normal thyroid samples from the NCBI GEO database (http://www.nibi.nih.gov/geo/) for further verification of the test data matrix.The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013).
DEG analysis
We used the test data matrix (RNA-seq data of 116 PTC and 58 normal samples) to analyze the DEGs between the PTC and normal samples using the “DESeq2” R package. We set the false discovery rate (FDR) to 0.05, and fold change was considered the differential expression ratio between PTC and normal samples. The criteria for DEGs were |log2(fold change)| >1 and FDR <0.05.
Construction of weighted gene co-expression networks and module identification
We analyzed the weighted co-expression network by using the test data matrix. To identify biologically meaningful gene modules containing highly correlated genes, a gene co-expression network was constructed using the “WGCNA” package in R software. We detected modules by setting MEDissThres to 0.25 in order to merge similar modules. We used average linkage hierarchical clustering based on topology overlap measurement (TOM) to identify gene co-expression modules, consisting of a group of genes with similar expression patterns. The soft threshold power was set to four based on the criterion of approximate scale-free topology. We used gene significance (GS) to qualify the link of individual genes to the trait of interest (PTC). In each module, we used module membership (MM) to evaluate the gene expression profile correlation and the module eigengene (11).
Gene interaction network construction and key gene identification
When the correlations between modules and traits (PTC and normal samples, respectively) were >0.5 and the P value was <0.05, and we considered this to be a module of interest. Topological data of the modules of interest (yellow and blue modules) were imported from the R program to Cytoscape (version 3.80) to visualize gene interaction network construction. The hub genes and subnetworks were identified using CytoHubba in Cytoscape. The analysis of DEGs was matched to the gene interaction network: genes that had a lower expression in PTC samples [log2 (fold change) <–1 and FDR <0.05] were labeled in green, while genes that had a higher expression in PTC samples [log2 (fold change) >1, and adjusted P<0.05] were labeled in red. The top 10 DEGs and the top 10 hub genes in each module of interest were selected for validation.
Validation by the RNA-seq data and GEO data in the modules of interest
We exported gene interactive networks of the yellow and blue modules from the R program to Cytoscape for visualization with the thresholds set as 0.17 and 0.27, respectively, to screen highly centralized genes (hub genes) in these modules. The validation data matrix (RNA-seq data of 386 PTC and 58 normal samples) and the GEO data were used to validate selected genes’ expression.
Cell culture and quantitative real-time polymerase chain reaction (qRT-PCR)
We used the human PTC-derived cell line (B-CPAP) and normal thyroid epithelial cell line (Nthy Ori3-1) from the American Type Culture Collection (ATCC, Manassas, VA, USA), and cultured them in DMEM/F-12 (Gibco, Grand Island, NY, USA) with 10% fetal bovine serum and 1% penicillin G and streptomycin in a 37 °C environment containing 5% CO2.Ten core genes (the top five DEGs: LCN2, PDZK1IP1, GDF15, SLPI, and KCNN4; the top five of hub genes: MVP, OCIAD2, MMP2, SH3BGRL3, and S100A11) of interest in the yellow module were further verified in the B-CPAP and Nthy Ori3-1 cell lines by qRT-PCR. The primers of the 10 genes are listed in . We set β-actin as a control and the Nthy Ori-3-1 cell line as a calibrator sample for the PTC-derived cell line, and then calculated the relative expressions using the 2–ΔΔCt method. The Student’s t-test was used to compare gene expression levels, PTC-derived cell lines, and normal thyroid epithelial cell lines.
We used the clinical-stage matrix to analyze the connection between gene expression and the clinical TNM stage. The five genes (GDF15, LCN2, KCNN4, MMP2, and SH3BGRL3) verified by qRT-PCR were stratified by pathologic T, N, M, and American Joint Committee on Cancer (AJCC) stage, respectively. We used the Kruskal-Wallis test to analyze the omnibus difference and used the Wilcoxon test to analyze post-hoc pairwise comparison.
Enrichment analysis of DEGs and modules of interest
The Database for Annotation, Visualization, and Integrated Discovery (DAVID, http://david.abcc.ncifcrf.gov) was used for Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment. We uploaded the genes from the critical co-expression modules to identify the potential functions. The Benjamini correction was performed to calibrate the confidence level, with a q value <0.05 being considered the cutoff criterion. The upregulated and downregulated DEGs, and the modules of interest were enriched in GO and KEGG functions by DAVID. GO analysis results indicated enrichment in biological process (BP), cellular component (CC), and molecular function (MF).
Statistical analysis
Data from validation data matrix, clinical stage matrix and GEO dataset were displayed by median (also from minimum to maximum) due to that gene expression from matrix and dataset did not conform to a normal distribution. For validation in silicon, we used Wilcox test to verify the gene difference between normal and PTC groups. For examining gene expressions relating to TNM stage, we used the Kruskal-Wallis test to analyze the omnibus difference and used the Wilcoxon test to analyze post-hoc pairwise comparison. We set α as 0.05 and used Bonferroni correction to calibrate the confidence level, with a q value <0.05 being considered the cutoff criterion (12). Data from validation in vitro were displayed by mean ± standard deviation and Student t-test was used to analyze the difference between normal and PTC groups; P<0.05 was regarded as being statistically significant.
Results
Sample characteristics and DEG screening
Among the 502 samples, the mean age was 47.35 (range, 15 to 89) years. Male were 135, and female were 367. The main PTC clinical data are shown in . The RNA-seq count matrix contained 13,693 genes. The filtering criteria required that the row sums of gene counts be bigger than the upper quantile of the row sums, and 4,387 highly expressed genes in 560 samples were thus selected for test, validation, and clinical-stage analysis. We finally identified 499 DEGs, 316 of which were upregulated and 183 were downregulated in the PTC samples compared with normal samples (). The expression of DEGs was highly clustered in both PTC and normal samples ().
Table 2
Characteristics of PTC patients
Characteristics
PTC
Quantity
Gender
Male
135
Female
367
T stage
T1
142
T2
164
T3
171
T4
23
Tx*
2
N stage
N0
228
N1
224
Nx*
50
M stage
M0
283
M1
9
Mx*
210
AJCC TNM stage
I
281
II
52
III
113
IV
55
NA*
1
*, as we could not evaluate pathologic Tx representing papillary tumor, pathologic Nx representing regional (nearby) lymph nodes, or pathologic Mx representing distant metastasis, we treated Tx, Nx and Mx, as not applicable. PTC, papillary thyroid carcinoma.
Figure 2
Screening DEGs between normal (N) and tumor (T) samples. (A) A volcano map of differentially DEGs showing the fold change and FDR of each gene (dot): red dots represent significantly upregulated genes (FDR <0.05) and green dots represent significantly downregulated genes (FDR <0.05); grey dots represent no significance (FDR >0.05). (B) A heatmap of DEGs showing both PTC and normal samples clustered independently. DEGs, differentially expressed genes; FDR, false discovery rate; PTC, papillary thyroid carcinoma.
*, as we could not evaluate pathologic Tx representing papillary tumor, pathologic Nx representing regional (nearby) lymph nodes, or pathologic Mx representing distant metastasis, we treated Tx, Nx and Mx, as not applicable. PTC, papillary thyroid carcinoma.Screening DEGs between normal (N) and tumor (T) samples. (A) A volcano map of differentially DEGs showing the fold change and FDR of each gene (dot): red dots represent significantly upregulated genes (FDR <0.05) and green dots represent significantly downregulated genes (FDR <0.05); grey dots represent no significance (FDR >0.05). (B) A heatmap of DEGs showing both PTC and normal samples clustered independently. DEGs, differentially expressed genes; FDR, false discovery rate; PTC, papillary thyroid carcinoma.
The yellow module was positively correlated with PTC, while the blue module was negatively correlated with PTC
Eight modules were selected: red (76 genes), turquoise (1,135 genes), yellow (94 genes), green (86 genes), black (73 genes), blue (896 genes), brown (468 genes), and grey (244 genes). The TOM of all genes is shown in . As shown in , seven modules (grey module excluded) were categorized into two main clusters: one cluster covered three modules (red, turquoise, and yellow module), while the other cluster involved four modules (green, black, blue, and brown module). Three pairs of modules had higher adjacencies: yellow and turquoise, blue and black, and brown and blue modules (). We selected 2 modules of interest: the yellow module (R2=0.61, P=2E–19) and the blue module (R2=–0.94, P=3E–85) (). The yellow module's correlation analysis revealed a positive interrelationship between gene expression and PTC, while that of the blue module showed a negative correspondence between gene expression and PTC. In the yellow module, GS and MM’s association was 0.84 (P=3.7E–26; ), while that in the blue module was 0.96 (P<1E–200; ).
Figure 3
Co-expression network construction. (A) A heatmap depicts the TOM for all genes. Lighter yellow color indicates a low overlap while a darker red color indicates a high overlap. The gene dendrogram and modules are also aligned on the left and top. (B) Hierarchical clustering of modules indicated an interconnection among modules. (C) The interaction relationship analysis of co-expressed genes. All modules are assigned on the left and top. The grids in light or dark red color in the middle indicate the connectivity of different modules. (D) Correlation of gene modules between normal and PTC samples. Each row corresponds to a module eigengene, each column corresponds to a trait, and each cell contains the correlation (R2) and p value (in brackets). The table is colored by correlation according to the color scale. (E,F) A scatterplot of GS for PTC versus MM in the yellow and blue module, indicating a highly significant correlation between GS and MM in the yellow and blue modules. TOM, topological overlap matrix; PTC, papillary thyroid carcinoma; GS, gene significance; MM, module membership.
Co-expression network construction. (A) A heatmap depicts the TOM for all genes. Lighter yellow color indicates a low overlap while a darker red color indicates a high overlap. The gene dendrogram and modules are also aligned on the left and top. (B) Hierarchical clustering of modules indicated an interconnection among modules. (C) The interaction relationship analysis of co-expressed genes. All modules are assigned on the left and top. The grids in light or dark red color in the middle indicate the connectivity of different modules. (D) Correlation of gene modules between normal and PTC samples. Each row corresponds to a module eigengene, each column corresponds to a trait, and each cell contains the correlation (R2) and p value (in brackets). The table is colored by correlation according to the color scale. (E,F) A scatterplot of GS for PTC versus MM in the yellow and blue module, indicating a highly significant correlation between GS and MM in the yellow and blue modules. TOM, topological overlap matrix; PTC, papillary thyroid carcinoma; GS, gene significance; MM, module membership.
Network construction and key gene identification in PTC of the yellow and blue modules
After screening with a 0.17 threshold in the yellow module, 64 genes were visualized, with 31 of these being matched to DEGs (). After screening with a 0.27 threshold in the blue module, 111 genes were visualized, with 84 of these genes being matched to DEGs (). The top 10 DEGs in the yellow and blue modules are shown in , respectively, and the top 10 hub genes in the core network in the yellow and blue modules are shown in .
Figure 4
The network of hub genes in the two modules of interest. (A) The gene interaction network in the yellow module showing that 31 of 64 genes were matched to DEGs (red circular nodes); the unmatched genes are the square nodes in yellow. (B) The top 10 hub genes in the subnetwork of the yellow module. (C) The gene interaction network in the blue module showing that 84 of 111 genes were matched to DEGs (green circular nodes); the unmatched genes are the square nodes in blue. (D) The top 10 hub genes in the subnetwork of the blue module. DEGs, differentially expressed genes.
Table 3
The top 10 DEGs in the yellow module
Gene ID
Gene name
Log2 fold change
Adjusted P value
ENSG00000148346
LCN2
5.553841607
2.95E–91
ENSG00000162366
PDZK1IP1
4.530363669
2.63E–58
ENSG00000130513
GDF15
4.319358493
1.57E–92
ENSG00000124107
SLPI
4.15074815
3.09E–55
ENSG00000104783
KCNN4
4.029339699
1.14E–60
ENSG00000131435
PDLIM4
3.990102564
1.68E–164
ENSG00000171345
KRT19
3.194147299
2.67E–92
ENSG00000133110
POSTN
3.161981456
1.91E–33
ENSG00000108821
COL1A1
2.845246535
4.43E–29
ENSG00000132470
ITGB4
2.35982993
1.11E–59
DEG, differentially expressed gene.
Table 4
The top 10 DEGs in the blue module
Gene
Gene name
Log2 fold change
Adjusted P value
ENSG00000074706
IPCEF1
–4.55752
1.33E–151
ENSG00000115112
TFCP2L1
–4.24682
8.54E–91
ENSG00000153246
PLA2R1
–3.85242
3.68E–150
ENSG00000080493
SLC4A4
–3.85126
6.26E–78
ENSG00000157680
DGKI
–3.80398
3.17E–90
ENSG00000147606
SLC26A7
–3.80141
3.00E–62
ENSG00000157404
KIT
–3.48921
6.25E–81
ENSG00000112562
SMOC2
–3.29676
2.92E–76
ENSG00000132561
MATN2
–3.23942
4.00E–132
ENSG00000091137
SLC26A4
–3.23634
4.50E–40
DEG, differentially expressed gene.
The network of hub genes in the two modules of interest. (A) The gene interaction network in the yellow module showing that 31 of 64 genes were matched to DEGs (red circular nodes); the unmatched genes are the square nodes in yellow. (B) The top 10 hub genes in the subnetwork of the yellow module. (C) The gene interaction network in the blue module showing that 84 of 111 genes were matched to DEGs (green circular nodes); the unmatched genes are the square nodes in blue. (D) The top 10 hub genes in the subnetwork of the blue module. DEGs, differentially expressed genes.DEG, differentially expressed gene.DEG, differentially expressed gene.
Among the 36 genes selected, 31 genes were consistent with the test data matrix, while five genes (COL1A2, COL6A3, MMP2, LUM and ABI3BP) were not consistent with the test data matrix
All the top 10 DEGs and the top 10 hub genes in the yellow and blue modules were selected for validation. COL1A1 was repeated both in the DEGs and hub genes in the yellow module, and thus, we finally selected 19 genes in the yellow module for validation. In all, 16 of 19 selected genes were highly expressed in the PTC group in the yellow module (q<0.05), while the expression of 3 genes (COL1A2, COL6A3, and MMP2) showed no significant difference between the PTC and normal group (q>0.05) in the validation data matrix (). DGKI, IPCEF1, and SLC4A4 were the repeat genes in the blue module’s DEGs and hub genes. Thus, 17 genes were finally chosen to be validated in the blue module. All of the selected genes in the blue module had a low expression in the PTC group in the validation data matrix (q<0.05) ().
Figure 5
Genes in the yellow module were verified by the validation data matrix. (A-J) The top 10 DEGs. (J-S) The top 10 hub genes. (J) COL1A1 was repeated in the DEGs and hub genes in the yellow module; 16 of 19 selected genes were highly expressed in the PTC group (q<0.05), while COL1A2, COL6A3, and MMP2 expression were not different across the PTC and normal groups (q>0.05). *, indicates P<0.05 and ns indicates P>0.05. DEGs, differentially expressed genes; PTC, papillary thyroid carcinoma.
Figure 6
Genes in the blue module were verified by the validation data matrix. (A-J) The top 10 DEGs. (H-Q) The top 10 hub genes. (H-J) DGKI, IPCEF1, and SLC4A4 were the repeated genes in the DEGs and hub genes in the blue module. All the selected genes in the blue module had a lower expression in the PTC group (q<0.05). *, indicates P<0.05. DEGs, differentially expressed genes; PTC, papillary thyroid carcinoma.
Genes in the yellow module were verified by the validation data matrix. (A-J) The top 10 DEGs. (J-S) The top 10 hub genes. (J) COL1A1 was repeated in the DEGs and hub genes in the yellow module; 16 of 19 selected genes were highly expressed in the PTC group (q<0.05), while COL1A2, COL6A3, and MMP2 expression were not different across the PTC and normal groups (q>0.05). *, indicates P<0.05 and ns indicates P>0.05. DEGs, differentially expressed genes; PTC, papillary thyroid carcinoma.Genes in the blue module were verified by the validation data matrix. (A-J) The top 10 DEGs. (H-Q) The top 10 hub genes. (H-J) DGKI, IPCEF1, and SLC4A4 were the repeated genes in the DEGs and hub genes in the blue module. All the selected genes in the blue module had a lower expression in the PTC group (q<0.05). *, indicates P<0.05. DEGs, differentially expressed genes; PTC, papillary thyroid carcinoma.Because the array data from GEO did not contain ITBG4, we selected 35 genes from the test data matrix to verify in the GEO data. Among the 35 genes selected for validation, the expression of 32 genes was consistent with the test matrix (q<0.05), while the expression of three genes (MMP2 and LUM in the yellow module, and ABI3BP in the blue module) were inconsistent with the test matrix (q>0.05) ().
Figure 7
Genes in the yellow module were verified by the GEO data. (A-P) Sixteen of 18 selected genes were significantly different across the PTC and normal groups (q<0.05), (Q,R) while the MMP2 and LUM expression did not have a significant difference between the PTC and normal groups (q>0.05). *, indicates P<0.05 and ns indicates P>0.05. GEO, Gene Expression Omnibus; PTC, papillary thyroid carcinoma.
Figure 8
Genes in the yellow module were verified by the GEO data. (A-P) Sixteen of 17 selected genes were significantly different across the PTC and normal groups (q<0.05), while (Q) the ABI3BP expression did not have a significant difference between the PTC and normal groups (q>0.05). *, indicates P<0.05 and ns indicates P>0.05. GEO, Gene Expression Omnibus; PTC, papillary thyroid carcinoma.
Genes in the yellow module were verified by the GEO data. (A-P) Sixteen of 18 selected genes were significantly different across the PTC and normal groups (q<0.05), (Q,R) while the MMP2 and LUM expression did not have a significant difference between the PTC and normal groups (q>0.05). *, indicates P<0.05 and ns indicates P>0.05. GEO, Gene Expression Omnibus; PTC, papillary thyroid carcinoma.Genes in the yellow module were verified by the GEO data. (A-P) Sixteen of 17 selected genes were significantly different across the PTC and normal groups (q<0.05), while (Q) the ABI3BP expression did not have a significant difference between the PTC and normal groups (q>0.05). *, indicates P<0.05 and ns indicates P>0.05. GEO, Gene Expression Omnibus; PTC, papillary thyroid carcinoma.
The expression of GDF15, MMP2, SH3BGRL3, KCNN4, and LCN2 were higher in the BCPAP cell line compared with the Nthy Ori3-1 cell line
Subsequently, we selected the 10 PTC-related genes in the yellow module and then validated the mRNA levels using qRT-PCR. We discovered that the expression levels of GDF15 (P=0.0004), MMP2 (P=0.0003), SH3BGRL3 (P=0.0003), KCNN4 (P=0.0228), and LCN2 (P=0.0389) were higher in the BCPAP cell line compared with the Nthy Ori3-1 cell line. In contrast, the expression of SLPI (P<0.0001), S100A11 (P<0.0001), OCIAD2 (P<0.0001), MVP (P=0.0004), and PDZK1IP1 (P=0.0013) were downregulated in the BCPAP cell line ().
Figure 9
Genes in the yellow module were verified by qRT-PCR. (A-E) GDF15, MMP2, SH3BGRL3, KCNN4, and LCN2 were highly expressed (P<0.05) while (F-J) SLPI, S100A11, OCIAD2, MVP, and PDZK1IP1 were weakly expressed (P<0.05) in the BCPAP cell line compared with the Nthy-ori 3-1 cell line. *, indicates P<0.05; **, indicates P<0.01; ***, indicates P<0.001. qRT-PCR, quantitative real-time polymerase chain reaction.
Genes in the yellow module were verified by qRT-PCR. (A-E) GDF15, MMP2, SH3BGRL3, KCNN4, and LCN2 were highly expressed (P<0.05) while (F-J) SLPI, S100A11, OCIAD2, MVP, and PDZK1IP1 were weakly expressed (P<0.05) in the BCPAP cell line compared with the Nthy-ori 3-1 cell line. *, indicates P<0.05; **, indicates P<0.01; ***, indicates P<0.001. qRT-PCR, quantitative real-time polymerase chain reaction.
The expression of GDF15, LCN2, KCNN4, MMP2, and SH3BGRL3 was highly associated with TNM stage
As shown in , four genes (GDF15, LCN2, KCNN4, and SH3BGRL3) showed a much higher expression in the T3–T4 stages than in the T1–T2 stages (q<0.05). In contrast, the expression of MMP2 showed no difference across the T1–T4 stages (q>0.05), which indicated that the high expression of GDF15, LCN2, KCNN4, and SH3BGRL3 was associated with large tumor size and extensive growth beyond the thyroid gland. In pathological N analysis, the five genes were all more highly expressed in N1 than in N0 (q<0.05), indicating that the high expression of the five genes was associated with spread to nearby lymph nodes (). However, these five genes did not demonstrate a difference between M0 and M1 (q>0.05) (). Finally, we discovered that these five genes had a higher expression in stage IV than in other stages (q<0.05), which suggested that these genes might be correlated with a higher stage ().
Figure 10
Genes expression in TNM stage. (A-E) Gene expression in pathologic T stage. (F-J) Gene expression in pathologic N stage. (K-O) Gene expression in pathologic M stage. (P-T) Gene expression in TNM stage. *, indicates q<0.05 and ns indicates q>0.05.
Genes expression in TNM stage. (A-E) Gene expression in pathologic T stage. (F-J) Gene expression in pathologic N stage. (K-O) Gene expression in pathologic M stage. (P-T) Gene expression in TNM stage. *, indicates q<0.05 and ns indicates q>0.05.
Functional enrichment in DEGs and modules of interest
The upregulated DEGs were significantly enriched in BP, CC, and MF showed in . As shown in , the downregulated DEGs that were significantly enriched in BP, CC, and MF. In the KEGG analysis, the upregulated DEGs were significantly enriched in the PI3K-Akt signaling pathway, focal adhesion, extracellular matrix (ECM)-receptor interaction, and proteoglycans in cancer. The downregulated DEGs were not significantly enriched in any pathways (q>0.05).
Figure 11
GO analysis and KEGG pathway enrichment results for genes in the upregulated DEGs, downregulated DEGs, and the yellow module. (A) The upregulated DEGs, (B) the downregulated DEGs, and (C) the yellow module. The bar represents the q value (Benjamini-adjusted), with a q value <0.05 being considered statistically significant. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; DEGs, differentially expressed genes; BP, biological process; MF, molecular function; CC, cellular component.
GO analysis and KEGG pathway enrichment results for genes in the upregulated DEGs, downregulated DEGs, and the yellow module. (A) The upregulated DEGs, (B) the downregulated DEGs, and (C) the yellow module. The bar represents the q value (Benjamini-adjusted), with a q value <0.05 being considered statistically significant. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; DEGs, differentially expressed genes; BP, biological process; MF, molecular function; CC, cellular component.Genes in the yellow module were mainly enriched in 13 GO terms closely related to cancer, shown in . Moreover, in the yellow module, KEGG pathway enrichment analysis showed notable enrichment in the PI3K-Akt signaling pathway, focal adhesion, protein digestion and absorption, and ECM-receptor interaction. No GO or KEGG terms were significant (q>0.05) in the blue module, and no GO or KEGG pathways were enriched.
Discussion
In this study, we used WGCNA to identify two modules of interest: the yellow module, which was positively associated with PTC, and the blue module, which was negatively correlated with PTC. Four genes (GDF15, LCN2, KCNN4, and SH3BGRL3) that were identified from the yellow module, were highly correlated with the PTC occurrence. We also found that the four genes had significantly higher expression in cases of larger tumor size, lymph node metastasis, and higher AJCC stage, meaning that they were correlated with PTC progression.Abundant research has found these four genes to be associated with the occurrence and progression of cancer. Mimeault et al. reported that growth differentiation factor 15 (GDF15) was significantly increased in injury, inflammation, cancer, and other diseases compared with normal physiological conditions (13), and higher expression of GDF15 was related to tumor tumorigenesis and progression. Studies by Yang et al. and Urakawa et al. showed that the high expression of GDF15 promoted tumorigenesis and progression in oral squamous cell carcinoma (14) and esophageal squamous cell carcinoma, respectively (15). Moreover, overexpression of GDF15 was shown to stimulate MEK/ERK and PI3K/Akt/mTOR signaling (16-18) and to assist in epithelial-mesenchymal transition (EMT) in ovarian and colorectal carcinomas cells (19,20). Many studies reported that lipocalin 2 (LCN2) was overexpressed in various cancers and associated with vascular invasion, TNM stage, and tumor recurrence (21-25). Wu et al. identified SH3 domain-binding glutamate-rich protein-like 3 (SH3BGRL3) as a highly relevant potential PTC candidate biomarker, as it was more highly expressed in PTC than in normal tissues (26); it was shown to promote the EMT, proliferation, and cell migration of urothelial carcinoma, and was also found to be highly connected with the epidermal growth factor receptor (EGFR) in bladder cancer (27). Potassium calcium-activated channel subfamily N member 4 (KCNN4) also demonstrated elevated expression in various types of cancer and was shown to be involved in promoting cell invasion and cell proliferation. Furthermore, the KCNN4-induced calcium signaling process also reported to be associated with the malignant phenotypes of numerous tumors, such as glioblastoma, pancreatic cancer, melanoma, and prostate cancer (28-31).Our data indicated that a higher expression of these four genes was significantly related to the PTC’s pathogenesis and correlated with a higher stage. Consistent with our findings, recent studies have reported that the mRNA levels of GDF15 (32), LCN2 (24), and H3BGRL3 (26) display a notable increase in thyroid cancer, especially in PTC. KCNN4 may regulate cancer proliferation and invasion, and increased expression of KCNN4 has been found to be an indicator of poor clinical outcomes in various tumors (27,33-37). Collectively, these findings highlight this gene’s potentially significant role in cancer development. In this study, KCNN4 was screened out in PTC, implying that it might play an essential role in this malignancy.By analyzing gene function and signaling pathways, we confirmed that genes in the yellow module and the upregulated DEGs were significantly enriched in the three vital signaling pathways of focal adhesion, ECM-receptor interaction, and PI3K-Akt signaling, which play critical roles in PTC. In line with our results, several studies have also reported focal adhesion (38,39), ECM-receptor interaction (39-41), and the PI3K-AKT signaling pathway (42,43) to be associated with the proliferation, metastasis, and development of PTC.Several limitations of our study should also be mentioned. First, the test and validation data matrix used the same normal samples, which would lead to data independence. Thus, we used data from the GEO database for independent verification in the article and got the same result that four genes (GDF15, LCN2, KCNN4, and SH3BGRL3) from the yellow module that were highly correlated with the occurrence and progression of PTC. Second, we only used a single PTC cell line for verification, and more PTC cell lines would provide more substantial verification. Finally, future studies with in vivo animal models are required to further validate the present study’s conclusions.
Conclusions
We systematically identified four genes (GDF15, LCN2, KCNN4, and SH3BGRL3) from the yellow module that were highly correlated with the occurrence and progression of PTC. These findings provide insights into the mechanisms underlying the pathogenesis and progression of PTC and may serve as potential targets for the diagnosis and treatment of patients with this disease.
Authors: C Z Yang; J Ma; D W Zhu; Y Liu; B Montgomery; L Z Wang; J Li; Z Y Zhang; C P Zhang; L P Zhong Journal: Ann Oncol Date: 2014-03-24 Impact factor: 32.976
Authors: Josena K Stephen; Dhananjay Chitale; Vinod Narra; Kang Mei Chen; Raja Sawhney; Maria J Worsham Journal: Cancers (Basel) Date: 2011-06-01 Impact factor: 6.639
Authors: Irene Ferrer; Álvaro Quintanal-Villalonga; Sonia Molina-Pinelo; Jose Manuel Garcia-Heredia; Marco Perez; Rocío Suárez; Santiago Ponce-Aix; Luis Paz-Ares; Amancio Carnero Journal: J Exp Clin Cancer Res Date: 2018-08-17