Literature DB >> 30250553

Identification of potential functional genes in papillary thyroid cancer by co-expression network analysis.

Zeng-Xin Ao1, Yuan-Cheng Chen1, Jun-Min Lu1, Jie Shen1, Lin-Ping Peng1, Xu Lin1, Cheng Peng1, Chun-Ping Zeng1, Xia-Fang Wang1, Rou Zhou1, Zhi Chen1, Hong-Mei Xiao2, Hong-Wen Deng1,2,3.   

Abstract

Interactions between multiple genes are involved in the development of complex diseases. However, there are few analyses of gene interactions associated with papillary thyroid cancer (PTC). Weighted gene co-expression network analysis (WGCNA) is a novel and powerful method that detects gene interactions according to their co-expression similarities. In the present study, WGCNA was performed in order to identify functional genes associated with PTC using R package. First, differential gene expression analysis was conducted in order to identify the differentially expressed genes (DEGs) between PTC and normal samples. Subsequently, co-expression networks of the DEGs were constructed for the two sample groups, respectively. The two networks were compared in order to identify a poorly preserved module. Concentrating on the significant module, validation analysis was performed to confirm the identified genes and combined functional enrichment analysis was conducted in order to identify more functional associations of these genes with PTC. As a result, 1062 DEGs were identified for network construction. A brown module containing 118 highly related genes was selected as it exhibited the lowest module preservation. After validation analysis, 61 genes in the module were confirmed to be associated with PTC. Following the enrichment analysis, two PTC-related pathways were identified: Wnt signal pathway and transcriptional misregulation in cancer. LRP4, KLK7, PRICKLE1, ETV4 and ETV5 were predicted to be candidate genes regulating the pathogenesis of PTC. These results provide novel insights into the etiology of PTC and the identification of potential functional genes.

Entities:  

Keywords:  ETV4; ETV5; KLK7; LRP4; PRICKLE1; co-expression; gene interactions; papillary thyroid cancer

Year:  2018        PMID: 30250553      PMCID: PMC6144229          DOI: 10.3892/ol.2018.9306

Source DB:  PubMed          Journal:  Oncol Lett        ISSN: 1792-1074            Impact factor:   2.967


Introduction

Thyroid cancer is the most common type of endocrine malignancy. The incidence of thyroid cancer has increased markedly worldwide in the last decade (1). Papillary thyroid cancer (PTC) is the most common type of thyroid cancer, accounting for approximately 80% of all cases (2). Understanding of the genetic mechanisms underlying PTC has improved remarkably in recent years, which provides more support for molecular diagnosis of the disease (3,4). However, predicting PTC as a result of the complicated molecular interactions participating in the pathogenesis of this type of cancer remains a clinical challenge (5). In addition, a number of patients with PTC demonstrate recurrence, invasion and distant metastasis due to resistance to surgical and radioiodine treatment. Novel molecular-targeted treatments hold potential for these cases (6). Therefore, investigation of the molecular mechanisms involved is expected to aid the identification of candidate biomarkers and improve molecular diagnosis and treatment of patients with PTC. Previous studies of the genetic basis of PTC mainly focus on a single susceptibility gene, which does not take into account the interactions of multiple genes. However, complex traits are modified by the cumulative effect of interactive genes (7). Elucidation of the interactions between genes and molecules is essential for understanding the molecular basis underlying complex diseases. Correlation network analysis identifies the correlation patterns among genes according to their co-expression similarities by grouping highly correlated genes into one module (8). Another advantage of network-assisted analysis is that transcriptional data has a more direct impact on protein expression than genomic data, which provides more biological information regarding the interactive genes (9). The method applied in the present study is weighted gene co-expression network analysis (WGCNA) (8), which has been successfully used for detection of genetic determinants in many human diseases, such as osteoporosis (10), obesity (11), familial combined hyperlipidemia (12) and metaphyseal chondrodysplasia (13). A gene harboring changeable interactions with other genes between diseased and normal conditions indicates that it may play a crucial role in pathology of the disease. Based on this hypothesis, Riquelme Medina and Lubovac-Pilav successfully employed WGCNA to identify genes and pathways involved in the development of type 1 diabetes (14). In the present study, WGCNA is employed to generate two networks derived from PTC and normal tissue control samples, respectively. The two networks were compared in order to identify the module with the lowest preservation, which contains genes harboring changeable interactions. To the best of our knowledge, the present study is the first to apply WGCNA innovatively in order to identify the genetic mechanisms of PTC.

Materials and methods

Gene expression data and processing

The gene expression dataset (accession number: GSE33630) was downloaded from the NCBI Gene Expression Omnibus database (http://www.nibi.nih.gov/geo/). Microarray data were obtained from RNA samples comprised of 49 PTC samples and 45 adjacent normal thyroid samples. The research was approved by Ehics Committees of the Institute of Endocrinology and Metabolism in Kiev, Imperial College in London and the Medical Radiological Research Centre in Obninsk. Details about the data can be found in the original publication (15). The platform used for profiling was the Affymetrix U-133 Plus v.2.0 array. The raw CEL data were processed and normalized using affy package in R software 3.3.0 (16). The probe IDs were converted into gene symbols using the annotation file for probes of the platform. The median expression levels of these probe IDs that correspond to the same gene were selected as the final expression value of a gene.

Differential gene expression analysis

In order to identify differentially expressed genes (DEGs) between the PTC and normal samples, empirical Bayes method was performed using limma package (17) in R. P-values were adjusted using the Benjamini and Hochberg method to control the false discovery rate caused by multiple testing. Genes were selected as DEGs for subsequent analysis according to the following criteria: logFC (fold change) >1 and adjusted P-value <0.05. Fold change was calculated as the ratio of the differential expression.

Construction of weighted gene co-expression networks

Network construction was performed using the WGCNA R package (8,18). First, cluster sampling was conducted for both the PTC and normal samples based on the expression levels of the DEGs to identify and remove potential outliers. The parameter beta of power function for network construction was then set as 8 according to the scale-free topology criterion as described in a previous study (8), making the established network satisfy approximate scale-free topology (linear regression model-fitting index R2 >0.9). WGCNA identifies modules with highly correlated genes; these genes usually have similar connectivity patterns with other genes, which can be defined as the topological overlap measurement (TOM). Genes were hierarchically clustered and visualized in a dendrogram according to dissimilarity TOM (1-TOM) (18). The branches of the tree representing highly correlated genes were grouped into one module marked by a particular color. Grey donates background genes that belonged to none of the modules. Different numbers of modules were established from the dendrogram in the PTC and normal samples, respectively.

Module preservation between PTC and normal networks

Module preservation analysis is capable of assessing whether a module defined in a reference dataset (PTC sample network in the present study) is also in the test dataset (normal sample network in the present study). Two composite preservation statistics, Zsummary (Eq1) and medianRank (Eq2), were used to compare relative preservation among multiple modules, which considers both density and connectivity preservation (19). The former is based on the extent of gene interconnectivity, while the latter is based on the connectivity pattern of genes. Studies have shown that Zsummary thresholds distinguish module preservation as follows (19): Zsummary<2, no preservation; Zsummary>10, strong evidence of module preservation; and 2

Intramodule gene interaction analysis

K.in, also known as intramodule connectivity, describes the connectivity of a gene with other genes in the same module (10). The k.in rank of a gene is the ranking of k.in among all of the genes in a particular module. K.in rank change refers to the difference of the k.in ranks between the PTC and normal sample networks. A gene with a large k.in rank change represents a large difference in its interactions with other genes in PTC and normal samples and indicates that it has an important role in the development of PTC.

Validation analysis using another gene expression dataset

In order to partially validate the identified genes in the significant module, a gene expression validation analysis was performed using another gene expression profile. The expression array for the validation was downloaded from the NCBI Gene Expression Omnibus database (accession number: GSE3678) and was derived from 7 PTC samples and 7 normal thyroid control samples. The point-biserial correlation coefficient was employed to estimate the relationship between PTC and gene expression levels. After 1000 bootstrap replications, the average of the correlation coefficients was calculated as the final point-biserial correlation coefficient of each gene.

Functional and pathway enrichment analyses of the significant module

Functional annotation of the genes in the significant module was accomplished using the online tool Database for Annotation Visualization and Integrated Discovery (DAVID) (http://david.abcc.ncifcrf.gov/) (20), depending on the GO term database and pathway database. Pathway enrichment of the module of interest was performed with another online tool, ConcensusPathDB (http://cpdb.molgen.mpg.de/) (21), based on the database of KEGG. This step evaluated the potential function of the genes related to thyroid carcinomas according to the known databases and known molecular pathways.

Results

A total of 1062 genes were identified as DEGs for network construction based on the aforementioned criteria. The 1062 genes consisted of 653 up-regulated genes and 409 down-regulated genes in the PTC samples compared with the normal samples.

Network construction for the PTC and normal samples with WGCNA

WGCNA R package was applied in order to generate two different co-expression networks for the two sample groups with the 1062 DEGs. Hierarchically clustered dendrograms were generated and modules containing highly interconnected genes were detected. Seven modules, represented by blue, brown, green, grey, red, turquoise and yellow, were detected in the PTC samples, with module sizes of 230, 118, 65, 198, 59, 276 and 116 genes, respectively. Three modules were identified in the normal samples, represented by blue, grey and turquoise, which contained 358, 240 and 464 genes, respectively. The gene cluster dendrograms and detected modules are presented in Fig. 1.
Figure 1.

The cluster dendograms for (A) PTC sample network and (B) normal sample network. Each branch (vertical line) represents an individual gene. Each module is marked by a particular color which contains highly correlated genes. PTC, papillary thyroid cancer.

Comparison of networks between the PTC and normal samples

In order to visualize the difference between the two networks, the same color of the module that each gene belonged to in the PTC sample network was applied to the corresponding gene in the normal sample network, without changing the hierarchically clustered dendrogram of the normal sample network. Visual change of the module structure is presented in Fig. 2. Integrated modules in the PTC sample network were found to be divided into several parts in the normal sample network.
Figure 2.

Visual change between PTC sample network and normal sample network. As a result, integrated modules in PTC sample network were divided into several parts in normal sample network. PTC, papillary thyroid cancer.

To objectively estimate whether a module is preserved, Zsummary and medianRank were employed in order to investigate the module preservation quantitatively. Visual results of the module preservation analysis including the statistics medianRank and Zsummary are presented in Fig. 3. The brown module has the highest medianRank as well as the lowest Zsummary among all the modules, indicating that it has the strongest preservation of all the modules. However, the Zsummaryvalue of the brown module is <3, which indicates no preservation according to the criterion of Zsummary. The brown module was selected for further analysis, when its medianRank and Zsummary statistics were both taken into account.
Figure 3.

Preservation of modules in normal sample network compared with PTC sample network. The left panel shows the statistic medianRank vs. module size. The right panel shows the composite statistic Zsummary vs. module size. PTC, papillary thyroid cancer.

Identification of genes associated with PTC in the brown module

The k.in rank change was calculated for each gene in the brown module in order to detect gene interaction changes when PTC occurs. After validation of the 118 genes in the brown module using point-biserial correlation coefficient, 61 genes were confirmed to be significantly relevant to PTC with P-value <0.05. The validation results of the significant genes with the top20 k.in rank change values are listed in Table I.
Table I.

Validation results of the significant genes with top20 k.in rank change values.

Gene symbolr[a]P-valuek.in rank change[b]
LRP40.7443597438.71×10−05108
TMEM1080.6637510010.00144694
ETV40.8013994945.13×10−0689
CAPN30.7517569776.33×10−0579
KLK70.8085174993.36×10−0676
TNRC6C-AS10.8647102465.44×10−0875
FXYD50.6760258280.00100974
KCNK50.6739392480.00107473
ST3GAL50.7618982414.00×10−0572
KLK100.7891686371.02×10−0571
PRICKLE10.7475578527.60×10−0569
BCHE−0.5149977790.03185168
NELL20.5893384670.00872668
IGFBP60.5658864660.01373561
LOC1027243120.7180839310.00024761
LPL0.8337749756.39×10−0761
SLIT10.5689121510.01298761
MYH100.7853517491.25×10−0559
LGALS10.5001508270.03943657
LRRK20.8430010193.24×10−0757

Point-biserial correlation coefficients

the difference of k.in rank between PTC and normal sample network. PTC, papillary thyroid cancer.

Functional annotation and enrichment analyses

The top10 significantly enriched terms in DAVID are shown in Table II, which includes extracellular region, integral component of plasma membrane, negative regulation of ERK1 and ERK2 cascade, extracellular space and cell-cell signaling.
Table II.

Module functional enrichment in DAVID.

CategoryGO IDGO termNo. of genesP-value
CCGO:0005576Extracellular region215.4×10−04
CCGO:0005887Integral component of plasma membrane198.1×10−04
BPGO:0070373Negative regulation of ERK1 and ERK2 cascade44.1×10−03
BPGO:0051965Positive regulation of synapse assembly44.9×10−03
CCGO:0005615Extracellular space167.9×10−03
BPGO:0060976Coronary vasculature development38.4×10−03
BPGO:0016042Lipid catabolic process41.2×10−02
BPGO:0007411Axon guidance51.2×10−02
BPGO:0007267Cell-cell signaling61.3×10−02
MFGO:0005198Structural molecule activity61.5×10−02

GO, gene ontology; CC, cellular components; BP, biological processes; MF, molecular functions.

Table III shows the results of pathway enrichment based on the KEGG database with the top5 enriched pathways. Several significantly enriched pathways are expected to be involved in PTC, such as transcriptional misregulations in cancer and Wnt signaling pathway (22,23).
Table III.

Pathway enrichment in ConsensusPathDB.

Pathway nameNo. of genesGenesP-value
Taste transduction3KCNK5, CHRM3, GABBR20.0104
Transcriptional misregulation in cancer4ETV4, ETV5, PLAU, DUSP60.0164
Regulation of actin cytoskeleton4MYH10, RRAS, CHRM3, CYFIP20.0288
Wnt signal pathway3CCND1, PRICKLE1, WIF10.0424
Rennin secretion2ADORA1, KCNJ20.0489

Discussion

Though studies of genomic events concerning the genetic mechanisms of PTC have achieved great progress in recent years (3,4), few of them have noted interactions of correlated genes thus far. Correlation network-based analysis is a novel method for investigation of biologically interactive genes associated with complex disease. Network analysis groups interactive genes into one module in which genes are enriched for specific biological pathways. Moreover, expression data used for gene grouping are from tissues and cells involved in the disease, so the interconnections between genes are only revealed in the specific disease conditions. Previous studies using expression data mainly focus on the DEGs in order to identify potential genes involved in PTC (24–27). In the present study, in addition to performing differential gene expression analysis, co-expression networks were established for the PTC and normal samples, respectively, using the 1062 DEGs identified. Comparison of gene-gene interaction between normal and disease networks is useful for identifying dysregulated pathways in the process of disease. This method enhances the ability to detect functional genes with regards to changing crosstalk with other genes rather than merely changing expression levels. Module preservation analysis was applied in the present study and a poorly preserved module comprised of 118 interactive genes was identified. Analyzing the genes in the brown module, the k.in rank change for each gene was calculated in order to detect gene interaction changes when PTC occurs. Subsequently, 61 genes were validated to be significantly relevant to PTC in the validation analysis. LRP4 is a promising candidate gene for PTC as it ranks first based on the k.in rank changes among the 118 genes in the brown module and also has a high correlation coefficient (r=0.7444) with PTC. In addition, it has been demonstrated to be associated with PTC in previous studies. LRP4 is reported to be overexpressed in PTC samples (28) and is included in a molecular classifier that is able to discriminate thyroids with PTC accurately from normal thyroids (29). KLK7 is also a potential oncogene for PTC. KLK7 has the fifth largest k.in rank change, as shown in Table I, as well as a significant correlation coefficient (r=0.8085) with PTC in the validation analysis. KLK7 has highly increased levels of expression in PTC compared with those in normal tissue (30). Furthermore, a previous study suggested that KLK7 promotes tumorgenesis and progress in PTC in vitro and in vivo experiments (31). To further investigate the functional relevance of the identified genes in PTC, functional annotation and enrichment analyses were performed and two significant pathways which may play a role in pathogenesis of PTC were identified. One of the significant pathways is the Wnt signal pathway, which has been implicated in the genesis and development of PTC in previous studies (22,23). CCND1 and PRICKLE1, which are enriched in this pathway in the enrichment analysis, have both been shown to be significantly relevant in PTC with high correlation coefficient values (r=0.8263 and r=0.7476, respectively) in the validation analysis. CCND1 has been reported to be a vital gene in PTC. Studies have shown that CCND1 expression is significantly associated with PTC growth and lymph node metastases through aberrant activation of the Wnt/beta-catenin pathway (32,33). PRICKLE1 is potential gene associated with PTC. PRICKLE1 encodes prickle planar cell polarity (PCP) protein 1and is a core signaling molecule in the Wnt/PCP signaling pathway (34,35). Previous studies indicate that the Wnt/PCP signaling pathway plays critical roles in the proliferation and migration of tumor cells (36,37). The other pathway identified to be involved in PTC is transcriptional misregulation in cancer. ETV4, ETV5 and DUSP6, which are enriched in this pathway, were all confirmed to be relevant to PTC in the validation analysis. DUSP6 has been reported to play an essential role in the initiation and progression of PTC by many functional experiments (38–40). ETV4 and ETV5 are both oncogenic E26 transformation-specific family transcription factors (41) and are implicated in many other types of cancer, such as colorectal (42), prostatic (43), endometrial (44) and ovarian (45) cancer. The results of the present study provide novel direction for better understanding the molecular mechanisms involved in PTC. The present study has several limitations. Firstly, not all of the genes associated with PTC that have been reported in previous genetic studies were identified as the present study is only a supplement to the current investigations of the genetic mechanism of PTC. However, previous studies often focus on the effect of a single gene on PTC, while the present study highlights the impact of gene-gene interactions on the tumor pathological process, which provides a novel insight for detection of potential functional genes. Secondly, the exact effects of the identified genes require corroboration with biological experiments. However, validation and enrichment analyses were used to confirm the identified genes in the present study. Certain genes, such as CCND1 and DUSP6, were confirmed that have been shown to be involved in the pathology of PTC by previous functional and experimental studies. In conclusion, the present study applied correlated network analysis in order to concentrate on gene interactions. Identified genes were validated using another set of expression data and then functional enrichment analysis was used to identify more biological associations. A number of novel genes, such as LRP4, KLK7, PRICKLE1, ETV4 and ETV5, and pathways (Wnt signal pathway and transcriptional misregulation in cancer) were identified that may exert important functions in PTC. These results contribute to the understanding of the genetic basis of PTC and provide novel insights into the identification of potential functional genes in PTC.
  45 in total

1.  Cyclin D1 overexpression in thyroid papillary microcarcinoma: its association with tumour size and aberrant beta-catenin expression.

Authors:  D Lantsov; S Meirmanov; M Nakashima; H Kondo; V Saenko; Y Naruke; H Namba; M Ito; A Abrosimov; E Lushnikov; I Sekine; Sh Yamashita
Journal:  Histopathology       Date:  2005-09       Impact factor: 5.087

2.  Co-expression network analysis identifies transcriptional modules in the mouse liver.

Authors:  Wei Liu; Hua Ye
Journal:  Mol Genet Genomics       Date:  2014-05-11       Impact factor: 3.291

3.  Gene expression profile of papillary thyroid cancer: sources of variability and diagnostic implications.

Authors:  Barbara Jarzab; Malgorzata Wiench; Krzysztof Fujarewicz; Krzysztof Simek; Michal Jarzab; Malgorzata Oczko-Wojciechowska; Jan Wloch; Agnieszka Czarniecka; Ewa Chmielik; Dariusz Lange; Agnieszka Pawlaczek; Sylwia Szpak; Elzbieta Gubala; Andrzej Swierniak
Journal:  Cancer Res       Date:  2005-02-15       Impact factor: 12.701

4.  Integrated genomic characterization of papillary thyroid carcinoma.

Authors: 
Journal:  Cell       Date:  2014-10-23       Impact factor: 41.582

5.  MicroRNA-145 inhibits human papillary cancer TPC1 cell proliferation by targeting DUSP6.

Authors:  Yifan Gu; Dengfeng Li; Qifeng Luo; Chuankui Wei; Hongming Song; Kaiyao Hua; Jialu Song; Yi Luo; Xiaoyu Li; Lin Fang
Journal:  Int J Clin Exp Med       Date:  2015-06-15

6.  Is my network module preserved and reproducible?

Authors:  Peter Langfelder; Rui Luo; Michael C Oldham; Steve Horvath
Journal:  PLoS Comput Biol       Date:  2011-01-20       Impact factor: 4.475

7.  Wnt/β-catenin signaling pathway is a direct enhancer of thyroid transcription factor-1 in human papillary thyroid carcinoma cells.

Authors:  Marie Gilbert-Sirieix; Joelle Makoukji; Shioko Kimura; Monique Talbot; Bernard Caillou; Charbel Massaad; Liliane Massaad-Massade
Journal:  PLoS One       Date:  2011-07-21       Impact factor: 3.240

8.  Identification of potential biomarkers and drugs for papillary thyroid cancer based on gene expression profile analysis.

Authors:  Ting Qu; Yan-Ping Li; Xiao-Hong Li; Yan Chen
Journal:  Mol Med Rep       Date:  2016-10-19       Impact factor: 2.952

9.  Identification of key genes associated with Schmid-type metaphyseal chondrodysplasia based on microarray data.

Authors:  Bing Wang; Li He; Wusheng Miao; Ge Wu; Hai Jiang; Yongtao Wu; Jining Qu; Min Li
Journal:  Int J Mol Med       Date:  2017-04-19       Impact factor: 4.101

10.  The genetic landscape of benign thyroid nodules revealed by whole exome and transcriptome sequencing.

Authors:  Lei Ye; Xiaoyi Zhou; Fengjiao Huang; Weixi Wang; Yicheng Qi; Heng Xu; Shu Yang; Liyun Shen; Xiaochun Fei; Jing Xie; Min Cao; Yulin Zhou; Wei Zhu; Shu Wang; Guang Ning; Weiqing Wang
Journal:  Nat Commun       Date:  2017-06-05       Impact factor: 14.919

View more
  6 in total

1.  Exploration of the Potential Biomarkers of Papillary Thyroid Cancer (PTC) Based on RT2 Profiler PCR Arrays and Bioinformatics Analysis.

Authors:  Ying Peng; Han-Wen Zhang; Wei-Han Cao; Ying Mao; Ruo-Chuan Cheng
Journal:  Cancer Manag Res       Date:  2020-09-28       Impact factor: 3.989

2.  Decoding pathogenesis factors involved in the progression of ATLL or HAM/TSP after infection by HTLV-1 through a systems virology study.

Authors:  Mohadeseh Zarei Ghobadi; Rahman Emamzadeh; Majid Teymoori-Rad; Sayed-Hamidreza Mozhgani
Journal:  Virol J       Date:  2021-08-26       Impact factor: 4.099

3.  Gene Expression Patterns Unveil New Insights in Papillary Thyroid Cancer.

Authors:  Mihai Saftencu; Cornelia Braicu; Roxana Cojocneanu; Mihail Buse; Alexandru Irimie; Doina Piciu; Ioana Berindan-Neagoe
Journal:  Medicina (Kaunas)       Date:  2019-08-19       Impact factor: 2.430

4.  Identification of prognostic biomarkers for papillary thyroid carcinoma by a weighted gene co-expression network analysis.

Authors:  Kexin Meng; Xiaotian Hu; Guowan Zheng; Chenhong Qian; Ying Xin; Haiwei Guo; Ru He; Minghua Ge; Jiajie Xu
Journal:  Cancer Med       Date:  2022-02-12       Impact factor: 4.711

5.  Mining of gene modules and identification of key genes in head and neck squamous cell carcinoma based on gene co-expression network analysis.

Authors:  Qian Zhao; Yan Zhang; Xue Zhang; Yeqing Sun; Zhengkui Lin
Journal:  Medicine (Baltimore)       Date:  2020-12-04       Impact factor: 1.817

6.  Identification of candidate genes associated with papillary thyroid carcinoma pathogenesis and progression by weighted gene co-expression network analysis.

Authors:  Xiaomin Chen; Ruoyu Wang; Tianze Xu; Yajing Zhang; Hongyan Li; Chengcheng Du; Kun Wang; Zairong Gao
Journal:  Transl Cancer Res       Date:  2021-02       Impact factor: 1.241

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.