Literature DB >> 28781952

Network analyses elucidate the role of SMYD3 in esophageal squamous cell carcinoma.

Xinning Liu1,2, Zhoude Zheng1,2, Chuhong Chen1,2, Simin Guo1,2, Zhennan Liao1,2, Yue Li1,2, Ying Zhu1,2, Haiying Zou1,2, Jianyi Wu1,2, Wenming Xie3, Pixian Zhang1,2, Liyan Xu1,4, Bingli Wu1,2, Enmin Li1,2.   

Abstract

SMYD3 is a member of the SET and myeloid-Nervy-DEAF-1 (MYND) domain-containing protein family of methyltransferases, which are known to play critical roles in carcinogenesis. Expression of SMYD3 is elevated in various cancers, including esophageal squamous cell carcinoma (ESCC), and is correlated with the survival time of patients with ESCC. Here, we dissect gene expression data, from a previously described KYSE150 ESCC cell line in which SMYD3 had been knocked down, by integration with the protein-protein interaction (PPI) network, to find the new potential biological roles of SMYD3 and subsequent target genes. By construction of a specific PPI network, differentially expressed genes (DEGs), following SMYD3 knockdown, were identified as interacting with thousands of neighboring proteins. Enrichment analyses from the DAVID Functional Annotation Chart found significant Gene Ontology (GO) terms associated with transcription activities, which were closely related to SMYD3 function. For example, YAP1 and GATA3 might be a target gene for SMYD3 to regulate transcription. Enrichment annotation of the total DEG PPI network by GO 'Biological Process' generated a connected functional map and found 532 significant terms, including known and potential biological roles of SMYD3 protein, such as expression regulation, signal transduction, cell cycle, cell metastasis, and invasion. Subcellular localization analyses found that DEGs and their interacting proteins were distributed in multiple layers, which might reflect the intricate biological processes at the spatial level. Our analysis of the PPI network has provided important clues for future detection of the biological roles and mechanisms, as well as the target genes of SMYD3.

Entities:  

Keywords:  SMYD3; functional enrichment annotation; protein–protein interaction network

Year:  2017        PMID: 28781952      PMCID: PMC5536995          DOI: 10.1002/2211-5463.12251

Source DB:  PubMed          Journal:  FEBS Open Bio        ISSN: 2211-5463            Impact factor:   2.693


differentially expressed genes Gene Ontology protein–protein interaction SET and MYND domain‐containing 3 Methylation of histone proteins plays a pivotal role in the regulation of a wide range of biological processes. SET and myeloid‐Nervy‐DEAF‐1 (MYND) domain‐containing protein (SMYD) is a methyltransferase family that includes SMYD1, SMYD2, SMYD3, SMYD4, and SMYD5 and has been found to play critical roles in human carcinogenesis. Altered expression of SMYD3 is associated with the progression of several solid tumors, including bladder cancer 1, glioma 2, gastric cancer 3, and prostate cancer 4. Sarris et al. 5 found a significant correlation between elevated expression of SMYD3 and the incidence of both hepatocellular carcinoma and colorectal cancer. Several trials have explored the effects of SMYD3 overexpression on proliferation, viability, cancer cell migration, and invasion 6, 7, 8, 9, 10. The series of elegant experiments suggested that SMYD3 could serve as a potential biomarker for clinically aggressive disease and an attractive therapeutic target. The biological roles for SMYD3 are numerous and far beyond its transactivation activity. Thus, more attention directed toward other roles remains necessary. Network‐based analysis of protein–protein interactions (PPI) refers to the association among protein molecules and the study of these associations from the aspects of biochemistry, signal transduction, and biomolecular networks. Proteins do not work alone, interacting with other proteins in the biological context of specific functions 11. In recent years, integrated analysis of large‐scale gene expression data, or other high‐throughput data, with the PPI network has received great attention to elucidate biological processes 12. Integration analyses with the PPI network provide a number of applications, such as protein interaction prediction, disease candidate genes identification, protein function prediction, functional protein module identification, protein complex, and drug target identification 13. In our previous study, we found that SMYD3 expression is frequently upregulated in human esophageal squamous cell carcinoma (ESCC) tissues, correlating with overall survival of patients with ESCC 14. RNAi‐mediated knockdown of SMYD3 suppressed ESCC cell proliferation, migration, and invasion in vitro and inhibited local tumor invasion in vivo 14. Future analyses should be carried out beyond the mere listing of differentially expressed genes (DEGs) after microarray experiments or other high‐throughput experiments. It would be helpful to explain the biological roles or biological phenotype of target genes, especially when more and more spatial or temporal interactions between proteins are obtained from the public databases. In this study, we analyzed DEGs, following SMYD3 knockdown in ESCC cells, by applying PPI network analysis.

Materials and methods

Expression of SMYD3 in ESCC from TCGA data

Expression data of esophageal carcinoma (TCGA_ESCA_exp_HiSeq‐2015‐02‐24) were downloaded from TCGA (https://cancergenome.nih.gov/), which contained level 3 expression of 89 cases of ESCC by RNA‐seq. The X‐tile program 3.6.1 15, to define the optimal cutoff point for the expression level for SMYD3, was used to classify patients with ESCC into two groups, high and low SMYD3 expression, following the Kaplan–Meier survival analysis and log‐rank test by SPSS 13.0 software (version 13.0; SPSS, Inc., Chicago, IL, USA).

The expression profile and differentially expressed genes

SMYD3 was knocked down as described in our previous study 14. Briefly, shRNA sequences targeting SMYD3 were ligated into the pGLV3/H1/GFP/+ Puro vector and transfected into the KYSE150 ESCC cell line. KYSE150 cells were transfected with an empty plasmid as a control. SMYD3 knockdown was confirmed by both QRT‐PCR and western blot, and the mRNA expression profile was analyzed with the GeneChip® PrimeView™ Human Gene Expression Array (Affymetrix, Santa Clara, CA, USA). Raw data were treated by normalization and log transformation. Both raw and treated expression data have been submitted to the NCBI GEO database (http://www.ncbi.nlm.nih.gov/geo/) and were assigned as GSE85419. A threshold of twofold change was set for DEGs in this study.

Construction of the PPI network

Currently known human PPI data are available from the newest releases of the following databases: HPRD (http://www.hprd.org/), BioGRID (http://thebiogrid.org/), DIP (http://dip.doe-mbi.ucla.edu/dip/Main.cgi), and IntAct (http://www.ebi.ac.uk/intact/). These physical protein interactions were collected from public reports of both low‐throughput and high‐throughput experimental results, providing high confidence for the following analyses, such as disease researches integrated with the human PPI network 16, 17. At first, we manually integrated the data to obtain a unique dataset of interactions for the Homo sapiens species. These unique PPI data were considered as a curated parental PPI network, containing 18 644 unique proteins and 199 411 interactions, which was applied for new or child PPI network construction. Cytoscape software has been widely applied for visualization, data integration, and analyses of PPI networks, as it provides update plugins to meet the needs of large‐scale data analyses 18. In Cytoscape, for visualization as graphs, PPI networks are presented with nodes as proteins and the edges as their interactions. First, the total DEGs, downregulated DEGs, and upregulated DEGs were mapped to the parental PPI network, and extracted their first class interacting neighbors to construct three PPI subnetworks. To increase the reliability and reduce the unnecessary connections, the network reconstruction was limited to the first interacting protein neighbors of these DEGs. Second, to detect the axis of SMYD3‐neighboring proteins, SMYD3 was used as the query node to construct a SMYD3‐central PPI network. Third, a subnetwork was created to detect the internal interactions between DEGs by mapping all DEGs to the parental PPI network.

Analyses of PPI network topological parameters

The analyses of multiple topological parameters of networks were carried out by NetworkAnalyzer 19. The network topological parameters are important characters to understand the organization of complex networks, such as the PPI network 20. The degree is defined as the number of one node's directly interacting protein neighbors in the PPI network. One of most important network topological characteristics is that the degree distribution follows a scale‐free power law distribution for many large networks, such as the PPI network or social network. In this study, the power law of distribution of node degrees was analyzed by the method in our previous report 21. The three new PPI subnetworks were treated as undirected in this study. The degree distribution P(k) of a large‐scale network is defined as the fraction of nodes in the network with degree k. The pattern of their dependencies can be visualized by fitting a line on the node degree distribution data. NetworkAnalyzer calculates the positive coordinate value for fitting the line where the power law curve of the form y = βx . R 2 value is a statistical measure of the linearity of the curve fit and used to quantify the fit to the power line. When the fit is good, the R 2 value is very close to 1. Moreover, many topological parameters were also analyzed and shown, as they indicate the network properties.

Gene functional enrichment analyses

The Functional Annotation Chart in DAVID (http://david.abcc.ncifcrf.gov/) is able to examine the significance of gene‐term enrichment by the application of a modified Fisher's exact test 22. The chart covers 40 annotation categories, including GO terms, protein functional domains, pathways, sequence features, disease associations, homology, gene functional summaries, and literature. Terms from the Functional Annotation Chart with P < 0.05 were visualized by Enrichment Map 23, which organizes the enriched overlapped gene sets into a network.

Generation of the functional annotation map

The total DEG PPI network was annotated by Gene Ontology (GO) for the mining of enriched GO ‘Biological Process’ terms using the ClueGO plugin, which creates a functionally and hierarchical organized GO/pathway term network 24. The enriched annotated terms with a P‐value < 0.001 were defined as significant. To visualize the relationship between GO terms, a kappa score, reflecting the overlapping number between terms, was set to 0.3 as the threshold.

Subcellular classes of the total DEG PPI network

The subcellular localization for proteins in the total DEG PPI network was obtained from the ‘GENE‐ONTOLOGY annotation file’ in the HPRD database, which was curated and imported into the network as a node attribute. For the localizations of several proteins that were annotated by multiple locations, including proteins that might translocate into the nucleus (e.g., both in cytoplasm and in nucleus), these localizations were merged (e.g., cytoplasm/nucleus). Cerebral program is able to separate the nodes in the total DEG PPI network into multiple layers according to their subcellular localization remaining their interactions, generating a pathway‐like graph 25.

Results

Correlation of SMYD3 with survival of patients with ESCC

Although we previously found that SMYD3 is overexpressed in patients with ESCC from China, we applied another dataset as a validation cohort. By X‐tile, patients with ESCC were classified into two groups according to the expression level of SMYD3 (P < 0.001; Fig. 1A). Compared with patients with high SYMD3 expression, ESCC patients with lower SYMD3 had a longer survival time (P = 0.014), consistent with our previous report (Fig. 1B).
Figure 1

Expression of SMYD3 correlates with survival of patients with ESCC. (A) The significant difference between low and high expression level of SMYD3. (B) The lower expression level of SYMD3 favors a long survival time for patients with ESCC.

Expression of SMYD3 correlates with survival of patients with ESCC. (A) The significant difference between low and high expression level of SMYD3. (B) The lower expression level of SYMD3 favors a long survival time for patients with ESCC.

The construction of three DEG PPI networks

Using twofold as the threshold, we identified 238 DEGs (85 upregulated genes and 153 downregulated genes) from the mRNA expression microarray of SMYD3‐knockdown ESCC cells. In order to provide a landscape of what and how the DEGs participate in cellular biological activities, we constructed a full view of their interaction proteins to shed light on their functions, as well as the biological role of SMYD3 in ESCC. The PPI datasets from several acknowledged interaction databases collected from the literature provide original and credible data for performing research. By mapping the total, downregulated, and upregulated DEGs to the parental PPI network and extracting their first interacting neighbors, three sub‐PPI networks were generated. As shown in Fig. 2A, the total DEG PPI subnetwork contained 4426 nodes and 82 039 edges, including 204 DEGs. The literature reports that 129 downregulated DEGs have interacting proteins to form a sub‐PPI network composed of 2963 nodes and 58 833 edges (Fig. 2B). Third, 85 upregulated proteins formed a PPI subnetwork containing 2176 nodes and 37 238 edges (Fig. 2C). These three PPI subnetworks suggested that the knockdown of SMYD3 tremendously disturbs the protein activities in ESCC, as more than 200 DEGs, resulting from SMYD3 knockdown, were capable of broadly influencing biological processes through the interactions with thousands of other proteins.
Figure 2

PPI network generation by mapping DEGs to the parental PPI network. (A) PPI network of total DEGs. (B) PPI network of downregulated DEGs. (C) PPI network of upregulated DEGs. (D) SMYD3‐central PPI subnetwork. (E) Internal interactions of DEGs. Different node colors indicate the types of proteins represented. Green and red nodes represent proteins encoded by down‐ and upregulated genes, respectively. Blue nodes represent interacting proteins that were not significantly differentially expressed.

PPI network generation by mapping DEGs to the parental PPI network. (A) PPI network of total DEGs. (B) PPI network of downregulated DEGs. (C) PPI network of upregulated DEGs. (D) SMYD3‐central PPI subnetwork. (E) Internal interactions of DEGs. Different node colors indicate the types of proteins represented. Green and red nodes represent proteins encoded by down‐ and upregulated genes, respectively. Blue nodes represent interacting proteins that were not significantly differentially expressed. Currently, 23 SMYD3‐interacting proteins have been identified. Nevertheless, the expression levels of these 23 proteins were not significantly changed, except for the gene mesoderm‐specific transcript (MEST), which was downregulated in the mRNA profile of SMYD3 knockdown in ESCC (Fig. 2D). The DEG–DEG interactions were acquired from the parental PPI network to detect their internal interactions. The DEG–DEG network contained 77 nodes (48 downregulated and eight upregulated DEGs) and 120 edges, which also included a large module containing 59 DEGs (39 downregulations and 20 upregulations; Fig. 2E).

Network topological characteristics analyses

Based on the specific distinguishing principles (e.g., power law distribution of node degree), a real biological network, such as the PPI network, is significantly discriminated from random networks 26, 27. For the total, downregulated, and upregulated DEG networks, the node degree distributions approximately showed a power law distribution, with an R 2 = 0.854, 0.831, and 0.845, respectively (Fig. 3A–C). These results indicated the three DEG PPI subnetworks constructed in this study were real complex biological networks characterized scale‐free 28. It also suggested that a small number of important proteins act as hub nodes with a large amount of interactions. Three topological metrics proposed to understand the structure of a complex network, specifically network density, network centralization, and clustering coefficient, are shown in Table 1. Several other important network characteristics, for example, average clustering coefficient distribution, closeness centrality, neighborhood connectivity distribution, and topological coefficients, are indicated in Fig. 3D–G.
Figure 3

Power law distribution of node degree of the total DEG PPI network (A), downregulated DEG PPI network (B), and upregulated DEG PPI subnetwork (C). The graph displays a decreasing trend of degree distribution with an increase in number of links displaying scale‐free topology. The results of average clustering coefficient distribution (D), closeness centrality (E), neighborhood connectivity distribution (F), and topological coefficients (G) were shown.

Table 1

Topological parameters of three DEGs PPI subnetwork

PPI subnetwork y = βx a R 2 CorrelationClustering coefficientNetwork centralizationNetwork density
Total DEGs y = 2000.9x 1.289 0.8540.5140.1880.2060.009
Downregulated DEGs y = 1183.6x 1.231 0.8310.5110.2000.2750.013
Upregulated DEGs y = 764.59x −1.194 0.8140.6190.2400.2540.013
Power law distribution of node degree of the total DEG PPI network (A), downregulated DEG PPI network (B), and upregulated DEG PPI subnetwork (C). The graph displays a decreasing trend of degree distribution with an increase in number of links displaying scale‐free topology. The results of average clustering coefficient distribution (D), closeness centrality (E), neighborhood connectivity distribution (F), and topological coefficients (G) were shown. Topological parameters of three DEGs PPI subnetwork

Functional annotation map of DEGs

To gain a full view of the functions and categories of the DEGs, the DEGs were annotated by Functional Annotation Chart and visualized by Enrichment Map. As shown in Fig. 4, one node represents one functional annotation term. The more significant the enriched term, the larger it is. Nodes from the same kind of functional category are shown in the same color. The edge width, defined by the overlap coefficient between the enriched terms (overlap coefficient cutoff was set as 0.6), is wider when more of the same genes overlap in two nodes.
Figure 4

Enrichment map for the DEGs to identify significant biological functions (P < 0.05). One node represented a significant functional term from the Functional Annotation chart. Node size represents enrichment significance. Edges indicate overlap between gene sets, whereas the thickness indicates the size of the overlap.

Enrichment map for the DEGs to identify significant biological functions (P < 0.05). One node represented a significant functional term from the Functional Annotation chart. Node size represents enrichment significance. Edges indicate overlap between gene sets, whereas the thickness indicates the size of the overlap. The Functional Annotation Chart results contained another 94 terms from the following annotation categories: 21 KEGG_PATHWAY, 13 INTERPRO, 26 UP_SEQ_FEATURE, 1 SMART, 29 SP_PIR_KEYWORDS, and 3 BIOCARTA (Appendix S1). These results would provide more information than mere GO enrichment. Accumulated evidence suggests that SMYD3 can influence distinct oncogenic processes by acting as a gene‐specific transcriptional regulator. This is supported by the DEGs from its knockdown being involved the transcription activities, such as two terms from GOTERM_BP_DIRECT: ‘GO:0010468~regulation of gene expression’ (PTHLH, BCL2, PHGDH, MYC, PHLDA2) and ‘GO:0010628~positive regulation of gene expression (PLSCR1, INHBA, EZR, TNC, TP53, MAPK9, IL1B, HMGA2, KDM5B, HNRNPU, IL1A, FN1)’. SMYD3 is also a histone lysine methyltransferase. Two significant terms about transcriptional regulation at chromosome level were found. GO:0003682 of ‘chromatin binding’ contained 10 enriched genes: DLX2, NUPR1, SMARCE1, GATA3, ANKRD2, TP53, BCL6, yes‐associated protein 1(YAP1), LOXL2, and PLAC8. GO:0000979 of ‘RNA polymerase II core promoter sequence‐specific DNA binding’ contained four enriched genes: GATA3, TP53, SMYD3, and YAP1. The term ‘domain: Leucine‐zipper’ from UP_SEQ_FEATURE contained five DEGs: MAX, TSC22D3, ATF3, CREB5, and MYC, indicating their protein sequence characters involved DNA binding in transcription regulation. These results might explain SMYD3 functioning as a histone methyltransferase, and why its knockdown in tumor cells extensively affects gene expression regulation activities.

Generation of functional annotation map

Cellular activities might be affected by the DEGs through cascades of interactions in the network to perform their multiple biological functions. To find potential cellular activities perturbed by SMYD3 protein through its DEGs and their interacting proteins, GO ‘Biological Process’ enrichment analyses of the total DEG PPI network were performed. We generated 532 enriched GO terms to construct a functional annotation map, in which the nodes were no longer proteins, but rather their enriched GO terms, with the edges suggesting significant overlapping of enriched proteins between two GO terms (Fig. 5).
Figure 5

Functional map of the total DEG PPI network. Functionally grouped network with terms is linked as nodes based on their kappa score (≥ 0.3). Functional groups with overlapped enriched genes are linked by an edge. Similar GO terms are labeled in the same color. The interested GO term related or potentially related to known and potential functions of SMYD3 is grouped by the Roman numeral. I: gene expression regulation‐associated terms; II: cell cycle‐associated terms; III: protein synthesis and RNA processing; IV: cancer cell metastasis and invasion; V: signal regulation or signal transduction.

Functional map of the total DEG PPI network. Functionally grouped network with terms is linked as nodes based on their kappa score (≥ 0.3). Functional groups with overlapped enriched genes are linked by an edge. Similar GO terms are labeled in the same color. The interested GO term related or potentially related to known and potential functions of SMYD3 is grouped by the Roman numeral. I: gene expression regulation‐associated terms; II: cell cycle‐associated terms; III: protein synthesis and RNA processing; IV: cancer cell metastasis and invasion; V: signal regulation or signal transduction. Interestingly, many groups of GO terms possibly associated with SMYD3 functions were found, as indicated by capitalized Roman numerals. As SMYD3 plays important roles in histone modification, a group of gene expression regulation‐associated terms were identified, such as ‘gene expression’, ‘regulation of gene expression’, ‘positive regulation of transcription, DNA‐dependent’, and ‘transcription from RNA polymerase II promoter’. The second interesting result was that many of the proteins from the total DEG PPI subnetwork participated in signal regulation or signal transduction, for example, ‘signal transduction’, ‘intracellular signal transduction’, ‘regulation of cell communication’, ‘regulation of intracellular protein kinase cascade’. A large functional group contained several cell cycle‐associated GO terms, for example, ‘cell cycle process’, ‘mitotic cell cycle’, ‘regulation of cell cycle’, ‘cell cycle arrest’, and ‘respond to DNA damage stimulus’, suggesting that SMYD3 regulates the cell cycle directly or indirectly by the cascade of protein interaction. Three terms indicating the known biological roles of SMYD3 in promoting cancer cell metastasis and invasion were also found: ‘cell–substrate adhesion’, ‘cell–matrix adhesion’, and ‘positive regulation of cell adhesion’. The interesting significant important GO terms are listed in Table 2.
Table 2

Interesting significant GO terms for SMYD3‐knockdown PPI network

Significant GO listTerm name P‐value corrected with Bonferroni
mRNA translation‐related terms
GO:0043933Macromolecular complex subunit organization1.36E‐56
GO:0016071mRNA metabolic process9.51E‐50
GO:0070727Cellular macromolecule localization1.18E‐44
GO:0034613Cellular protein localization2.27E‐44
GO:0006886Intracellular protein transport4.69E‐42
GO:0033365Protein localization to organelle1.16E‐37
GO:0006412Translation7.26E‐35
GO:0006413Translational initiation1.16E‐32
GO:0034623Cellular macromolecular complex disassembly2.36E‐32
GO:0006414Translational elongation8.80E‐32
GO:0006415Translational termination9.32E‐32
GO:0045047Protein targeting to ER9.92E‐32
GO:0072599Establishment of protein localization in endoplasmic reticulum9.92E‐32
GO:0043241Protein complex disassembly1.00E‐27
GO:0015031Protein transport7.98E‐26
GO:0045184Establishment of protein localization9.67E‐26
GO:0048610Cellular process involved in reproduction1.07E‐24
GO:0006401RNA catabolic process1.94E‐24
GO:0010608Post‐transcriptional regulation of gene expression6.50E‐18
GO:0006396RNA processing1.08E‐17
GO:0072594Establishment of protein localization to organelle3.65E‐17
GO:0008380RNA splicing6.43E‐15
GO:0006397mRNA processing4.23E‐14
GO:2000241Regulation of reproductive process1.84E‐05
GO:0006417Regulation of translation2.13E‐05
Signal pathway‐related terms
GO:0016310Phosphorylation5.89E‐51
GO:0035556Intracellular signal transduction1.15E‐38
GO:0009966Regulation of signal transduction1.34E‐37
GO:0010646Regulation of cell communication2.61E‐37
GO:0007165Signal transduction1.07E‐33
GO:0043549Regulation of kinase activity1.32E‐30
GO:0051338Regulation of transferase activity4.04E‐30
GO:0048585Negative regulation of response to stimulus9.24E‐24
GO:0000165MAPK cascade1.14E‐19
GO:0010627Regulation of intracellular protein kinase cascade1.31E‐19
GO:0080135Regulation of cellular response to stress2.68E‐19
GO:0051347Positive regulation of transferase activity2.49E‐18
GO:0009967Positive regulation of signal transduction6.47E‐17
GO:0023056Positive regulation of signaling1.56E‐15
GO:0043408Regulation of MAPK cascade6.18E‐13
GO:0043405Regulation of MAP kinase activity3.37E‐11
GO:0043406Positive regulation of MAP kinase activity2.86E‐07
GO:0010741Negative regulation of intracellular protein kinase cascade4.25E‐07
Cell cycle‐related terms
GO:0051726Regulation of cell cycle5.94E‐48
GO:0022402Cell cycle process7.59E‐43
GO:0000278Mitotic cell cycle1.28E‐41
GO:0051329Interphase of mitotic cell cycle5.92E‐36
GO:0051329Interphase of mitotic cell cycle5.93E‐36
GO:0045786Negative regulation of cell cycle1.92E‐33
GO:0022403Cell cycle phase1.05E‐31
GO:0007050Cell cycle arrest1.27E‐29
GO:0010564Regulation of cell cycle process1.19E‐27
GO:0000082G1/S transition of mitotic cell cycle1.00E‐26
GO:0007346Regulation of mitotic cell cycle4.10E‐24
GO:0071156Regulation of cell cycle arrest5.40E‐23
GO:0000075Cell cycle checkpoint1.39E‐20
GO:2000602Regulation of interphase of mitotic cell cycle1.49E‐18
GO:2000045Regulation of G1/S transition of mitotic cell cycle1.49E‐16
GO:2000045Regulation of G1/S transition of mitotic cell cycle1.49E‐16
GO:0000084S phase of mitotic cell cycle4.75E‐13
GO:0071158Positive regulation of cell cycle arrest6.00E‐13
GO:0031571Mitotic cell cycle G1/S transition DNA damage checkpoint1.53E‐12
GO:0031575Mitotic cell cycle G1/S transition checkpoint2.71E‐12
GO:0090068Positive regulation of cell cycle process2.97E‐12
GO:0000086G2/M transition of mitotic cell cycle1.56E‐11
GO:0000087M phase of mitotic cell cycle4.06E‐10
GO:0045787Positive regulation of cell cycle9.05E‐08
Gene expression regulation‐related terms
GO:0010467Gene expression8.82E‐71
GO:0006139Nucleobase‐containing compound metabolic process7.11E‐69
GO:0090304Nucleic acid metabolic process4.05E‐67
GO:0034641Cellular nitrogen compound metabolic process1.94E‐57
GO:0016070RNA metabolic process1.11E‐54
GO:0009059Macromolecule biosynthetic process1.93E‐44
GO:0034645Cellular macromolecule biosynthetic process1.97E‐44
GO:2000113Negative regulation of cellular macromolecule biosynthetic process5.72E‐41
GO:0010558Negative regulation of macromolecule biosynthetic process9.56E‐41
GO:0010629Negative regulation of gene expression1.54E‐40
GO:0032774RNA biosynthetic process9.47E‐40
GO:0010468Regulation of gene expression1.09E‐39
GO:0010628Positive regulation of gene expression4.79E‐39
GO:0051171Regulation of nitrogen compound metabolic process1.66E‐38
GO:0009891Positive regulation of biosynthetic process1.76E‐37
GO:0051254Positive regulation of RNA metabolic process4.72E‐37
GO:0045934Negative regulation of nucleobase‐containing compound metabolic process8.42E‐37
GO:0009890Negative regulation of biosynthetic process8.80E‐37
GO:0031328Positive regulation of cellular biosynthetic process1.68E‐36
GO:0010557Positive regulation of macromolecule biosynthetic process4.85E‐36
GO:0031327Negative regulation of cellular biosynthetic process8.05E‐36
GO:0051253Negative regulation of RNA metabolic process2.26E‐35
GO:0010556Regulation of macromolecule biosynthetic process2.72E‐35
GO:2000112Regulation of cellular macromolecule biosynthetic process5.58E‐35
GO:0045893Positive regulation of transcription, DNA‐dependent1.40E‐33
GO:0031326Regulation of cellular biosynthetic process2.14E‐33
GO:0045892Negative regulation of transcription, DNA‐dependent2.67E‐33
GO:0044249Cellular biosynthetic process2.93E‐32
GO:0051252Regulation of RNA metabolic process1.43E‐30
GO:2001141Regulation of RNA biosynthetic process1.91E‐29
GO:0006355Regulation of transcription, DNA‐dependent2.01E‐28
GO:0006351Transcription, DNA‐dependent1.86E‐27
GO:0019219Regulation of nucleobase‐containing compound metabolic process
GO:0006366Transcription from RNA polymerase II promoter
Cell adhesion‐related terms
GO:0031589Cell–substrate adhesion3.19E‐6
GO:0007160Cell–matrix adhesion1.85E‐4
GO:0045785Positive regulation of cell adhesion9.38E‐4
Interesting significant GO terms for SMYD3‐knockdown PPI network

Subcellular layers of proteins in the PPI subnetwork

The proper subcellular localization of the proteins is extremely crucial because the appropriate location provides the physiological context for their functions, such as signal transduction, transcription regulation, protein modification, and complex formation. Cerebral program could array nodes in the PPI network into different subcellular layers maintaining their interactions. The total DEG PPI network was separated into 10 layers in the following percentages: secreted (6.5%), membrane (13.2%), cytoskeleton (0.5%), cytoskeleton/cytoplasm (0.3%), cytoplasm (26.4%), secreted/nucleus (0.8%), membrane/nucleus (0.3%), cytoskeleton/nucleus (0.45%), cytoplasm/nucleus (28.6%), and nucleus (23%; Fig. 6A). These results suggest the proteins in the total DEG PPI network distributed from extracellular to intracellular till nucleus.
Figure 6

Subcellular layers illustrating the PPI network. (A) The complete DEG PPI network. (B) SMYD3‐central PPI subnetwork. Proteins were distributed according to their subcellular location without changing their interactions.

Subcellular layers illustrating the PPI network. (A) The complete DEG PPI network. (B) SMYD3‐central PPI subnetwork. Proteins were distributed according to their subcellular location without changing their interactions. There are currently 23 SMYD3‐interacting proteins reported and annotated in the PPI database. The most recognized SMYD3 function is histone methyltransferase, suggesting that this protein is mainly localized in the nucleus. To detect whether there were any possibilities for SMYD3 and its interacting proteins play roles in the nucleus, the proteins of the SMYD3‐central PPI network were also arrayed according to subcellular locations. Many SMYD3 interacting proteins, such as E2H2, HDAC1, NFYB, and POLR2A, were located in the nucleus, suggesting that SMYD3 might form functional protein complexes with these proteins to regulate gene expression (Fig. 6B). These results might reflect the intricate biological processes at the spatial level.

Discussion

Globally, esophageal cancer is the eighth most common malignancy, as well as the sixth most common fatal cancer worldwide. Esophageal cancer has two histological types, adenocarcinoma and squamous cell carcinoma; the latter is among the four most common causes of death in China 29. More and more research has illustrated that systems biology approaches, such as network‐based methods, can be successfully applied to elucidate the molecular mechanisms of diseases 30, 31. SMYD3 protein interacts with H3K4Me3‐modified histone tails, which facilitates its recruitment to the core promoter of transcriptionally active genes 32. To explore the potential roles or functions of SMYD3, a systems approach was applied by integrating public protein interaction data with DEGs resulting from SMYD3 knockdown to provide a full view. As shown in the three (downregulated, upregulated, and total DEGs) PPI subnetworks generated, thousands of proteins interact with the DEGs. This suggests that SMYD3 affects the expression of other proteins, and its knockdown impacts on cellular activities through the perturbation of the cellular protein network. Of the 23 SMYD3 directly interacting proteins, only the expression of MEST is downregulated. MEST is an imprinted gene with a hypermethylation promoter and is associated with cell invasion, as well as being a risk factor for cervical cancer and hepatocellular carcinoma 33, 34, 35. It is presumed that the knockdown of SMYD3 might impact on the expression of its directly interacting protein MEST. The wide coverage of the Functional Annotation Chart provides a powerful tool to facilitate large‐scale gene function analysis from a network viewpoint. These functional terms are presumed to be significantly mediated by SMYD3 through its DEGs and could also be applied to explore the molecular roles of SMYD3 in ESCC tumor initiation and growth. The two terms of GO:0003682 of ‘chromatin binding’ and GO:0000979 of ‘RNA polymerase II core promoter sequence‐specific DNA binding’ have three repeated genes of GATA3, TP53, and YAP1 with fold changes of −2.71, −2.02, and −2.08, respectively, following knockdown of SMYD3 in ESCC. Yes‐associated protein 1, a key gene in the Hippo signaling pathway, is a crucial regulator pervasively activated in human malignancies 36. High levels of nuclear YAP1 are correlated with increased chromosome instability and aneuploidy in hepatocellular carcinoma 37. In breast cancer, Theodorou et al. found that GATA3 (GATA binding protein 3) is pivotal in mediating enhancer accessibility at regulatory regions involved in ESR1‐mediated transcription. GATA3 silencing results in a global redistribution of cofactors and active histone marks prior to estrogen stimulation. These results indicate that GATA3, when present on the chromatin, may serve as a licensing factor for estrogen–ESR1‐mediated interactions between cis‐regulatory elements 38. In this light, our data suggest that YAP1 and GATA3 are important target genes for future analysis on the impact of SMYD3‐mediated regulation of tumor‐associated genes. To better understand the biological roles of the DEGs through the interactions with their protein partners, the total DEG PPI network was subjected to functional enrichment annotation by GO, which was also illustrated by a network. We show that the total DEG PPI network, perturbed by the knockdown of SMDY3, involves various biological activities, including the acknowledged and potential functions of SMDY3. Interestingly, the functional enrichment annotation map contains several cell motility‐related GO terms, for example, ‘cell–substrate adhesion’ and ‘cell–matrix adhesion’, indicating a function for SMYD3 in cancer cell metastasis. Direct evidence for SMDY3 participation in cancer cell metastasis has been found in our previous report from ESCC 14, as well as in bladder and colon cancer in vitro and in vivo 39, suggesting that SMYD3 is one of the key players stimulating migration and invasiveness of these cancer cells. On the other hand, SMYD3 is able to regulate cell signal transduction, cell cycle, and various biological effects, except the well‐known gene expression regulation, through a cascade of PPIs. The direct role for SMYD3 in the regulation of signal pathways has been reported, as SMYD3 mediates the methylation of MAP3K2 at lysine 260, which potentiates activation of the Ras/Raf/MEK/ERK signaling module 40. These results provide critical clues to explore the multiple functions of SMYD3 in the future. Studies indicate that classical signaling pathways are composed of a series of genes or proteins, each linked by the order involved in signal transduction and response 41. We presume that the many canonical and noncanonical signals are also transduced by sequential protein interactions, arrayed in the proper layers. On the other hand, the roles or functions of the protein might vary according to its subcellular localization. For example, proteins located in the plasma membrane are primarily involved in cell adhesion, cytoskeleton, and cell signaling, whereas in the nucleus, proteins are mainly involved in transcription, ribosomal assembly, or chromatin remodeling 42. SMYD3 protein distributes both in cytoplasm and in nucleus or translocates from the cytoplasm into the nucleus, enabling its multiple functions in different subcellular localizations 43. Based on subcellular localization information, a pathway‐like view of total DEG PPI network was created, displaying the cellular locations of proteins and making it easier to understand the direction of information flow. In summary, knockdown of SMYD3 causes the altered expression of its target genes, indicated by the PPI network, to directly or indirectly affect the signaling of extracellular membrane–cytoskeleton/cytoplasm–nucleus cascades, causing the altered expression of other DEGs, and consequently cause alterations in cell cycling, signal transduction, invasion, and metastasis.

Data Accessibility

Both raw and treated expression data described in this study have been submitted to the NCBI GEO database (http://www.ncbi.nlm.nih.gov/geo/) and assigned the accession number GSE85419.

Author contributions

XL and ZZ analyzed the data, interpreted the data, and wrote the manuscript. CC, SG, ZL, YL, YZ, HZ, JW, WX, and PZ analyzed the data and prepared the figures. LX, BW, and EL conceived and designed the study, and involved in supervision and funding acquisition. All the authors edited the manuscript prior to submission. Appendix S1. Functional Annotation Chart of differentially expressed genes following SMYD3 knockdown in ESCC. Click here for additional data file.
  43 in total

Review 1.  Network biology: understanding the cell's functional organization.

Authors:  Albert-László Barabási; Zoltán N Oltvai
Journal:  Nat Rev Genet       Date:  2004-02       Impact factor: 53.242

2.  Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation.

Authors:  Aaron Barsky; Jennifer L Gardy; Robert E W Hancock; Tamara Munzner
Journal:  Bioinformatics       Date:  2007-02-19       Impact factor: 6.937

3.  SMYD3 as an oncogenic driver in prostate cancer by stimulation of androgen receptor transcription.

Authors:  Cheng Liu; Chang Wang; Kun Wang; Li Liu; Qi Shen; Keqiang Yan; Xiaoqing Sun; Jie Chen; Jikai Liu; Hongbo Ren; Hainan Liu; Zhonghua Xu; Sanyuan Hu; Dawei Xu; Yidong Fan
Journal:  J Natl Cancer Inst       Date:  2013-10-30       Impact factor: 13.506

Review 4.  Human Protein Reference Database and Human Proteinpedia as resources for phosphoproteome analysis.

Authors:  Renu Goel; H C Harsha; Akhilesh Pandey; T S Keshava Prasad
Journal:  Mol Biosyst       Date:  2011-12-08

5.  Enrichment map: a network-based method for gene-set enrichment visualization and interpretation.

Authors:  Daniele Merico; Ruth Isserlin; Oliver Stueker; Andrew Emili; Gary D Bader
Journal:  PLoS One       Date:  2010-11-15       Impact factor: 3.240

6.  SET and MYND domain-containing protein 3 is overexpressed in human glioma and contributes to tumorigenicity.

Authors:  Bin Dai; Weiqing Wan; Peng Zhang; Yisong Zhang; Changcun Pan; Guolu Meng; Xinru Xiao; Zhen Wu; Wang Jia; Junting Zhang; Liwei Zhang
Journal:  Oncol Rep       Date:  2015-09-01       Impact factor: 3.906

7.  Towards the identification of protein complexes and functional modules by integrating PPI network and gene expression data.

Authors:  Min Li; Xuehong Wu; Jianxin Wang; Yi Pan
Journal:  BMC Bioinformatics       Date:  2012-05-23       Impact factor: 3.169

8.  The BioGRID interaction database: 2015 update.

Authors:  Andrew Chatr-Aryamontri; Bobby-Joe Breitkreutz; Rose Oughtred; Lorrie Boucher; Sven Heinicke; Daici Chen; Chris Stark; Ashton Breitkreutz; Nadine Kolas; Lara O'Donnell; Teresa Reguly; Julie Nixon; Lindsay Ramage; Andrew Winter; Adnane Sellam; Christie Chang; Jodi Hirschman; Chandra Theesfeld; Jennifer Rust; Michael S Livstone; Kara Dolinski; Mike Tyers
Journal:  Nucleic Acids Res       Date:  2014-11-26       Impact factor: 19.160

9.  GATA3 acts upstream of FOXA1 in mediating ESR1 binding by shaping enhancer accessibility.

Authors:  Vasiliki Theodorou; Rory Stark; Suraj Menon; Jason S Carroll
Journal:  Genome Res       Date:  2012-11-21       Impact factor: 9.043

10.  PEG1/MEST and IGF2 DNA methylation in CIN and in cervical cancer.

Authors:  A C Vidal; N M Henry; S K Murphy; O Oneko; M Nye; J A Bartlett; F Overcash; Z Huang; F Wang; P Mlay; J Obure; J Smith; B Vasquez; B Swai; B Hernandez; C Hoyo
Journal:  Clin Transl Oncol       Date:  2013-06-18       Impact factor: 3.405

View more
  2 in total

1.  Global expression profiling and pathway analysis in two different population groups in relation to high altitude.

Authors:  Supriya Saini; Praveen Vats; Susovon Bayen; Priya Gaur; Koushik Ray; Krishna Kishore; Meerim Sartmyrzaeva; Almaz Akunov; Abdirashit Maripov; Akpay Sarybaev; Bhuvnesh Kumar; Shashi Bala Singh
Journal:  Funct Integr Genomics       Date:  2018-10-19       Impact factor: 3.410

Review 2.  Long Non-coding RNAs With In Vitro and In Vivo Efficacy in Preclinical Models of Esophageal Squamous Cell Carcinoma Which Act by a Non-microRNA Sponging Mechanism.

Authors:  Ulrich H Weidle; Fabian Birzele
Journal:  Cancer Genomics Proteomics       Date:  2022 Jul-Aug       Impact factor: 3.395

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.