Literature DB >> 23149315

Dynamic protein-protein interaction subnetworks of lung cancer in cases with smoking history.

Wei Yu1, Li-Ran He, Yan-Chao Zhao, Man-Him Chan, Meng Zhang, Miao He.   

Abstract

Smoking is the primary cause of lung cancer and is linked to 85% of lung cancer cases. However, how lung cancer develops in patients with smoking history remains unclear. Systems approaches that combine human protein-protein interaction (PPI) networks and gene expression data are superior to traditional methods. We performed these systems to determine the role that smoking plays in lung cancer development and used the support vector machine (SVM) model to predict PPIs. By defining expression variance (EV), we found 520 dynamic proteins (EV>0.4) using data from the Human Protein Reference Database and Gene Expression Omnibus Database, and built 7 dynamic PPI subnetworks of lung cancer in patients with smoking history. We also determined the primary functions of each subnetwork: signal transduction, apoptosis, and cell migration and adhesion for subnetwork A; cell-sustained angiogenesis for subnetwork B; apoptosis for subnetwork C; and, finally, signal transduction and cell replication and proliferation for subnetworks D-G. The probability distribution of the degree of dynamic protein and static protein differed, clearly showing that the dynamic proteins were not the core proteins which widely connected with their neighbor proteins. There were high correlations among the dynamic proteins, suggesting that the dynamic proteins tend to form specific dynamic modules. We also found that the dynamic proteins were only correlated with the expression of selected proteins but not all neighbor proteins when cancer occurred.

Entities:  

Mesh:

Year:  2012        PMID: 23149315      PMCID: PMC3845612          DOI: 10.5732/cjc.012.10099

Source DB:  PubMed          Journal:  Chin J Cancer        ISSN: 1944-446X


Lung cancer is currently the deadliest disease in the world. Smoking is the primary cause of lung cancer and has been linked to 85% of cases. The occurrence and development of this disease is a multi-gene, multi-stage, and extremely complex process that involves several changes, including Oncogene activation, tumor suppressor gene mutation and deletion, tumor cell apoptosis suppression, and microsatellite instability[1]–[3]. Generally, many factors within the tumor microenvironment can influence cellular metabolism, signal transduction, and gene expression. However, most studies on lung cancer pathogenesis focus primarily on a single or limited number of genes and simple functional annotation using standard research methods. Only at the system level can the molecular mechanisms of cancer be revealed effectively. In the protein-protein interaction (PPI) network, the dynamic modules or subnetworks of proteins may have leading roles in the cancer development and metastasis process. The static modules of proteins may belong to the inherent components in a PPI network; these modules tend to associate with the “noises” of protein expression, genetic modification, and genetic evolution. The static modules of proteins may be a buffer in the variation of the PPI network, and cells having these proteins are robust[4]. Thus, it is very important to explore the dynamic PPI subnetwork of lung cancer in patients with smoking history at the cellular level.

Data and Methods

Data sources

All human protein sequence data (9289 in total) and interaction data (37 066 pairs in total) were downloaded from the Human Protein Reference Database (HPRD)[5], Release 8, 1 Sep 2009. Gene expression data for lung cancer patients with smoking history and non-cancer samples were downloaded from the Gene Expression Omnibus (GEO) Database of the National Center for Biotechnology Information (NCBI)[6]. The original data, GSE94115, were obtained from the smoking groups[7]. A total of 72 patients with lung cancer were randomly selected, and 72 individuals without lung cancer were randomly selected as the training set. Another 17 patients with lung cancer and 17 individuals without lung cancer were randomly selected as the test set (Table 1).
Table 1.

The original data and classes from the Gene Expression Omnibus (GEO) Database that used to predict protein-protein interactions by support vector machine (SVM) model

Serial numberUnitTraining/testingCancer/non-cancer
GSM94020-GSM94075, GSM94155-GSM9417272TrainingCancer
GSM94767-GSM9478417TestingCancer
GSM940100-GSM940148 GSM94077-GSM9409972TrainingNon-cancer
GSM94785-GSM9480117TestingNon-cancer

Protein expression variance

Expression variance (EV) can be used to measure the dynamic expression of genes in the genome. The value of EV is the percentage of gene expression variance divided by the genome expression variance. If EV is small, the difference between differential expression of two genes in the genome is also small[7]. Thus, by defining the EV value, we could also classify proteins coding by the genes as dynamic or static. Here, if EV > 0.4, the protein was classified as dynamic, whereas if EV < 0.2, the protein was classified as static.

Pearson correlation

The correlation of expression between two proteins was measured by determining the Pearson correlation coefficient (PCC). The smaller the absolute value of PCC, the lower the correlation of expression. We referred to the screening criteria of Goh et al.[8] to define the values. If the absolute value of PPC was greater than 0.6 and the EV value was simultaneously greater than 0.4, the related proteins were selected to compose the dynamic PPI subnetwork. PCC was calculated with the following formula: Here, the vector (x, y) represents the interaction pair of proteins A and B, respectively; x and y and represent the average expression of proteins A and B, respectively, in the 72 samples.

PPI prediction and subnetwork visualization

The support vector machine (SVM) model was used to predict PPIs. Subnetworks were visualized using Cytoscape software with the Cerebral plug-in, which could locate proteins interacting in cells[9].

Results

Dynamic PPI subnetworks

Based on the data in Table 1, we calculated the parameters (C, g) of the SVM model to predict PPIs. (C, g) equaled (2, 0.03125), and the 5-fold calibration accuracy validation rate was 79%. The forecast accuracy rate was 70.58%, though only when test data were used. In total, we identified 520 dynamic proteins (EV > 0.4) and 2754 static proteins (EV < 0.2), and we successfully built dynamic PPI subnetworks of lung cancer in patients with smoking history (Figure 1).
Figure 1.

Dynamic protein-protein interaction subnetworks A–G.

A, the functions of subnetwork A are mainly signal transduction, apoptosis, and cell migration and adhesion; B, the function of subnetwork B is mainly cell-sustained angiogenesis; C, the primary function of subnetwork C is apoptosis; D–G, the functions of subnetworks D–G are mainly signal transduction, cell replication and proliferation.

Functions of the dynamic PPI subnetworks

Using the Gene Ontology database, we determined that the majority of proteins in the subnetworks functioned in the physiological processes of cell migration and adhesion, apoptosis, signal transduction, cell-sustained angiogenesis, and cell replication and proliferation. The primary collective functions of each PPI subnetwork could be stamped by the key nodes of proteins or a group of proteins with the similar functions in the specific physiological processes in PPI subnetwork. Analyzing the PPIs and pathways shown in Figure 1A, we determined that the functions of subnetwork A were mainly signal transduction, apoptosis, and cell migration and adhesion (Table 2). This suggests that the pathways in subnetwork A played central role in cancer cell-to-cell communication, cancer cell apoptosis control, and cancer cell adhesion and invasion to other normal tissues.
Table 2.

Functional classification of proteins in protein-protein interaction subnetworks A–G

SubnetworkFunctionProteins
ACell-sustained angiogenesisLM02
Cell adhesion and migrationRAF1, TJP1, TJP2, CTNNA1, CSDA, PPP1CC
ApoptosisHSPA1B, HSPA1A, RAF1, STAT1, YWHAZ, BAG3, YWHAQ
Signal transductionNMI, STAT1, IFNGR1, IRF2, ISGF3G, RAF1, HSPA1B, RHEB, SHOC2
Cell replication and proliferationRAF1
BCell-sustained angiogenesisPAFAH1B1, HIF1A, RABIA, G0LGA5
Cell adhesion and migrationPAFAH1B1, PGL1, HIF1A
ApoptosisMIF
Signal transductionHIF1A
Cell replication and proliferationPGK1, MIF, CAPNS
CCell-sustained angiogenesisACTG1, DYNLL1
Cell adhesion and migrationACTG1, PAPBPC4
ApoptosisFXR1, DYNLL1, ANXA5, HSPE1, HBXIP
Signal transductionHSPD1, APLP2
Cell replication and proliferationPPIA, HSPD1
DSignal transductionPDCD6IP
Cell replication and proliferationPDCD6
ECell adhesion and migrationTUBB
ApoptosisTUBB
Cell replication and proliferationFTH1
FSignal transductionPRKAR1A, AKAP11
GCell adhesion and migrationIQGAP1
Signal transductionCALM1, RRAD
Cell replication and proliferationDDX5

Dynamic protein-protein interaction subnetworks A–G.

A, the functions of subnetwork A are mainly signal transduction, apoptosis, and cell migration and adhesion; B, the function of subnetwork B is mainly cell-sustained angiogenesis; C, the primary function of subnetwork C is apoptosis; D–G, the functions of subnetworks D–G are mainly signal transduction, cell replication and proliferation. The function of subnetwork B (Figure 1B) was mainly cell-sustained angiogenesis (Table 2). As we know, cancer cells require nutrients to grow, and these nutrients travel to cancer cells through blood vessels. Therefore, blood vessel and tissue generation are important factors of cancer development. Subnetwork B showed some key pathways of blood vessel generation. The primary function of subnetwork C was apoptosis (Table 2, Figure 1C). Proteins in this subnetwork such as fragile X mental retardation related protein 1 (FXR1), dynein light chain LC8-type 1 (DYNLL1), and heat shock 10-kDa protein 1 (HSPE1) positively regulated programmed cell death, whereas annexin A5 (ANXA5) and hepatitis B virus x interacting protein (HBXIP) negatively regulated programmed cell death[10]. Subnetworks D–G (Figures 1D–G) functioned mainly in signal transduction, cell replication, and proliferation (Table 2). Generally, tumor cells generate many of their own growth signals, thereby reducing their dependence on stimulation from their normal tissue microenvironment. While most soluble mitogenic growth factors (GFs) are made by one cell type to stimulate proliferation of another, many of the growth signals can drive the proliferation of carcinoma cells, thus allows cell proliferation in specific phase of tumor development[11]. Subnetworks D–G presented partial pathway of tumor proliferation.

Evaluation of the dynamic proteins

Because whole human PPI networks are incomplete for lung cancer patients with smoking history, we could not clearly determine the effects of each lung cancer-related protein and pathway. Although we built several dynamic PPI subnetworks, we still did not know the role that these subnetworks play in processes of tumor development. Thus, to evaluate the effects of dynamic proteins, we measured the probability distributions of the degree of the dynamic protein and the static protein respectively. We found that the probability distributions between the degree of the dynamic protein (Figure 2A) and the static protein (Figure 2B) were very different and that the maximum density of the degree of the dynamic protein was relative lower, suggesting that the dynamic proteins were not the core proteins which widely connected with their neighbor proteins.
Figure 2.

Dynamic and static proteins' probability distributions of three main parameters—the degree of proteins, the average EV of neighbor proteins, and the average PCC of neighbor proteins.

A, C, E, dynamic proteins; B, D, F, static proteins.

We found that the probability distributions between the average EV of neighbor proteins of the dynamic proteins (Figure 2C) and the static proteins (Figure 2D) had no significant difference. But there existed difference between the probability distributions of the average PCC of neighbor proteins in the dynamic proteins (Figure 2E) and static proteins (Figure 2F). For the average EV of neighbor proteins of the dynamic proteins, the maximum density of the probability distribution was low, suggesting that the expression difference between two neighbor proteins was also small. For the average PCC of neighbor proteins of the dynamic proteins, the mean value of the probability distributions was large, suggesting that there were high correlations among the dynamic proteins. The results showed that three main parameters—the degree of proteins, the average EV of neighbor proteins, and the average PCC of neighbor proteins—could basically reflect the relationship between proteins and surrounding proteins and were related to changes in protein expression. Moreover, the dynamic proteins, as well as static proteins, might have a similar correlation with surrounding proteins. These results suggest that the dynamic proteins were only correlated with the expression of selected proteins but not all neighbor proteins when cancer occurred. We retrieved the functional annotations of selected dynamic proteins from the Gene Ontology Database and assessed their functional link with lung cancer pathogenesis (Table 3). Each of these genes and their protein products are related to cancer pathogenesis, though not specifically to lung cancer.
Table 3.

Functional annotations of other partial dynamic proteins retrieved from Gene Ontology (GO) Database

Function from GO annotationProteins
Signal transductionTOLLIP, HINT1, MAPK11, TNFAIP3, IL22, SOCS5, C0R02A, PDZD3, EEF1E1, RANBP2, PDPK1, MAPK6, PTEN, MPP3
Ion/glucose/transmembrane/vesicle-mediated/intracellular transportNUDT9, SLC5A1, NDUFV2, MY05A, C0X7A2L, SRP54, PDZD3, ARCN1, SLC25A11, CPNE3
Response to stimulusTOLLIP, IL22, EIF2B3, PDZD3, PEF1, IL1R2
Regulation of apoptosisEEF1E1, PTEN, DAD1, NDUFS1, TNFAIP3
Regulation of transcription via RNA polymerase II promoterORC2L, ECD, S0X9, KLF9, EGR1, HSBP1, SAP30
Ubiquitin-dependent protein catabolic processCDC16, PSMD6, PSMD10, TSG101, UCHL3
Tissue/organism developmentDDX1, KRT85, NDUFV2
DNA replication and damage responseORC2L, ORC5L, EEF1E1
Regulation of cell proliferationCDC16, EEF1E1, PTEN
Regulation of cell cycle processCDC16, PSMD6
Establishment of vesicle location/Golgi transport vesicle coatingTMED10, COPB2, ARCN
Macromolecular complex assemblyEIF2B3, EPRS, DDX1
Cell adhesion and motionPTEN
Cellular respiration and homeostasisNDUFS1

Dynamic and static proteins' probability distributions of three main parameters—the degree of proteins, the average EV of neighbor proteins, and the average PCC of neighbor proteins.

A, C, E, dynamic proteins; B, D, F, static proteins.

Discussion

In this study, we found 520 dynamic proteins and 2754 static proteins using the data from HPRD and the GEO database. We also built 7 dynamic PPI subnetworks of lung cancer in patients with smoking history. Initial analysis revealed the main functions of each PPI subnetwork: signal transduction, apoptosis, and cell migration and adhesion for subnetwork A; cell-sustained angiogenesis for subnetwork B; apoptosis for subnetwork C; and signal transduction and cell replication and proliferation for subnetworks D–G. These subnetworks reveal potential mechanisms underlying lung cancer development[12]. For the main parameter—the degree of proteins, the probability distribution of dynamic proteins and static proteins was different, obviously showing that dynamic proteins were not the core proteins which widely connected with their neighbor proteins. There were high correlations among the dynamic proteins, suggesting that the dynamic proteins tend to form specific dynamic modules. Systems approaches that combine human PPI networks and gene expression data are superior to traditional methods, which can only analyze small amounts of gene expression data. Here, we compared high-throughput microarray expression data of 72 healthy smokers and 72 smokers with lung cancer, and we built several human dynamic PPI subnetworks. The gene expression data were then mapped on dynamic PPI subnetworks. We calculated each protein's EV value according to gene expression changes in lung cancer patients with smoking history and healthy samples. EV value was used to define the dynamic protein with biggest expression change or the static protein with smallest expression change. Based on the relationship between protein expression and the PPI network, we analyzed the functions of the dynamic PPI subnetworks. By analyzing the degree of relation between dynamic proteins, static proteins, and their neighbor proteins, as well as the average EV and average PCC of neighbor proteins, we were able to evaluate the effects of the dynamic PPI subnetworks in cancer development. The dynamic proteins that we identified represented all essential functions of lung cancer development, which included maintenance of intracellular dynamic balance and regulation of programmed cell death, cell movement and localization, cell proliferation, immunoreactions, and transcription initiation via RNA polymerase II. Particularly, these essential functions were related to chemical or physical injury-induced inflammatory reactions and chemical stimulus reactions, which suggests that the cellular damage caused by smoking was the critical factor leading to lung cancer[13],[14]. In other words, lung cancer in patients with smoking history may be caused by proteins with a high EV value that function in the transition from precancerous stage to metastatic stage. Because the dynamic proteins linked to lung cancer did not show different degrees, and because we observed different average EV and PCC values of neighbor proteins from the static protein, we found that not all the dynamic proteins were at core positions of the PPI networks or were not Hub nodes. Our finding that dynamic proteins did not show higher tendency of distribution than static proteins was in keeping with the previous conclusions of Goh et al.[8], which indicates that vast majority of disease genes were nonessential and showed no tendency to encode Hub proteins. Our discovery verified the feasibility of seeking cancer pathways. However, the proteins that we used to build the dynamic PPI subnetworks were only a part of the complete human PPI network, thus the accuracy of PPI prediction with the SVM model needs to be further improved. In conclusion, by analyzing dynamic protein subnetworks, we found that the proteins of such subnetwork took part in cancer-related biological processes. These proteins related to lung cancer could be used to predict cancer-related pathways.
  14 in total

Review 1.  The hallmarks of cancer.

Authors:  D Hanahan; R A Weinberg
Journal:  Cell       Date:  2000-01-07       Impact factor: 41.582

2.  HBXIP functions as a cofactor of survivin in apoptosis suppression.

Authors:  Hiroyuki Marusawa; Shu-Ichi Matsuzawa; Kate Welsh; Hua Zou; Robert Armstrong; Ingo Tamm; John C Reed
Journal:  EMBO J       Date:  2003-06-02       Impact factor: 11.598

3.  Cerebral: a Cytoscape plugin for layout of and interaction with biological networks using subcellular localization annotation.

Authors:  Aaron Barsky; Jennifer L Gardy; Robert E W Hancock; Tamara Munzner
Journal:  Bioinformatics       Date:  2007-02-19       Impact factor: 6.937

4.  Drug-target network.

Authors:  Muhammed A Yildirim; Kwang-Il Goh; Michael E Cusick; Albert-László Barabási; Marc Vidal
Journal:  Nat Biotechnol       Date:  2007-10       Impact factor: 54.908

5.  Brain 1H magnetic resonance spectroscopic differences in myotonic dystrophy type 2 and type 1.

Authors:  Stefan Vielhaber; Sibylle Jakubiczka; Charly Gaul; Mircea Ariel Schoenfeld; Grazyna Debska-Vielhaber; Stefan Zierz; Hans-Jochen Heinze; Heiko G Niessen; Jörn Kaufmann
Journal:  Muscle Nerve       Date:  2006-08       Impact factor: 3.217

Review 6.  Network systems biology for targeted cancer therapies.

Authors:  Ting-Ting Zhou
Journal:  Chin J Cancer       Date:  2011-12-16

7.  Human Protein Reference Database--2009 update.

Authors:  T S Keshava Prasad; Renu Goel; Kumaran Kandasamy; Shivakumar Keerthikumar; Sameer Kumar; Suresh Mathivanan; Deepthi Telikicherla; Rajesh Raju; Beema Shafreen; Abhilash Venugopal; Lavanya Balakrishnan; Arivusudar Marimuthu; Sutopa Banerjee; Devi S Somanathan; Aimy Sebastian; Sandhya Rani; Somak Ray; C J Harrys Kishore; Sashi Kanth; Mukhtar Ahmed; Manoj K Kashyap; Riaz Mohmood; Y L Ramachandra; V Krishna; B Abdul Rahiman; Sujatha Mohan; Prathibha Ranganathan; Subhashri Ramabadran; Raghothama Chaerkady; Akhilesh Pandey
Journal:  Nucleic Acids Res       Date:  2008-11-06       Impact factor: 16.971

8.  Revealing static and dynamic modular architecture of the eukaryotic protein interaction network.

Authors:  Kakajan Komurov; Michael White
Journal:  Mol Syst Biol       Date:  2007-04-24       Impact factor: 11.429

9.  Clinical evaluation of seven tumour markers in lung cancer diagnosis: can any combination improve the results?

Authors:  M Plebani; D Basso; F Navaglia; M De Paoli; A Tommasini; A Cipriani
Journal:  Br J Cancer       Date:  1995-07       Impact factor: 7.640

10.  Reversible and permanent effects of tobacco smoke exposure on airway epithelial gene expression.

Authors:  Jennifer Beane; Paola Sebastiani; Gang Liu; Jerome S Brody; Marc E Lenburg; Avrum Spira
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more
  2 in total

Review 1.  Quantitative proteomics in lung cancer.

Authors:  Chantal Hoi Yin Cheung; Hsueh-Fen Juan
Journal:  J Biomed Sci       Date:  2017-06-14       Impact factor: 8.410

2.  Secular trend analysis of lung cancer incidence in Sihui city, China between 1987 and 2011.

Authors:  Jin-Lin Du; Xiao Lin; Li-Fang Zhang; Yan-Hua Li; Shang-Hang Xie; Meng-Jie Yang; Jie Guo; Er-Hong Lin; Qing Liu; Ming-Huang Hong; Qi-Hong Huang; Zheng-Er Liao; Su-Mei Cao
Journal:  Chin J Cancer       Date:  2015-07-31
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.