Literature DB >> 23094030

Protein-protein interaction analysis highlights additional loci of interest for multiple sclerosis.

Giammario Ragnedda1, Giulio Disanto, Gavin Giovannoni, George C Ebers, Stefano Sotgiu, Sreeram V Ramagopalan.   

Abstract

Genetic factors play an important role in determining the risk of multiple sclerosis (MS). The strongest genetic association in MS is located within the major histocompatibility complex class II region (MHC), but more than 50 MS loci of modest effect located outside the MHC have now been identified. However, the relative candidate genes that underlie these associations and their functions are largely unknown. We conducted a protein-protein interaction (PPI) analysis of gene products coded in loci recently reported to be MS associated at the genome-wide significance level and in loci suggestive of MS association. Our aim was to identify which suggestive regions are more likely to be truly associated, which genes are mostly implicated in the PPI network and their expression profile. From three recent independent association studies, SNPs were considered and divided into significant and suggestive depending on the strength of the statistical association. Using the Disease Association Protein-Protein Link Evaluator tool we found that direct interactions among genetic products were significantly higher than expected by chance when considering both significant regions alone (p<0.0002) and significant plus suggestive (p<0.007). The number of genes involved in the network was 43. Of these, 23 were located within suggestive regions and many of them directly interacted with proteins coded within significant regions. These included genes such as SYK, IL-6, CSF2RB, FCLR3, EIF4EBP2 and CHST12. Using the gene portal BioGPS, we tested the expression of these genes in 24 different tissues and found the highest values among immune-related cells as compared to non-immune tissues (p<0.001). A gene ontology analysis confirmed the immune-related functions of these genes. In conclusion, loci currently suggestive of MS association interact with and have similar expression profiles and function as those significantly associated, highlighting the fact that more common variants remain to be found to be associated to MS.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23094030      PMCID: PMC3475710          DOI: 10.1371/journal.pone.0046730

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Multiple Sclerosis (MS) is the most common inflammatory disease of central nervous system (CNS) which affects young adults [1]. It is widely acknowledged that genetic factors play an important role in determining the risk of MS [2]. Several epidemiological studies demonstrated an increased frequency of MS among biological relatives of affected individuals [3], [4]. Family based and association studies have shown that the strongest genetic association in MS is located within the major histocompatibility complex (MHC) class II region [5]. In particular the HLA-DRB1*1501 allele confers an approximate odds ratio of 3 [6]. However, during the last few years Genome Wide Association Studies (GWAS) have identified many other MS associated loci of modest effect located outside the MHC (now more than 50) [7]–[11]. Despite the recent advances in the understanding of the genetic architecture of MS, several questions remain to be answered. For example, due to stringent correction criteria many genetic variants fail to reach genome-wide significance but can still be considered as suggestive of genetic association. Furthermore, once a SNP is found to be associated with a particular disease, the relative candidate gene (or genes) that mediate such association is usually unknown. Analysis of protein-protein interaction (PPI) networks is being increasingly recognized as an important tool to characterize the underlying biology of genes associated to complex diseases, in particular immune-mediated ones [12], [13]. It is logical to hypothesize that those genes which are truly associated with the same trait will be involved in similar biological processes. For example, Rossin et al. found that proteins encoded in genomic regions associated to rheumatoid Arthritis and Crohn's disease physically interact more than what would be expected by chance and that the genes encoding these proteins are highly expressed in immune tissues [12]. Studying such PPI interactions can ultimately elucidate which suggestive regions are more likely to be truly associated and greatly aid the identification of those genes that are mediating the GWAS findings. We conducted a PPI analysis of gene products coded in loci recently reported to be MS associated and suggestive of MS association. Our aim was to identify which suggestive regions are more likely to be truly associated, which genes are mostly implicated in the MS PPI network, their expression profiles and functions.

Methods

Three recent independent association studies were considered for our analysis [14]–[16]. In Sawcer et al. and Patsopoulos et al., SNPs were divided into significant and suggestive depending on the strength of the statistical association [14], [15]. From Sawcer et al we defined as suggestive those SNPs with p values in the discovery phase of less than 1×10−4 and significant those that either were replication of previous GWAS findings or had a replication p<0.05 and a p-combined<5×10−7 [14]. In Patsopoulos et al., significant SNPs were defined as either those with p-value<5×10−8 or replication of previously identified associated SNPs. Suggestive SNPs were those with p-values between 5×10−8 and 1×10−6 [15]. We also included in the analyses the top 82 SNPs (with a log p value>4.91) from Wang et al [16]. All SNPs from this study were considered as suggestive, because the study was not designed to meet currently accepted criteria for genome wide significance. After removing duplicate SNPs, 67 significant and 133 suggestive SNPs were obtained. Protein-to-protein interaction assessment was conducted using the Disease Association Protein-Protein Link Evaluator (DAPPLE) tool [12]. This bioinformatics tool is able to investigate physical interactions among gene products encoded within certain genomic regions by the creation of a PPI network. Interactions are extracted from the database “InWeb” that combines data from a variety of public PPI sources including MINT, BIND, IntAct and KEGG and defines high confidence interactions as those seen in multiple independent experiments. The region around a given SNP is extended to the genomic interval defined by SNPs in moderate linkage disequilibrium (r∧2> = 0.5) and then to the nearest recombination hotspots [12]. Connections can be direct (two proteins are physically linked to each other) and indirect (interaction is mediated by a common interactor). The extent of the PPI network are assessed using the following parameters: the number of direct interactions between proteins from different loci, the mean associated protein direct and indirect connectivities (the mean number of distinct loci a protein is directly or indirectly connected to) and the mean common interactor connectivity (average number of proteins in separate loci bound by common interactors) [12]. The non-randomness of the network and the significance of the interaction parameters are tested using a permutation method that compares the original network with thousands of networks created by randomly re-assigning the protein names while keeping the overall structure (size and number of interactions) of the original network. Those genes that participate in the network more than expected by chance are defined as genes to prioritize (corrected p<0.05) [12]. Expression data were gathered from BioGPS, an online gene annotation database that reports individual gene expression levels for a number of human tissues and cell types [17]. Analyses were performed using non-parametric tests (Kruskal-Wallis and Mann-Whitney tests). Gene ontology terms were investigated using The Database for Annotation, Visualization and Integrated Discovery (DAVID) v6.7, an online tool that is able to identify the functional categories and biological processes which are most represented within a list of genes [18], [19].

Results

Dapple analysis of significant SNPs

Our first aim was to assess the extent of PPI interactions among genes located within genomic regions with definite association with MS susceptibility. We therefore submitted into DAPPLE the 67 SNPs with genome-wide significant association with MS risk. There were a total of 75 proteins participating in the direct network with 104 direct interactions (expected direct interactions = 61, p<0.0002) (Table 1, Figure 1 and Table S1). The mean associated protein direct connectivity was 2.7 (expected = 1.7, p<0.0002). The mean associated protein indirect connectivity was 52.2 (expected = 43.8, p = 0.04) and the mean common interactor connectivity was 4.5. (expected = 3.9, p = 0.0002). The total number of genes implicated in the network was 215 (Table S1). The total number of genes that had more connections than expected by chance (genes to prioritize) was 22 and included previously shown putative candidate genes such as IL-12A, SOCS-1, CBLB, MALT-1, IL-22RA, MAPK-1 and IL-7R.
Table 1

Summary of DAPPLE analysis of significant and significant plus suggestive SNPs.

SignificantSignificanceSignificant + suggestiveSignificance
Number of proteins in the network 75-189-
Direct interactions 104 p<0.0002281 p<0.007
MAPDC * 2.7 p<0.00022.9 p = 0.0008
MAPIC ** 52.2 p = 0.0493 p = 0.34
Mean CI connectivity *** 4.5 p = 0.00025.05 p = 0.05
Genes to prioritize 39-22-

Mean associated protein direct connectivity;

Mean associated protein indirect connectivity;

Mean Common Interactor connectivity.

Figure 1

Direct connections among gene products from MS significant regions.

Colours indicate significance of participation in the PPI network.

Direct connections among gene products from MS significant regions.

Colours indicate significance of participation in the PPI network. Mean associated protein direct connectivity; Mean associated protein indirect connectivity; Mean Common Interactor connectivity.

Dapple analysis of significant plus suggestive SNPs

When suggestive SNPs were included in the analysis, the number of proteins participating in the network and that of direct interactions increased from 75 to 189 and from 104 to 281 respectively (expected direct interactions = 242, p<0.007) (Table 1, Figure 2 and Table S2). The mean associated protein direct connectivity was also higher than expected (observed = 2.9, expected = 2.4, p = 0.0008). The mean associated protein indirect connectivity was 93 (expected = 91, p = 0.34). The mean common interactor connectivity was 5.05 (expected = 4.8, p = 0.05). The total number of genes analyzed was 445 (Table S2), while genes to prioritize were 43 of which 23 were located within suggestive regions. These included genes such as SYK, IL-6, CSF2RB, FCLR3, EIF4EBP2 and CHST12 (Table 2).
Figure 2

Direct connections among gene products from MS significant plus suggestive regions.

Colours indicate significance of participation in the PPI network.

Table 2

List of candidate genes (genes to prioritize) obtained from DAPPLE analysis of significant plus suggestive SNPs.

SIGNIFICANTSUGGESTIVE
GENESNPSTUDYGENESNPSTUDY
BCL9L rs63092314 ARAP3 rs230210314
CARD11 rs1158106214 CHST12 rs695280914
CBLB rs202859714 CSF2RB rs207271114
IL12A rs224312314 FCRL3 rs376195914
IL12B rs254689014,15 MAP3K14 rs479281414
IL20RA rs1706609614 NDFIP1 rs106215814
IL22RA2 rs1706609614 SLC30A7 rs1204890414
IL2RA rs311847014 SYK rs29098614
IL7R rs689793214,15 UBASH3B rs794103014
MALT1 rs723807814 IQCB1 rs268142415
MAPK1 rs228379214 ANGPT2 rs251558516
RPS25 rs63092314 C12orf51 rs1106598716
SOCS1 rs720078614 CDH2 rs52843816
SP110 rs1020187214 CUX2 rs1106598716
SP140 rs1020187214 EIF4EBP2 rs1076236316
STAT3 rs989111914 ENSG00000205175 rs161171516
TMEM87B rs1717487014 ENSG00000204600 rs43449616
TYK2 rs811244914 ENSG00000205173 rs43444616
YPEL2 rs18051514 IL6 rs1024446716
C12orf65 rs179010015 RBM45 rs1020314116
SLC30A6 rs1302980916
TRAFD1 rs1106598716
Wdr23(DCAF11) rs1014690616

Direct connections among gene products from MS significant plus suggestive regions.

Colours indicate significance of participation in the PPI network.

Tissue-specific expression and gene ontology terms of candidate genes

In order to further investigate the nature of our findings we assessed in which tissues these genes were mostly expressed. We used the gene portal BioGPS which contains gene expression data on a variety of human tissues and cell types [17]. For our analysis we considered 10 immune cell types and 14 non-immune tissues. We submitted the full list of candidate genes (n = 43) obtained from the significant plus suggestive DAPPLE analysis and for each gene we obtained a different genetic expression value in every tissue or cell type tested. Because of different background characteristics between each probe set, a direct comparison of expression across different genes was not possible. Therefore, we decided to standardize the expression values of each single gene across different tissues and used the obtained z-values for all subsequent analyses. Figure 3 shows the standardized expression values in the 24 tissues and cell types tested. Expression appeared particularly high in whole blood as well as in most of immune-related cell types (in particular B-cells, plasmacytoid dendritic cells (pDCs), natural killer (NK) cells, CD4+ and CD8+ T cells). An independent-sample Kruskal-Wallis test confirmed that gene expression was significantly different across tissues (p<0.001). When tissues were divided into immune and non-immune, expression was substantially different between the two groups (p<0.001) (Figure 4). When compared to average expression across tissues, candidate genes were significantly overexpressed in B-lymphoblasts, pDCs, monocytes, B cells, NK cells, CD4+ T cells (p<0.001), CD34+ hematopoietic cells (p = 0.001) and CD8+ T cells (p = 0.003). Expression patterns were similar for significantly and suggestively associated loci.
Figure 3

Expression values of candidate genes (genes to prioritize) in all 23 tissues and cell types tested.

Figure 4

Expression values of candidate genes (genes to prioritize) in immune and non immune tissues.

We further confirmed the immunological nature of these candidate genes using DAVID [18], [19], a bioinformatics tool that is able to identify the biological processes in which a group of genes are involved. Candidate genes were significantly enriched for immune related processes such as regulation of leukocyte activation (p = 3.10×10−8), regulation of T cell proliferation (p = 3.25×10−8), positive regulation of immune system processes (p = 7.7×10−7), regulation of protein kinase cascade (p = 5.46×10−4) and regulation of cytokine production (p = 0.001459) (see Table S3 for the full list). GO enrichment was similar for significantly and suggestively associated loci.

Discussion

We showed that genetic products coded in loci strongly associated with MS risk substantially interact with each other. Both direct and indirect interactions were significantly higher than what would be expected by chance only. When the PPI analysis was extended to suggestive SNPs, we found an increased number of total proteins participating in the network and direct interactions (Figure 1 and 2). The only parameter that did not reach significance was the number of indirect interactions. This finding could be explained by the possible lack of real MS association among several suggestive SNPs. However, including suggestive SNPs in the PPI analysis increased the number of genes to prioritize from 22 to 43. Interestingly, more than half of these genes (n = 23) were located within suggestive regions and many of them directly interacted with proteins coded within significant regions (e.g. CSF2RB-CBLB, IL6-IL2RA, MAPK3K14-NFKB1, SYK-STAT3, see Table S2). Taken together the suggestive statistical evidence of genetic association and the functional evidence of protein-protein interaction support the hypothesis that these genes could play an important role in the pathogenesis of MS. We validated our results looking at tissue specific expression of these candidate genes. Using the BioGPS database we were able to show that the suggestively associated genes identified by DAPPLE were largely and specifically expressed in immune cells as compared to other tissues. A gene ontology analysis also confirmed the immune-related functions of these genes. More generally, these findings provide additional support to the immunological nature of MS [20]. Notably, candidate gene expression was particularly high among CD8+ and CD4+ T cells, B cells, NK cells and pDCs. Interestingly all these cell types have been implicated in the pathogenesis of MS. Several immune specific genes are located within MS suggestive regions. For example a SNP located near the gene encoding the Spleen Tyrosine Kinase (SYK) was found suggestive of association in Sawcer et al. Notably SYK was particularly highly expressed in B-cells, DCs, monocytes, CD33+ myeloid cells and NK cells. This protein has a central role in adaptive immune receptor signalling by phosphorylation of the immunoreceptor tyrosine-based activation motifs (ITAMs) [21]. SYK mediated ITAMs phosphorylation determines activation of signalling intermediates such as NF-κB, JNK and PYK2 that ultimately lead to lymphocyte activation [22]. ITAM signals mediated by SYK can also induce expansion of NK cells [23]. Interestingly, the SYK-inhibitor R788 (fostamatinib) has beneficial effects in patients affected by RA, when compared to placebo [24]. CSF2RB is another gene particularly highly expressed in B-cells, DCs, monocytes, CD33+ myeloid cells and NK cells. It codes for the β-subunit (βc) of the granulocyte-macrophage colony-stimulating factor (GM-CSF), IL-3 and IL-5 receptors that are expressed by peripheral leucocytes and blood DCs [25]. This gene appears to play an important role in allergic inflammation [26]. Interestingly, associations between CSF2RB and schizophrenia [27] and bipolar disorder [28] have been recently found. EIF4EBP2 encodes the Eukaryotic Translation Initiation Factor 4E Binding Protein 2. The members of this family of proteins (4EBPs) can inhibit translation initiation through binding eIF4E [29]. 4EBPs regulate cell proliferation by interaction with mTORC1 pathway [30]. In addiction, EIF4EBP1 knock-out mice showed a type I IFN over production in pDCs [31]. We found an over-expression of EIF4BP2 in pDCs, CD4 cells, CD8 cells and NK cells. CHST12 encodes the carbohydrate (chondroitin 4-O) sulfotransferase 2, a protein located in the membrane of the Golgi apparatus membrane and which is implicated in chondroitin and dermatan sulphate (DS) synthesis in different tissues [32]. DS proteoglycans participate in various biological events such as extracellular matrix assembly, cell adhesion, migration and proliferation [33]. We found high expression of CHST12 in pDCs, CD4 cells, CD8 cells and NK cells. To conclude, a number of proteins coded by genes located within MS-associated genomic regions are implicated in the same PPI networks. The extent of this interaction substantially increases when genomic regions with suggestive evidence of association are included in the analysis. This suggests that at least some of these suggestive GWAS hits represent truly associated loci, and thus more common variants remain to be found to be associated to MS. Finally, we further confirmed the immunological nature of MS and show how a single cell type cannot explain the complexity of this disease. Future functional studies should investigate how and in which cell types the suggestive candidate genes are acting. This will improve our knowledge of this complex disease and hopefully provide future strategies of disease prevention and treatment. Direct connections and list of genes from DAPPLE analysis of significant SNPs. (XLSX) Click here for additional data file. Direct connections and list of genes from DAPPLE analysis of significant plus suggestive SNPs. (XLSX) Click here for additional data file. of DAVID gene ontology. (XLSX) Click here for additional data file.
  33 in total

Review 1.  Multiple sclerosis: risk factors, prodromes, and potential causal pathways.

Authors:  Sreeram V Ramagopalan; Ruth Dobson; Ute C Meier; Gavin Giovannoni
Journal:  Lancet Neurol       Date:  2010-07       Impact factor: 44.182

2.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nat Protoc       Date:  2009       Impact factor: 13.491

3.  Genome-wide meta-analysis identifies novel multiple sclerosis susceptibility loci.

Authors:  Nikolaos A Patsopoulos; Federica Esposito; Joachim Reischl; Stephan Lehr; David Bauer; Jürgen Heubach; Rupert Sandbrink; Christoph Pohl; Gilles Edan; Ludwig Kappos; David Miller; Javier Montalbán; Chris H Polman; Mark S Freedman; Hans-Peter Hartung; Barry G W Arnason; Giancarlo Comi; Stuart Cook; Massimo Filippi; Douglas S Goodin; Douglas Jeffery; Paul O'Connor; George C Ebers; Dawn Langdon; Anthony T Reder; Anthony Traboulsee; Frauke Zipp; Sebastian Schimrigk; Jan Hillert; Melanie Bahlo; David R Booth; Simon Broadley; Matthew A Brown; Brian L Browning; Sharon R Browning; Helmut Butzkueven; William M Carroll; Caron Chapman; Simon J Foote; Lyn Griffiths; Allan G Kermode; Trevor J Kilpatrick; Jeanette Lechner-Scott; Mark Marriott; Deborah Mason; Pablo Moscato; Robert N Heard; Michael P Pender; Victoria M Perreau; Devindri Perera; Justin P Rubio; Rodney J Scott; Mark Slee; Jim Stankovich; Graeme J Stewart; Bruce V Taylor; Niall Tubridy; Ernest Willoughby; James Wiley; Paul Matthews; Filippo M Boneschi; Alastair Compston; Jonathan Haines; Stephen L Hauser; Jacob McCauley; Adrian Ivinson; Jorge R Oksenberg; Margaret Pericak-Vance; Stephen J Sawcer; Philip L De Jager; David A Hafler; Paul I W de Bakker
Journal:  Ann Neurol       Date:  2011-12       Impact factor: 10.422

4.  Translational control of the innate immune response through IRF-7.

Authors:  Rodney Colina; Mauro Costa-Mattioli; Ryan J O Dowling; Maritza Jaramillo; Lee-Hwa Tai; Caroline J Breitbach; Yvan Martineau; Ola Larsson; Liwei Rong; Yuri V Svitkin; Andrew P Makrigiannis; John C Bell; Nahum Sonenberg
Journal:  Nature       Date:  2008-02-13       Impact factor: 49.962

5.  Parent-of-origin effect in multiple sclerosis: observations in half-siblings.

Authors:  G C Ebers; A D Sadovnick; D A Dyment; I M L Yee; C J Willer; Neil Risch
Journal:  Lancet       Date:  2004-05-29       Impact factor: 79.321

6.  The IL-3/IL-5/GM-CSF common receptor plays a pivotal role in the regulation of Th2 immunity and allergic airway inflammation.

Authors:  Kelly L Asquith; Hayley S Ramshaw; Philip M Hansbro; Kenneth W Beagley; Angel F Lopez; Paul S Foster
Journal:  J Immunol       Date:  2008-01-15       Impact factor: 5.422

7.  Multiple sclerosis immunology: The healthy immune system vs the MS immune system.

Authors:  Lloyd H Kasper; Jennifer Shoemaker
Journal:  Neurology       Date:  2010-01-05       Impact factor: 9.910

8.  Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology.

Authors:  Elizabeth J Rossin; Kasper Lage; Soumya Raychaudhuri; Ramnik J Xavier; Diana Tatar; Yair Benita; Chris Cotsapas; Mark J Daly
Journal:  PLoS Genet       Date:  2011-01-13       Impact factor: 5.917

9.  Modeling the cumulative genetic risk for multiple sclerosis from genome-wide association data.

Authors:  Joanne H Wang; Derek Pappas; Philip L De Jager; Daniel Pelletier; Paul Iw de Bakker; Ludwig Kappos; Chris H Polman; Lori B Chibnik; David A Hafler; Paul M Matthews; Stephen L Hauser; Sergio E Baranzini; Jorge R Oksenberg
Journal:  Genome Med       Date:  2011-01-18       Impact factor: 11.117

10.  Gene-wide analyses of genome-wide association data sets: evidence for multiple common risk alleles for schizophrenia and bipolar disorder and for overlap in genetic risk.

Authors:  V Moskvina; N Craddock; P Holmans; I Nikolov; J S Pahwa; E Green; M J Owen; M C O'Donovan
Journal:  Mol Psychiatry       Date:  2008-12-09       Impact factor: 15.992

View more
  7 in total

Review 1.  Network.assisted analysis to prioritize GWAS results: principles, methods and perspectives.

Authors:  Peilin Jia; Zhongming Zhao
Journal:  Hum Genet       Date:  2014-02       Impact factor: 4.132

Review 2.  Protein-protein interaction networks (PPI) and complex diseases.

Authors:  Nahid Safari-Alighiarloo; Mohammad Taghizadeh; Mostafa Rezaei-Tavirani; Bahram Goliaei; Ali Asghar Peyvandi
Journal:  Gastroenterol Hepatol Bed Bench       Date:  2014

3.  What is the Real Fate of Vitamin D in Multiple Sclerosis?

Authors:  Vahid Shaygannejad; Zahra Tolou-Ghamari
Journal:  Int J Prev Med       Date:  2013-05

4.  Systematic protein-protein interaction and pathway analyses in the idiopathic inflammatory myopathies.

Authors:  Joanna E Parkes; Simon Rothwell; Philip J Day; Neil J McHugh; Zoë E Betteridge; Robert G Cooper; William E Ollier; Hector Chinoy; Janine A Lamb
Journal:  Arthritis Res Ther       Date:  2016-07-07       Impact factor: 5.156

5.  Network analysis of common genes related to esophageal, gastric, and colon cancers.

Authors:  Padina Vaseghi Maghvan; Mostafa Rezaei-Tavirani; Hakimeh Zali; Abdolrahim Nikzamir; Saeed Abdi; Mahsa Khodadoostan; Hamid Asadzadeh-Aghdaei
Journal:  Gastroenterol Hepatol Bed Bench       Date:  2017

6.  Protein-Protein interactions uncover candidate 'core genes' within omnigenic disease networks.

Authors:  Abhirami Ratnakumar; Nils Weinhold; Jessica C Mar; Nadeem Riaz
Journal:  PLoS Genet       Date:  2020-07-17       Impact factor: 5.917

7.  Molecular signature of different lesion types in the brain white matter of patients with progressive multiple sclerosis.

Authors:  Maria L Elkjaer; Tobias Frisch; Richard Reynolds; Tim Kacprowski; Mark Burton; Torben A Kruse; Mads Thomassen; Jan Baumbach; Zsolt Illes
Journal:  Acta Neuropathol Commun       Date:  2019-12-11       Impact factor: 7.801

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.