Literature DB >> 19357639

Tissue specificity and the human protein interaction network.

Alice Bossi1, Ben Lehner.   

Abstract

A protein interaction network describes a set of physical associations that can occur between proteins. However, within any particular cell or tissue only a subset of proteins is expressed and so only a subset of interactions can occur. Integrating interaction and expression data, we analyze here this interplay between protein expression and physical interactions in humans. Proteins only expressed in restricted cell types, like recently evolved proteins, make few physical interactions. Most tissue-specific proteins do, however, bind to universally expressed proteins, and so can function by recruiting or modifying core cellular processes. Conversely, most 'housekeeping' proteins that are expressed in all cells also make highly tissue-specific protein interactions. These results suggest a model for the evolution of tissue-specific biology, and show that most, and possibly all, 'housekeeping' proteins actually have important tissue-specific molecular interactions.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19357639      PMCID: PMC2683721          DOI: 10.1038/msb.2009.17

Source DB:  PubMed          Journal:  Mol Syst Biol        ISSN: 1744-4292            Impact factor:   11.429


Introduction

Nearly all processes in biology are dependent on the precise physical interactions among many individual proteins. These range from the maintenance of cellular architecture and the propagation of the genetic material, to the ability of cells to process and respond to environmental information. Defining a near-complete map of the physical interactions that can occur between human proteins—the human protein ‘interactome'—is an important ambition of current research. Similar to the sequence of the human genome, the human interactome serves as a resource for researchers and can be used to understand how proteins are organized to perform functions within a cell (Bork ; Cusick ). Protein interactome mapping projects were pioneered in model organisms (Uetz ; Walhout ; Ito ; Ho ; Li ; Gavin ; Krogan ), with initial efforts in humans focused on particular pathways or genomic regions (Bouwmeester ; Lehner and Sanderson, 2004; Lehner ; Jeronimo ). More recently, the cloning of large sets of human open reading frames and improvements in interaction assays have allowed these efforts to be expanded by an order of magnitude to the scale of the human proteome (Rual ; Stelzl ; Ewing ). These data, combined with extensive efforts to collate known interactions from the scientific literature (Bader ; Xenarios ; Pagel ; Persico ; Stark ; Kerrien ; Vastrik ; Ruepp ), mean that there is now a reasonably extensive resource of known human protein interactions (Hart ). A global interactome network provides an overview of all of the physical interactions that can occur between human proteins. However, very little is known about when and where each of these interactions can occur. Within any particular cell or tissue of the human body not all protein interactions can occur. Most simply, if two genes are not expressed in a cell, then an interaction between their protein products cannot occur. In unicellular organisms, one approach that has been used to investigate the dynamics of interaction networks between cellular states has been to integrate interactome data with expression data. This approach has been used to identify co-regulated interaction modules (Ihmels ; Komurov and White, 2007) or to investigate the relationships between interaction network topology and gene co-expression (Han ). Additional studies have used gene expression (Luscombe ; de Lichtenberg ) or functional information (Rachlin ) to investigate the cellular conditions (or ‘context') under which interactions can occur, and to distinguish between condition-dependent and condition-independent interactions. In the present study, we apply a similar approach to the human protein interaction network, using global gene expression data to identify the human cells and tissues in which each interaction can or cannot occur. By performing this analysis, we are able to investigate the relationship between the tissue specificity of a protein and its number of interaction partners. Moreover, and strikingly, we find extensive communication between universally expressed proteins and those with tissue-specific expression. Even the most tissue-specific proteins normally interact directly with components of the core cellular machinery. Conversely, nearly all universally expressed ‘housekeeping' proteins have protein interactions that can only occur in a restricted subset of cells. Our results suggest a model for the evolution of tissue-specific functions through the modification and re-use of core cellular processes, and that most ‘housekeeping' proteins should probably be considered as important for tissue-specific processes.

Results

Construction of a global human protein interaction network

To construct a global human physical protein interaction network, we integrated data from 21 different sources to define a network of 80 922 physical interactions that can occur between 10 229 human proteins. We only included interactions supported by at least one piece of direct experimental evidence demonstrating physical association between two human proteins (see Materials and methods; Supplementary Table 1). Moreover, to account for differences in interaction assay reliability, throughout this work, we also consider a high-confidence subset of this global network that consists of interactions reported in at least two independent primary research publications. There are a total of 13 102 of these multiple publication-supported interactions that connect 4750 human proteins.

Determining the tissue specificity of human protein interactions

We then used gene expression data (Su ) to determine the cells and tissues of the human body in which each of these interactions can occur (Figure 1A). If two genes are co-expressed in a cell, then under some condition their products can physically interact in that cell. However, if two proteins are not expressed in a tissue, then the interaction cannot occur in this tissue. The complete set of interactions, their supporting evidence, and the cells and tissues in which each interaction can occur are provided as Supplementary Table 1 as a resource for researchers interested in the biology of any particular human cell or tissue.
Figure 1

Tissue-specific and recently evolved proteins make few protein interactions. (A) Integrating protein interaction and expression data to construct ‘local' interactomes for human cells and tissues. (B, C) The relationship between protein interaction degree and protein expression breadth (the number of tissues in which a protein is expressed) for the complete human protein interaction network (B), and (C) for ancestral (pre-metazoan) proteins (blue) and for metazoan-specific proteins (red). P<10e−15 in all cases, Kolmogorov–Smirnov test. Bars indicate one standard error. Interaction degree is the maximum number of co-expressed interaction partners. The same analysis is performed for the multiple-support network and for a network without protein complex-derived interactions in Supplementary Figure 1.

Tissue specific and recently evolved proteins make few protein interactions

We first examined the relationship between the tissue specificity of a protein and the number of interactions that it makes (a protein's interaction degree). We find that more tissue-specific proteins make fewer interactions than widely expressed proteins (Figure 1B, Spearman's rho=0.19, P<2.2e−16). This is true both for the complete and for the multiple-support interaction dataset (Supplementary Figure 1A), and when excluding all protein complexes (Supplementary Figure 1B). It has been shown earlier that tissue-specific proteins are more likely to be recent evolutionary innovations than universally expressed proteins (Lehner and Fraser, 2004b). We find that more-recently evolved proteins have fewer interactions than ancient proteins, but that the relationship between tissue specificity and interaction degree is seen for both sets of proteins (Figure 1C). That is, the older a protein is, and the more tissues in which it is expressed, the more protein interactions it is likely to have.

The most tissue-specific proteins normally interact with core cellular components

We next analyzed the extent to which tissue-specific proteins interact with the most widely expressed proteins. We find that even when only considering the most tissue-restricted proteins (proteins expressed in ⩽10/79 tissues), most of them are known to interact directly with universally expressed human proteins (Figure 2A). The same result is seen when only considering high-confidence human protein interactions (Supplementary Figure 2A), and when using diverse definitions of universally expressed proteins (Figure 2A). Thus, most tissue-specific proteins can function by directly contacting components of the core cellular machinery.
Figure 2

Most tissue-specific proteins interact with core cellular components, and most housekeeping proteins have tissue-specific physical interactions. (A) The proportion of the most tissue-specific proteins (proteins expressed in only 1–10/79 tissues) that interact with universally expressed housekeeping proteins. (B) The percentage of housekeeping proteins that interact with non-housekeeping proteins. These data are for the complete network. The same analysis is shown for the high-confidence multiple-support network in Supplementary Figure 2. Housekeeping proteins are defined by 10 criteria: (1) this study 79/79 tissues, (2) this study 71–79 tissues, (3) this study 79/79 tissues with reduced expression stringency, (4) this study 71–79 tissues with reduced stringency, (5) this study 79/79 tissues with increased stringency, (6) this study 71–79 tissues with increased stringency, (7) Zhu et al microarray data 18/18 tissues, (8) Zhu et al microarray data 16–18 tissues, (9) Zhu et al EST data 18/18 tissues, (10) Zhu et al EST data 16–18 tissues (Zhu ). (C) Many proteins make interactions that can only occur in a subset of the tissues in which they are expressed. The number of tissues in which the interactions of a protein can occur is compared with the number of tissues in which a protein is expressed for proteins falling into each of the eight bins of tissue specificity. Data are shown for the complete network. Data for the filtered multiple-support network and reduced and increased stringency expression thresholds are shown in Supplementary Figure 3.

Most universally expressed proteins have tissue-specific protein interactions

Constitutively expressed proteins are often considered as important for ‘housekeeping' biological processes that are required in all cells. However, nearly all of the most widely expressed proteins have interactions with other proteins that are not themselves universally expressed (Figure 2B). That is, most universally expressed proteins have physical interactions that can only occur in a restricted subset of cells and tissues. The same result is seen when using the complete interaction dataset, when only considering high-confidence interactions described in multiple independent publications (Supplementary Figure 2B), or when using diverse definitions of universally expressed proteins (Figure 2B). Thus most, and possibly all, universally expressed proteins have tissue-specific molecular interactions. Proteins that themselves have restricted expression patterns also have many interactions that can only occur in a subset of the tissues in which they are expressed (Figure 2C). That is, as a consequence of interactions between more and less widely expressed proteins, human protein interactions are often more tissue specific than proteins (P<10−16).

Extensive re-use of housekeeping proteins for tissue-specific biological processes

To further illustrate how housekeeping proteins are widely re-used for tissue-specific biological processes, we considered neuronal protein complexes that function in synaptic transmission, learning, and memory. The subunits of these complexes have been identified by extensive proteomic approaches, and the importance of individual subunits for learning and memory have been validated by genetic studies in mice and by clinical studies in humans (Pocklington ). We estimate that ∼20–60% of the subunits of these neuronal-specific complexes are actually universally expressed housekeeping proteins (Figure 3A and B). Moreover, in ∼30% of cases, these housekeeping subunits have genetically verified roles in learning and memory (Figure 3C). Thus, universally expressed proteins, through their tissue-specific interactions, can be re-used and essential for highly tissue-specific biological processes.
Figure 3

The re-use of housekeeping proteins for tissue-specific functions. Here we use the example of neurotransmitter receptor protein complexes identified by affinity purification followed by mass spectrometry (Pocklington ). (A) A section of the binary protein interaction network of neurotransmitter receptor complexes, with subunits marked as universally expressed (housekeeping) proteins (yellow) and non-housekeeping (blue). The housekeeping and non-housekeeping interaction partners of the housekeeping protein Rac1 are highlighted and labeled as examples. (B) The percentage of subunits of neurotransmitter receptor protein complexes considered as universally expressed housekeeping proteins is shown for 10 different criteria of housekeeping proteins, as described in Figure 2. Criteria10 is used in panel A. (C) The proportion of these housekeeping subunits that have been experimentally verified as essential for learning and memory in mouse models or that are implicated in psychiatric disease in humans is shown for the same 10 criteria of housekeeping proteins. Protein complex subunits, binary protein–protein interactions, and genetic data are all from Pocklington ). The network in (A) was visualized using Biolayout Express (3D) (Freeman ).

Discussion

The evolution of tissue-specific biological processes

Taken together, our findings suggest the following model for the evolution of tissue-specific functions. Many (but not all) tissue-specific proteins are recent evolutionary innovations (Lehner ). In general, these tissue-specific proteins initially make few interactions, and these interactions are frequently with much more widely expressed and ‘housekeeping' components of the cell. Thus, many tissue-specific proteins probably function by directly recruiting or modifying the activities of core cellular components. There are, however, exceptions to this trend, with some tissue-specific proteins acting as ‘local' hubs in the interaction network of a particular tissue (our unpublished observation).

Frequent re-use of housekeeping proteins for tissue-specific biology

Universally expressed ‘housekeeping' proteins tend to make many interactions. Many of these interactions (∼50–60%, Supplementary Figure 3) are with other housekeeping proteins. However, the majority of universally expressed proteins also make interactions that can only occur in a subset of the tissues in which they are expressed. Therefore, there appears to be very frequent, and possibly universal, re-use of ‘housekeeping' proteins to perform tissue-specific biological processes. That is, most housekeeping proteins can be considered to be important for different (or at least modified) biological processes in different tissues. In summary, our results suggest that it might be better to consider the biology of any particular tissue in the terms of the particular interactions that can occur in that tissue, rather than simply in the terms of the unique proteins that are expressed there.

The importance of interaction network dynamics

In unicellular yeast, broadly expressed proteins can have precisely temporally regulated activities because of their interactions with proteins with restricted expression profiles (de Lichtenberg ). We show here that a similar process may be widely used in multicellular organisms to restrict and modify the activities of a protein to a subset of the tissues in which it is expressed. Together with earlier analyses in yeast (Han ; Luscombe ; de Lichtenberg ), this work highlights the importance of considering global interaction networks as having dynamic, not static, structures, and topologies. Additional work analyzing how the networks of molecular interactions change between cell types, states, and conditions should prove a fruitful approach for understanding living systems.

Materials and methods

Protein interaction data

We compiled human protein interactions from a total of 21 different databases, as listed in Table I. We required that each interaction be supported by at least one piece of direct experimental evidence demonstrating physical association between two human proteins, and removed all interactions that did not meet these criteria. All interactions were mapped to common Ensembl gene identifiers. The complete network (‘CRG-all'), consists of 80 922 interactions between 10 229 human proteins (approximately half the human proteome) and is available as Supplementary Table S1.
Table 1

Human protein interaction datasets used to construct or support the integrated human interaction network

DatasetaDescriptionReference
BINDLiterature curationBader et al (2001)
BIOGRIDLiterature curationStark et al (2006)
BioverseData integrationMcDermott et al (2005)
CCSB-HI1Yeast two-hybridRual et al (2005)
Co-citationText-miningRamani et al (2005)
Co-expression conservationConserved co-expression relationshipsRamani et al (2008)
CORUMLiterature curationRuepp et al (2008)
DIPLiterature curationXenarios et al (2002)
HPRDLiterature curationPeri et al (2003)
IntactLiterature curationKerrien et al (2007)
IntNetDBData integrationXia et al (2006)
MDCYeast two-hybridStelzl et al (2005)
MINTLiterature curation, orthologyPersico et al (2005)
MIPSLiterature curationPagel et al (2005)
OPHIDOrthologyBrown and Jurisica (2005)
OttowaAffinity purification- mass spectrometryEwing et al (2007)
PC/AtaxiaYeast two-hybridLim et al (2006)
ReactomeLiterature curationVastrik et al (2007)
SangerOrthologyLehner and Fraser (2004a)
Transcription complexesAffinity purification- mass spectrometryJeronimo et al (2007)
UnileverData integrationhttp://www.cytoscape.org/cgi-bin/moin.cgi/Data_Sets

aConserved co-expression, co-citation, or evolutionary conservation data are only used in the final network as additional supporting evidence. All interactions must have at least one piece of experimental binding evidence to be included in the final dataset, or physical binding evidence from at least two publications to be included in the multiple-support network.

Filtered interaction dataset

In total, 13 102 of the interactions in our network between 4750 proteins are supported by experimental evidence of physical binding reported in at least two different primary research publications. Given the multiple lines of evidence supporting these interactions, we use this subset of interactions (‘CRG-filtered') as high-confidence interactions to confirm that our conclusions are not affected by interaction data quality or sampling (see Supplementary Figures).

Expression data

To identify which protein interactions can occur in a particular cell or tissue type, we used global gene expression data. Although interactions can be regulated by localization, phosphorylation, etc, we aim to distinguish the proteins that can interact under some condition in a tissue from those that cannot, and mRNA expression is a reasonable indicator of this potential. We used expression data from the GNF Atlas project that measured expression across 79 different human cell or tissue types (Su ). The MAS5 normalized expression levels were averaged between experimental replicas, and in cases where more than one probe set was present for a gene, the more sensitive probe set was used. In this dataset, a gene is considered as present in a tissue, if its normalized expression level is >200 (Su ). However, our conclusions remain the same when this stringency is increased or decreased (see Supplementary information). At this threshold, >98% of the interaction partners in our global network for which expression information is available are co-expressed in least one human tissue.

Housekeeping proteins

We identified universally expressed housekeeping proteins using a total of 10 different criteria. First, we used the GNF Atlas data, and considered housekeeping proteins as those with an expression level above 200 in all 79 tissues, or in more than 70/79 tissues (i.e. allowing for some false-negatives). Second, we used the same two tissue criteria, but increased (250) or decreased (150) the stringency at which a gene is considered expressed. Third, we used four additional sets defined in an earlier publication—genes identified as expressed in 18/18 or at least 16/18 tissues using microarray data, and genes with the same tissue criteria but defined using expressed sequence tag (EST) data (Zhu ).

Neurotransmitter receptor complexes

Components of N-methyl-D-aspartate receptor and metabotropic receptor complexes were identified by extensive proteomic studies as described (Pocklington ). We used the 215 subunits of these complexes that could be mapped to human Ensembl gene identifiers, of which 77 have demonstrated roles in learning and memory through genetic studies in mice or are implicated in psychiatric disorders in humans (Pocklington ). We used the sets of housekeeping proteins described above to identify how many of these subunits represent universally expressed proteins.

Protein evolution

Proteins were classified as metazoan specific or pre-metazoan using the analysis of Freilich .

Conflict of interest

The authors declare that they have no conflict of interest. Supplementary Figures 1 - 3
  46 in total

1.  BIND--The Biomolecular Interaction Network Database.

Authors:  G D Bader; I Donaldson; C Wolting; B F Ouellette; T Pawson; C W Hogue
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae.

Authors:  P Uetz; L Giot; G Cagney; T A Mansfield; R S Judson; J R Knight; D Lockshon; V Narayan; M Srinivasan; P Pochart; A Qureshi-Emili; Y Li; B Godwin; D Conover; T Kalbfleisch; G Vijayadamodar; M Yang; M Johnston; S Fields; J M Rothberg
Journal:  Nature       Date:  2000-02-10       Impact factor: 49.962

3.  Protein interaction mapping in C. elegans using proteins involved in vulval development.

Authors:  A J Walhout; R Sordella; X Lu; J L Hartley; G F Temple; M A Brasch; N Thierry-Mieg; M Vidal
Journal:  Science       Date:  2000-01-07       Impact factor: 47.728

4.  Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping enzyme.

Authors:  Célia Jeronimo; Diane Forget; Annie Bouchard; Qintong Li; Gordon Chua; Christian Poitras; Cynthia Thérien; Dominique Bergeron; Sylvie Bourassa; Jack Greenblatt; Benoit Chabot; Guy G Poirier; Timothy R Hughes; Mathieu Blanchette; David H Price; Benoit Coulombe
Journal:  Mol Cell       Date:  2007-07-20       Impact factor: 17.970

5.  Revealing static and dynamic modular architecture of the eukaryotic protein interaction network.

Authors:  Kakajan Komurov; Michael White
Journal:  Mol Syst Biol       Date:  2007-04-24       Impact factor: 11.429

6.  CORUM: the comprehensive resource of mammalian protein complexes.

Authors:  Andreas Ruepp; Barbara Brauner; Irmtraud Dunger-Kaltenbach; Goar Frishman; Corinna Montrone; Michael Stransky; Brigitte Waegele; Thorsten Schmidt; Octave Noubibou Doudieu; Volker Stümpflen; H Werner Mewes
Journal:  Nucleic Acids Res       Date:  2007-10-26       Impact factor: 16.971

7.  Construction, visualisation, and clustering of transcription networks from microarray expression data.

Authors:  Tom C Freeman; Leon Goldovsky; Markus Brosch; Stijn van Dongen; Pierre Mazière; Russell J Grocock; Shiri Freilich; Janet Thornton; Anton J Enright
Journal:  PLoS Comput Biol       Date:  2007-10       Impact factor: 4.475

8.  A first-draft human protein-interaction map.

Authors:  Ben Lehner; Andrew G Fraser
Journal:  Genome Biol       Date:  2004-08-13       Impact factor: 13.583

9.  How many human genes can be defined as housekeeping with current expression data?

Authors:  Jiang Zhu; Fuhong He; Shuhui Song; Jing Wang; Jun Yu
Journal:  BMC Genomics       Date:  2008-04-16       Impact factor: 3.969

10.  A map of human protein interactions derived from co-expression of human mRNAs and their orthologs.

Authors:  Arun K Ramani; Zhihua Li; G Traver Hart; Mark W Carlson; Daniel R Boutz; Edward M Marcotte
Journal:  Mol Syst Biol       Date:  2008-04-15       Impact factor: 11.429

View more
  164 in total

Review 1.  Proteome-wide prediction of protein-protein interactions from high-throughput data.

Authors:  Zhi-Ping Liu; Luonan Chen
Journal:  Protein Cell       Date:  2012-06-22       Impact factor: 14.870

2.  AMPK Facilitates Nuclear Accumulation of Nrf2 by Phosphorylating at Serine 550.

Authors:  Min Sung Joo; Won Dong Kim; Ki Young Lee; Ji Hyun Kim; Ja Hyun Koo; Sang Geon Kim
Journal:  Mol Cell Biol       Date:  2016-06-29       Impact factor: 4.272

3.  Large-scale de novo prediction of physical protein-protein association.

Authors:  Antigoni Elefsinioti; Ömer Sinan Saraç; Anna Hegele; Conrad Plake; Nina C Hubner; Ina Poser; Mihail Sarov; Anthony Hyman; Matthias Mann; Michael Schroeder; Ulrich Stelzl; Andreas Beyer
Journal:  Mol Cell Proteomics       Date:  2011-08-11       Impact factor: 5.911

4.  The relationships among microRNA regulation, intrinsically disordered regions, and other indicators of protein evolutionary rate.

Authors:  Sean Chun-Chang Chen; Trees-Juen Chuang; Wen-Hsiung Li
Journal:  Mol Biol Evol       Date:  2011-03-11       Impact factor: 16.240

5.  Network-based inference from complex proteomic mixtures using SNIPE.

Authors:  David P Nusinow; Adam Kiezun; Daniel J O'Connell; Joel M Chick; Yingzi Yue; Richard L Maas; Steven P Gygi; Shamil R Sunyaev
Journal:  Bioinformatics       Date:  2012-10-11       Impact factor: 6.937

6.  Network properties of human disease genes with pleiotropic effects.

Authors:  Sreenivas Chavali; Fredrik Barrenas; Kartiek Kanduri; Mikael Benson
Journal:  BMC Syst Biol       Date:  2010-06-04

7.  CORUM: the comprehensive resource of mammalian protein complexes--2009.

Authors:  Andreas Ruepp; Brigitte Waegele; Martin Lechner; Barbara Brauner; Irmtraud Dunger-Kaltenbach; Gisela Fobo; Goar Frishman; Corinna Montrone; H-Werner Mewes
Journal:  Nucleic Acids Res       Date:  2009-11-01       Impact factor: 16.971

8.  Protein-protein interactions essentials: key concepts to building and analyzing interactome networks.

Authors:  Javier De Las Rivas; Celia Fontanillo
Journal:  PLoS Comput Biol       Date:  2010-06-24       Impact factor: 4.475

9.  An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data.

Authors:  Daniel Ramsköld; Eric T Wang; Christopher B Burge; Rickard Sandberg
Journal:  PLoS Comput Biol       Date:  2009-12-11       Impact factor: 4.475

10.  Translation efficiency in humans: tissue specificity, global optimization and differences between developmental stages.

Authors:  Yedael Y Waldman; Tamir Tuller; Tomer Shlomi; Roded Sharan; Eytan Ruppin
Journal:  Nucleic Acids Res       Date:  2010-01-21       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.