Literature DB >> 11574061

Being in the right location at the right time.

Abstract

Taking each coding sequence from the human genome in turn and identifying the subcellular localization of the corresponding protein would be a significant contribution to understanding the function of each of these genes and to deciphering functional networks. This article highlights current approaches aimed at achieving this goal.

Entities: Chemical Gene Species

Mesh：

Substances：

Year: 2001 PMID： 11574061 PMCID： PMC138962 DOI： 10.1186/gb-2001-2-9-reviews1024

Source DB: PubMed Journal: Genome Biol ISSN： 1474-7596 Impact factor: 13.583

The spatial and temporal regulation of biochemical reactions in eukaryotic cells is achieved by a high degree of compartmentalization. Each protein is part of a functional biochemical network and all proteins within a particular network are at least once in their lifetime localized close to each other, within (or at) a particular organelle or compartment. This facilitates interactions and yet allows the segregation of different networks. Exchange of information between different organelles, and of proteins between networks, is essential for the proper function of the cell as an entity and is achieved by the active transport of material. One of the best examples of such an assembly of networks is the secretory pathway. Secretory proteins move sequentially through the distinct membrane-bounded organelles of this pathway, receiving at each step specific enzymatic modifications necessary for their quality control and proper function. The communication and specific transfer of material between membrane organelles is mediated by distinct small membrane-bounded transport carrier vesicles containing a myriad of regulatory proteins. A key feature of any protein functionally involved in the secretory pathway is its permanent or transient localization to one of the appropriate transport carriers or organelles. Extending this concept to the whole cell, the determination of the subcellular localization of a novel protein is one of the essential steps in resolving its function. This includes imaging not only the protein's steady-state distribution but also the changes in localization that can occur in response to environmental conditions, during specific stages of the cell cycle or of cell differentiation. Indeed, changes in localization can also be caused by the breakdown of remote but functionally related organelles and/or cellular structures, such as Golgi fragmentation resulting from microtubule reorganization (see for example Figure 1c,d).

Figure 1

Highly dynamic and interdependent organization of distinct subcellular structures. The Vero cells in (a) show the normal arrangement of microtubules (green) radiating from the microtubule-organizing center. The Golgi complex (indicated by the arrow), a membrane-bounded organelle through which all secretory proteins pass en route to the cell surface, is stained with antibodies against the coat protein complex COPI (red; where red and green staining coincide they appear yellow). The Golgi complex resides as a tight structure at a central perinuclear location. The cell in (b) has been treated with the drug brefeldin A, which causes rapid removal of the COPI coat from Golgi membranes into a cytoplasmic pool, followed by disassembly of the Golgi apparatus. The microtubule network remains unaffected by this treatment, however. (c) Treatment of a cell with the drug nocodazole causes disassembly of the microtubules into their respective cytoplasmic tubulin monomers. This breakdown of the microtubule network, a key component of cell architecture, also results in the breakdown of the Golgi complex into distinct fragments spread throughout the cell (as indicated by the arrowheads). The cell in (d) has been transfected with a GFP-tagged novel cDNA, which when expressed localizes along the entire microtubule network (green). But as the expression level of this protein increases, it interferes with the microtubule network with the concomitant result that the Golgi is fragmented in a similar manner to that observed in (c) (as indicated by the arrowheads). This phenotypic effect illustrates the dynamic interdependency of organelles exemplified by Golgi morphology and the microtubule network. The nuclei of all the cells have also been stained with the DNA-chelating agent diamino phenylindole (DAPI; blue), showing that this organelle appears not to be affected by the various treatments. The bar indicates 10 μm.

Although studies to follow these dynamic events have been a difficult task in the past, the availability of green fluorescent protein (GFP) and its spectral variants has now facilitated localization experiments particularly aimed at observing protein dynamics in living cells [1,2,3,4]. The cDNA encoding GFP was cloned several years ago and encodes a 27 kDa protein that emits green fluorescence when excited with blue light, without the need for any co-factors. Thus, any cDNA can be fused with the coding sequence of GFP, and the localization of the expressed GFP fusion can be followed in living cells. This unique feature of GFP has led to the development of a number of 'localization screening assays', which can be performed in a systematic 'high-throughput' manner as typically required for large-scale post-genome projects.

GFP-based techniques

Most GFP-based techniques fuse either fragments of genomic libraries or individual clones from cDNA libraries to the coding sequence of GFP, then express the fusions in cells or tissues and determine their subcellular localizations by microscopic inspection. Subsequently, the respective cDNAs or genes are rescued from the cells or tissues, cloned and sequenced. Such strategies have already been conducted on a genome-wide scale in yeast [5,6] and have identified the localization of so-far uncharacterized proteins, or fragments thereof. The GFP-tagged proteins can be immediately followed in living cells by time-lapse microscopy to determine their cellular dynamics, which adds a further level of information to such screens. At least 50% of the cDNAs isolated in this way are already known and well characterized, however [6,7,8,9]. Furthermore, the same cDNA clones are isolated several-fold in one screen, as the primary criterion for selection is simply localization [5]. These aspects are major disadvantages of such morphological screens and make them inefficient. For example, in an attempt to isolate novel nuclear-envelope proteins, 550,000 starting cDNA clones were required to identify 27 clones localizing to this compartment, of which only two proved to be novel [9]. When tagging cDNA libraries with GFP, consideration must also be given to the effect of the reporter on masking targeting signals contained within the expressed proteins. Amino-terminal fusions of GFP to target proteins potentially block signal sequences associated with import into mitochondria or the endoplasmic reticulum, for example. Conversely, when using either random DNA fragments or even non-full-length cDNAs (of which there are significant numbers in cDNA libraries), the expressed proteins may appear to clearly localize, but the recorded localization may be aberrant, resulting simply from exposing a peptide sequence normally hidden in the full-length protein. This was clearly demonstrated in the 'motif-trap method' by which a large number of cryptic mitochondrial targeting signals were isolated - many corresponding to sequences derived from non-coding genomic DNA [10]. In an attempt to circumvent the problem of hidden amino-terminal targeting sequences, in one study [11] cDNAs were cloned from a library containing cDNA fragments upstream of GFP, and a retrovirus-mediated expression system was used to determine the cellular localizations of the encoded fusion products. Although this expression system is highly effective, the authors themselves concede that none of their cDNAs was full-length, and that the interpretation of the localization results is dependent upon the targeting sequences being present in the partial cDNA [11]. Thus, strategies using GFP tagging of whole cDNA or genomic libraries generate significant amounts of redundant or inaccurate data, all of which are time-consuming, and therefore expensive, to eliminate. Methods are therefore now being devised to focus more rapidly specifically on those localizations of interest. For example, one possibility is first to isolate GFP-positive cells from the non-fluorescent cells using fluorescence-activated cell sorting (FACS), which is able to sort thousands of GFP-expressing cells within minutes into individual wells of multiwell plates, and subsequently to clone them. In this way only GFP-expressing cells have to be examined microscopically, which increases the speed of analysis. An improved variant of such an approach was described recently [7] with the aim of identifying proteins localizing to the nucleus. Pichon and co-workers first mildly permeabilized intact cells with detergent, in order to remove cytosolic but not nuclear GFP-fusion proteins, and then sorted the remaining GFP-positive cells using FACS. This resulted in a 70-fold enrichment of cells expressing GFP-fusion proteins in the nucleus compared to cultures that had not been treated and sorted. Clearly, tagging sequenced full-length cDNAs on an individual basis retains the advantages but overcomes many drawbacks of the approaches described above [12,13]. One advantage is the availability of a large clone resource from genome projects, the cDNA sequences of which can be prescreened for already-known genes or species variants, so that only novel cDNAs need to be GFP-tagged and screened. In addition, different versions of full-length GFP fusions - tagged at either the amino or the carboxyl terminus - can be generated and compared, helping to circumvent the risk of masking targeting sequences. Indeed, as expected, often only one version of a GFP-tagged protein shows proper subcellular localization [13]. Although the tagging of full-length cDNAs is a relatively low-throughput process and is reliant upon the identification of novel cDNAs by other means such as systematic sequencing [14], it has a further clear advantage that no additional cloning is required once an interesting localization has been identified. Tagging of full-length cDNAs suffered until recently from the problem that conventional restriction-enzyme-based cloning had to be used, which is tedious and virtually impossible to do for any large set of molecules [12]. To overcome this problem, we have recently devised a method that uses a recombination-based cloning system to systematically tag with GFP open reading frames of full-length cDNAs that have been identified and sequenced by large-scale genome projects [13,14]. The whole procedure is amenable to automation, and other characterization studies (for example, mutagenesis, protein dynamics and identification of interacting partners) can follow the localization screen immediately without further generation of new reagents or lengthy cloning procedures to identify the full-length cDNAs.

In silico methods

Several bioinformatic tools have been developed with the aim of predicting protein localization on the basis of sequence features within the respective gene or cDNA. One of the early methods, PSORT [15,16], detects in sequences the signals required for sorting proteins to particular subcellular compartments. Although PSORT is a well-accessed program and is widely applicable to different organisms, its overall accuracy - at best, for yeast - is still in the region of 50%. Others have used phylogenetic profiles [17], more careful use of annotated databases such as the Meta-A evaluation of SWISS-PROT entries [18], or expression levels [19] as means to tap into the knowledge that can be gained from determining localization. More profitable, perhaps, is to concentrate on specific organelles and the sequence motifs that direct proteins to them. For example, defined signals for directing proteins to mitochondria, the secretory pathway or chloroplasts are now well characterized, and the success rate of prediction can be as high as 90%. Even the correct prediction of cleavage sites for the signal sequences is possible with more than 50% success rate [20]. Certainly the speed and cost of these methods is currently unsurpassed. As a result of more genome sequencing projects being completed, more data for comparisons are available, and so the quality of results using screening algorithms based on sequence homologies rises steadily. More databases, which integrate all this information, are therefore being implemented [21,22]. Experimental data gathered for individual genes, and ideally proteins, also funnels into such databases information that is then accessible to in silico tools. For many novel proteins, however, these tools remain at present suggestive at best, and for these molecules there is still no alternative to actual experimental verification. In summary, a protein's localization and its subcellular dynamics are important parameters to know when trying to determine its function. With the availability of GFP and its variants, new in vivo approaches have been made possible, and these have already identified novel proteins in various desired locations. In due course, these techniques will undoubtedly be applied and perfected on a genome-wide scale. Furthermore, the reagents generated during the course of such projects (such as GFP-tagged proteins) are extremely useful for subsequent microscope-based functional studies with different foci - for example, the analysis of a protein's posttranslational modifications or the dynamics of interactions with binding partners in living cells [4]. This will ultimately allow us to identify functional networks of proteins in a morphological context and will greatly contribute to our understanding of whole-cell function.

20 in total

1. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization.

Authors: K Nakai; P Horton
Journal: Trends Biochem Sci Date: 1999-01 Impact factor: 13.807

Review 2. Dual-colour imaging with GFP variants.

Authors: J Ellenberg; J Lippincott-Schwartz; J F Presley
Journal: Trends Cell Biol Date: 1999-02 Impact factor: 20.808

3. Libraries of green fluorescent protein fusions generated by transposition in vitro.

Authors: G V Merkulov; J D Boeke
Journal: Gene Date: 1998-11-19 Impact factor: 3.688

Review 4. Wanted: subcellular localization of proteins based on sequence.

Authors: F Eisenhaber; P Bork
Journal: Trends Cell Biol Date: 1998-04 Impact factor: 20.808

5. Toward a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs.

Authors: S Wiemann; B Weil; R Wellenreuther; J Gassenhuber; S Glassl; W Ansorge; M Böcher; H Blöcker; S Bauersachs; H Blum; J Lauber; A Düsterhöft; A Beyer; K Köhrer; N Strack; H W Mewes; B Ottenwälder; B Obermaier; J Tampe; D Heubner; R Wambutt; B Korn; M Klein; A Poustka
Journal: Genome Res Date: 2001-03 Impact factor: 9.043

6. Green fluorescent protein as a marker for gene expression.

Authors: M Chalfie; Y Tu; G Euskirchen; W W Ward; D C Prasher
Journal: Science Date: 1994-02-11 Impact factor: 47.728

7. Identification of fission yeast nuclear markers using random polypeptide fusions with green fluorescent protein.

Authors: K E Sawin; P Nurse
Journal: Proc Natl Acad Sci U S A Date: 1996-12-24 Impact factor: 11.205

8. A method to identify cDNAs based on localization of green fluorescent protein fusion products.

Authors: K Misawa; T Nosaka; S Morita; A Kaneko; T Nakahata; S Asano; T Kitamura
Journal: Proc Natl Acad Sci U S A Date: 2000-03-28 Impact factor: 11.205

9. Motif trap: a rapid method to clone motifs that can target proteins to defined subcellular localisations.

Authors: L A Bejarano; C González
Journal: J Cell Sci Date: 1999-12 Impact factor: 5.285

10. A visual screen of a GFP-fusion library identifies a new type of nuclear envelope membrane protein.

Authors: M M Rolls; P A Stein; S S Taylor; E Ha; F McKeon; T A Rapoport
Journal: J Cell Biol Date: 1999-07-12 Impact factor: 10.539

9 in total

1. Genome-wide RNAi screening identifies human proteins with a regulatory function in the early secretory pathway.

Authors: Jeremy C Simpson; Brigitte Joggerst; Vibor Laketa; Fatima Verissimo; Cihan Cetin; Holger Erfle; Mariana G Bexiga; Vasanth R Singan; Jean-Karim Hériché; Beate Neumann; Alvaro Mateos; Jonathon Blake; Stephanie Bechtel; Vladimir Benes; Stefan Wiemann; Jan Ellenberg; Rainer Pepperkok
Journal: Nat Cell Biol Date: 2012-06-03 Impact factor: 28.824

2. The German cDNA network: cDNAs, functional genomics and proteomics.

Authors: Stefan Wiemann; Stephanie Bechtel; Detlev Bannasch; Rainer Pepperkok; Annemarie Poustka
Journal: J Struct Funct Genomics Date: 2003

3. LIFEdb: a database for functional genomics experiments integrating information from external sources, and serving as a sample tracking system.

Authors: Detlev Bannasch; Alexander Mehrle; Karl-Heinz Glatting; Rainer Pepperkok; Annemarie Poustka; Stefan Wiemann
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

4. Automatic identification of subcellular phenotypes on human cell arrays.

Authors: Christian Conrad; Holger Erfle; Patrick Warnat; Nathalie Daigle; Thomas Lörch; Jan Ellenberg; Rainer Pepperkok; Roland Eils
Journal: Genome Res Date: 2004-06 Impact factor: 9.043

5. Immunolabeling artifacts and the need for live-cell imaging.

Authors: Ulrike Schnell; Freark Dijk; Klaas A Sjollema; Ben N G Giepmans
Journal: Nat Methods Date: 2012-01-30 Impact factor: 28.547

6. In vivo direct molecular imaging of early tumorigenesis and malignant progression induced by transgenic expression of GFP-Met.

Authors: Sharon Moshitch-Moshkovitz; Galia Tsarfaty; Dafna W Kaufman; Gideon Y Stein; Keren Shichrur; Eddy Solomon; Robert H Sigler; James H Resau; George F Vande Woude; Ilan Tsarfaty
Journal: Neoplasia Date: 2006-05 Impact factor: 5.715

Review 7. The subcellular localization of the mammalian proteome comes a fraction closer.

Authors: Jeremy C Simpson; Rainer Pepperkok
Journal: Genome Biol Date: 2006 Impact factor: 13.583

8. Development of a high-throughput method for the systematic identification of human proteins nuclear translocation potential.

Authors: Trinh Xuan Hoat; Nicolas Bertin; Noriko Ninomiya; Shiro Fukuda; Kengo Usui; Jun Kawai; Yoshihide Hayashizaki; Harukazu Suzuki
Journal: BMC Cell Biol Date: 2009-09-22 Impact factor: 4.241

Review 9. Localizing the proteome.

Authors: Jeremy C Simpson; Rainer Pepperkok
Journal: Genome Biol Date: 2003-11-18 Impact factor: 13.583

9 in total