Literature DB >> 17130152

Organelle DB: an updated resource of eukaryotic protein localization and function.

Nuwee Wiwatwattana1, Christopher M Landau, G Jamie Cope, Gabriel A Harp, Anuj Kumar.   

Abstract

Organelle DB (http://organelledb.lsi.umich.edu) is a web-accessible relational database presenting a supplemented catalog of organelle-localized proteins and major protein complexes. Since its release in 2004, Organelle DB has grown by 20% to encompass over 30,000 proteins from 138 eukaryotic organisms. Each protein in Organelle DB is presented with its subcellular localization, primary sequence and a detailed description of its function, as available. All records in Organelle DB have been annotated using controlled vocabulary from the Gene Ontology consortium. Protein localization data are inherently visual, and Organelle DB is a significant repository of biological images, housing 1500 micrographs of yeast cells carrying stained proteins. Furthermore, we report here the development of Organelle View, an extension of Organelle DB for the interactive visualization of organelles and subcellular structures in the budding yeast Saccharomyces cerevisiae. Organelle View offers a dimensional representation of a yeast cell; users can search Organelle View for proteins of interest, and the organelles housing these proteins will be highlighted in the cell image. Among other applications, Organelle View may serve as an educational aid engaging introductory biology students through a visually 'fun' interface. Organelle View can be accessed from the Organelle DB home page or directly at http://organelleview.lsi.umich.edu.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17130152      PMCID: PMC1716721          DOI: 10.1093/nar/gkl1000

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


OVERVIEW

Since its inception in 2004, Organelle DB has provided a freely accessible information resource cataloging eukaryotic proteins that are known components of an organelle or major protein complex (1). Organelle DB presents a list of proteins organized essentially by subcellular localization and/or by organism. Each protein record housed within Organelle DB presents systematic and common gene/protein names, gene descriptions, phenotypic information (as available), biological terms from the Gene Ontology (GO) consortium, amino acid sequence and, in some cases, micrograph images (Figure 1A). To facilitate data interoperability, we have taken care to describe all protein localizations using the controlled vocabulary established by the GO consortium. In total, Organelle DB encompasses ∼60 GO localization terms; these terms have been described previously (1).
Figure 1

Sample protein reports from Organelle DB and Organelle View. (A) Output from Organelle DB in response to a query for the yeast protein YOL090w/Msh2p, a nuclear protein involved in mismatch repair. YOL090w is the systematic name for this gene product; Msh2p is the standard or common name for this protein. (B) Output from Organelle View queried for four yeast proteins as shown. The cell image is highlighted for Msh2p/YOL090w and, thus, the nucleus appears in blue (color-coded for Msh2p).

Sample protein reports from Organelle DB and Organelle View. (A) Output from Organelle DB in response to a query for the yeast protein YOL090w/Msh2p, a nuclear protein involved in mismatch repair. YOL090w is the systematic name for this gene product; Msh2p is the standard or common name for this protein. (B) Output from Organelle View queried for four yeast proteins as shown. The cell image is highlighted for Msh2p/YOL090w and, thus, the nucleus appears in blue (color-coded for Msh2p). Organelle DB has been populated in two ways. First, we have extracted protein localization data from each major model organism database [i.e. the Saccharomyces Genome Database SGD (2), the Drosophila melanogaster database FlyBase (3), the Caenorhabditis elegans database WormBase (4), the Mouse Genome Database MGD (5) and the Arabidopsis Information Resource TAIR (6)]. Localization data for human proteins and for other proteins outside of the standard model organisms have been extracted from SWISS-PROT (7) and from GO (8). Second, we have manually compiled protein localization data from large-scale and systematic studies in the budding yeast Saccharomyces cerevisiae (9–11) in supplement to localization data deposited in SGD. Since localization data have been drawn from several databases and studies, the particular source of a given protein localization record is indicated within each protein data report in the ‘Data Source’ field. By maintaining updated localization data from these sources, we have grown Organelle DB to encompass over 31 000 proteins spanning 138 organisms across the eukaryotic kingdom. Numerical listings of protein localizations for major organisms of study are presented in Table 1. Note that we now present these data tallies by specific organism rather than by broad organismal groupings (e.g. Arabidopsis thaliana rather than ‘plants’).
Table 1

Protein localization records in Organelle DB (as of September 2006)

OrganismSubcellular localizations
NucleusMitochondriaERMembrane proteinProtein complexMiscellaneousTotal
Saccharomyces cerevisiae22239893598896234355518
Arabidopsis thaliana1168596707645583813537
Drosophila melanogaster13834981018635542603659
Caenorhabditis elegans140280780729543
Mus musculus144345515612252013463826
Homo sapiens169132022719112654384852
Others (132 in total)18478178253130963048112 698
Total records989511 316117370412838237034 633

Note that records do not correspond exactly with proteins; one protein may have more than one record if it has been found within more than one organelle.

Protein localization records in Organelle DB (as of September 2006) Note that records do not correspond exactly with proteins; one protein may have more than one record if it has been found within more than one organelle.

VISUALIZING PROTEIN LOCALIZATION DATA

Although localization data are inherently visual, text-based representations of protein localization can be difficult to understand in some cases. For example, in S.cerevisiae, a number of polarity proteins have been found localized to the bud tip; however, this descriptive term is likely unfamiliar and difficult to visualize for any researcher working with an organism other than yeast. From a simple micrograph of yeast cells, the ‘bud tip’ localization becomes clearly understood as the extreme tip of the growing bud at the opposite end of the larger, so-called mother cell. Moreover, the subtleties of a given protein localization can be lost in a simple text-based classification scheme: cytoplasmic staining can be ‘patchy’ (possibly cytoskeletal) or diffuse (from a soluble protein), with very different implications regarding protein function in each instance (9). Thus, often, the subcellular distribution of a protein is best considered by viewing an image of a cell in which the protein of interest is visualized either as a fusion to a fluorescent protein (10) or by indirect immunofluorescence staining (9). We have included this type of primary localization data in Organelle DB whenever possible; specifically, Organelle DB presents ∼1500 fluorescent micrographs of yeast cells visualized with antibodies directed against epitope-tagged proteins (indirect immunofluorescence) from our own studies of protein localization in S.cerevisiae (9,11). In addition, we welcome submissions from the scientific community of any such images for any protein reported in Organelle DB. To further facilitate the visualization of protein localization data, we have developed an extension of Organelle DB called Organelle View. Organelle View is a scientific visualization application allowing users to dynamically generate a visual interpretation of data from Organelle DB. Organelle View presents a searchable interface with a three-dimensional representation of an archetypical cell (Figure 1B). Rather than representing organelles and subcellular structures by text, Organelle View offers an artist's rendering of a cell and its major organelles. At present, we have chosen a budding yeast cell (S.cerevisiae) as the model for Organelle View, largely because protein localization has been studied quite extensively in yeast (9,10); future versions of Organelle View will incorporate additional cell types from other organisms. Users can search Organelle View for any yeast protein, and the organelle to which that protein localizes will be highlighted in the cell image. An additional text-based summary of gene function is also presented for each searched protein. Organelle View, therefore, offers an alternative mode of presentation for the information housed in Organelle DB; it also stands as a useful educational tool, providing an easily accessible and engaging platform from which introductory biology students can explore the basics of cell biology.

DESIGN AND IMPLEMENTATION

Organelle DB was developed using the PHP server-side scripting language version 4.3.9 on a Linux server running the MySQL database version 5.0.18. We populated the most recent protein localization data from the GO database and major model organism databases [the databases described above plus the Rat Genome Database RGD (12), the Dictyostelium Database dictyBase (13) and the Zebrafish Information Network ZFIN (14)]. The scripts we implemented were configured to automatically add new genes, delete obsolete genes and update the gene information obtained from each of the source databases. We also developed a facility to add/delete/edit a particular gene per curator request. The size of our current database is 324 MB. The Organelle View application is a web-based Java applet. This applet interfaces with the existing database Organelle DB and renders a three-dimensional model of a cell with accompanying text and dynamic functionality. The rendering code was provided by the program WireFusion (Demicron). All functionality code is written in Java and JavaScript and is provided by Nformation Design (Philadelphia, PA). All buttons and text areas outside of the applet were created using the PHP and HTML languages. Models for Organelle View were created using the open-source three-dimensional modeling program Blender. The Organelle View applet requires Java 1.4.1 to function correctly.

USING ORGANELLE DB

Organelle DB is fully searchable and presents users with a variety of options for convenient data access and retrieval. From the Organelle DB home page, users may specifically search for proteins localized to a given organelle, subcellular structure or protein complex. Additional options are provided in the Quick Search form such that users may alternatively browse records related to a single organism or gene/protein. The Quick Search form on the Organelle DB home page provides six broad protein localization groupings as follows: endoplasmic reticulum (ER), nucleus, membrane protein, mitochondrion, protein complex and others. Detailed subcategories of organelles, protein complexes and organisms may be directly accessed from our Advanced Search forms (on the ‘Search’ page at Organelle DB). These Advanced Search options offer a full list of organelles and organisms contained within Organelle DB; for example, through our Advanced Search, users may select an organelle (e.g. endoplasmic reticulum) and further select a subcategory of that organelle (e.g. integral to endoplasmic reticulum membrane). In addition, users may specify an organelle and organism, thereby limiting output to only those organelle-localized proteins from the indicated organism. Search results are presented as a list, with protein names and a brief description of each protein indicated. By clicking on a protein name, users are taken to a full protein report (Figure 1A) containing the gene's systematic name and standard/common name, gene description including phenotypic information as available, GO classifications, amino acid sequence and any captured images supporting the reported protein localization (available for some yeast proteins in Organelle DB). We have taken particular care to maintain proper nomenclature for a given organism in presenting gene names. In cases where multiple isoforms of a given protein are reported, the amino acid sequence of each isoform is presented in Organelle DB. As an alternative to individual search queries, users may download datasets from Organelle DB in bulk. Specifically, all data in Organelle DB may be downloaded as tab-delimited text files. In total, we offer three such files. Protein localization records from Organelle DB may be downloaded in a single file. GO annotations for each protein presented in Organelle DB are provided in a separate file. A third file provides amino acid sequences in the FASTA format for all protein entries. Multiple sequences are available for certain proteins in Organelle DB; these protein sequences can be correlated to a single protein entry in the tab-delimited text file described above through the Accession ID field.

USING ORGANELLE VIEW

Like Organelle DB, Organelle View is fully searchable. Users can enter up to four proteins in the ‘Quick Search’ form to the left of our home page (Figure 1B). The proteins with localizations will be displayed in text to the right of the search form boxes, color-coded as indicated (the first protein name and localization printed in red, etc.). The corresponding organelle for each protein will also be colored accordingly in the cell image and can be highlighted by rolling over the protein name and its localization. The cell image provided in Organelle View is an artist's rendering of a budding yeast cell; descriptive terms and organelle names related to this image are presented in the ‘Localization Key’ at the bottom left of our home page. A brief description of each cellular landmark and/or organelle is provided here; the text may be viewed by scrolling over the desired Localization Key image. The cell image can be manipulated by the cursor; e.g. the cell image can be rotated by clicking/dragging the image. Also, by clicking with the right mouse button, users can zoom in and out of the cell. Organelle View also provides much of the protein function information presented in Organelle DB. Users can view summary information regarding the function of any selected protein by clicking on the protein's corresponding number in the ‘Information’ box. The resulting text display presents systematic and standard gene/protein names, a brief functional description of the protein and any comments related to the protein's function or localization. The color scheme described above can also be ‘animated’ if multiple proteins sharing a common localization are entered into Organelle View. By this feature, the organelle common to both proteins will shift in color in the cell image, transitioning, e.g. from red (for Protein 1) to orange (for Protein 2). This automatic color shift can be toggled on/off by clicking the ‘Animate’ button to the lower right of the cell image. A complete tutorial describing the use of Organelle View may be accessed on-line by clicking the ‘About/How To’ button from the Organelle View home page.

APPLICATIONS AND SIGNIFICANCE

Organelle DB is a cross-species information resource for researchers utilizing nearly any eukaryotic organism of study. Data from 138 organisms are encompassed in Organelle DB. In particular, we are taking significant care to ensure that Organelle DB is fully integrated with major model organism sites and relevant external databases. Each protein report in Organelle DB is linked to the appropriate external database (i.e. the model organism sites SGD, TAIR, MGD, FlyBase, WormBase or the protein database SWISS-PROT). Thus, users can quickly drill deeper into specific proteins of interest. Furthermore, we have maintained a controlled vocabulary as much as possible in annotating Organelle DB in order to ensure significant data interoperability. We have carefully utilized proper gene/protein names for each record in Organelle DB; whenever possible, we have drawn protein names from the appropriate model organism sites in order to comply with all naming conventions for each respective organism. Users can then easily navigate between our information and relevant information in other external sites. Collectively, the data in Organelle DB may be used by researchers in a broad swath of disciplines ranging from evolutionary biology to molecular/cellular biology and genomics/bioinformatics. Through the datasets in Organelle DB, evolutionary biologists will be able to consider organelle evolution through the eukaryota, particularly as we generate more complete datasets of eukaryotic protein localization. By integrating localization data with protein–protein interaction data, molecular, cellular and developmental biologists can ferret out higher-confidence subsets of protein interactions for further study. Researchers applying genomics and bioinformatics can use the data in Organelle DB to investigate the organelle as a functional unit, profiling and cataloging the dynamics of the organelle (i.e. its known constituent proteins) in response to cell growth and cell stress. Thus, Organelle DB is a cross-species and cross-discipline resource of general interest to the greater scientific community. In its present form, Organelle View is a first step toward the development of non-text-based resources for the presentation of protein localization data. In addition, we view it as a particularly useful resource in the instruction of younger students (e.g. high school biology students and college undergraduates), introducing them to complicated concepts in cellular and molecular biology through an interface that is visually arresting and ‘fun’.

FUTURE DIRECTIONS

As part of our ongoing maintenance of Organelle DB, we intend to update protein localization records, by placing a particular emphasis upon proteins differentially localized during cell development and/or during cell stress responses. We also plan to modify Organelle View. We envision Organelle View as a true complement to Organelle DB—an interface for the graphical visualization of proteins within organelles and protein complexes both in a static and dynamic model of the cell. Thus, we are currently working to generate a dynamic representation of the yeast cell cycle, such that differentially localized proteins can be represented during cell cycle progression. The current prototype for Organelle View presents a budding yeast cell, but once this platform is well established, we expect to develop similar three-dimensional models for other organisms as well, providing both a broader range of cell types as well as a more detailed cell view with finer subcellular resolution.

DATABASE ACCESS

Organelle DB may be accessed freely at through the Life Sciences Institute at the University of Michigan. User support may be obtained from Organelle DB by contacting anujk@umich.edu. Please direct all technical questions and concerns to this address as well. When referencing Organelle DB and/or Organelle View, please cite this article.
  14 in total

1.  Global analysis of protein localization in budding yeast.

Authors:  Won-Ki Huh; James V Falvo; Luke C Gerke; Adam S Carroll; Russell W Howson; Jonathan S Weissman; Erin K O'Shea
Journal:  Nature       Date:  2003-10-16       Impact factor: 49.962

2.  The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.

Authors:  Brigitte Boeckmann; Amos Bairoch; Rolf Apweiler; Marie-Claude Blatter; Anne Estreicher; Elisabeth Gasteiger; Maria J Martin; Karine Michoud; Claire O'Donovan; Isabelle Phan; Sandrine Pilbout; Michel Schneider
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

3.  Large-scale analysis of the yeast genome by transposon tagging and gene disruption.

Authors:  P Ross-Macdonald; P S Coelho; T Roemer; S Agarwal; A Kumar; R Jansen; K H Cheung; A Sheehan; D Symoniatis; L Umansky; M Heidtman; F K Nelson; H Iwasaki; K Hager; M Gerstein; P Miller; G S Roeder; M Snyder
Journal:  Nature       Date:  1999-11-25       Impact factor: 49.962

4.  WormBase: better software, richer content.

Authors:  Erich M Schwarz; Igor Antoshechkin; Carol Bastiani; Tamberlyn Bieri; Darin Blasiar; Payan Canaran; Juancarlos Chan; Nansheng Chen; Wen J Chen; Paul Davis; Tristan J Fiedler; Lisa Girard; Todd W Harris; Eimear E Kenny; Ranjana Kishore; Dan Lawson; Raymond Lee; Hans-Michael Müller; Cecilia Nakamura; Phil Ozersky; Andrei Petcherski; Anthony Rogers; Will Spooner; Mary Ann Tuli; Kimberly Van Auken; Daniel Wang; Richard Durbin; John Spieth; Lincoln D Stein; Paul W Sternberg
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

5.  The Zebrafish Information Network: the zebrafish model organism database.

Authors:  Judy Sprague; Leyla Bayraktaroglu; Dave Clements; Tom Conlin; David Fashena; Ken Frazer; Melissa Haendel; Douglas G Howe; Prita Mani; Sridhar Ramachandran; Kevin Schaper; Erik Segerdell; Peiran Song; Brock Sprunger; Sierra Taylor; Ceri E Van Slyke; Monte Westerfield
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

6.  The Mouse Genome Database (MGD): updates and enhancements.

Authors:  Judith A Blake; Janan T Eppig; Carol J Bult; James A Kadin; Joel E Richardson
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

7.  dictyBase, the model organism database for Dictyostelium discoideum.

Authors:  Rex L Chisholm; Pascale Gaudet; Eric M Just; Karen E Pilcher; Petra Fey; Sohel N Merchant; Warren A Kibbe
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

8.  Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome.

Authors:  Jodi E Hirschman; Rama Balakrishnan; Karen R Christie; Maria C Costanzo; Selina S Dwight; Stacia R Engel; Dianna G Fisk; Eurie L Hong; Michael S Livstone; Robert Nash; Julie Park; Rose Oughtred; Marek Skrzypek; Barry Starr; Chandra L Theesfeld; Jennifer Williams; Rey Andrada; Gail Binkley; Qing Dong; Christopher Lane; Stuart Miyasato; Anand Sethuraman; Mark Schroeder; Mayank K Thanawala; Shuai Weng; Kara Dolinski; David Botstein; J Michael Cherry
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  Organelle DB: a cross-species database of protein localization and function.

Authors:  Nuwee Wiwatwattana; Anuj Kumar
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  The Rat Genome Database (RGD): developments towards a phenome database.

Authors:  Norberto de la Cruz; Susan Bromberg; Dean Pasko; Mary Shimoyama; Simon Twigger; Jiali Chen; Chin-Fu Chen; Chunyu Fan; Cindy Foote; Gopal R Gopinath; Glenn Harris; Aubrey Hughes; Yuan Ji; Weihong Jin; Dawei Li; Jedidiah Mathis; Natalya Nenasheva; Jeff Nie; Rajni Nigam; Victoria Petri; Dorothy Reilly; Weiye Wang; Wenhua Wu; Angela Zuniga-Meyer; Lan Zhao; Anne Kwitek; Peter Tonellato; Howard Jacob
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  12 in total

1.  ComPPI: a cellular compartment-specific database for protein-protein interaction network analysis.

Authors:  Daniel V Veres; Dávid M Gyurkó; Benedek Thaler; Kristóf Z Szalay; Dávid Fazekas; Tamás Korcsmáros; Peter Csermely
Journal:  Nucleic Acids Res       Date:  2014-10-27       Impact factor: 16.971

2.  Ordered assembly of heat shock proteins, Hsp26, Hsp70, Hsp90, and Hsp104, on expanded polyglutamine fragments revealed by chemical probes.

Authors:  Gladis M Walter; Matthew C Smith; Susanne Wisén; Venkatesha Basrur; Kojo S J Elenitoba-Johnson; Martin L Duennwald; Anuj Kumar; Jason E Gestwicki
Journal:  J Biol Chem       Date:  2011-10-03       Impact factor: 5.157

3.  Integrating in silico resources to map a signaling network.

Authors:  Hanqing Liu; Tim N Beck; Erica A Golemis; Ilya G Serebriiskii
Journal:  Methods Mol Biol       Date:  2014

Review 4.  Mutant power: using mutant allele collections for yeast functional genomics.

Authors:  Kaitlyn L Norman; Anuj Kumar
Journal:  Brief Funct Genomics       Date:  2015-10-09       Impact factor: 4.241

5.  RegPhos: a system to explore the protein kinase-substrate phosphorylation network in humans.

Authors:  Tzong-Yi Lee; Justin Bo-Kai Hsu; Wen-Chi Chang; Hsien-Da Huang
Journal:  Nucleic Acids Res       Date:  2010-10-30       Impact factor: 16.971

6.  CYCLoPs: A Comprehensive Database Constructed from Automated Analysis of Protein Abundance and Subcellular Localization Patterns in Saccharomyces cerevisiae.

Authors:  Judice L Y Koh; Yolanda T Chong; Helena Friesen; Alan Moses; Charles Boone; Brenda J Andrews; Jason Moffat
Journal:  G3 (Bethesda)       Date:  2015-04-15       Impact factor: 3.154

7.  HMPAS: Human Membrane Protein Analysis System.

Authors:  Min-Sung Kim; Gwan-Su Yi
Journal:  Proteome Sci       Date:  2013-11-07       Impact factor: 2.480

8.  The plant organelles database (PODB): a collection of visualized plant organelles and protocols for plant organelle research.

Authors:  Shoji Mano; Tomoki Miwa; Shuh-ichi Nishikawa; Tetsuro Mimura; Mikio Nishimura
Journal:  Nucleic Acids Res       Date:  2007-10-11       Impact factor: 16.971

9.  Large-Scale Analysis of Kinase Signaling in Yeast Pseudohyphal Development Identifies Regulation of Ribonucleoprotein Granules.

Authors:  Christian A Shively; Hye Kyong Kweon; Kaitlyn L Norman; Dattatreya Mellacheruvu; Tao Xu; Daniel T Sheidy; Craig J Dobry; Ivan Sabath; Eric E P Cosky; Elizabeth J Tran; Alexey Nesvizhskii; Philip C Andrews; Anuj Kumar
Journal:  PLoS Genet       Date:  2015-10-08       Impact factor: 5.917

10.  LoQAtE--Localization and Quantitation ATlas of the yeast proteomE. A new tool for multiparametric dissection of single-protein behavior in response to biological perturbations in yeast.

Authors:  Michal Breker; Melissa Gymrek; Ofer Moldavski; Maya Schuldiner
Journal:  Nucleic Acids Res       Date:  2013-10-22       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.