Literature DB >> 15608220

The Yeast Resource Center Public Data Repository.

Michael Riffle1, Lars Malmström, Trisha N Davis.   

Abstract

The Yeast Resource Center Public Data Repository (YRC PDR) serves as a single point of access for the experimental data produced from many collaborations typically studying Saccharomyces cerevisiae (baker's yeast). The experimental data include large amounts of mass spectrometry results from protein co-purification experiments, yeast two-hybrid interaction experiments, fluorescence microscopy images and protein structure predictions. All of the data are accessible via searching by gene or protein name, and are available on the Web at http://www.yeastrc.org/pdr/.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15608220      PMCID: PMC540027          DOI: 10.1093/nar/gki073

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The Yeast Resource Center (YRC) is an NCRR Biomedical Technology Resource Center that provides expertise and otherwise costly tools of research to scientists and students worldwide. This is accomplished via collaborations and technology development projects—with 231 such collaborations having been submitted since the beginning of 2002. The collaborations focus mainly on the study of Saccharomyces cerevisiae via four primary areas of expertise provided by the YRC: mass spectrometry, yeast two-hybrid arrays, deconvolution fluorescence microscopy and protein structure prediction. The YRC investigators, who have been responsible for fulfilling collaboration requests are Dr John Yates, Dr Ruedi Aebersold (mass spectrometry), Dr Stanley Fields (yeast two-hybrid), Dr Trisha Davis, Dr Eric Muller (fluorescence microscopy) and Dr David Baker (protein structure prediction). Collaborative projects can involve multiple experiments carried out in one or more of these four areas. All four areas can produce large amounts of data—not all of which are necessarily used in the course of publication by the collaborator. In addition, not all collaborations necessarily lead to a publication; but data produced through the collaboration may be valuable and useful. The YRC makes available both the published and unpublished data through the YRC Public Data Repository (PDR) to the community at large. Perhaps the most significant aspect of the YRC PDR is that it releases all of the data at a single point of access, bringing together the experimental data from many research projects into one consolidated searchable database accessible through the Web. Instead of going from website to website supporting individual papers, one can easily search the experimental data for multiple papers at once and view the results in a single interface. As more datasets from research collaborations with the YRC become public, the database will continue to grow and become an increasingly significant asset to the research community.

THE CONTENTS OF THE YRC PDR DATABASE

At the time of this writing, the YRC PDR includes data from six collaborative projects—including four publications (1–4). This includes mass spectrometry data collected through protein co-purification experiments, yeast two-hybrid protein interaction data, fluorescent microscopy images and protein structure prediction data. Protein structures are predicted for protein domains, as parsed from the Ginzu algorithm (5). Ab initio structure predictions are available as Protein Data Bank (PDB) (6) formatted text, as generated using the Rosetta de novo structure prediction method (7–10). In addition, the database includes images taken from silver-stained polyacrylamide gels of samples produced from protein co-purification experiments; and links to descriptions of the protocol used for the purification. The breakdown of the amount of data presently included in the database is summarized in Table 1.
Table 1.

A summary of the quantity of the different types of data currently available in the database

Mass spectrometry data
 
 Total runs119
 Total unique proteins identified3138
 Total peptides identified41 397
 Total gel images45
Yeast two-hybrid data 
 Total baits with significant hits409
 Total unique ORFs with significant hits1373
 Total unique significant interactions2031
Fluorescence microscopy data 
 Total unique proteins localized122
 Total full-field images767
 Total selected region images877
Protein structure prediction 
 Total ORFs with structure data145
 Total domains with structure data255
 Total ab initio structures850 (for 86 domains from 63 proteins)

THE YRC PDR WEB INTERFACE

We have developed a simple-to-use web interface to the YRC PDR database. The primary means of interacting with the data is to perform searches based on systematic open reading frame (ORF) or gene names. Gene names are mapped onto systematic ORF names through the publicly available Saccharomyces Genome Database (11,12). Searching will bring the user to a page displaying an overall summary of all the experimental data we have for a given ORF. An example search result is given in Figure 1. This ‘ORF Overview Page’ is separated into five sections, from which the user can view the Gene Ontology (13) description for the ORF and jump to experimental data view pages for each of the four types of data. Each of these data view pages is tailored to a specific kind of data and each has its own features that are described below. All data are clearly labeled according to publication(s) for which they were produced. In addition, data not used in any publication are clearly labeled as unpublished.
Figure 1

A screen capture of the ‘ORF Overview Page’ for the S.cerevisiae gene NSL1. This screen illustrates the result of searching for NSL1 or YPL233w. The page is separated into five distinct sections—Gene Ontology annotations, mass spectrometry, localization, yeast two-hybrid and protein structure prediction. Each section contains a summary of the experimental data relevant to NSL1 and provides links to the data.

Mass spectrometry data

From the ORF Overview Page's mass spectrometry section, the user is presented with several links for viewing the mass spectrometry data. View Protocol link: This provides the user with a text description of the protocol used for a particular protein purification, if the protocol is available. Bait ORF link: This lists the actual purified protein and a link to that protein's ORF Overview Page. Whenever the name of an ORF is given in the website, it is linked to that ORF's overview page. View Gel link: If the protein sample was subjected to electrophoresis on an SDS polyacrylamide gel, this link will be present and will provide an image of the silver-stained gel. View Run link: This is a link to the results from the analysis of the protein by mass spectrometry. The data include a filtered and formatted listing produced from the DTASelect algorithm (14). The data are presented as a list of systematic ORF names for proteins that are co-purified with the bait protein, along with its sequence coverage, number of peptides, spectrum count and molecular weight. A guideline for interpretation of these columns is provided on this page. For each ORF listed, there is a link for viewing the peptides that were used to make that identification. The list of ORFs and the peptide lists may be downloaded as tab-delimited text files from the site. An example of the page displaying mass spectrometry data is provided in Figure 2.
Figure 2

A screen capture of the mass spectrometry data view page. Listed are the ORFs identified through mass spectrometry as having co-purified with the bait ORF, along with experimental data and links to peptide information.

Fluorescence microscopy (localization) data

The ORF Overview Page's localization section allows the user to view fluorescence microscopy images of each protein tagged with a fluorescent protein such as green fluorescent protein. All localization experiments involving this ORF are clearly listed here. The ‘View Images’ link provides the means to view all images from the localization experiment, the experimental parameters used to create these images and the localization determination expressed as a cellular component term from Gene Ontology.

Yeast two-hybrid data

The ORF Overview Page's yeast two-hybrid section provides the means to quickly jump to and view results from all yeast two-hybrid screens in which the ORF of interest was bait or prey. Screen results display the prey ORF as well as the number of hits. A number of hits greater than one are considered significant, but single hits are shown for completeness. The results from these screens are also available for download as a tab-delimited text file.

Protein structure prediction data

If structure prediction data are available for an ORF, the protein structure prediction section provides a list of computationally derived domains for the ORF. This section will give the start and stop residue for each domain, the source of the structure in the database and a link to structural information for that domain. The information in these structure links is tailored to how the structure was derived. Domains, for which the structures were obtained through ab initio prediction, will contain links to the top ten predicted structures. These structures are viewable in the site itself via the WebMol Java applet (15). The structures are also downloadable as PDB text files.

AVAILABILITY

The contents of the YRC PDR are available on the Web at http://www.yeastrc.org/pdr/. From this URL the contents of the database can be viewed as HTML pages, as well as tab-delimited text files when applicable. The entire published datasets of yeast two-hybrid and mass spectrometry run results are available as tab-delimited text files, linked from the front page. The unpublished datasets are available upon request. These tab-delimited text files can easily be imported into Microsoft Excel, as well as other spreadsheet and data software.

FUTURE DIRECTIONS

The YRC will likely expand beyond providing collaborations and technology development in only these four current areas of expertise. As a result, the type of experimental data available in the YRC PDR database will also expand. Currently, the YRC PDR only includes experimental data covering S.cerevisiae. The YRC has broadened its scope and has begun participating in collaborations involving other organisms. As a result, the YRC PDR will contain data from protein experiments involving multiple organisms. Given these two main points and the fact that the YRC PDR will continue to expand by the addition of data from more and more collaborations, the functionality of the interface will be expanded to include more sophisticated searching tools, such as searching only published data, searching by species and searching by protein or gene sequence. User-controlled filters will be added to the mass spectrometry results in order to facilitate the user in identifying more meaningful results. In addition, a probability-based algorithm for analyzing multiple mass spectrometry that runs simultaneously will be added to the site, allowing the user to discover probable protein complexes.
  15 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Rosetta in CASP4: progress in ab initio protein structure prediction.

Authors:  R Bonneau; J Tsai; I Ruczinski; D Chivian; C Rohl; C E Strauss; D Baker
Journal:  Proteins       Date:  2001

3.  Automated prediction of CASP-5 structures using the Robetta server.

Authors:  Dylan Chivian; David E Kim; Lars Malmström; Philip Bradley; Timothy Robertson; Paul Murphy; Charles E M Strauss; Richard Bonneau; Carol A Rohl; David Baker
Journal:  Proteins       Date:  2003

4.  DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics.

Authors:  David L Tabb; W Hayes McDonald; John R Yates
Journal:  J Proteome Res       Date:  2002 Jan-Feb       Impact factor: 4.466

5.  A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae.

Authors:  P Uetz; L Giot; G Cagney; T A Mansfield; R S Judson; J R Knight; D Lockshon; V Narayan; M Srinivasan; P Pochart; A Qureshi-Emili; Y Li; B Godwin; D Conover; T Kalbfleisch; G Vijayadamodar; M Yang; M Johnston; S Fields; J M Rothberg
Journal:  Nature       Date:  2000-02-10       Impact factor: 49.962

6.  Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms.

Authors:  Karen R Christie; Shuai Weng; Rama Balakrishnan; Maria C Costanzo; Kara Dolinski; Selina S Dwight; Stacia R Engel; Becket Feierbach; Dianna G Fisk; Jodi E Hirschman; Eurie L Hong; Laurie Issel-Tarver; Robert Nash; Anand Sethuraman; Barry Starr; Chandra L Theesfeld; Rey Andrada; Gail Binkley; Qing Dong; Christopher Lane; Mark Schroeder; David Botstein; J Michael Cherry
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

7.  Localization of proteins that are coordinately expressed with Cln2 during the cell cycle.

Authors:  Bryan A Sundin; Chun-Hwei Chiu; Michael Riffle; Trisha N Davis; Eric G D Muller
Journal:  Yeast       Date:  2004-07-15       Impact factor: 3.239

8.  Saccharomyces genome database: underlying principles and organisation.

Authors:  Selina S Dwight; Rama Balakrishnan; Karen R Christie; Maria C Costanzo; Kara Dolinski; Stacia R Engel; Becket Feierbach; Dianna G Fisk; Jodi Hirschman; Eurie L Hong; Laurie Issel-Tarver; Robert S Nash; Anand Sethuraman; Barry Starr; Chandra L Theesfeld; Rey Andrada; Gail Binkley; Qing Dong; Christopher Lane; Mark Schroeder; Shuai Weng; David Botstein; J Michael Cherry
Journal:  Brief Bioinform       Date:  2004-03       Impact factor: 11.622

9.  Assigning function to yeast proteins by integration of technologies.

Authors:  Tony R Hazbun; Lars Malmström; Scott Anderson; Beth J Graczyk; Bethany Fox; Michael Riffle; Bryan A Sundin; J Derringer Aranda; W Hayes McDonald; Chun-Hwei Chiu; Brian E Snydsman; Phillip Bradley; Eric G D Muller; Stanley Fields; David Baker; John R Yates; Trisha N Davis
Journal:  Mol Cell       Date:  2003-12       Impact factor: 17.970

10.  A protein interaction map for cell polarity development.

Authors:  B L Drees; B Sundin; E Brazeau; J P Caviston; G C Chen; W Guo; K G Kozminski; M W Lau; J J Moskow; A Tong; L R Schenkman; A McKenzie; P Brennwald; M Longtine; E Bi; C Chan; P Novick; C Boone; J R Pringle; T N Davis; S Fields; D G Drubin
Journal:  J Cell Biol       Date:  2001-08-06       Impact factor: 10.539

View more
  21 in total

1.  HOPS prevents the disassembly of trans-SNARE complexes by Sec17p/Sec18p during membrane fusion.

Authors:  Hao Xu; Youngsoo Jun; James Thompson; John Yates; William Wickner
Journal:  EMBO J       Date:  2010-05-14       Impact factor: 11.598

2.  Prp40 Homolog A Is a Novel Centrin Target.

Authors:  Adalberto Díaz Casas; Walter J Chazin; Belinda Pastrana-Ríos
Journal:  Biophys J       Date:  2017-06-20       Impact factor: 4.033

3.  A mass spectrometry proteomics data management platform.

Authors:  Vagisha Sharma; Jimmy K Eng; Michael J Maccoss; Michael Riffle
Journal:  Mol Cell Proteomics       Date:  2012-05-18       Impact factor: 5.911

4.  Markov chain Monte Carlo simulation of a Bayesian mixture model for gene network inference.

Authors:  Younhee Ko; Jaebum Kim; Sandra L Rodriguez-Zas
Journal:  Genes Genomics       Date:  2019-02-11       Impact factor: 1.839

5.  Transmembrane topology and signal peptide prediction using dynamic bayesian networks.

Authors:  Sheila M Reynolds; Lukas Käll; Michael E Riffle; Jeff A Bilmes; William Stafford Noble
Journal:  PLoS Comput Biol       Date:  2008-11-07       Impact factor: 4.475

Review 6.  Proteomics of plant pathogenic fungi.

Authors:  Raquel González-Fernández; Elena Prats; Jesús V Jorrín-Novo
Journal:  J Biomed Biotechnol       Date:  2010-05-27

7.  Metadata matters: access to image data in the real world.

Authors:  Melissa Linkert; Curtis T Rueden; Chris Allan; Jean-Marie Burel; Will Moore; Andrew Patterson; Brian Loranger; Josh Moore; Carlos Neves; Donald Macdonald; Aleksandra Tarkowska; Caitlin Sticco; Emma Hill; Mike Rossner; Kevin W Eliceiri; Jason R Swedlow
Journal:  J Cell Biol       Date:  2010-05-31       Impact factor: 10.539

8.  Large-scale prediction of protein-protein interactions from structures.

Authors:  Martial Hue; Michael Riffle; Jean-Philippe Vert; William S Noble
Journal:  BMC Bioinformatics       Date:  2010-03-18       Impact factor: 3.169

9.  The Yeast Resource Center Public Image Repository: A large database of fluorescence microscopy images.

Authors:  Michael Riffle; Trisha N Davis
Journal:  BMC Bioinformatics       Date:  2010-05-19       Impact factor: 3.169

Review 10.  Proteomics data repositories: providing a safe haven for your data and acting as a springboard for further research.

Authors:  Juan Antonio Vizcaíno; Joseph M Foster; Lennart Martens
Journal:  J Proteomics       Date:  2010-07-06       Impact factor: 4.044

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.