Literature DB >> 19010965

The protein structure initiative structural genomics knowledgebase.

Helen M Berman1, John D Westbrook, Margaret J Gabanyi, Wendy Tao, Raship Shah, Andrei Kouranov, Torsten Schwede, Konstantin Arnold, Florian Kiefer, Lorenza Bordoli, Jürgen Kopp, Michael Podvinec, Paul D Adams, Lester G Carter, Wladek Minor, Rajesh Nair, Joshua La Baer.   

Abstract

The Protein Structure Initiative Structural Genomics Knowledgebase (PSI SGKB, http://kb.psi-structuralgenomics.org) has been created to turn the products of the PSI structural genomics effort into knowledge that can be used by the biological research community to understand living systems and disease. This resource provides central access to structures in the Protein Data Bank (PDB), along with functional annotations, associated homology models, worldwide protein target tracking information, available protocols and the potential to obtain DNA materials for many of the targets. It also offers the ability to search all of the structural and methodological publications and the innovative technologies that were catalyzed by the PSI's high-throughput research efforts. In collaboration with the Nature Publishing Group, the PSI SGKB provides a research library, editorials about new research advances, news and an events calendar to present a broader view of structural biology and structural genomics. By making these resources freely available, the PSI SGKB serves as a bridge to connect the structural biology and the greater biomedical communities.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 19010965      PMCID: PMC2686438          DOI: 10.1093/nar/gkn790

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The goal of the worldwide structural genomics initiative is to determine the 3D structures of proteins on a genomic scale. Since 2001, these efforts have resulted in more than 6772 structure depositions to the PDB (1), 3251 of which are from the National Institutes of Health-sponsored Protein Structure Initiative Centers. In order to determine these structures in a high-throughput manner, the PSI Centers have developed advanced technologies to facilitate these processes, including protein production, crystallization, structure determination, refinement and analysis. The Protein Structure Initiative Structural Genomics Knowledgebase (PSI SGKB) (http://kb.psi-structuralgenomics.org) was launched in early 2008 with the goal of making the results of the PSI initiative widely available to the broad community of biologists (2). The PSI SGKB provides access to the annotated protein structures, the models that can be leveraged from them, associated functional predictions, experimental protocols and tracking information. The new resource integrates this structural information with relevant scientific information from external data resources in a coherent and contextual format. The PSI SGKB also provides the descriptions of a vast array of technologies, protein production protocols and software applications. By making all of these products accessible to the greater community, it is anticipated that the PSI SGKB will become an enabling resource for biologists, biochemists, functional genomicists, pharmacologists, educators and physicians.

ARCHITECTURE

The PSI SGKB is a portal that provides integrated access to the PSI Centers, external biological databases, the PDB and a set of widely distributed resources, some of which are portals themselves. These resources include: Experimental data tracking: TargetDB (http://targetdb.rcsb.org) (3) and PepcDB (http://pepcdb.rcsb.org) (4) were established to track the progress of targets that are being worked on by the worldwide structural genomics centers. TargetDB gives the status of each target and PepcDB provides information about the protocols used for protein production and the reasons for stopping work on any one target. Data are regularly collected, tracked and made available via the TargetDB and PepcDB web sites. These resources have query and report functionality. PepcDB now provides access to the general protocols for the most successful experimental trial as well as any specific details associated with the trial. Cross-references are provided to the Research Collaboratory for Structural Biology (RCSB) PDB (5), Pfam (6), Superfamily (7), TIGR Families (8), ProDom (9), iProClass (10) and Prosite (11). Sequence family (e.g. BIG and MEGA assignments) and target classification details (e.g. biomedical, community-nominated) are now collected from PSI centers. As PepcDB grows, this resource will become an indispensable and truly unique resource for biologists who are expressing and purifying proteins for their own experiments. Materials repository: The PSI Materials Repository (PSI MR) (http://www.hip.harvard.edu/PSIMR/index.htm) has been established at Harvard University. A mechanism for storing and distributing clones is in place. Material transfer agreements have been established between the PSI centers and the PSI MR. The specifications for the data to accompany physical samples have been set. These data include the information required to ensure the interoperability of this repository with TargetDB and PepcDB. The information collected for each PSI clone will also be stored in a searchable database. Using this database, researchers will be able to select and order clones online for a minimal fee that covers processing, handling and shipping. Homology modeling: For every structure determined by the PSI Centers, hundreds of models could be made using a variety of established methods. This has been done by all of the PSI centers. At a workshop held in 2005 at Rutgers University, it was proposed that a portal for models should be launched (12). This would allow access to a variety of models predicted by different methods for any target protein. The Protein Modeling Portal [PMP; http://www.proteinmodelportal.org; (13)] has been established at the Swiss Institute of Bioinformatics (Biozentrum University of Basel) and is headed by Torsten Schwede. The PMP currently provides access to several million pre-built models from the four PSI centers and publicly available model databases [i.e. ModBase (14) and SWISS-MODEL Repository (15)]. The PMP can be accessed from the PSI SGKB through web service queries, or by directly searching the portal for models of specific or similar protein sequences or models built on specific template structures. The PSI SGKB links to the PMP pages which display information on individual models (or sets of models), as well as functional annotation of the target protein sequence. Annotation: Many different annotations are possible for every target, including structure determination and validation details; sequence information, including possible family and domain assignments; structure information, including surface characteristics; cavities; potential and actual active sites; fold classifications; protein–protein and protein–ligand interactions; and structure–function relationships. The PSI centers have created services that provide access to many of these annotations. The PSI SGKB links to the PSI interactive services, summaries and galleries of annotation information, with several of these resources integrated directly into PSI SGKB search reports. In addition, the search reports link to approximately 50 additional annotation resources of sequence, structure and function such as UniProt (16), National Center for Biotechnology Information (NCBI) (17,18), Class, Architecture, Topology, and Homologous Superfamily (CATH) (19), Structural Classification of Proteins (SCOP) (20), and Gene Ontology (GO) (21). A complete list of annotation sources is maintained on an Annotation Resources page (http://kb.psi-structuralgenomics.org/KB/annotation-resources.html). The ‘Workshop on the Biological Annotation of Novel Proteins’ (7 and 8 March 2008; http://annotation-workshop.rutgers.edu/) was convened to collect detailed recommendations and requirements for future annotation information to be incorporated into the PSI SGKB. Technology development: Each PSI center has developed a variety of cutting edge technologies for all stages of the structure determination pipeline (22). These technologies have been critical to the success of the PSI program. Additionally, the descriptions of these technologies are an important resource for the broader biological community for use in other research. Established at Lawrence Berkeley Laboratory under the leadership of Paul Adams, the Technology Portal (https://isswprod.lbl.gov/PSIKBPortal/) currently provides access to summaries of key PSI technologies with links to the responsible PSI center and/or related publication. Information within the Technology Portal is accessible through keyword searches of the PSI SGKB. Metrics: A quantitative assessment of the productivity of the PSI Centers is available via metrics, such as the numbers of distinct and novel structures. The metrics were articulated in a PSI Steering Subcommittee on Goals and Milestones report (http://targetdb.rcsb.org/Metrics/Milestones.html). Tabulations of key metrics from these recommendations are updated regularly at http://targetdb.rcsb.org/Metrics/SummaryTable.html and http://targetdb.rcsb.org/Metrics/MilestonesTables.html. Publications: All articles published by PSI scientists are collected into a central Publications Resource (http://olenka.med.virginia.edu/psi) developed by Wladek Minor at the University of Virginia. Lists of citations are categorized as structural or methodological, and include the PubMed identifier and the number of times the article has been cited. With this final piece of information, the resource calculates and tracks the following (with current values at the time this manuscript was prepared): total number of articles published by all Centers (1036), number of articles cited more than five times (561), total number of citations (16 077), average number of times a structural (10.7) or methodological (20.5) paper has been cited and total impact of the articles by journal impact rating (4985). These values are also available by individual PSI center. The resource additionally charts the number of publications by year and by impact factor. These statistics are essential to track the impact of the PSI efforts, since outreach and the dissemination of PSI-catalyzed research are central to the PSI mission.

SEARCH CAPABILITIES

Integrating information from each PSI center with external resources is a key focus of the PSI SGKB. This integration already exists in resources like the RCSB PDB, which contain annotations about folds from CATH (19) and SCOP (20), function via Gene Ontology (21) and sequence via UniProt (16). TargetDB and PepcDB are cross-linked to sequence family and domain databases. The PSI SGKB takes this integration a large step further. In the current PSI SGKB release, it is possible to perform a single query to extract information from the PDB, PepcDB, TargetDB, the PMP, the Technology Portal, the Publications Resource, the MR and other biological annotation resources (Figure 1). For example, a user can type in a sequence or a PDB ID code to extract reports containing structural information and annotations, protein production protocols, predicted models, available DNA clone materials and related target status, without having to go to each individual module. Keyword searches query technology information, publication information (including title and abstracts for structural genomics-related target and structure publications), the MR and all indexed pages from the PSI Centers.
Figure 1.

A functional view of the PSI SGKB resource. The PSI SGKB portal database is comprised of ID codes, sequences, external URLs and annotations. PDB ID, sequence, or keyword queries made to the PSI SGKB access other PSI data portals (TargetDB, PepcDB, PMP, Technology Portal, Publications Resource, MR, PSI Center websites), PDB and 50 other external biological resources (a selection is shown).

A functional view of the PSI SGKB resource. The PSI SGKB portal database is comprised of ID codes, sequences, external URLs and annotations. PDB ID, sequence, or keyword queries made to the PSI SGKB access other PSI data portals (TargetDB, PepcDB, PMP, Technology Portal, Publications Resource, MR, PSI Center websites), PDB and 50 other external biological resources (a selection is shown).

THE NPG PSI SGKB GATEWAY

The scope and potential outreach of the PSI SGKB was expanded recently when it joined forces with the Nature Publishing Group to create a PSI SGKB Gateway. The Gateway enhances the PSI SGKB's powerful query feature with articles and resources that highlight research findings, new technologies, and general structural biology news. A Research Library catalogs articles relevant to structural biology and genomics. A description of a particularly interesting molecule is featured monthly. The Functional Sleuth section presents information about molecules of unknown function to challenge biologists to provide further insights about these molecules. News alerts and Really Simple Syndication (RSS) feeds will help alert the scientific community about the progress of structural genomics.

FUTURE DEVELOPMENTS

The core resources for the PSI SGKB are in place. New developments include the construction of a pipeline to collect and calculate an expanded set of annotations. A simplified matrix presentation will be provided for the new annotations and will highlight the structures requiring further functional characterization. Another planned capability will be the use of data mining to fully exploit these annotations.

FUNDING

National Institute of General Medical Sciences. Funding for open access charge: The PSI SGKB is supported by the National Institutes of Health (NIH) as a sub-grant under Prime Agreement Award Number: 3U54GM074958-04S1. Conflict of interest statement. None declared.
  20 in total

1.  TargetDB: a target registration database for structural genomics projects.

Authors:  Li Chen; Rose Oughtred; Helen M Berman; John Westbrook
Journal:  Bioinformatics       Date:  2004-05-06       Impact factor: 6.937

2.  Outcome of a workshop on archiving structural models of biological macromolecules.

Authors:  Helen M Berman; Stephen K Burley; Wah Chiu; Andrej Sali; Alexei Adzhubei; Philip E Bourne; Stephen H Bryant; Roland L Dunbrack; Krzysztof Fidelis; Joachim Frank; Adam Godzik; Kim Henrick; Andrzej Joachimiak; Bernard Heymann; David Jones; John L Markley; John Moult; Gaetano T Montelione; Christine Orengo; Michael G Rossmann; Burkhard Rost; Helen Saibil; Torsten Schwede; Daron M Standley; John D Westbrook
Journal:  Structure       Date:  2006-08       Impact factor: 5.006

3.  Harnessing knowledge from structural genomics.

Authors:  Helen M Berman
Journal:  Structure       Date:  2008-01       Impact factor: 5.006

4.  The ProDom database of protein domain families.

Authors:  F Corpet; J Gouzy; D Kahn
Journal:  Nucleic Acids Res       Date:  1998-01-01       Impact factor: 16.971

5.  The SWISS-MODEL Repository: new features and functionalities.

Authors:  Jürgen Kopp; Torsten Schwede
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

6.  The RCSB PDB information portal for structural genomics.

Authors:  Andrei Kouranov; Lei Xie; Joanna de la Cruz; Li Chen; John Westbrook; Philip E Bourne; Helen M Berman
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

7.  The Universal Protein Resource (UniProt).

Authors: 
Journal:  Nucleic Acids Res       Date:  2006-11-16       Impact factor: 16.971

8.  The SUPERFAMILY database in 2007: families and functions.

Authors:  Derek Wilson; Martin Madera; Christine Vogel; Cyrus Chothia; Julian Gough
Journal:  Nucleic Acids Res       Date:  2006-11-10       Impact factor: 16.971

9.  The 20 years of PROSITE.

Authors:  Nicolas Hulo; Amos Bairoch; Virginie Bulliard; Lorenzo Cerutti; Béatrice A Cuche; Edouard de Castro; Corinne Lachaize; Petra S Langendijk-Genevaux; Christian J A Sigrist
Journal:  Nucleic Acids Res       Date:  2007-11-14       Impact factor: 16.971

10.  Database resources of the National Center for Biotechnology Information.

Authors:  David L Wheeler; Tanya Barrett; Dennis A Benson; Stephen H Bryant; Kathi Canese; Vyacheslav Chetvernin; Deanna M Church; Michael Dicuccio; Ron Edgar; Scott Federhen; Michael Feolo; Lewis Y Geer; Wolfgang Helmberg; Yuri Kapustin; Oleg Khovayko; David Landsman; David J Lipman; Thomas L Madden; Donna R Maglott; Vadim Miller; James Ostell; Kim D Pruitt; Gregory D Schuler; Martin Shumway; Edwin Sequeira; Steven T Sherry; Karl Sirotkin; Alexandre Souvorov; Grigory Starchenko; Roman L Tatusov; Tatiana A Tatusova; Lukas Wagner; Eugene Yaschenko
Journal:  Nucleic Acids Res       Date:  2007-11-27       Impact factor: 16.971

View more
  51 in total

1.  The Protein Structure Initiative Structural Biology Knowledgebase Technology Portal: a structural biology web resource.

Authors:  Lida K Gifford; Lester G Carter; Margaret J Gabanyi; Helen M Berman; Paul D Adams
Journal:  J Struct Funct Genomics       Date:  2012-04-06

Review 2.  OpenHelix: bioinformatics education outside of a different box.

Authors:  Jennifer M Williams; Mary E Mangan; Cynthia Perreault-Micale; Scott Lathe; Neeraj Sirohi; Warren C Lathe
Journal:  Brief Bioinform       Date:  2010-08-26       Impact factor: 11.622

3.  PSI:Biology-materials repository: a biologist's resource for protein expression plasmids.

Authors:  Catherine Y Cormier; Jin G Park; Michael Fiacco; Jason Steel; Preston Hunter; Jason Kramer; Rajeev Singla; Joshua LaBaer
Journal:  J Struct Funct Genomics       Date:  2011-03-01

4.  Mutations in PPCS, Encoding Phosphopantothenoylcysteine Synthetase, Cause Autosomal-Recessive Dilated Cardiomyopathy.

Authors:  Arcangela Iuso; Marit Wiersma; Hans-Joachim Schüller; Ben Pode-Shakked; Dina Marek-Yagel; Mathias Grigat; Thomas Schwarzmayr; Riccardo Berutti; Bader Alhaddad; Bart Kanon; Nicola A Grzeschik; Jürgen G Okun; Zeev Perles; Yishay Salem; Ortal Barel; Amir Vardi; Marina Rubinshtein; Tal Tirosh; Gal Dubnov-Raz; Ana C Messias; Caterina Terrile; Iris Barshack; Alex Volkov; Camilla Avivi; Eran Eyal; Elisa Mastantuono; Muhamad Kumbar; Shachar Abudi; Matthias Braunisch; Tim M Strom; Thomas Meitinger; Georg F Hoffmann; Holger Prokisch; Tobias B Haack; Bianca J J M Brundel; Dorothea Haas; Ody C M Sibon; Yair Anikster
Journal:  Am J Hum Genet       Date:  2018-05-10       Impact factor: 11.025

5.  Will widgets and semantic tagging change computational biology?

Authors:  Philip E Bourne; Bojan Beran; Chunxiao Bi; Wolfgang Bluhm; Roland Dunbrack; Andreas Prlić; Greg Quinn; Peter Rose; Raship Shah; Wendy Tao; Brian Weitzner; Ben Yukich
Journal:  PLoS Comput Biol       Date:  2010-02-26       Impact factor: 4.475

6.  The structure of SSO2064, the first representative of Pfam family PF01796, reveals a novel two-domain zinc-ribbon OB-fold architecture with a potential acyl-CoA-binding role.

Authors:  S Sri Krishna; L Aravind; Constantina Bakolitsa; Jonathan Caruthers; Dennis Carlton; Mitchell D Miller; Polat Abdubek; Tamara Astakhova; Herbert L Axelrod; Hsiu Ju Chiu; Thomas Clayton; Marc C Deller; Lian Duan; Julie Feuerhelm; Joanna C Grant; Gye Won Han; Lukasz Jaroszewski; Kevin K Jin; Heath E Klock; Mark W Knuth; Abhinav Kumar; David Marciano; Daniel McMullan; Andrew T Morse; Edward Nigoghossian; Linda Okach; Ron Reyes; Christopher L Rife; Henry van den Bedem; Dana Weekes; Qingping Xu; Keith O Hodgson; John Wooley; Marc André Elsliger; Ashley M Deacon; Adam Godzik; Scott A Lesley; Ian A Wilson
Journal:  Acta Crystallogr Sect F Struct Biol Cryst Commun       Date:  2010-03-05

7.  TOPSAN: use of a collaborative environment for annotating, analyzing and disseminating data on JCSG and PSI structures.

Authors:  S Sri Krishna; Dana Weekes; Constantina Bakolitsa; Marc André Elsliger; Ian A Wilson; Adam Godzik; John Wooley
Journal:  Acta Crystallogr Sect F Struct Biol Cryst Commun       Date:  2010-09-30

8.  IC50-to-Ki: a web-based tool for converting IC50 to Ki values for inhibitors of enzyme activity and ligand binding.

Authors:  R Z Cer; U Mudunuri; R Stephens; F J Lebeda
Journal:  Nucleic Acids Res       Date:  2009-04-24       Impact factor: 16.971

9.  The Protein Model Portal.

Authors:  Konstantin Arnold; Florian Kiefer; Jürgen Kopp; James N D Battey; Michael Podvinec; John D Westbrook; Helen M Berman; Lorenza Bordoli; Torsten Schwede
Journal:  J Struct Funct Genomics       Date:  2008-11-27

Review 10.  From protein sequences to 3D-structures and beyond: the example of the UniProt knowledgebase.

Authors:  Ursula Hinz
Journal:  Cell Mol Life Sci       Date:  2009-12-31       Impact factor: 9.261

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.