Literature DB >> 15608193

PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids.

Roman A Laskowski¹, Victor V Chistyakov, Janet M Thornton.

Abstract

PDBsum is a database of mainly pictorial summaries of the 3D structures of proteins and nucleic acids in the Protein Data Bank. Its pages aim to provide an at-a-glance view of the contents of every 3D structure, plus detailed structural analyses of each protein chain, DNA-RNA chain and any bound ligands and metals. In the past year, the database has been significantly improved, in terms of both appearance and new content. Moreover, it has moved to its new address at http://www.ebi.ac.uk/thornton-srv/databases/pdbsum.

Entities: CellLine Chemical Gene

Mesh：

Substances：

Year: 2005 PMID： 15608193 PMCID： PMC539955 DOI： 10.1093/nar/gki001

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

The PDBsum database was created at University College London in 1995 (1,2). Its aim was to provide an illustrated and informative summary for each of the 3D structures released by the Protein Data Bank (PDB) (3). As of July 1, 2004, the database has been transferred to the European Bioinformatics Institute having had a complete facelift and many new analyses and links added to it. Its new address is http://www.ebi.ac.uk/thornton-srv/databases/pdbsum. We describe in this paper, some of the improvements that have been made and the new features that have been added.

NEW LAYOUT

The most obvious change that has been made is to the appearance of the web pages. These have been modernized, simplified and structured in a more logical manner and are now generated dynamically. Each structure's home page now provides a thumbnail image(s) of the structure plus, below it, an index listing the molecules it contains, in terms of protein chain(s), DNA–RNA chains, small-molecule ligands, metal ions and number of water molecules. Clicking on the items in the index takes you to the analyses provided for that molecule type (secondary structure diagrams for protein chains, protein–ligand interactions for the ligands, and so on). The index thus provides an at-a-glance summary of the molecules contained in the PDB entry. Much duplication of redundant information has been removed. Thus, for example, where a structure contains multiple copies of the same protein chain, only a representative chain is described in detail; previously all structures were rather unnecessarily described. This is reflected in the index, which groups together or separates the protein chains accordingly. So you can immediately see that, say, the structure consists of four chains (A, B, C and D) which are all equivalent, or conversely, that the structure consists of two dissimilar protein chains, A and B, etc. Similarly, for ligands, multiple copies of the same ligand, making identical interactions with equivalent protein chains, are now shown only once. In addition to the thumbnail image and index of contents, the home page of each entry also provides the usual descriptive information (such as title, authors, date of deposition), links to other sequence and structure databases, summary PROCHECK (4) analyses and a button for viewing the structure in RasMol (5). A novel feature is a link to a server that allows you to automatically generate your own image of the structure via MolScript (6) and Raster3d (7). Another new feature, for most enzyme structures, is a diagram of the reaction catalysed by the enzyme. The diagram shows chemical drawings of the reactants, products and, where relevant, cofactors. The drawings are generated from mol2 files that were downloaded from the KEGG (8) ftp site. Of particular interest are structures where the bound ligand corresponds to, or is similar, to one of the molecules involved in the reaction. These are identified on the diagram with their percentage similarity to the molecule in question. Similarities are calculated by using a simple graph-match between the atom types and connectivities of the structure's ligands and the reaction molecules. Figure 1 shows an example.

Figure 1

Diagram showing the reaction catalysed by enzymes of class E.C.2.6.1.16—the glucosamine 6-phosphate synthases. The diagram is taken from the PDBsum page for 1gdo, where the bound ligand—an l-glutamate—corresponds to one of the enzyme's products and is highlighted with a blue border in the diagram. Clicking on the highlighted molecule goes to the corresponding ligand page.

PROTEIN PAGES

Each representative protein chain in a given structure has its own page holding a ‘wiring diagram’ of its secondary structure, plus domain organization as given in the CATH fold classification database (9). As before, a detailed analysis of the secondary structure motifs is provided, via PROMOTIF (10), and any valid PROSITE patterns (11) contained within the sequence are mapped to the 3D structure (12) via RasMol. There are two new features on these pages. The first is the thumbnail image, which shows the chain in question in solid representation, any identical chains as semi-transparent and all other molecules in the structure as transparent. Clicking on the image brings up a picture of the chain itself. For large and complex structures, this can help locate the chain in the structure as a whole. The second novel feature is the inclusion of residue conservation data, where available. It is well known that highly conserved residues are usually crucial to the function of the protein, and their location on the surface of the protein can pinpoint the functionally active region. The conservation of each residue is computed by the ConSurf (13) program, which uses multiple sequence alignments of the protein chain against homologues in the sequence databases. The residues are coloured according to their conservation score on the wiring diagram and a RasMol view of the protein's surface shows the most and least highly conserved regions on the surface (see Figure 2). An alternative view of the 3D structure, again using RasMol, is provided by using the ConSurf colouring scheme.

Figure 2

Surface of the glucosamine 6-phosphate synthase structure (PDB code 1gdo) coloured by residue conservation: red and pink for the most highly conserved regions, and blue for the most variable. The bound ligand—an l-glutamate—can be seen in spacefill representation within the highly conserved binding pocket. Also bound are an acetate ion and a sodium ion (green sphere).

LIGAND AND METAL ION PAGES

The ligand pages show the various ligand molecules and metal ions bound to the protein or DNA molecules in the structure. Where there are many instances of the same ligand or metal, only a representative example is given; identical molecules making identical interactions with equivalent protein chains are merely listed. Such rationalization is necessary as some structures these days have staggeringly large numbers of bound ligands—see for example PDB code 1qzv, which has no fewer than 334 alpha-chlorophyll A molecules, plus others, bound to a large complex of 32 protein chains corresponding to plant photosystem I.

THE ENZYME STRUCTURES DATABASE (EC → PDB)

The Enzyme Structures Database, http://www.ebi.ac.uk/thornton-srv/databases/enzymes, is a subset of PDBsum, which provides a separate grouping of all the enzymes structures in the PDB, classified by their enzyme classification (EC) numbers (14). The database preserves the hierarchy of the EC numbering scheme, showing the number of PDB structures belonging to the class at each level. At the lowest level, the listed PDB codes link directly to their PDBsum pages. Where any of the listed structures contain ligands that resemble, or correspond to, any of the reaction molecules, this resemblance is given by a percentage similarity. This helps identify structures, belonging to a specific enzyme class, which may be the most informative in terms of where and how the cognate ligand(s) bind. The EC hierarchy, descriptions, reactions and reaction molecules are obtained from the ENZYME database (15). The molecule definitions are downloaded as mol2 files from the KEGG ftp site, as mentioned above.

PDBsum HIGHLIGHTS

A new feature, accessed from the PDBsum home page, is the Highlights page. This tabulates the most extreme entries in the database in terms of various attributes: oldest depositions, youngest, largest, smallest, longest chain, most ligands, highest resolution, lowest, and so on (see Figure 3). This helps locate some of the more unusual structures that have been solved to date! More highlights are planned as the PDB is full of the weird and the wonderful.

Figure 3

Example of one of the PDBsum highlights listings, here showing the top five structures in terms of the highest quoted resolution.

12 in total

1. KEGG: kyoto encyclopedia of genes and genomes.

Authors: M Kanehisa; S Goto
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

3. Assigning genomic sequences to CATH.

Authors: F M Pearl; D Lee; J E Bray; I Sillitoe; A E Todd; A P Harrison; J M Thornton; C A Orengo
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

4. The ENZYME database in 2000.

Authors: A Bairoch
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

5. Three-dimensional structure analysis of PROSITE patterns.

Authors: A Kasuya; J M Thornton
Journal: J Mol Biol Date: 1999-03-12 Impact factor: 5.469

6. Recent improvements to the PROSITE database.

Authors: Nicolas Hulo; Christian J A Sigrist; Virginie Le Saux; Petra S Langendijk-Genevaux; Lorenza Bordoli; Alexandre Gattiker; Edouard De Castro; Philipp Bucher; Amos Bairoch
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

7. ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information.

Authors: Fabian Glaser; Tal Pupko; Inbal Paz; Rachel E Bell; Dalit Bechor-Shental; Eric Martz; Nir Ben-Tal
Journal: Bioinformatics Date: 2003-01 Impact factor: 6.937

8. Raster3D: photorealistic molecular graphics.

Authors: E A Merritt; D J Bacon
Journal: Methods Enzymol Date: 1997 Impact factor: 1.600

9. PROMOTIF--a program to identify and analyze structural motifs in proteins.

Authors: E G Hutchinson; J M Thornton
Journal: Protein Sci Date: 1996-02 Impact factor: 6.725

10. PDBsum: a Web-based database of summaries and analyses of all PDB structures.

Authors: R A Laskowski; E G Hutchinson; A D Michie; A C Wallace; M L Jones; J M Thornton
Journal: Trends Biochem Sci Date: 1997-12 Impact factor: 13.807

170 in total

1. Molecular modeling and active site analysis of SdiA homolog, a putative quorum sensor for Salmonella typhimurium pathogenecity reveals specific binding patterns of AHL transcriptional regulators.

Authors: Shanmugam Gnanendra; Shanmugam Anusuya; Jeyakumar Natarajan
Journal: J Mol Model Date: 2012-06-02 Impact factor: 1.810

2. Sequence and structure continuity of evolutionary importance improves protein functional site discovery and annotation.

Authors: A D Wilkins; R Lua; S Erdin; R M Ward; O Lichtarge
Journal: Protein Sci Date: 2010-07 Impact factor: 6.725

3. The 1.4 A resolution structure of Paracoccus pantotrophus pseudoazurin.

Authors: Shabir Najmudin; Sofia R Pauleta; Isabel Moura; Maria J Romão
Journal: Acta Crystallogr Sect F Struct Biol Cryst Commun Date: 2010-05-25

4. Crystal structures of NAC domains of human nascent polypeptide-associated complex (NAC) and its αNAC subunit.

Authors: Lanfeng Wang; Wenchi Zhang; Lu Wang; Xuejun C Zhang; Xuemei Li; Zihe Rao
Journal: Protein Cell Date: 2010-05-08 Impact factor: 14.870

5. KB-Rank: efficient protein structure and functional annotation identification via text query.

Authors: Elchin S Julfayev; Ryan J McLaughlin; Yi-Ping Tao; William A McLaughlin
Journal: J Struct Funct Genomics Date: 2012-01-21

6. Expression profiling and in silico homology modeling of Inositol pentakisphosphate 2-kinase, a potential candidate gene for low phytate trait in soybean.

Authors: Nabaneeta Basak; Veda Krishnan; Vanita Pandey; Mansi Punjabi; Alkesh Hada; Ashish Marathe; Monica Jolly; Bhagath Kumar Palaka; Dinakara R Ampasala; Archana Sachdev
Journal: 3 Biotech Date: 2020-05-27 Impact factor: 2.406

7. Crystal structure of a novel Sm-like protein of putative cyanophage origin at 2.60 A resolution.

Authors: Debanu Das; Piotr Kozbial; Herbert L Axelrod; Mitchell D Miller; Daniel McMullan; S Sri Krishna; Polat Abdubek; Claire Acosta; Tamara Astakhova; Prasad Burra; Dennis Carlton; Connie Chen; Hsiu-Ju Chiu; Thomas Clayton; Marc C Deller; Lian Duan; Ylva Elias; Marc-André Elsliger; Dustin Ernst; Carol Farr; Julie Feuerhelm; Anna Grzechnik; Slawomir K Grzechnik; Joanna Hale; Gye Won Han; Lukasz Jaroszewski; Kevin K Jin; Hope A Johnson; Heath E Klock; Mark W Knuth; Abhinav Kumar; David Marciano; Andrew T Morse; Kevin D Murphy; Edward Nigoghossian; Amanda Nopakun; Linda Okach; Silvya Oommachen; Jessica Paulsen; Christina Puckett; Ron Reyes; Christopher L Rife; Natasha Sefcovic; Sebastian Sudek; Henry Tien; Christine Trame; Christina V Trout; Henry van den Bedem; Dana Weekes; Aprilfawn White; Qingping Xu; Keith O Hodgson; John Wooley; Ashley M Deacon; Adam Godzik; Scott A Lesley; Ian A Wilson
Journal: Proteins Date: 2009-05-01

8. Interaction of dihydrofolate reductase and aminoglycoside adenyltransferase enzyme from Klebsiella pneumoniae multidrug resistant strain DF12SA with clindamycin: a molecular modelling and docking study.

Authors: Shailesh K Shahi; Vinay K Singh; Ashok Kumar; Sanjeev K Gupta; Surya K Singh
Journal: J Mol Model Date: 2012-10-25 Impact factor: 1.810

9. Computational analysis of C-reactive protein for assessment of molecular dynamics and interaction properties.

Authors: Chiranjib Chakraborty; Alok Agrawal
Journal: Cell Biochem Biophys Date: 2013-11 Impact factor: 2.194

10. Exploring CYP1A1 as anticancer target: homology modeling and in silico inhibitor design.

Authors: Abhay T Sangamwar; Leena B Labhsetwar; Sharad V Kuberkar
Journal: J Mol Model Date: 2008-07-30 Impact factor: 1.810