Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Progress of structural genomics initiatives: an analysis of solved target structures.

Literature DB >> 15854658

Progress of structural genomics initiatives: an analysis of solved target structures.

Annabel E Todd¹, Russell L Marsden, Janet M Thornton, Christine A Orengo.

Abstract

The explosion in gene sequence data and technological breakthroughs in protein structure determination inspired the launch of structural genomics (SG) initiatives. An often stated goal of structural genomics is the high-throughput structural characterisation of all protein sequence families, with the long-term hope of significantly impacting on the life sciences, biotechnology and drug discovery. Here, we present a comprehensive analysis of solved SG targets to assess progress of these initiatives. Eleven consortia have contributed 316 non-redundant entries and 323 protein chains to the Protein Data Bank (PDB), and 459 and 393 domains to the CATH and SCOP structure classifications, respectively. The quality and size of these proteins are comparable to those solved in traditional structural biology and, despite huge scope for duplicated efforts, only 14% of targets have a close homologue (>/=30% sequence identity) solved by another consortium. Analysis of CATH and SCOP revealed the significant contribution that structural genomics is making to the coverage of superfamilies and folds. A total of 67% of SG domains in CATH are unique, lacking an already characterised close homologue in the PDB, whereas only 21% of non-SG domains are unique. For 29% of domains, structure determination revealed a remote evolutionary relationship not apparent from sequence, and 19% and 11% contributed new superfamilies and folds. The secondary structure class, fold and superfamily distributions of this dataset reflect those of the genomes. The domains fall into 172 different folds and 259 superfamilies in CATH but the distribution is highly skewed. The most populous of these are those that recur most frequently in the genomes. Whilst 11% of superfamilies are bacteria-specific, most are common to all three superkingdoms of life and together the 316 PDB entries have provided new and reliable homology models for 9287 non-redundant gene sequences in 206 completely sequenced genomes. From the perspective of this analysis, it appears that structural genomics is on track to be a success, and it is hoped that this work will inform future directions of the field.

Mesh：

Year: 2005 PMID： 15854658 DOI： 10.1016/j.jmb.2005.03.037

Source DB: PubMed Journal: J Mol Biol ISSN： 0022-2836 Impact factor: 5.469

Keyword Cloud
Cited

64 in total

1. Sub-AQUA: real-value quality assessment of protein structure models.

Authors: Yifeng David Yang; Preston Spratt; Hao Chen; Changsoon Park; Daisuke Kihara
Journal: Protein Eng Des Sel Date: 2010-06-04 Impact factor: 1.650

2. Structure- and sequence-based function prediction for non-homologous proteins.

Authors: Lee Sael; Meghana Chitale; Daisuke Kihara
Journal: J Struct Funct Genomics Date: 2012-01-22

Review 3. Structural genomics: the ultimate approach for rational drug design.

Authors: Kenneth Lundstrom
Journal: Mol Biotechnol Date: 2006-10 Impact factor: 2.695

4. FINDSITE-metal: integrating evolutionary information and machine learning for structure-based metal-binding site prediction at the proteome level.

Authors: Michal Brylinski; Jeffrey Skolnick
Journal: Proteins Date: 2010-12-06

5. Robust recognition of zinc binding sites in proteins.

Authors: Jessica C Ebert; Russ B Altman
Journal: Protein Sci Date: 2007-11-27 Impact factor: 6.725

6. Growth of novel protein structural data.

Authors: Michael Levitt
Journal: Proc Natl Acad Sci U S A Date: 2007-02-20 Impact factor: 11.205

7. Using surface envelopes to constrain molecular modeling.

Authors: Jonathan M Dugan; Russ B Altman
Journal: Protein Sci Date: 2007-07 Impact factor: 6.725

8. A simple genetic algorithm for the optimization of multidomain protein homology models driven by NMR residual dipolar coupling and small angle X-ray scattering data.

Authors: Fabien Mareuil; Christina Sizun; Javier Perez; Marc Schoenauer; Jean-Yves Lallemand; François Bontems
Journal: Eur Biophys J Date: 2007-05-24 Impact factor: 1.733

Review 9. Exploiting protein structure data to explore the evolution of protein function and biological complexity.

Authors: Russell L Marsden; Juan A G Ranea; Antonio Sillero; Oliver Redfern; Corin Yeats; Michael Maibaum; David Lee; Sarah Addou; Gabrielle A Reeves; Timothy J Dallman; Christine A Orengo
Journal: Philos Trans R Soc Lond B Biol Sci Date: 2006-03-29 Impact factor: 6.237

Review 10. 'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it.

Authors: Andrew D Hanson; Anne Pribat; Jeffrey C Waller; Valérie de Crécy-Lagard
Journal: Biochem J Date: 2009-12-14 Impact factor: 3.857