Literature DB >> 16524831

Exploiting protein structure data to explore the evolution of protein function and biological complexity.

Russell L Marsden1, Juan A G Ranea, Antonio Sillero, Oliver Redfern, Corin Yeats, Michael Maibaum, David Lee, Sarah Addou, Gabrielle A Reeves, Timothy J Dallman, Christine A Orengo.   

Abstract

New directions in biology are being driven by the complete sequencing of genomes, which has given us the protein repertoires of diverse organisms from all kingdoms of life. In tandem with this accumulation of sequence data, worldwide structural genomics initiatives, advanced by the development of improved technologies in X-ray crystallography and NMR, are expanding our knowledge of structural families and increasing our fold libraries. Methods for detecting remote sequence similarities have also been made more sensitive and this means that we can map domains from these structural families onto genome sequences to understand how these families are distributed throughout the genomes and reveal how they might influence the functional repertoires and biological complexities of the organisms. We have used robust protocols to assign sequences from completed genomes to domain structures in the CATH database, allowing up to 60% of domain sequences in these genomes, depending on the organism, to be assigned to a domain family of known structure. Analysis of the distribution of these families throughout bacterial genomes identified more than 300 universal families, some of which had expanded significantly in proportion to genome size. These highly expanded families are primarily involved in metabolism and regulation and appear to make major contributions to the functional repertoire and complexity of bacterial organisms. When comparisons are made across all kingdoms of life, we find a smaller set of universal domain families (approx. 140), of which families involved in protein biosynthesis are the largest conserved component. Analysis of the behaviour of other families reveals that some (e.g. those involved in metabolism, regulation) have remained highly innovative during evolution, making it harder to trace their evolutionary ancestry. Structural analyses of metabolic families provide some insights into the mechanisms of functional innovation, which include changes in domain partnerships and significant structural embellishments leading to modulation of active sites and protein interactions.

Mesh:

Substances:

Year:  2006        PMID: 16524831      PMCID: PMC1609337          DOI: 10.1098/rstb.2005.1801

Source DB:  PubMed          Journal:  Philos Trans R Soc Lond B Biol Sci        ISSN: 0962-8436            Impact factor:   6.237


  77 in total

1.  The SYSTERS protein sequence cluster set.

Authors:  A Krause; J Stoye; M Vingron
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database.

Authors:  Daniel W A Buchan; Adrian J Shepherd; David Lee; Frances M G Pearl; Stuart C G Rison; Janet M Thornton; Christine A Orengo
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

3.  The InterPro Database, 2003 brings increased coverage and new features.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Daniel Barrell; Alex Bateman; David Binns; Margaret Biswas; Paul Bradley; Peer Bork; Phillip Bucher; Richard R Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Laurent Falquet; Wolfgang Fleischmann; Sam Griffiths-Jones; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; Rodrigo Lopez; Ivica Letunic; David Lonsdale; Ville Silventoinen; Sandra E Orchard; Marco Pagni; David Peyruc; Chris P Ponting; Jeremy D Selengut; Florence Servant; Christian J A Sigrist; Robert Vaughan; Evgueni M Zdobnov
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

4.  A practical and robust sequence search strategy for structural genomics target selection.

Authors:  James E Bray; Russell L Marsden; Stuart C G Rison; Alexei Savchenko; Aled M Edwards; Janet M Thornton; Christine A Orengo
Journal:  Bioinformatics       Date:  2004-06-16       Impact factor: 6.937

Review 5.  Structure, function and evolution of multidomain proteins.

Authors:  Christine Vogel; Matthew Bashton; Nicola D Kerrison; Cyrus Chothia; Sarah A Teichmann
Journal:  Curr Opin Struct Biol       Date:  2004-04       Impact factor: 6.809

6.  Evolution of protein superfamilies and bacterial genome size.

Authors:  Juan A G Ranea; Daniel W A Buchan; Janet M Thornton; Christine A Orengo
Journal:  J Mol Biol       Date:  2004-02-27       Impact factor: 5.469

Review 7.  Survey of current protein family databases and their application in comparative, structural and functional genomics.

Authors:  Oliver Redfern; Alastair Grant; Michael Maibaum; Christine Orengo
Journal:  J Chromatogr B Analyt Technol Biomed Life Sci       Date:  2005-02-05       Impact factor: 3.205

Review 8.  Enzyme recruitment in evolution of new function.

Authors:  R A Jensen
Journal:  Annu Rev Microbiol       Date:  1976       Impact factor: 15.500

9.  The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.

Authors:  Frances Pearl; Annabel Todd; Ian Sillitoe; Mark Dibley; Oliver Redfern; Tony Lewis; Christopher Bennett; Russell Marsden; Alistair Grant; David Lee; Adrian Akpor; Michael Maibaum; Andrew Harrison; Timothy Dallman; Gabrielle Reeves; Ilhem Diboun; Sarah Addou; Stefano Lise; Caroline Johnston; Antonio Sillero; Janet Thornton; Christine Orengo
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  E-MSD: an integrated data resource for bioinformatics.

Authors:  S Velankar; P McNeil; V Mittard-Runte; A Suarez; D Barrell; R Apweiler; K Henrick
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  6 in total

Review 1.  The proteome: structure, function and evolution.

Authors:  Keiran Fleming; Lawrence A Kelley; Suhail A Islam; Robert M MacCallum; Arne Muller; Florencio Pazos; Michael J E Sternberg
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2006-03-29       Impact factor: 6.237

2.  Introduction. Bioinformatics: from molecules to systems.

Authors:  David T Jones; Michael J E Sternberg; Janet M Thornton
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2006-03-29       Impact factor: 6.237

Review 3.  Protein structure databases.

Authors:  Roman A Laskowski
Journal:  Mol Biotechnol       Date:  2011-06       Impact factor: 2.695

4.  Overcoming sequence misalignments with weighted structural superposition.

Authors:  Nickolay A Khazanov; Kelly L Damm-Ganamet; Daniel X Quang; Heather A Carlson
Journal:  Proteins       Date:  2012-07-28

5.  Tracing the origin of functional and conserved domains in the human proteome: implications for protein evolution at the modular level.

Authors:  Lipika R Pal; Chittibabu Guda
Journal:  BMC Evol Biol       Date:  2006-11-07       Impact factor: 3.260

6.  Mechanism and Catalytic Site Atlas (M-CSA): a database of enzyme reaction mechanisms and active sites.

Authors:  António J M Ribeiro; Gemma L Holliday; Nicholas Furnham; Jonathan D Tyzack; Katherine Ferris; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.