Literature DB >> 10211823

Genome analysis: Assigning protein coding regions to three-dimensional structures.

A A Salamov1, M Suwa, C A Orengo, M B Swindells.   

Abstract

We describe the results of a procedure for maximizing the number of sequences that can be reliably linked to a protein of known three-dimensional structure. Unlike other methods, which try to increase sensitivity through the use of fold recognition software, we only use conventional sequence alignment tools, but apply them in a manner that significantly increases the number of relationships detected. We analyzed 11 genomes and found that, depending on the genome, between 23 and 32% of the ORFs had significant matches to proteins of known structure. In all cases, the aligned region consisted of either >100 residues or >50% of the smaller sequence. Slightly higher percentages could be attained if smaller motifs were also included. This is significantly higher than most previously reported methods, even those that have a fold-recognition component. We survey the biochemical and structural characteristics of the most frequently occurring proteins, and discuss the extent to which alignment methods can realistically assign function to gene products.

Mesh:

Year:  1999        PMID: 10211823      PMCID: PMC2144302          DOI: 10.1110/ps.8.4.771

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  26 in total

1.  Protein folds and functions.

Authors:  A C Martin; C A Orengo; E G Hutchinson; S Jones; M Karmirantzou; R A Laskowski; J B Mitchell; C Taroni; J M Thornton
Journal:  Structure       Date:  1998-07-15       Impact factor: 5.006

2.  Domain assignment for protein structures using a consensus approach: characterization and analysis.

Authors:  S Jones; M Stewart; A Michie; M B Swindells; C Orengo; J M Thornton
Journal:  Protein Sci       Date:  1998-02       Impact factor: 6.725

3.  A model for the nucleotide-binding domains of ABC transporters based on the large domain of aspartate aminotransferase.

Authors:  F J Hoedemaeker; A R Davidson; D R Rose
Journal:  Proteins       Date:  1998-02-15

4.  Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium.

Authors:  D Fischer; D Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  1997-10-28       Impact factor: 11.205

5.  Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships.

Authors:  S E Brenner; C Chothia; T J Hubbard
Journal:  Proc Natl Acad Sci U S A       Date:  1998-05-26       Impact factor: 11.205

6.  The Protein Data Bank: a computer-based archival file for macromolecular structures.

Authors:  F C Bernstein; T F Koetzle; G J Williams; E F Meyer; M D Brice; J R Rodgers; O Kennard; T Shimanouchi; M Tasumi
Journal:  J Mol Biol       Date:  1977-05-25       Impact factor: 5.469

7.  CATH--a hierarchic classification of protein domain structures.

Authors:  C A Orengo; A D Michie; S Jones; D T Jones; M B Swindells; J M Thornton
Journal:  Structure       Date:  1997-08-15       Impact factor: 5.006

8.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

Authors:  S Karlin; S F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  1990-03       Impact factor: 11.205

9.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

10.  Fold and function predictions for Mycoplasma genitalium proteins.

Authors:  L Rychlewski; B Zhang; A Godzik
Journal:  Fold Des       Date:  1998
View more
  8 in total

1.  Detection of protein fold similarity based on correlation of amino acid properties.

Authors:  I V Grigoriev; S H Kim
Journal:  Proc Natl Acad Sci U S A       Date:  1999-12-07       Impact factor: 11.205

2.  Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments.

Authors:  I Friedberg; T Kaplan; H Margalit
Journal:  Protein Sci       Date:  2000-11       Impact factor: 6.725

3.  A comparison of position-specific score matrices based on sequence and structure alignments.

Authors:  Anna R Panchenko; Stephen H Bryant
Journal:  Protein Sci       Date:  2002-02       Impact factor: 6.725

4.  The CATH extended protein-family database: providing structural annotations for genome sequences.

Authors:  Frances M G Pearl; David Lee; James E Bray; Daniel W A Buchan; Adrian J Shepherd; Christine A Orengo
Journal:  Protein Sci       Date:  2002-02       Impact factor: 6.725

5.  Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database.

Authors:  Daniel W A Buchan; Adrian J Shepherd; David Lee; Frances M G Pearl; Stuart C G Rison; Janet M Thornton; Christine A Orengo
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

6.  Pcons: a neural-network-based consensus predictor that improves fold recognition.

Authors:  J Lundström; L Rychlewski; J Bujnicki; A Elofsson
Journal:  Protein Sci       Date:  2001-11       Impact factor: 6.725

7.  Rapid protein domain assignment from amino acid sequence using predicted secondary structure.

Authors:  Russell L Marsden; Liam J McGuffin; David T Jones
Journal:  Protein Sci       Date:  2002-12       Impact factor: 6.725

8.  HUNT: launch of a full-length cDNA database from the Helix Research Institute.

Authors:  H T Yudate; M Suwa; R Irie; H Matsui; T Nishikawa; Y Nakamura; D Yamaguchi; Z Z Peng; T Yamamoto; K Nagai; K Hayashi; T Otsuki; T Sugiyama; T Ota; Y Suzuki; S Sugano; T Isogai; Y Masuho
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.