Literature DB >> 8743691

Protein sequence comparison at genome scale.

E V Koonin1, R L Tatusov, K E Rudd.   

Abstract

An adequate set of computer procedures tailored to address the task of genome-scale analysis of protein sequences will greatly increase the beneficial impact of the genome sequencing projects on the progress of biological research. This is especially pertinent given the fact that, for model organisms, one-half or more of the putative gene products have not been functionally characterized. Here we described several programs that may comprise the core of such a set and their application to the analysis of about 3000 proteins comprising 75% of the E. coli gene products. We find that the protein sequences encoded in this model genome are a rich source of information, with biologically relevant similarities detected for more than 80% of them. In the majority of cases, these similarities become evident directly from the results of BLAST searches. However, methods for motif analysis provide for a significant increase in search sensitivity and are particularly important for the detection of ancient conserved regions. As a result of sequence similarity analysis, generalized functional predictions can be made for the majority of uncharacterized ORF products, allowing efficient focusing of experimental effort. Clustering of the E. coli proteins on the basis of sequence similarity shows that almost one-half of the bacterial proteins have at least one paralog and that the likelihood that a protein belongs to a small or a large cluster depends on the function of this particular protein.

Entities:  

Mesh:

Substances:

Year:  1996        PMID: 8743691     DOI: 10.1016/s0076-6879(96)66020-0

Source DB:  PubMed          Journal:  Methods Enzymol        ISSN: 0076-6879            Impact factor:   1.600


  15 in total

1.  Lineage-specific gene expansions in bacterial and archaeal genomes.

Authors:  I K Jordan; K S Makarova; J L Spouge; Y I Wolf; E V Koonin
Journal:  Genome Res       Date:  2001-04       Impact factor: 9.043

2.  Genome of lumpy skin disease virus.

Authors:  E R Tulman; C L Afonso; Z Lu; L Zsak; G F Kutish; D L Rock
Journal:  J Virol       Date:  2001-08       Impact factor: 5.103

3.  The genome of swinepox virus.

Authors:  C L Afonso; E R Tulman; Z Lu; L Zsak; F A Osorio; C Balinsky; G F Kutish; D L Rock
Journal:  J Virol       Date:  2002-01       Impact factor: 5.103

4.  Phosphoesterase domains associated with DNA polymerases of diverse origins.

Authors:  L Aravind; E V Koonin
Journal:  Nucleic Acids Res       Date:  1998-08-15       Impact factor: 16.971

5.  A database of macromolecular motions.

Authors:  M Gerstein; W Krebs
Journal:  Nucleic Acids Res       Date:  1998-09-15       Impact factor: 16.971

6.  Genomic evidence for two functionally distinct gene classes.

Authors:  M C Rivera; R Jain; J E Moore; J A Lake
Journal:  Proc Natl Acad Sci U S A       Date:  1998-05-26       Impact factor: 11.205

7.  Extracting protein alignment models from the sequence database.

Authors:  A F Neuwald; J S Liu; D J Lipman; C E Lawrence
Journal:  Nucleic Acids Res       Date:  1997-05-01       Impact factor: 16.971

8.  Multisite-specific tRNA:m5C-methyltransferase (Trm4) in yeast Saccharomyces cerevisiae: identification of the gene and substrate specificity of the enzyme.

Authors:  Y Motorin; H Grosjean
Journal:  RNA       Date:  1999-08       Impact factor: 4.942

9.  A minimal gene set for cellular life derived by comparison of complete bacterial genomes.

Authors:  A R Mushegian; E V Koonin
Journal:  Proc Natl Acad Sci U S A       Date:  1996-09-17       Impact factor: 11.205

10.  The genome of Melanoplus sanguinipes entomopoxvirus.

Authors:  C L Afonso; E R Tulman; Z Lu; E Oma; G F Kutish; D L Rock
Journal:  J Virol       Date:  1999-01       Impact factor: 5.103

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.