Literature DB >> 8524875

Sequence similarity analysis of Escherichia coli proteins: functional and evolutionary implications.

E V Koonin1, R L Tatusov, K E Rudd.   

Abstract

A computer analysis of 2328 protein sequences comprising about 60% of the Escherichia coli gene products was performed using methods for database screening with individual sequences and alignment blocks. A high fraction of E. coli proteins--86%--shows significant sequence similarity to other proteins in current databases; about 70% show conservation at least at the level of distantly related bacteria, and about 40% contain ancient conserved regions (ACRs) shared with eukaryotic or Archaeal proteins. For > 90% of the E. coli proteins, either functional information or sequence similarity, or both, are available. Forty-six percent of the E. coli proteins belong to 299 clusters of paralogs (intraspecies homologs) defined on the basis of pairwise similarity. Another 10% could be included in 70 superclusters using motif detection methods. The majority of the clusters contain only two to four members. In contrast, nearly 25% of all E. coli proteins belong to the four largest superclusters--namely, permeases, ATPases and GTPases with the conserved "Walker-type" motif, helix-turn-helix regulatory proteins, and NAD(FAD)-binding proteins. We conclude that bacterial protein sequences generally are highly conserved in evolution, with about 50% of all ACR-containing protein families represented among the E. coli gene products. With the current sequence databases and methods of their screening, computer analysis yields useful information on the functions and evolutionary relationships of the vast majority of genes in a bacterial genome. Sequence similarity with E. coli proteins allows the prediction of functions for a number of important eukaryotic genes, including several whose products are implicated in human diseases.

Entities:  

Mesh:

Substances:

Year:  1995        PMID: 8524875      PMCID: PMC40515          DOI: 10.1073/pnas.92.25.11921

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  27 in total

1.  A hidden Markov model that finds genes in E. coli DNA.

Authors:  A Krogh; I S Mian; D Haussler
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

Review 2.  Issues in searching molecular sequence databases.

Authors:  S F Altschul; M S Boguski; W Gish; J C Wootton
Journal:  Nat Genet       Date:  1994-02       Impact factor: 38.330

3.  Large scale bacterial gene discovery by similarity search.

Authors:  K Robison; W Gilbert; G M Church
Journal:  Nat Genet       Date:  1994-06       Impact factor: 38.330

4.  Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

Authors:  R L Tatusov; S F Altschul; E V Koonin
Journal:  Proc Natl Acad Sci U S A       Date:  1994-12-06       Impact factor: 11.205

5.  Exploring the Mycoplasma capricolum genome: a minimal cell reveals its physiology.

Authors:  P Bork; C Ouzounis; G Casari; R Schneider; C Sander; M Dolan; W Gilbert; P M Gillevet
Journal:  Mol Microbiol       Date:  1995-06       Impact factor: 3.501

6.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd.

Authors:  R D Fleischmann; M D Adams; O White; R A Clayton; E F Kirkness; A R Kerlavage; C J Bult; J F Tomb; B A Dougherty; J M Merrick
Journal:  Science       Date:  1995-07-28       Impact factor: 47.728

7.  ECD--a totally integrated database of Escherichia coli K12.

Authors:  R Wahl; P Rice; C M Rice; M Kröger
Journal:  Nucleic Acids Res       Date:  1994-09       Impact factor: 16.971

8.  Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

Authors:  M Borodovsky; K E Rudd; E V Koonin
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

9.  Widespread protein sequence similarities: origins of Escherichia coli genes.

Authors:  B Labedan; M Riley
Journal:  J Bacteriol       Date:  1995-03       Impact factor: 3.490

10.  Yeast chromosome III: new gene functions.

Authors:  E V Koonin; P Bork; C Sander
Journal:  EMBO J       Date:  1994-02-01       Impact factor: 11.598

View more
  26 in total

1.  Lineage-specific gene expansions in bacterial and archaeal genomes.

Authors:  I K Jordan; K S Makarova; J L Spouge; Y I Wolf; E V Koonin
Journal:  Genome Res       Date:  2001-04       Impact factor: 9.043

2.  Crystal structure of a fibrillarin homologue from Methanococcus jannaschii, a hyperthermophile, at 1.6 A resolution.

Authors:  H Wang; D Boisvert; K K Kim; R Kim; S H Kim
Journal:  EMBO J       Date:  2000-02-01       Impact factor: 11.598

3.  Evolution of protein families: is it possible to distinguish between domains of life?

Authors:  Marta Sales-Pardo; Albert O B Chan; Luís A N Amaral; Roger Guimerà
Journal:  Gene       Date:  2007-08-14       Impact factor: 3.688

4.  Evolution of the structure and chromosomal distribution of histidine biosynthetic genes.

Authors:  R Fani; E Mori; E Tamburini; A Lazcano
Journal:  Orig Life Evol Biosph       Date:  1998-10       Impact factor: 1.950

5.  A structural census of the current population of protein sequences.

Authors:  M Gerstein; M Levitt
Journal:  Proc Natl Acad Sci U S A       Date:  1997-10-28       Impact factor: 11.205

6.  Sequence analysis of eukaryotic developmental proteins: ancient and novel domains.

Authors:  A R Mushegian; E V Koonin
Journal:  Genetics       Date:  1996-10       Impact factor: 4.562

7.  Searching databases of conserved sequence regions by aligning protein multiple-alignments.

Authors:  S Pietrokovski
Journal:  Nucleic Acids Res       Date:  1996-10-01       Impact factor: 16.971

8.  A promoter for the first nine genes of the Escherichia coli mra cluster of cell division and cell envelope biosynthesis genes, including ftsI and ftsW.

Authors:  H Hara; S Yasuda; K Horiuchi; J T Park
Journal:  J Bacteriol       Date:  1997-09       Impact factor: 3.490

9.  Identification of new RNA modifying enzymes by iterative genome search using known modifying enzymes as probes.

Authors:  C Gustafsson; R Reid; P J Greene; D V Santi
Journal:  Nucleic Acids Res       Date:  1996-10-01       Impact factor: 16.971

10.  A minimal gene set for cellular life derived by comparison of complete bacterial genomes.

Authors:  A R Mushegian; E V Koonin
Journal:  Proc Natl Acad Sci U S A       Date:  1996-09-17       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.