Literature DB >> 2296575

Automatic generation of primary sequence patterns from sets of related protein sequences.

R F Smith1, T F Smith.   

Abstract

We have developed a computer algorithm that can extract the pattern of conserved primary sequence elements common to all members of a homologous protein family. The method involves clustering the pairwise similarity scores among a set of related sequences to generate a binary dendrogram (tree). The tree is then reduced in a stepwise manner by progressively replacing the node connecting the two most similar termini by one common pattern until only a single common "root" pattern remains. A pattern is generated at a node by (i) performing a local optimal alignment on the sequence/pattern pair connected by the node with the use of an extended dynamic programming algorithm and then (ii) constructing a single common pattern from this alignment with a nested hierarchy of amino acid classes to identify the minimal inclusive amino acid class covering each paired set of elements in the alignment. Gaps within an alignment are created and/or extended using a "pay once" gap penalty rule, and gapped positions are converted into gap characters that function as 0 or 1 amino acid of any type during subsequent alignment. This method has been used to generate a library of covering patterns for homologous families in the National Biomedical Research Foundation/Protein Identification Resource protein sequence data base. We show that a covering pattern can be more diagnostic for sequence family membership than any of the individual sequences used to construct the pattern.

Mesh:

Substances:

Year:  1990        PMID: 2296575      PMCID: PMC53211          DOI: 10.1073/pnas.87.1.118

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  23 in total

1.  The refined crystal structure of bovine beta-trypsin at 1.8 A resolution. II. Crystallographic refinement, calcium binding site, benzamidine binding site and active site at pH 7.0.

Authors:  W Bode; P Schwager
Journal:  J Mol Biol       Date:  1975-11-15       Impact factor: 5.469

2.  Identification of protein sequence homology by consensus template alignment.

Authors:  W R Taylor
Journal:  J Mol Biol       Date:  1986-03-20       Impact factor: 5.469

3.  Cloning, characterization and nucleotide sequences of two cDNAs encoding human pancreatic trypsinogens.

Authors:  M Emi; Y Nakamura; M Ogawa; T Yamamoto; T Nishide; T Mori; K Matsubara
Journal:  Gene       Date:  1986       Impact factor: 3.688

4.  The protein identification resource (PIR).

Authors:  D G George; W C Barker; L T Hunt
Journal:  Nucleic Acids Res       Date:  1986-01-10       Impact factor: 16.971

5.  The statistical distribution of nucleic acid similarities.

Authors:  T F Smith; M S Waterman; C Burks
Journal:  Nucleic Acids Res       Date:  1985-01-25       Impact factor: 16.971

6.  Rapid searches for complex patterns in biological molecules.

Authors:  R M Abarbanel; P R Wieneke; E Mansfield; D A Jaffe; D L Brutlag
Journal:  Nucleic Acids Res       Date:  1984-01-11       Impact factor: 16.971

7.  Identification of common molecular subsequences.

Authors:  T F Smith; M S Waterman
Journal:  J Mol Biol       Date:  1981-03-25       Impact factor: 5.469

8.  Rapid similarity searches of nucleic acid and protein data banks.

Authors:  W J Wilbur; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1983-02       Impact factor: 11.205

9.  Covalent structure of bovine trypsinogen. The position of the remaining amides.

Authors:  O Mikes; V Holeysovský; V Tomásek; F Sorm
Journal:  Biochem Biophys Res Commun       Date:  1966-08-12       Impact factor: 3.575

10.  Efficient sequence alignment algorithms.

Authors:  M S Waterman
Journal:  J Theor Biol       Date:  1984-06-07       Impact factor: 2.691

View more
  76 in total

1.  Amino acid substitution matrices from protein blocks.

Authors:  S Henikoff; J G Henikoff
Journal:  Proc Natl Acad Sci U S A       Date:  1992-11-15       Impact factor: 11.205

2.  Aphidicolin inhibits DNA polymerase II of Escherichia coli, an alpha-like DNA polymerase.

Authors:  H Chen; C B Lawrence; S K Bryan; R E Moses
Journal:  Nucleic Acids Res       Date:  1990-12-11       Impact factor: 16.971

3.  Developmental rearrangement of cyanobacterial nif genes: nucleotide sequence, open reading frames, and cytochrome P-450 homology of the Anabaena sp. strain PCC 7120 nifD element.

Authors:  P J Lammers; S McLaughlin; S Papin; C Trujillo-Provencio; A J Ryncarz
Journal:  J Bacteriol       Date:  1990-12       Impact factor: 3.490

4.  An Eulerian path approach to local multiple alignment for DNA sequences.

Authors:  Yu Zhang; Michael S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-24       Impact factor: 11.205

5.  Enhanced malignant transformation induced by expression of a distinct protein domain of ribonucleotide reductase large subunit from herpes simplex virus type 2.

Authors:  M A Ali; D McWeeney; A Milosavljevic; J Jurka; R J Jariwalla
Journal:  Proc Natl Acad Sci U S A       Date:  1991-09-15       Impact factor: 11.205

6.  Automated assembly of protein blocks for database searching.

Authors:  S Henikoff; J G Henikoff
Journal:  Nucleic Acids Res       Date:  1991-12-11       Impact factor: 16.971

7.  A common pattern between the TGF-beta family and glutaredoxin.

Authors:  R Guigó; T F Smith
Journal:  Biochem J       Date:  1991-12-15       Impact factor: 3.857

8.  The hglK gene is required for localization of heterocyst-specific glycolipids in the cyanobacterium Anabaena sp. strain PCC 7120.

Authors:  K Black; W J Buikema; R Haselkorn
Journal:  J Bacteriol       Date:  1995-11       Impact factor: 3.490

9.  Transmission-blocking antibodies recognize microfilarial chitinase in brugian lymphatic filariasis.

Authors:  J A Fuhrman; W S Lane; R F Smith; W F Piessens; F B Perler
Journal:  Proc Natl Acad Sci U S A       Date:  1992-03-01       Impact factor: 11.205

10.  Sequence and complementation analysis of recF genes from Escherichia coli, Salmonella typhimurium, Pseudomonas putida and Bacillus subtilis: evidence for an essential phosphate binding loop.

Authors:  S J Sandler; B Chackerian; J T Li; A J Clark
Journal:  Nucleic Acids Res       Date:  1992-02-25       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.