Literature DB >> 9070452

Embedding strategies for effective use of information from multiple sequence alignments.

S Henikoff1, J G Henikoff.   

Abstract

We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain.

Mesh:

Substances:

Year:  1997        PMID: 9070452      PMCID: PMC2143675          DOI: 10.1002/pro.5560060319

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


  34 in total

1.  Amino acid substitution matrices from protein blocks.

Authors:  S Henikoff; J G Henikoff
Journal:  Proc Natl Acad Sci U S A       Date:  1992-11-15       Impact factor: 11.205

2.  Automatic generation of primary sequence patterns from sets of related protein sequences.

Authors:  R F Smith; T F Smith
Journal:  Proc Natl Acad Sci U S A       Date:  1990-01       Impact factor: 11.205

3.  Improving the sensitivity of the sequence profile method.

Authors:  R Lüthy; I Xenarios; P Bucher
Journal:  Protein Sci       Date:  1994-01       Impact factor: 6.725

4.  Identification of sequence pattern with profile analysis.

Authors:  M Gribskov; S Veretnik
Journal:  Methods Enzymol       Date:  1996       Impact factor: 1.600

Review 5.  Hidden Markov models.

Authors:  S R Eddy
Journal:  Curr Opin Struct Biol       Date:  1996-06       Impact factor: 6.809

6.  A large family of bacterial activator proteins.

Authors:  S Henikoff; G W Haughn; J M Calvo; J C Wallace
Journal:  Proc Natl Acad Sci U S A       Date:  1988-09       Impact factor: 11.205

7.  Detecting homology of distantly related proteins with consensus sequences.

Authors:  L Patthy
Journal:  J Mol Biol       Date:  1987-12-20       Impact factor: 5.469

8.  Profile analysis: detection of distantly related proteins.

Authors:  M Gribskov; A D McLachlan; D Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  1987-07       Impact factor: 11.205

9.  Improved sensitivity of profile searches through the use of sequence weights and gap excision.

Authors:  J D Thompson; D G Higgins; T J Gibson
Journal:  Comput Appl Biosci       Date:  1994-02

10.  Identification of common molecular subsequences.

Authors:  T F Smith; M S Waterman
Journal:  J Mol Biol       Date:  1981-03-25       Impact factor: 5.469

View more
  25 in total

1.  Increased coverage of protein families with the blocks database servers.

Authors:  J G Henikoff; E A Greene; S Pietrokovski; S Henikoff
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  CDD: a database of conserved domain alignments with links to domain three-dimensional structure.

Authors:  Aron Marchler-Bauer; Anna R Panchenko; Benjamin A Shoemaker; Paul A Thiessen; Lewis Y Geer; Stephen H Bryant
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

3.  Conservation of structure and function among tyrosine recombinases: homology-based modeling of the lambda integrase core-binding domain.

Authors:  Brian M Swalla; Richard I Gumport; Jeffrey F Gardner
Journal:  Nucleic Acids Res       Date:  2003-02-01       Impact factor: 16.971

Review 4.  Statistical signals in bioinformatics.

Authors:  Samuel Karlin
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-12       Impact factor: 11.205

Review 5.  The limits of protein sequence comparison?

Authors:  William R Pearson; Michael L Sierk
Journal:  Curr Opin Struct Biol       Date:  2005-06       Impact factor: 6.809

6.  A DNA methyltransferase homolog with a chromodomain exists in multiple polymorphic forms in Arabidopsis.

Authors:  S Henikoff; L Comai
Journal:  Genetics       Date:  1998-05       Impact factor: 4.562

7.  An MMP liberates the Ninjurin A ectodomain to signal a loss of cell adhesion.

Authors:  Shuning Zhang; Gina M Dailey; Elaine Kwan; Bernadette M Glasheen; Gyna E Sroga; Andrea Page-McCaw
Journal:  Genes Dev       Date:  2006-06-30       Impact factor: 11.361

8.  Seeking an ancient enzyme in Methanococcus jannaschii using ORF, a program based on predicted secondary structure comparisons.

Authors:  R Aurora; G D Rose
Journal:  Proc Natl Acad Sci U S A       Date:  1998-03-17       Impact factor: 11.205

9.  Superior performance in protein homology detection with the Blocks Database servers.

Authors:  S Henikoff; S Pietrokovski; J G Henikoff
Journal:  Nucleic Acids Res       Date:  1998-01-01       Impact factor: 16.971

Review 10.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.