Literature DB >> 8901539

Multiple DNA and protein sequence alignment based on segment-to-segment comparison.

B Morgenstern1, A Dress, T Werner.   

Abstract

In this paper, a new way to think about, and to construct, pairwise as well as multiple alignments of DNA and protein sequences is proposed. Rather than forcing alignments to either align single residues or to introduce gaps by defining an alignment as a path running right from the source up to the sink in the associated dot-matrix diagram, we propose to consider alignments as consistent equivalence relations defined on the set of all positions occurring in all sequences under consideration. We also propose constructing alignments from whole segments exhibiting highly significant overall similarity rather than by aligning individual residues. Consequently, we present an alignment algorithm that (i) is based on segment-to-segment comparison instead of the commonly used residue-to-residue comparison and which (ii) avoids the well-known difficulties concerning the choice of appropriate gap penalties: gaps are not treated explicity, but remain as those parts of the sequences that do not belong to any of the aligned segments. Finally, we discuss the application of our algorithm to two test examples and compare it with commonly used alignment methods. As a first example, we aligned a set of 11 DNA sequences coding for functional helix-loop-helix proteins. Though the sequences show only low overall similarity, our program correctly aligned all of the 11 functional sites, which was a unique result among the methods tested. As a by-product, the reading frames of the sequences were identified. Next, we aligned a set of ribonuclease H proteins and compared our results with alignments produced by other programs as reported by McClure et al. [McClure, M. A., Vasi, T. K. & Fitch, W. M. (1994) Mol. Biol. Evol. 11, 571-592]. Our program was one of the best scoring programs. However, in contrast to other methods, our protein alignments are independent of user-defined parameters.

Entities:  

Mesh:

Substances:

Year:  1996        PMID: 8901539      PMCID: PMC37949          DOI: 10.1073/pnas.93.22.12098

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  17 in total

1.  CLUSTAL V: improved software for multiple sequence alignment.

Authors:  D G Higgins; A J Bleasby; R Fuchs
Journal:  Comput Appl Biosci       Date:  1992-04

2.  Sensitivity comparison of protein amino acid sequences.

Authors:  P Argos; M Vingron
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

3.  Fast and sensitive multiple sequence alignments on a microcomputer.

Authors:  D G Higgins; P M Sharp
Journal:  Comput Appl Biosci       Date:  1989-04

4.  A flexible multiple sequence alignment program.

Authors:  H M Martinez
Journal:  Nucleic Acids Res       Date:  1988-03-11       Impact factor: 16.971

5.  Weighting in sequence space: a comparison of methods in terms of generalized sequences.

Authors:  M Vingron; P R Sibbald
Journal:  Proc Natl Acad Sci U S A       Date:  1993-10-01       Impact factor: 11.205

Review 6.  Sequence alignment and penalty choice. Review of concepts, case studies and implications.

Authors:  M Vingron; M S Waterman
Journal:  J Mol Biol       Date:  1994-01-07       Impact factor: 5.469

7.  Similarities between protein 3-D structures.

Authors:  U Lessel; D Schomburg
Journal:  Protein Eng       Date:  1994-10

8.  Optimal alignment between groups of sequences and its application to multiple sequence alignment.

Authors:  O Gotoh
Journal:  Comput Appl Biosci       Date:  1993-06

9.  Comparative analysis of multiple protein-sequence alignment methods.

Authors:  M A McClure; T K Vasi; W M Fitch
Journal:  Mol Biol Evol       Date:  1994-07       Impact factor: 16.240

10.  Discrimination between related DNA sites by a single amino acid residue of Myc-related basic-helix-loop-helix proteins.

Authors:  C V Dang; C Dolde; M L Gillison; G J Kato
Journal:  Proc Natl Acad Sci U S A       Date:  1992-01-15       Impact factor: 11.205

View more
  65 in total

1.  Structure-function analysis of yeast hexokinase: structural requirements for triggering cAMP signalling and catabolite repression.

Authors:  L S Kraakman; J Winderickx; J M Thevelein; J H De Winde
Journal:  Biochem J       Date:  1999-10-01       Impact factor: 3.857

2.  First pass annotation of promoters on human chromosome 22.

Authors:  M Scherf; A Klingenhoff; K Frech; K Quandt; R Schneider; K Grote; M Frisch; V Gailus-Durner; A Seidel; R Brack-Werner; T Werner
Journal:  Genome Res       Date:  2001-03       Impact factor: 9.043

3.  Toward a comprehensive phylogeny for mammalian and avian herpesviruses.

Authors:  D J McGeoch; A Dolan; A C Ralph
Journal:  J Virol       Date:  2000-11       Impact factor: 5.103

4.  DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.

Authors:  J D Thompson; F Plewniak; J Thierry; O Poch
Journal:  Nucleic Acids Res       Date:  2000-08-01       Impact factor: 16.971

5.  In silico prediction of scaffold/matrix attachment regions in large genomic sequences.

Authors:  Matthias Frisch; Kornelie Frech; Andreas Klingenhoff; Kerstin Cartharius; Ines Liebich; Thomas Werner
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

6.  Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution.

Authors:  Adam Pavlícek; Jan Paces; Daniel Elleder; Jirí Hejnar
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

7.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

8.  The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences.

Authors:  Michael Brudno; Rasmus Steinkamp; Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

9.  DIALIGN: multiple DNA and protein sequence alignment at BiBiServ.

Authors:  Burkhard Morgenstern
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

10.  Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs.

Authors:  Bastien Chevreux; Thomas Pfisterer; Bernd Drescher; Albert J Driesel; Werner E G Müller; Thomas Wetter; Sándor Suhai
Journal:  Genome Res       Date:  2004-05-12       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.