Literature DB >> 2156130

Hierarchical method to align large numbers of biological sequences.

W R Taylor.   

Abstract

The method presented here is intended as a compromise between finding a good overall alignment and the time taken to do so. Many multiple alignment algorithms spend an excessively large amount of effort trying to find the best global alignment. This time is often ill spent because the results of the standard dynamic programming alignment algorithm are dominated by the choice of gap penalty and the form of the score matrix, both of which have a poor theoretical foundation. Nonetheless, it is important that savings in time do not compromise the quality of the alignment. By using the consensus sequence approach, this danger is largely avoided as the conserved features of the sequences are quickly identified and preserved through further cycles. In the alignment of existing alignments, which is one of the more novel aspects of the method, each alignment was treated as an averaged consensus sequence with gaps making no contribution. This gives rise to the advantageous property that gaps will have a greater propensity to be inserted where there are already gaps and is equivalent to a local change in the gap penalty. This type of behavior represents a transition away from the homogeneous scoring schemes used in aligning two sequences toward a scoring scheme that depends on position in the sequence. The alignment of consensus sequences thus forms a bridge between simple pair alignment and the alignment of discrete patterns in which sequence features and allowed gap locations are exaggerated. To complete this transition the program described above has been integrated into the earlier pattern matching (template) program. Such templates can reliably locate sequence similarities that are too weak or scattered to be found by the more standard alignment methods and should therefore produce a further condensation of the sequence data bank. Only by continually extending our knowledge of the relationships between sequences to increasingly distant similarities can we hope to avoid being overwhelmed by the increasing amount of data.

Mesh:

Substances:

Year:  1990        PMID: 2156130     DOI: 10.1016/0076-6879(90)83031-4

Source DB:  PubMed          Journal:  Methods Enzymol        ISSN: 0076-6879            Impact factor:   1.600


  8 in total

1.  Multiple alignment using simulated annealing: branch point definition in human mRNA splicing.

Authors:  A V Lukashin; J Engelbrecht; S Brunak
Journal:  Nucleic Acids Res       Date:  1992-05-25       Impact factor: 16.971

2.  A comparison of several similarity indices used in the classification of protein sequences: a multivariate analysis.

Authors:  C Landès; A Hénaut; J L Risler
Journal:  Nucleic Acids Res       Date:  1992-07-25       Impact factor: 16.971

3.  Simple chained guide trees give high-quality protein multiple sequence alignments.

Authors:  Kieran Boyce; Fabian Sievers; Desmond G Higgins
Journal:  Proc Natl Acad Sci U S A       Date:  2014-07-07       Impact factor: 11.205

4.  Predicted structure of the extracellular region of ligand-gated ion-channel receptors shows SH2-like and SH3-like domains forming the ligand-binding site.

Authors:  J E Gready; S Ranganathan; P R Schofield; Y Matsuo; K Nishikawa
Journal:  Protein Sci       Date:  1997-05       Impact factor: 6.725

5.  Identity of the putative serine-proteinase fold in proteins of the complement system with nine relevant crystal structures.

Authors:  S J Perkins; K F Smith
Journal:  Biochem J       Date:  1993-10-01       Impact factor: 3.857

6.  Multiple protein structure alignment.

Authors:  W R Taylor; T P Flores; C A Orengo
Journal:  Protein Sci       Date:  1994-10       Impact factor: 6.725

7.  Modular arrangement of proteins as inferred from analysis of homology.

Authors:  E L Sonnhammer; D Kahn
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

8.  Using 3D Hidden Markov Models that explicitly represent spatial coordinates to model and compare protein structures.

Authors:  Vadim Alexandrov; Mark Gerstein
Journal:  BMC Bioinformatics       Date:  2004-01-09       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.