Literature DB >> 8771180

A reliable sequence alignment method based on probabilities of residue correspondences.

S Miyazawa1.   

Abstract

Probabilities of all possible correspondences of residues in aligning two proteins are evaluated by assuming that the statistical weight of each alignment is proportional to the exponent of its total similarity score. Based on such probabilities, a probability alignment that includes the most probable correspondences is proposed. In the case of highly similar sequence pairs, the probability alignments agree with the maximum similarity alignments that correspond to the alignments with the maximum similarity score. Significant correspondences in the probability alignments are those whose probabilities are > 0.5. The probability alignment method is applied to a few protein pairs, and results indicate that such highly probable correspondences in the probability alignments are probably correct correspondences that agree with the structural alignments and that incorrect correspondences in the maximum similarity alignments are usually insignificant correspondences in the probability alignments. The root mean square deviations in superimposition of corresponding residues tend to be smaller for significant correspondences in the probability alignments than for all correspondences in the maximum similarity alignments, indicating that incorrect correspondences in the maximum similarity alignments tend to be insignificant correspondences in probability alignments. This fact is also confirmed in 109 protein pairs that are similar to each other with sequence identities between 90 and 35%. In addition, the probability alignment method may better predict correct correspondences than the maximum similarity alignment method. Probability alignments do, of course, depend on a scoring scheme but are less sensitive to the value of parameters such as gap penalties. The present probability alignment method is useful for constructing reliable alignments based on the probabilities of correspondences and can be used with any scoring scheme.

Mesh:

Substances:

Year:  1995        PMID: 8771180     DOI: 10.1093/protein/8.10.999

Source DB:  PubMed          Journal:  Protein Eng        ISSN: 0269-2139


  31 in total

1.  A range of complex probabilistic models for RNA secondary structure prediction that includes the nearest-neighbor model and more.

Authors:  Elena Rivas; Raymond Lang; Sean R Eddy
Journal:  RNA       Date:  2011-12-22       Impact factor: 4.942

Review 2.  A classification of bioinformatics algorithms from the viewpoint of maximizing expected accuracy (MEA).

Authors:  Michiaki Hamada; Kiyoshi Asai
Journal:  J Comput Biol       Date:  2012-02-07       Impact factor: 1.479

3.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

4.  Centroid estimation in discrete high-dimensional spaces with applications in biology.

Authors:  Luis E Carvalho; Charles E Lawrence
Journal:  Proc Natl Acad Sci U S A       Date:  2008-02-27       Impact factor: 11.205

5.  Effect of using suboptimal alignments in template-based protein structure prediction.

Authors:  Hao Chen; Daisuke Kihara
Journal:  Proteins       Date:  2011-01

6.  Parameters for accurate genome alignment.

Authors:  Martin C Frith; Michiaki Hamada; Paul Horton
Journal:  BMC Bioinformatics       Date:  2010-02-09       Impact factor: 3.169

7.  Improving pairwise sequence alignment accuracy using near-optimal protein sequence alignments.

Authors:  Michael L Sierk; Michael E Smoot; Ellen J Bass; William R Pearson
Journal:  BMC Bioinformatics       Date:  2010-03-22       Impact factor: 3.169

8.  Unifying evolutionary and thermodynamic information for RNA folding of multiple alignments.

Authors:  Stefan E Seemann; Jan Gorodkin; Rolf Backofen
Journal:  Nucleic Acids Res       Date:  2008-10-04       Impact factor: 16.971

9.  Exact calculation of distributions on integers, with application to sequence alignment.

Authors:  Lee A Newberg; Charles E Lawrence
Journal:  J Comput Biol       Date:  2009-01       Impact factor: 1.479

10.  Genome-wide searching with base-pairing kernel functions for noncoding RNAs: computational and expression analysis of snoRNA families in Caenorhabditis elegans.

Authors:  Kensuke Morita; Yutaka Saito; Kengo Sato; Kotaro Oka; Kohji Hotta; Yasubumi Sakakibara
Journal:  Nucleic Acids Res       Date:  2009-01-07       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.