Literature DB >> 21554016

Sequence alignment as hypothesis testing.

Lu Meng1, Fengzhu Sun, Xuegong Zhang, Michael S Waterman.   

Abstract

Sequence alignment depends on the scoring function that defines similarity between pairs of letters. For local alignment, the computational algorithm searches for the most similar segments in the sequences according to the scoring function. The choice of this scoring function is important for correctly detecting segments of interest. We formulate sequence alignment as a hypothesis testing problem, and conduct extensive simulation experiments to study the relationship between the scoring function and the distribution of aligned pairs within the aligned segment under this framework. We cut through the many ways to construct scoring functions and showed that any scoring function with negative expectation used in local alignment corresponds to a hypothesis test between the background distribution of sequence letters and a statistical distribution of letter pairs determined by the scoring function. The results indicate that the log-likelihood ratio scoring function is statistically most powerful and has the highest accuracy for detecting the segments of interest that are defined by the statistical distribution of aligned letter pairs.

Mesh:

Year:  2011        PMID: 21554016      PMCID: PMC3122928          DOI: 10.1089/cmb.2010.0328

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  30 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  BALSA: Bayesian algorithm for local sequence alignment.

Authors:  Bobbie-Jo M Webb; Jun S Liu; Charles E Lawrence
Journal:  Nucleic Acids Res       Date:  2002-03-01       Impact factor: 16.971

3.  Structure-derived substitution matrices for alignment of distantly related sequences.

Authors:  A Prlić; F S Domingues; M J Sippl
Journal:  Protein Eng       Date:  2000-08

4.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

5.  The rapid generation of mutation data matrices from protein sequences.

Authors:  D T Jones; W R Taylor; J M Thornton
Journal:  Comput Appl Biosci       Date:  1992-06

6.  Exhaustive matching of the entire protein sequence database.

Authors:  G H Gonnet; M A Cohen; S A Benner
Journal:  Science       Date:  1992-06-05       Impact factor: 47.728

7.  Environment-specific amino acid substitution tables: tertiary templates and prediction of protein folds.

Authors:  J Overington; D Donnelly; M S Johnson; A Sali; T L Blundell
Journal:  Protein Sci       Date:  1992-02       Impact factor: 6.725

8.  Bayesian adaptive sequence alignment algorithms.

Authors:  J Zhu; J S Liu; C E Lawrence
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

9.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

Authors:  S Karlin; S F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  1990-03       Impact factor: 11.205

10.  DIALIGN-T: an improved algorithm for segment-based multiple sequence alignment.

Authors:  Amarendran R Subramanian; Jan Weyer-Menkhoff; Michael Kaufmann; Burkhard Morgenstern
Journal:  BMC Bioinformatics       Date:  2005-03-22       Impact factor: 3.169

View more
  4 in total

1.  A conditional neural fields model for protein threading.

Authors:  Jianzhu Ma; Jian Peng; Sheng Wang; Jinbo Xu
Journal:  Bioinformatics       Date:  2012-06-15       Impact factor: 6.937

2.  Testing for universal common ancestry.

Authors:  Leonardo de Oliveira Martins; David Posada
Journal:  Syst Biol       Date:  2014-06-23       Impact factor: 15.683

3.  Estimating statistical significance of local protein profile-profile alignments.

Authors:  Mindaugas Margelevičius
Journal:  BMC Bioinformatics       Date:  2019-08-13       Impact factor: 3.169

4.  Protein threading using context-specific alignment potential.

Authors:  Jianzhu Ma; Sheng Wang; Feng Zhao; Jinbo Xu
Journal:  Bioinformatics       Date:  2013-07-01       Impact factor: 6.937

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.