J S Liu1, C E Lawrence. 1. Department of Statistics, Stanford University, Stanford, CA, USA. jliu@stat.stabford.edu
Abstract
MOTIVATION: Most existing bioinformatics methods are limited to making point estimates of one variable, e.g. the optimal alignment, with fixed input values for all other variables, e.g. gap penalties and scoring matrices. While the requirement to specify parameters remains one of the more vexing issues in bioinformatics, it is a reflection of a larger issue: the need to broaden the view on statistical inference in bioinformatics. RESULTS: The assignment of probabilities for all possible values of all unknown variables in a problem in the form of a posterior distribution is the goal of Bayesian inference. Here we show how this goal can be achieved for most bioinformatics methods that use dynamic programming. Specifically, a tutorial style description of a Bayesian inference procedure for segmentation of a sequence based on the heterogeneity in its composition is given. In addition, full Bayesian inference algorithms for sequence alignment are described. AVAILABILITY: Software and a set of transparencies for a tutorial describing these ideas are available at http://www.wadsworth.org/res&res/bioinfo/
MOTIVATION: Most existing bioinformatics methods are limited to making point estimates of one variable, e.g. the optimal alignment, with fixed input values for all other variables, e.g. gap penalties and scoring matrices. While the requirement to specify parameters remains one of the more vexing issues in bioinformatics, it is a reflection of a larger issue: the need to broaden the view on statistical inference in bioinformatics. RESULTS: The assignment of probabilities for all possible values of all unknown variables in a problem in the form of a posterior distribution is the goal of Bayesian inference. Here we show how this goal can be achieved for most bioinformatics methods that use dynamic programming. Specifically, a tutorial style description of a Bayesian inference procedure for segmentation of a sequence based on the heterogeneity in its composition is given. In addition, full Bayesian inference algorithms for sequence alignment are described. AVAILABILITY: Software and a set of transparencies for a tutorial describing these ideas are available at http://www.wadsworth.org/res&res/bioinfo/
Authors: Pierre Nicolas; Laurent Bize; Florence Muri; Mark Hoebeke; François Rodolphe; S Dusko Ehrlich; Bernard Prum; Philippe Bessières Journal: Nucleic Acids Res Date: 2002-03-15 Impact factor: 16.971
Authors: H Mannila; M Koivisto; M Perola; T Varilo; W Hennah; J Ekelund; M Lukk; L Peltonen; E Ukkonen Journal: Am J Hum Genet Date: 2003-05-20 Impact factor: 11.025
Authors: L McCue; W Thompson; C Carmack; M P Ryan; J S Liu; V Derbyshire; C E Lawrence Journal: Nucleic Acids Res Date: 2001-02-01 Impact factor: 16.971