Literature DB >> 24135792

A note on probabilistic models over strings: the linear algebra approach.

Alexandre Bouchard-Côté1.   

Abstract

Probabilistic models over strings have played a key role in developing methods that take into consideration indels as phylogenetically informative events. There is an extensive literature on using automata and transducers on phylogenies to do inference on these probabilistic models, in which an important theoretical question is the complexity of computing the normalization of a class of string-valued graphical models. This question has been investigated using tools from combinatorics, dynamic programming, and graph theory, and has practical applications in Bayesian phylogenetics. In this work, we revisit this theoretical question from a different point of view, based on linear algebra. The main contribution is a set of results based on this linear algebra view that facilitate the analysis and design of inference algorithms on string-valued graphical models. As an illustration, we use this method to give a new elementary proof of a known result on the complexity of inference on the "TKF91" model, a well-known probabilistic model over strings. Compared to previous work, our proving method is easier to extend to other models, since it relies on a novel weak condition, triangular transducers, which is easy to establish in practice. The linear algebra view provides a concise way of describing transducer algorithms and their compositions, opens the possibility of transferring fast linear algebra libraries (for example, based on GPUs), as well as low rank matrix approximation methods, to string-valued inference problems.

Mesh:

Year:  2013        PMID: 24135792     DOI: 10.1007/s11538-013-9906-6

Source DB:  PubMed          Journal:  Bull Math Biol        ISSN: 0092-8240            Impact factor:   1.758


  5 in total

1.  Historian: accurate reconstruction of ancestral sequences and evolutionary rates.

Authors:  Ian H Holmes
Journal:  Bioinformatics       Date:  2017-04-15       Impact factor: 6.937

2.  General continuous-time Markov model of sequence evolution via insertions/deletions: are alignment probabilities factorable?

Authors:  Kiyoshi Ezawa
Journal:  BMC Bioinformatics       Date:  2016-08-11       Impact factor: 3.169

3.  Solving the master equation for Indels.

Authors:  Ian H Holmes
Journal:  BMC Bioinformatics       Date:  2017-05-12       Impact factor: 3.169

4.  General continuous-time Markov model of sequence evolution via insertions/deletions: local alignment probability computation.

Authors:  Kiyoshi Ezawa
Journal:  BMC Bioinformatics       Date:  2016-09-27       Impact factor: 3.169

5.  Machine Boss: rapid prototyping of bioinformatic automata.

Authors:  Jordi Silvestre-Ryan; Yujie Wang; Mehak Sharma; Stephen Lin; Yolanda Shen; Shihab Dider; Ian Holmes
Journal:  Bioinformatics       Date:  2021-04-09       Impact factor: 6.931

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.