Literature DB >> 12855464

Pair hidden Markov models on tree structures.

Yasubumi Sakakibara1.   

Abstract

MOTIVATION: Computationally identifying non-coding RNA regions on the genome has much scope for investigation and is essentially harder than gene-finding problems for protein-coding regions. Since comparative sequence analysis is effective for non-coding RNA detection, efficient computational methods are expected for structural alignments of RNA sequences. On the other hand, Hidden Markov Models (HMMs) have played important roles for modeling and analysing biological sequences. Especially, the concept of Pair HMMs (PHMMs) have been examined extensively as mathematical models for alignments and gene finding.
RESULTS: We propose the pair HMMs on tree structures (PHMMTSs), which is an extension of PHMMs defined on alignments of trees and provides a unifying framework and an automata-theoretic model for alignments of trees, structural alignments and pair stochastic context-free grammars. By structural alignment, we mean a pairwise alignment to align an unfolded RNA sequence into an RNA sequence of known secondary structure. First, we extend the notion of PHMMs defined on alignments of 'linear' sequences to pair stochastic tree automata, called PHMMTSs, defined on alignments of 'trees'. The PHMMTSs provide various types of alignments of trees such as affine-gap alignments of trees and an automata-theoretic model for alignment of trees. Second, based on the observation that a secondary structure of RNA can be represented by a tree, we apply PHMMTSs to the problem of structural alignments of RNAs. We modify PHMMTSs so that it takes as input a pair of a 'linear' sequence and a 'tree' representing a secondary structure of RNA to produce a structural alignment. Further, the PHMMTSs with input of a pair of two linear sequences is mathematically equal to the pair stochastic context-free grammars. We demonstrate some computational experiments to show the effectiveness of our method for structural alignments, and discuss a complexity issue of PHMMTSs.

Mesh:

Substances:

Year:  2003        PMID: 12855464     DOI: 10.1093/bioinformatics/btg1032

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  8 in total

1.  PSSMTS: position specific scoring matrices on tree structures.

Authors:  Kengo Sato; Kensuke Morita; Yasubumi Sakakibara
Journal:  J Math Biol       Date:  2007-07-07       Impact factor: 2.259

2.  Hidden Markov Models and their Applications in Biological Sequence Analysis.

Authors:  Byung-Jun Yoon
Journal:  Curr Genomics       Date:  2009-09       Impact factor: 2.236

Review 3.  Informatic resources for identifying and annotating structural RNA motifs.

Authors:  Ajish D George; Scott A Tenenbaum
Journal:  Mol Biotechnol       Date:  2008-11-01       Impact factor: 2.695

4.  Structator: fast index-based search for RNA sequence-structure patterns.

Authors:  Fernando Meyer; Stefan Kurtz; Rolf Backofen; Sebastian Will; Michael Beckstette
Journal:  BMC Bioinformatics       Date:  2011-05-27       Impact factor: 3.169

5.  Evolutionary triplet models of structured RNA.

Authors:  Robert K Bradley; Ian Holmes
Journal:  PLoS Comput Biol       Date:  2009-08-28       Impact factor: 4.475

6.  Directed acyclic graph kernels for structural RNA analysis.

Authors:  Kengo Sato; Toutai Mituyama; Kiyoshi Asai; Yasubumi Sakakibara
Journal:  BMC Bioinformatics       Date:  2008-07-22       Impact factor: 3.169

7.  Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

Authors:  Markus Bauer; Gunnar W Klau; Knut Reinert
Journal:  BMC Bioinformatics       Date:  2007-07-27       Impact factor: 3.169

8.  Software.ncrna.org: web servers for analyses of RNA sequences.

Authors:  Kiyoshi Asai; Hisanori Kiryu; Michiaki Hamada; Yasuo Tabei; Kengo Sato; Hiroshi Matsui; Yasubumi Sakakibara; Goro Terai; Toutai Mituyama
Journal:  Nucleic Acids Res       Date:  2008-04-25       Impact factor: 16.971

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.