Masanori Arita1, Koji Tsuda, Kiyoshi Asai. 1. Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan. m-arita@aist.go.jp
Abstract
MOTIVATION: A new method for finding subtle patterns in sequences is introduced. It approximates the multiple correlations among residuals with pair-wise correlations, with the learning cost O(m(2)n) where n is the number of training sequences, each of length m. The method suits to model splicing sites in human DNA, which are reported to have higher-order dependencies. RESULTS: By computational experiments, the prediction accuracy of our model was shown to surpass that of previously reported Markov models for the prediction of acceptor sites in human. AVAILABILITY: The C++ source code is available on request from the authors.
MOTIVATION: A new method for finding subtle patterns in sequences is introduced. It approximates the multiple correlations among residuals with pair-wise correlations, with the learning cost O(m(2)n) where n is the number of training sequences, each of length m. The method suits to model splicing sites in human DNA, which are reported to have higher-order dependencies. RESULTS: By computational experiments, the prediction accuracy of our model was shown to surpass that of previously reported Markov models for the prediction of acceptor sites in human. AVAILABILITY: The C++ source code is available on request from the authors.