Literature DB >> 15882140

sFFT: a faster accurate computation of the p-value of the entropy score.

Uri Keich1.   

Abstract

We present sFFT, an algorithm for efficiently computing the p-value of the information content, or the entropy score of an alignment of DNA sequences. Applying the FFT algorithm to an exponentially shifted probability mass function allows us perform fast convolutions that do not suffer from the otherwise overwhelming effect of accumulated numerical roundoff errors. Through a rigorous analysis of the propagation of numerical errors across the various steps of sFFT, we provide a theoretical bound on the overall error of our computed p-value. The accuracy of the computed p-value, as well as the utility of the error bound, are empirically demonstrated. Although there are faster algorithms that would compute this p-value, they can err significantly; sFFT is the fastest reliable algorithm. Finally, we note that the basic algorithm is likely to be applicable in a wider context than the one considered here.

Mesh:

Year:  2005        PMID: 15882140     DOI: 10.1089/cmb.2005.12.416

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  2 in total

1.  Compound poisson approximation of the number of occurrences of a position frequency matrix (PFM) on both strands.

Authors:  Utz J Pape; Sven Rahmann; Fengzhu Sun; Martin Vingron
Journal:  J Comput Biol       Date:  2008 Jul-Aug       Impact factor: 1.479

2.  Accurate computation of survival statistics in genome-wide studies.

Authors:  Fabio Vandin; Alexandra Papoutsaki; Benjamin J Raphael; Eli Upfal
Journal:  PLoS Comput Biol       Date:  2015-05-07       Impact factor: 4.475

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.