Literature DB >> 8891959

Poisson process approximation for sequence repeats, and sequencing by hybridization.

R Arratia1, D Martin, G Reinert, M S Waterman.   

Abstract

Sequencing by hybridization is a tool to determine a DNA sequence from the unordered list of all l-tuples contained in this sequence; typical numbers for l are l = 8, 10, 12. For theoretical purposes we assume that the multiset of all l-tuples is known. This multiset determines the DNA sequence uniquely if none of the so-called Ukkonen transformations are possible. These transformations require repeats of (l-1)-tuples in the sequence, with these repeats occurring in certain spatial patterns. We model DNA as an i.i.d. sequence. We first prove Poisson process approximations for the process of indicators of all leftmost long repeats allowing self-overlap and for the process of indicators of all left-most long repeats without self-overlap. Using the Chen-Stein method, we get bounds on the error of these approximations. As a corollary, we approximate the distribution of longest repeats. In the second step we analyze the spatial patterns of the repeats. Finally we combine these two steps to prove an approximation for the probability that a random sequence is uniquely recoverable from its list of l-tuples. For all our results we give some numerical examples including error bounds.

Mesh:

Year:  1996        PMID: 8891959     DOI: 10.1089/cmb.1996.3.425

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  4 in total

1.  Parking strategies for genome sequencing.

Authors:  J C Roach; V Thorsson; A F Siegel
Journal:  Genome Res       Date:  2000-07       Impact factor: 9.043

Review 2.  Nonrandom clusters of palindromes in herpesvirus genomes.

Authors:  Ming-Ying Leung; Kwok Pui Choi; Aihua Xia; Louis H Y Chen
Journal:  J Comput Biol       Date:  2005-04       Impact factor: 1.479

3.  Optimal compressed representation of high throughput sequence data via light assembly.

Authors:  Antonio A Ginart; Joseph Hui; Kaiyuan Zhu; Ibrahim Numanagić; Thomas A Courtade; S Cenk Sahinalp; David N Tse
Journal:  Nat Commun       Date:  2018-02-08       Impact factor: 14.919

4.  Sequencing by hybridization of long targets.

Authors:  Yu Qin; Tobias M Schneider; Michael P Brenner
Journal:  PLoS One       Date:  2012-05-04       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.