Literature DB >> 16901236

A general coverage theory for shotgun DNA sequencing.

Michael C Wendl1.   

Abstract

The classical theory of shotgun DNA sequencing accounts for neither the placement dependencies that are a fundamental consequence of the forward-reverse sequencing strategy, nor the edge effect that arises for small to moderate-sized genomic targets. These phenomena are relevant to a number of sequencing scenarios, including large-insert BAC and fosmid clones, filtered genomic libraries, and macro-nuclear chromosomes. Here, we report a model that considers these two effects and provides both the expected value of coverage and its variance. Comparison to methyl-filtered maize data shows significant improvement over classical theory. The model is used to analyze coverage performance over a range of small to moderately-sized genomic targets. We find that the read pairing effect and the edge effect interact in a non-trivial fashion. Shorter reads give superior coverage per unit sequence depth relative to longer ones. In principle, end-sequences can be optimized with respect to template insert length; however, optimal performance is unlikely to be realized in most cases because of inherent size variation in any set of targets. Conversely, single-stranded reads exhibit roughly the same coverage attributes as optimized end-reads. Although linking information is lost, single-stranded data should not pose a significant assembly liability if the target represents predominantly low-copy sequence. We also find that random sequencing should be halted at substantially lower redundancies than those now associated with larger projects. Given the enormous amount of data generated per cycle on pyro-sequencing instruments, this observation suggests devising schemes to split each run cycle between twoor more projects. This would prevent over-sequencing and would further leverage the pyrosequencing method.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 16901236     DOI: 10.1089/cmb.2006.13.1177

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  8 in total

1.  An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile.

Authors:  Celine Prakash; Arndt Von Haeseler
Journal:  J Comput Biol       Date:  2016-09-23       Impact factor: 1.479

2.  Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.

Authors:  Stephen A Stanhope
Journal:  PLoS One       Date:  2010-07-29       Impact factor: 3.240

3.  Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.

Authors:  Michael C Wendl; Karthik Kota; George M Weinstock; Makedonka Mitreva
Journal:  J Math Biol       Date:  2012-09-11       Impact factor: 2.259

4.  Coverage statistics for sequence census methods.

Authors:  Steven N Evans; Valerie Hower; Lior Pachter
Journal:  BMC Bioinformatics       Date:  2010-08-18       Impact factor: 3.169

5.  Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity.

Authors:  Luis M Rodriguez-R; Santosh Gunturu; James M Tiedje; James R Cole; Konstantinos T Konstantinidis
Journal:  mSystems       Date:  2018-04-10       Impact factor: 6.496

6.  Aspects of coverage in medical DNA sequencing.

Authors:  Michael C Wendl; Richard K Wilson
Journal:  BMC Bioinformatics       Date:  2008-05-16       Impact factor: 3.169

7.  Lessons learned from the initial sequencing of the pig genome: comparative analysis of an 8 Mb region of pig chromosome 17.

Authors:  Elizabeth A Hart; Mario Caccamo; Jennifer L Harrow; Sean J Humphray; James G R Gilbert; Steve Trevanion; Tim Hubbard; Jane Rogers; Max F Rothschild
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

Review 8.  Recovering complete and draft population genomes from metagenome datasets.

Authors:  Naseer Sangwan; Fangfang Xia; Jack A Gilbert
Journal:  Microbiome       Date:  2016-03-08       Impact factor: 14.650

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.