Literature DB >> 25252999

Ensemble analysis of adaptive compressed genome sequencing strategies.

Zeinab Taghavi.   

Abstract

BACKGROUND: Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity.
RESULTS: In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to log of the number of cells and proportional linearly with the number of distinct genomes. The probability of missing a genome depends on its abundance and the ratio of its size over the maximum genome size in the sample. The modified resource allocation method accommodates a parameter to control that probability. AVAILABILITY: The squeezambler 2.0 C++ source code is available at http://sourceforge.net/projects/hyda/.

Entities:  

Mesh:

Year:  2014        PMID: 25252999      PMCID: PMC4221792          DOI: 10.1186/1471-2105-15-S9-S13

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  8 in total

1.  Cultivating the uncultured.

Authors:  Karsten Zengler; Gerardo Toledo; Michael Rappe; James Elkins; Eric J Mathur; Jay M Short; Martin Keller
Journal:  Proc Natl Acad Sci U S A       Date:  2002-11-18       Impact factor: 11.205

2.  Distilled single-cell genome sequencing and de novo assembly for sparse microbial communities.

Authors:  Zeinab Taghavi; Narjes S Movahedi; Sorin Draghici; Hamidreza Chitsaz
Journal:  Bioinformatics       Date:  2013-08-05       Impact factor: 6.937

3.  A human gut microbial gene catalogue established by metagenomic sequencing.

Authors:  Junjie Qin; Ruiqiang Li; Jeroen Raes; Manimozhiyan Arumugam; Kristoffer Solvsten Burgdorf; Chaysavanh Manichanh; Trine Nielsen; Nicolas Pons; Florence Levenez; Takuji Yamada; Daniel R Mende; Junhua Li; Junming Xu; Shaochuan Li; Dongfang Li; Jianjun Cao; Bo Wang; Huiqing Liang; Huisong Zheng; Yinlong Xie; Julien Tap; Patricia Lepage; Marcelo Bertalan; Jean-Michel Batto; Torben Hansen; Denis Le Paslier; Allan Linneberg; H Bjørn Nielsen; Eric Pelletier; Pierre Renault; Thomas Sicheritz-Ponten; Keith Turner; Hongmei Zhu; Chang Yu; Shengting Li; Min Jian; Yan Zhou; Yingrui Li; Xiuqing Zhang; Songgang Li; Nan Qin; Huanming Yang; Jian Wang; Søren Brunak; Joel Doré; Francisco Guarner; Karsten Kristiansen; Oluf Pedersen; Julian Parkhill; Jean Weissenbach; Peer Bork; S Dusko Ehrlich; Jun Wang
Journal:  Nature       Date:  2010-03-04       Impact factor: 49.962

4.  ART: a next-generation sequencing read simulator.

Authors:  Weichun Huang; Leping Li; Jason R Myers; Gabor T Marth
Journal:  Bioinformatics       Date:  2011-12-23       Impact factor: 6.937

5.  Compressed Genotyping.

Authors:  Yaniv Erlich; Assaf Gordon; Michael Brand; Gregory J Hannon; Partha P Mitra
Journal:  IEEE Trans Inf Theory       Date:  2010-02       Impact factor: 2.501

6.  Nearly finished genomes produced using gel microdroplet culturing reveal substantial intraspecies genomic diversity within the human microbiome.

Authors:  Michael S Fitzsimons; Mark Novotny; Chien-Chi Lo; Armand E K Dichosa; Joyclyn L Yee-Greenbaum; Jeremy P Snook; Wei Gu; Olga Chertkov; Karen W Davenport; Kim McMurry; Krista G Reitenga; Ashlynn R Daughton; Jian He; Shannon L Johnson; Cheryl D Gleasner; Patti L Wills; Beverly Parson-Quintana; Patrick S Chain; John C Detter; Roger S Lasken; Cliff S Han
Journal:  Genome Res       Date:  2013-03-14       Impact factor: 9.043

7.  Efficient de novo assembly of single-cell bacterial genomes from short-read data sets.

Authors:  Hamidreza Chitsaz; Joyclyn L Yee-Greenbaum; Glenn Tesler; Mary-Jane Lombardo; Christopher L Dupont; Jonathan H Badger; Mark Novotny; Douglas B Rusch; Louise J Fraser; Niall A Gormley; Ole Schulz-Trieglaff; Geoffrey P Smith; Dirk J Evers; Pavel A Pevzner; Roger S Lasken
Journal:  Nat Biotechnol       Date:  2011-09-18       Impact factor: 54.908

8.  How to apply de Bruijn graphs to genome assembly.

Authors:  Phillip E C Compeau; Pavel A Pevzner; Glenn Tesler
Journal:  Nat Biotechnol       Date:  2011-11-08       Impact factor: 54.908

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.