MOTIVATION: Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. RESULTS: Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. AVAILABILITY: Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/.
MOTIVATION: Identification of every single genome present in a microbial sample is an important and challenging task with crucial applications. It is challenging because there are typically millions of cells in a microbial sample, the vast majority of which elude cultivation. The most accurate method to date is exhaustive single-cell sequencing using multiple displacement amplification, which is simply intractable for a large number of cells. However, there is hope for breaking this barrier, as the number of different cell types with distinct genome sequences is usually much smaller than the number of cells. RESULTS: Here, we present a novel divide and conquer method to sequence and de novo assemble all distinct genomes present in a microbial sample with a sequencing cost and computational complexity proportional to the number of genome types, rather than the number of cells. The method is implemented in a tool called Squeezambler. We evaluated Squeezambler on simulated data. The proposed divide and conquer method successfully reduces the cost of sequencing in comparison with the naïve exhaustive approach. AVAILABILITY: Squeezambler and datasets are available at http://compbio.cs.wayne.edu/software/squeezambler/.
Authors: Steven L Salzberg; Adam M Phillippy; Aleksey Zimin; Daniela Puiu; Tanja Magoc; Sergey Koren; Todd J Treangen; Michael C Schatz; Arthur L Delcher; Michael Roberts; Guillaume Marçais; Mihai Pop; James A Yorke Journal: Genome Res Date: 2012-01-06 Impact factor: 9.043
Authors: Kun Zhang; Adam C Martiny; Nikos B Reppas; Kerrie W Barry; Joel Malek; Sallie W Chisholm; George M Church Journal: Nat Biotechnol Date: 2006-05-28 Impact factor: 54.908
Authors: Yann Marcy; Cleber Ouverney; Elisabeth M Bik; Tina Lösekann; Natalia Ivanova; Hector Garcia Martin; Ernest Szeto; Darren Platt; Philip Hugenholtz; David A Relman; Stephen R Quake Journal: Proc Natl Acad Sci U S A Date: 2007-07-09 Impact factor: 11.205
Authors: Seiyu Hosono; A Fawad Faruqi; Frank B Dean; Yuefen Du; Zhenyu Sun; Xiaohong Wu; Jing Du; Stephen F Kingsmore; Michael Egholm; Roger S Lasken Journal: Genome Res Date: 2003-04-14 Impact factor: 9.043
Authors: Hamidreza Chitsaz; Joyclyn L Yee-Greenbaum; Glenn Tesler; Mary-Jane Lombardo; Christopher L Dupont; Jonathan H Badger; Mark Novotny; Douglas B Rusch; Louise J Fraser; Niall A Gormley; Ole Schulz-Trieglaff; Geoffrey P Smith; Dirk J Evers; Pavel A Pevzner; Roger S Lasken Journal: Nat Biotechnol Date: 2011-09-18 Impact factor: 54.908
Authors: Tanja Woyke; Gary Xie; Alex Copeland; José M González; Cliff Han; Hajnalka Kiss; Jimmy H Saw; Pavel Senin; Chi Yang; Sourav Chatterji; Jan-Fang Cheng; Jonathan A Eisen; Michael E Sieracki; Ramunas Stepanauskas Journal: PLoS One Date: 2009-04-23 Impact factor: 3.240