Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 OnlineCall: fast online parameter estimation and base calling for illumina's next-generation sequencing.

Literature DB >> 22569177

OnlineCall: fast online parameter estimation and base calling for illumina's next-generation sequencing.

Abstract

MOTIVATION: Next-generation DNA sequencing platforms are becoming increasingly cost-effective and capable of providing enormous number of reads in a relatively short time. However, their accuracy and read lengths are still lagging behind those of conventional Sanger sequencing method. Performance of next-generation sequencing platforms is fundamentally limited by various imperfections in the sequencing-by-synthesis and signal acquisition processes. This drives the search for accurate, scalable and computationally tractable base calling algorithms capable of accounting for such imperfections.
RESULTS: Relying on a statistical model of the sequencing-by-synthesis process and signal acquisition procedure, we develop a computationally efficient base calling method for Illumina's sequencing technology (specifically, Genome Analyzer II platform). Parameters of the model are estimated via a fast unsupervised online learning scheme, which uses the generalized expectation-maximization algorithm and requires only 3 s of running time per tile (on an Intel i7 machine @3.07GHz, single core)-a three orders of magnitude speed-up over existing parametric model-based methods. To minimize the latency between the end of the sequencing run and the generation of the base calling reports, we develop a fast online scalable decoding algorithm, which requires only 9 s/tile and achieves significantly lower error rates than the Illumina's base calling software. Moreover, it is demonstrated that the proposed online parameter estimation scheme efficiently computes tile-dependent parameters, which can thereafter be provided to the base calling algorithm, resulting in significant improvements over previously developed base calling methods for the considered platform in terms of performance, time/complexity and latency. AVAILABILITY: A C code implementation of our algorithm can be downloaded from http://www.cerc.utexas.edu/OnlineCall/.

Mesh：

Year: 2012 PMID： 22569177 PMCID： PMC3381969 DOI： 10.1093/bioinformatics/bts256

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

10 in total

1. Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Authors: Daniel R Zerbino; Ewan Birney
Journal: Genome Res Date: 2008-03-18 Impact factor: 9.043

Review 2. Next-generation DNA sequencing methods.

Authors: Elaine R Mardis
Journal: Annu Rev Genomics Hum Genet Date: 2008 Impact factor: 8.929

3. BayesCall: A model-based base-calling algorithm for high-throughput short-read sequencing.

Authors: Wei-Chun Kao; Kristian Stevens; Yun S Song
Journal: Genome Res Date: 2009-08-06 Impact factor: 9.043

4. Base-calling of automated sequencer traces using phred. II. Error probabilities.

Authors: B Ewing; P Green
Journal: Genome Res Date: 1998-03 Impact factor: 9.043

5. Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.

Authors: Yaniv Erlich; Partha P Mitra; Melissa delaBastide; W Richard McCombie; Gregory J Hannon
Journal: Nat Methods Date: 2008-07-06 Impact factor: 28.547

Review 6. Base-calling for next-generation sequencing platforms.

Authors: Christian Ledergerber; Christophe Dessimoz
Journal: Brief Bioinform Date: 2011-01-18 Impact factor: 11.622

7. Improved base calling for the Illumina Genome Analyzer using machine learning strategies.

Authors: Martin Kircher; Udo Stenzel; Janet Kelso
Journal: Genome Biol Date: 2009-08-14 Impact factor: 13.583

8. Using quality scores and longer reads improves accuracy of Solexa read mapping.

Authors: Andrew D Smith; Zhenyu Xuan; Michael Q Zhang
Journal: BMC Bioinformatics Date: 2008-02-28 Impact factor: 3.169

9. Accurate whole human genome sequencing using reversible terminator chemistry.

Authors: David R Bentley; Shankar Balasubramanian; Harold P Swerdlow; Geoffrey P Smith; John Milton; Clive G Brown; Kevin P Hall; Dirk J Evers; Colin L Barnes; Helen R Bignell; Jonathan M Boutell; Jason Bryant; Richard J Carter; R Keira Cheetham; Anthony J Cox; Darren J Ellis; Michael R Flatbush; Niall A Gormley; Sean J Humphray; Leslie J Irving; Mirian S Karbelashvili; Scott M Kirk; Heng Li; Xiaohai Liu; Klaus S Maisinger; Lisa J Murray; Bojan Obradovic; Tobias Ost; Michael L Parkinson; Mark R Pratt; Isabelle M J Rasolonjatovo; Mark T Reed; Roberto Rigatti; Chiara Rodighiero; Mark T Ross; Andrea Sabot; Subramanian V Sankar; Aylwyn Scally; Gary P Schroth; Mark E Smith; Vincent P Smith; Anastassia Spiridou; Peta E Torrance; Svilen S Tzonev; Eric H Vermaas; Klaudia Walter; Xiaolin Wu; Lu Zhang; Mohammed D Alam; Carole Anastasi; Ify C Aniebo; David M D Bailey; Iain R Bancarz; Saibal Banerjee; Selena G Barbour; Primo A Baybayan; Vincent A Benoit; Kevin F Benson; Claire Bevis; Phillip J Black; Asha Boodhun; Joe S Brennan; John A Bridgham; Rob C Brown; Andrew A Brown; Dale H Buermann; Abass A Bundu; James C Burrows; Nigel P Carter; Nestor Castillo; Maria Chiara E Catenazzi; Simon Chang; R Neil Cooley; Natasha R Crake; Olubunmi O Dada; Konstantinos D Diakoumakos; Belen Dominguez-Fernandez; David J Earnshaw; Ugonna C Egbujor; David W Elmore; Sergey S Etchin; Mark R Ewan; Milan Fedurco; Louise J Fraser; Karin V Fuentes Fajardo; W Scott Furey; David George; Kimberley J Gietzen; Colin P Goddard; George S Golda; Philip A Granieri; David E Green; David L Gustafson; Nancy F Hansen; Kevin Harnish; Christian D Haudenschild; Narinder I Heyer; Matthew M Hims; Johnny T Ho; Adrian M Horgan; Katya Hoschler; Steve Hurwitz; Denis V Ivanov; Maria Q Johnson; Terena James; T A Huw Jones; Gyoung-Dong Kang; Tzvetana H Kerelska; Alan D Kersey; Irina Khrebtukova; Alex P Kindwall; Zoya Kingsbury; Paula I Kokko-Gonzales; Anil Kumar; Marc A Laurent; Cynthia T Lawley; Sarah E Lee; Xavier Lee; Arnold K Liao; Jennifer A Loch; Mitch Lok; Shujun Luo; Radhika M Mammen; John W Martin; Patrick G McCauley; Paul McNitt; Parul Mehta; Keith W Moon; Joe W Mullens; Taksina Newington; Zemin Ning; Bee Ling Ng; Sonia M Novo; Michael J O'Neill; Mark A Osborne; Andrew Osnowski; Omead Ostadan; Lambros L Paraschos; Lea Pickering; Andrew C Pike; Alger C Pike; D Chris Pinkard; Daniel P Pliskin; Joe Podhasky; Victor J Quijano; Come Raczy; Vicki H Rae; Stephen R Rawlings; Ana Chiva Rodriguez; Phyllida M Roe; John Rogers; Maria C Rogert Bacigalupo; Nikolai Romanov; Anthony Romieu; Rithy K Roth; Natalie J Rourke; Silke T Ruediger; Eli Rusman; Raquel M Sanches-Kuiper; Martin R Schenker; Josefina M Seoane; Richard J Shaw; Mitch K Shiver; Steven W Short; Ning L Sizto; Johannes P Sluis; Melanie A Smith; Jean Ernest Sohna Sohna; Eric J Spence; Kim Stevens; Neil Sutton; Lukasz Szajkowski; Carolyn L Tregidgo; Gerardo Turcatti; Stephanie Vandevondele; Yuli Verhovsky; Selene M Virk; Suzanne Wakelin; Gregory C Walcott; Jingwen Wang; Graham J Worsley; Juying Yan; Ling Yau; Mike Zuerlein; Jane Rogers; James C Mullikin; Matthew E Hurles; Nick J McCooke; John S West; Frank L Oaks; Peter L Lundberg; David Klenerman; Richard Durbin; Anthony J Smith
Journal: Nature Date: 2008-11-06 Impact factor: 49.962

10. Probabilistic base calling of Solexa sequencing data.

Authors: Jacques Rougemont; Arnaud Amzallag; Christian Iseli; Laurent Farinelli; Ioannis Xenarios; Felix Naef
Journal: BMC Bioinformatics Date: 2008-10-13 Impact factor: 3.169

10 in total

7 in total

OnlineCall: fast online parameter estimation and base calling for illumina's next-generation sequencing.

1. Velvet: algorithms for de novo short read assembly using de Bruijn graphs.

Review 2. Next-generation DNA sequencing methods.

3. BayesCall: A model-based base-calling algorithm for high-throughput short-read sequencing.

4. Base-calling of automated sequencer traces using phred. II. Error probabilities.

5. Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.

Review 6. Base-calling for next-generation sequencing platforms.

7. Improved base calling for the Illumina Genome Analyzer using machine learning strategies.

8. Using quality scores and longer reads improves accuracy of Solexa read mapping.

9. Accurate whole human genome sequencing using reversible terminator chemistry.

10. Probabilistic base calling of Solexa sequencing data.

1. QVZ: lossy compression of quality values.

2. CROMqs: an infinitesimal successive refinement lossy compressor for the quality scores.

3. Denoising of Quality Scores for Boosted Inference and Reduced Storage.

4. SDhaP: haplotype assembly for diploids and polyploids via semi-definite programming.

5. freeIbis: an efficient basecaller with calibrated quality scores for Illumina sequencers.

6. Base calling for high-throughput short-read sequencing: dynamic programming solutions.

7. Sparse Tensor Decomposition for Haplotype Assembly of Diploids and Polyploids.