Literature DB >> 21856737

Comparative analysis of algorithms for next-generation sequencing read alignment.

Matthew Ruffalo1, Thomas LaFramboise, Mehmet Koyutürk.   

Abstract

MOTIVATION: The advent of next-generation sequencing (NGS) techniques presents many novel opportunities for many applications in life sciences. The vast number of short reads produced by these techniques, however, pose significant computational challenges. The first step in many types of genomic analysis is the mapping of short reads to a reference genome, and several groups have developed dedicated algorithms and software packages to perform this function. As the developers of these packages optimize their algorithms with respect to various considerations, the relative merits of different software packages remain unclear. However, for scientists who generate and use NGS data for their specific research projects, an important consideration is choosing the software that is most suitable for their application.
RESULTS: With a view to comparing existing short read alignment software, we develop a simulation and evaluation suite, Seal, which simulates NGS runs for different configurations of various factors, including sequencing error, indels and coverage. We also develop criteria to compare the performances of software with disparate output structure (e.g. some packages return a single alignment while some return multiple possible alignments). Using these criteria, we comprehensively evaluate the performances of Bowtie, BWA, mr- and mrsFAST, Novoalign, SHRiMP and SOAPv2, with regard to accuracy and runtime.
CONCLUSION: We expect that the results presented here will be useful to investigators in choosing the alignment software that is most suitable for their specific research aims. Our results also provide insights into the factors that should be considered to use alignment results effectively. Seal can also be used to evaluate the performance of algorithms that use deep sequencing data for various purposes (e.g. identification of genomic variants). AVAILABILITY: Seal is available as open source at http://compbio.case.edu/seal/. CONTACT: matthew.ruffalo@case.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh:

Year:  2011        PMID: 21856737     DOI: 10.1093/bioinformatics/btr477

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  89 in total

Review 1.  Applications of targeted gene capture and next-generation sequencing technologies in studies of human deafness and other genetic disabilities.

Authors:  Xi Lin; Wenxue Tang; Shoeb Ahmad; Jingqiao Lu; Candice C Colby; Jason Zhu; Qing Yu
Journal:  Hear Res       Date:  2012-01-14       Impact factor: 3.208

2.  Next-generation sequencing for cancer diagnostics: a practical perspective.

Authors:  Cliff Meldrum; Maria A Doyle; Richard W Tothill
Journal:  Clin Biochem Rev       Date:  2011-11

3.  Mapping reads on a genomic sequence: an algorithmic overview and a practical comparative analysis.

Authors:  Sophie Schbath; Véronique Martin; Matthias Zytnicki; Julien Fayolle; Valentin Loux; Jean-François Gibrat
Journal:  J Comput Biol       Date:  2012-04-16       Impact factor: 1.479

4.  Collaborative science in the next-generation sequencing era: a viewpoint on how to combine exome sequencing data across sites to identify novel disease susceptibility genes.

Authors:  Steven N Hart; Kara N Maxwell; Tinu Thomas; Vignesh Ravichandran; Bradley Wubberhorst; Robert J Klein; Kasmintan Schrader; Csilla Szabo; Jeffrey N Weitzel; Susan L Neuhausen; Katherine Nathanson; Kenneth Offit; Fergus J Couch; Joseph Vijai
Journal:  Brief Bioinform       Date:  2015-09-10       Impact factor: 11.622

Review 5.  Review of alignment and SNP calling algorithms for next-generation sequencing data.

Authors:  M Mielczarek; J Szyda
Journal:  J Appl Genet       Date:  2015-06-09       Impact factor: 3.240

Review 6.  From next-generation resequencing reads to a high-quality variant data set.

Authors:  S P Pfeifer
Journal:  Heredity (Edinb)       Date:  2016-10-19       Impact factor: 3.821

7.  A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases.

Authors:  Chirag Jain; Alexander Dilthey; Sergey Koren; Srinivas Aluru; Adam M Phillippy
Journal:  J Comput Biol       Date:  2018-04-30       Impact factor: 1.479

8.  Realistic artificial DNA sequences as negative controls for computational genomics.

Authors:  Juan Caballero; Arian F A Smit; Leroy Hood; Gustavo Glusman
Journal:  Nucleic Acids Res       Date:  2014-05-06       Impact factor: 16.971

Review 9.  Clinical analysis and interpretation of cancer genome data.

Authors:  Eliezer M Van Allen; Nikhil Wagle; Mia A Levy
Journal:  J Clin Oncol       Date:  2013-04-15       Impact factor: 44.544

10.  Pluribus-Exploring the Limits of Error Correction Using a Suffix Tree.

Authors:  Daniel Savel; Thomas LaFramboise; Ananth Grama; Mehmet Koyuturk
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2016-06-29       Impact factor: 3.710

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.