Literature DB >> 23125479

High Performance Multiple Sequence Alignment System for Pyrosequencing Reads from Multiple Reference Genomes.

Fahad Saeed1, Alan Perez-Rathke, Jaroslaw Gwarnicki, Tanya Berger-Wolf, Ashfaq Khokhar.   

Abstract

Genome resequencing with short reads generated from pyrosequencing generally relies on mapping the short reads against a single reference genome. However, mapping of reads from multiple reference genomes is not possible using a pairwise mapping algorithm. In order to align the reads w.r.t each other and the reference genomes, existing multiple sequence alignment(MSA) methods cannot be used because they do not take into account the position of these short reads with respect to the genome, and are highly inefficient for large number of sequences. In this paper, we develop a highly scalable parallel algorithm based on domain decomposition, referred to as P-Pyro-Align, to align such large number of reads from single or multiple reference genomes. The proposed alignment algorithm accurately aligns the erroneous reads, and has been implemented on a cluster of workstations using MPI library. Experimental results for different problem sizes are analyzed in terms of execution time, quality of the alignments, and the ability of the algorithm to handle reads from multiple haplotypes. We report high quality multiple alignment of up to 0.5 million reads. The algorithm is shown to be highly scalable and exhibits super-linear speedups with increasing number of processors.

Entities:  

Year:  2011        PMID: 23125479      PMCID: PMC3486434          DOI: 10.1016/j.jpdc.2011.08.001

Source DB:  PubMed          Journal:  J Parallel Distrib Comput        ISSN: 0743-7315            Impact factor:   3.734


  41 in total

1.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

2.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

3.  Pyrosequencing analysis of the gyrB gene to differentiate bacteria responsible for diarrheal diseases.

Authors:  X-L Hou; Q-Y Cao; H-Y Jia; Z Chen
Journal:  Eur J Clin Microbiol Infect Dis       Date:  2008-03-07       Impact factor: 3.267

4.  SOAP2: an improved ultrafast tool for short read alignment.

Authors:  Ruiqiang Li; Chang Yu; Yingrui Li; Tak-Wah Lam; Siu-Ming Yiu; Karsten Kristiansen; Jun Wang
Journal:  Bioinformatics       Date:  2009-06-03       Impact factor: 6.937

5.  PASS: a program to align short sequences.

Authors:  Davide Campagna; Alessandro Albiero; Alessandra Bilardi; Elisa Caniato; Claudio Forcato; Svetlin Manavski; Nicola Vitulo; Giorgio Valle
Journal:  Bioinformatics       Date:  2009-02-13       Impact factor: 6.937

6.  Mapping short DNA sequencing reads and calling variants using mapping quality scores.

Authors:  Heng Li; Jue Ruan; Richard Durbin
Journal:  Genome Res       Date:  2008-08-19       Impact factor: 9.043

7.  ZOOM! Zillions of oligos mapped.

Authors:  Hao Lin; Zefeng Zhang; Michael Q Zhang; Bin Ma; Ming Li
Journal:  Bioinformatics       Date:  2008-08-06       Impact factor: 6.937

8.  PatMaN: rapid alignment of short sequences to large databases.

Authors:  Kay Prüfer; Udo Stenzel; Michael Dannemann; Richard E Green; Michael Lachmann; Janet Kelso
Journal:  Bioinformatics       Date:  2008-05-08       Impact factor: 6.937

9.  DNA sequencing: bench to bedside and beyond.

Authors:  Clyde A Hutchison
Journal:  Nucleic Acids Res       Date:  2007-09-12       Impact factor: 16.971

10.  MUSCLE: a multiple sequence alignment method with reduced time and space complexity.

Authors:  Robert C Edgar
Journal:  BMC Bioinformatics       Date:  2004-08-19       Impact factor: 3.169

View more
  2 in total

1.  PhosSA: Fast and accurate phosphorylation site assignment algorithm for mass spectrometry data.

Authors:  Fahad Saeed; Trairak Pisitkun; Jason D Hoffert; Sara Rashidian; Guanghui Wang; Marjan Gucek; Mark A Knepper
Journal:  Proteome Sci       Date:  2013-11-07       Impact factor: 2.480

2.  Fast-GPU-PCC: A GPU-Based Technique to Compute Pairwise Pearson's Correlation Coefficients for Time Series Data-fMRI Study.

Authors:  Taban Eslami; Fahad Saeed
Journal:  High Throughput       Date:  2018-04-20
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.