Literature DB >> 21976422

Probabilistic alignments with quality scores: an application to short-read mapping toward accurate SNP/indel detection.

Michiaki Hamada1, Edward Wijaya, Martin C Frith, Kiyoshi Asai.   

Abstract

MOTIVATION: Recent studies have revealed the importance of considering quality scores of reads generated by next-generation sequence (NGS) platforms in various downstream analyses. It is also known that probabilistic alignments based on marginal probabilities (e.g. aligned-column and/or gap probabilities) provide more accurate alignment than conventional maximum score-based alignment. There exists, however, no study about probabilistic alignment that considers quality scores explicitly, although the method is expected to be useful in SNP/indel callers and bisulfite mapping, because accurate estimation of aligned columns or gaps is important in those analyses.
RESULTS: In this study, we propose methods of probabilistic alignment that consider quality scores of (one of) the sequences as well as a usual score matrix. The method is based on posterior decoding techniques in which various marginal probabilities are computed from a probabilistic model of alignments with quality scores, and can arbitrarily trade-off sensitivity and positive predictive value (PPV) of prediction (aligned columns and gaps). The method is directly applicable to read mapping (alignment) toward accurate detection of SNPs and indels. Several computational experiments indicated that probabilistic alignments can estimate aligned columns and gaps accurately, compared with other mapping algorithms e.g. SHRiMP2, Stampy, BWA and Novoalign. The study also suggested that our approach yields favorable precision for SNP/indel calling.

Entities:  

Mesh:

Year:  2011        PMID: 21976422     DOI: 10.1093/bioinformatics/btr537

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  11 in total

1.  Comparative phenotypic analysis and genome sequence of Clostridium beijerinckii SA-1, an offspring of NCIMB 8052.

Authors:  Walter J Sandoval-Espinola; Satya T Makwana; Mari S Chinn; Michael R Thon; M Andrea Azcárate-Peril; José M Bruno-Bárcena
Journal:  Microbiology (Reading)       Date:  2013-09-25       Impact factor: 2.777

Review 2.  Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches.

Authors:  Haley J Abel; Eric J Duncavage
Journal:  Cancer Genet       Date:  2013-11-20

3.  Homeostatic IL-13 in healthy skin directs dendritic cell differentiation to promote TH2 and inhibit TH17 cell polarization.

Authors:  Olivier Lamiable; Franca Ronchese; Johannes U Mayer; Kerry L Hilligan; Jodie S Chandler; David A Eccles; Samuel I Old; Rita G Domingues; Jianping Yang; Greta R Webb; Luis Munoz-Erazo; Evelyn J Hyde; Kirsty A Wakelin; Shiau-Choot Tang; Sally C Chappell; Sventja von Daake; Frank Brombacher; Charles R Mackay; Alan Sher; Roxane Tussiwand; Lisa M Connor; David Gallego-Ortega; Dragana Jankovic; Graham Le Gros; Matthew R Hepworth
Journal:  Nat Immunol       Date:  2021-11-18       Impact factor: 25.606

4.  A mostly traditional approach improves alignment of bisulfite-converted DNA.

Authors:  Martin C Frith; Ryota Mori; Kiyoshi Asai
Journal:  Nucleic Acids Res       Date:  2012-03-28       Impact factor: 16.971

5.  Adaptable probabilistic mapping of short reads using position specific scoring matrices.

Authors:  Peter Kerpedjiev; Jes Frellsen; Stinus Lindgreen; Anders Krogh
Journal:  BMC Bioinformatics       Date:  2014-04-09       Impact factor: 3.169

6.  Training alignment parameters for arbitrary sequencers with LAST-TRAIN.

Authors:  Michiaki Hamada; Yukiteru Ono; Kiyoshi Asai; Martin C Frith
Journal:  Bioinformatics       Date:  2017-03-15       Impact factor: 6.937

7.  Indel-tolerant read mapping with trinucleotide frequencies using cache-oblivious kd-trees.

Authors:  Md Pavel Mahmud; John Wiedenhoeft; Alexander Schliep
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

8.  Next generation sequence analysis and computational genomics using graphical pipeline workflows.

Authors:  Federica Torri; Ivo D Dinov; Alen Zamanyan; Sam Hobel; Alex Genco; Petros Petrosyan; Andrew P Clark; Zhizhong Liu; Paul Eggert; Jonathan Pierce; James A Knowles; Joseph Ames; Carl Kesselman; Arthur W Toga; Steven G Potkin; Marquis P Vawter; Fabio Macciardi
Journal:  Genes (Basel)       Date:  2012-08-30       Impact factor: 4.096

9.  Improved base-calling and quality scores for 454 sequencing based on a Hurdle Poisson model.

Authors:  Kristof De Beuf; Joachim De Schrijver; Olivier Thas; Wim Van Criekinge; Rafael A Irizarry; Lieven Clement
Journal:  BMC Bioinformatics       Date:  2012-11-15       Impact factor: 3.169

10.  Pan-genome Analysis of Ancient and Modern Salmonella enterica Demonstrates Genomic Stability of the Invasive Para C Lineage for Millennia.

Authors:  Zhemin Zhou; Inge Lundstrøm; Alicia Tran-Dien; Sebastián Duchêne; Nabil-Fareed Alikhan; Martin J Sergeant; Gemma Langridge; Anna K Fotakis; Satheesh Nair; Hans K Stenøien; Stian S Hamre; Sherwood Casjens; Axel Christophersen; Christopher Quince; Nicholas R Thomson; François-Xavier Weill; Simon Y W Ho; M Thomas P Gilbert; Mark Achtman
Journal:  Curr Biol       Date:  2018-07-19       Impact factor: 10.834

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.