Literature DB >> 23995392

PyroHMMvar: a sensitive and accurate method to call short indels and SNPs for Ion Torrent and 454 data.

Feng Zeng1, Rui Jiang, Ting Chen.   

Abstract

MOTIVATION: The identification of short insertions and deletions (indels) and single nucleotide polymorphisms (SNPs) from Ion Torrent and 454 reads is a challenging problem, essentially because these techniques are prone to sequence erroneously at homopolymers and can, therefore, raise indels in reads. Most of the existing mapping programs do not model homopolymer errors when aligning reads against the reference. The resulting alignments will then contain various kinds of mismatches and indels that confound the accurate determination of variant loci and alleles.
RESULTS: To address these challenges, we realign reads against the reference using our previously proposed hidden Markov model that models homopolymer errors and then merges these pairwise alignments into a weighted alignment graph. Based on our weighted alignment graph and hidden Markov model, we develop a method called PyroHMMvar, which can simultaneously detect short indels and SNPs, as demonstrated in human resequencing data. Specifically, by applying our methods to simulated diploid datasets, we demonstrate that PyroHMMvar produces more accurate results than state-of-the-art methods, such as Samtools and GATK, and is less sensitive to mapping parameter settings than the other methods. We also apply PyroHMMvar to analyze one human whole genome resequencing dataset, and the results confirm that PyroHMMvar predicts SNPs and indels accurately.
AVAILABILITY AND IMPLEMENTATION: Source code freely available at the following URL: https://code.google.com/p/pyrohmmvar/, implemented in C++ and supported on Linux. .

Entities:  

Mesh:

Year:  2013        PMID: 23995392      PMCID: PMC3888126          DOI: 10.1093/bioinformatics/btt512

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  32 in total

1.  A general approach to single-nucleotide polymorphism discovery.

Authors:  G T Marth; I Korf; M D Yandell; R T Yeh; Z Gu; H Zakeri; N O Stitziel; L Hillier; P Y Kwok; W R Gish
Journal:  Nat Genet       Date:  1999-12       Impact factor: 38.330

2.  Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes.

Authors:  Tushar R Bhangale; Mark J Rieder; Robert J Livingston; Deborah A Nickerson
Journal:  Hum Mol Genet       Date:  2004-11-03       Impact factor: 6.150

3.  Genome sequencing in microfabricated high-density picolitre reactors.

Authors:  Marcel Margulies; Michael Egholm; William E Altman; Said Attiya; Joel S Bader; Lisa A Bemben; Jan Berka; Michael S Braverman; Yi-Ju Chen; Zhoutao Chen; Scott B Dewell; Lei Du; Joseph M Fierro; Xavier V Gomes; Brian C Godwin; Wen He; Scott Helgesen; Chun Heen Ho; Chun He Ho; Gerard P Irzyk; Szilveszter C Jando; Maria L I Alenquer; Thomas P Jarvie; Kshama B Jirage; Jong-Bum Kim; James R Knight; Janna R Lanza; John H Leamon; Steven M Lefkowitz; Ming Lei; Jing Li; Kenton L Lohman; Hong Lu; Vinod B Makhijani; Keith E McDade; Michael P McKenna; Eugene W Myers; Elizabeth Nickerson; John R Nobile; Ramona Plant; Bernard P Puc; Michael T Ronan; George T Roth; Gary J Sarkis; Jan Fredrik Simons; John W Simpson; Maithreyan Srinivasan; Karrie R Tartaro; Alexander Tomasz; Kari A Vogt; Greg A Volkmer; Shally H Wang; Yong Wang; Michael P Weiner; Pengguang Yu; Richard F Begley; Jonathan M Rothberg
Journal:  Nature       Date:  2005-07-31       Impact factor: 49.962

4.  A probabilistic method for small RNA flowgram matching.

Authors:  Vladimir Vacic; Hailing Jin; Jian-Kang Zhu; Stefano Lonardi
Journal:  Pac Symp Biocomput       Date:  2008

5.  The complete genome of an individual by massively parallel DNA sequencing.

Authors:  David A Wheeler; Maithreyan Srinivasan; Michael Egholm; Yufeng Shen; Lei Chen; Amy McGuire; Wen He; Yi-Ju Chen; Vinod Makhijani; G Thomas Roth; Xavier Gomes; Karrie Tartaro; Faheem Niazi; Cynthia L Turcotte; Gerard P Irzyk; James R Lupski; Craig Chinault; Xing-zhi Song; Yue Liu; Ye Yuan; Lynne Nazareth; Xiang Qin; Donna M Muzny; Marcel Margulies; George M Weinstock; Richard A Gibbs; Jonathan M Rothberg
Journal:  Nature       Date:  2008-04-17       Impact factor: 49.962

6.  Identification of common molecular subsequences.

Authors:  T F Smith; M S Waterman
Journal:  J Mol Biol       Date:  1981-03-25       Impact factor: 5.469

7.  An initial map of insertion and deletion (INDEL) variation in the human genome.

Authors:  Ryan E Mills; Christopher T Luttig; Christine E Larkins; Adam Beauchamp; Circe Tsui; W Stephen Pittard; Scott E Devine
Journal:  Genome Res       Date:  2006-08-10       Impact factor: 9.043

8.  The diploid genome sequence of an individual human.

Authors:  Samuel Levy; Granger Sutton; Pauline C Ng; Lars Feuk; Aaron L Halpern; Brian P Walenz; Nelson Axelrod; Jiaqi Huang; Ewen F Kirkness; Gennady Denisov; Yuan Lin; Jeffrey R MacDonald; Andy Wing Chun Pang; Mary Shago; Timothy B Stockwell; Alexia Tsiamouri; Vineet Bafna; Vikas Bansal; Saul A Kravitz; Dana A Busam; Karen Y Beeson; Tina C McIntosh; Karin A Remington; Josep F Abril; John Gill; Jon Borman; Yu-Hui Rogers; Marvin E Frazier; Stephen W Scherer; Robert L Strausberg; J Craig Venter
Journal:  PLoS Biol       Date:  2007-09-04       Impact factor: 8.029

9.  PyroHMMsnp: an SNP caller for Ion Torrent and 454 sequencing data.

Authors:  Feng Zeng; Rui Jiang; Ting Chen
Journal:  Nucleic Acids Res       Date:  2013-05-21       Impact factor: 16.971

10.  Accuracy and quality of massively parallel DNA pyrosequencing.

Authors:  Susan M Huse; Julie A Huber; Hilary G Morrison; Mitchell L Sogin; David Mark Welch
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more
  7 in total

1.  Mechanistic Study of Utilization of Water-Insoluble Saccharomyces cerevisiae Glucans by Bifidobacterium breve Strain JCM1192.

Authors:  Hoi Yee Keung; Tsz Kai Li; Lok To Sham; Man Kit Cheung; Peter Chi Keung Cheung; Hoi Shan Kwan
Journal:  Appl Environ Microbiol       Date:  2017-03-17       Impact factor: 4.792

2.  Complete Genome Sequence of Lactobacillus acidophilus FSI4, Isolated from Yogurt.

Authors:  Oleg Iartchouk; Sergei Kozyavkin; Valeri Karamychev; Alexei Slesarev
Journal:  Genome Announc       Date:  2015-04-09

3.  DeepSimulator: a deep simulator for Nanopore sequencing.

Authors:  Yu Li; Renmin Han; Chongwei Bi; Mo Li; Sheng Wang; Xin Gao
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

4.  vi-HMM: a novel HMM-based method for sequence variant identification in short-read data.

Authors:  Man Tang; Mohammad Shabbir Hasan; Hongxiao Zhu; Liqing Zhang; Xiaowei Wu
Journal:  Hum Genomics       Date:  2019-02-13       Impact factor: 4.639

Review 5.  Best practices for evaluating single nucleotide variant calling methods for microbial genomics.

Authors:  Nathan D Olson; Steven P Lund; Rebecca E Colman; Jeffrey T Foster; Jason W Sahl; James M Schupp; Paul Keim; Jayne B Morrow; Marc L Salit; Justin M Zook
Journal:  Front Genet       Date:  2015-07-07       Impact factor: 4.599

6.  CRCDA--Comprehensive resources for cancer NGS data analysis.

Authors:  Manonanthini Thangam; Ramesh Kumar Gopal
Journal:  Database (Oxford)       Date:  2015-10-08       Impact factor: 3.451

7.  OTG-snpcaller: an optimized pipeline based on TMAP and GATK for SNP calling from ion torrent data.

Authors:  Pengyuan Zhu; Lingyu He; Yaqiao Li; Wenpan Huang; Feng Xi; Lin Lin; Qihuan Zhi; Wenwei Zhang; Y Tom Tang; Chunyu Geng; Zhiyuan Lu; Xun Xu
Journal:  PLoS One       Date:  2014-05-13       Impact factor: 3.240

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.