Literature DB >> 22809341

Analysis of context-dependent errors for illumina sequencing.

Irina Abnizova1, Steven Leonard, Tom Skelly, Andy Brown, David Jackson, Marina Gourtovaia, Guoying Qi, Rene Te Boekhorst, Nadeem Faruque, Kevin Lewis, Tony Cox.   

Abstract

The new generation of short-read sequencing technologies requires reliable measures of data quality. Such measures are especially important for variant calling. However, in the particular case of SNP calling, a great number of false-positive SNPs may be obtained. One needs to distinguish putative SNPs from sequencing or other errors. We found that not only the probability of sequencing errors (i.e. the quality value) is important to distinguish an FP-SNP but also the conditional probability of "correcting" this error (the "second best call" probability, conditional on that of the first call). Surprisingly, around 80% of mismatches can be "corrected" with this second call. Another way to reduce the rate of FP-SNPs is to retrieve DNA motifs that seem to be prone to sequencing errors, and to attach a corresponding conditional quality value to these motifs. We have developed several measures to distinguish between sequence errors and candidate SNPs, based on a base call's nucleotide context and its mismatch type. In addition, we suggested a simple method to correct the majority of mismatches, based on conditional probability of their "second" best intensity call. We attach a corresponding second call confidence (quality value) of being corrected to each mismatch.

Mesh:

Year:  2012        PMID: 22809341     DOI: 10.1142/S0219720012410053

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  9 in total

1.  Detecting non-allelic homologous recombination from high-throughput sequencing data.

Authors:  Matthew M Parks; Charles E Lawrence; Benjamin J Raphael
Journal:  Genome Biol       Date:  2015-04-08       Impact factor: 13.583

2.  ViVaMBC: estimating viral sequence variation in complex populations from illumina deep-sequencing data using model-based clustering.

Authors:  Bie Verbist; Lieven Clement; Joke Reumers; Kim Thys; Alexander Vapirev; Willem Talloen; Yves Wetzels; Joris Meys; Jeroen Aerssens; Luc Bijnens; Olivier Thas
Journal:  BMC Bioinformatics       Date:  2015-02-22       Impact factor: 3.169

3.  HIV-1 and HIV-2 exhibit similar mutation frequencies and spectra in the absence of G-to-A hypermutation.

Authors:  Jonathan M O Rawson; Sean R Landman; Cavan S Reilly; Louis M Mansky
Journal:  Retrovirology       Date:  2015-07-10       Impact factor: 4.602

Review 4.  Navigating the rapids: the development of regulated next-generation sequencing-based clinical trial assays and companion diagnostics.

Authors:  Saumya Pant; Russell Weiner; Matthew J Marton
Journal:  Front Oncol       Date:  2014-04-17       Impact factor: 6.244

Review 5.  Analysis of plant microbe interactions in the era of next generation sequencing technologies.

Authors:  Claudia Knief
Journal:  Front Plant Sci       Date:  2014-05-21       Impact factor: 5.753

6.  Tackling critical parameters in metazoan meta-barcoding experiments: a preliminary study based on coxI DNA barcode.

Authors:  Bachir Balech; Anna Sandionigi; Caterina Manzari; Emiliano Trucchi; Apollonia Tullo; Flavio Licciulli; Giorgio Grillo; Elisabetta Sbisà; Stefano De Felici; Cecilia Saccone; Anna Maria D'Erchia; Donatella Cesaroni; Maurizio Casiraghi; Saverio Vicario
Journal:  PeerJ       Date:  2018-06-13       Impact factor: 2.984

7.  VarBin, a novel method for classifying true and false positive variants in NGS data.

Authors:  Jacob Durtschi; Rebecca L Margraf; Emily M Coonrod; Kalyan C Mallempati; Karl V Voelkerding
Journal:  BMC Bioinformatics       Date:  2013-10-01       Impact factor: 3.169

8.  Valection: design optimization for validation and verification studies.

Authors:  Christopher I Cooper; Delia Yao; Dorota H Sendorek; Takafumi N Yamaguchi; Christine P'ng; Kathleen E Houlahan; Cristian Caloian; Michael Fraser; Kyle Ellrott; Adam A Margolin; Robert G Bristow; Joshua M Stuart; Paul C Boutros
Journal:  BMC Bioinformatics       Date:  2018-09-25       Impact factor: 3.169

9.  A Sequence-Based Novel Approach for Quality Evaluation of Third-Generation Sequencing Reads.

Authors:  Wenjing Zhang; Neng Huang; Jiantao Zheng; Xingyu Liao; Jianxin Wang; Hong-Dong Li
Journal:  Genes (Basel)       Date:  2019-01-14       Impact factor: 4.096

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.