Bie M P Verbist1, Kim Thys1, Joke Reumers1, Yves Wetzels1, Koen Van der Borght1, Willem Talloen1, Jeroen Aerssens1, Lieven Clement1, Olivier Thas2. 1. Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia. 2. Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia Department of Mathematical Modeling, Statistics and Bioinformatics, Ghent University, Coupure Links 653, 9000 Gent, Janssen R&D, Janssen Pharmaceutical Companies of Johnson & Johnson, Turnhoutseweg 30, 2340 Beerse, Applied Mathematics, Informatics and Statistics, Ghent University, Krijgslaan 281 S9, 9000 Gent, Belgium and University of Wollongong, National Institute for Applied Statistics Research Australia (NIASRA), School of Mathematics and Applied Statistics, NSW 2522, Australia.
Abstract
MOTIVATION: In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infected patients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. RESULTS: A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%. AVAILABILITY: The VirVarSeq is available, together with a user's guide and test data, at sourceforge: http://sourceforge.net/projects/virtools/?source=directory.
MOTIVATION: In virology, massively parallel sequencing (MPS) opens many opportunities for studying viral quasi-species, e.g. in HIV-1- and HCV-infectedpatients. This is essential for understanding pathways to resistance, which can substantially improve treatment. Although MPS platforms allow in-depth characterization of sequence variation, their measurements still involve substantial technical noise. For Illumina sequencing, single base substitutions are the main error source and impede powerful assessment of low-frequency mutations. Fortunately, base calls are complemented with quality scores (Qs) that are useful for differentiating errors from the real low-frequency mutations. RESULTS: A variant calling tool, Q-cpileup, is proposed, which exploits the Qs of nucleotides in a filtering strategy to increase specificity. The tool is imbedded in an open-source pipeline, VirVarSeq, which allows variant calling starting from fastq files. Using both plasmid mixtures and clinical samples, we show that Q-cpileup is able to reduce the number of false-positive findings. The filtering strategy is adaptive and provides an optimized threshold for individual samples in each sequencing run. Additionally, linkage information is kept between single-nucleotide polymorphisms as variants are called at the codon level. This enables virologists to have an immediate biological interpretation of the reported variants with respect to their antiviral drug responses. A comparison with existing SNP caller tools reveals that calling variants at the codon level with Q-cpileup results in an outstanding sensitivity while maintaining a good specificity for variants with frequencies down to 0.5%. AVAILABILITY: The VirVarSeq is available, together with a user's guide and test data, at sourceforge: http://sourceforge.net/projects/virtools/?source=directory.
Authors: Wei Shao; Valerie F Boltz; Junko Hattori; Michael J Bale; Frank Maldarelli; John M Coffin; Mary F Kearney Journal: AIDS Res Hum Retroviruses Date: 2020-08-27 Impact factor: 2.205
Authors: Marc Noguera-Julian; Dianna Edgil; P Richard Harrigan; Paul Sandstrom; Catherine Godfrey; Roger Paredes Journal: J Infect Dis Date: 2017-12-01 Impact factor: 5.226
Authors: Matthias Döring; Joachim Büch; Georg Friedrich; Alejandro Pironti; Prabhav Kalaghatgi; Elena Knops; Eva Heger; Martin Obermeier; Martin Däumer; Alexander Thielen; Rolf Kaiser; Thomas Lengauer; Nico Pfeifer Journal: Nucleic Acids Res Date: 2018-07-02 Impact factor: 16.971
Authors: Migle Gabrielaite; Marc Bennedbæk; Adrian G Zucco; Christina Ekenberg; Daniel D Murray; Virginia L Kan; Giota Touloumi; Linos Vandekerckhove; Dan Turner; James Neaton; H Clifford Lane; Sandra Safo; Alejandro Arenas-Pinto; Mark N Polizzotto; Huldrych F Günthard; Jens D Lundgren; Rasmus L Marvig Journal: J Infect Dis Date: 2021-12-15 Impact factor: 5.226
Authors: Chris Wymant; François Blanquart; Tanya Golubchik; Astrid Gall; Margreet Bakker; Daniela Bezemer; Nicholas J Croucher; Matthew Hall; Mariska Hillebregt; Swee Hoe Ong; Oliver Ratmann; Jan Albert; Norbert Bannert; Jacques Fellay; Katrien Fransen; Annabelle Gourlay; M Kate Grabowski; Barbara Gunsenheimer-Bartmeyer; Huldrych F Günthard; Pia Kivelä; Roger Kouyos; Oliver Laeyendecker; Kirsi Liitsola; Laurence Meyer; Kholoud Porter; Matti Ristola; Ard van Sighem; Ben Berkhout; Marion Cornelissen; Paul Kellam; Peter Reiss; Christophe Fraser Journal: Virus Evol Date: 2018-05-18
Authors: Sergey Knyazev; Viachaslau Tsyvina; Anupama Shankar; Andrew Melnyk; Alexander Artyomenko; Tatiana Malygina; Yuri B Porozov; Ellsworth M Campbell; William M Switzer; Pavel Skums; Serghei Mangul; Alex Zelikovsky Journal: Nucleic Acids Res Date: 2021-09-27 Impact factor: 16.971
Authors: J D Baxter; D Dunn; A Tostevin; R L Marvig; M Bennedbaek; A Cozzi-Lepri; S Sharma; M J Kozal; M Gompels; A N Pinto; J Lundgren Journal: HIV Med Date: 2020-12-25 Impact factor: 3.094