Literature DB >> 32151775

Evaluation of haplotype callers for next-generation sequencing of viruses.

Anton Eliseev1, Keylie M Gibson2, Pavel Avdeyev3, Dmitry Novik1, Matthew L Bendall4, Marcos Pérez-Losada5, Nikita Alexeev1, Keith A Crandall6.   

Abstract

Currently, the standard practice for assembling next-generation sequencing (NGS) reads of viral genomes is to summarize thousands of individual short reads into a single consensus sequence, thus confounding useful intra-host diversity information for molecular phylodynamic inference. It is hypothesized that a few viral strains may dominate the intra-host genetic diversity with a variety of lower frequency strains comprising the rest of the population. Several software tools currently exist to convert NGS sequence variants into haplotypes. Previous benchmarks of viral haplotype reconstruction programs used simulation scenarios that are useful from a mathematical perspective but do not reflect viral evolution and epidemiology. Here, we tested twelve NGS haplotype reconstruction methods using viral populations simulated under realistic evolutionary dynamics. We simulated coalescent-based populations that spanned known levels of viral genetic diversity, including mutation rates, sample size and effective population size, to test the limits of the haplotype reconstruction methods and to ensure coverage of predicted intra-host viral diversity levels (especially HIV-1). All twelve investigated haplotype callers showed variable performance and produced drastically different results that were mainly driven by differences in mutation rate and, to a lesser extent, in effective population size. Most methods were able to accurately reconstruct haplotypes when genetic diversity was low. However, under higher levels of diversity (e.g., those seen intra-host HIV-1 infections), haplotype reconstruction quality was highly variable and, on average, poor. All haplotype reconstruction tools, except QuasiRecomb and ShoRAH, greatly underestimated intra-host diversity and the true number of haplotypes. PredictHaplo outperformed, in regard to highest precision, recall, and lowest UniFrac distance values, the other haplotype reconstruction tools followed by CliqueSNV, which, given more computational time, may have outperformed PredictHaplo. Here, we present an extensive comparison of available viral haplotype reconstruction tools and provide insights for future improvements in haplotype reconstruction tools using both short-read and long-read technologies.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Fast-evolving viruses; HIV; Haplotype reconstruction; Intra-host diversity; Next-generation sequencing; Simulations

Year:  2020        PMID: 32151775      PMCID: PMC7293574          DOI: 10.1016/j.meegid.2020.104277

Source DB:  PubMed          Journal:  Infect Genet Evol        ISSN: 1567-1348            Impact factor:   3.342


  75 in total

1.  Complete nucleotide sequence of the AIDS virus, HTLV-III.

Authors:  L Ratner; W Haseltine; R Patarca; K J Livak; B Starcich; S F Josephs; E R Doran; J A Rafalski; E A Whitehorn; K Baumeister
Journal:  Nature       Date:  1985 Jan 24-30       Impact factor: 49.962

2.  ART: a next-generation sequencing read simulator.

Authors:  Weichun Huang; Leping Li; Jason R Myers; Gabor T Marth
Journal:  Bioinformatics       Date:  2011-12-23       Impact factor: 6.937

3.  In vivo analysis of human T-cell leukemia virus type 1 reverse transcription accuracy.

Authors:  L M Mansky
Journal:  J Virol       Date:  2000-10       Impact factor: 5.103

4.  Versatile and open software for comparing large genomes.

Authors:  Stefan Kurtz; Adam Phillippy; Arthur L Delcher; Michael Smoot; Martin Shumway; Corina Antonescu; Steven L Salzberg
Journal:  Genome Biol       Date:  2004-01-30       Impact factor: 13.583

5.  De novo assembly of viral quasispecies using overlap graphs.

Authors:  Jasmijn A Baaijens; Amal Zine El Aabidine; Eric Rivals; Alexander Schönhuth
Journal:  Genome Res       Date:  2017-04-10       Impact factor: 9.043

6.  Stochastic processes constrain the within and between host evolution of influenza virus.

Authors:  John T McCrone; Robert J Woods; Emily T Martin; Ryan E Malosh; Arnold S Monto; Adam S Lauring
Journal:  Elife       Date:  2018-05-03       Impact factor: 8.140

7.  Assembling millions of short DNA sequences using SSAKE.

Authors:  René L Warren; Granger G Sutton; Steven J M Jones; Robert A Holt
Journal:  Bioinformatics       Date:  2006-12-08       Impact factor: 6.937

8.  Phylodynamics of HIV-1 from a phase-III AIDS vaccine trial in North America.

Authors:  Marcos Pérez-Losada; David V Jobes; Faruk Sinangil; Keith A Crandall; David Posada; Phillip W Berman
Journal:  Mol Biol Evol       Date:  2009-10-28       Impact factor: 16.240

9.  Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges.

Authors:  Mattia C F Prosperi; Li Yin; David J Nolan; Amanda D Lowe; Maureen M Goodenow; Marco Salemi
Journal:  Sci Rep       Date:  2013-10-03       Impact factor: 4.379

10.  Characterization of HIV diversity, phylodynamics and drug resistance in Washington, DC.

Authors:  Marcos Pérez-Losada; Amanda D Castel; Brittany Lewis; Michael Kharfen; Charles P Cartwright; Bruce Huang; Taylor Maxwell; Alan E Greenberg; Keith A Crandall
Journal:  PLoS One       Date:  2017-09-29       Impact factor: 3.240

View more
  11 in total

Review 1.  Integrated Analysis of Whole Genome and Epigenome Data Using Machine Learning Technology: Toward the Establishment of Precision Oncology.

Authors:  Ken Asada; Syuzo Kaneko; Ken Takasawa; Hidenori Machino; Satoshi Takahashi; Norio Shinkai; Ryo Shimoyama; Masaaki Komatsu; Ryuji Hamamoto
Journal:  Front Oncol       Date:  2021-05-12       Impact factor: 6.244

2.  V-pipe: a computational pipeline for assessing viral genetic diversity from high-throughput data.

Authors:  Susana Posada-Céspedes; David Seifert; Ivan Topolsky; Kim Philipp Jablonski; Karin J Metzner; Niko Beerenwinkel
Journal:  Bioinformatics       Date:  2021-01-20       Impact factor: 6.937

3.  Drug Resistance Prediction Using Deep Learning Techniques on HIV-1 Sequence Data.

Authors:  Margaret C Steiner; Keylie M Gibson; Keith A Crandall
Journal:  Viruses       Date:  2020-05-19       Impact factor: 5.048

4.  Are We Ready for NGS HIV Drug Resistance Testing? The Second "Winnipeg Consensus" Symposium.

Authors:  Hezhao Ji; Paul Sandstrom; Roger Paredes; P Richard Harrigan; Chanson J Brumme; Santiago Avila Rios; Marc Noguera-Julian; Neil Parkin; Rami Kantor
Journal:  Viruses       Date:  2020-05-27       Impact factor: 5.048

Review 5.  Illuminating an Ecological Blackbox: Using High Throughput Sequencing to Characterize the Plant Virome Across Scales.

Authors:  François Maclot; Thierry Candresse; Denis Filloux; Carolyn M Malmstrom; Philippe Roumagnac; René van der Vlugt; Sébastien Massart
Journal:  Front Microbiol       Date:  2020-10-16       Impact factor: 5.640

6.  Principles of dengue virus evolvability derived from genotype-fitness maps in human and mosquito cells.

Authors:  Patrick T Dolan; Shuhei Taguwa; Mauricio Aguilar Rangel; Ashley Acevedo; Tzachi Hagai; Raul Andino; Judith Frydman
Journal:  Elife       Date:  2021-01-25       Impact factor: 8.713

7.  Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction.

Authors:  Sergey Knyazev; Viachaslau Tsyvina; Anupama Shankar; Andrew Melnyk; Alexander Artyomenko; Tatiana Malygina; Yuri B Porozov; Ellsworth M Campbell; William M Switzer; Pavel Skums; Serghei Mangul; Alex Zelikovsky
Journal:  Nucleic Acids Res       Date:  2021-09-27       Impact factor: 16.971

8.  Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses.

Authors:  Keylie M Gibson; Margaret C Steiner; Uzma Rentia; Matthew L Bendall; Marcos Pérez-Losada; Keith A Crandall
Journal:  Viruses       Date:  2020-07-14       Impact factor: 5.048

9.  Testing the genomic stability of the Brazilian yellow fever vaccine strain using next-generation sequencing data.

Authors:  Amanda Araújo Serrão de Andrade; André E R Soares; Luiz Gonzaga Paula de Almeida; Luciane Prioli Ciapina; Cristiane Pinheiro Pestana; Carolina Lessa Aquino; Marco Alberto Medeiros; Ana Tereza Ribeiro de Vasconcelos
Journal:  Interface Focus       Date:  2021-06-11       Impact factor: 3.906

10.  VirStrain: a strain identification tool for RNA viruses.

Authors:  Herui Liao; Dehan Cai; Yanni Sun
Journal:  Genome Biol       Date:  2022-01-31       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.