BACKGROUND: Highly mutable RNA viruses exist in infected hosts as heterogeneous populations of genetically close variants known as quasispecies. Next-generation sequencing (NGS) allows for analysing a large number of viral sequences from infected patients, presenting a novel opportunity for studying the structure of a viral population and understanding virus evolution, drug resistance and immune escape. Accurate reconstruction of genetic composition of intra-host viral populations involves assembling the NGS short reads into whole-genome sequences and estimating frequencies of individual viral variants. Although a few approaches were developed for this task, accurate reconstruction of quasispecies populations remains greatly unresolved. RESULTS: Two new methods, AmpMCF and ShotMCF, for reconstruction of the whole-genome intra-host viral variants and estimation of their frequencies were developed, based on Multicommodity Flows (MCFs). AmpMCF was designed for NGS reads obtained from individual PCR amplicons and ShotMCF for NGS shotgun reads. While AmpMCF, based on covering formulation, identifies a minimal set of quasispecies explaining all observed reads, ShotMCS, based on packing formulation, engages the maximal number of reads to generate the most probable set of quasispecies. Both methods were evaluated on simulated data in comparison to Maximum Bandwidth and ViSpA, previously developed state-of-the-art algorithms for estimating quasispecies spectra from the NGS amplicon and shotgun reads, respectively. Both algorithms were accurate in estimation of quasispecies frequencies, especially from large datasets. CONCLUSIONS: The problem of viral population reconstruction from amplicon or shotgun NGS reads was solved using the MCF formulation. The two methods, ShotMCF and AmpMCF, developed here afford accurate reconstruction of the structure of intra-host viral population from NGS reads. The implementations of the algorithms are available at http://alan.cs.gsu.edu/vira.html (AmpMCF) and http://alan.cs.gsu.edu/NGS/?q=content/shotmcf (ShotMCF).
BACKGROUND: Highly mutable RNA viruses exist in infected hosts as heterogeneous populations of genetically close variants known as quasispecies. Next-generation sequencing (NGS) allows for analysing a large number of viral sequences from infectedpatients, presenting a novel opportunity for studying the structure of a viral population and understanding virus evolution, drug resistance and immune escape. Accurate reconstruction of genetic composition of intra-host viral populations involves assembling the NGS short reads into whole-genome sequences and estimating frequencies of individual viral variants. Although a few approaches were developed for this task, accurate reconstruction of quasispecies populations remains greatly unresolved. RESULTS: Two new methods, AmpMCF and ShotMCF, for reconstruction of the whole-genome intra-host viral variants and estimation of their frequencies were developed, based on Multicommodity Flows (MCFs). AmpMCF was designed for NGS reads obtained from individual PCR amplicons and ShotMCF for NGS shotgun reads. While AmpMCF, based on covering formulation, identifies a minimal set of quasispecies explaining all observed reads, ShotMCS, based on packing formulation, engages the maximal number of reads to generate the most probable set of quasispecies. Both methods were evaluated on simulated data in comparison to Maximum Bandwidth and ViSpA, previously developed state-of-the-art algorithms for estimating quasispecies spectra from the NGS amplicon and shotgun reads, respectively. Both algorithms were accurate in estimation of quasispecies frequencies, especially from large datasets. CONCLUSIONS: The problem of viral population reconstruction from amplicon or shotgun NGS reads was solved using the MCF formulation. The two methods, ShotMCF and AmpMCF, developed here afford accurate reconstruction of the structure of intra-host viral population from NGS reads. The implementations of the algorithms are available at http://alan.cs.gsu.edu/vira.html (AmpMCF) and http://alan.cs.gsu.edu/NGS/?q=content/shotmcf (ShotMCF).
Authors: E A Duarte; I S Novella; S C Weaver; E Domingo; S Wain-Hobson; D K Clarke; A Moya; S F Elena; J C de la Torre; J J Holland Journal: Infect Agents Dis Date: 1994-08
Authors: Thomas von Hahn; Joo Chun Yoon; Harvey Alter; Charles M Rice; Barbara Rehermann; Peter Balfe; Jane A McKeating Journal: Gastroenterology Date: 2006-12-03 Impact factor: 22.682
Authors: Pavel Skums; Zoya Dimitrova; David S Campo; Gilberto Vaughan; Livia Rossi; Joseph C Forbi; Jonny Yokosawa; Alex Zelikovsky; Yury Khudyakov Journal: BMC Bioinformatics Date: 2012-06-25 Impact factor: 3.169
Authors: Anton Eliseev; Keylie M Gibson; Pavel Avdeyev; Dmitry Novik; Matthew L Bendall; Marcos Pérez-Losada; Nikita Alexeev; Keith A Crandall Journal: Infect Genet Evol Date: 2020-03-06 Impact factor: 3.342
Authors: Nicholas C Wu; Justin De La Cruz; Laith Q Al-Mawsawi; C Anders Olson; Hangfei Qi; Harding H Luan; Nguyen Nguyen; Yushen Du; Shuai Le; Ting-Ting Wu; Xinmin Li; Martha J Lewis; Otto O Yang; Ren Sun Journal: PLoS One Date: 2014-05-19 Impact factor: 3.240
Authors: Damien C Tully; Colin B Ogilvie; Rebecca E Batorsky; David J Bean; Karen A Power; Musie Ghebremichael; Hunter E Bedard; Adrianne D Gladden; Aaron M Seese; Molly A Amero; Kimberly Lane; Graham McGrath; Suzane B Bazner; Jake Tinsley; Niall J Lennon; Matthew R Henn; Zabrina L Brumme; Philip J Norris; Eric S Rosenberg; Kenneth H Mayer; Heiko Jessen; Sergei L Kosakovsky Pond; Bruce D Walker; Marcus Altfeld; Jonathan M Carlson; Todd M Allen Journal: PLoS Pathog Date: 2016-05-10 Impact factor: 6.823
Authors: Sergey Knyazev; Viachaslau Tsyvina; Anupama Shankar; Andrew Melnyk; Alexander Artyomenko; Tatiana Malygina; Yuri B Porozov; Ellsworth M Campbell; William M Switzer; Pavel Skums; Serghei Mangul; Alex Zelikovsky Journal: Nucleic Acids Res Date: 2021-09-27 Impact factor: 16.971
Authors: Francesca Di Giallonardo; Armin Töpfer; Melanie Rey; Sandhya Prabhakaran; Yannick Duport; Christine Leemann; Stefan Schmutz; Nottania K Campbell; Beda Joos; Maria Rita Lecca; Andrea Patrignani; Martin Däumer; Christian Beisel; Peter Rusert; Alexandra Trkola; Huldrych F Günthard; Volker Roth; Niko Beerenwinkel; Karin J Metzner Journal: Nucleic Acids Res Date: 2014-06-27 Impact factor: 16.971