Jiao Chen1, Yingchao Zhao2, Yanni Sun1. 1. Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA. 2. School of Computing and Information Sciences, Caritas Institute of Higher Education, Hong Kong, China.
Abstract
Motivation: RNA virus populations contain different but genetically related strains, all infecting an individual host. Reconstruction of the viral haplotypes is a fundamental step to characterize the virus population, predict their viral phenotypes and finally provide important information for clinical treatment and prevention. Advances of the next-generation sequencing technologies open up new opportunities to assemble full-length haplotypes. However, error-prone short reads, high similarities between related strains, an unknown number of haplotypes pose computational challenges for reference-free haplotype reconstruction. There is still much room to improve the performance of existing haplotype assembly tools. Results: In this work, we developed a de novo haplotype reconstruction tool named PEHaplo, which employs paired-end reads to distinguish highly similar strains for viral quasispecies data. It was applied on both simulated and real quasispecies data, and the results were benchmarked against several recently published de novo haplotype reconstruction tools. The comparison shows that PEHaplo outperforms the benchmarked tools in a comprehensive set of metrics. Availability and implementation: The source code and the documentation of PEHaplo are available at https://github.com/chjiao/PEHaplo. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: RNA virus populations contain different but genetically related strains, all infecting an individual host. Reconstruction of the viral haplotypes is a fundamental step to characterize the virus population, predict their viral phenotypes and finally provide important information for clinical treatment and prevention. Advances of the next-generation sequencing technologies open up new opportunities to assemble full-length haplotypes. However, error-prone short reads, high similarities between related strains, an unknown number of haplotypes pose computational challenges for reference-free haplotype reconstruction. There is still much room to improve the performance of existing haplotype assembly tools. Results: In this work, we developed a de novo haplotype reconstruction tool named PEHaplo, which employs paired-end reads to distinguish highly similar strains for viral quasispecies data. It was applied on both simulated and real quasispecies data, and the results were benchmarked against several recently published de novo haplotype reconstruction tools. The comparison shows that PEHaplo outperforms the benchmarked tools in a comprehensive set of metrics. Availability and implementation: The source code and the documentation of PEHaplo are available at https://github.com/chjiao/PEHaplo. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Anton Eliseev; Keylie M Gibson; Pavel Avdeyev; Dmitry Novik; Matthew L Bendall; Marcos Pérez-Losada; Nikita Alexeev; Keith A Crandall Journal: Infect Genet Evol Date: 2020-03-06 Impact factor: 3.342
Authors: Zhi-Luo Deng; Akshay Dhingra; Adrian Fritz; Jasper Götting; Philipp C Münch; Lars Steinbrück; Thomas F Schulz; Tina Ganzenmüller; Alice C McHardy Journal: Brief Bioinform Date: 2021-05-20 Impact factor: 11.622
Authors: A Fritz; A Bremges; Z-L Deng; T-R Lesker; J Götting; T Ganzenmüller; A Sczyrba; A Dilthey; F Klawonn; A C McHardy Journal: bioRxiv Date: 2021-01-26
Authors: Hong Zhou; Xing Chen; Tao Hu; Juan Li; Hao Song; Yanran Liu; Peihan Wang; Di Liu; Jing Yang; Edward C Holmes; Alice C Hughes; Yuhai Bi; Weifeng Shi Journal: Curr Biol Date: 2020-05-11 Impact factor: 10.834