Sivan Leviyang1, Igor Griva2, Sergio Ita3, Welkin E Johnson4. 1. Department of Mathematics and Statistics, Georgetown University, Washington DC, 20057, USA. 2. Department of Mathematics, George Mason University, Fairfax, VA 22030, USA. 3. Department of Medicine, University of California - San Diego, La Jolla, CA 92093, USA. 4. Department of Biology, Boston College, Chestnut Hill, MA 02467, USA.
Abstract
MOTIVATION: Next generation sequencing (NGS) has been increasingly applied to characterize viral evolution during HIV and SIV infections. In particular, NGS datasets sampled during the initial months of infection are characterized by relatively low levels of diversity as well as convergent evolution at multiple loci dispersed across the viral genome. Consequently, fully characterizing viral evolution from NGS datasets requires haplotype reconstruction across large regions of the viral genome. Existing haplotype reconstruction algorithms have not been developed with the particular characteristics of early HIV/SIV infection in mind, raising the possibility that better performance could be achieved through a specifically designed algorithm. RESULTS: Here, we introduce a haplotype reconstruction algorithm, RegressHaplo, specifically designed for low diversity and convergent evolution regimes. The algorithm uses a penalized regression that balances a data fitting term with a penalty term that encourages solutions with few haplotypes. The regression covariates are a large set of potential haplotypes and fitting the regression is made computationally feasible by the low diversity setting. Using simulated and in vivo datasets, we compare RegressHaplo to PredictHaplo and QuRe, two existing haplotype reconstruction algorithms. RegressHaplo performs better than these algorithms on simulated datasets with relatively low diversity levels. We suggest RegressHaplo as a novel tool for the investigation of early infection HIV/SIV datasets and, more generally, low diversity viral NGS datasets. CONTACT: sr286@georgetown.edu. AVAILABILITY AND IMPLEMENTATION: https://github.com/SLeviyang/RegressHaplo.
MOTIVATION: Next generation sequencing (NGS) has been increasingly applied to characterize viral evolution during HIV and SIV infections. In particular, NGS datasets sampled during the initial months of infection are characterized by relatively low levels of diversity as well as convergent evolution at multiple loci dispersed across the viral genome. Consequently, fully characterizing viral evolution from NGS datasets requires haplotype reconstruction across large regions of the viral genome. Existing haplotype reconstruction algorithms have not been developed with the particular characteristics of early HIV/SIV infection in mind, raising the possibility that better performance could be achieved through a specifically designed algorithm. RESULTS: Here, we introduce a haplotype reconstruction algorithm, RegressHaplo, specifically designed for low diversity and convergent evolution regimes. The algorithm uses a penalized regression that balances a data fitting term with a penalty term that encourages solutions with few haplotypes. The regression covariates are a large set of potential haplotypes and fitting the regression is made computationally feasible by the low diversity setting. Using simulated and in vivo datasets, we compare RegressHaplo to PredictHaplo and QuRe, two existing haplotype reconstruction algorithms. RegressHaplo performs better than these algorithms on simulated datasets with relatively low diversity levels. We suggest RegressHaplo as a novel tool for the investigation of early infection HIV/SIV datasets and, more generally, low diversity viral NGS datasets. CONTACT: sr286@georgetown.edu. AVAILABILITY AND IMPLEMENTATION: https://github.com/SLeviyang/RegressHaplo.
Authors: Christopher Quince; Anders Lanzén; Thomas P Curtis; Russell J Davenport; Neil Hall; Ian M Head; L Fiona Read; William T Sloan Journal: Nat Methods Date: 2009-08-09 Impact factor: 28.547
Authors: Benjamin N Bimber; Benjamin J Burwitz; Shelby O'Connor; Ann Detmer; Emma Gostick; Simon M Lank; David A Price; Austin Hughes; David O'Connor Journal: J Virol Date: 2009-06-10 Impact factor: 5.103
Authors: Brandon F Keele; Elena E Giorgi; Jesus F Salazar-Gonzalez; Julie M Decker; Kimmy T Pham; Maria G Salazar; Chuanxi Sun; Truman Grayson; Shuyi Wang; Hui Li; Xiping Wei; Chunlai Jiang; Jennifer L Kirchherr; Feng Gao; Jeffery A Anderson; Li-Hua Ping; Ronald Swanstrom; Georgia D Tomaras; William A Blattner; Paul A Goepfert; J Michael Kilby; Michael S Saag; Eric L Delwart; Michael P Busch; Myron S Cohen; David C Montefiori; Barton F Haynes; Brian Gaschen; Gayathri S Athreya; Ha Y Lee; Natasha Wood; Cathal Seoighe; Alan S Perelson; Tanmoy Bhattacharya; Bette T Korber; Beatrice H Hahn; George M Shaw Journal: Proc Natl Acad Sci U S A Date: 2008-05-19 Impact factor: 11.205
Authors: Nilu Goonetilleke; Michael K P Liu; Jesus F Salazar-Gonzalez; Guido Ferrari; Elena Giorgi; Vitaly V Ganusov; Brandon F Keele; Gerald H Learn; Emma L Turnbull; Maria G Salazar; Kent J Weinhold; Stephen Moore; Norman Letvin; Barton F Haynes; Myron S Cohen; Peter Hraber; Tanmoy Bhattacharya; Persephone Borrow; Alan S Perelson; Beatrice H Hahn; George M Shaw; Bette T Korber; Andrew J McMichael Journal: J Exp Med Date: 2009-06-01 Impact factor: 14.307
Authors: Anton Eliseev; Keylie M Gibson; Pavel Avdeyev; Dmitry Novik; Matthew L Bendall; Marcos Pérez-Losada; Nikita Alexeev; Keith A Crandall Journal: Infect Genet Evol Date: 2020-03-06 Impact factor: 3.342
Authors: Chen Cao; Jingni He; Lauren Mak; Deshan Perera; Devin Kwok; Jia Wang; Minghao Li; Tobias Mourier; Stefan Gavriliuc; Matthew Greenberg; A Sorana Morrissy; Laura K Sycuro; Guang Yang; Daniel C Jeffares; Quan Long Journal: Mol Biol Evol Date: 2021-05-19 Impact factor: 16.240
Authors: Ronaldo da Silva Francisco; L Felipe Benites; Alessandra P Lamarca; Luiz G P de Almeida; Alana Witt Hansen; Juliana Schons Gularte; Meriane Demoliner; Alexandra L Gerber; Ana Paula de C Guimarães; Ana Karolina Eisen Antunes; Fagner Henrique Heldt; Larissa Mallmann; Bruna Hermann; Ana Luiza Ziulkoski; Vyctoria Goes; Karoline Schallenberger; Micheli Fillipi; Francini Pereira; Matheus Nunes Weber; Paula Rodrigues de Almeida; Juliane Deise Fleck; Ana Tereza R Vasconcelos; Fernando Rosado Spilki Journal: Virus Res Date: 2021-02-22 Impact factor: 3.303
Authors: Evgenii A Konorov; Vyacheslav Yurchenko; Ivan Patraman; Alexander Lukashev; Nadezhda Oyun Journal: PeerJ Date: 2021-07-21 Impact factor: 2.984