Alexey Markin1, Sanket Wagle2, Tavis K Anderson1, Oliver Eulenstein2. 1. Virus and Prion Research Unit, National Animal Disease Center, USDA-ARS, Ames, IA, 50010, USA. 2. Department of Computer Science, Iowa State University, Ames, IA, 50011, USA.
Abstract
MOTIVATION: A phylogenetic network is a powerful model to represent entangled evolutionary histories with both divergent (speciation) and convergent (e.g., hybridization, reassortment, recombination) evolution. The standard approach to inference of hybridization networks is to (i) reconstruct rooted gene trees and (ii) leverage gene tree discordance for network inference. Recently, we introduced a method called RF-Net for accurate inference of virus reassortment and hybridization networks from input gene trees in the presence of errors commonly found in phylogenetic trees. While RF-Net demonstrated the ability to accurately infer networks with up to four reticulations from erroneous input gene trees, its application was limited by the number of reticulations it could handle in a reasonable amount of time. This limitation is particularly restrictive in the inference of the evolutionary history of segmented RNA viruses such as influenza A virus (IAV), where reassortment is one of the major mechanisms shaping the evolution of these pathogens. RESULTS: Here, we expand the functionality of RF-Net that makes it significantly more applicable in practice. Crucially, we introduce a fast extension to RF-Net, called Fast-RF-Net, that can handle large numbers of reticulations without sacrificing accuracy. Additionally, we develop automatic stopping criteria to select the appropriate number of reticulations heuristically and implement a feature for RF-Net to output error-corrected input gene trees. We then conduct a comprehensive study of the original method and its novel extensions and confirm their efficacy in practice using extensive simulation and empirical influenza A virus evolutionary analyses. AVAILABILITY: RF-Net 2 is available at https://github.com/flu-crew/rf-net-2.
MOTIVATION: A phylogenetic network is a powerful model to represent entangled evolutionary histories with both divergent (speciation) and convergent (e.g., hybridization, reassortment, recombination) evolution. The standard approach to inference of hybridization networks is to (i) reconstruct rooted gene trees and (ii) leverage gene tree discordance for network inference. Recently, we introduced a method called RF-Net for accurate inference of virus reassortment and hybridization networks from input gene trees in the presence of errors commonly found in phylogenetic trees. While RF-Net demonstrated the ability to accurately infer networks with up to four reticulations from erroneous input gene trees, its application was limited by the number of reticulations it could handle in a reasonable amount of time. This limitation is particularly restrictive in the inference of the evolutionary history of segmented RNA viruses such as influenza A virus (IAV), where reassortment is one of the major mechanisms shaping the evolution of these pathogens. RESULTS: Here, we expand the functionality of RF-Net that makes it significantly more applicable in practice. Crucially, we introduce a fast extension to RF-Net, called Fast-RF-Net, that can handle large numbers of reticulations without sacrificing accuracy. Additionally, we develop automatic stopping criteria to select the appropriate number of reticulations heuristically and implement a feature for RF-Net to output error-corrected input gene trees. We then conduct a comprehensive study of the original method and its novel extensions and confirm their efficacy in practice using extensive simulation and empirical influenza A virus evolutionary analyses. AVAILABILITY: RF-Net 2 is available at https://github.com/flu-crew/rf-net-2.
Authors: Yun Zhang; Brian D Aevermann; Tavis K Anderson; David F Burke; Gwenaelle Dauphin; Zhiping Gu; Sherry He; Sanjeev Kumar; Christopher N Larsen; Alexandra J Lee; Xiaomei Li; Catherine Macken; Colin Mahaffey; Brett E Pickett; Brian Reardon; Thomas Smith; Lucy Stewart; Christian Suloway; Guangyu Sun; Lei Tong; Amy L Vincent; Bryan Walters; Sam Zaremba; Hongtao Zhao; Liwei Zhou; Christian Zmasek; Edward B Klem; Richard H Scheuermann Journal: Nucleic Acids Res Date: 2016-09-26 Impact factor: 16.971
Authors: Tavis K Anderson; Blake Inderski; Diego G Diel; Benjamin M Hause; Elizabeth G Porter; Travis Clement; Eric A Nelson; Jianfa Bai; Jane Christopher-Hennings; Phillip C Gauger; Jianqiang Zhang; Karen M Harmon; Rodger Main; Kelly M Lager; Kay S Faaberg Journal: Database (Oxford) Date: 2021-12-15 Impact factor: 3.451