Literature DB >> 29688310

RANGER-DTL 2.0: rigorous reconstruction of gene-family evolution by duplication, transfer and loss.

Mukul S Bansal1, Manolis Kellis2,3, Misagh Kordi1, Soumya Kundu1.   

Abstract

Summary: RANGER-DTL 2.0 is a software program for inferring gene family evolution using Duplication-Transfer-Loss reconciliation. This new software is highly scalable and easy to use, and offers many new features not currently available in any other reconciliation program. RANGER-DTL 2.0 has a particular focus on reconciliation accuracy and can account for many sources of reconciliation uncertainty including uncertain gene tree rooting, gene tree topological uncertainty, multiple optimal reconciliations and alternative event cost assignments. RANGER-DTL 2.0 is open-source and written in C++ and Python. Availability and implementation: Pre-compiled executables, source code (open-source under GNU GPL) and a detailed manual are freely available from http://compbio.engr.uconn.edu/software/RANGER-DTL/. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2018        PMID: 29688310      PMCID: PMC6137995          DOI: 10.1093/bioinformatics/bty314

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Duplication-Transfer-Loss (DTL) reconciliation is widely recognized as one of the most powerful computational techniques for understanding the evolution of microbial gene families (Kamneva and Ward, 2014). DTL reconciliation works by comparing a given gene tree (for the gene family of interest) against the corresponding species tree and postulating gene duplication, horizontal gene transfer and gene loss events to explain the evolution of that gene tree inside the species tree. The result of DTL reconciliation is a mapping of the nodes of the gene tree to nodes (or edges) of the species tree, showing the embedding of the gene tree inside the species tree, as well as a labeling of each internal node of the gene tree as either a speciation, duplication, or transfer event. Such detailed knowledge of gene family evolution has many important biological applications, and the DTL reconciliation problem has therefore been extensively studied, e.g. (Bansal , 2013; David and Alm, 2011; Doyon ; Jacox ; Kordi and Bansal, 2016; Sjostrand ; Stolzer ; Szollosi ; Tofigh ). While probabilistic models of DTL evolution also exist (Sjostrand ; Szollosi ), we focus here on parsimony-based models of DTL reconciliation which are much more scalable and require fewer parameters. Parsimony-based DTL reconciliation is also known to be highly accurate in practice; see Section S3 in the Supplementary Material for a detailed discussion on accuracy. A preliminary version of RANGER-DTL (short for Rapid ANalysis of Gene family Evolution using Reconciliation-DTL) was released in 2012 with a paper on the algorithmics of DTL reconciliation (Bansal ), providing only rudimentary functionality. Despite its limited functionality, the preliminary version of RANGER-DTL has been frequently used for biological data analysis (Dupont and Cox, 2017; Heitlinger ; Heshiki ; Jeong ; Koczyk ; Ricci ). Here, we release the first full version of RANGER-DTL with greatly extended and improved functionality, and featuring the new algorithms and techniques developed in Bansal ; Kordi and Bansal (2016); Kundu and Bansal (2018).

2 Features

RANGER-DTL 2.0 is designed to enable fast and rigorous analysis of gene families and provides several advanced features not available in any other reconciliation software. The software takes as input a gene tree (rooted or unrooted) and a rooted species tree and reconciles the two by postulating speciation, duplication, transfer and loss events. Advanced capabilities of RANGER-DTL 2.0 include (i) principled handling of unrooted gene trees by considering all possible optimal rootings, (ii) uniformly random sampling of the space of all optimal reconciliations, making it possible to compute multiple optimal reconciliations and account for the variability in optimal reconciliation scenarios, (iii) use of distance-dependent transfer costs to better model transfer dynamics, (iv) handling gene tree uncertainty by collapsing weakly supported gene tree edges and computing and considering all optimal resolutions of the gene tree and (v) computing support values for individual DTL event inferences and species mapping assignments while accounting for multiple optimal reconciliations, uncertainty in gene tree rooting, alternative event cost assignments and even gene tree topological uncertainty. Furthermore, RANGER-DTL 2.0 can efficiently analyze trees with thousands of taxa. While it can handle both undated and fully-dated species trees, the focus of RANGER-DTL 2.0 is on undated species trees, for which it offers the most options and functionality. The reason for focusing on undated species trees is explained in Section S1 in the Supplementary Material. Several features of RANGER-DTL 2.0, including consideration of all optimal gene tree roots, all possible optimal resolutions of unresolved gene trees and distance-dependent transfer costs, are not available in any comparable software package. A detailed comparison of RANGER-DTL 2.0 with existing DTL reconciliation software appears in Section S2 of the Supplementary Material.

3 Availability and requirements

The software package consists of 10 related programs designed to work together to support various reconciliation analyses. These ten programs are organized into (i) three core programs, which define the core functionality of RANGER-DTL 2.0, designed to be applied sequentially, (ii) five Supplementary programs that provide additional functionality and (iii) two summary scripts. Further details on the implementation of RANGER-DTL 2.0 are given in Section S4 of the Supplementary Material. RANGER-DTL 2.0 is available open-source under GNU General Public Licence v3. Pre-compiled executables for Linux, Mac, and Windows, source code and a detailed manual are freely available online. The eight core and Supplementary programs are written in C++ and can be compiled on any operating system with a C++ compiler supporting the ANSI C++ standard. These C++ programs use standard C++ libraries along with the freely available and widely used Boost C++ libraries (http://www.boost.org/). The two summary scripts are written in Python and can be run on any operating system with the Python interpreter. RANGER-DTL is designed to be efficient in both time complexity and memory requirements, and all programs, except for the two that consider unresolved gene trees, are scalable to hundreds or thousands of genes and taxa on commodity hardware. For instance, computing an optimal reconciliation using the core Ranger-DTL program for species trees and gene trees with 200 leaves and 1000 leaves each requires approximately 5 s and 9 min, respectively, on a desktop computer with a 3.1 GHz Intel i5 processor and both instances require less than 1 GB of RAM. In fact, with the supplementary program Ranger-DTL-Fast, reconciling the 1000-leaf trees takes less than a second.

4 Conclusion

Accurate and efficient DTL reconciliation of gene trees and species trees is crucial to understanding microbial gene and species evolution and to inferring horizontal gene transfer and other evolutionary events. RANGER-DTL 2.0 makes it possible to perform fast and rigorous analysis of gene family evolution through DTL reconciliation and offers many important features, such as consideration of all optimal gene tree roots, all possible optimal resolutions of unresolved gene trees, and distance-dependent transfer costs, that are not available in any comparable reconciliation software. RANGER-DTL is also designed to be easy to use, with easily interpretable results. There are several additional features that we intend to add to RANGER-DTL to further improve its functionality and accuracy. These include fast heuristics for handling gene tree uncertainty and estimating its impact on the reconciliation, and consideration of transfers from unsampled or extinct lineages, e.g. (Jacox ). These and other new features will be extensively tested to assess their impact on DTL reconciliation accuracy, and those that result in an improvement will be added to RANGER-DTL.

Funding

This work was supported in part by U.S. National Science Foundation CAREER award IIS 1553421 and by U.S. National Science Foundation awards MCB 1616514 and IES 1615573 to MSB, and by a University of Connecticut Summer Undergraduate Research Fund award to SK. Conflict of Interest: none declared. Click here for additional data file.
  14 in total

1.  Reconciliation revisited: handling multiple optima when reconciling with duplication, transfer, and loss.

Authors:  Mukul S Bansal; Eric J Alm; Manolis Kellis
Journal:  J Comput Biol       Date:  2013-09-14       Impact factor: 1.479

2.  Simultaneous identification of duplications and lateral gene transfers.

Authors:  Ali Tofigh; Michael Hallett; Jens Lagergren
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2011 Mar-Apr       Impact factor: 3.710

3.  Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations.

Authors:  Gergely J Szöllosi; Bastien Boussau; Sophie S Abby; Eric Tannier; Vincent Daubin
Journal:  Proc Natl Acad Sci U S A       Date:  2012-10-04       Impact factor: 11.205

4.  A Bayesian method for analyzing lateral gene transfer.

Authors:  Joel Sjöstrand; Ali Tofigh; Vincent Daubin; Lars Arvestad; Bengt Sennblad; Jens Lagergren
Journal:  Syst Biol       Date:  2014-02-20       Impact factor: 15.683

5.  Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees.

Authors:  Maureen Stolzer; Han Lai; Minli Xu; Deepa Sathaye; Benjamin Vernot; Dannie Durand
Journal:  Bioinformatics       Date:  2012-09-15       Impact factor: 6.937

6.  Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss.

Authors:  Mukul S Bansal; Eric J Alm; Manolis Kellis
Journal:  Bioinformatics       Date:  2012-06-15       Impact factor: 6.937

7.  The genome of Eimeria falciformis--reduction and specialization in a single host apicomplexan parasite.

Authors:  Emanuel Heitlinger; Simone Spork; Richard Lucius; Christoph Dieterich
Journal:  BMC Genomics       Date:  2014-08-20       Impact factor: 3.969

8.  Genomic Data Quality Impacts Automated Detection of Lateral Gene Transfer in Fungi.

Authors:  Pierre-Yves Dupont; Murray P Cox
Journal:  G3 (Bethesda)       Date:  2017-04-03       Impact factor: 3.154

9.  Toward a Metagenomic Understanding on the Bacterial Composition and Resistome in Hong Kong Banknotes.

Authors:  Yoshitaro Heshiki; Thrimendra Dissanayake; Tingting Zheng; Kang Kang; Ni Yueqiong; Zeling Xu; Chinmoy Sarkar; Patrick C Y Woo; Billy K C Chow; David Baker; Aixin Yan; Christopher J Webster; Gianni Panagiotou; Jun Li
Journal:  Front Microbiol       Date:  2017-04-13       Impact factor: 5.640

10.  The Distant Siblings-A Phylogenomic Roadmap Illuminates the Origins of Extant Diversity in Fungal Aromatic Polyketide Biosynthesis.

Authors:  Grzegorz Koczyk; Adam Dawidziuk; Delfina Popiel
Journal:  Genome Biol Evol       Date:  2015-11-03       Impact factor: 3.416

View more
  16 in total

1.  Horizontal Gene Transfer as an Indispensable Driver for Evolution of Neocallimastigomycota into a Distinct Gut-Dwelling Fungal Lineage.

Authors:  Chelsea L Murphy; Noha H Youssef; Radwa A Hanafy; M B Couger; Jason E Stajich; Yan Wang; Kristina Baker; Sumit S Dagar; Gareth W Griffith; Ibrahim F Farag; T M Callaghan; Mostafa S Elshahed
Journal:  Appl Environ Microbiol       Date:  2019-07-18       Impact factor: 4.792

2.  Deciphering Microbial Gene Family Evolution Using Duplication-Transfer-Loss Reconciliation and RANGER-DTL.

Authors:  Mukul S Bansal
Journal:  Methods Mol Biol       Date:  2022

3.  Rooting Species Trees Using Gene Tree-Species Tree Reconciliation.

Authors:  Brogan J Harris; Paul O Sheridan; Adrián A Davín; Cécile Gubry-Rangin; Gergely J Szöllősi; Tom A Williams
Journal:  Methods Mol Biol       Date:  2022

4.  Counting and sampling gene family evolutionary histories in the duplication-loss and duplication-loss-transfer models.

Authors:  Cedric Chauve; Yann Ponty; Michael Wallner
Journal:  J Math Biol       Date:  2020-02-15       Impact factor: 2.259

5.  Origin and Evolution of Fusidane-Type Antibiotics Biosynthetic Pathway through Multiple Horizontal Gene Transfers.

Authors:  Xiangchen Li; Jian Cheng; Xiaonan Liu; Xiaoxian Guo; Yuqian Liu; Wenjing Fan; Lina Lu; Yanhe Ma; Tao Liu; Shiheng Tao; Huifeng Jiang
Journal:  Genome Biol Evol       Date:  2020-10-01       Impact factor: 3.416

6.  The Patchy Distribution of Restriction⁻Modification System Genes and the Conservation of Orphan Methyltransferases in Halobacteria.

Authors:  Matthew S Fullmer; Matthew Ouellette; Artemis S Louyakis; R Thane Papke; Johann Peter Gogarten
Journal:  Genes (Basel)       Date:  2019-03-19       Impact factor: 4.096

Review 7.  Current and Promising Approaches to Identify Horizontal Gene Transfer Events in Metagenomes.

Authors:  Gavin M Douglas; Morgan G I Langille
Journal:  Genome Biol Evol       Date:  2019-10-01       Impact factor: 3.416

8.  A Tale of Two Families: Whole Genome and Segmental Duplications Underlie Glutamine Synthetase and Phosphoenolpyruvate Carboxylase Diversity in Narrow-Leafed Lupin (Lupinus angustifolius L.).

Authors:  Katarzyna B Czyż; Michał Książkiewicz; Grzegorz Koczyk; Anna Szczepaniak; Jan Podkowiński; Barbara Naganowska
Journal:  Int J Mol Sci       Date:  2020-04-08       Impact factor: 5.923

9.  Assessing the accuracy of phylogenetic rooting methods on prokaryotic gene families.

Authors:  Taylor Wade; L Thiberio Rangel; Soumya Kundu; Gregory P Fournier; Mukul S Bansal
Journal:  PLoS One       Date:  2020-05-15       Impact factor: 3.240

10.  RecPhyloXML: a format for reconciled gene trees.

Authors:  Wandrille Duchemin; Guillaume Gence; Anne-Muriel Arigon Chifolleau; Lars Arvestad; Mukul S Bansal; Vincent Berry; Bastien Boussau; François Chevenet; Nicolas Comte; Adrián A Davín; Christophe Dessimoz; David Dylus; Damir Hasic; Diego Mallo; Rémi Planel; David Posada; Celine Scornavacca; Gergely Szöllosi; Louxin Zhang; Éric Tannier; Vincent Daubin
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.