Literature DB >> 21068445

Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study.

Sven Buerki1, Félix Forest, Nicolas Salamin, Nadir Alvarez.   

Abstract

For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.

Mesh:

Substances:

Year:  2010        PMID: 21068445     DOI: 10.1093/sysbio/syq057

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  10 in total

1.  Computing diversity from dated phylogenies and taxonomic hierarchies: does it make a difference to the conclusions?

Authors:  Carlo Ricotta; Giovanni Bacaro; Michela Marignani; Sandrine Godefroid; Stefano Mazzoleni
Journal:  Oecologia       Date:  2012-04-17       Impact factor: 3.225

2.  The abrupt climate change at the Eocene-Oligocene boundary and the emergence of South-East Asia triggered the spread of sapindaceous lineages.

Authors:  Sven Buerki; Félix Forest; Tanja Stadler; Nadir Alvarez
Journal:  Ann Bot       Date:  2013-05-30       Impact factor: 4.357

3.  Building megaphylogenies for macroecology: taking up the challenge.

Authors:  Cristina Roquet; Wilfried Thuiller; Sébastien Lavergne
Journal:  Ecography       Date:  2013-01-01       Impact factor: 5.992

4.  Comparative transcriptomics of early dipteran development.

Authors:  Eva Jiménez-Guri; Jaime Huerta-Cepas; Luca Cozzuto; Karl R Wotton; Hui Kang; Heinz Himmelbauer; Guglielmo Roma; Toni Gabaldón; Johannes Jaeger
Journal:  BMC Genomics       Date:  2013-02-24       Impact factor: 3.969

5.  Polynomial supertree methods revisited.

Authors:  Malte Brinkmeyer; Thasso Griebel; Sebastian Böcker
Journal:  Adv Bioinformatics       Date:  2011-12-21

6.  An empirical evaluation of two-stage species tree inference strategies using a multilocus dataset from North American pines.

Authors:  Michael DeGiorgio; John Syring; Andrew J Eckert; Aaron Liston; Richard Cronn; David B Neale; Noah A Rosenberg
Journal:  BMC Evol Biol       Date:  2014-03-29       Impact factor: 3.260

7.  An updated infra-familial classification of Sapindaceae based on targeted enrichment data.

Authors:  Sven Buerki; Martin W Callmander; Pedro Acevedo-Rodriguez; Porter P Lowry; Jérôme Munzinger; Paul Bailey; Olivier Maurin; Grace E Brewer; Niroshini Epitawalage; William J Baker; Félix Forest
Journal:  Am J Bot       Date:  2021-07-05       Impact factor: 3.325

8.  Ancient origins of vertebrate-specific innate antiviral immunity.

Authors:  Krishanu Mukherjee; Bryan Korithoski; Bryan Kolaczkowski
Journal:  Mol Biol Evol       Date:  2013-10-08       Impact factor: 16.240

9.  Reconciliation of gene and species trees.

Authors:  L Y Rusin; E V Lyubetskaya; K Y Gorbunov; V A Lyubetsky
Journal:  Biomed Res Int       Date:  2014-03-27       Impact factor: 3.411

10.  Colony size predicts division of labour in attine ants.

Authors:  Henry Ferguson-Gow; Seirian Sumner; Andrew F G Bourke; Kate E Jones
Journal:  Proc Biol Sci       Date:  2014-10-22       Impact factor: 5.349

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.