Cara Stockham1, Li-San Wang, Tandy Warnow. 1. Texas Institute for Computational and Applied Mathematics, University of Texas, ACES 6.412, Austin TX 78712, USA.
Abstract
MOTIVATION: Phylogenetic analyses often produce thousands of candidate trees. Biologists resolve the conflict by computing the consensus of these trees. Single-tree consensus as postprocessing methods can be unsatisfactory due to their inherent limitations. RESULTS: In this paper we present an alternative approach by using clustering algorithms on the set of candidate trees. We propose bicriterion problems, in particular using the concept of information loss, and new consensus trees called characteristic trees that minimize the information loss. Our empirical study using four biological datasets shows that our approach provides a significant improvement in the information content, while adding only a small amount of complexity. Furthermore, the consensus trees we obtain for each of our large clusters are more resolved than the single-tree consensus trees. We also provide some initial progress on theoretical questions that arise in this context.
MOTIVATION: Phylogenetic analyses often produce thousands of candidate trees. Biologists resolve the conflict by computing the consensus of these trees. Single-tree consensus as postprocessing methods can be unsatisfactory due to their inherent limitations. RESULTS: In this paper we present an alternative approach by using clustering algorithms on the set of candidate trees. We propose bicriterion problems, in particular using the concept of information loss, and new consensus trees called characteristic trees that minimize the information loss. Our empirical study using four biological datasets shows that our approach provides a significant improvement in the information content, while adding only a small amount of complexity. Furthermore, the consensus trees we obtain for each of our large clusters are more resolved than the single-tree consensus trees. We also provide some initial progress on theoretical questions that arise in this context.
Authors: Michelle M McMahon; Akshay Deepak; David Fernández-Baca; Darren Boss; Michael J Sanderson Journal: PLoS One Date: 2015-02-13 Impact factor: 3.240
Authors: David C Haws; Peter Huggins; Eric M O'Neill; David W Weisrock; Ruriko Yoshida Journal: BMC Bioinformatics Date: 2012-08-21 Impact factor: 3.169
Authors: Steven J Kavanagh; Thomas C Schulz; Philippa Davey; Charles Claudianos; Carrie Russell; Peter D Rathjen Journal: Nucleic Acids Res Date: 2005-03-01 Impact factor: 16.971