Literature DB >> 11590098

A simple algorithm to infer gene duplication and speciation events on a gene tree.

C M Zmasek1, S R Eddy.   

Abstract

MOTIVATION: When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication), because duplication enables functional diversification. The utility of phylogenetic information in high-throughput genome annotation ('phylogenomics') is widely recognized, but existing approaches are either manual or indirect (e.g. not based on phylogenetic trees). Our goal is to automate phylogenomics using explicit phylogenetic inference. A necessary component is an algorithm to infer speciation and duplication events in a given gene tree.
RESULTS: We give an algorithm to infer speciation and duplication events on a gene tree by comparison to a trusted species tree. This algorithm has a worst-case running time of O(n(2)) which is inferior to two previous algorithms that are approximately O(n) for a gene tree of sequences. However, our algorithm is extremely simple, and its asymptotic worst case behavior is only realized on pathological data sets. We show empirically, using 1750 gene trees constructed from the Pfam protein family database, that it appears to be a practical (and often superior) algorithm for analyzing real gene trees. AVAILABILITY: http://www.genetics.wustl.edu/eddy/forester.

Mesh:

Year:  2001        PMID: 11590098     DOI: 10.1093/bioinformatics/17.9.821

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  86 in total

1.  Neutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes.

Authors:  Sankar Subramanian; Sudhir Kumar
Journal:  Genome Res       Date:  2003-05       Impact factor: 9.043

2.  Dissecting plant genomes with the PLAZA comparative genomics platform.

Authors:  Michiel Van Bel; Sebastian Proost; Elisabeth Wischnitzki; Sara Movahedi; Christopher Scheerlinck; Yves Van de Peer; Klaas Vandepoele
Journal:  Plant Physiol       Date:  2011-12-23       Impact factor: 8.340

3.  Phylogenetic molecular function annotation.

Authors:  Barbara E Engelhardt; Michael I Jordan; Susanna T Repo; Steven E Brenner
Journal:  J Phys Conf Ser       Date:  2009

4.  Optimal gene trees from sequences and species trees using a soft interpretation of parsimony.

Authors:  Ann-Charlotte Berglund-Sonnhammer; Pär Steffansson; Matthew J Betts; David A Liberles
Journal:  J Mol Evol       Date:  2006-07-07       Impact factor: 2.395

5.  COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations.

Authors:  Raja Jothi; Elena Zotenko; Asba Tasneem; Teresa M Przytycka
Journal:  Bioinformatics       Date:  2006-01-24       Impact factor: 6.937

6.  Accelerated rate of gene gain and loss in primates.

Authors:  Matthew W Hahn; Jeffery P Demuth; Sang-Gook Han
Journal:  Genetics       Date:  2007-10-18       Impact factor: 4.562

7.  A hierarchical model for incomplete alignments in phylogenetic inference.

Authors:  Fuxia Cheng; Stefanie Hartmann; Mayetri Gupta; Joseph G Ibrahim; Todd J Vision
Journal:  Bioinformatics       Date:  2009-01-15       Impact factor: 6.937

8.  Reconciliation with non-binary species trees.

Authors:  Benjamin Vernot; Maureen Stolzer; Aiton Goldman; Dannie Durand
Journal:  J Comput Biol       Date:  2008-10       Impact factor: 1.479

9.  Computational methods for Gene Orthology inference.

Authors:  David M Kristensen; Yuri I Wolf; Arcady R Mushegian; Eugene V Koonin
Journal:  Brief Bioinform       Date:  2011-06-19       Impact factor: 11.622

10.  Genome-scale phylogenetic function annotation of large and diverse protein families.

Authors:  Barbara E Engelhardt; Michael I Jordan; John R Srouji; Steven E Brenner
Journal:  Genome Res       Date:  2011-07-22       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.