Literature DB >> 24974376

Prospects for building large timetrees using molecular data with incomplete gene coverage among species.

Alan Filipski1, Oscar Murillo2, Anna Freydenzon1, Koichiro Tamura3, Sudhir Kumar4.   

Abstract

Scientists are assembling sequence data sets from increasing numbers of species and genes to build comprehensive timetrees. However, data are often unavailable for some species and gene combinations, and the proportion of missing data is often large for data sets containing many genes and species. Surprisingly, there has not been a systematic analysis of the effect of the degree of sparseness of the species-gene matrix on the accuracy of divergence time estimates. Here, we present results from computer simulations and empirical data analyses to quantify the impact of missing gene data on divergence time estimation in large phylogenies. We found that estimates of divergence times were robust even when sequences from a majority of genes for most of the species were absent. From the analysis of such extremely sparse data sets, we found that the most egregious errors occurred for nodes in the tree that had no common genes for any pair of species in the immediate descendant clades of the node in question. These problematic nodes can be easily detected prior to computational analyses based only on the input sequence alignment and the tree topology. We conclude that it is best to use larger alignments, because adding both genes and species to the alignment augments the number of genes available for estimating divergence events deep in the tree and improves their time estimates.
© The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Keywords:  divergence time; incomplete data; timetree

Mesh:

Year:  2014        PMID: 24974376      PMCID: PMC4137717          DOI: 10.1093/molbev/msu200

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  26 in total

1.  Efficiency of the neighbor-joining method in reconstructing deep and shallow evolutionary relationships in large phylogenies.

Authors:  S Kumar; S R Gadagkar
Journal:  J Mol Evol       Date:  2000-12       Impact factor: 2.395

2.  Estimating absolute rates of molecular evolution and divergence times: a penalized likelihood approach.

Authors:  Michael J Sanderson
Journal:  Mol Biol Evol       Date:  2002-01       Impact factor: 16.240

3.  Heterogeneity of nucleotide frequencies among evolutionary lineages and phylogenetic inference.

Authors:  Michael S Rosenberg; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2003-04-02       Impact factor: 16.240

4.  Divergence time and evolutionary rate estimation with multilocus data.

Authors:  Jeffrey L Thorne; Hirohisa Kishino
Journal:  Syst Biol       Date:  2002-10       Impact factor: 15.683

5.  Phylogenomics of eukaryotes: impact of missing data on large alignments.

Authors:  Hervé Philippe; Elizabeth A Snell; Eric Bapteste; Philippe Lopez; Peter W H Holland; Didier Casane
Journal:  Mol Biol Evol       Date:  2004-06-02       Impact factor: 16.240

6.  Divergence dates for Malagasy lemurs estimated from multiple gene loci: geological and evolutionary context.

Authors:  Anne D Yoder; Ziheng Yang
Journal:  Mol Ecol       Date:  2004-04       Impact factor: 6.185

7.  The timing of eukaryotic evolution: does a relaxed molecular clock reconcile proteins and fossils?

Authors:  Emmanuel J P Douzery; Elizabeth A Snell; Eric Bapteste; Frédéric Delsuc; Hervé Philippe
Journal:  Proc Natl Acad Sci U S A       Date:  2004-10-19       Impact factor: 11.205

8.  MEGA6: Molecular Evolutionary Genetics Analysis version 6.0.

Authors:  Koichiro Tamura; Glen Stecher; Daniel Peterson; Alan Filipski; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2013-10-16       Impact factor: 16.240

9.  A stepwise algorithm for finding minimum evolution trees.

Authors:  S Kumar
Journal:  Mol Biol Evol       Date:  1996-04       Impact factor: 16.240

10.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

Authors:  M Hasegawa; H Kishino; T Yano
Journal:  J Mol Evol       Date:  1985       Impact factor: 2.395

View more
  15 in total

Review 1.  Molecular clocks and the early evolution of metazoan nervous systems.

Authors:  Gregory A Wray
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-12-19       Impact factor: 6.237

2.  Advances in Time Estimation Methods for Molecular Data.

Authors:  Sudhir Kumar; S Blair Hedges
Journal:  Mol Biol Evol       Date:  2016-02-16       Impact factor: 16.240

3.  A Protocol for Diagnosing the Effect of Calibration Priors on Posterior Time Estimates: A Case Study for the Cambrian Explosion of Animal Phyla.

Authors:  Fabia U Battistuzzi; Paul Billing-Ross; Oscar Murillo; Alan Filipski; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2015-03-25       Impact factor: 16.240

4.  MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets.

Authors:  Sudhir Kumar; Glen Stecher; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2016-03-22       Impact factor: 16.240

5.  Phylogenetic background and habitat drive the genetic diversification of Escherichia coli.

Authors:  Marie Touchon; Amandine Perrin; Jorge André Moura de Sousa; Belinda Vangchhia; Samantha Burn; Claire L O'Brien; Erick Denamur; David Gordon; Eduardo Pc Rocha
Journal:  PLoS Genet       Date:  2020-06-12       Impact factor: 5.917

6.  RelTime Relaxes the Strict Molecular Clock throughout the Phylogeny.

Authors:  Fabia U Battistuzzi; Qiqing Tao; Lance Jones; Koichiro Tamura; Sudhir Kumar
Journal:  Genome Biol Evol       Date:  2018-06-01       Impact factor: 3.416

7.  A matter of background: DNA repair pathways as a possible cause for the sparse distribution of CRISPR-Cas systems in bacteria.

Authors:  Aude Bernheim; David Bikard; Marie Touchon; Eduardo P C Rocha
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-05-13       Impact factor: 6.237

8.  On estimating evolutionary probabilities of population variants.

Authors:  Ravi Patel; Sudhir Kumar
Journal:  BMC Evol Biol       Date:  2019-06-25       Impact factor: 3.260

9.  Theoretical Foundation of the RelTime Method for Estimating Divergence Times from Variable Evolutionary Rates.

Authors:  Koichiro Tamura; Qiqing Tao; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2018-07-01       Impact factor: 16.240

10.  Host Range and Genetic Plasticity Explain the Coexistence of Integrative and Extrachromosomal Mobile Genetic Elements.

Authors:  Jean Cury; Pedro H Oliveira; Fernando de la Cruz; Eduardo P C Rocha
Journal:  Mol Biol Evol       Date:  2018-09-01       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.