Literature DB >> 15764703

Incorporating gene-specific variation when inferring and evaluating optimal evolutionary tree topologies from multilocus sequence data.

Tae-Kun Seo1, Hirohisa Kishino, Jeffrey L Thorne.   

Abstract

Because of the increase of genomic data, multiple genes are often available for the inference of phylogenetic relationships. The simple approach for combining multiple genes from the same taxon is to concatenate the sequences and then ignore the fact that different positions in the concatenated sequence came from different genes. Here, we discuss two criteria for inferring the optimal tree topology from data sets with multiple genes. These criteria are designed for multigene data sets where gene-specific evolutionary features are too important to ignore. One criterion is conventional and is obtained by taking the sum of log-likelihoods over all genes. The other criterion is obtained by dividing the log-likelihood for a gene by its sequence length and then taking the arithmetic mean over genes of these ratios. A similar strategy could be adopted with parsimony scores. The optimal tree is then declared to be the one for which the sum or the arithmetic mean is maximized. These criteria are justified within a two-stage hierarchical framework. The first level of the hierarchy represents gene-specific evolutionary features, and the second represents site-specific features for given genes. For testing significance of the optimal topology, we suggest a two-stage bootstrap procedure that involves resampling genes and then resampling alignment columns within resampled genes. An advantage of this procedure over concatenation is that it can effectively account for gene-specific evolutionary features. We discuss the applicability of the two-stage bootstrap idea to the Kishino-Hasegawa test and the Shimodaira-Hasegawa test.

Mesh:

Year:  2005        PMID: 15764703      PMCID: PMC555482          DOI: 10.1073/pnas.0408313102

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  23 in total

1.  Plastid genome phylogeny and a model of amino acid substitution for proteins encoded by chloroplast DNA.

Authors:  J Adachi; P J Waddell; W Martin; M Hasegawa
Journal:  J Mol Evol       Date:  2000-04       Impact factor: 2.395

2.  Rapid evaluation of the phylogenetic congruence of sequence data using likelihood ratio tests.

Authors:  P J Waddell; H Kishino; R Ota
Journal:  Mol Biol Evol       Date:  2000-12       Impact factor: 16.240

3.  Coevolving protein residues: maximum likelihood identification and relationship to structure.

Authors:  D D Pollock; W R Taylor; N Goldman
Journal:  J Mol Biol       Date:  1999-03-19       Impact factor: 5.469

4.  CONSEL: for assessing the confidence of phylogenetic tree selection.

Authors:  H Shimodaira; M Hasegawa
Journal:  Bioinformatics       Date:  2001-12       Impact factor: 6.937

5.  A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames.

Authors:  A M Pedersen; J L Jensen
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

6.  Interordinal relationships and timescale of eutherian evolution as inferred from mitochondrial genome data.

Authors:  Y Cao; M Fujiwara; M Nikaido; N Okada; M Hasegawa
Journal:  Gene       Date:  2000-12-23       Impact factor: 3.688

7.  Likelihood-based tests of topologies in phylogenetics.

Authors:  N Goldman; J P Anderson; A G Rodrigo
Journal:  Syst Biol       Date:  2000-12       Impact factor: 15.683

8.  Maximum likelihood analysis of gene-based and structure-based process partitions, using mammalian mitochondrial genomes.

Authors:  R W DeBry
Journal:  Syst Biol       Date:  1999-06       Impact factor: 15.683

9.  Phylogenetic position of turtles among amniotes: evidence from mitochondrial and nuclear genes.

Authors:  Y Cao; M D Sorenson; Y Kumazawa; D P Mindell; M Hasegawa
Journal:  Gene       Date:  2000-12-23       Impact factor: 3.688

10.  Estimation of primate speciation dates using local molecular clocks.

Authors:  A D Yoder; Z Yang
Journal:  Mol Biol Evol       Date:  2000-07       Impact factor: 16.240

View more
  7 in total

Review 1.  Multilocus phylogeography and phylogenetics using sequence-based markers.

Authors:  Patrícia H Brito; Scott V Edwards
Journal:  Genetica       Date:  2008-07-24       Impact factor: 1.082

2.  A hierarchical model for incomplete alignments in phylogenetic inference.

Authors:  Fuxia Cheng; Stefanie Hartmann; Mayetri Gupta; Joseph G Ibrahim; Todd J Vision
Journal:  Bioinformatics       Date:  2009-01-15       Impact factor: 6.937

3.  Classification of nucleotide sequences using support vector machines.

Authors:  Tae-Kun Seo
Journal:  J Mol Evol       Date:  2010-08-26       Impact factor: 2.395

4.  Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots.

Authors:  Michael J Moore; Pamela S Soltis; Charles D Bell; J Gordon Burleigh; Douglas E Soltis
Journal:  Proc Natl Acad Sci U S A       Date:  2010-02-22       Impact factor: 11.205

5.  Dissecting Molecular Evolution in the Highly Diverse Plant Clade Caryophyllales Using Transcriptome Sequencing.

Authors:  Ya Yang; Michael J Moore; Samuel F Brockington; Douglas E Soltis; Gane Ka-Shu Wong; Eric J Carpenter; Yong Zhang; Li Chen; Zhixiang Yan; Yinlong Xie; Rowan F Sage; Sarah Covshoff; Julian M Hibberd; Matthew N Nelson; Stephen A Smith
Journal:  Mol Biol Evol       Date:  2015-04-02       Impact factor: 16.240

6.  Rooting the eutherian tree: the power and pitfalls of phylogenomics.

Authors:  Hidenori Nishihara; Norihiro Okada; Masami Hasegawa
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

7.  Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies.

Authors:  Erfan Sayyari; Siavash Mirarab
Journal:  Mol Biol Evol       Date:  2016-04-15       Impact factor: 16.240

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.