Literature DB >> 16306390

DNA assembly with gaps (Dawg): simulating sequence evolution.

Reed A Cartwright1.   

Abstract

MOTIVATION: Relationships amongst taxa are inferred from biological data using phylogenetic methods and procedures. Very few known phylogenies exist against which to test the accuracy of our inferences. Therefore, in the absence of biological data, simulated data must be used to test the accuracy of methods which produce these inferences. Researchers have limited or non-existent options for simulations useful for studying the impact of insertions, deletions, and alignments on phylogenetic accuracy.
RESULTS: To satisfy this gap I have developed a new algorithm of indel formation and incorporated it into a new, flexible, and portable application for sequence simulation. The application, called Dawg, simulates phylogenetic evolution of DNA sequences in continuous time using the robust general time reversible model with gamma and invariant rate heterogeneity and a novel length-dependent model of indel formation. On completion, Dawg produces the true alignment of the simulated sequences. Unlike other applications, Dawg allows indel lengths to be explicitly distributed via a biologically realistic power law. Many options are available to allow users to customize their simulations and results. Because simulating with indels would be problematic if biologically realistic parameters could not be estimated, a script is provided with Dawg that can estimate the parameters of indel formation from sequence data. Dawg was applied to the sequences of four chloroplast trnK introns. It was used to parametrically bootstrap an estimation of the rate of indel formation for the phylogeny. Because Dawg can assist in parametric bootstrapping of sequence data it is useful beyond phylogenetics, such as studying alignment algorithms or parameters of molecular evolution. AVAILABILITY: Dawg 1.0.0 can be obtained at the following websites: http://www.genetics.uga.edu/sw/ or http://scit.us/dawg/. The package includes source code, example files, a brief manual and helper scripts. Binary distributions are available for Windows and Macintosh OS X. A development page for Dawg exists at http://scit.us/dawg/, with links to a Subversion repository, mailing lists and updated versions.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16306390     DOI: 10.1093/bioinformatics/bti1200

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  66 in total

1.  Divergent evolution within protein superfolds inferred from profile-based phylogenetics.

Authors:  Douglas L Theobald; Deborah S Wuttke
Journal:  J Mol Biol       Date:  2005-09-20       Impact factor: 5.469

2.  Ngila: global pairwise alignments with logarithmic and affine gap costs.

Authors:  Reed A Cartwright
Journal:  Bioinformatics       Date:  2007-03-25       Impact factor: 6.937

3.  PSAR-align: improving multiple sequence alignment using probabilistic sampling.

Authors:  Jaebum Kim; Jian Ma
Journal:  Bioinformatics       Date:  2013-11-12       Impact factor: 6.937

4.  Simulation of genes and genomes forward in time.

Authors:  Antonio Carvajal-Rodríguez
Journal:  Curr Genomics       Date:  2010-03       Impact factor: 2.236

5.  Analysis of long branch extraction and long branch shortening.

Authors:  Timothy O'Connor; Kenneth Sundberg; Hyrum Carroll; Mark Clement; Quinn Snell
Journal:  BMC Genomics       Date:  2010-11-02       Impact factor: 3.969

6.  INDELible: a flexible simulator of biological sequence evolution.

Authors:  William Fletcher; Ziheng Yang
Journal:  Mol Biol Evol       Date:  2009-05-07       Impact factor: 16.240

7.  Biological sequence simulation for testing complex evolutionary hypotheses: indel-Seq-Gen version 2.0.

Authors:  Cory L Strope; Kevin Abel; Stephen D Scott; Etsuko N Moriyama
Journal:  Mol Biol Evol       Date:  2009-08-03       Impact factor: 16.240

8.  Tools for simulating evolution of aligned genomic regions with integrated parameter estimation.

Authors:  Avinash Varadarajan; Robert K Bradley; Ian H Holmes
Journal:  Genome Biol       Date:  2008-10-08       Impact factor: 13.583

9.  Phylogenetic inference under varying proportions of indel-induced alignment gaps.

Authors:  Bhakti Dwivedi; Sudhindra R Gadagkar
Journal:  BMC Evol Biol       Date:  2009-08-23       Impact factor: 3.260

10.  Towards realistic benchmarks for multiple alignments of non-coding sequences.

Authors:  Jaebum Kim; Saurabh Sinha
Journal:  BMC Bioinformatics       Date:  2010-01-26       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.