Literature DB >> 19042944

Problems and solutions for estimating indel rates and length distributions.

Reed A Cartwright1.   

Abstract

Insertions and deletions (indels) are fundamental but understudied components of molecular evolution. Here we present an expectation-maximization algorithm built on a pair hidden Markov model that is able to properly handle indels in neutrally evolving DNA sequences. From a data set of orthologous introns, we estimate relative rates and length distributions of indels among primates and rodents. This technique has the advantage of potentially handling large genomic data sets. We find that a zeta power-law model of indel lengths provides a much better fit than the traditional geometric model and that indel processes are conserved between our taxa. The estimated relative rates are about 12-16 indels per 100 substitutions, and the estimated power-law magnitudes are about 1.6-1.7. More significantly, we find that using the traditional geometric/affine model of indel lengths introduces artifacts into evolutionary analysis, casting doubt on studies of the evolution and diversity of indel formation using traditional models and invalidating measures of species divergence that include indel lengths.

Mesh:

Year:  2008        PMID: 19042944      PMCID: PMC2734402          DOI: 10.1093/molbev/msn275

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  40 in total

1.  Sequence alignments and pair hidden Markov models using evolutionary history.

Authors:  Bjarne Knudsen; Michael M Miyamoto
Journal:  J Mol Biol       Date:  2003-10-17       Impact factor: 5.469

2.  The order of sequence alignment can bias the selection of tree topology.

Authors:  J A Lake
Journal:  Mol Biol Evol       Date:  1991-05       Impact factor: 16.240

3.  Exhaustive matching of the entire protein sequence database.

Authors:  G H Gonnet; M A Cohen; S A Benner
Journal:  Science       Date:  1992-06-05       Impact factor: 47.728

4.  Indel evolution of mammalian introns and the utility of non-coding nuclear markers in eutherian phylogenetics.

Authors:  Conrad A Matthee; Geeta Eick; Sandi Willows-Munro; Claudine Montgelard; Amanda T Pardini; Terence J Robinson
Journal:  Mol Phylogenet Evol       Date:  2006-10-11       Impact factor: 4.286

5.  Indelign: a probabilistic framework for annotation of insertions and deletions in a multiple alignment.

Authors:  Jaebum Kim; Saurabh Sinha
Journal:  Bioinformatics       Date:  2006-11-15       Impact factor: 6.937

6.  An improved algorithm for matching biological sequences.

Authors:  O Gotoh
Journal:  J Mol Biol       Date:  1982-12-15       Impact factor: 5.469

Review 7.  SNPs in disease gene mapping, medicinal drug development and evolution.

Authors:  Barkur S Shastry
Journal:  J Hum Genet       Date:  2007-10-11       Impact factor: 3.172

8.  Genome sequence of the Brown Norway rat yields insights into mammalian evolution.

Authors:  Richard A Gibbs; George M Weinstock; Michael L Metzker; Donna M Muzny; Erica J Sodergren; Steven Scherer; Graham Scott; David Steffen; Kim C Worley; Paula E Burch; Geoffrey Okwuonu; Sandra Hines; Lora Lewis; Christine DeRamo; Oliver Delgado; Shannon Dugan-Rocha; George Miner; Margaret Morgan; Alicia Hawes; Rachel Gill; Robert A Holt; Mark D Adams; Peter G Amanatides; Holly Baden-Tillson; Mary Barnstead; Soo Chin; Cheryl A Evans; Steve Ferriera; Carl Fosler; Anna Glodek; Zhiping Gu; Don Jennings; Cheryl L Kraft; Trixie Nguyen; Cynthia M Pfannkoch; Cynthia Sitter; Granger G Sutton; J Craig Venter; Trevor Woodage; Douglas Smith; Hong-Mei Lee; Erik Gustafson; Patrick Cahill; Arnold Kana; Lynn Doucette-Stamm; Keith Weinstock; Kim Fechtel; Robert B Weiss; Diane M Dunn; Eric D Green; Robert W Blakesley; Gerard G Bouffard; Pieter J De Jong; Kazutoyo Osoegawa; Baoli Zhu; Marco Marra; Jacqueline Schein; Ian Bosdet; Chris Fjell; Steven Jones; Martin Krzywinski; Carrie Mathewson; Asim Siddiqui; Natasja Wye; John McPherson; Shaying Zhao; Claire M Fraser; Jyoti Shetty; Sofiya Shatsman; Keita Geer; Yixin Chen; Sofyia Abramzon; William C Nierman; Paul H Havlak; Rui Chen; K James Durbin; Amy Egan; Yanru Ren; Xing-Zhi Song; Bingshan Li; Yue Liu; Xiang Qin; Simon Cawley; Kim C Worley; A J Cooney; Lisa M D'Souza; Kirt Martin; Jia Qian Wu; Manuel L Gonzalez-Garay; Andrew R Jackson; Kenneth J Kalafus; Michael P McLeod; Aleksandar Milosavljevic; Davinder Virk; Andrei Volkov; David A Wheeler; Zhengdong Zhang; Jeffrey A Bailey; Evan E Eichler; Eray Tuzun; Ewan Birney; Emmanuel Mongin; Abel Ureta-Vidal; Cara Woodwark; Evgeny Zdobnov; Peer Bork; Mikita Suyama; David Torrents; Marina Alexandersson; Barbara J Trask; Janet M Young; Hui Huang; Huajun Wang; Heming Xing; Sue Daniels; Darryl Gietzen; Jeanette Schmidt; Kristian Stevens; Ursula Vitt; Jim Wingrove; Francisco Camara; M Mar Albà; Josep F Abril; Roderic Guigo; Arian Smit; Inna Dubchak; Edward M Rubin; Olivier Couronne; Alexander Poliakov; Norbert Hübner; Detlev Ganten; Claudia Goesele; Oliver Hummel; Thomas Kreitler; Young-Ae Lee; Jan Monti; Herbert Schulz; Heike Zimdahl; Heinz Himmelbauer; Hans Lehrach; Howard J Jacob; Susan Bromberg; Jo Gullings-Handley; Michael I Jensen-Seaman; Anne E Kwitek; Jozef Lazar; Dean Pasko; Peter J Tonellato; Simon Twigger; Chris P Ponting; Jose M Duarte; Stephen Rice; Leo Goodstadt; Scott A Beatson; Richard D Emes; Eitan E Winter; Caleb Webber; Petra Brandt; Gerald Nyakatura; Margaret Adetobi; Francesca Chiaromonte; Laura Elnitski; Pallavi Eswara; Ross C Hardison; Minmei Hou; Diana Kolbe; Kateryna Makova; Webb Miller; Anton Nekrutenko; Cathy Riemer; Scott Schwartz; James Taylor; Shan Yang; Yi Zhang; Klaus Lindpaintner; T Dan Andrews; Mario Caccamo; Michele Clamp; Laura Clarke; Valerie Curwen; Richard Durbin; Eduardo Eyras; Stephen M Searle; Gregory M Cooper; Serafim Batzoglou; Michael Brudno; Arend Sidow; Eric A Stone; J Craig Venter; Bret A Payseur; Guillaume Bourque; Carlos López-Otín; Xose S Puente; Kushal Chakrabarti; Sourav Chatterji; Colin Dewey; Lior Pachter; Nicolas Bray; Von Bing Yap; Anat Caspi; Glenn Tesler; Pavel A Pevzner; David Haussler; Krishna M Roskin; Robert Baertsch; Hiram Clawson; Terrence S Furey; Angie S Hinrichs; Donna Karolchik; William J Kent; Kate R Rosenbloom; Heather Trumbower; Matt Weirauch; David N Cooper; Peter D Stenson; Bin Ma; Michael Brent; Manimozhiyan Arumugam; David Shteynberg; Richard R Copley; Martin S Taylor; Harold Riethman; Uma Mudunuri; Jane Peterson; Mark Guyer; Adam Felsenfeld; Susan Old; Stephen Mockrin; Francis Collins
Journal:  Nature       Date:  2004-04-01       Impact factor: 49.962

Review 9.  An overview of the serpin superfamily.

Authors:  Ruby H P Law; Qingwei Zhang; Sheena McGowan; Ashley M Buckle; Gary A Silverman; Wilson Wong; Carlos J Rosado; Chris G Langendorf; Rob N Pike; Philip I Bird; James C Whisstock
Journal:  Genome Biol       Date:  2006-05-30       Impact factor: 13.583

10.  Patterns of insertion and deletion in Mammalian genomes.

Authors:  Yanhui Fan; Wenjuan Wang; Guoji Ma; Lijing Liang; Qi Shi; Shiheng Tao
Journal:  Curr Genomics       Date:  2007-09       Impact factor: 2.236

View more
  29 in total

1.  Dindel: accurate indel calls from short-read data.

Authors:  Cornelis A Albers; Gerton Lunter; Daniel G MacArthur; Gilean McVean; Willem H Ouwehand; Richard Durbin
Journal:  Genome Res       Date:  2010-10-27       Impact factor: 9.043

2.  De novo assembly and analysis of RNA-seq data.

Authors:  Gordon Robertson; Jacqueline Schein; Readman Chiu; Richard Corbett; Matthew Field; Shaun D Jackman; Karen Mungall; Sam Lee; Hisanaga Mark Okada; Jenny Q Qian; Malachi Griffith; Anthony Raymond; Nina Thiessen; Timothee Cezard; Yaron S Butterfield; Richard Newsome; Simon K Chan; Rong She; Richard Varhol; Baljit Kamoh; Anna-Liisa Prabhu; Angela Tam; YongJun Zhao; Richard A Moore; Martin Hirst; Marco A Marra; Steven J M Jones; Pamela A Hoodless; Inanc Birol
Journal:  Nat Methods       Date:  2010-10-10       Impact factor: 28.547

3.  Correlated Selection on Amino Acid Deletion and Replacement in Mammalian Protein Sequences.

Authors:  Yichen Zheng; Dan Graur; Ricardo B R Azevedo
Journal:  J Mol Evol       Date:  2018-06-28       Impact factor: 2.395

4.  Selection-driven divergence after gene duplication in Arabidopsis thaliana.

Authors:  Toni I Gossmann; Karl J Schmid
Journal:  J Mol Evol       Date:  2011-10-02       Impact factor: 2.395

5.  Massive turnover of functional sequence in human and other mammalian genomes.

Authors:  Stephen Meader; Chris P Ponting; Gerton Lunter
Journal:  Genome Res       Date:  2010-08-06       Impact factor: 9.043

Review 6.  Expanding the computational toolbox for mining cancer genomes.

Authors:  Li Ding; Michael C Wendl; Joshua F McMichael; Benjamin J Raphael
Journal:  Nat Rev Genet       Date:  2014-07-08       Impact factor: 53.242

7.  Exome sequencing in genomic regions related to racing performance of Quarter Horses.

Authors:  Guilherme L Pereira; Jessica M Malheiros; Alejandra M T Ospina; Luis Artur L Chardulo; Rogério A Curi
Journal:  J Appl Genet       Date:  2019-01-21       Impact factor: 3.240

8.  PICS-Ord: unlimited coding of ambiguous regions by pairwise identity and cost scores ordination.

Authors:  Robert Lücking; Brendan P Hodkinson; Alexandros Stamatakis; Reed A Cartwright
Journal:  BMC Bioinformatics       Date:  2011-01-07       Impact factor: 3.169

9.  INDELible: a flexible simulator of biological sequence evolution.

Authors:  William Fletcher; Ziheng Yang
Journal:  Mol Biol Evol       Date:  2009-05-07       Impact factor: 16.240

10.  A window into domain amplification through Piccolo in teleost fish.

Authors:  Michael L Nonet
Journal:  G3 (Bethesda)       Date:  2012-11-01       Impact factor: 3.154

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.