Literature DB >> 23046496

Computational workflow for analysis of gain and loss of genes in distantly related genomes.

Andrey Ptitsyn1, Leonid L Moroz.   

Abstract

BACKGROUND: Early evolution of animals led to profound changes in body plan organization, symmetry and the rise of tissue complexity including formation of muscular and nervous systems. This process was associated with massive restructuring of animal genomes as well as deletion, acquisition and rapid differentiation of genes from a common metazoan ancestor. Here, we present a simple but efficient workflow for elucidation of gene gain and gene loss within major branches of the animal kingdom.
METHODS: We have designed a pipeline of sequence comparison, clustering and functional annotation using 12 major phyla as illustrative examples. Specifically, for the input we used sets of ab initio predicted gene models from the genomes of six bilaterians, three basal metazoans (Cnidaria, Placozoa, Porifera), two unicellular eukaryotes (Monosiga and Capsospora) and the green plant Arabidopsis as an out-group. Due to the large amounts of data the software required a high-performance Linux cluster. The final results can be imported into standard spreadsheet analysis software and queried for the numbers and specific sets of genes absent in specific genomes, uniquely present or shared among different taxons. RESULTS AND
CONCLUSIONS: The developed software is open source and available free of charge on Open Source principles. It allows the user to address a number of specific questions regarding gene gain and gene loss in particular genomes, and user-defined groups of genomes can be formulated in a type of logical expression. For example, our analysis of 12 sequenced genomes indicated that these genomes possess at least 90,000 unique genes and gene families, suggesting enormous diversity of the genome repertoire in the animal kingdom. Approximately 9% of these gene families are shared universally (homologous) among all genomes, 53% are unique to specific taxa, and the rest are shared between two or more distantly related genomes.

Entities:  

Mesh:

Year:  2012        PMID: 23046496      PMCID: PMC3439731          DOI: 10.1186/1471-2105-13-S15-S5

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  12 in total

1.  BadiRate: estimating family turnover rates by likelihood-based methods.

Authors:  P Librado; F G Vieira; J Rozas
Journal:  Bioinformatics       Date:  2011-11-10       Impact factor: 6.937

2.  The COG database: a tool for genome-scale analysis of protein functions and evolution.

Authors:  R L Tatusov; M Y Galperin; D A Natale; E V Koonin
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

3.  CAFE: a computational tool for the study of gene family evolution.

Authors:  Tijl De Bie; Nello Cristianini; Jeffery P Demuth; Matthew W Hahn
Journal:  Bioinformatics       Date:  2006-03-16       Impact factor: 6.937

4.  Reconstructing ancestral genome content based on symmetrical best alignments and Dollo parsimony.

Authors:  Onur Sakarya; Kenneth S Kosik; Todd H Oakley
Journal:  Bioinformatics       Date:  2008-01-09       Impact factor: 6.937

5.  Eukaryotic genes in Mycobacterium tuberculosis could have a role in pathogenesis and immunomodulation.

Authors:  Junaid Gamieldien; Andrey Ptitsyn; Winston Hide
Journal:  Trends Genet       Date:  2002-01       Impact factor: 11.639

6.  Identification of common molecular subsequences.

Authors:  T F Smith; M S Waterman
Journal:  J Mol Biol       Date:  1981-03-25       Impact factor: 5.469

7.  Distinguishing protein-coding and noncoding genes in the human genome.

Authors:  Michele Clamp; Ben Fry; Mike Kamal; Xiaohui Xie; James Cuff; Michael F Lin; Manolis Kellis; Kerstin Lindblad-Toh; Eric S Lander
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-26       Impact factor: 11.205

8.  Databases of homologous gene families for comparative genomics.

Authors:  Simon Penel; Anne-Muriel Arigon; Jean-François Dufayard; Anne-Sophie Sertier; Vincent Daubin; Laurent Duret; Manolo Gouy; Guy Perrière
Journal:  BMC Bioinformatics       Date:  2009-06-16       Impact factor: 3.169

9.  A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes.

Authors:  Eugene V Koonin; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Dmitri M Krylov; Kira S Makarova; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Igor B Rogozin; Sergei Smirnov; Alexander V Sorokin; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  Genome Biol       Date:  2004-01-15       Impact factor: 13.583

10.  The COG database: an updated version includes eukaryotes.

Authors:  Roman L Tatusov; Natalie D Fedorova; John D Jackson; Aviva R Jacobs; Boris Kiryutin; Eugene V Koonin; Dmitri M Krylov; Raja Mazumder; Sergei L Mekhedov; Anastasia N Nikolskaya; B Sridhar Rao; Sergei Smirnov; Alexander V Sverdlov; Sona Vasudevan; Yuri I Wolf; Jodie J Yin; Darren A Natale
Journal:  BMC Bioinformatics       Date:  2003-09-11       Impact factor: 3.169

View more
  5 in total

Review 1.  Evolution by gene loss.

Authors:  Ricard Albalat; Cristian Cañestro
Journal:  Nat Rev Genet       Date:  2016-04-18       Impact factor: 53.242

2.  The ctenophore genome and the evolutionary origins of neural systems.

Authors:  Leonid L Moroz; Kevin M Kocot; Mathew R Citarella; Sohn Dosung; Tigran P Norekian; Inna S Povolotskaya; Anastasia P Grigorenko; Christopher Dailey; Eugene Berezikov; Katherine M Buckley; Andrey Ptitsyn; Denis Reshetov; Krishanu Mukherjee; Tatiana P Moroz; Yelena Bobkova; Fahong Yu; Vladimir V Kapitonov; Jerzy Jurka; Yuri V Bobkov; Joshua J Swore; David O Girardo; Alexander Fodor; Fedor Gusev; Rachel Sanford; Rebecca Bruders; Ellen Kittler; Claudia E Mills; Jonathan P Rast; Romain Derelle; Victor V Solovyev; Fyodor A Kondrashov; Billie J Swalla; Jonathan V Sweedler; Evgeny I Rogaev; Kenneth M Halanych; Andrea B Kohn
Journal:  Nature       Date:  2014-05-21       Impact factor: 49.962

3.  Network Centrality Analysis in Fungi Reveals Complex Regulation of Lost and Gained Genes.

Authors:  Jasmin Coulombe-Huntington; Yu Xia
Journal:  PLoS One       Date:  2017-01-03       Impact factor: 3.240

4.  Proceedings of the 2012 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference. Introduction.

Authors:  Jonathan D Wren; Mikhail G Dozmorov; Dennis Burian; Rakesh Kaundal; Susan Bridges; Doris M Kupfer
Journal:  BMC Bioinformatics       Date:  2012-09-11       Impact factor: 3.169

5.  The skeletal proteome of the coral Acropora millepora: the evolution of calcification by co-option and domain shuffling.

Authors:  Paula Ramos-Silva; Jaap Kaandorp; Lotte Huisman; Benjamin Marie; Isabelle Zanella-Cléon; Nathalie Guichard; David J Miller; Frédéric Marin
Journal:  Mol Biol Evol       Date:  2013-06-12       Impact factor: 16.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.