Literature DB >> 32886787

CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets.

Connor D Harris1, Ellis L Torrance1, Kasie Raymann1, Louis-Marie Bobay1.   

Abstract

The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does not compute all pairwise genome comparisons and uses a heuristic based on the distributions of identity scores to classify sequences as orthologs or paralogs/xenologs. Although it is much faster than current methods, our results indicate that our approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs. CoreCruncher is freely available from: https://github.com/lbobay/CoreCruncher. CoreCruncher is written in Python 3.7 and can also run on Python 2.7 without modification. It requires the python library Numpy and either Usearch or Blast. Certain options require the programs muscle or mafft.
© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

Entities:  

Keywords:  core genome; orthology; prokaryotes

Mesh:

Year:  2021        PMID: 32886787      PMCID: PMC7826169          DOI: 10.1093/molbev/msaa224

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  29 in total

Review 1.  Inferring orthology and paralogy.

Authors:  Adrian M Altenhoff; Christophe Dessimoz
Journal:  Methods Mol Biol       Date:  2012

2.  Search and clustering orders of magnitude faster than BLAST.

Authors:  Robert C Edgar
Journal:  Bioinformatics       Date:  2010-08-12       Impact factor: 6.937

3.  GET_HOMOLOGUES, a versatile software package for scalable and robust microbial pangenome analysis.

Authors:  Bruno Contreras-Moreira; Pablo Vinuesa
Journal:  Appl Environ Microbiol       Date:  2013-10-04       Impact factor: 4.792

Review 4.  A genomic perspective on protein families.

Authors:  R L Tatusov; E V Koonin; D J Lipman
Journal:  Science       Date:  1997-10-24       Impact factor: 47.728

Review 5.  Ten years of pan-genome analyses.

Authors:  George Vernikos; Duccio Medini; David R Riley; Hervé Tettelin
Journal:  Curr Opin Microbiol       Date:  2014-12-05       Impact factor: 7.934

6.  Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.

Authors:  M Remm; C E Storm; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-12-14       Impact factor: 5.469

7.  Big data and other challenges in the quest for orthologs.

Authors:  Erik L L Sonnhammer; Toni Gabaldón; Alan W Sousa da Silva; Maria Martin; Marc Robinson-Rechavi; Brigitte Boeckmann; Paul D Thomas; Christophe Dessimoz
Journal:  Bioinformatics       Date:  2014-07-26       Impact factor: 6.937

8.  Factors driving effective population size and pan-genome evolution in bacteria.

Authors:  Louis-Marie Bobay; Howard Ochman
Journal:  BMC Evol Biol       Date:  2018-10-12       Impact factor: 3.260

9.  Phylogenetic and functional assessment of orthologs inference projects and methods.

Authors:  Adrian M Altenhoff; Christophe Dessimoz
Journal:  PLoS Comput Biol       Date:  2009-01-16       Impact factor: 4.475

10.  SonicParanoid: fast, accurate and easy orthology inference.

Authors:  Salvatore Cosentino; Wataru Iwasaki
Journal:  Bioinformatics       Date:  2019-01-01       Impact factor: 6.937

View more
  3 in total

1.  Genome size distributions in bacteria and archaea are strongly linked to evolutionary history at broad phylogenetic scales.

Authors:  Carolina A Martinez-Gutierrez; Frank O Aylward
Journal:  PLoS Genet       Date:  2022-05-23       Impact factor: 6.020

2.  Characterization of Expression and Epigenetic Features of Core Genes in Common Wheat.

Authors:  Dongyang Zheng; Wenli Zhang
Journal:  Genes (Basel)       Date:  2022-06-21       Impact factor: 4.141

3.  Phylogenetic systematics of Butyrivibrio and Pseudobutyrivibrio genomes illustrate vast taxonomic diversity, open genomes and an abundance of carbohydrate-active enzyme family isoforms.

Authors:  Sara E Pidcock; Timofey Skvortsov; Fernanda G Santos; Stephen J Courtney; Karen Sui-Ting; Christopher J Creevey; Sharon A Huws
Journal:  Microb Genom       Date:  2021-10
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.