Literature DB >> 33335719

On the transformation of MinHash-based uncorrected distances into proper evolutionary distances for phylogenetic inference.

Alexis Criscuolo1.   

Abstract

Recently developed MinHash-based techniques were proven successful in quickly estimating the level of similarity between large nucleotide sequences. This article discusses their usage and limitations in practice to approximating uncorrected distances between genomes, and transforming these pairwise dissimilarities into proper evolutionary distances. It is notably shown that complex distance measures can be easily approximated using simple transformation formulae based on few parameters. MinHash-based techniques can therefore be very useful for implementing fast yet accurate alignment-free phylogenetic reconstruction procedures from large sets of genomes. This last point of view is assessed with a simulation study using a dedicated bioinformatics tool. Copyright:
© 2020 Criscuolo A.

Entities:  

Keywords:  MinHash; evolutionary distance; genome; p-distance; phylogenetics; simulation; substitution model

Year:  2020        PMID: 33335719      PMCID: PMC7713896          DOI: 10.12688/f1000research.26930.1

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


  53 in total

1.  Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used.

Authors:  K Takahashi; M Nei
Journal:  Mol Biol Evol       Date:  2000-08       Impact factor: 16.240

2.  Efficient biased estimation of evolutionary distances when substitution rates vary across sites.

Authors:  Stéphane Guindon; Olivier Gascuel
Journal:  Mol Biol Evol       Date:  2002-04       Impact factor: 16.240

3.  A mathematical method for determining genome divergence and species delineation using AFLP.

Authors:  Christophe Mougel; Jean Thioulouse; Guy Perrière; Xavier Nesme
Journal:  Int J Syst Evol Microbiol       Date:  2002-03       Impact factor: 2.747

4.  On inconsistency of the neighbor-joining, least squares, and minimum evolution estimation when substitution processes are incorrectly modeled.

Authors:  Edward Susko; Yuji Inagaki; Andrew J Roger
Journal:  Mol Biol Evol       Date:  2004-05-21       Impact factor: 16.240

5.  IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies.

Authors:  Lam-Tung Nguyen; Heiko A Schmidt; Arndt von Haeseler; Bui Quang Minh
Journal:  Mol Biol Evol       Date:  2014-11-03       Impact factor: 16.240

6.  Roary: rapid large-scale prokaryote pan genome analysis.

Authors:  Andrew J Page; Carla A Cummins; Martin Hunt; Vanessa K Wong; Sandra Reuter; Matthew T G Holden; Maria Fookes; Daniel Falush; Jacqueline A Keane; Julian Parkhill
Journal:  Bioinformatics       Date:  2015-07-20       Impact factor: 6.937

7.  Skmer: assembly-free and alignment-free sample identification using genome skims.

Authors:  Shahab Sarmashghi; Kristine Bohmann; M Thomas P Gilbert; Vineet Bafna; Siavash Mirarab
Journal:  Genome Biol       Date:  2019-02-13       Impact factor: 13.583

Review 8.  When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data.

Authors:  Will P M Rowe
Journal:  Genome Biol       Date:  2019-09-13       Impact factor: 13.583

9.  Phylonium: fast estimation of evolutionary distances from large samples of similar genomes.

Authors:  Fabian Klötzl; Bernhard Haubold
Journal:  Bioinformatics       Date:  2020-04-01       Impact factor: 6.937

10.  The number of k-mer matches between two DNA sequences as a function of k and applications to estimate phylogenetic distances.

Authors:  Sophie Röhling; Alexander Linne; Jendrik Schellhorn; Morteza Hosseini; Thomas Dencker; Burkhard Morgenstern
Journal:  PLoS One       Date:  2020-02-10       Impact factor: 3.240

View more
  3 in total

1.  A Dual Barcoding Approach to Bacterial Strain Nomenclature: Genomic Taxonomy of Klebsiella pneumoniae Strains.

Authors:  Melanie Hennart; Julien Guglielmini; Sébastien Bridel; Martin C J Maiden; Keith A Jolley; Alexis Criscuolo; Sylvain Brisse
Journal:  Mol Biol Evol       Date:  2022-07-02       Impact factor: 8.800

2.  Paenibacillus allorhizoplanae sp. nov. from the rhizoplane of a Zea mays root.

Authors:  Peter Kämpfer; André Lipski; Lucie Lamothe; Dominique Clermont; Alexis Criscuolo; John A McInroy; Stefanie P Glaeser
Journal:  Arch Microbiol       Date:  2022-09-18       Impact factor: 2.667

3.  Neonatal acquisition of extended-spectrum beta-lactamase-producing Enterobacteriaceae in the community of a low-income country (NeoLIC): protocol for a household cohort study in Moramanga, Madagascar.

Authors:  Aina Harimanana; Andriniaina Rakotondrasoa; Lalainasoa Odile Rivoarilala; Alexis Criscuolo; Lulla Opatowski; Elliot Fara Nandrasana Rakotomanana; Perlinot Herindrainy; Jean-Marc Collard; Tania Crucitti; Bich-Tram Huynh
Journal:  BMJ Open       Date:  2022-09-23       Impact factor: 3.006

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.