Literature DB >> 30597002

Libra: scalable k-mer-based tool for massive all-vs-all metagenome comparisons.

Illyoung Choi1, Alise J Ponsero2, Matthew Bomhoff2, Ken Youens-Clark2, John H Hartman1, Bonnie L Hurwitz2,3.   

Abstract

Background: Shotgun metagenomics provides powerful insights into microbial community biodiversity and function. Yet, inferences from metagenomic studies are often limited by dataset size and complexity and are restricted by the availability and completeness of existing databases. De novo comparative metagenomics enables the comparison of metagenomes based on their total genetic content.
Results: We developed a tool called Libra that performs an all-vs-all comparison of metagenomes for precise clustering based on their k-mer content. Libra uses a scalable Hadoop framework for massive metagenome comparisons, Cosine Similarity for calculating the distance using sequence composition and abundance while normalizing for sequencing depth, and a web-based implementation in iMicrobe (http://imicrobe.us) that uses the CyVerse advanced cyberinfrastructure to promote broad use of the tool by the scientific community. Conclusions: A comparison of Libra to equivalent tools using both simulated and real metagenomic datasets, ranging from 80 million to 4.2 billion reads, reveals that methods commonly implemented to reduce compute time for large datasets, such as data reduction, read count normalization, and presence/absence distance metrics, greatly diminish the resolution of large-scale comparative analyses. In contrast, Libra uses all of the reads to calculate k-mer abundance in a Hadoop architecture that can scale to any size dataset to enable global-scale analyses and link microbial signatures to biological processes.

Entities:  

Mesh:

Year:  2019        PMID: 30597002      PMCID: PMC6354030          DOI: 10.1093/gigascience/giy165

Source DB:  PubMed          Journal:  Gigascience        ISSN: 2047-217X            Impact factor:   6.524


  44 in total

1.  Virtual metagenome reconstruction from 16S rRNA gene sequences.

Authors:  Shujiro Okuda; Yuki Tsuchiya; Chiho Kiriyama; Masumi Itoh; Hisao Morisaki
Journal:  Nat Commun       Date:  2012       Impact factor: 14.919

2.  Depth-stratified functional and taxonomic niche specialization in the 'core' and 'flexible' Pacific Ocean Virome.

Authors:  Bonnie L Hurwitz; Jennifer R Brum; Matthew B Sullivan
Journal:  ISME J       Date:  2014-08-05       Impact factor: 10.302

3.  GemSIM: general, error-model based simulator of next-generation sequencing data.

Authors:  Kerensa E McElroy; Fabio Luciani; Torsten Thomas
Journal:  BMC Genomics       Date:  2012-02-15       Impact factor: 3.969

4.  The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families.

Authors:  Shibu Yooseph; Granger Sutton; Douglas B Rusch; Aaron L Halpern; Shannon J Williamson; Karin Remington; Jonathan A Eisen; Karla B Heidelberg; Gerard Manning; Weizhong Li; Lukasz Jaroszewski; Piotr Cieplak; Christopher S Miller; Huiying Li; Susan T Mashiyama; Marcin P Joachimiak; Christopher van Belle; John-Marc Chandonia; David A Soergel; Yufeng Zhai; Kannan Natarajan; Shaun Lee; Benjamin J Raphael; Vineet Bafna; Robert Friedman; Steven E Brenner; Adam Godzik; David Eisenberg; Jack E Dixon; Susan S Taylor; Robert L Strausberg; Marvin Frazier; J Craig Venter
Journal:  PLoS Biol       Date:  2007-03       Impact factor: 8.029

5.  Artificial and natural duplicates in pyrosequencing reads of metagenomic data.

Authors:  Beifang Niu; Limin Fu; Shulei Sun; Weizhong Li
Journal:  BMC Bioinformatics       Date:  2010-04-13       Impact factor: 3.169

6.  Libra: scalable k-mer-based tool for massive all-vs-all metagenome comparisons.

Authors:  Illyoung Choi; Alise J Ponsero; Matthew Bomhoff; Ken Youens-Clark; John H Hartman; Bonnie L Hurwitz
Journal:  Gigascience       Date:  2019-02-01       Impact factor: 6.524

7.  Structure, function and diversity of the healthy human microbiome.

Authors: 
Journal:  Nature       Date:  2012-06-13       Impact factor: 49.962

8.  Cloud computing for comparative genomics.

Authors:  Dennis P Wall; Parul Kudtarkar; Vincent A Fusaro; Rimma Pivovarov; Prasad Patil; Peter J Tonellato
Journal:  BMC Bioinformatics       Date:  2010-05-18       Impact factor: 3.169

9.  DistMap: a toolkit for distributed short read mapping on a Hadoop cluster.

Authors:  Ram Vinay Pandey; Christian Schlötterer
Journal:  PLoS One       Date:  2013-08-23       Impact factor: 3.240

10.  Searching for SNPs with cloud computing.

Authors:  Ben Langmead; Michael C Schatz; Jimmy Lin; Mihai Pop; Steven L Salzberg
Journal:  Genome Biol       Date:  2009-11-20       Impact factor: 13.583

View more
  11 in total

1.  Libra: scalable k-mer-based tool for massive all-vs-all metagenome comparisons.

Authors:  Illyoung Choi; Alise J Ponsero; Matthew Bomhoff; Ken Youens-Clark; John H Hartman; Bonnie L Hurwitz
Journal:  Gigascience       Date:  2019-02-01       Impact factor: 6.524

2.  A New Alignment-Free Whole Metagenome Comparison Tool and Its Application on Gut Microbiomes of Wild Giant Pandas.

Authors:  Jiuhong Dong; Shuai Liu; Yaran Zhang; Yi Dai; Qi Wu
Journal:  Front Microbiol       Date:  2020-06-16       Impact factor: 5.640

3.  iMicrobe: Tools and data-dreaiven discovery platform for the microbiome sciences.

Authors:  Ken Youens-Clark; Matt Bomhoff; Alise J Ponsero; Elisha M Wood-Charlson; Joshua Lynch; Illyoung Choi; John H Hartman; Bonnie L Hurwitz
Journal:  Gigascience       Date:  2019-07-01       Impact factor: 6.524

4.  Extreme Viral Partitioning in a Marine-Derived High Arctic Lake.

Authors:  Myriam Labbé; Catherine Girard; Warwick F Vincent; Alexander I Culley
Journal:  mSphere       Date:  2020-05-13       Impact factor: 4.389

5.  Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data.

Authors:  Julie Chih-Yu Chen; Andrea D Tyler
Journal:  Biol Direct       Date:  2020-12-10       Impact factor: 4.540

6.  Gut Microbiota in Dholes During Estrus.

Authors:  Xiaoyang Wu; Yongquan Shang; Qinguo Wei; Jun Chen; Huanxin Zhang; Yao Chen; Xiaodong Gao; Zhiyong Wang; Honghai Zhang
Journal:  Front Microbiol       Date:  2020-11-30       Impact factor: 5.640

7.  Practical selection of representative sets of RNA-seq samples using a hierarchical approach.

Authors:  Laura H Tung; Carl Kingsford
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

8.  Metagenomic analysis through the extended Burrows-Wheeler transform.

Authors:  Veronica Guerrini; Felipe A Louza; Giovanna Rosone
Journal:  BMC Bioinformatics       Date:  2020-09-16       Impact factor: 3.169

9.  NCBI's Virus Discovery Hackathon: Engaging Research Communities to Identify Cloud Infrastructure Requirements.

Authors:  Ryan Connor; Rodney Brister; Jan P Buchmann; Ward Deboutte; Rob Edwards; Joan Martí-Carreras; Mike Tisza; Vadim Zalunin; Juan Andrade-Martínez; Adrian Cantu; Michael D'Amour; Alexandre Efremov; Lydia Fleischmann; Laura Forero-Junco; Sanzhima Garmaeva; Melissa Giluso; Cody Glickman; Margaret Henderson; Benjamin Kellman; David Kristensen; Carl Leubsdorf; Kyle Levi; Shane Levi; Suman Pakala; Vikas Peddu; Alise Ponsero; Eldred Ribeiro; Farrah Roy; Lindsay Rutter; Surya Saha; Migun Shakya; Ryan Shean; Matthew Miller; Benjamin Tully; Christopher Turkington; Ken Youens-Clark; Bert Vanmechelen; Ben Busby
Journal:  Genes (Basel)       Date:  2019-09-16       Impact factor: 4.096

10.  Identification and quantitation of clinically relevant microbes in patient samples: Comparison of three k-mer based classifiers for speed, accuracy, and sensitivity.

Authors:  George S Watts; James E Thornton; Ken Youens-Clark; Alise J Ponsero; Marvin J Slepian; Emmanuel Menashi; Charles Hu; Wuquan Deng; David G Armstrong; Spenser Reed; Lee D Cranmer; Bonnie L Hurwitz
Journal:  PLoS Comput Biol       Date:  2019-11-22       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.