Literature DB >> 35532226

Recovery of Metagenome-Assembled Genomes from a Human Fecal Sample with Pacific Biosciences High-Fidelity Sequencing.

Florian Plaza Oñate1, Hugo Roume1, Mathieu Almeida1.   

Abstract

Here, we report the recovery of 89 metagenome-assembled genomes (MAGs) derived from a human fecal sample subjected to Pacific Biosciences (PacBio) high-fidelity (HiFi) sequencing. A total of 9 MAGs consisted of complete circular contigs, and 45 MAGs were high-quality draft genomes according to the minimum information about a metagenome-assembled genome (MIMAG) standards.

Entities:  

Year:  2022        PMID: 35532226      PMCID: PMC9202402          DOI: 10.1128/mra.00250-22

Source DB:  PubMed          Journal:  Microbiol Resour Announc        ISSN: 2576-098X


ANNOUNCEMENT

Recent studies have shown that long-read sequencing technologies improve the contiguity of metagenomic assemblies and enable the recovery of repeated regions compared with short-read alternatives (1, 2). To assess long-read technologies and evaluate dedicated bioinformatics tools, we performed deep sequencing of a human fecal metagenome with the PacBio Sequel II system. A fecal sample was self-collected by a 30-year-old healthy French volunteer for whom written consent was obtained. The sample was stored immediately in a stabilizing solution (RNAlater) following the International Human Microbiome Standards (IHMS) SOP_05_V2 (3, 4). DNA was extracted from 200 mg of fecal material following the IHMS SOP_07_V2 (3, 5). A total of 500 ng of high-molecular-weight DNA was used to build an unamplified nonmultiplexed library with the SMRTbell express template prep kit 2.0 (Pacific Biosciences) following manufacturer recommendations for metagenomics (6). Then, a 30-hour sequencing run was performed on a Sequel II device using Chemistry v2.0. Finally, removal of adapter sequences, read quality control, and generation of circular consensus sequencing (CCS) reads were performed through a dedicated pipeline (7). Below, default parameters were used for all software unless otherwise specified. Reads shorter than 1,000 bp or aligned to the human genome (GenBank accession number GCA_009914755.3) with minimap v2.24 (8) (parameters, -x asm20) were discarded. In total, 1,645,079 reads with a median quality value of 40 were obtained for a cumulative length of 13,012,430,198 bp. The median length of the sequencing reads was 7,620 bp (Q1 = 5,936 bp; Q3 = 9,635 bp). Metagenomic assembly was performed with Flye v2.9 (9) (parameters: ‐‐pacbio-hifi ‐‐meta), and obtained contigs shorter than 2,500 bp were filtered out. The assembly consisted of 9,253 contigs (including 9 circular contigs of ≥1 Mb) with a cumulative length of 596,522,308 bp. N50 and L50 values were 164,736 bp and 628, respectively. Contig binning was performed with MetaBAT v2.12.1 (10) and SemiBin v0.5.0 (11) (parameters: ‐‐environment human_gut). Results from both tools were combined with the bin_refinement module implemented in metaWRAP v1.3.2 (12). Metagenome-assembled genome (MAG) quality was assessed with checkm v1.1.3 (13) except for one eukaryotic MAG for which BUSCO v5.2.2 (14) was used. A total of 89 MAGs with an estimated completeness of ≥70%, with a contamination of ≤5%, and passing the chimera detection implemented in GUNC v1.0.5 (15) were selected (Fig. 1). These MAGs were annotated subsequently with Prokka v1.14.5 (16), and taxonomic classification was performed with GTDB-Tk v1.5.0 (17). In total, 9 MAGs consisted of complete circular contigs, and 45 MAGs were high-quality draft sequences according to the minimum information about metagenome-assembled genome (MIMAG) standards (18). Notably, all 88 prokaryotic MAGs had at least 1 complete 16S rRNA gene. MAGs were mainly bacteria (87/89), 1 was an archaea (min17_bin38), and the largest (11.6 Mb) was an unicellular eukaryote of the genus Blastocystis (min17_eukbin1). We compared our prokaryotic MAGs with the representative genomes of the UHGG catalogue v2 (19) using fastANI v1.33 (20). Three MAGs corresponded to species not represented in the Unified Human Gastrointestinal Genome (UHGG) collection (average nucleotide identity cutoff = 95%). Remarkably, 30 MAGs had better assembly statistics than the UHGG representatives according to a composite score defined as completeness – (5 × contamination) + log(N50).
FIG 1

Rank order plots comparing the quality of MAGs produced by the three binning methods. Estimated completeness (A), estimated contamination (B), and N50 statistics (C) taken from the CheckM output report are shown. Only MAGs with completeness of ≥70% and contamination of ≤5% are considered.

Rank order plots comparing the quality of MAGs produced by the three binning methods. Estimated completeness (A), estimated contamination (B), and N50 statistics (C) taken from the CheckM output report are shown. Only MAGs with completeness of ≥70% and contamination of ≤5% are considered.

Data availability.

Sequencing data (accession ERX7722845) and primary metagenome assembly (accession ERZ4963561) were deposited in the European Nucleotide Archive (ENA) under BioProject accession number PRJEB50473. Prokka annotation reports, MAG sequences, and related metadata were deposited in the INRAE data portal (data set S63W9S).
  16 in total

1.  Minimap2: pairwise alignment for nucleotide sequences.

Authors:  Heng Li
Journal:  Bioinformatics       Date:  2018-09-15       Impact factor: 6.937

2.  metaFlye: scalable long-read metagenome assembly using repeat graphs.

Authors:  Mikhail Kolmogorov; Derek M Bickhart; Bahar Behsaz; Alexey Gurevich; Mikhail Rayko; Sung Bong Shin; Kristen Kuhn; Jeffrey Yuan; Evgeny Polevikov; Timothy P L Smith; Pavel A Pevzner
Journal:  Nat Methods       Date:  2020-10-05       Impact factor: 28.547

3.  Towards standards for human fecal sample processing in metagenomic studies.

Authors:  Paul I Costea; Georg Zeller; Shinichi Sunagawa; Eric Pelletier; Adriana Alberti; Florence Levenez; Melanie Tramontano; Marja Driessen; Rajna Hercog; Ferris-Elias Jung; Jens Roat Kultima; Matthew R Hayward; Luis Pedro Coelho; Emma Allen-Vercoe; Laurie Bertrand; Michael Blaut; Jillian R M Brown; Thomas Carton; Stéphanie Cools-Portier; Michelle Daigneault; Muriel Derrien; Anne Druesne; Willem M de Vos; B Brett Finlay; Harry J Flint; Francisco Guarner; Masahira Hattori; Hans Heilig; Ruth Ann Luna; Johan van Hylckama Vlieg; Jana Junick; Ingeborg Klymiuk; Philippe Langella; Emmanuelle Le Chatelier; Volker Mai; Chaysavanh Manichanh; Jennifer C Martin; Clémentine Mery; Hidetoshi Morita; Paul W O'Toole; Céline Orvain; Kiran Raosaheb Patil; John Penders; Søren Persson; Nicolas Pons; Milena Popova; Anne Salonen; Delphine Saulnier; Karen P Scott; Bhagirath Singh; Kathleen Slezak; Patrick Veiga; James Versalovic; Liping Zhao; Erwin G Zoetendal; S Dusko Ehrlich; Joel Dore; Peer Bork
Journal:  Nat Biotechnol       Date:  2017-10-02       Impact factor: 54.908

4.  Generating lineage-resolved, complete metagenome-assembled genomes from complex microbial communities.

Authors:  Derek M Bickhart; Mikhail Kolmogorov; Elizabeth Tseng; Daniel M Portik; Anton Korobeynikov; Ivan Tolstoganov; Gherman Uritskiy; Ivan Liachko; Shawn T Sullivan; Sung Bong Shin; Alvah Zorea; Victòria Pascal Andreu; Kevin Panke-Buisse; Marnix H Medema; Itzhak Mizrahi; Pavel A Pevzner; Timothy P L Smith
Journal:  Nat Biotechnol       Date:  2022-01-03       Impact factor: 68.164

5.  CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

Authors:  Donovan H Parks; Michael Imelfort; Connor T Skennerton; Philip Hugenholtz; Gene W Tyson
Journal:  Genome Res       Date:  2015-05-14       Impact factor: 9.043

6.  High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries.

Authors:  Chirag Jain; Luis M Rodriguez-R; Adam M Phillippy; Konstantinos T Konstantinidis; Srinivas Aluru
Journal:  Nat Commun       Date:  2018-11-30       Impact factor: 14.919

7.  Finding the right fit: evaluation of short-read and long-read sequencing approaches to maximize the utility of clinical microbiome data.

Authors:  Jeanette L Gehrig; Daniel M Portik; Mark D Driscoll; Eric Jackson; Shreyasee Chakraborty; Dawn Gratalo; Meredith Ashby; Ricardo Valladares
Journal:  Microb Genom       Date:  2022-03

8.  MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies.

Authors:  Dongwan D Kang; Feng Li; Edward Kirton; Ashleigh Thomas; Rob Egan; Hong An; Zhong Wang
Journal:  PeerJ       Date:  2019-07-26       Impact factor: 2.984

9.  GUNC: detection of chimerism and contamination in prokaryotic genomes.

Authors:  Askarbek Orakov; Anthony Fullam; Luis Pedro Coelho; Supriya Khedkar; Damian Szklarczyk; Daniel R Mende; Thomas S B Schmidt; Peer Bork
Journal:  Genome Biol       Date:  2021-06-13       Impact factor: 13.583

10.  BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes.

Authors:  Mosè Manni; Matthew R Berkeley; Mathieu Seppey; Felipe A Simão; Evgeny M Zdobnov
Journal:  Mol Biol Evol       Date:  2021-09-27       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.