| Literature DB >> 35118380 |
Sabrina Krakau1, Daniel Straub1, Hadrien Gourlé2, Gisela Gabernet1, Sven Nahnsen1.
Abstract
The analysis of shotgun metagenomic data provides valuable insights into microbial communities, while allowing resolution at individual genome level. In absence of complete reference genomes, this requires the reconstruction of metagenome assembled genomes (MAGs) from sequencing reads. We present the nf-core/mag pipeline for metagenome assembly, binning and taxonomic classification. It can optionally combine short and long reads to increase assembly continuity and utilize sample-wise group-information for co-assembly and genome binning. The pipeline is easy to install-all dependencies are provided within containers-portable and reproducible. It is written in Nextflow and developed as part of the nf-core initiative for best-practice pipeline development. All codes are hosted on GitHub under the nf-core organization https://github.com/nf-core/mag and released under the MIT license.Entities:
Year: 2022 PMID: 35118380 PMCID: PMC8808542 DOI: 10.1093/nargab/lqac007
Source DB: PubMed Journal: NAR Genom Bioinform ISSN: 2631-9268
Figure 1.(A) Overview of the nf-core/mag pipeline (v2.1.0). (B) Clustered heatmap showing MAG abundances, i.e. centered log-ratio depths across samples. (C) Schematic representation of MAG summary output, containing abundance information, QC metrics and taxonomic classifications.
Comparison of nf-core/mag’s functionality with commonly used metagenome assembly and binning pipelines. A more detailed comparison is shown in Supplementary Table S1
| Functionality | Muffin v1.0.3 | ATLAS v2.6a2 | nf-core/mag v2.1.0 | |
|---|---|---|---|---|
| Assembly | Hybrid assembly | Yes | Partial | Yes |
| Reassembly after binning | Yes | No | No | |
| Group-wise co-assembly | No | No | Yes | |
| Genome binning | Group-wise co-abundances used for binning | No | Yes | Yes |
| Bin refinement | Yes | Yes | No | |
| MAG abundance estimation | No | Yes | Yes | |
| Annotation | Taxonomic classification | Yes | Yes | Yes |
| Functional annotation | Yes | Yes | No | |
| Usability | Reproducibility | No | No | Yes |
| Adherence to set of strict best-practice guidelines for pipeline development | No | No | Yes |
Figure 2.Assembly metrics obtained using different nf-core/mag assembly settings on the simulated data: sample-wise assembly, group-wise co-assembly, short read only assembly or hybrid assembly. Each point corresponds to one assembly, originating either from one sample or one group. Metrics displayed are (A) total length of the assembly in base pairs, (B) N50 value (i.e. the length of the shortest contig that needs to be included to cover least 50% of the genome), (C) number of contigs in the final assembly, (D) size of the largest contig in base pairs and (E) number of MAGs identified in the final assembly. The metrics (A)–(D) are part of the QUAST assembly summary.