| Literature DB >> 33087062 |
Alexander Eng1, Adrian J Verster1,2, Elhanan Borenstein3,4,5.
Abstract
BACKGROUND: Microbial communities have become an important subject of research across multiple disciplines in recent years. These communities are often examined via shotgun metagenomic sequencing, a technology which can offer unique insights into the genomic content of a microbial community. Functional annotation of shotgun metagenomic data has become an increasingly popular method for identifying the aggregate functional capacities encoded by the community's constituent microbes. Currently available metagenomic functional annotation pipelines, however, suffer from several shortcomings, including limited pipeline customization options, lack of standard raw sequence data pre-processing, and insufficient capabilities for integration with distributed computing systems.Entities:
Keywords: Distributed computing; Functional annotation; Metagenomics; Pipeline
Mesh:
Year: 2020 PMID: 33087062 PMCID: PMC7579964 DOI: 10.1186/s12859-020-03815-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Comparison of read-based metagenomic annotation pipeline features
| Feature | MG-RAST | SUPER-FOCUS | eggNOG-mapper | HUMAnN2 | YAMP | MetaLAFFA |
|---|---|---|---|---|---|---|
| Metagenomic functional annotation | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Uses DIAMOND for read alignment | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Read pre-processing | ✓ | ✓ | ✓ | |||
| Ortholog aggregation to broader functional categorizations | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Available as a web service | ✓ | ✓ | ||||
| Native integration with distributed computing systems | ✓ | ✓ | ||||
| Automatic continuation from intermediate steps after interruption | ✓ | ✓ | ✓ | |||
| Convenient incorporation of new pipeline steps | ✓ | ✓ | ||||
| Universal single-copy gene-based abundance normalization via MUSiCC | ✓ |
Fig. 1Flowchart of the default MetaLAFFA workflow. The default MetaLAFFA workflow consists of three phases, quality control (top), read mapping (middle), and functional annotation (bottom). This flowchart outlines the individual processing steps taken in each phase (colored rectangular boxes), the intermediate outputs of these steps (grey rounded boxes), supporting data files required for specific steps (yellow rounded boxes), user-provided input to MetaLAFFA (red rounded box), and the final outputs of the pipeline (purple rounded boxes). Third-party tools used in the default pipeline workflow are indicated in parentheses for their associated processing steps. File types of all inputs, outputs, and supporting data files are indicated by file suffix