| Literature DB >> 30377376 |
Eric A Franzosa1,2, Lauren J McIver1,2, Gholamali Rahnavard1,2, Luke R Thompson3, Melanie Schirmer1,2, George Weingart1, Karen Schwarzberg Lipson4, Rob Knight3,5, J Gregory Caporaso4, Nicola Segata6, Curtis Huttenhower7,8.
Abstract
Functional profiles of microbial communities are typically generated using comprehensive metagenomic or metatranscriptomic sequence read searches, which are time-consuming, prone to spurious mapping, and often limited to community-level quantification. We developed HUMAnN2, a tiered search strategy that enables fast, accurate, and species-resolved functional profiling of host-associated and environmental communities. HUMAnN2 identifies a community's known species, aligns reads to their pangenomes, performs translated search on unclassified reads, and finally quantifies gene families and pathways. Relative to pure translated search, HUMAnN2 is faster and produces more accurate gene family profiles. We applied HUMAnN2 to study clinal variation in marine metabolism, ecological contribution patterns among human microbiome pathways, variation in species' genomic versus transcriptional contributions, and strain profiling. Further, we introduce 'contributional diversity' to explain patterns of ecological assembly across different microbial community types.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30377376 PMCID: PMC6235447 DOI: 10.1038/s41592-018-0176-y
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1:HUMAnN2 functionally profiles microbial communities with high accuracy using tiered search.
(a) Overview of HUMAnN2’s tiered search algorithm for meta’omic functional profiling (expanded in Supplementary Fig. 1). (b) HUMAnN2’s tiered search vs. pure translated search evaluated on a synthetic gut metagenome. Sensitivity, precision, and overall accuracy (1 - Bray-Curtis dissimilarity) were computed for (c) gene family and (d) pathway abundance profiles relative to gold standards at the whole-community level (“overall”) and for each stratification. (e) HUMAnN2 compared with other methods in the task of quantifying community-total COG abundances. Runtimes reflect multi-threaded execution on 8 CPU cores.
Figure 2:Contributional diversity of core human microbiome pathways.
(a) Within- and between- sample contributional diversity for core metabolic pathways (individual points) from HMP metagenomes. Stars indicate background species-level whole-community diversity. (b-e) Examples of pathways with “extreme” diversity patterns. The top of each set of stacked bars indicates the total stratified abundance of the pathway within a single sample (log-scaled). Species and “unclassified” stratifications are linearly (proportionally) scaled within the total bar height.
Figure 3:Thermocline-associated microbial enzymes in the marine pelagic zone.
(a-e) Examples of KEGG Orthogroups (KOs) demonstrating strong temperature associations across 45 Red Sea metagenomes; all were newly quantified by HUMAnN2 relative to the samples’ initial publication. (f) Pearson correlations for 4,609 KOs that were quantified by both HUMAnN2 and HUMAnN1. “GAIW” indicates “Gulf of Aden Intermediate Water”: a cool nutrient-rich water mass within the Red Sea. The n=45 total samples in (f) are subdivided by depth layers (the sample from 258 m was grouped with the 500-m samples) and colored by latitude. From smallest to largest, box plot elements represent the lower inner fence, first quartile, median, third quartile, and upper inner fence.
Figure 4:Metatranscriptomic functional profiling and multi’omic data integration with HUMAnN2.
(a) Average within-sample metagenomic (DNA) versus metatranscriptomic (RNA) contributional diversities for n=181 core pathways profiled from 78 paired inflammatory bowel disease (IBD) meta’omes from the IBDMDB cohort. Pathways are colored by “relative expression” (RNA:DNA ratio). (b) Sucrose degradation (outlined in ‘a’) is a prevalent pathway with high within-subject contributional diversity at the DNA level but low within-subject contributional diversity at the RNA level. This pattern was conserved across three IBD phenotypes: Crohn’s disease (CD), ulcerative colitis (UC), and non-IBD controls. Species’ contributions were rescaled to sum to 1 within each sample (set of stacked bars).