| Literature DB >> 33170893 |
Ray Sajulga1, Caleb Easterly1, Michael Riffle2, Bart Mesuere3, Thilo Muth4, Subina Mehta1, Praveen Kumar1, James Johnson1, Bjoern Andreas Gruening5, Henning Schiebenhoefer6, Carolin A Kolmeder7, Stephan Fuchs8, Brook L Nunn2, Joel Rudney1, Timothy J Griffin1, Pratik D Jagtap1.
Abstract
To gain a thorough appreciation of microbiome dynamics, researchers characterize the functional relevance of expressed microbial genes or proteins. This can be accomplished through metaproteomics, which characterizes the protein expression of microbiomes. Several software tools exist for analyzing microbiomes at the functional level by measuring their combined proteome-level response to environmental perturbations. In this survey, we explore the performance of six available tools, to enable researchers to make informed decisions regarding software choice based on their research goals. Tandem mass spectrometry-based proteomic data obtained from dental caries plaque samples grown with and without sucrose in paired biofilm reactors were used as representative data for this evaluation. Microbial peptides from one sample pair were identified by the X! tandem search algorithm via SearchGUI and subjected to functional analysis using software tools including eggNOG-mapper, MEGAN5, MetaGOmics, MetaProteomeAnalyzer (MPA), ProPHAnE, and Unipept to generate functional annotation through Gene Ontology (GO) terms. Among these software tools, notable differences in functional annotation were detected after comparing differentially expressed protein functional groups. Based on the generated GO terms of these tools we performed a peptide-level comparison to evaluate the quality of their functional annotations. A BLAST analysis against the NCBI non-redundant database revealed that the sensitivity and specificity of functional annotation varied between tools. For example, eggNOG-mapper mapped to the most number of GO terms, while Unipept generated more accurate GO terms. Based on our evaluation, metaproteomics researchers can choose the software according to their analytical needs and developers can use the resulting feedback to further optimize their algorithms. To make more of these tools accessible via scalable metaproteomics workflows, eggNOG-mapper and Unipept 4.0 were incorporated into the Galaxy platform.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33170893 PMCID: PMC7654790 DOI: 10.1371/journal.pone.0241503
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1A comparative workflow of all six functional software tools producing GO term lists from the same dataset.
The inputs that are required for each software tool are connected from the top. The reference databases used for each tool are aligned in the middle. The outputs and Gene Ontology (GO) term translation processes are outlined at the bottom. Additional output types (data and visualizations) are shown in the table underneath the workflow.
Functional analysis of the oral dysbiosis dataset using molecular function GO terms.
| Tool | EggNOG mapper | MEGAN | MetaGOmics | MPA | ProPHAnE | Unipept |
|---|---|---|---|---|---|---|
| 18,440 | 1,665 | 2,829 | 23,169 | 3,999 | 3,471 | |
| 533,066 (6,155) | 76,529 (4,155) | 2,829 (2,829) | 77,204 (1,056) | 189,054 (2,598) | 3,471 (3,471) | |
| 88,582 (1,411) | 21,212 (1,613) | 900 (900) | 42,084 (634) | 57,208 (1,057) | 1,726 (1,726) | |
| 265 | 168 | 113 | 1 | 16 | 394 | |
| 1,466 | 1,693 | 1,002 | 974 | 1,135 | 2,249 | |
| 204 | 118 | 80 | 1 | 11 | 447 |
Fig 2A) Qualitative and quantitative comparison of functional tools. Overlap of unique molecular function GO terms (left) and expanded GO terms (right) were compared amongst the six functional tools. Values were calculated as a fraction of the size of the term intersection (between the tools labeled on the column and row) over the total term size of the tool listed on the horizontal axis (column). Each functional analysis software tool was compared against each other. For example, for molecular function GO terms (left panel), 90% of the unique MPA term set is present in Unipept’s unique term set. For molecular function expanded GO terms, the overlap is much larger for all tools within Unipept (right panel’s top row). B) Comparison of quantitative expression for molecular function GO terms from Unipept and MetaGOmics. Log2ratio of spectral counts ‘with sugar sample’ (WS) against ‘no sugar sample’ (NS) was calculated for MetaGOmics- and Unipept-generated molecular function GO terms. Unipept identified 1,109 molecular function GO terms, while MetaGOmics identified 900 molecular function GO terms. The data points in the figure represent quantitative values for 460 molecular function GO terms that overlapped between Unipept and MetaGOmics.
Fig 3Analysis of peptides associated with acetyl-CoA C-acetyltransferase activity.
A.) A combined GO hierarchy of unique terms annotated from a single peptide (sequence = FKDEIVPVVIPNK) for peptide-level tools (Unipept, MetaGOmics, and EggNOG-mapper), and a baseline tool: BLAST2GO—NCBI (nr). This peptide was selected from a group of 20 peptides randomly selected from all possible peptides that annotated ‘Acetyl-CoA C-acetyltransferase’ from the peptide-level tools. The peptide was selected since it shared results with the most number of peptides (10). Similar analyses for these other peptides are included in Supplement S5 in S1 File. In this hierarchy, an arrow indicates “is a” or parent/children relationships. Colored blocks represent GO terms (color represents relationship type). Dashed block outlines (non-colored) are labeled with tools that encapsulate GO terms that were annotated by that tool. B.) A stacked bar chart representation of the related terms (descendants, ancestors, or ancestor’s children). Colors correspond to these relation types. Green represents terms that were found through BLAST2GO’s annotation via NCBI (nr). The other colors represent the relationship of the other terms in other tools to those BLAST GO terms. These types are quantified and stacked on one another to show the contributions of each relationship type to the overall GO hierarchy.
Comparison of the top five upregulated and top five downregulated molecular function GO terms of Unipept with the molecular function GO terms from the other tools from the oral dysbiosis dataset.
| GO Term | Fold Change | Unipept | EggNOG | MEGAN | MetaGOmics | MPA | Prophane |
|---|---|---|---|---|---|---|---|
| glucosyltransferase activity | FC (WS / NS) | 6.87 (116 / 0) | 2 (3 / 0) | 1.46 (21 / 7) | 2.71 (117 / 17) | 3.01 (7.03 / 0) | 0.52 (0.44 / 0) |
| Percentile (%) | 0.058 | 12.267 | 18.438 | 8.024 | 1.157 | 1.9 | |
| dextransucrase activity | FC (WS / NS) | 6.67 (101 / 0) | - | - | 6.71 (104 / 0) | 3.01 (7.03 / 0) | - |
| Percentile (%) | 0.086 | - | - | 0.035 | 1.157 | - | |
| pyruvate oxidase activity | FC (WS / NS) | 6.38 (82 / 0) | - | - | 6.13 (69 / 0) | - | -0.09 (0 / 0.07) |
| Percentile (%) | 0.115 | - | - | 0.071 | - | 57.689 | |
| glyceraldehyde-3-phosphate dehydrogenase (NADP+) (non-phosphorylating) activity | FC (WS / NS) | 6.04 (65 / 0) | - | - | 6.02 (64 / 0) | - | - |
| Percentile (%) | 0.144 | - | - | 0.141 | - | - | |
| fructuronate reductase activity | FC (WS / NS) | 5.7 (51 / 0) | - | 0 (1 / 1) | - | - | - |
| Percentile (%) | 0.173 | - | 52.072 | - | - | - | |
| CoA-transferase activity | FC (WS / NS) | -7.84 (2 / 685.56) | -6.28 (0 / 76.8) | -8.56 (0 / 377) | -5.15 (15 / 568) | -4.52 (0 / 22) | -1.3 (0 / 1.47) |
| Percentile (%) | 99.885 | 99.881 | 99.94 | 97.702 | 99.236 | 99.925 | |
| acetyl-CoA C-acyltransferase activity | FC (WS / NS) | -7.92 (0 / 240.7) | -6.45 (0 / 86.16) | -8.59 (0 / 384) | -8.37 (0 / 330) | -5 (0 / 31) | -1.09 (0 / 1.13) |
| Percentile (%) | 99.914 | 99.913 | 100 | 99.894 | 99.819 | 99.825 | |
| butyrate-acetoacetate CoA-transferase activity | FC (WS / NS) | -8.36 (0 / 326.86) | - | - | -6.85 (0 / 114) | -4.52 (0 / 22) | -0.42 (0 / 0.33) |
| Percentile (%) | 99.942 | - | - | 99.434 | 99.258 | 96.674 | |
| glutaconate CoA-transferase activity | FC (WS / NS) | -8.41 (0 / 339.03) | - | - | -8.13 (0 / 279) | -4 (0 / 15) | -0.55 (0 / 0.47) |
| Percentile (%) | 99.971 | - | - | 99.823 | 98.412 | 98.25 | |
| acetyl-CoA C-acetyltransferase activity | FC (WS / NS) | -8.54 (0 / 369.94) | -6.45 (0 / 86.16) | -8.59 (0 / 384) | -8.37 (0 / 330) | -5.46 (0 / 43) | -1.09 (0 / 1.13) |
| Percentile (%) | 100 | 99.913 | 100 | 99.859 | 99.987 | 99.825 |
Fold changes are featured here (descending for Unipept). For other tools, if there are multiple GO terms that match the top term, then the term with the highest absolute fold change is displayed. Additionally, spectral counts are indicated for “with sucrose” and “no sucrose” (WS / NS) conditions which are used to calculate the displayed fold change . Percentiles are included to indicate the position of that particular term in that GO set containing all ontologies (0 = most upregulated; 100 = most downregulated).
Fig 4Gene ontology hierarchy analysis of a single GO term for all six tools.
A) A stacked bar chart representation of the related terms (descendants, ancestors, ancestor’s children, or extraneous) for xanthine dehydrogenase activity (XDA) for all six functional tools. Colors correspond to these relation types. Green represents terms that were found through BLAST2GO’s annotation via NCBI (nr). The other colors represent the relationship of the other terms in other tools to those BLAST GO terms. These types are quantified and stacked on one another to show the contributions of each relationship type to the overall GO hierarchy. B) A single GO hierarchy analysis of xanthine dehydrogenase activity (XDA) (GO:0004854) for all six functional tools. For each tool, the GO terms of annotation groups containing XDA are arranged in a GO hierarchy for each tool. Each GO term is contained in a rounded rectangle. All six hierarchies are layered upon one another (with the largest in the background and the smallest in the foreground). Each color represents a tool (not a GO relationship). Solid lines represent “is a” or parent/child relationships and the dashed line indicates a connection that is present in the overall combined hierarchy, but not in any of the individual hierarchies. Circles indicate GO terms derived from BLAST2GO—NCBI (nr) results using peptides from the peptide-level tools (MetaGOmics, Unipept, EggNOG-mapper). GO terms not represented by more than one peptide are omitted (hence why eggNOG-mapper is not represented here).