| Literature DB >> 35446845 |
Renato Giliberti1, Sara Cavaliere1, Italia Elisa Mauriello1, Danilo Ercolini1,2, Edoardo Pasolli1,2.
Abstract
Machine learning-based classification approaches are widely used to predict host phenotypes from microbiome data. Classifiers are typically employed by considering operational taxonomic units or relative abundance profiles as input features. Such types of data are intrinsically sparse, which opens the opportunity to make predictions from the presence/absence rather than the relative abundance of microbial taxa. This also poses the question whether it is the presence rather than the abundance of particular taxa to be relevant for discrimination purposes, an aspect that has been so far overlooked in the literature. In this paper, we aim at filling this gap by performing a meta-analysis on 4,128 publicly available metagenomes associated with multiple case-control studies. At species-level taxonomic resolution, we show that it is the presence rather than the relative abundance of specific microbial taxa to be important when building classification models. Such findings are robust to the choice of the classifier and confirmed by statistical tests applied to identifying differentially abundant/present taxa. Results are further confirmed at coarser taxonomic resolutions and validated on 4,026 additional 16S rRNA samples coming from 30 public case-control studies.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35446845 PMCID: PMC9064115 DOI: 10.1371/journal.pcbi.1010066
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Summary of the 25 classification tasks derived from metagenomic datasets for case-control prediction.
ACDV: Atherosclerotic cardiovascular disease, AD: Alzheimer’s disease, BD: Behcet’s disease, CRC: Colorectal cancer, IBD: irritable bowel disease, T1D: Type 1 diabetes, T2D: Type 2 diabetes. We additionally considered the HMP_2012 dataset [10] for body site discrimination between gut (N = 414) and oral (N = 147) samples.
| Dataset name | Body site | # controls | Cases | # cases | Reference |
|---|---|---|---|---|---|
| JieZ_2017 | Gut | 171 | ACVD | 214 | [ |
| ChngKR_2016 | Skin | 40 | AD | 38 | [ |
| YeZ_2018 | Gut | 45 | BD | 20 | [ |
| RaymondF_2016 | Gut | 36 | Cephalosporins | 36 | [ |
| QinN_2014 | Gut | 114 | Cirrhosis | 123 | [ |
| FengQ_2015 | Gut | 61 | CRC | 46 | [ |
| GuptaA_2019 | Gut | 30 | CRC | 28 | [ |
| HanniganGD_2017 | Gut | 28 | CRC | 27 | [ |
| ThomasAM_2018a | Gut | 24 | CRC | 29 | [ |
| ThomasAM_2018b | Gut | 28 | CRC | 32 | [ |
| VogtmannE_2016 | Gut | 52 | CRC | 52 | [ |
| WirbelJ_2018 | Gut | 65 | CRC | 60 | [ |
| YachidaS_2019 | Gut | 251 | CRC | 258 | [ |
| YuJ_2015 | Gut | 53 | CRC | 75 | [ |
| ZellerG_2014 | Gut | 54 | CRC | 61 | [ |
| LiJ_2017 | Gut | 41 | Hypertension | 99 | [ |
| IjazUZ_2017 | Gut | 38 | IBD | 56 | [ |
| NielsenHB_2014 | Gut | 248 | IBD | 148 | [ |
| GhensiP_2019_m | Oral | 49 | Mucositis | 20 | [ |
| GhensiP_2019 | Oral | 49 | Peri-implantitis | 23 | [ |
| Castro_NallarE_2015 | Oral | 16 | Schizophrenia | 16 | [ |
| Heitz-BuschartA_2016 | Gut | 26 | T1D | 27 | [ |
| KosticAD_2015 | Gut | 89 | T1D | 31 | [ |
| KarlssonFH_2013 | Gut | 43 | T2D | 53 | [ |
| QinJ_2012 | Gut | 174 | T2D | 170 | [ |