| Literature DB >> 34718713 |
Hanbo Jin1, Guoru Hu1, Chuqing Sun1, Yiqian Duan2, Zhenmo Zhang1, Zhi Liu3, Xing-Ming Zhao2,4,5, Wei-Hua Chen1,6.
Abstract
mBodyMap is a curated database for microbes across the human body and their associations with health and diseases. Its primary aim is to promote the reusability of human-associated metagenomic data and assist with the identification of disease-associated microbes by consistently annotating the microbial contents of collected samples using state-of-the-art toolsets and manually curating the meta-data of corresponding human hosts. mBodyMap organizes collected samples based on their association with human diseases and body sites to enable cross-dataset integration and comparison. To help users find microbes of interest and visualize and compare their distributions and abundances/prevalence within different body sites and various diseases, the mBodyMap database is equipped with an intuitive interface and extensive graphical representations of the collected data. So far, it contains a total of 63 148 runs, including 14 401 metagenomes and 48 747 amplicons related to health and 56 human diseases, from within 22 human body sites across 136 projects. Also available in the database are pre-computed abundances and prevalence of 6247 species (belonging to 1645 genera) stratified by body sites and diseases. mBodyMap can be accessed at: https://mbodymap.microbiome.cloud.Entities:
Mesh:
Substances:
Year: 2022 PMID: 34718713 PMCID: PMC8728210 DOI: 10.1093/nar/gkab973
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Key features of mBodyMap and comparison with similar databases on microbe-human disease associations
| Database | Key features | Data source | Data size | Reference | |
|---|---|---|---|---|---|
| # Disease | # Microbe | ||||
| mBodyMap | • Comprehensive collection of metagenomic data and analysis using state-of-the-art tools | Metagenomics data | 56 | 6247 | This study |
| HMDAD | • Text mining in large quantity of publications followed by manual curation | Text-mining | 39 | 292 | ( |
| Disbiome | • Collection and presentation of published microbiota-disease information in a standardized way | Text-mining | 372 | 1622 | ( |
| MicroPhenoDB | • Provision of non-redundant associations between microbes and human disease phenotypes across human body and relationships between unique clade-specific core genes and microbes | Text-mining, HMDAD and Disbiome | 542 | 1781 | ( |
Figure 1.Overview of data in mBodyMap. (A) The left panel contains an interactive body map indicating clickable body sites for which metagenomic data are available; the right panel contains the number of samples for each body site, stratified by health (dark green) and diseases (yellow). (B) A barplot summarizing the meta-data we have collected for samples. The Y-axis represents meta-information, and the X-axis denotes the proportion of the samples comprising this meta-information. (C) The integrity of the metadata assessed based on age, sex and BMI.
Statistics of health and the top 10 diseases included in mBodyMap
| Health/disease | No. of associated sites | No. of processed runs | No. of valid runs | No. of associated species | No. of associated genera |
|---|---|---|---|---|---|
| Health | 21 | 42 816 | 36 852 | 6070 | 1623 |
| Respiratory tract infections | 3 | 2357 | 2274 | 3525 | 1103 |
| Cystic fibrosis | 1 | 2129 | 1656 | 4569 | 1353 |
| Pouchitis | 2 | 1858 | 889 | 3621 | 1190 |
| Bacterial vaginosis | 1 | 1541 | 1538 | 3775 | 1227 |
| Chronic obstructive pulmonary disease | 3 | 1174 | 1084 | 4122 | 1292 |
| Premature birth | 2 | 1137 | 1110 | 3040 | 952 |
| Necrotizing enterocolitis | 1 | 1094 | 659 | 1037 | 323 |
| Asthma | 1 | 870 | 850 | 3654 | 1196 |
| Crohn disease | 2 | 714 | 398 | 1189 | 423 |
| Endometrial neoplasms | 8 | 660 | 604 | 2835 | 999 |
No. of associated sites: the number of body sites from which the sample with this health/disease was harvested.
No. of processed runs: the number of all runs with processed sequence data; all the runs are processed eventually.
No. of valid runs: the number of runs whose data passed our quality control procedure, with the corresponding species/genus relative abundances available in our database.
No. of associated species: the number of species associated with processed and valid runs.
No. of associated genera: the number of genera associated with processed and valid runs.
Figure 2.Graphical representation of the abundances, prevalence, and distributions within health and diseases of a selected taxon. Here, Haemophilus parainfluenzae at the upper respiratory tract is used as an example. (A) Its prevalence across health and multiple diseases. The Y-axis represents health and various diseases, and the X-axis denotes the proportion of the samples comprising this health or disease. (B) The box plot's Y-axis representing health and other diseases and its X-axis denoting relative abundances. (C) Its distributions among health and various diseases.
Figure 3.Distribution of Streptococcus mitis, a known disease-causing bacterium, across body sites in mBodyMap. Display of the relative abundance (A) and prevalence (B) of S. mitis in various sites of healthy and diseased human bodies. S. mitis was isolated in significant abundances in multiple body sites of the diseased population, which is consistent with its characterization as a pathogenic bacterium.