| Literature DB >> 32435232 |
Zhanshan Sam Ma1,2.
Abstract
There are two major sequencing technologies for investigating the microbiome: the amplicon sequencing that generates the OTU (operational taxonomic unit) tables of marker genes (e.g., bacterial 16S-rRNA), and the metagenomic shotgun sequencing that generates metagenomic gene abundance (MGA) tables. The OTU table is the counterpart of species abundance tables in macrobial ecology of plants and animals, and has been the target of numerous ecological and network analyses in recent gold rush for microbiome research and in great efforts for establishing an inclusive theoretical ecology. Nevertheless, MGA analyses have been largely limited to bioinformatics pipelines and ad hoc statistical methods, and systematic approaches to MGAs guided by classic ecological theories are still few. Here, we argue that, the difference between "gene kinds" and "gene species" are nominal, and the metagenome that a microbiota carries is essentially a 'community' of metagenomic genes (MGs). Each row of a MGA table represents a metagenome of a microbiota, and the whole MGA table represents a 'meta-metagenome' (or an assemblage of metagenomes) of N microbiotas (microbiome samples). Consequently, the same ecological/network analyses used in OTU analyses should be equally applicable to MGA tables. Here we choose to analyze the heterogeneity of metagenome by introducing classic Taylor's power law (TPL) and its recent extensions in community ecology. Heterogeneity is a fundamental property of metagenome, particularly in the context of human microbiomes. Recent studies have shown that the heterogeneity of human metagenomes is far more significant than that of human genomes. Therefore, without deep understanding of the human metagenome heterogeneity, personalized medicine of the human microbiome-associated diseases is hardly feasible. The TPL extensions have been successfully applied to measure the heterogeneity of human microbiome based on amplicon-sequencing reads of marker genes (e.g., 16s-rRNA). In this article, we demonstrate the analysis of the metagenomic heterogeneity of human gut microbiome at whole metagenome scale (with type-I power law extension) and metagenomic gene scale (type-III), as well as the heterogeneity of gene clusters, respectively. We further examine the influences of obesity, IBD and diabetes on the heterogeneity, which is of important ramifications for the diagnosis and treatment of human microbiome-associated diseases.Entities:
Keywords: Taylor’s power law; medical ecology of metagenome; metagenome ecology; metagenome functional gene cluster (MFGC); metagenome spatial heterogeneity; metagenomic gene abundance (MGA) table; power law extensions
Year: 2020 PMID: 32435232 PMCID: PMC7218080 DOI: 10.3389/fmicb.2020.00648
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
FIGURE 1Showing the flowchart of analyzing the microbiome heterogeneity from ecological, taxonomical, functional and evolutionary perspective in terms of various scales [OTU, MG (metagenomic gene), MFGC (metagenome functional gene clusters), MF/MP (metagenomic function/pathway) with the power law extensions (PLEs)]. The right side and framed in red color are newly introduced in the present study. See the Online Supplementary Information (OSI) for the R-Scripts implementing the PLE analysis and randomization tests.
The parameters of PLE-I (type-I power law extension) for metagenome spatial heterogeneity, in terms of the MGA (metagenomic gene abundance).
| Power Law Extension (PLE) | Case study | Treatment | SE( | ln( | SE[ln( | |||||
| Type-I PLE for Metagenome Spatial Heterogeneity with MGA | Obesity | Lean | 2.012 | 0.113 | 3.740 | 0.337 | 0.025 | 0.878 | <0.001 | 95 |
| Overweight | 3.447 | 0.158 | –1.204 | 0.532 | 1.636 | 0.914 | <0.001 | 96 | ||
| Type-II Diabetes | Healthy | 3.232 | 0.210 | –0.529 | 0.650 | 1.267 | 0.876 | <0.001 | 74 | |
| Disease | 1.846 | 0.143 | 3.982 | 0.447 | 0.009 | 0.840 | <0.001 | 71 | ||
| IBD | Healthy | 1.385 | 0.079 | 5.365 | 0.266 | 0.000 | 0.903 | <0.001 | 71 | |
| Disease | 2.248 | 0.227 | 2.754 | 0.761 | 0.110 | 0.766 | <0.001 | 71 |
The p-value of the randomization test for the difference between the healthy and diseased treatments in their metagenome spatial heterogeneities parameters of PLE-I.
| Power Law Extension (PLE) | Case Study | Treatments | ln( | ||
| Type-I PLE for Metagenome Spatial Heterogeneity with MGA | Obesity | Lean vs. Overweight | <0.001 | <0.001 | <0.001 |
| Type-2 diabetes | Healthy vs. Disease | 0.044 | 0.038 | 0.044 | |
| IBD | Healthy vs. Disease | 0.021 | 0.043 | 0.015 |
FIGURE 2The PLE-I (type-I power law extension) models fitted for the obesity case study.
The parameters of PLE-III (type-III power law extension) for measuring gene-level (inter-gene) spatial aggregation, in terms of the metagenomic gene abundance (MGA).
| Power Law Extension (PLE) | Case study | Treatment | SE( | ln( | SE[ln( | |||||
| Type-III PLE for Gene-Level Spatial Heterogeneity with MGA | Obesity | Lean | 2.371 | 0.000 | −0.732 | 0.001 | 1.706 | 0.961 | <0.001 | 5407291 |
| Overweight | 2.363 | 0.000 | −0.744 | 0.001 | 1.726 | 0.961 | <0.001 | 5134721 | ||
| Type-II Diabetes | Healthy | 2.340 | 0.000 | −0.842 | 0.001 | 1.875 | 0.954 | <0.001 | 4573927 | |
| Disease | 2.338 | 0.000 | −0.791 | 0.001 | 1.806 | 0.949 | <0.001 | 4432814 | ||
| IBD | Healthy | 2.466 | 0.000 | −1.000 | 0.001 | 1.978 | 0.961 | <0.001 | 2898618 | |
| Disease | 2.351 | 0.000 | −0.791 | 0.001 | 1.796 | 0.957 | <0.001 | 4462890 |
FIGURE 3The PLE-III (type-III power law extension) models fitted to the obesity study datasets: more than 10 million points (5407291 lean group + 5134721 overweight) were used to fit the PLE-III models, but here we only randomly selected 100,000 points (50,000 from each treatment) to draw the graphs (so as to accommodate the file size of the figure).
The parameters of PLE-I (type-I power law extension) for metagenome spatial heterogeneity, in terms of the MFGC (metagenome functional gene cluster) distribution.
| Type of MFGC and database used | Microbiome | Treatment | ln( | |||||||
| Type-I MFGC (eggNOG) | Obesity | Lean | 2.119 | 0.020 | 3.187 | 0.157 | 0.996 | <0.0001 | 95 | 0.058 |
| Overweight | 2.028 | 0.017 | 3.895 | 0.135 | 0.997 | <0.0001 | 96 | 0.023 | ||
| Type 2 diabetes | Healthy | 2.058 | 0.025 | 3.501 | 0.180 | 0.995 | <0.0001 | 74 | 0.037 | |
| Disease | 2.057 | 0.018 | 3.480 | 0.126 | 0.997 | <0.0001 | 71 | 0.037 | ||
| IBD | Healthy | 2.053 | 0.017 | 3.690 | 0.136 | 0.998 | <0.0001 | 71 | 0.030 | |
| Disease | 2.138 | 0.021 | 3.014 | 0.159 | 0.997 | <0.0001 | 71 | 0.071 | ||
| MFGC Type-I (KEGG) | Obesity | Lean | 2.091 | 0.015 | 3.505 | 0.123 | 0.998 | <0.0001 | 95 | 0.040 |
| Overweight | 2.027 | 0.014 | 4.008 | 0.109 | 0.998 | <0.0001 | 96 | 0.020 | ||
| Type-II diabetes | Healthy | 2.035 | 0.020 | 3.772 | 0.150 | 0.997 | <0.0001 | 74 | 0.026 | |
| Disease | 2.036 | 0.014 | 3.794 | 0.106 | 0.998 | <0.0001 | 71 | 0.026 | ||
| IBD | Healthy | 2.042 | 0.013 | 3.888 | 0.104 | 0.999 | <0.0001 | 71 | 0.024 | |
| Disease | 2.111 | 0.016 | 3.292 | 0.130 | 0.998 | <0.0001 | 71 | 0.052 | ||
| MFGC Type-II (eggNOG) | Obesity | Lean | 1.884 | 0.021 | 4.912 | 0.223 | 0.995 | <0.0001 | 95 | 0.004 |
| Overweight | 1.859 | 0.019 | 5.212 | 0.208 | 0.995 | <0.0001 | 96 | 0.002 | ||
| Type-II diabetes | Healthy | 1.783 | 0.059 | 5.771 | 0.614 | 0.962 | <0.0001 | 74 | 0.001 | |
| Disease | 1.715 | 0.091 | 6.461 | 0.937 | 0.915 | <0.0001 | 71 | 0.000 | ||
| IBD | Healthy | 1.967 | 0.024 | 3.992 | 0.260 | 0.995 | <0.0001 | 71 | 0.016 | |
| Disease | 1.937 | 0.021 | 4.295 | 0.232 | 0.996 | <0.0001 | 71 | 0.010 | ||
| MFGC Type-II (KEGG) | Obesity | Lean | 1.915 | 0.017 | 4.834 | 0.197 | 0.996 | <0.0001 | 95 | 0.005 |
| Overweight | 1.889 | 0.016 | 5.136 | 0.178 | 0.997 | <0.0001 | 96 | 0.003 | ||
| Type-II diabetes | Healthy | 1.830 | 0.054 | 5.572 | 0.577 | 0.970 | <0.0001 | 74 | 0.001 | |
| Disease | 1.806 | 0.078 | 5.856 | 0.831 | 0.942 | <0.0001 | 71 | 0.001 | ||
| IBD | Healthy | 1.988 | 0.021 | 3.997 | 0.234 | 0.996 | <0.0001 | 71 | 0.017 | |
| Disease | 1.961 | 0.018 | 4.257 | 0.201 | 0.997 | <0.0001 | 71 | 0.012 |
The p-value of the randomization test for the difference between the healthy and diseased treatments in their PLE-I (type-I power law extension) parameters in terms of the MFGC.
| MFGC Type and Databases used | Microbiome | Treatments | ln( | ||
| MFGC Type-I (eggNOG) | Obesity | Lean vs. Overweight | 0.347 | 0.345 | 0.348 |
| Type 2 diabetes | Healthy vs. Disease | 0.985 | 0.937 | 0.965 | |
| IBD | Healthy vs. Disease | 0.039 | 0.033 | 0.059 | |
| MFGC Type-I (KEGG) | Obesity | Lean vs. Overweight | 0.442 | 0.465 | 0.444 |
| Type 2 diabetes | Healthy vs. Disease | 0.987 | 0.913 | 0.947 | |
| IBD | Healthy vs. Disease | 0.018 | 0.012 | 0.025 | |
| MFGC Type-II (eggNOG) | Obesity | Lean vs. Overweight | 0.421 | 0.388 | 0.432 |
| Type 2 diabetes | Healthy vs. Disease | 0.551 | 0.556 | 0.597 | |
| IBD | Healthy vs. Disease | 0.330 | 0.370 | 0.375 | |
| MFGC Type-II (KEGG) | Obesity | Lean vs. Overweight | 0.361 | 0.337 | 0.382 |
| Type 2 diabetes | Healthy vs. Disease | 0.781 | 0.771 | 0.787 | |
| IBD | Healthy vs. Disease | 0.370 | 0.427 | 0.418 |
The p-value of Wilcoxon tests for the difference between the healthy and diseased treatments in their metagenome spatial heterogeneities and community dominance (also see Supplementary Figure S1A for the V/M heterogeneity index and Supplementary Figure S1B for the community dominance index).
| Taylor’s Power Law Extension (TPLE) | Case Study | Treatments | Mean of Healthy | Mean of Diseased | |
| Variance/mean-ratio heterogeneity Index ( | Obesity | Lean vs. Overweight | 886.50 | 1129.0 | <0.001 |
| Type-2 Diabetes | Healthy vs. Disease | 594.10 | 761.10 | <0.001 | |
| IBD | Healthy vs. Disease | 786.70 | 1064.4 | <0.001 | |
| Community dominance Index ( | Obesity | Lean vs. Overweight | 45.717 | 39.910 | <0.001 |
| Type-2 Diabetes | Healthy vs. Disease | 27.766 | 34.444 | <0.001 | |
| IBD | Healthy vs. Disease | 28.419 | 37.765 | <0.001 |