| Literature DB >> 32153643 |
Yuanyuan Ma1, Guoying Liu1, Yingjun Ma2, Qianjun Chen2,3.
Abstract
Microbe-disease association relationship mining is drawing more and more attention due to its potential in capturing disease-related microbes. Hence, it is essential to develop new tools or algorithms to study the complex pathogenic mechanism of microbe-related diseases. However, previous research studies mainly focused on the paradigm of "one disease, one microbe," rarely investigated the cooperation and associations between microbes, diseases or microbe-disease co-modules from system level. In this study, we propose a novel two-level module identifying algorithm (MDNMF) based on nonnegative matrix tri-factorization which integrates two similarity matrices (disease and microbe similarity matrices) and one microbe-disease association matrix into the objective of MDNMF. MDNMF can identify the modules from different levels and reveal the connections between these modules. In order to improve the efficiency and effectiveness of MDNMF, we also introduce human symptoms-disease network and microbial phylogenetic distance into this model. Furthermore, we applied it to HMDAD dataset and compared it with two NMF-based methods to demonstrate its effectiveness. The experimental results show that MDNMF can obtain better performance in terms of enrichment index (EI) and the number of significantly enriched taxon sets. This demonstrates the potential of MDNMF in capturing microbial modules that have significantly biological function implications.Entities:
Keywords: co-modules; human microbiome; matrix factorization; microbe-disease association; phylogenetic distance
Year: 2020 PMID: 32153643 PMCID: PMC7048008 DOI: 10.3389/fgene.2020.00083
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Illustrative example of MDNMF. First, based on Gaussian kernel function we can obtain microbe and disease similarity matrices from the original microbe-disease association matrix. Then, these three matrices are served as the input of MDNMF. Simultaneously, in order to improve the accuracy of module finding and biological interpretability of modules identified by MDNMF, human symptoms-disease network and microbial phylogenetic distance are also introduced into the model. At last, microbe-disease co-modules from different levels can be obtained.
The performance of three co-model discovering algorithms in term of EI and TS.
| (#) identified co-modules |
| (#) |
| |
|---|---|---|---|---|
| NMF | 12 | 0.08676 | 39 | 29 |
| NetNMF | 13 | 0.11563 | 49 | 36 |
| MDNMF | 14 | 0.30182 | 62 | 48 |
*(P-value < 0.005 and FDR < 0.05). # represents the number of identified co-modules or significantly enriched taxon sets.
Figure 2Comparison of all the enriched TS terms of microbe modules detected by MDNMF, NMF, and NetNMF using HMDAD dataset.
Figure 3Model selection of parameters: μ and k.
The identified microbe-disease co-modules by MDNMF.
| Co-module_id | Disease module | Microbe module | Taxon sets (matched disease, descending order by FDR) | Associated co-module |
|---|---|---|---|---|
| 9 |
|
|
| 10,4,7 |
* Colors indicate different diseases or enriched taxon sets.
The detailed information of identified microbe-disease co-module 4.
| Co-module_id | Disease module | Microbe module | Taxon sets (matched disease, descending order by FDR) | Associated co-module |
|---|---|---|---|---|
| 4 |
|
|
| 9,7 |
|
| ||||
| COPD | ||||
|
| ||||
|
| ||||
|
| ||||
| New-onset untreated rheumatoid arthrits | ||||
|
| ||||
| Rheumatoid arthrits | ||||
|
|
*Colors indicate different diseases or enriched taxon sets.
Figure 4Top enriched biological terms in microbe module 9.
Figure 5Top enriched biological terms in microbe module 4.