| Literature DB >> 34093985 |
Monica Steffi Matchado1, Michael Lauber1, Sandra Reitmeier2,3, Tim Kacprowski4,5, Jan Baumbach6,7, Dirk Haller2,3, Markus List1.
Abstract
Microorganisms including bacteria, fungi, viruses, protists and archaea live as communities in complex and contiguous environments. They engage in numerous inter- and intra- kingdom interactions which can be inferred from microbiome profiling data. In particular, network-based approaches have proven helpful in deciphering complex microbial interaction patterns. Here we give an overview of state-of-the-art methods to infer intra-kingdom interactions ranging from simple correlation- to complex conditional dependence-based methods. We highlight common biases encountered in microbial profiles and discuss mitigation strategies employed by different tools and their trade-off with increased computational complexity. Finally, we discuss current limitations that motivate further method development to infer inter-kingdom interactions and to robustly and comprehensively characterize microbial environments in the future.Entities:
Keywords: Microbial co-occurrence networks; Microbial interactions; Network analysis; Trans-kingdom interactions
Year: 2021 PMID: 34093985 PMCID: PMC8131268 DOI: 10.1016/j.csbj.2021.05.001
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1(A) Schematic overview on taxonomic profiling of bacteria, fungi and the virome. (B) Illustrates three important biases: compositionality, sparsity and spurious correlations in microbial co-occurrence network analysis.
Fig. 2Overview of network approaches for microbial intra- and inter-kingdom interactions.
Overview of microbial co-occurrence network methods.
| Tools | Principle/Models | Advantages | Limitation | Applications |
|---|---|---|---|---|
| SparCC (2012) python r-sparcc | Pearson correlations from log-transformed abundance Bayesian approach to differentiate true fractions from the observed counts and to handle sparsity Log-ratio transformed abundance/count matrix | Handles compositionality bias and sparsity | High computational complexity due to the iterative approximation approach Nonlinear relationships cannot be detected | Interaction between gut fungi microbiome of the Human Microbiome Project healthy cohort |
| CCLasso (2015) R package | Latent variable model with l1-norm shrinkage method simple pseudo count implementation Log-ratio transformed abundance/count matrix | Faster than SparCC Handles Compositionality bias | Nonlinear relationships cannot be detected Study only pairwise correlations between microbiomes | It was used to capture the interaction between Marine phototrophs and archaea |
| REBACCA (2015) | Linear system using log ratios between pairs of compositions with l1-norm shrinkage method | Obtain higher accuracy when a sparse condition is satisfied Controls the false positives Suitable for large sample size | Nonlinear relationships cannot be detected Asymptotic performance with large sample size | Positive correlation between |
| CoNet (2016) Cytoscape Command line tool | Five similarity measures: Bray and Curtis, Kullback–Leibler dissimilarity measures, Pearson and Spearman correlation, and mutual information Compendium of generalized boosted linear models | Able to build bipartite network | Does not address compositionality bias Study only pairwise correlations between microbiomes | Interaction studies of ecological systems like ranging from plant Identification of autism spectrum disorder-enriched |
| Meta-Network (2019) | Hybrid method with Pearson Correlation and graph-based method FS-Weight method to study indirect relationships Nonlinear associations using PCA-PMI method MCODE cluster algorithm to detect clusters and hubs | indirect correlation and non-linear correlations can be identified Outperforms the Spearman and Pearson Correlation | Does not address compositionality bias | Identification of hidden relationship between |
| Correlation-Centric Network (2020) Command line tool | Edge-centric Network Pearson correlation coefficient for network construction Isomorphism mapping for deriving Correlation-Centric Network from species–species co-occurrence networks (SCNs) | Correlations of the edge distribution can be studied Outperforms the SCNs | Does not address compositionality CCN a new perspective in microbiome network derived from host diet during the seasonal variations. Identified | Identification of biomarkers in gene-co-expression and personalized characterization of diseases |
| MENAP (2012) online tool | Random Matrix Theory (RMT)- based molecular ecological network analysis | Threshold to construct network is automatically determined Robust to noise | Does not address the issues of network sparsity and compositional bias | Detection of highly connected cluster of Study of soil microbial structures |
| gCoda (2017) R package | Logistic normal distribution to overcome compositionality-bias Majorization-Minimization algorithm Maximum likelihood with l1 penalty to deal with dimensionality | Requires less computation time than SPIEC-EASI Efficient for compositional data More stable and accurate compared to SPIEC-EASI. | Non-convexity of the likelihood function Lack in identifying the hub/Key species Lack in consistency of the estimators | Not available |
| MDiNE (2019) R package | Dirichlet-multinomial logistic-normal distribution to address the compositional nature Markov Chain Monte Carlo (MCMC) methods to define logistic multinomial normal model | Differential networks based on precision matrix estimation for binary sample condition Zero handling without resorting to the addition of a pseudo-count Handles Compositionality | Running time is high Supports only single binary covariate to construct the networks Dirichlet-multinomial logistic-normal distribution model cannot capture positive and negative covariances | Identification of new biomarkers such as Enterobacteriacea, more abundant in Crohn’s samples and Lachnospiraceae to be less |
| MixMPLN (2019) R package | Mixture of K Multivariate Poisson Log-Normal distributions Minorization–maximization principle ℓ1-penalty model to solve the sparse networks | Capturing multiple networks from the same count matrix Handles Compositionality | Runtime comparison and computational complexity are not well-addressed. | Able to reproduce and identify the changes between infants gut microbiome and older children and adults |
| NetComi (2020) R package | Integrates extensive list of methods that take into account the special characteristics of amplicon data: SparCC, SPIEC-EASI, proportionality, SPRING unique feature: Differential network analysis | Ability to study differential networks Easy-to-use | Model networks from a single domain of life | Not available |
| Environmentally-Driven Edge Detection (2020) | Sign Pattern, Overlap, Interaction Information, Data Processing Inequality to remove the environmentally-driven (indirect) associations | Ability to identify the environmentally-driven (indirect) associations (edges) from the network | Currently ENDED supports only any closed triplet i.e (fully connected) | Not available |
| Mint (2015) R package | Poisson-multivariate normal hierarchical model with ℓ1-penalty model to capture direct interactions | Controls for confounding predictors to remove indirect interactions | Does not account for the compositional nature of microbiome data Unable to detect latent factors | Not available |
| mLDM (2016) R package | Hierarchical Bayesian model with sparsity constraints | Handles compositional bias Able to detect direct associations and remove indirect associations Microbial absolute abundance can be estimated | Lacks scalability and efficiency, high computational power Hierarchical Bayesian model consume most of the training time. Unable to detect latent factors | Not available |
| HARMONIES (2020) R package webtool | A hybrid approach using Zero-inflated negative binomial distribution and Dirichlet process Gaussian graphical model to deal with sparse network | Handle overdispersion and high number of zero counts | Small sample size affects the performance | Discovered a unique subnetwork of Fusobacterium, Peptostreptococcus, and Parvimonas in healthy patients compared to Colorectal cancer patients |
| SPIEC-EASI (2015) R package | CLR transformation of the input Selection of two approaches: Glasso or Neighborhood Selection | Handles compositionally Avoids detection of transitive correlations | Graphs with large hub node are more difficult to recover Cannot handles co-variates | Interaction studies of various ecological systems like plants Study the interactions of Viral Populations to identify the Age-Dependent patterns in human gut |
| Hubs weighted graphical lasso (2020) | Weighted lasso approach with special row/column sum weights to penalize hubs | Includes structural information of the network to correctly identify hub edges | Not available | |
| FlashWeave (2019) | Local-to-global learning framework | Adjusts for latent variable Less runtime Good performance on heterogenous datasets | Quality drop when applied to homogeneous data with small sample number | Understanding the interaction between Core Microbiome of ascidian, a marine invertebrate chordates |
| COZINE R package (2020) | CLR transformation only on non-zero count values Multivariate Gaussian Hurdle model Group-lasso penalty to obtain sparse estimates | Handles compositional bias and zero inflation High accuracy | Not available | |
| SPIEC-EASI Extension (2018) R package | Central Log Ration Transformation of the input Selection of two approaches: Glasso or Neighborhood Selection | Handles compositionally Avoids detection of transitive correlations | Graphs with large hub node are more difficult to recover | Identification of associations between fungi and bacteria and also elucidated the importance of including cross-biom interactions in microbiome data analysis |
| Multi-Omics Factor Analysis R package (2018) | Normalized data matrices from one or more data modalities Bayesian Group Factor Analysis framework Automatic Relevance Determination | Integrates multiple data modalities and sample groups and finds drivers of variation | Assumes linear or moderate non-linear relationships Assumes independence between features in prior distribution | Identification of complex interaction between |
| DIABLO R package (2019) | Singular value decomposition with ℓ1-penalized selection of correlated variables from several omic data sets | Finds correlated features which possess a discriminative ability | Assumes linear relationship between features from different omics data sets | Interaction study between bacterial taxa, metabolites and physiological traits in the study of haem-induced lipoperoxidation on mucosal and luminal gut homeostasis |
Fig. 3Workflow indicating the suitable network approaches depending upon different challenges.