| Literature DB >> 32429287 |
Tara Eicher1,2, Garrett Kinnebrew1,3,4, Andrew Patt5,6, Kyle Spencer1,6,7, Kevin Ying3,8, Qin Ma1, Raghu Machiraju1,2,9,10, And Ewy A Mathé1,5.
Abstract
As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose effective techniques for multi-omics analyses relevant to their field of study.Entities:
Keywords: biological pathways; clustering; co-regulation; deep learning; dimensionality reduction; machine learning; multi-omics integration; network analysis; pathway enrichment; visualization
Year: 2020 PMID: 32429287 PMCID: PMC7281435 DOI: 10.3390/metabo10050202
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Figure 1The metabolome in the context of other omics data types and broad approaches for their integration.
Figure 2Analysis techniques on a dataset with n samples and d analytes in two classes. Blue represents one class of samples, and red represents another class of samples. Class typically corresponds to phenotype but could also correspond to batch or another variable of interest. Analyses include unsupervised clustering approaches (Section 3), modeling co-regulation (Section 4), approaches for identifying analytes associated with class (Section 5), and pathway analysis (Section 6).
Examples of multi-omics applications using unsupervised analysis.
| Type of Method | Functionality | Reference | |
|---|---|---|---|
| Dimensionality Reduction | t-Distributed Stochastic Neighbor Embedding | Visualize gut microbial communities and serum metabolites by diet and supplements. | [ |
| Visualize prefrontal cortex metabolites and lipids by human population group. | [ | ||
| Clustering | Hierarchical Clustering | Identify multi-omic molecular subtypes in hepatocellular carcinoma. | [ |
| Identify multi-omic clusters in breast tumor tissue associated with prognosis. | [ | ||
| Identify lipid–protein–metabolite clusters associated with diabetes and periodontal disease. | [ | ||
| Partitioning Around Medoids (PAM) | Identify microbial–metabolite clusters associated with diarrhea. | [ | |
| Gaussian Mixture Modeling (GMM) | Identify clinical depression score clusters associated with blood metabolomic and genomic data in blood to predict drug response. | [ | |
| Density-Based Spatial Clustering of Applications with Noise | Evaluate the impact of bacterial metabolism on mucosal immunity. | [ | |
| Other Machine Learning Methods | Random Forest | Identify clusters of histological stromal features associated with prognosis and metabolites in cancer-associated fibroblasts. | [ |
| Autoencoder | Cluster plasma protein and metabolite levels to identify temporal trends in murine cardiac remodeling. | [ | |
* Raw data are available in the supplementary of the referenced manuscript, or a public repository. † Preprocessed data are available in the supplementary of the referenced manuscript, or a public repository. ‡ Descriptive statistics are available in a table or supplementary materials of referenced manuscript. Unmarked data are available upon request from the authors or from a consortium.
Examples of multi-omics applications using co-regulation analysis.
| Type of Method | Functionality | Reference | |
|---|---|---|---|
| Associative Networks | Correlation Networks | Find metabolite–metabolite associations specific to or shared across blood, urine, and saliva. | [ |
| Find modules of blood metabolites and genes associated with body weight change. | [ | ||
| Find associations between serum, blood, and gut antibodies, metabolites, and microbiome and patient disease activity reports in inflammatory bowel disease. | [ | ||
| Find associations between metabolites, transcripts, cytokines, and cell frequencies in plasma and whole blood associated with adaptive immune response to | [ | ||
| Partial Correlation Networks | Visualize associations between sleep survey responses and levels of serum cytokines, metabolites, lipids, proteins, and genes. | [ | |
| Visualize associations between metabolites and lipids associated with metabolic disease treatment in rat liver tissue and clinical chemistry measurements from serum. | [ | ||
| Weighted Gene Co-Expression Network Analysis (WGCNA) | Characterize complex transcriptomic and metabolic traits in major depressive disorder. | [ | |
| Identify co-regulated modules of blood metabolites and transcripts in children with asthma. | [ | ||
| Identify co-regulated modules of metabolites and transcripts in glioblastoma multiforme. | [ | ||
| Topological Analysis of Networks | Subnetworks | Identify subnetworks of correlated proteins and metabolites in adrenocorticotropic hormone-secreting pituitary adenomas. | [ |
| Identify subnetworks of correlated genetic, proteomic, metabolomic, clinical, and microbiome data from multiple biofluids in cardiometabolic disease. | [ | ||
* Raw data are available in the supplementary of the referenced manuscript, or a public repository. † Preprocessed data are available in the supplementary of the referenced manuscript, or a public repository. ‡ Descriptive statistics are available in a table or supplementary materials of referenced manuscript. Unmarked data are available upon request from the authors or from a consortium.
Examples of multi-omics applications that identify analytes associated with phenotype.
| Type of Method | Functionality | Reference | |
|---|---|---|---|
| Univariate Statistical Methods | Student’s | Identify metabolites, miRNAs, mRNAs, and lncRNAs altered by exposure to benzo[a]pyrene to identify mechanisms of toxicity. | [ |
| Multivariate Statistical Methods | Partial Least Squares Discriminant Analysis | Identify breast tumor tissue metabolites that differentiate MRI features. | [ |
| Identify metabolites that differentiate normal and tumor tissue in the prostate. | [ | ||
| Identify differences between fibromyalgia and control groups in gut microbes, serum metabolites, miRNA, and cytokine levels. | [ | ||
| Discover temporal changes in plasma lipid and metabolite patterns from normal and hyperlipidemic patients. | [ | ||
| Linear Models (and variants) | Identify metabolites from bronchial alveolar lavage associated with continuous CT scan features in cystic fibrosis. | [ | |
| Identify serum metabolites associated with visceral adipose tissue features from MRI and tomography. | [ | ||
| Identify plasma metabolites and proteins associated with prognosis in septic shock patients. | [ | ||
| Find associations between blood DNA methylation and metabolite levels in smokers. | [ | ||
| Identifying Analyte Relationships that Differ by Phenotype | DiffCorr | Identify differences in metabolite-metabolite correlations between traumatic brain injury and control groups. | [ |
| IntLIM | Identify synovial fluid metabolites and blood and bone marrow transcripts that differentiate between osteoarthritis and rheumatoid arthritis. | [ | |
| Machine Learning Methods for Predicting Phenotype | Random Forest | Identify serum metabolites, proteins, and peptides differentiating between metabolic syndrome and control groups. | [ |
| Identify metabolites and other analytes predictive of weight gain and loss. | [ | ||
| Identify metabolites, transcripts, and proteins predictive of potato quality traits. | [ | ||
| Identify metabolites and transcripts predictive of heat stress in the liver. | [ | ||
| Support Vector Machine (SVM) | Predict metabolite levels using genes and metabolites in breast and hepatocellular carcinoma. | [ | |
| Multilayer Perceptron (MLP) | Predict early and late stage bladder cancer using urinary metabolites and genes. | [ | |
| Predict early renal injury using serum metabolites and lipids. | [ | ||
| Convolutional Neural Network (CNN) | Predict early renal injury using serum metabolites and lipids. | [ | |
| Recurrent Neural Network (RNN) | Integrate transcript and metabolite levels to predict cellular state in | [ | |
* Raw data are available in the supplementary of the referenced manuscript, or a public repository. † Preprocessed data are available in the supplementary of the referenced manuscript, or a public repository. ‡ Descriptive statistics are available in a table or supplementary materials of referenced manuscript. Unmarked data are available upon request from the authors or from a consortium.
Multi-omics applications using biological or visual interpretation methods.
| Type of Method | Functionality | Reference | |
|---|---|---|---|
| Pathway enrichment methods | Overrepresentation Analysis (ORA) | Identify dysregulated pathways in prostate tumor tissue using metabolite and transcript data. | [ |
| Identify dysregulated pathways in the murine hippocampus and left ventricle during proton irradiation using metabolite and DNA methylation data. | [ | ||
| Identify dysregulated pathways in cationic liposome treatment of human hepatocyte cells using metabolomic and proteomic data. | [ | ||
| Identify dysregulated pathways in kidney disease in the rat serum metabolome and proteome. | [ | ||
| Identify dysregulated gut microbial pathways in gastrectomy patients. | [ | ||
| Identify dysregulated gut microbial pathways in sports classification groups of Irish athletes. | [ | ||
| Identify dysregulated gut microbial pathways as a result of whey protein supplementation. | [ | ||
| Topological Scoring | Identify functional connections between dysregulated pathways in Alzheimer’s using genes, metabolites, miRNA, and proteins from multiple sources. | [ | |
| Visualization of biological pathways and networks | Visualize metabolic networks in drug-susceptible and drug-resistant strains of | [ | |
| Visualize interactions between metabolites and genes in non-small cell lung cancer. | [ | ||
* Raw data are available in the supplementary of the referenced manuscript, or a public repository. † Preprocessed data are available in the supplementary of the referenced manuscript, or a public repository. ‡ Descriptive statistics are available in a table or supplementary materials of referenced manuscript. Unmarked data are available upon request from the authors or from a consortium.