| Literature DB >> 28670325 |
Sijia Huang1,2, Kumardeep Chaudhary1, Lana X Garmire1,2,3.
Abstract
Multi-omics data integration is one of the major challenges in the era of precision medicine. Considerable work has been done with the advent of high-throughput studies, which have enabled the data access for downstream analyses. To improve the clinical outcome prediction, a gamut of software tools has been developed. This review outlines the progress done in the field of multi-omics integration and comprehensive tools developed so far in this field. Further, we discuss the integration methods to predict patient survival at the end of the review.Entities:
Keywords: integration; multi-omics; precision medicine; prediction; prognosis; supervised learning; unsupervised learning
Year: 2017 PMID: 28670325 PMCID: PMC5472696 DOI: 10.3389/fgene.2017.00084
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1Unsupervised data integration methodology.
Summary of data integration tools.
| Joint NMF | Unsupervised | Multi-data | Subset of genes (modules) | Matrix factorization | NA | Zhang et al., |
| iCluster | Unsupervised | EXP, CNV | Cluster | matrix factorization | L1 penalty | Shen et al., |
| iCluster+ | Unsupervised | Multi-data | Cluster | matrix factorization | L1 penalty | Mo et al., |
| JIVE | Unsupervised | Multi-data | Shared factors and unique factors | Matrix factorization | L1 penalty | Lock et al., |
| Joint Bayes Factor | Unsupervised | EXP, MET, CNV | Shared factors and unique factors | Matrix factorization | Student-t sparseness promoting prior | Ray et al., |
| ssCCA | Unsupervised | Sequence data | Operational taxonomic unit and cluster | Canonical Correlation Analysis | L1 penalty | Chen et al., |
| CCA sparse group | Unsupervised | Two types of data | Group of features with weights | Canonical Correlation Analysis | L1 penalty | Lin et al., |
| sMBPLS | Unsupervised | Multi-data | Group of features as modules | Partial Least Squares | L1 penalty | Li et al., |
| SNPLS | Unsupervised | EXP, drug response, gene network info. | Gene-drug co-module | Partial Least Squares | Network-based penalty | Chen and Zhang, |
| MDI | Unsupervised | Multi-data | Cluster | Bayesian | NA | Kirk et al., |
| Prob_GBM | Unsupervised | EXP, CNV, miRNA, SNP | Cluster | Bayesian | NA | Cho and Przytycka, |
| PSDF | Unsupervised | EXP, CNV | Cluster | Bayesian | Binary indicator->likelihood of feature | Yuan et al., |
| BCC | Unsupervised | EXP, MET, miRNA, proteomics | Cluster | Bayesian | NA | Lock and Dunson, |
| CONEXIC | Unsupervised | EXP, CNV | Groups of genes associated with modulators | Bayesian | NA | Akavia et al., |
| PARADIGM | Unsupervised | Multi-data | Gene score and significance in each pathway | pathway networks | NA | Vaske et al., |
| SNF | Unsupervised | EXP, MET, miRNA | Cluster | similarity network fusion | NA | Wang et al., |
| Lemon-Tree | Unsupervised | EXP, CNV/miRNA/ methyl (only one type) | Association network graphics | module network | NA | Bonnet et al., |
| rMKL-LPP | Unsupervised | Multi-data | Cluster | Multiple kernel learning | Dimension reduction metric Locality Preserving Projections (LPP) | Speicher and Pfeifer, |
| CNAmet | Unsupervised | EXP, MET, CNV | Scores and | Multi-step analysis | NA | Louhimo and Hautaniemi, |
| iPAC | Unsupervised | EXP, CNV | Subset of genes | Multi-step analysis | Multiple filtering steps including common aberrant genes, in-cis correlation and in-trans functionality | Aure et al., |
| ATHENA | Supervised | EXP, CNV, MET, miRNA | Final model with patient index | Grammatical Evolution Neural Networks (GENN) | Neural Networks | Kim et al., |
| jActiveModules | Supervised | EXP, PPI, protein-DNA interactions | Subnetwork (network hotspots) | Network simulated annealing | NA | Ideker et al., |
| Network propagation | Supervised | Gene expression, mutation, PPI | Propagated network relative to differential expression of gene | Network | NA | Ruffalo et al., |
| SDP/SVM | Supervised | EXP, protein sequence, protein interactions, hydropathy profile | Linear classifier based on combination of kernels | SDP/SVM | Recommends CCA (canonical correlation analysis) | Lanckriet et al., |
| FSMKL | Supervised | EXP, CNV, Clinic feature (ER status) | Linear classifier based on combination kernel | Multiple kernel learning | SimpleMKL (gradient descent method) | Seoane et al., |
| iBAG | Supervised | Multi-data | Subset of genes | Multi-step analysis | Bayesian | Jennings et al., |
| MCD | Supervised | MET, CNV, LoH | Subset of genes | Multi-step analysis | NA | Chari et al., |
| Anduril | Supervised | EXP, MET, miRNA, exon, aCGH, SNP | Comprehensive report | Multi-step analysis | NA | Ovaska et al., |
| GeneticInterPred | Semi-supervised | EXP, PPI, protein complex data | Genetic interaction labels | Graph integration | NA | You et al., |
| Graph-based learning | Semi-supervised | EXP, CNV, MET, miRNA | Patient scores for classification purpose | Graph integration | NA | Kim et al., |
| CoxPath | Survival-driven | EXP, CNV, MET, miRNA | Prognosis index for each patient | Multi-step analysis | L1 penalty | Mankoo et al., |
| MKGI | Survival-driven | EXP, CNV, MET, miRNA | Final model with patient index | Grammatical Evolution Neural Networks (GENN) | Neural Networks | Kim et al., |
FS Method, Feature Selection Method; EXP, Expression; CNV, Copy Number Variation; MET, DNA Methylation; SNP, Single Nucleotide Polymorphism; aCGH, Array Comparative Genomic Hybridization; PPI, Protein-Protein Interaction; LoH, Loss of Heterozygosity.
Figure 2Supervised data integration methodology.