| Literature DB >> 27294690 |
Frans M van der Kloet1, Patricia Sebastián-León2, Ana Conesa2, Age K Smilde1, Johan A Westerhuis3.
Abstract
BACKGROUND: Joint and individual variation explained (JIVE), distinct and common simultaneous component analysis (DISCO) and O2-PLS, a two-block (X-Y) latent variable regression method with an integral OSC filter can all be used for the integrated analysis of multiple data sets and decompose them in three terms: a low(er)-rank approximation capturing common variation across data sets, low(er)-rank approximations for structured variation distinctive for each data set, and residual noise. In this paper these three methods are compared with respect to their mathematical properties and their respective ways of defining common and distinctive variation.Entities:
Keywords: DISCO; Integrated analysis; JIVE; Multiple data-sets; O2-PLS
Mesh:
Substances:
Year: 2016 PMID: 27294690 PMCID: PMC4905617 DOI: 10.1186/s12859-016-1037-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Schematic overview of common and distinctive parts for two data-sets. a: two data-sets with equal total variance and b: two data-sets with different total variance
Summary table of all orthogonalities constraints for the three algorithms
| DISCO | JIVE | O2-PLS | |
|---|---|---|---|
| Orthogonalities ( | |||
|
| 0 | 0 | 0 |
|
| 0 | ≠0 | ≠0 |
|
| 0 | 0 | 0 |
|
| 0 | 0 | ≠0 |
|
| 0 | ≠0 | ≠0 |
|
| 0 | ≠0 | ≠0 |
|
| 0 | ≠0 | ≠0 |
| Characteristics | |||
| Fusion | [ | [ |
|
| First step | Distinctive | Common | Common |
| Optimization | Distinctive | Common + Distinctive | Common/Distinctive |
(0: orthogonal, ≠0: no forced orthogonality)
Fig. 2The loadings that were used to generate the data with for both scenarios
Summary table of explained variances by the different methods in the second scenario using the real model settings (1,2,2)
| Data-set | Part | Real | DISCO | JIVE | O2-PLS |
|---|---|---|---|---|---|
| 1 | Common | 0.11 | 0.11 | 0.83 | 0.11 |
| 1 | Distinctive | 0.88 | 0.88 | 0.16 | 0.88 |
| 1 | Error | 0.01 | 0.01 | 0.01 | 0.01 |
| 2 | Common | 0.62 | 0.62 | 0.00 | 0.62 |
| 2 | Distinctive | 0.36 | 0.36 | 0.91 | 0.32 |
| 2 | Error | 0.02 | 0.02 | 0.09 | 0.06 |
Fig. 3Real and fitted common and distinctive parts for each methods when there is an low abundant common variation (scenario 2). The real common component is show with a blue cross, the first and second distinctive component with an orange and yellow cross. The fits by the methods are identified by a line with in their respective color
Fig. 4Score plots of the common (top row) and distinctive (bottom row) parts of the mRNA data-set (left column) and miRNA data-set (right column) after O2-PLS decomposition
Summary table of fitted explained variation by the different methods using the real mRNA and miRNA data-sets
| Data-set | Part | DISCO | JIVE | O2-PLS |
|---|---|---|---|---|
| mRNA | Common | 0.22 | 0.15 | 0.20 |
| mRNA | Distinctive | 0.45 | 0.57 | 0.48 |
| mRNA | Total | 0.68 | 0.72 | 0.68 |
| miRNA | Common | 0.33 | 0.26 | 0.29 |
| miRNA | Distinctive | 0.40 | 0.49 | 0.40 |
| miRNA | Total | 0.73 | 0.75 | 0.70 |
Fig. 5Scatter plots of % variance of original variable explained in common parts (left) and % variance of original variable explained in distinctive parts (right) of the mRNA data-set on the top row and of the miRNA on the bottom row between the the different algorithms. DISCO and O2-PLS look very similar. JIVE shows more genes of which more variance is used in the distinctive part which coincides with the increased amount of distinct variance explained by JIVE in comparison to DISCO and O2-PLS
Summary of the SWISS scores for the common and distinctive parts identified by the different models during the analysis of the mRNA/miRNA GlioBlastoma data
| Common (mRNA/miRNA) | Distinctive mRNA | Distinctive miRNA | |
|---|---|---|---|
| REAL (5 PC’s) | 0.66/0.79 | ||
| DISCO | 0.67 | 0.94 | 0.99 |
| JIVE | 0.74 | 0.92 | 0.94 |
| O2-PLS | 0.65/0.66 | 0.97 | 0.93 |
RV modified coefficients of the common and distinctive scores for GlioBlastoma data-sets
| Data-set | Part | Method | O2-PLS | DISCO | JIVE |
|---|---|---|---|---|---|
| mRNA | Common | O2-PLS | X | 0.77/0.67 | 0.42/0.41 |
| mRNA | Common | DISCO | 0.77/0.67 | X | 0.58 |
| mRNA | Common | JIVE | 0.42/0.41 | 0.58 | X |
| mRNA | Distinctive | O2-PLS | X | 0.53 | 0.58 |
| mRNA | Distinctive | DISCO | 0.53 | X | 0.68 |
| mRNA | Distinctive | JIVE | 0.58 | 0.68 | X |
| miRNA | Distinctive | O2-PLS | X | 0.56 | 0.55 |
| miRNA | Distinctive | DISCO | 0.56 | X | 0.74 |
| miRNA | Distinctive | JIVE | 0.55 | 0.74 | X |
|
| total number of datasets, |
|
| number of rows (objects) |
|
| number of columns (variables) for matrix |
|
| total number of variables (∑1 |
|
| number of components for common part |
|
| number of components for distinctive part of matrix |
|
| total number of components ((∑ |
|
| data matrix ( |
|
| concatenated data matrix [ |
|
| common part of matrix |
|
| concatenated common parts [ |
|
| distinctive part of matrix |
|
| concatenated distinctive parts [ |
|
| the residual error of matrix |
|
| concatenated residual errors [ |
|
| scores of SCA model (corresponds to objects) ( |
|
| loadings of SCA model (corresponds to variables) ( |
|
| rotation target loading in DISCO model ( |
|
| rotation matrix in DISCO ( |
|
| weight matrix (used in DISCO) to penalize rotation matrix ( |
|
| common scores (SCA and JIVE) ( |
|
| common loadings (JIVE) ( |
|
| common scores (O2-PLS) for matrix |
|
| common loadings for matrix |
|
| distinctive scores for matrix |
|
| distinctive loadings for matrix |
| ∘ | Hadamard (element-wise) matrix product |