| Literature DB >> 29587627 |
Katrijn Van Deun1, Elise A V Crompvoets2, Eva Ceulemans3.
Abstract
BACKGROUND: Data analysis methods are usually subdivided in two distinct classes: There are methods for prediction and there are methods for exploration. In practice, however, there often is a need to learn from the data in both ways. For example, when predicting the antibody titers a few weeks after vaccination on the basis of genomewide mRNA transcription rates, also mechanistic insights about the effect of vaccinations on the immune system are sought. Principal covariates regression (PCovR) is a method that combines both purposes. Yet, it misses insightful representations of the data as these include all the variables.Entities:
Keywords: Dimension reduction; High-dimensional data; Immunology; Prediction; Stability selection
Mesh:
Substances:
Year: 2018 PMID: 29587627 PMCID: PMC5870402 DOI: 10.1186/s12859-018-2114-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Tucker congruence for the simulated data
Fig. 2Prediction error for the simulated data
Fig. 3Variance accounted for bye ach PCA component in the predictor data
Fig. 4Fit measures obtained for principal covariates regression and partial least square
Fit of modeled to observed data for three methods: SPCovR, spls, and SGCCA
| Method | VAF |
|
|
|---|---|---|---|
| SPCovR | 0.19 | 0.42 | 0.79 |
| spls | 0.99 | 0.55 | |
| SGCCA | 0.11 | 1 | 0.53 |
Displayed are the variance accounted for by the components in the block of covariates and the squared correlation between the modeled and observed outcome for the 2008 and 2007 season. The model was constructed using the 2008 data
Percentage of variance accounted for in the block of covariates (VAFX) and in the outcome by each of the SPCovR and SGCCA components
| Component 1 | Component 2 | ||
|---|---|---|---|
| SPCOVR | VAFX | 0.10 | 0.08 |
|
| 0.01 | 0.40 | |
| SGCCA | VAFX | 0.07 | 0.04 |
|
| 0.79 | 0.20 |
Significantly enriched gene ontology classes
| Biological process | Nr of genes found | Nr of genes expected | +/− | |
|---|---|---|---|---|
| rRNA methylation | 5 | .21 | + | 2.03 |
| Cellular macromolecule metabolic process | 89 | 58.65 | + | 1.68 |
| Nucleic acid metabolic process | 60 | 34.14 | + | 2.84 |
| Cellular component organization or biogenesis | 75 | 47.15 | + | 3.86 |
| Gene expression | 57 | 31.88 | + | 3.30 |
| Leukocyte activation | 18 | 4.59 | + | 6.11 |
| Cell activation | 20 | 5.36 | + | 2.88 |
| Immune system process | 31 | 13.13 | + | 2.79 |
| Immune effector process | 19 | 5.25 | + | 9.41 |
| Negative regulation of metabolic process | 32 | 14.16 | + | 4.59 |