| Literature DB >> 21092268 |
Gift Nyamundanda1, Lorraine Brennan, Isobel Claire Gormley.
Abstract
BACKGROUND: Data from metabolomic studies are typically complex and high-dimensional. Principal component analysis (PCA) is currently the most widely used statistical technique for analyzing metabolomic data. However, PCA is limited by the fact that it is not based on a statistical model.Entities:
Mesh:
Year: 2010 PMID: 21092268 PMCID: PMC3006395 DOI: 10.1186/1471-2105-11-571
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Results of fitting a PPCA model to the urine dataset. A. Plot of the modified BIC values and the proportion of variation explained by each model: the higher the BIC value the better the model. B. The scores plot: the red triangles denote the subjects in the treatment group and the black dots denote those in the control group. The grey ellipses are the 95% posterior sets indicating the uncertainty associated with each estimated score. C. A barplot of spectral bins with loadings which are significantly different from zero and greater in absolute value than 0.8. The barplot shows how the selected spectral regions load on PC 1 and their corresponding 95% confidence intervals.
Figure 2Results of fitting a PPCCA model to the urine dataset with weight as a covariate. A. Plot of the modified BIC values and the proportion of variation explained by each model: the higher the BIC value the better the model. B. The scores plot: the red dots denote the subjects in the treatment group and the black dots denote those in the control group. Dot size reflects a subject's weight. (Larger dots suggest heavier subjects.) The grey ellipses illustrate 95% posterior sets indicating the uncertainty associated with each score. C. A barplot of spectral bins with loadings which are significantly different from zero and greater in absolute value than 0.8. The barplot shows how the selected spectral regions load on PC 1 and their corresponding 95% confidence intervals.
Regression parameter estimates from the fitted PPCCA model.
| Intercept | Slope | |
|---|---|---|
| PC 1 ( | -0.87 (-3.34, 1.61) | 1.34 (-2.47, 5.15) |
| PC 2 ( | ||
| PC 3 ( | 0.86 (-0.07, 1.78) | -1.32 (-2.77, 0.14) |
| PC 4 ( | 0.18 (-1.03, 1.38) | -0.28 (-2.13, 1.58) |
| PC 5 ( | 0.03 (-2.20, 2.26) | -0.04 (-3.32, 3.23) |
95% CIs are given in parentheses. Those estimates significantly different from zero are highlighted in bold.
Figure 3The scores plot for a single PPCA model fitted to the brain spectra. Each black dot denotes a score in the two dimensional principal subspace. The grey ellipses are the 95% posterior sets illustrating the uncertainty associated with each score. An underlying group structure is clearly apparent.
Figure 4A heatmap of the BIC values for MPPCA models fitted to the brain spectra data. Colour represents the BIC value for each model -- the lighter the color the higher the BIC value and the better the model. The optimal model with G = 4 and q = 7 is indicated with a cross.
Cross tabulation of the group membership of subjects based on the estimated MPPCA model and the brain region of origin.
| Cerebellum | Brain stem | Pre-frontal cortex | Hippocampus | |
|---|---|---|---|---|
| Group 1 | 8 | 0 | 0 | 0 |
| Group 2 | 0 | 8 | 0 | 0 |
| Group 3 | 0 | 0 | 9 | 0 |
| Group 4 | 0 | 0 | 0 | 8 |