| Literature DB >> 18466464 |
Sandra Waaijenborg1, Aeilko H Zwinderman.
Abstract
Inter-individual variation in gene expression levels can arise as an effect of variation in DNA markers. When associating multiple gene expression variables with multiple DNA marker variables, multivariate techniques, such as canonical correlation analysis, should be used to deal with the effect of co-regulating genes. We adapted the elastic net, a penalized approach proposed for variable selection in regression context, to canonical correlation analysis. The number of variables within each canonical component could be greatly reduced without too much loss of information, so the canonical components become easier to interpret. Another advantage is that it groups co-regulating genes, so that they end up in the same canonical components. Furthermore, our adaptation works well in situations where the number of variables greatly exceeds the number of subjects.Entities:
Year: 2007 PMID: 18466464 PMCID: PMC2367589 DOI: 10.1186/1753-6561-1-s1-s122
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Effect of the number of variables on the MSPE andthe canonical correlation. The effect of the number of variables within each CCA component pair on (a) the mean squared prediction error, and (b) the average canonical correlation.
Figure 2Weights of the CCA components. The weights of the CCA components containing the gene expression variables and the SNP dummy variables, ordered according to their chromosome location, obtained from a CCA component pair containing 50 variables each.
Figure 3Intra-CCA component correlation. The distribution of the intra-CCA components correlations (a) of the gene expression variables and (b) the SNP dummy variables, obtained from a CCA component pair containing 50 variables each.