Yuanjia Wang1, Yixin Fang, Man Jin. 1. Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY 10032, USA. yw2016@columbia.edu
Abstract
OBJECTIVE: To develop a ridge penalized principal-components approach based on heritability that can be applied to high-dimensional family data. METHODS: The first principal component of heritability for a trait constellation is defined as a linear combination of traits that maximizes the heritability, which is equivalent to maximize the family-specific variation relative to the subject-specific variation. To analyze high-dimensional data and prevent overfitting, we propose a penalized principal-components approach based on heritability by adding a ridge penalty to the subject-specific variation. We choose the optimal regularization parameter by cross-validation. RESULTS: The principal-components approach based on heritability with and without ridge penalty was compared to the usual principal-components analysis in four settings. The penalized principal-components of heritability analysis had substantially larger coefficients for the traits with genetic effect than for the traits with no genetic effect, while the non-regularized analysis failed to identify the genetic traits. In addition, linkage analysis on the combined traits showed that the power of the proposed methods was higher than the usual principal-components analysis and the non-regularized principal-components of heritability analysis. CONCLUSIONS: The penalized principal-components approach based on heritability can effectively handle large number of traits with family structure and provide power gain for linkage analysis. The cross-validation procedure performs well in choosing optimal magnitude of penalty. Copyright 2007 S. Karger AG, Basel.
OBJECTIVE: To develop a ridge penalized principal-components approach based on heritability that can be applied to high-dimensional family data. METHODS: The first principal component of heritability for a trait constellation is defined as a linear combination of traits that maximizes the heritability, which is equivalent to maximize the family-specific variation relative to the subject-specific variation. To analyze high-dimensional data and prevent overfitting, we propose a penalized principal-components approach based on heritability by adding a ridge penalty to the subject-specific variation. We choose the optimal regularization parameter by cross-validation. RESULTS: The principal-components approach based on heritability with and without ridge penalty was compared to the usual principal-components analysis in four settings. The penalized principal-components of heritability analysis had substantially larger coefficients for the traits with genetic effect than for the traits with no genetic effect, while the non-regularized analysis failed to identify the genetic traits. In addition, linkage analysis on the combined traits showed that the power of the proposed methods was higher than the usual principal-components analysis and the non-regularized principal-components of heritability analysis. CONCLUSIONS: The penalized principal-components approach based on heritability can effectively handle large number of traits with family structure and provide power gain for linkage analysis. The cross-validation procedure performs well in choosing optimal magnitude of penalty. Copyright 2007 S. Karger AG, Basel.
Authors: Anna A Igolkina; Georgy Meshcheryakov; Maria V Gretsova; Sergey V Nuzhdin; Maria G Samsonova Journal: BMC Genomics Date: 2020-07-28 Impact factor: 3.969