| Literature DB >> 26346893 |
Sungho Won1, Hosik Choi2, Suyeon Park3, Juyoung Lee4, Changyi Park5, Sunghoon Kwon6.
Abstract
Owing to recent improvement of genotyping technology, large-scale genetic data can be utilized to identify disease susceptibility loci and this successful finding has substantially improved our understanding of complex diseases. However, in spite of these successes, most of the genetic effects for many complex diseases were found to be very small, which have been a big hurdle to build disease prediction model. Recently, many statistical methods based on penalized regressions have been proposed to tackle the so-called "large P and small N" problem. Penalized regressions including least absolute selection and shrinkage operator (LASSO) and ridge regression limit the space of parameters, and this constraint enables the estimation of effects for very large number of SNPs. Various extensions have been suggested, and, in this report, we compare their accuracy by applying them to several complex diseases. Our results show that penalized regressions are usually robust and provide better accuracy than the existing methods for at least diseases under consideration.Entities:
Mesh:
Year: 2015 PMID: 26346893 PMCID: PMC4539442 DOI: 10.1155/2015/605891
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1AUCs from test set. AUCs for T2D, obesity, hypertension, CPD10, SC, and SI from test set were calculated for different n and p 1. TR indicates the truncated ridge.
Relative proportion of variance explained by genotyped SNPs.
| T2D | Obesity | Hypertension | CPD | SI | SC | |
|---|---|---|---|---|---|---|
|
| 0.147276 | 0.14922 | 0.296246 | 0.243554 | 0.052088 | 1.00 |
|
| 0.097091 | 0.10029 | 0.100675 | 0.424123 | 0.080256 | 0.102595 |
Figure 2Number of nonzero p 1 in the disease risk prediction model. Numbers of nonzero coefficients of SNPs in disease risk prediction model were provided for different n and p 1. TR indicates the truncated ridge.