| Literature DB >> 20017973 |
Dai Wang1, Yu Sun, Paul Stang, Jesse A Berlin, Marsha A Wilcox, Qingqin Li.
Abstract
Population stratification (PS) represents a major challenge in genome-wide association studies. Using the Genetic Analysis Workshop 16 Problem 1 data, which include samples of rheumatoid arthritis patients and healthy controls, we compared two methods that can be used to evaluate population structure and correct PS in genome-wide association studies: the principal-component analysis method and the multidimensional-scaling method. While both methods identified similar population structures in this dataset, principal-component analysis performed slightly better than the multidimensional-scaling method in correcting for PS in genome-wide association analysis of this dataset.Entities:
Year: 2009 PMID: 20017973 PMCID: PMC2795880 DOI: 10.1186/1753-6561-3-s7-s109
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Correlation between first eight principal components and first eight MDS dimensions
| Top eight principal components | Top eight MDS dimensions | |||||||
|---|---|---|---|---|---|---|---|---|
| dim1 | dim2 | dim3 | dim4 | dim5 | dim6 | dim7 | dim8 | |
| evec1 | 0.01 | -0.02 | 0.01 | 0.005 | 0.01 | -0.002 | -0.0004 | |
| evec2 | 0.00 | 0.17 | 0.03 | -0.01 | -0.01 | 0.01 | 0.002 | |
| evec3 | -0.04 | 0.17 | -0.96 | -0.11 | -0.01 | 0.01 | 0.01 | -0.01 |
| evec4 | -0.02 | 0.00 | -0.12 | 0.06 | 0.17 | -0.07 | -0.03 | |
| evec5 | 0.02 | -0.01 | -0.05 | 0.38 | -0.20 | -0.43 | 0.20 | 0.03 |
| evec6 | 0.01 | 0.01 | 0.02 | 0.02 | 0.13 | 0.00 | 0.08 | -0.17 |
| evec7 | -0.02 | -0.001 | -0.03 | -0.01 | 0.17 | -0.07 | -0.17 | 0.31 |
| evec8 | 0.02 | -0.01 | 0.02 | 0.02 | -0.26 | -0.18 | -0.32 | 0.06 |
aBold font indicates the absolute value of correlation coefficient > 0.8.
Figure 1Population structures identified by PCA and MDS. A, The first six principal components are plotted against one another with RA status distinguished by shading; B, the first six MDS dimensions are plotted against one another with RA status distinguished by shading.
Figure 2Q-Q plots of . SNPs in HLA region are excluded to enhance readability.
Figure 3Results of GWA analyses. The y axis is in square root scale to enhance readability.
p-Values at SNP rs2476601 from three GWA analyses and their rankings in non-HLA SNPs
| Analysis method | Rank in non-HLA SNPs | |
|---|---|---|
| Trend test without adjustment | 5.42 × 10-12 | 3 |
| Logistic regression adjusted for principal components | 0.000018 | 18 |
| Logistic regression adjusted for MDS dimensions | 0.000022 | 59 |