| Literature DB >> 19284702 |
Hongyan Xu1, Bayazid Sarkar, Varghese George.
Abstract
BACKGROUND: Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. Population structure is a potential problem, the effects of which on genetic association studies are controversial. The first step to systematically quantify the effects of population structure is to choose an appropriate measure of population structure for human data. The commonly used measure is Wright's FST. For a set of subpopulations it is generally assumed to be one value of FST. However, the estimates could be different for distinct loci. Since population structure is a concept at the population level, a measure of population structure that utilized the information across loci would be desirable.Entities:
Year: 2009 PMID: 19284702 PMCID: PMC2652468 DOI: 10.1186/1756-0500-2-21
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Figure 1Linear regression of F.
Correlation coefficient of estimates of FST and C from simulations with varying sample sizes
| Sample size | Correlation coefficient |
| 30 | 0.944 |
| 60 | 0.959 |
| 90 | 0.973 |
Correlation coefficient of estimates of FST and C from simulations with αij from a mixture distribution
| Sample size | Correlation coefficient | |||
| a = 0.05 | A = 0.10 | a = 0.20 | a = 0.30 | |
| 30 | 0.942 | 0.925 | 0.882 | 0.792 |
| 60 | 0.951 | 0.939 | 0.894 | 0.862 |
| 90 | 0.963 | 0.946 | 0.903 | 0.881 |
Correlation of the estimates of FST and C from several weighing schemes
| Weighting scheme | Correlation coefficient | |
| Sample size (30, 40, 90) | Sample size (30, 60, 70) | |
| Scheme 1 | 0.866 | 0.875 |
| Scheme 2 | 0.940 | 0.947 |
| Scheme 3 | 0.931 | 0.937 |
Figure 2Linear regression of F.
Correlation of the estimates of FST and C with varying number of sub-populations
| Sample size | Correlation coefficient |
| (90, 90) | 0.874 |
| (60, 60, 60) | 0.959 |
| (45, 45, 45, 45) | 0.965 |
| (36, 36, 36, 36, 36) | 0.971 |