| Literature DB >> 19503599 |
Alkes L Price1, Agnar Helgason, Snaebjorn Palsson, Hreinn Stefansson, David St Clair, Ole A Andreassen, David Reich, Augustine Kong, Kari Stefansson.
Abstract
The Icelandic population has been sampled in many disease association studies, providing a strong motivation to understand the structure of this population and its ramifications for disease gene mapping. Previous work using 40 microsatellites showed that the Icelandic population is relatively homogeneous, but exhibits subtle population structure that can bias disease association statistics. Here, we show that regional geographic ancestries of individuals from Iceland can be distinguished using 292,289 autosomal single-nucleotide polymorphisms (SNPs). We further show that subpopulation differences are due to genetic drift since the settlement of Iceland 1100 years ago, and not to varying contributions from different ancestral populations. A consequence of the recent origin of Icelandic population structure is that allele frequency differences follow a null distribution devoid of outliers, so that the risk of false positive associations due to stratification is minimal. Our results highlight an important distinction between population differences attributable to recent drift and those arising from more ancient divergence, which has implications both for association studies and for efforts to detect natural selection using population differentiation.Entities:
Mesh:
Year: 2009 PMID: 19503599 PMCID: PMC2684636 DOI: 10.1371/journal.pgen.1000505
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Map of 11 regions of Iceland, color-coded to match Figures 2 and 3.
The interior region is not numbered, as it is uninhabited. Sample sizes for each region are listed in Table 1.
Data for Icelandic samples with majority ancestry from each of the 11 regions.
| Region | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| Total | 47 | 959 | 1154 | 3667 | 1343 | 1108 | 1102 | 1368 | 803 | 1447 | 1315 |
| Unrelated | 3 | 55 | 65 | 100 | 100 | 100 | 98 | 100 | 61 | 100 | 95 |
For each region, we list the total number of Icelandic samples with majority ancestry from that region, and the number of unrelated samples that were selected.
Figure 2PCA plots of (A) samples with most of their ancestry from 11 regions of Iceland and (B) samples with most of their ancestry from 11 regions of Iceland, together with a set of 250 randomly selected Icelandic samples.
Pairwise F ST and heterozygosity estimates for 11 regions of Iceland.
|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
| 1 | 0.3505 | 0.0019 | 0.0024 | 0.0022 | 0.0021 | 0.0031 | 0.0036 | 0.0032 | 0.0042 | 0.0018 | 0.0022 |
| 2 | 0.3479 | 0.0013 | 0.0015 | 0.0016 | 0.0027 | 0.0030 | 0.0027 | 0.0038 | 0.0018 | 0.0019 | |
| 3 | 0.3475 | 0.0012 | 0.0015 | 0.0027 | 0.0030 | 0.0027 | 0.0040 | 0.0021 | 0.0022 | ||
| 4 | 0.3478 | 0.0014 | 0.0027 | 0.0028 | 0.0027 | 0.0039 | 0.0020 | 0.0023 | |||
| 5 | 0.3474 | 0.0014 | 0.0021 | 0.0024 | 0.0039 | 0.0020 | 0.0023 | ||||
| 6 | 0.3468 | 0.0018 | 0.0030 | 0.0048 | 0.0031 | 0.0034 | |||||
| 7 | 0.3457 | 0.0029 | 0.0049 | 0.0033 | 0.0035 | ||||||
| 8 | 0.3466 | 0.0032 | 0.0025 | 0.0030 | |||||||
| 9 | 0.3446 | 0.0027 | 0.0036 | ||||||||
| 10 | 0.3479 | 0.0012 | |||||||||
| 11 | 0.3470 |
Heterozygosity values are listed on the diagonal. Standard errors of F ST estimates were equal to 0.0007 for all comparisons involving Region 1 and 0.0001 for all other comparisons.
Figure 3PCA plot of samples from Norway and Scotland projected onto PCs computed using samples with most of their ancestry from 11 regions of Iceland.
Figure 4P-P plots of allele frequency differentiation between region r and the union of all other regions, for each value of r ().
Figure 5P-P plot of allele frequency differentiation between Norway and Scotland.
The nine SNPs from Table 3 are displayed as squares.
List of markers whose unusual differentiation between Iceland and Scotland is genomewide-significant.
| Marker | Chromosome | Build35 Position | Nominal P-Value | Inside Gene? |
| rs10024216 | 4 | 38,586,678 | 7×10−8 | |
| rs10008492 | 4 | 38,588,286 | 7×10−10 | |
| rs4331786 | 4 | 38,591,974 | 2×10−9 | |
| rs11096957 | 4 | 38,599,057 | 1×10−9 |
|
| rs4543123 | 4 | 38,615,090 | 5×10−11 | |
| rs4833095 | 4 | 38,622,276 | 6×10−10 |
|
| rs7944926 | 11 | 70,843,273 | 2×10−9 |
|
| rs3794060 | 11 | 70,865,327 | 3×10−9 |
|
| rs13107325* | 4 | 103,545,887 | 2×10−7 |
|
A total of 12 markers in the TLR region and 5 markers in the NADSYN1 region achieved a nominal P-value of 0.0001 or lower (data not shown). We list with an asterisk one additional marker whose differentiation is highly suggestive (see text). Gene names are listed for markers located between the transcription start and end sites of a gene.