| Literature DB >> 19738909 |
Chantel D Sloan1, Angeline D Andrew, Eric J Duell, Scott M Williams, Margaret R Karagas, Jason H Moore.
Abstract
Genetic structure due to ancestry has been well documented among many divergent human populations. However, the ability to associate ancestry with genetic substructure without using supervised clustering has not been explored in more presumably homogeneous and admixed US populations. The goal of this study was to determine if genetic structure could be detected in a United States population from a single state where the individuals have mixed European ancestry. Using Bayesian clustering with a set of 960 single nucleotide polymorphisms (SNPs) we found evidence of population stratification in 864 individuals from New Hampshire that can be used to differentiate the population into six distinct genetic subgroups. We then correlated self-reported ancestry of the individuals with the Bayesian clustering results. Finnish and Russian/Polish/Lithuanian ancestries were most notably found to be associated with genetic substructure. The ancestral results were further explained and substantiated using New Hampshire census data from 1870 to 1930 when the largest waves of European immigrants came to the area. We also discerned distinct patterns of linkage disequilibrium (LD) between the genetic groups in the growth hormone receptor gene (GHR). To our knowledge, this is the first time such an investigation has uncovered a strong link between genetic structure and ancestry in what would otherwise be considered a homogenous US population.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19738909 PMCID: PMC2734429 DOI: 10.1371/journal.pone.0006928
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Bayesian clustering results.
a) Bar plots from CLUMPP results aligning 10 structure runs for K = 2 to K = 10. Each plot was created using 960 tagSNPs from 864 individuals, and is sorted by q values. The plots are read from left to right, with bars representing individuals and the color of the bar representing the proportion of that individuals' markers that originated from a certain population. b) Probabilities from structure shown as boxplots of the 10 runs at each K. Structure admixture (c) and FST (d) values for 10 runs. The FST's graphed are averages across each subpopulation for each K.
Ancestry analysis results for between 2 and 7 populations assumed.
| Number of Populations | Population Group | Ancestries (p-value) | ||
| K3 | 1 | Finland (0.005) | Ireland (0.05) | |
| 2 | Italy (0.022) | UK (0.027) | ||
| 3 | Ca_Indian (0.005) | Germany (0.026) | Russia (0.019) | |
| K4 | 1 | Poland (0.015) | Russia (0.001) | UK (0.035) |
| 2 | Ca_Indian (0.008) | Jewish (0.049) | ||
| 3 | Finland (0.008) | Switzerland (0.038) | ||
| 4 | Italy (0.017) | Netherlands (0.024) | ||
| K5 | 1 | England (0.011) | Italy (0.01) | Netherlands (0.004) |
| 2 | Lithuania (0.037) | Poland (0.001) | Russia (0.000) | |
| 3 | Ca_Indian (0.016) | France (0.035) | Jewish (0.037) | |
| 4 | Am_Indian (0.04) | Ca_Indian (0.035) | Canada (0.025) | |
| 5 | Finland (0.006) | Switzerland (0.043) | ||
| K6 | 1 | Am_Indian (0.043) | ||
| 2 | England (0.024) | Italy (0.03) | Netherlands (0.002) | |
| 3 | Czech (0.029) | |||
| 4 | Ca_Indian (0.021) | France (0.02) | Jewish (0.023) | |
| 5 | Finland (0.006) | |||
| 6 | Lithuania (0.017) | Poland (0.001) | Russia (0.001) | |
| K7 | 1 | Ca_Indian (0.011) | ||
| 2 | England (0.03) | Italy (0.005) | Netherlands (0.007) | |
| 3 | Am_Indian (0.027) | UK (0.038) | ||
| 4 | Finland (0.007) | |||
| 5 | Czech (0.029) | |||
| 6 | Lithuania (0.024) | Poland (0.001) | Russia (0.001) | |
| 7 | Ca_Indian (0.046) | France (0.012) | Jewish (0.023) | |
A Spearman's rank correlation between each ancestry with more than 5 individuals reporting For each population, ancestries with a Spearman's rank correlation p-value<0.05 are shown along with their p-values (in parenthesis).
Figure 2Census data for New Hampshire from 1870 to 1930 showing thousands of immigrants from European countries by census year.
Figure 3D' values using 18 SNPs from the GHR gene for K = 4 population clusters.
Haplotype association analysis results using score statistics as computed within the R package haplo.stats.
|
| 1 | 2 | 3 | 4 | |
| 1 (n = 49) |
| NA | 0.0001* | 0.05783 | 0.00626 |
| 2 (n = 89) |
| NA | 0.00067 | 0.24951 | |
| 3 (n = 80) |
| NA | 0.00804 | ||
| 4 (n = 60) |
| NA |
Global values were obtained by comparing individuals from each population to all others from the other 3 populations. Subsequent p-values presented are obtained by comparing haplotypes between groups. The haplotypes were associated when comparing populations 1 and 2*, with a p-value below a Bonferroni corrected alpha of 0.000347.