| Literature DB >> 26460400 |
Christopher D Steele1, Denise Syndercombe Court, David J Balding.
Abstract
We estimate the population genetics parameter FST (also referred to as the fixation index) from short tandem repeat (STR) allele frequencies, comparing many worldwide human subpopulations at approximately the national level with continental-scale populations. FST is commonly used to measure population differentiation, and is important in forensic DNA analysis to account for remote shared ancestry between a suspect and an alternative source of the DNA. We estimate FST comparing subpopulations with a hypothetical ancestral population, which is the approach most widely used in population genetics, and also compare a subpopulation with a sampled reference population, which is more appropriate for forensic applications. Both estimation methods are likelihood-based, in which FST is related to the variance of the multinomial-Dirichlet distribution for allele counts. Overall, we find low FST values, with posterior 97.5 percentiles < 3% when comparing a subpopulation with the most appropriate population, and even for inter-population comparisons we find FST < 5%. These are much smaller than single nucleotide polymorphism-based inter-continental FST estimates, and are also about half the magnitude of STR-based estimates from population genetics surveys that focus on distinct ethnic groups rather than a general population. Our findings support the use of FST up to 3% in forensic calculations, which corresponds to some current practice.Entities:
Mesh:
Year: 2014 PMID: 26460400 PMCID: PMC4223938 DOI: 10.1111/ahg.12081
Source DB: PubMed Journal: Ann Hum Genet ISSN: 0003-4800 Impact factor: 1.670
Figure 1Countries of origin of the individuals included in the study, coloured according to the population that provides the best fit according to the indirect method (see text). White indicates countries represented by fewer than five individuals.
Number of alleles typed per locus and population. IC1-6 correspond to populations; Caucasian (IC1), Black African/Caribbean (IC3), South Asian (IC4), East/South-East Asian (IC5), and Middle Eastern/North African (IC6)
| Observations | IC1 | IC2 | IC3 | IC4 | IC5 | IC6 | Total |
|---|---|---|---|---|---|---|---|
| D3S1358 | 7013 | 162 | 5200 | 704 | 625 | 226 | 13930 |
| TH01 | 6953 | 158 | 5177 | 694 | 624 | 226 | 13832 |
| D21S11 | 7006 | 162 | 5198 | 704 | 624 | 225 | 13919 |
| D18S51 | 6944 | 157 | 5180 | 704 | 626 | 226 | 13837 |
| D16S539 | 6951 | 162 | 5183 | 694 | 626 | 226 | 13842 |
| VWA | 7013 | 162 | 5194 | 704 | 626 | 226 | 13925 |
| D8S1179 | 7007 | 162 | 5200 | 704 | 626 | 226 | 13925 |
| FGA | 6988 | 162 | 5196 | 700 | 626 | 226 | 13898 |
| D19S433 | 6836 | 158 | 5122 | 687 | 621 | 226 | 13650 |
| D2S1338 | 6575 | 152 | 4995 | 667 | 620 | 220 | 13229 |
| D22S1045 | 1822 | 56 | 3478 | 523 | 506 | 162 | 6547 |
| D1S1656 | 1835 | 56 | 3509 | 528 | 511 | 162 | 6601 |
| D10S1248 | 1823 | 56 | 3497 | 516 | 506 | 118 | 6516 |
| D2S441 | 1808 | 56 | 3458 | 521 | 501 | 160 | 6504 |
| D12S391 | 1869 | 56 | 3531 | 551 | 507 | 162 | 6676 |
| SE33 | 376 | 4 | 1039 | 308 | 396 | 140 | 2263 |
Figure 2posterior 95% interval using: (red) a beta prior with median 2.3% and 95% CI (0.26%, 8.0%); (blue) the uniform prior. Sample sizes are shown on x-axis. Data were simulated to have (horizontal line). The vertical lines indicate the 95% equal-tailed CI, and medians are indicated with horizontal segments.
Figure 3posterior densities (solid lines) using the direct method, given a uniform prior (blue) and an informative beta prior (red). Dotted red lines show the beta prior density. The subpopulations analysed are (left) Iran and (right) Afghanistan, with the reference populations being EA6 (Middle East/North Africa) and EA4 (South Asia), respectively.
Posterior 95% intervals for locus effect parameters using the indirect method. The analysis used all 7121 individuals with IC1 through IC6 treated as six subpopulations
| Percentile | Percentile | ||||
|---|---|---|---|---|---|
| Locus | 2.5 | 97.5 | Locus | 2.5 | 97.5 |
| D3 | −1.72 | −0.2 | D19 | −0.62 | 0.62 |
| TH01 | 0.11 | 1.58 | D2 | −0.59 | 0.62 |
| D21 | −0.85 | 0.45 | D22 | −0.06 | 1.32 |
| D18 | −0.79 | 0.38 | D1 | −0.7 | 0.52 |
| D16 | −1.3 | 0.15 | D10 | −0.87 | 0.6 |
| vWA | −0.93 | 0.42 | D2 | −0.21 | 1.15 |
| D8 | −0.73 | 0.6 | D12 | −0.71 | 0.56 |
| FGA | −1.04 | 0.23 | |||
The 2.5, 50, and 97.5 posterior percentiles of (expressed as %). Subpopulations were compared both individually with the reference population EA1 (direct method, 10 loci) and analysed jointly to infer ancestral allele fractions (indirect method, 15 loci). n denotes the sample size (number of individuals)
| Direct | Indirect | ||||||
|---|---|---|---|---|---|---|---|
| IC1 | 2.5 | 50 | 97.5 | 2.5 | 50 | 97.5 | |
| Eire | 1949 | 0.1 | 0.2 | 0.2 | 0.0 | 0.0 | 0.1 |
| Great Britain | 1416 | 0.1 | 0.1 | 0.1 | 0.0 | 0.0 | 0.0 |
| Eastern Europe | 61 | 0.2 | 0.5 | 1.0 | 0.1 | 0.3 | 0.7 |
| Northern Europe | 45 | 0.0 | 0.3 | 0.8 | 0.0 | 0.2 | 0.5 |
| Southern Europe | 60 | 0.0 | 0.2 | 0.5 | 0.0 | 0.1 | 0.3 |
| Western Europe | 13 | 0.1 | 0.7 | 2.1 | 0.0 | 0.5 | 1.8 |
| Anglo New World | 13 | 0.1 | 0.5 | 1.7 | 0.0 | 0.3 | 1.4 |
| Latin America | 25 | 0.5 | 1.3 | 2.4 | 0.6 | 1.3 | 2.4 |
The 2.5, 50, and 97.5 posterior percentiles of (expressed as %). Subpopulations were compared both individually with the reference population EA3 (direct method, 10 loci) and analysed jointly to infer ancestral allele fractions (indirect method, 15 loci). n denotes the sample size (number of individuals)
| Direct | Indirect | ||||||
|---|---|---|---|---|---|---|---|
| IC3 | 2.5 | 50 | 97.5 | 2.5 | 50 | 97.5 | |
| Ghana | 214 | 0.8 | 1.1 | 1.6 | 0.2 | 0.3 | 0.5 |
| Jamaica | 166 | 0.5 | 0.7 | 1.0 | 0.0 | 0.1 | 0.2 |
| Kenya | 51 | 0.7 | 1.2 | 1.9 | 0.8 | 1.3 | 1.9 |
| Nigeria | 444 | 0.9 | 1.2 | 1.5 | 0.2 | 0.3 | 0.3 |
| Sierra Leone | 41 | 0.7 | 1.3 | 2.2 | 0.1 | 0.3 | 0.8 |
| Uganda | 63 | 0.3 | 0.5 | 1.0 | 0.0 | 0.2 | 0.4 |
| Unknown IC3 | 864 | 0.4 | 0.5 | 0.7 | 0.0 | 0.0 | 0.0 |
| Other Caribbean | 20 | 0.5 | 1.5 | 2.9 | 0.1 | 0.4 | 1.3 |
| Other C/S Africa | 55 | 0.3 | 0.6 | 1.1 | 0.0 | 0.1 | 0.3 |
| Other E Africa | 66 | 0.3 | 0.7 | 1.1 | 0.0 | 0.1 | 0.4 |
| Other W Africa | 48 | 0.1 | 0.5 | 1.0 | 0.0 | 0.1 | 0.3 |
The 2.5, 50, and 97.5 posterior percentiles of (expressed as %). Subpopulations were compared both individually with the reference population EA4 (direct method, 10 loci) and analysed jointly to infer ancestral allele fractions (indirect method, 15 loci). n denotes the sample size (number of individuals)
| Direct | Indirect | ||||||
|---|---|---|---|---|---|---|---|
| IC4 | 2.5 | 50 | 97.5 | 2.5 | 50 | 97.5 | |
| Afghanistan | 47 | 0.1 | 0.3 | 0.9 | 0.1 | 0.4 | 0.9 |
| Bangladesh | 53 | 0.1 | 0.4 | 0.9 | 0.0 | 0.1 | 0.4 |
| India | 49 | 0.0 | 0.3 | 0.8 | 0.0 | 0.1 | 0.4 |
| Pakistan | 60 | 0.0 | 0.2 | 0.5 | 0.0 | 0.2 | 0.5 |
| Unknown IC4 | 76 | 0.0 | 0.2 | 0.5 | 0.0 | 0.1 | 0.2 |
The 2.5, 50, and 97.5 posterior percentiles of (expressed as %). Subpopulations were compared both individually with the reference population EA5 (direct method, 10 loci) and analysed jointly to infer ancestral allele fractions (indirect method, 15 loci). n denotes the sample size (number of individuals)
| Direct | Indirect | ||||||
|---|---|---|---|---|---|---|---|
| IC5 | 2.5 | 50 | 97.5 | 2.5 | 50 | 97.5 | |
| NE Asia | 260 | 0.1 | 0.2 | 0.3 | 0.1 | 0.4 | 0.8 |
| SE Asia | 44 | 0.0 | 0.2 | 0.7 | 0.0 | 0.1 | 0.4 |
The 2.5, 50, and 97.5 posterior percentiles of (expressed as %). Subpopulations were compared both individually with the reference population EA6 (direct method, 10 loci) and analysed jointly to infer ancestral allele fractions (indirect method, 15 loci). n denotes the sample size (number of individuals)
| Direct | Indirect | ||||||
|---|---|---|---|---|---|---|---|
| IC6 | 2.5 | 50 | 97.5 | 2.5 | 50 | 97.5 | |
| Iran | 12 | 0.1 | 0.9 | 2.4 | 0.1 | 0.9 | 2.7 |
| Iraq | 28 | 0.0 | 0.2 | 0.7 | 0.0 | 0.2 | 0.7 |
| Somalia | 494 | 1.1 | 1.3 | 1.7 | 1.2 | 1.6 | 2.1 |
| Turkey | 20 | 0.1 | 0.5 | 1.6 | 0.2 | 0.9 | 2.1 |
| Middle East | 24 | 0.1 | 0.7 | 1.8 | 0.1 | 0.5 | 1.6 |
| N Africa | 26 | 0.2 | 0.7 | 1.7 | 0.1 | 0.6 | 1.5 |
Posterior median (%) for fringe subpopulations: These are subpopulations for which another reference population gives a median estimate using the direct method within 0.001 of the lowest (best fit) value
| Reference | |||||
|---|---|---|---|---|---|
| Fringe | EA1 | EA3 | EA4 | EA5 | EA6 |
| Afghanistan | 1.17 | 2.90 | 0.78 | 1.87 | 0.78 |
| Kenya | 2.32 | 1.39 | 2.51 | 2.32 | 1.36 |
| Southern Europe | 0.30 | 2.99 | 1.20 | 2.03 | 0.34 |
| Unknown IC4 | 1.68 | 2.80 | 0.62 | 1.17 | 0.72 |
Posterior median (%):Populations IC1-6 were compared to each reference population in turn using the direct method. The indirect method was used to compare each population to a hypothetical global ancestral population
| Reference | |||||||
|---|---|---|---|---|---|---|---|
| Global | EA1 | EA3 | EA4 | EA5 | EA6 | Indirect | |
| IC1 | 3582 | 0.4 | 3.1 | 1.9 | 1.9 | 0.9 | 2.7 |
| IC3 | 2032 | 1.7 | 0.7 | 1.7 | 1.4 | 1.1 | 1.0 |
| IC4 | 285 | 1.4 | 3.1 | 0.7 | 1.3 | 0.8 | 2.3 |
| IC5 | 304 | 3.1 | 4.2 | 2.4 | 0.5 | 2.0 | 3.3 |
| IC6 | 604 | 1.8 | 1.7 | 1.9 | 1.7 | 0.9 | 1.4 |