| Literature DB >> 23550126 |
Hanna Larsson1, Thomas Källman, Niclas Gyllenstrand, Martin Lascoux.
Abstract
The site frequency spectrum of mutations (SFS) and linkage disequilibrium (LD) are the two major sources of information in population genetics studies. In this study we focus on the levels of LD and the SFS and on the effect of sample size on summary statistics in 10 Scandinavian populations of Norway spruce. We found that previous estimates of a low level of LD were highly influenced by both sampling strategy and the fact that data from multiple loci were analyzed jointly. Estimates of LD were in fact heterogeneous across loci and increased within individual populations compared with the estimate from the total data. The variation in levels of LD among populations most likely reflects different demographic histories, although we were unable to detect population structure by using standard approaches. As in previous studies, we also found that the SFS-based test Tajima's D was highly sensitive to sample size, revealing that care should be taken to draw strong conclusions from this test when sample size is small. In conclusion, the results from this study are in line with recent studies in other conifers that have revealed a more complex and variable pattern of LD than earlier studies suggested and with studies in trees and humans that suggest that Tajima's D is sensitive to sample size. This has large consequences for the design of future association and population genetic studies in Norway spruce.Entities:
Keywords: Tajima’s D; conifer; linkage disequilibrium; recombination; resampling
Mesh:
Substances:
Year: 2013 PMID: 23550126 PMCID: PMC3656727 DOI: 10.1534/g3.112.005462
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Map of Scandinavia with the locations of sampled populations.
Location of the populations used in this study and sample size
| Population | Name | Latitude | Longitude | No. Sampled Individuals |
|---|---|---|---|---|
| Saleby | SE-58 | 58° 36′N | 13° 12’E | 8 |
| SörAmsberg | SE-60 | 60° 45′N | 15° 42’E | 8 |
| Fulufjället | SE-61 | 61° 57′N | 12° 78’E | 24 |
| Strängsund | SE-62 | 62° 63′N | 15° 12’E | 8 |
| Höglunda | SE-64 | 64° 08′N | 18° 74’E | 24 |
| Jock/Erkinvinsa | SE-66 | 66° 58′N | 22° 70’E | 8 |
| Punkaharju | FI-61 | 61° 72′N | 29° 39’E | 8 |
| Vilpuula | FI-62 | 62° 02′N | 24° 63’E | 8 |
| St2 | FI-66 | 66° 24′N | 26° 53’E | 8 |
| Sodankylä | FI-67 | 67° 41′N | 26° 62’E | 24 |
Nucleotide diversity and summary statistics for the 11 loci used to estimate long-range LD and structure in populations of P. abies
| Gene | N | Length of Amplicon, bp | Bp Sequenced | S (Singletons) | H | Hd | θw | π | Tajima’s D |
|---|---|---|---|---|---|---|---|---|---|
| PaAP2L3 | 74 | 4681 | 457 | 5 (1) | 7 | 0.58 | 2.2 | 1.8 | −0.42 |
| PaCDF1 | 107 | 1585 | 1028 | 23 (4) | 22 | 0.92 | 4.3 | 2.9 | −0.94 |
| PaCOL1 | 81 | 2970 | 2449 | 64 (26) | 39 | 0.97 | 5.3 | 3.3 | −1.22 |
| PaMFT1 | 96 | 4328 | 1597 | 62 (34) | 54 | 0.95 | 7.6 | 3.3 | −1.81 |
| PaFTL1 | 109 | 2742 | 748 | 14 (4) | 14 | 0.82 | 3.6 | 3.3 | −0.18 |
| PaCCA1 | 88 | 4126 | 742 | 24 (5) | 21 | 0.90 | 6.4 | 4.1 | −1.1 |
| PaPRR7 | 93 | 7271 | 1796 | 31 (21) | 23 | 0.88 | 3.4 | 1.6 | −1.65 |
| PaPRR1 | 114 | 1859 | 986 | 25 (8) | 20 | 0.89 | 4.8 | 4.8 | 0.02 |
| PaWS02746 | 97 | 4411 | 470 | 34 (13) | 40 | 0.96 | 14.1 | 12 | −0.43 |
| PaWS02749 | 100 | 3189 | 605 | 53 (20) | 23 | 0.82 | 16.9 | 10.5 | −1.21 |
| PaZIP | 113 | 4107 | 803 | 21 (6) | 15 | 0.74 | 4.9 | 3.6 | −0.78 |
LD, linkage disequilibrium; N, Sample size; S, Number of segregating sites; H, Number of observed haplotypes; Hd, Observed haplotype diversity; θw, Watterson’s estimate of θ (×10-03); π, Average nucleotide diversity (×10-03).
Significant deviation from the standard neutral model.
Figure 2A schematic representation of the eleven genes amplified in this study. The regions sequenced and analyzed are indicated underneath each gene (see Legend).
Summary sequence statistics for the 11 loci within the populations SE-61, SE-64, and FI-67
| Resampling | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Gene | Pop | N | Bp | S | Hd | θw | π | Tajd | Mean π (min, max) | Mean TajD (min, max) |
| PaAP2L3 | SE-61 | 16 | 519 | 5 (2) | 0.69 | 2.9 | 2.4 | −0.53 | 1.2 (0.5, 1.6) | −0.28 (−1.0, 0.3) |
| SE-64 | 7 | 544 | 4 (0) | 0.67 | 3 | 3.9 | −1.35 | n.a. | n.a. | |
| FI-67 | 16 | 811 | 7 (0) | 0.77 | 2.6 | 3.3 | 0.99 | 2.5 (1.4, 3.1) | 0.57 (−0.7, 1.4) | |
| PaCDF1 | SE-61 | 20 | 1179 | 15 (7) | 0.94 | 3.6 | 2.6 | −1.03 | 2.0 (1.3, 2.8) | −0.51 (−1.3, 0.5) |
| SE-64 | 18 | 1506 | 11 (2) | 0.84 | 2.1 | 2.4 | 0.47 | 2.2 (1.4, 2.7) | 0.13 (−0.5, 1.2) | |
| FI-67 | 16 | 1330 | 14 (6) | 0.94 | 3.2 | 2.6 | −0.76 | 1.5 (0.8, 2.3) | −0.93 (−1.6, −0.6) | |
| PaCOL1 | SE-61 | 20 | 2495 | 40 (15) | 0.97 | 4.5 | 3.3 | −1.11 | 3.2 (2.9, 3.5) | −0.96 (−1.4, −0.3) |
| SE-64 | 14 | 2527 | 30 (15) | 0.96 | 3.7 | 3.3 | −0.54 | 3.2 (2.5, 4.1) | −0.22 (−0.7, 0.3) | |
| FI-67 | 11 | 2492 | 23 (4) | 0.93 | 3.2 | 3.5 | 0.51 | 3.5 (3.1, 3.8) | 0.42 (−0.3, 1.5) | |
| PaMFT1 | SE-61 | 21 | 1670 | 27 (15) | 0.96 | 4.5 | 3.6 | −0.82 | 3.1 (1.8, 3.9) | −0.18 (−0.8, 0.4) |
| SE-64 | 14 | 1674 | 23 (13) | 0.99 | 4.3 | 3.9 | −0.4 | 3.2 (2.7, 3.6) | 0.09 (−0.4, 1.0) | |
| FI-67 | 14 | 1676 | 15 (7) | 0.93 | 2.8 | 2.8 | −0.1 | 2.7 (1.9, 6.2) | 0.23 (−0.4, 1.5) | |
| PaFTL1 | SE-61 | 20 | 784 | 8 (1) | 0.77 | 2.9 | 2.7 | −0.18 | 1.6 (1.2, 2.0) | −0.2 (−1.0, 0.5) |
| SE-64 | 19 | 865 | 11 (4) | 0.89 | 3.6 | 3 | −0.62 | 1.7 (1.0, 2.6) | −0.49 (−1.3, 0.7) | |
| FI-67 | 17 | 1206 | 14 (6) | 0.82 | 3.4 | 3 | −0.52 | 2.3 (1.1, 3.8) | −0.38 (−0.9, 0.5) | |
| PaCCA1 | SE-61 | 20 | 992 | 16 (6) | 0.89 | 4.6 | 3 | −1.23 | 2.1 (1.3, 2.9) | −0.85 (−1.3, 0.2) |
| SE-64 | 15 | 900 | 14 (5) | 0.93 | 4.8 | 3.7 | −0.94 | 2.1 1.3, 2.7) | −1.01 (−1.5, 0.2) | |
| FI-67 | 15 | 896 | 17 (11) | 0.93 | 5.8 | 3.4 | −1.69 | 1.8 (1.1, 2.8) | −1.29 (−1.8, −0.4) | |
| PaPRR7 | SE-61 | 18 | 2477 | 18 (9) | 0.95 | 2.1 | 1.6 | −0.95 | 1.2 (0.9, 1.7) | −0.45 (−1.4, 0.2) |
| SE-64 | 19 | 2513 | 15 (8) | 0.9 | 1.7 | 1.2 | −1.06 | 1.5 (0.6, 1.8) | 0.00 (−0.8, 0.4) | |
| FI-67 | 18 | 3178 | 24 (13) | 0.94 | 2.2 | 1.6 | −1.07 | 1.2 (1.0, 1.4) | −0.63 (−1.0, −0.2) | |
| PaPRR1 | SE-61 | 22 | 1068 | 21 (10) | 0.94 | 5.4 | 4.6 | −0.53 | 4.1 (2.4, 5.4) | 0.22 (−0.3, 1.1) |
| SE-64 | 18 | 1054 | 18 (7) | 0.86 | 5 | 5.6 | 0.49 | 4.0 (2.1, 5.8) | −0.01 (−1.1, 0.9) | |
| FI-67 | 21 | 1035 | 16 (4) | 0.89 | 4.3 | 5 | 0.61 | 3.6 (2.7, 4.8) | 0.61 (−0.1, 1.5) | |
| PaWS02746 | SE-61 | 18 | 515 | 26 (12) | 0.98 | 15 | 14 | −0.23 | 3.8 (2.8, 6.5) | −0.33 (−0.8, 0.6) |
| SE-64 | 15 | 513 | 19 (7) | 0.97 | 11 | 12 | 0.14 | 4.3 (3.1, 5.9) | 0.47 (−1.1, 1.5) | |
| FI-67 | 15 | 959 | 29 (12) | 0.99 | 9.3 | 10 | 0.37 | 7.4 (6.3, 9.3) | 0.37 (−1.0, 1.0) | |
| PaWS02749 | SE-61 | 19 | 641 | 28 (14) | 0.73 | 13 | 8.3 | −1.31 | 4.4 (1.8, 6.4) | −1.04 (−1.8, −0.5) |
| SE-64 | 14 | 735 | 28 (13) | 0.92 | 12 | 9.9 | −0.75 | 5.1 (3.0, 8.6) | −0.58 (−1.5, 0.5) | |
| FI-67 | 16 | 847 | 28 (17) | 0.87 | 10 | 8.3 | −0.69 | 5.2 (4.5, 6.3) | −0.30 (−0.8, 0.2) | |
| PaZIP | SE-61 | 22 | 1123 | 14 (3) | 0.68 | 3.4 | 3.6 | 0.17 | 2.6 (1.4, 4.5) | −0.53 (−1.8, 1.2) |
| SE-64 | 22 | 932 | 14 (5) | 0.8 | 4.1 | 3.3 | −0.68 | 2.3 (1.5, 3.2) | −0.57 (−1.7, 1.4) | |
| FI-67 | 16 | 900 | 21(12) | 0.78 | 4 | 5.2 | 1.12 | 3.1 (2.6, 3.9) | 1.09 (−0.3, 2.0) | |
N, sample size; S, number of segregating sites; Hd, observed haplotype diversity; θw, Wattersons estimate of θ (×10-03); π, average pairwise distance (×10-03); TajD, Tajima’s D; n.a., not calculated due to low sample size.
Figure 3Within population estimates of mean Tajima’s D across eleven loci plotted against latitude of origin. Boxes denote estimates from the number of individuals sampled from the population. The mean across eleven loci within resampled populations SE-61, SE-64 and FI-67 is plotted with circles, triangles and crosses respectively (see legend).
Figure 4Plot of the squared correlation of allele frequencies (r2) vs. distance in base pairs across 11 loci for different subsets of populations. (Top left) all ten populations n=97, (top right) SE-61 n=20, (bottom left) SE-64 n=16, and (bottom right) FI-67 n=16.
Mean linkage disequilibrium and recombination rate parameters estimated per locus for the merged data set of 10 populations with a mean of 97 individuals
| Number of Sites | r2 | |||||||
|---|---|---|---|---|---|---|---|---|
| Gene | Informative | Pairwise | Sign. Pairwise, % | Mean | <0.2 | ρ | ρ/site | ρ/θ |
| PaAP2L3 | 4 | 6 | 16.7 | n.a. | n.a. | 2.04 | 0.0004 | 1.98 |
| PaCDF1 | 19 | 171 | 6.4 | 0.041 | 0 | 5.10 | 0.003 | 1.16 |
| PaCOL1 | 38 | 703 | 6.3 | 0.079 | 74 | 18.4 | 0.006 | 1.42 |
| PaMFT1 | 28 | 378 | 15 | 0.056 | 12 | 25.5 | 0.006 | 2.11 |
| PaFTL1 | 10 | 45 | 28.9 | 0.13 | 5 | 19.4 | 0.007 | 7.29 |
| PaCCA-1 | 18 | 153 | 8.5 | 0.069 | 7 | 11.2 | 0.003 | 2.36 |
| PaPRR7 | 10 | 45 | 20 | 0.097 | 88 | 8.16 | 0.001 | 1.34 |
| PaPRR1 | 17 | 136 | 30.9 | 0.17 | 96 | 3.06 | 0.002 | 0.65 |
| PaWS02746 | 19 | 171 | 21.6 | 0.14 | 118 | 21.4 | 0.005 | 3.24 |
| PaWS02749 | 31 | 465 | 11.8 | 0.12 | 59 | 11.2 | 0.004 | 1.10 |
| PaZIP | 15 | 105 | 26.7 | 0.2 | 353 | 0 | 0 | 0 |
| All Loci | 209 | 2378 | 11.4 | 0.1 | 46 | 11.4 | 0.003 | 1.81 |
n.a., not applicable.
Number of pairwise comparisons and the fraction of these that are significant.
Number of base pairs where estimated r2 falls below 0.2.
Mean linkage disequilibrium and recombination rate parameters estimated per locus for each of the three populations SE-61, SE-64, and FI-67
| Number of Sites | r2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Gene | Pop | Informative | Pairwise | Sign. Pairwise, % | Mean | <0.2 | ρ | ρ/site | ρ/θ | |
| PaAP2L3 | SE-61 | 3 | 3 | 0 | −0.16 | n.a. | 2.04 | 0.4 | 1.35 | |
| SE-64 | 4 | 6 | 0 | −0.69 | n.a. | 10.2 | 2.2 | 6.25 | ||
| FI-67 | 7 | 21 | 9.5 | 0.32 | n.a. | 2.04 | 0.4 | 0.97 | ||
| PaCDF1 | SE-61 | 8 | 28 | 3.6 | 0.14 | n.a. | 6.12 | 3.9 | 1.45 | |
| SE-64 | 9 | 36 | 8.3 | 0.25 | n.a. | 0 | 0 | 0 | ||
| FI-67 | 8 | 28 | 0 | 0.15 | n.a. | 13.3 | 8.4 | 3.14 | ||
| PaCOL1 | SE-61 | 25 | 300 | 0 | 0.16 | n.a. | 23.5 | 7.9 | 2.08 | |
| SE-64 | 15 | 105 | 0 | 0.29 | n.a. | 12.2 | 4.1 | 1.3 | ||
| FI-67 | 19 | 171 | 0 | 0.37 | n.a. | 4.08 | 1.4 | 0.52 | ||
| PaMFT1 | SE-61 | 11 | 55 | 10.9 | 0.25 | n.a. | 20.4 | 4.7 | 2.72 | |
| SE-64 | 10 | 45 | 13.3 | 0.37 | n.a. | 2.04 | 0.5 | 0.28 | ||
| FI-67 | 8 | 28 | 21.4 | 0.44 | n.a. | 3.06 | 0.7 | 0.65 | ||
| PaFTL1 | SE-61 | 7 | 21 | 9.5 | 0.29 | n.a. | 7.14 | 2.6 | 3.17 | |
| SE-64 | 7 | 21 | 4.8 | 0.2 | n.a. | 2.04 | 0.7 | 0.65 | ||
| FI-67 | 8 | 28 | 14.3 | 0.34 | n.a. | 8.16 | 3 | 1.97 | ||
| PaCCA1-l | SE-61 | 10 | 45 | 2.2 | 0.19 | n.a. | 7.14 | 1.7 | 1.58 | |
| SE-64 | 8 | 28 | 0 | 0.2 | n.a. | 0 | 0 | 0 | ||
| FI-67 | 6 | 15 | 0 | 0.29 | n.a. | 0 | 0 | 0 | ||
| PaPRR7 | SE-61 | 9 | 36 | 5.6 | 0.22 | n.a. | 3.06 | 0.4 | 0.58 | |
| SE-64 | 7 | 21 | 9.5 | 0.2 | n.a. | 3.06 | 0.4 | 0.71 | ||
| FI-67 | 11 | 55 | 1.8 | 0.16 | n.a. | 5.1 | 0.7 | 0.73 | ||
| PaPRR1 | SE-61 | 11 | 55 | 12.7 | 0.29 | n.a. | 2.04 | 1.1 | 0.35 | |
| SE-64 | 11 | 55 | 23.6 | 0.52 | n.a. | 0 | 0 | 0 | ||
| FI-67 | 12 | 66 | 16.7 | 0.37 | n.a. | 2.04 | 1.1 | 0.46 | ||
| PaWS02746 | SE-61 | 14 | 91 | 17.6 | 0.38 | n.a. | 8.16 | 1.9 | 1.08 | |
| SE-64 | 12 | 66 | 6.1 | 0.32 | n.a. | 18.4 | 4.2 | 3.14 | ||
| FI-67 | 16 | 120 | 23.3 | 0.5 | n.a. | 12.2 | 2.8 | 1.37 | ||
| PaWS02749 | SE-61 | 14 | 91 | 11 | 0.39 | n.a. | 3.06 | 1 | 0.38 | |
| SE-64 | 15 | 105 | 0 | 0.37 | n.a. | 2.04 | 0.6 | 0.23 | ||
| FI-67 | 11 | 55 | 29.1 | 0.55 | n.a. | 7.14 | 2.2 | 0.85 | ||
| PaZIP | SE-61 | 11 | 55 | 40 | 0.54 | n.a. | 0 | 0 | 0 | |
| SE-64 | 9 | 36 | 58.3 | 0.55 | n.a. | 0 | 0 | 0 | ||
| FI-67 | 9 | 36 | 44.4 | 0.66 | n.a. | 0 | 0 | 0 | ||
| Mean | SE-61 | 123 | 780 | 8.6 | 0.261 | 705 | 7.5 | 2.3 | 1.34 | |
| SE-64 | 107 | 524 | 9.5 | 0.348 | 2549 | 4.5 | 1.2 | 0.88 | ||
| FI-67 | 115 | 623 | 13.5 | 0.398 | 4580 | 5.2 | 1.9 | 0.94 | ||
n.a., not applicable.
Number of pairwise comparisons and the fraction of these that are significant.
Number of base pairs where estimated r2 falls below 0.2.
Figure 5Plot of the squared correlation of allele frequencies (r2) vs. distance in base pairs in the gene PaZIP using all populations.