| Literature DB >> 23894410 |
John A Sved1, Emilie C Cameron, A Stuart Gilchrist.
Abstract
There is a substantial literature on the use of linkage disequilibrium (LD) to estimate effective population size using unlinked loci. The Ne estimates are extremely sensitive to the sampling process, and there is currently no theory to cope with the possible biases. We derive formulae for the analysis of idealised populations mating at random with multi-allelic (microsatellite) loci. The 'Burrows composite index' is introduced in a novel way with a 'composite haplotype table'. We show that in a sample of diploid size S, the mean value of x2 or r2 from the composite haplotype table is biased by a factor of 1-1/(2S-1)2, rather than the usual factor 1+1/(2S-1) for a conventional haplotype table. But analysis of population data using these formulae leads to Ne estimates that are unrealistically low. We provide theory and simulation to show that this bias towards low Ne estimates is due to null alleles, and introduce a randomised permutation correction to compensate for the bias. We also consider the effect of introducing a within-locus disequilibrium factor to r2, and find that this factor leads to a bias in the Ne estimate. However this bias can be overcome using the same randomised permutation correction, to yield an altered r2 with lower variance than the original r2, and one that is also insensitive to null alleles. The resulting formulae are used to provide Ne estimates on 40 samples of the Queensland fruit fly, Bactrocera tryoni, from populations with widely divergent Ne expectations. Linkage relationships are known for most of the microsatellite loci in this species. We find that there is little difference in the estimated Ne values from using known unlinked loci as compared to using all loci, which is important for conservation studies where linkage relationships are unknown.Entities:
Mesh:
Year: 2013 PMID: 23894410 PMCID: PMC3720881 DOI: 10.1371/journal.pone.0069078
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Symbols used in the text.
| Ne | Effective population size |
| S | Number of diploid individuals in a sample |
| n11 | Number of genotypes in a sample with aa at first locus and bb at second locus |
| n12 | Number of aa b– genotypes where – refers to non-b allele at the second locus |
| n21 | Number of a– bb genotypes |
| n22 | Number of a– b– genotypes |
| na, nb | Number of a and b alleles respectively |
| pa, pb | Allele frequencies in gametic and composite table, = na/2S and nb/2S |
| pab | Frequency of the ab haplotype |
| D | Gametic disequilibrium coefficient = pab – papb |
| r2 | Gametic correlation = D2/[pa(1– pa)pb(1– pb)] |
| M | Number of ab haplotypes in composite haplotype table = 4n11+2n12+2n21+ n22 |
| pab(comp) | Frequency of ab in composite haplotype table = M/4S |
| D(comp) | Disequilibrium coefficient from composite haplotype table = pab(comp) – papb |
| Δ | Burrows’ disequilibrium coefficient = 2D(comp) |
| r2(comp) | r2 value from composite haplotype table = D2(comp)/[pa(1– pa)pb(1– pb)] |
|
| Composite r2 parameter = 4r2(comp) |
|
| Estimate of |
|
|
|
| ?2(comp) | ?2 calculated from composite haplotype table |
| pn | Frequency of null alleles at a locus |
| α | Half the difference between coupling and repulsion heterozygote frequencies |
Figure 1The composite haplotype table for a 2-allele observed sample.
Figure 2The composite haplotype table for an example of two microsatellites from the fruit y outbreak data set.
Observed statistics from simulations with and without incorporating single-locus disequilibrium.
| Actual | 32 | 64 | 128 | 256 | 512 | 1024 |
|
| 0.00993 | 0.00511 | 0.00255 | 0.00129 | 0.00065 | 0.00032 |
| (2) | 34 | 65 | 131 | 259 | 516 | 1036 |
| (3) | 26 | 41 | 59 | 76 | 89 | 97 |
| (4) | 33 | 64 | 127 | 249 | 494 | 1025 |
| (5) | 0.01067 | 0.00598 | 0.00352 | 0.00225 | 0.00163 | 0.00133 |
| (6) | 31 | 56 | 95 | 148 | 203 | 249 |
| (7) | 35 | 68 | 134 | 265 | 523 | 1040 |
| (8) | 31 | 56 | 96 | 147 | 206 | 248 |
| (9) | 35 | 68 | 136 | 274 | 559 | 1127 |
| (10) | 0.00655 | 0.00397 | 0.00285 | 0.00231 | 0.00205 | 0.00193 |
| (11) | 0.00468 | 0.00272 | 0.00186 | 0.00146 | 0.00126 | 0.00117 |
| (12) | 0.00454 | 0.00277 | 0.00195 | 0.00153 | 0.00134 | 0.00124 |
| (13) | 0.00299 | 0.00167 | 0.00108 | 0.00081 | 0.00067 | 0.00059 |
All used sample size S = 32.
Summary of N estimated by various procedures for East coast outbreak populations of B.tryoni, with the most likely estimate shown by ⇓.
| S | No homozygote correction | Homozygote correction | Likelihood Significance | |||||
| Unlinked No permute | Unlinked permute | Unlinked permute | All loci | |||||
| permute | LDNe | genotype | composite | |||||
| Albury03 | 27 | 60 | ∞ | ∞ | ∞ | ∞ |
| |
| Barooga03 | 33 | 40 | 30 | 40 | 20 | 20 |
| |
| Condobolin02 | 42 | 40 | ∞ | ∞ | ∞ | ∞ |
| |
| Coota02 | 43 | 110 | ∞ | 450 | 340 | 510 |
| |
| Corowa02 | 22 | 20 | 120 | 180 | 100 | ∞ | ||
| Cowra | 20 | 20 | 230 | 150 | 180 | ∞ | ||
| Deniliquin02 | 40 | 30 | 40 | 40 | 30 | ∞ |
|
|
| Deniliquin03 | 53 | 40 | 100 | 150 | 70 | 90 |
|
|
| Deniliquin04 | 73 | 50 | 130 | 160 | 70 | 110 |
|
|
| Dubbo02 | 26 | 30 | 180 | 130 | 160 | ∞ |
| |
| Forbes02 | 34 | 40 | 250 | 180 | 170 | ∞ |
| |
| Grenfell02 | 31 | 130 | ∞ | ∞ | ∞ | ∞ |
| |
| Hay02 | 26 | 20 | 30 | 20 | 20 | 140 |
| |
| Hay03 | 28 | 40 | 230 | 120 | 50 | 80 |
|
|
| Henty02 | 20 | 20 | 120 | 60 | 50 | 190 |
| |
| LakeCarg02 | 74 | 30 | 40 | 50 | 30 | 70 |
|
|
| Leeton03 | 82 | 70 | 110 | 160 | 70 | 80 |
|
|
| Narrandera04 | 25 | 30 | ∞ | 770 | 130 | 510 |
| |
| Parkes02 | 20 | 30 | 130 | 100 | 80 | 500 | ||
| Parkes03 | 41 | 30 | 140 | 140 | 190 | 310 |
| |
| Temora02 | 20 | 20 | 120 | 160 | 150 | ∞ | ||
| TheRock02 | 20 | 30 | 410 | 170 | 100 | ∞ | ||
| Tumut | 20 | 20 | 670 | 470 | 270 | ∞ |
| |
| Wagga02 | 57 | 70 | 790 | ∞ | ∞ | ∞ | ||
| Wagga03 | 162 | 210 | 660 | 740 | 610 | 860 | ||
| Wahgunyah | 24 | 20 | 90 | 70 | 50 | ∞ |
|
|
| Wilcannia02 | 43 | 20 | 50 | 60 | 30 | 50 |
|
|
| Wodonga | 42 | 30 | 110 | 110 | 100 | 130 |
| |
| WWyalong03 | 24 | 120 | ∞ | ∞ | 110 | ∞ |
| |
| Young02 | 49 | 110 | 170 | 380 | 400 | 440 |
|
|
| Coffs02 | 18 | 40 | 70 | 60 | 70 | ∞ | ||
| Foster02 | 34 | 40 | ∞ | ∞ | ∞ | ∞ |
|
|
| Grafton03 | 29 | 40 | 290 | 280 | 510 | ∞ |
| |
| Maclean02 | 34 | 50 | 600 | 280 | 360 | ∞ |
| |
| NSW03 | 42 | 90 | 380 | ∞ | ∞ | ∞ | ||
| QLD03 | 42 | 70 | 430 | 290 | 530 | ∞ |
| |
| Sawtell02 | 34 | 120 | ∞ | ∞ | ∞ | ∞ | ||
| SWRocks02 | 33 | 40 | ∞ | ∞ | ∞ | ∞ |
| |
| Syd03 | 42 | 130 | ∞ | ∞ | ∞ | 630 |
| |
| Taree03 | 30 | 40 | ∞ | ∞ | ∞ | ∞ | ||
Non-outbreak population.
Significant at 5% level.
Significant at 1% level.
Significant at 0.1% level.
Excess of homozygosity for different microsatellites.
| Rank | Microsatellite | Number of populations | |
| Homozygous excess | Out of | ||
| 1 | Bt2.9a | 36 | 39 |
| 2 | Bt6.1a | 33 | 36 |
| 3 | Bt15 | 36 | 40 |
| 4 | Bt4.1a | 36 | 40 |
| 5 | Bt1.7a | 35 | 40 |
| 6 | Bt2.6a | 33 | 40 |
| 7 | Bt2.6b | 31 | 38 |
| 8 | Bt3.2b | 30 | 37 |
| 9 | Bt1.6a | 31 | 39 |
| 10 | Bt32 | 30 | 39 |
| 11 | Bt10 | 30 | 40 |
| 12 | Bt7.9a | 29 | 39 |
| 13 | Bt6.12a | 27 | 40 |
| 14 | Bt5.10a | 27 | 40 |
| 15 | Bt8.5a | 26 | 40 |
| 16 | Bt11 | 25 | 40 |
| 17 | Bt7.2b | 23 | 39 |
| 18 | Bt1.1a | 20 | 40 |
| 19 | Bt9.1a | 20 | 40 |
| 20 | Bt14 | 18 | 40 |
| 21 | Bt8.6a | 18 | 40 |
| 22 | Bp78 | 18 | 40 |
| 23 | Bt17 | 17 | 40 |
| 24 | Bt4.3a | 16 | 40 |
| 25 | Bt4.6a | 15 | 38 |
| 26 | Bt6.8a | 15 | 40 |
| 27 | Bt8.12a | 15 | 40 |
| 28 | Bt6.10b | 14 | 40 |
| 29 | Bt5.8a | 9 | 38 |
Estimated N values for North-West population samples.
| S | No homozygote correction | Homozygote correction | Likelihood Significance | |||||
| Unlinked No permute | Unlinked permute | Unlinked permute | All loci | |||||
| permute | LDNe | genotypea | composite | |||||
| K-Ke2002 | 22 | 30 | 160 | 270 | 90 | ∞ |
| |
| K-Ke2003 | 39 | 20 | 60 | 90 | 100 | ∞ |
| |
| K-Kl2000 | 77 | 70 | 240 | 290 | 160 | 190 | ||
| K-Kl2001 | 50 | 60 | 190 | 210 | 170 | ∞ | ||
| K-Kl2002 | 44 | 30 | 60 | 100 | 70 | 80 |
|
|
| K-Kl2003 | 50 | 50 | ∞ | ∞ | ∞ | ∞ |
| |
| K-Km2002 | 27 | 20 | 420 | 280 | 90 | 50 |
|
|
| N-DWN02 | 40 | 20 | 50 | 80 | 90 | 780 | ||
| N-DWN03 | 20 | 60 | ∞ | ∞ | ∞ | ∞ |
| |
| N-DWN99 | 20 | ∞ | ∞ | ∞ | ∞ | ∞ | ||
| N-DWNBUSH02 | 30 | 40 | ∞ | ∞ | ∞ | ∞ | ||
| N-DWN-KTH03 | 19 | 60 | ∞ | ∞ | ∞ | ∞ | ||
| N-GOVE02 | 17 | ∞ | ∞ | ∞ | ∞ | ∞ | ||
| N-KAK02 | 40 | 40 | 80 | 120 | 120 | 440 |
| |
| N-KTH03 | 20 | 30 | 100 | 230 | ∞ | ∞ | ||
| N-KTHGO02 | 28 | 80 | ∞ | 440 | 470 | ∞ |
| |
| N-mDK02 | 27 | 40 | 300 | 180 | 270 | ∞ | ||
| N-mDKA02 | 20 | 80 | ∞ | ∞ | 150 | ∞ | ||
| N-mKKu03 | 36 | 30 | 100 | 120 | 80 | 200 |
| |
| N-nDWN02 | 50 | 70 | 140 | 210 | 320 | ∞ |
| |
| N-nDWN03 | 20 | 90 | ∞ | ∞ | ∞ | 100 |
| |
| N-nKTH03 | 20 | 30 | 170 | 270 | 420 | ∞ | ||
| Q-AT02 | 21 | 40 | ∞ | ∞ | ∞ | ∞ | ||
| Q-ATH99 | 21 | 110 | ∞ | ∞ | ∞ | 340 | ||
| Q-CT00 | 23 | 140 | ∞ | ∞ | ∞ | ∞ |
| |
| Q-CT99 | 17 | 50 | 90 | 280 | ∞ | ∞ | ||
| Q-LR00 | 24 | 80 | ∞ | ∞ | ∞ | 110 | ||
| Q-MB02 | 21 | 40 | ∞ | ∞ | ∞ | ∞ | ||
| Q-Qld00 | 94 | 110 | 260 | 260 | 390 | ∞ |
| |
| Q-QLD01 | 55 | 70 | 280 | 280 | 630 | 300 | ||
| Q-QLD02 | 40 | 40 | 220 | 250 | 160 | ∞ |
| |
| Q-QLD03 | 42 | 40 | 250 | 110 | 140 | ∞ | ||
| W-Brm01 | 21 | 20 | 30 | 40 | 30 | 80 | ||
| W-Der01 | 17 | 10 | 10 | 10 | 10 | 10 |
|
|
Significant at 5% level.
Significant at 1% level.
Significant at 0.1% level.