| Literature DB >> 18826649 |
Jennifer R S Meadows1, Eva K F Chan, James W Kijas.
Abstract
BACKGROUND: The success of genome-wide scans depends on the strength and magnitude of linkage disequilibrium (LD) present within the populations under investigation. High density SNP arrays are currently in development for the sheep genome, however little is known about the behaviour of LD in this livestock species. This study examined the behaviour of LD within five sheep populations using two LD metrics, D' and x2'. Four economically important Australian sheep flocks, three pure breeds (White Faced Suffolk, Poll Dorset, Merino) and a crossbred population (Merino x Border Leicester), along with an inbred Australian Merino museum flock were analysed.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18826649 PMCID: PMC2572059 DOI: 10.1186/1471-2156-9-61
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Genetic Diversity Within Five Sheep Populations
| Population | n | PD | MER | MxB | EMAI | ||||
| WFS | 84 | 0.68 | 6.90 | 7.21 | 0.26 | 0.035 | 0.051 | 0.063 | 0.257 |
| PD | 122 | 0.65 | 7.03 | 6.95 | 0.18 | 0.072 | 0.085 | 0.259 | |
| MER | 126 | 0.70 | 8.13 | 8.13 | 0.58 | 0.043 | 0.183 | ||
| MxB | 128 | 0.68 | 7.80 | 7.79 | 0.36 | 0.217 | |||
| EMAI | 95 | 0.40 | 3.03 | 3.13 | 0.09 | ||||
Microsatellite genotypes from 28 microsatellite markers were used to estimate the following measures of genetic diversity; HE is the expected heterozygosity or gene diversity; AN is the average number of observed alleles per locus; AR is allelic richness, a measure of diversity following rarefaction for sample size; pAR is private allele richness, a simple measure of population distinctiveness. All measures were calculated using HP-RARE ver1.0 [29]. n is the number of individuals tested. Pair-wise estimates of FST were calculated using the program FSTAT 2.9.3.2 .
Figure 1Cluster analysis of five sheep populations. Analysis of White Faced Suffolk (WFS), Poll Dorset (PD), Merino (MER), Merino × Border Leicester (MxB) and the Macarther Merino using STRUCTURE v2.2 [14] reveals the total genetic variation was explained with four sub-populations.
Figure 2Linkage disequilibrium (x. Each population is plotted in a separate panel. The absolute values of x2' (green circles) are plotted as a function of the genetic distance separating each marker pair (cM). Note the Y axis scale (x2') is not the same for each population. The mean value of x2' within defined distance bins is shown as horizontal green bars and contained within Table 2. The decay of LD modelled as a function of distance according to formula 3 is shown using black diamonds. Two significance thresholds are indicated using horizontal lines. The first represents the average x2' value obtained between non-syntenic marker pairs (orange line) while the second represents the 5% significance threshold (red line).
Mean x2' with Increasing Genetic Distance
| Distance bin | WFS | PD | MER | MxB | EMAI |
| 0–5 cM | 0.167 (0.076) | 0.151 (0.086) | 0.084 (0.048) | 0.120 (0.064) | 0.283 (0.199) |
| 5–10 cM | 0.129 (0.063) | 0.111 (0.056) | 0.084 (0.051) | 0.075 (0.051) | 0.192 (0.131) |
| 0–10 cM | 0.156 (0.073) | 0.139 (0.079) | 0.084 (0.048) | 0.102 (0.062) | 0.250 (0.179) |
| 10–20 cM | 0.139 (0.056) | 0.100 (0.032) | 0.072 (0.035) | 0.096 (0.054) | 0.067 (0.055) |
| 20–30 cM | 0.098 (0.030) | 0.096 (0.110) | 0.062 (0.037) | 0.060 (0.034) | 0.042 (0.030) |
| 30–40 cM | 0.095 (0.033) | 0.096 (0.033) | 0.063 (0.033) | 0.072 (0.033) | 0.028 (0.017) |
| 40–115 cM | 0.105 (0.055) | 0.096 (0.065) | 0.073 (0.032) | 0.093 (0.047) | 0.042 (0.034) |
| Non-syntenic | 0.099 (0.047) | 0.088 (0.047) | 0.073 (0.033) | 0.087 (0.047) | 0.048 (0.071) |
| Syntenic | 153 | 153 | 171 | 171 | 120 |
| Non-Syntenic | 198 | 198 | 207 | 207 | 180 |
| Critical Threshold 5% | 0.141 | 0.065 | 0.151 | 0.151 | 0.053 |
| 0.802 | 1.066 | 9.015 | 4.875 | 0.239 | |
Mean values for x2' (standard deviation) were calculated following classification of marker pairs into distance bins. The number of both syntenic and non-syntenic marker pairs used for the calculation of mean x2' are given for each population. The x2' value which corresponds to the 5% level of significance is given for each population. This appears as a horizontal red line in Figure 2. The decay of LD with distance is quantified using b(formula 3).
The Proportion of Marker Pairs in Significant LD
| 0–5 cM | 13/18 (0.72) | 15/18 (0.83) | 10/18 (0.56) | 11/18 (0.61) | 12/12 (1.00) |
| 5–10 cM | 6/8 (0.75) | 6/8 (0.75) | 4/12 (0.33) | 3/12 (0.25) | 6/7 (0.86) |
| 0–10 cM | 19/26 (0.73) | 21/26 (0.81) | 14/30 (0.47) | 14/30 (0.47) | 18/19 (0.95) |
| 10–20 cM | 10/16 (0.63) | 8/16 (0.50) | 2/16 (0.13) | 4/16 (0.25) | 8/14 (0.57) |
| 20–30 cM | 6/19 (0.32) | 12/19 (0.63) | 7/24 (0.29) | 1/24 (0.04) | 3/12 (0.25) |
| 30–40 cM | 2/12 (0.17) | 6/12 (0.50) | 2/14 (0.14) | 1/14 (0.07) | 1/11 (0.09) |
| 40–115 cM | 9/80 (0.11) | 33/80 (0.41) | 3/87 (0.03) | 10/87 (0.11) | 8/64 (013) |
| Non-syntenic | 24/198 (0.12) | 83/198 (0.42) | 28/207 (0.14) | 19/207 (0.10) | 22/180 (0.12) |
The number of marker pairs with significant x2' (p < 0.05) is given before the total number of markers tested for each bin and population. The proportion is given in brackets.
Predictions for Genome Wide Associations
| WFS | 0.2 | 0.39 | 0.91 | 6.08 | 0.82 | 4,268 |
| PD | 0.2 | 0.28 | 0.80 | 9.20 | 0.54 | 6,481 |
| MER | 0.2 | 0.06 | 0.25 | 52.4 | 0.10 | 35,000 |
| MxB | 0.2 | 0.11 | 0.45 | 25.4 | 0.20 | 7,000 |
| EMAI | 0.2 | 0.58 | 0.99 | 3.42 | 1.46 | 2,397 |
| WFS | 0.141 | 0.61 | 0.99 | 3 | 1.58 | 2,215 |
| PD | 0.065 | 0.78 | 1.00 | 2 | 2.51 | 1,394 |
| MER | 0.151 | 0.06 | 0.25 | 52 | 0.10 | 35,000 |
| MxB | 0.151 | 0.28 | 0.80 | 9 | 0.54 | 6,481 |
| EMAI | 0.053 | 1.00 | 1.00 | na | na | na |
In calculation 1, the threshold (T) was set to 0.2 or the empirically derived 5% significance threshold for each population. This allowed the value for LDto be taken from the dataset and used to calculate P. The range (R) was set to 0 – 5 cM in each case and mR = 5. In calculation 2, Pwas set to 0.95 in each case and the thresholds used were the same as for calculation 1 which resulted in use of the same values for LD. This allowed the number of markers (mR) for size range (R = 5) to be calculated. mR was converted into the required marker spacing in cM (M) and the total number of markers required for a genome scan (Total M) for each population. The calculation was not applicable (na) where LD was equal to 1.