| Literature DB >> 27317779 |
Irene van den Berg1, Didier Boichard2, Bernt Guldbrandtsen3, Mogens S Lund3.
Abstract
Sequence data are expected to increase the reliability of genomic prediction by containing causative mutations directly, especially in cases where low linkage disequilibrium between markers and causative mutations limits prediction reliability, such as across-breed prediction in dairy cattle. In practice, the causative mutations are unknown, and prediction with only variants in perfect linkage disequilibrium with the causative mutations is not realistic, leading to a reduced reliability compared to knowing the causative variants. Our objective was to use sequence data to investigate the potential benefits of sequence data for the prediction of genomic relationships, and consequently reliability of genomic breeding values. We used sequence data from five dairy cattle breeds, and a larger number of imputed sequences for two of the five breeds. We focused on the influence of linkage disequilibrium between markers and causative mutations, and assumed that a fraction of the causative mutations was shared across breeds and had the same effect across breeds. By comparing the loss in reliability of different scenarios, varying the distance between markers and causative mutations, using either all genome wide markers from commercial SNP chips, or only the markers closest to the causative mutations, we demonstrate the importance of using only variants very close to the causative mutations, especially for across-breed prediction. Rare variants improved prediction only if they were very close to rare causative mutations, and all causative mutations were rare. Our results show that sequence data can potentially improve genomic prediction, but careful selection of markers is essential.Entities:
Keywords: GenPred; across-breed prediction; genomic relationships; genomic selection; linkage disequilibrium; sequence data; shared data resource
Mesh:
Substances:
Year: 2016 PMID: 27317779 PMCID: PMC4978908 DOI: 10.1534/g3.116.027730
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Subsets of variants
| Variants | Overall | HOL | JER | MON | NOR | RDC | |
|---|---|---|---|---|---|---|---|
| SEQ | 1,475,541 | 0.12 | 0.11 | 0.10 | 0.11 | 0.11 | 0.12 |
| IMP | 247,141 | 0.10 | — | 0.09 | — | — | 0.10 |
| 50K | 2863 | 0.26 | 0.25 | 0.20 | 0.23 | 0.24 | 0.25 |
| HD | 44,540 | 0.27 | 0.25 | 0.20 | 0.23 | 0.24 | 0.25 |
| MAF ≤ 0.1 | 907,484 | 0.03 | 0.02 | 0.04 | 0.03 | 0.03 | 0.03 |
| MAF ≥ 0.1 | 568,057 | 0.27 | 0.26 | 0.21 | 0.23 | 0.23 | 0.26 |
Number of variants (nVariants) and corresponding average minor allele frequencies (MAF) according to variant classes. HOL, Holstein; JER, Jersey; MON, Montbéliarde; NOR, Normande; RDC, Danish Red.
Figure 1Description of the different scenarios. Scenarios varied according to the number (c) and minor allele frequency (MAF) of the causative mutations, and to the nature (from sequence or from chip), MAF, and distance to causative mutations of prediction variants.
Figure 2Genomic relationships within and across breed, using HD markers. HOL, Holstein; JER, Jersey; MON, Montbéliarde; NOR, Normande; RDC, Danish Red.
Sharing of causative mutations between breeds
| All Variants | MAF ≤ 0.1 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Breed | HOL | JER | MON | NOR | RDC | HOL | JER | MON | NOR | RDC |
| HOL | 78 | 64 | ||||||||
| JER | 48 | 54 | 23 | 32 | ||||||
| MON | 52 | 41 | 60 | 27 | 16 | 41 | ||||
| NOR | 52 | 41 | 45 | 60 | 26 | 16 | 19 | 39 | ||
| RDC | 64 | 48 | 52 | 52 | 77 | 43 | 23 | 29 | 27 | 62 |
Percentage of simulated causative mutations that was segregating in each breed (diagonal) and shared between breeds (below diagonal) according to minor allele frequency (MAF). HOL, Holstein; JER, Jersey; MON, Montbéliarde; NOR, Normande; RDC, Danish Red.
Reliabilities and observed and prediction reliability factors for different sets of markers
| ncaus | Train | Val | Set | Reliability | Heritability | RFO | RFP |
|---|---|---|---|---|---|---|---|
| 50 | JER | RDC | Cc | 0.85 (0.02) | 0.79 (0.01) | — | — |
| PSEQ1kb | 0.38 (0.03) | 0.75 (0.01) | 0.45 (0.04) | 0.39 (0.02) | |||
| PSEQ10kb | 0.27 (0.03) | 0.72 (0.01) | 0.31 (0.04) | 0.25 (0.02) | |||
| PSEQ25kb | 0.13 (0.02) | 0.71 (0.01) | 0.15 (0.03) | 0.16 (0.01) | |||
| P50K | 0.04 (0.01) | 0.78 (0.01) | 0.05 (0.02) | 0.03 (0.00) | |||
| P50KC | 0.09 (0.02) | 0.54 (0.02) | 0.11 (0.02) | 0.20 (0.01) | |||
| PHD | 0.08 (0.02) | 0.78 (0.01) | 0.10 (0.03) | 0.03 (0.00) | |||
| PHDC | 0.31 (0.04) | 0.61 (0.02) | 0.36 (0.04) | 0.34 (0.02) | |||
| RDC | JER | Cc | 0.95 (0.01) | 0.78 (0.01) | — | — | |
| PSEQ1kb | 0.47 (0.04) | 0.70 (0.01) | 0.50 (0.04) | 0.38 (0.02) | |||
| PSEQ10kb | 0.26 (0.03) | 0.66 (0.02) | 0.28 (0.04) | 0.23 (0.02) | |||
| PSEQ25kb | 0.13 (0.03) | 0.61 (0.02) | 0.14 (0.03) | 0.14 (0.01) | |||
| P50K | 0.09 (0.02) | 0.75 (0.01) | 0.10 (0.02) | 0.02 (0.00) | |||
| P50KC | 0.14 (0.03) | 0.39 (0.02) | 0.15 (0.03) | 0.18 (0.01) | |||
| PHD | 0.14 (0.03) | 0.77 (0.01) | 0.15 (0.03) | 0.02 (0.00) | |||
| PHDC | 0.40 (0.04) | 0.51 (0.03) | 0.42 (0.04) | 0.33 (0.02) | |||
| 100 | JER | RDC | Cc | 0.84 (0.01) | 0.79 (0.01) | — | — |
| PSEQ1kb | 0.36 (0.03) | 0.79 (0.01) | 0.43 (0.03) | 0.43 (0.02) | |||
| PSEQ10kb | 0.24 (0.03) | 0.79 (0.01) | 0.29 (0.03) | 0.26 (0.01) | |||
| PSEQ25kb | 0.08 (0.01) | 0.78 (0.01) | 0.09 (0.01) | 0.17 (0.01) | |||
| P50K | 0.06 (0.01) | 0.79 (0.01) | 0.07 (0.01) | 0.05 (0.00) | |||
| P50KC | 0.09 (0.02) | 0.61 (0.01) | 0.11 (0.02) | 0.21 (0.01) | |||
| PHD | 0.11 (0.02) | 0.79 (0.01) | 0.13 (0.02) | 0.06 (0.00) | |||
| PHDC | 0.28 (0.03) | 0.68 (0.01) | 0.34 (0.03) | 0.37 (0.01) | |||
| RDC | JER | Cc | 0.92 (0.01) | 0.79 (0.01) | — | — | |
| PSEQ1kb | 0.47 (0.04) | 0.75 (0.01) | 0.50 (0.04) | 0.41 (0.02) | |||
| PSEQ10kb | 0.32 (0.03) | 0.72 (0.01) | 0.35 (0.03) | 0.24 (0.01) | |||
| PSEQ25kb | 0.12 (0.02) | 0.68 (0.01) | 0.13 (0.02) | 0.14 (0.01) | |||
| P50K | 0.07 (0.01) | 0.76 (0.01) | 0.08 (0.01) | 0.03 (0.00) | |||
| P50KC | 0.14 (0.02) | 0.47 (0.01) | 0.15 (0.02) | 0.19 (0.01) | |||
| PHD | 0.14 (0.02) | 0.78 (0.01) | 0.15 (0.02) | 0.03 (0.00) | |||
| PHDC | 0.37 (0.04) | 0.59 (0.01) | 0.40 (0.04) | 0.36 (0.01) | |||
| 250 | JER | RDC | Cc | 0.73 (0.02) | 0.79 (0.01) | — | — |
| PSEQ1kb | 0.26 (0.02) | 0.81 (0.01) | 0.35 (0.03) | 0.47 (0.01) | |||
| PSEQ10kb | 0.12 (0.01) | 0.80 (0.01) | 0.17 (0.02) | 0.32 (0.01) | |||
| PSEQ25kb | 0.04 (0.01) | 0.81 (0.01) | 0.06 (0.01) | 0.23 (0.01) | |||
| P50K | 0.04 (0.01) | 0.79 (0.01) | 0.05 (0.01) | 0.13 (0.00) | |||
| P50KC | 0.07 (0.01) | 0.74 (0.01) | 0.09 (0.02) | 0.27 (0.01) | |||
| PHD | 0.05 (0.01) | 0.79 (0.01) | 0.07 (0.01) | 0.13 (0.00) | |||
| PHDC | 0.20 (0.02) | 0.76 (0.01) | 0.27 (0.03) | 0.41 (0.01) | |||
| RDC | JER | Cc | 0.85 (0.01) | 0.79 (0.01) | — | — | |
| PSEQ1kb | 0.33 (0.03) | 0.77 (0.01) | 0.39 (0.04) | 0.42 (0.01) | |||
| PSEQ10kb | 0.19 (0.03) | 0.75 (0.01) | 0.22 (0.03) | 0.27 (0.01) | |||
| PSEQ25kb | 0.07 (0.01) | 0.71 (0.01) | 0.08 (0.02) | 0.17 (0.01) | |||
| P50K | 0.08 (0.02) | 0.76 (0.01) | 0.09 (0.02) | 0.07 (0.00) | |||
| P50KC | 0.11 (0.02) | 0.60 (0.01) | 0.13 (0.02) | 0.22 (0.01) | |||
| PHD | 0.10 (0.02) | 0.78 (0.01) | 0.11 (0.02) | 0.07 (0.00) | |||
| PHDC | 0.25 (0.03) | 0.66 (0.01) | 0.29 (0.03) | 0.38 (0.01) |
SE of reliability, estimated heritability, RFO and RFP are given in parentheses. ncaus, the number of causative mutations; train, training population; val, validation population; set, markers used for prediction; RFO, observed reliability factor; RFP, predicted reliability factor; JER, Jersey; RDC, Danish Red; Cc, causative mutations; PSEQ1/10/25kb, sequence interval on 1 kb/10 kb/25 kb of the causative mutations; P50K and PHD, all markers on the 50K and HD SNP chip, respectively; P50KC and PHDC, the closest causative mutations to each 50K and HD marker, respectively.
Figure 3Reliability factor (RFP) as a function of the distance between causative mutations and intervals in prediction scenarios for within- and across-breed prediction. HOL, Holstein; JER, Jersey; MON, Montbéliarde; NOR, Normande; RDC, Danish Red.
Figure 4The influence of the MAF of causative mutations and markers on the reliability factor (RFP). Cc_PSEQd, no restriction on MAF for either causative mutations or markers; CcLM_PSEQd, MAF causative mutations ≤ 0.10 (no restriction for markers); Cc_PSEQdHM, no restriction for causative mutations, MAF markers > 0.10; CcLM_PSEQdHM, MAF causative mutations ≤ 0.10, MAF markers > 0.10.