| Literature DB >> 25887220 |
Sonia E Eynard1,2,3,4, Jack J Windig5,6, Grégoire Leroy7,8, Rianne van Binsbergen9,10, Mario P L Calus11.
Abstract
BACKGROUND: Relationships between individuals and inbreeding coefficients are commonly used for breeding decisions, but may be affected by the type of data used for their estimation. The proportion of variants with low Minor Allele Frequency (MAF) is larger in whole genome sequence (WGS) data compared to Single Nucleotide Polymorphism (SNP) chips. Therefore, WGS data provide true relationships between individuals and may influence breeding decisions and prioritisation for conservation of genetic diversity in livestock. This study identifies differences between relationships and inbreeding coefficients estimated using pedigree, SNP or WGS data for 118 Holstein bulls from the 1000 Bull genomes project. To determine the impact of rare alleles on the estimates we compared three scenarios of MAF restrictions: variants with a MAF higher than 5%, variants with a MAF higher than 1% and variants with a MAF between 1% and 5%.Entities:
Mesh:
Year: 2015 PMID: 25887220 PMCID: PMC4365517 DOI: 10.1186/s12863-015-0185-0
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Overview of the different scenarios
|
|
|
|
|
|---|---|---|---|
|
| Pedigree | None | 0 |
|
| BovineSNP50 BeadChip | ≥ 5 | 41 225 |
|
| BovineSNP50 BeadChip | ≥ 1 | 44 548 |
|
| BovineSNP50 BeadChip | Between 1 and 5 | 3 323 |
|
| Whole genome sequence | ≥ 5 | 11 953 905 |
|
| Whole genome sequence | ≥ 1 | 15 871 933 |
|
| Whole genome sequence | Between 1 and 5 | 3 918 028 |
Figure 1Distribution plot of the number of variants per class of MAF. Histograms of the number of segregating variants in each Minor Allele Frequency category (116 bins) from 1% to 50%, with density curve. The histogram on the left represents the distribution of variants from the Bovine 50 K SNP chip. The histogram on the right represents the distribution of variants from whole genome sequence (WGS) data.
Hardy-Weinberg proportions analysis
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
| Total variants | 41 225 | 44 548 | 3 323 | 11 953 905 | 15 871 933 | 3 918 028 |
| Departing variants | 1 633 | 1 693 | 60 | 1 105 493 | 1 196 346 | 90 853 |
| % departing variants | 3.961 | 3.800 | 1.806 | 9.248 | 7.537 | 2.319 |
Descriptive statistics (Yang method)
|
|
|
|
| |
|---|---|---|---|---|
| First degree relationships | ||||
|
| 0.503 | 0.548 | 0.663 | 0.0014 |
|
| 0.368 | 0.464 | 0.603 | 0.0026 |
|
| 0.355 | 0.453 | 0.617 | 0.0032 |
|
| 0.069 | 0.315 | 1.055 | 0.0367 |
|
| 0.339 | 0.427 | 0.555 | 0.0023 |
|
| 0.293 | 0.389 | 0.543 | 0.0033 |
|
| 0.128 | 0.275 | 0.692 | 0.0154 |
| Second degree relationships | ||||
|
| 0.250 | 0.302 | 0.406 | 0.0013 |
|
| 0.100 | 0.216 | 0.440 | 0.0038 |
|
| 0.094 | 0.209 | 0.445 | 0.0038 |
|
| −0.022 | 0.113 | 0.517 | 0.0093 |
|
| 0.075 | 0.200 | 0.402 | 0.0032 |
|
| 0.059 | 0.177 | 0.382 | 0.0031 |
|
| 0.001 | 0.105 | 0.402 | 0.0048 |
| Less-related | ||||
|
| 0.000 | 0.056 | 0.245 | 0.0019 |
|
| −0.135 | −0.015 | 0.382 | 0.0021 |
|
| −0.126 | −0.015 | 0.386 | 0.0019 |
|
| −0.112 | −0.012 | 0.432 | 0.0011 |
|
| −0.113 | −0.013 | 0.349 | 0.0018 |
|
| −0.092 | −0.010 | 0.321 | 0.0013 |
|
| −0.075 | −0.001 | 0.599 | 0.0008 |
| Inbreeding coefficients | ||||
|
| 0.000 | 0.027 | 0.163 | 0.0009 |
|
| −0.244 | −0.009 | 0.109 | 0.0023 |
|
| −0.234 | −0.009 | 0.108 | 0.0021 |
|
| −0.107 | −0.014 | 0.176 | 0.0011 |
|
| −0.215 | −0.037 | 0.068 | 0.0017 |
|
| −0.200 | −0.060 | 0.045 | 0.0012 |
|
| −0.273 | −0.131 | −0.021 | 0.0015 |
Figure 2Linear regressions plots for A , SNP and WGS against each other ( Yang method ) . Plots of linear regressions of A estimated relationships from pedigree (A ped), G estimated relationships for Single Nucleotide Polymorphism (G SNP) and whole genome sequence (G WGS) data using the Yang method. Each linear regression was performed for the scenarios with Minor Allele Frequency (MAF) ≥ 5% (5+), ≥ 1% (1+) and between 1% and 5% (1_5). The first row represents the plots for scenario +5, the second for +1 and the third for 1_5. The first column shows the linear regression plots of G SNP on A ped. The second column shows the linear regression plots of G WGS on A ped. The third shows the linear regression plots of G WGS on G SNP. In black is the regression line for an exact linear model (intercept=0, slope=1) and in red is the actual overall regression line. On the top left corner, the overall correlation coefficient for each linear regression appears.
Correlation coefficients for estimated relationships and inbreeding coefficients (Yang method)
|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
|
| 0.450a,b | 0.372a,b | 0.511a,b | 0.395a,b | 0.595a,b | 0.721a,b |
|
| 0.487a,b | 0.361a,b | 0.512a,b | 0.392a,b | 0.579a,b | 0.710a,b |
|
| 0.973a,b | 0.982a,b | 0.979a,b | 0.979a,b | 0.985a,b | 0.985a,b |
|
| 0.335a,b | 0.351a,b | 0.516a,b | 0.391a,b | 0.601a,b | 0.723a,b |
|
| 0.212b | 0.286a,b | 0.514a,b | 0.360a,b | 0.570a,b | 0.689a,b |
|
| 0.948a,b | 0.967a,b | 0.966a,b | 0.933a,b | 0.936a,b | 0.946a,b |
|
| −0.162b | 0.045b | 0.374a,b | 0.122b | 0.448a,b | 0.501a,b |
|
| −0.170b | 0.022b | 0.351a,b | 0.035b | 0.142b | 0.198b |
|
| 0.950 a,b | 0.857a,b | 0.676a,b | 0.515a,b | 0.487a,b | 0.537a,b |
|
| 0.978a,b | 0.995a | 0.999a | 0.999a | 0.999a | 0.999a |
|
| 0.888a,b | 0.972a,b | 0.989a,b | 0.965a,b | 0.969a,b | 0.978a,b |
|
| 0.567a,b | 0.587a,b | 0.555a,b | 0.446a,b | 0.467a,b | 0.588a,b |
|
| 0.503a,b | 0.647a,b | 0.600a,b | 0.263a,b | 0.185b | 0.315a,b |
|
| 0.725a,b | 0.661a,b | 0.593a,b | 0.488a,b | 0.494a,b | 0.611a,b |
|
| 0.844a,b | 0.808a,b | 0.714a,b | 0.507a,b | 0.423a,b | 0.505a,b |
a,bwhere ameans significantly different from 0 and bsignificantly different from 1 (P-value ≤0.05).
Descriptive statistics (based on similarities)
|
|
|
|
| |
|---|---|---|---|---|
| First degree relationships | ||||
|
| 0.503 | 0.548 | 0.663 | 0.0014 |
|
| 0.815 | 0.876 | 0.974 | 0.0011 |
|
| 0.891 | 0.949 | 1.040 | 0.0010 |
|
| 1.686 | 1.851 | 1.939 | 0.0026 |
|
| 0.957 | 1.008 | 1.080 | 0.0006 |
|
| 1.165 | 1.209 | 1.265 | 0.0005 |
|
| 1.719 | 1.822 | 1.876 | 0.0013 |
| Second degree relationships | ||||
|
| 0.250 | 0.302 | 0.407 | 0.0013 |
|
| 0.617 | 0.693 | 0.847 | 0.0021 |
|
| 0.705 | 0.778 | 0.921 | 0.0019 |
|
| 1.622 | 1.830 | 1.910 | 0.0028 |
|
| 0.786 | 0.864 | 1.009 | 0.0013 |
|
| 1.034 | 1.096 | 1.207 | 0.0009 |
|
| 1.661 | 1.807 | 1.859 | 0.0016 |
| Less-related | ||||
|
| 0.000 | 0.056 | 0.245 | 0.0019 |
|
| 0.405 | 0.502 | 0.746 | 0.0017 |
|
| 0.501 | 0.597 | 0.829 | 0.0017 |
|
| 1.477 | 1.773 | 1.925 | 0.0040 |
|
| 0.634 | 0.715 | 0.911 | 0.0010 |
|
| 0.889 | 0.976 | 1.132 | 0.0009 |
|
| 1.576 | 1.771 | 1.868 | 0.0017 |
| Inbreeding coefficients | ||||
|
| 0.000 | 0.027 | 0.163 | 0.0009 |
|
| 0.003 | 0.251 | 0.347 | 0.0015 |
|
| 0.059 | 0.298 | 0.390 | 0.0014 |
|
| 0.706 | 0.886 | 0.974 | 0.0020 |
|
| 0.163 | 0.342 | 0.417 | 0.0010 |
|
| 0.321 | 0.473 | 0.537 | 0.0007 |
|
| 0.764 | 0.873 | 0.930 | 0.0009 |
Figure 3Linear regressions plots for A, SNP and WGS against each other (based on similarities). Plots of linear regression of A estimated relationships from pedigree (A ped), G estimated relationships for Single Nucleotide Polymorphism (G SNP) and whole genome sequence (G WGS) data, based on similarities. Each linear regression was performed for the scenarios with Minor Allele Frequency (MAF) ≥ 5% (5+), ≥ 1% (1+) and between 1% and 5% (1_5). The first row represents the plots for scenario +5, the second for +1 and the third for 1_5. The first column shows the linear regression plots of G SNP on A ped. The second column shows the linear regression plots of G WGS on A ped. The third shows the linear regression plots of G WGS on G SNP. In black is the regression line for an exact linear model (intercept=0, slope=1) and in red is the actual overall regression line. On the top left corner, the overall correlation coefficient for each linear regression appears.
Correlation coefficient for estimated relationships and inbreeding coefficients (based on similarities)
|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
|
| 0.703a,b | 0.531a,b | 0.698a,b | 0.474a,b | 0.618a,b | 0.665a,b |
|
| 0.618a,b | 0.508a,b | 0.633a,b | 0.394a,b | 0.544a,b | 0.616a,b |
|
| 0.936a,b | 0.935a,b | 0.916a,b | 0.928a,b | 0.950a,b | 0.962a,b |
|
| 0.700a,b | 0.542a,b | 0.707a,b | 0.484a,b | 0.622a,b | 0.660a,b |
|
| 0.610a,b | 0.551a,b | 0.660a,b | 0.425a,b | 0.565a,b | 0.601a,b |
|
| 0.915a,b | 0.909a,b | 0.905a,b | 0.914a,b | 0.934a,b | 0.947a,b |
|
| 0.259b | 0.286a,b | 0.474a,b | 0.269a,b | 0.269a,b | 0.237b |
|
| 0.222b | 0.277a,b | 0.423a,b | 0.242a,b | 0.248b | 0.201b |
|
| 0.869a,b | 0.791a,b | 0.813a,b | 0.782a,b | 0.697a,b | 0.666a,b |
|
| 0.994a | 0.996a | 0.995a | 0.996a | 0.998a | 0.999a |
|
| 0.922a,b | 0.947a,b | 0.949a,b | 0.960a,b | 0.970a,b | 0.983a,b |
|
| 0.346a,b | 0.260a,b | 0.521a,b | 0.280a,b | 0.307a,b | 0.508a,b |
|
| 0.194b | 0.115b | 0.398a,b | 0.195a,b | 0.185b | 0.367a,b |
|
| 0.449a,b | 0.343a,b | 0.603a,b | 0.362a,b | 0.365 a,b | 0.543a,b |
|
| 0.559a,b | 0.427a,b | 0.668a,b | 0.462a,b | 0.417a,b | 0.533a,b |
a,bwhere ameans significantly different from 0 and bsignificantly different from 1 (P-value ≤0.05).