| Literature DB >> 30134820 |
Wilson Nandolo1,2, Yuri T Utsunomiya3, Gábor Mészáros4, Maria Wurzinger1, Negar Khayadzadeh1, Rafaela B P Torrecilha3, Henry A Mulindwa5, Timothy N Gondwe2, Patrik Waldmann6, Maja Ferenčaković7, José F Garcia3,8, Benjamin D Rosen9, Derek Bickhart9, Curt P van Tassell9, Ino Curik7, Johann Sölkner1.
Abstract
BACKGROUND: Runs of homozygosity (ROH) islands are stretches of homozygous sequence in the genome of a large proportion of individuals in a population. Algorithms for the detection of ROH depend on the similarity of haplotypes. Coverage gaps and copy number variants (CNV) may result in incorrect identification of such similarity, leading to the detection of ROH islands where none exists. Misidentified hemizygous regions will also appear as homozygous based on sequence variation alone. Our aim was to identify ROH islands influenced by marker coverage gaps or CNV, using Illumina BovineHD BeadChip (777 K) single nucleotide polymorphism (SNP) data for Austrian Brown Swiss, Tyrol Grey and Pinzgauer cattle.Entities:
Mesh:
Year: 2018 PMID: 30134820 PMCID: PMC6106898 DOI: 10.1186/s12711-018-0414-x
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Fig. 1BAF and LRR plots for scenarios with different copy numbers and homozygous alleles. The plots show how some regions that contain CNV can be erroneously determined as having ROH. The normal state is to carry two copies of an allele, and the BAF values are distributed between intermediate and extreme values, while the LRR ratio has intermediate values. As the number of copies increases, the LRR values move towards the higher extreme while the BAF values disperse into more intermediate values based on copy number. When the copy number is very large, this segment may also be mistyped as being homozygous and ROH algorithms may detect it as a ROH. A segment, which is truly homozygous has extreme BAF values and intermediate LRR values
Size of ROH islands in Brown Swiss, Tyrol Grey and Pinzgauer cattle
| Breed | Number of autosomes with ROH islands | Number of ROH islands | ROH island length (bp) | Coverage (Mb) | |||
|---|---|---|---|---|---|---|---|
| Min | Median | Mean | Max | ||||
| Brown Swiss | 8 | 13 | 34,863 | 1,662,891 | 2,049,006 | 6,624,458 | 26.637 |
| Tyrol Grey | 17 | 22 | 16,770 | 1,397,826 | 1,771,718 | 5,948,811 | 38.978 |
| Pinzgauer | 14 | 24 | 5309 | 1,194,240 | 1,493,608 | 4,651,919 | 35.847 |
CNVR numbers, lengths and coverage for each CNV detection method used
| Software | Copy state | Na | CNVR length (bp) | Total (Mb) | Coverage (%)b | |||
|---|---|---|---|---|---|---|---|---|
| Mean | Median | Min | Max | |||||
|
| ||||||||
| PennCNV | Loss | 210 | 51,633.0 | 24,514.5 | 1358 | 483,799 | 10.843 | 0.43 |
| Gain | 66 | 84,062.2 | 24,387.5 | 2809 | 1,879,682 | 5.548 | 0.22 | |
| Both | 30 | 241,581.7 | 104,390.5 | 7538 | 1,347,298 | 7.247 | 0.29 | |
| Overall | 306 | 23.638 | 0.94 | |||||
| SVS | Loss | 141 | 41,221.0 | 6858.0 | 1086 | 1,217,387 | 5.812 | 0.23 |
| Both | 46 | 37,702.4 | 10,232.5 | 1774 | 353,135 | 1.734 | 0.07 | |
| Overall | 187 | 7.546 | 0.30 | |||||
| Consensus | Loss | 9 | 42,871.3 | 47,761.0 | 4693 | 90,545 | 0.386 | 0.02 |
| Both | 21 | 197,095.7 | 53,252.0 | 1404 | 945,913 | 4.139 | 0.16 | |
| Overall | 30 | 4.525 | 0.18 | |||||
|
| ||||||||
| PennCNV | Loss | 502 | 95,870.1 | 49,568.0 | 1358 | 2,611,715 | 48.127 | 1.65 |
| Gain | 90 | 47,216.2 | 24,954.0 | 2455 | 279,361 | 4.249 | 0.15 | |
| Both | 14 | 518,196.6 | 256,469.0 | 5035 | 1,646,040 | 7.255 | 0.25 | |
| Overall | 606 | 59.631 | 2.04 | |||||
| SVS | Loss | 115 | 167,013.8 | 11,531.0 | 1369 | 4,210,187 | 19.207 | 0.66 |
| Both | 38 | 44,509.8 | 11,567.0 | 1774 | 652,218 | 1.691 | 0.06 | |
| Overall | 153 | 20.898 | 0.72 | |||||
| Consensus | Loss | 49 | 142,561.0 | 75,065.0 | 3620 | 1,432,454 | 6.985 | 0.24 |
| Both | 22 | 205,619.5 | 58,194.5 | 2270 | 790,623 | 4.524 | 0.16 | |
| Overall | 71 | 11.509 | 0.39 | |||||
|
| ||||||||
| PennCNV | Loss | 390 | 63,906.2 | 26,759.5 | 1300 | 951,876 | 24.923 | 0.99 |
| Gain | 100 | 35,733.7 | 20,985.5 | 1950 | 279,361 | 3.573 | 0.14 | |
| Both | 38 | 373,093.4 | 164,810.5 | 4038 | 2,050,695 | 14.17755 | 0.56 | |
| Overall | 528 | 42.67433 | 1.70 | |||||
| SVS | Loss | 119 | 66,791.9 | 8018.0 | 1169 | 1,311,740 | 7.94824 | 0.32 |
| Gain | 1 | 307,583.0 | 307,583.0 | 307,583 | 307,583 | 0.307583 | 0.01 | |
| Both | 58 | 31,875.8 | 10,004.5 | 1369 | 320,050 | 1.848796 | 0.07 | |
| Overall | 178 | 10.10462 | 0.40 | |||||
| Consensus | Loss | 17 | 66,484.7 | 31,556.0 | 4895 | 459,485 | 1.13024 | 0.05 |
| Both | 33 | 179,520.3 | 34,711.0 | 1774 | 947,366 | 5.924169 | 0.24 | |
| Overall | 50 | 7.054409 | 0.28 | |||||
aN = Number of CNVRs
bThe coverage percentage is based on the bovine autosomal genome size of 2511 Mb covered by the BovineHD SNP chip
Overlaps between the consensus CNVR identified in this study and CNVR reported by other studies
| Study | Autosomal CNVR | Coverage (Mb) | Brown Swiss | Tyrol Grey | Pinzgauer | |||
|---|---|---|---|---|---|---|---|---|
| Overlaps | % | Overlaps | % | Overlaps | % | |||
| Bae et al. [ | 368 | 51.596 | 4 | 13 | 9 | 12 | 5 | 10 |
| Bagnato et al. [ | 150 | 48.252 | 1 | 3 | 4 | 5 | 1 | 3 |
| Bickhart et al. [ | 1726 | 51.396 | 22 | 73 | 33 | 44 | 39 | 78 |
| Hou et al. [ | 3346 | 51.798 | 10 | 33 | 23 | 31 | 19 | 38 |
| Liu et al. [ | 163 | 43.631 | 2 | 7 | 5 | 7 | 5 | 10 |
| Prinsen et al. [ | 563 | 50.444 | 30 | 100 | 65 | 87 | 43 | 86 |
| Sasaki et al. [ | 861 | 50.251 | 24 | 80 | 49 | 65 | 39 | 78 |
| Wu et al. [ | 247 | 46.839 | 10 | 33 | 16 | 21 | 13 | 26 |
| Xu et al. [ | 257 | 41.564 | 13 | 43 | 20 | 27 | 24 | 48 |
| Zhang et al. [ | 425 | 49.037 | 14 | 47 | 24 | 32 | 25 | 50 |
Fig. 2Details of the overlaps between SNP coverage gaps, ROH islands and CNVR. The upper panel shows the ROH (black) for each animal (grey gridlines). The lower panel shows the IMD (black) and the proportion of animals in ROH at each marker (green). The inset shows a short region with normal IMD, and which could be a true ROH island. However, the region is flanked by big gaps, the edges of which could also be true ROH. The whole region is detected as a ROH in most of the individuals
Fig. 3ROH islands and CNVR on each chromosome for Pinzgauer cattle. Each chromosome (dark grey bar) has four lines. Starting from top to bottom within the chromosome, the top line (black) is for ROH islands. The second line is for PennCNV CNV (light blue for copy loss, red for copy gain and light green for both copy loss and copy gain). The third line is for SVS CNV (blue for copy loss, maroon for copy gain and dark green for both copy loss and copy gain). The fourth (last) line is for intermarker distance (IMD, light grey for IMD < 9.2365 kb and orange for IMD > 9.2365 kb). The magenta rectangles show regions where consensus CNVR overlap with ROH islands
Results of the permutation test that checks whether the intersections between CNVR with copy loss (copy loss or both copy loss and copy gain) and ROH islands are due to chance alone
| Breed | Software | ROHDa | Estimate (ROHRb) | Confidence interval | ||
|---|---|---|---|---|---|---|
| Lower | Upper | |||||
| Brown Swiss | PennCNV | 0.630 | 0.072 | 0.069 | 0.074 | 0 |
| SVS | 0.176 | 0.023 | 0.021 | 0.025 | 0 | |
| Consensus | 0.000 | 0.008 | 0.007 | 0.009 | 7.00E−38 | |
| Tyrol Grey | PennCNV | 2.931 | 0.460 | 0.451 | 0.470 | 0 |
| SVS | 2.453 | 0.150 | 0.143 | 0.157 | 0 | |
| Consensus | 2.135 | 0.084 | 0.079 | 0.088 | 0 | |
| Pinzgauer | PennCNV | 4.824 | 0.420 | 0.410 | 0.430 | 0 |
| SVS | 3.774 | 0.059 | 0.056 | 0.063 | 0 | |
| Consensus | 2.729 | 0.033 | 0.030 | 0.036 | 0 | |
The number of iterations used for randomizing the locations of the ROH islands was 10,000
aIntersections between CNVRs and ROH islands from the data
bIntersections between CNVRs and randomized ROH islands
Numbers and lengths ROH islands that were affected by CNV and gaps
| Breed | ROH islands affected by | Number of affected ROH islands | Coverage (Mb) | As the percentage of total ROH island coverage |
|---|---|---|---|---|
| Brown Swiss (ROH island coverage = 26.637) | Gain + loss | 3 | 4.073 | 15.3 |
| Gap | 1 | 1.459 | 5.5 | |
| Gap + gain + loss | 1 | 1.109 | 4.2 | |
| Loss | 2 | 9.855 | 37.0 | |
| Overall | 7 | 16.496 | 61.9 | |
| Tyrol Grey (ROH island coverage = 35.847) | Gain | 1 | 1.417 | 4.0 |
| Gap | 3 | 4.225 | 11.8 | |
| Gap + gain + loss | 2 | 5.789 | 16.1 | |
| Gap + loss | 2 | 2.931 | 8.2 | |
| Loss | 6 | 12.939 | 36.1 | |
| Overall | 14 | 27.301 | 76.2 | |
| Pinzgauer (ROH island coverage = 38.978) | Gap | 3 | 1.811 | 4.6 |
| Gap + gain + loss | 3 | 6.834 | 17.5 | |
| Gap + loss | 2 | 4.382 | 11.2 | |
| Loss | 8 | 15.879 | 40.7 | |
| Overall | 16 | 28.905 | 74.2 |
Fig. 4BAF and LRR plots of selected ROH islands. In each of the four sub-plots, the top panel is for individual ROH (black) and individual CNV (blue and dark red for copy loss and copy gain, respectively, for SVS; and light blue and red for copy loss and copy gain, respectively, for PennCNV). The middle panel shows the mean BAF values at each marker while the third panel is for mean LRR at each marker. Panel a illustrates the possibility of a gap being detected as a ROH. Panel b shows how a CNVR can extend ROH leading to a ROH island. Panel c is a typical example of a CNVR and gaps within a ROH island. Panel d is an extreme example of ROH islands being detected between regions with a high frequency of CNV
Fig. 5Haplotype diversity statistics in all ROH islands. a Number of haplotypes. b Number of SNPs per haplotype. c Effective number of haplotypes per haplotype block. d Haplotype diversity