| Literature DB >> 33882824 |
Adéla Nosková1, Meenu Bhati2, Naveen Kumar Kadri2, Danang Crysnanto2, Stefan Neuenschwander3, Andreas Hofer4, Hubert Pausch2.
Abstract
BACKGROUND: The key-ancestor approach has been frequently applied to prioritize individuals for whole-genome sequencing based on their marginal genetic contribution to current populations. Using this approach, we selected 70 key ancestors from two lines of the Swiss Large White breed that have been selected divergently for fertility and fattening traits and sequenced their genomes with short paired-end reads.Entities:
Keywords: Genetic diversity; Genotyping by sequencing; Key ancestor animals; Low-pass sequencing; Swiss large white
Year: 2021 PMID: 33882824 PMCID: PMC8061004 DOI: 10.1186/s12864-021-07610-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Variants detected in 70 sequenced key ancestor animals
| Raw | Filtered & imputed | Dam line | Sire line | |
|---|---|---|---|---|
| Number of animals | 70 | 70 | 32 | 38 |
| Sequence coverage1 | 16.69 (8.72–36.85) | 16.69 (8.72–36.85) | 18.02 (9.31–36.85) | 15.57 (8.72–27.73) |
| Number of variants | ||||
| All | 28,407,060 | 26,862,369 | 24,358,047 | 24,093,052 |
| Biallelic SNP | 22,191,375 | 21,209,725 | 19,456,000 | 19,232,692 |
| Biallelic INDEL | 4,379,470 | 4,339,947 | 3,960,976 | 3,928,684 |
| Others2 | 1,836,215 | 1,312,697 | 941,071 | 931,676 |
| Autosomal variants | ||||
| All | 27,582,843 | 26,198,587 | 23,774,053 | 23,531,919 |
| Biallelic SNP | 21,553,323 | 20,715,354 | 19,015,058 | 18,808,294 |
| Biallelic INDEL | 4,248,742 | 4,211,012 | 3,846,008 | 3,817,622 |
| Others2 | 1,780,778 | 1,272,221 | 912,987 | 906,003 |
1 estimated from the autosomes
2 this category contains multi-allelic SNP, multi-allelic INDEL, as well as sites that may contain both SNP and INDEL
Fig. 1Sequencing of key ancestor animals from two pig lines. a Number of polymorphic sites detected in the 70 boars as a function of depth of coverage based on imputed and filtered non-imputed data (transparency). b Plot of the first two principal components showing the separation of animals by breed and the relationship between both lines. Blue and orange symbols indicate 38 and 32 boars from the sire and dam line, respectively
Comparison between sequence- and array-called genotypes at corresponding positions
| Dataset | Genotype concordance (%) | Non-reference sensitivity (%) | Non-reference discrepancy (%) |
|---|---|---|---|
| Raw | 99.18 | 99.75 | 1.11 |
| Filtered | 99.19 | 99.77 | 1.09 |
| Filtered & imputed | 99.82 | 99.95 | 0.24 |
Fig. 2Genomic inbreeding in the two lines. FROH in dam and sire line, estimated for three groups of ROH classified based on their length: small (50 kb – 100 kb), medium (100 kb – 2 Mb) and long (> 2 Mb)
Predicted consequences of variants segregating in two lines. The table shows only the most sever consequence for a variant
| Consequence type (most severe) | Dam line | Sire line |
|---|---|---|
| Splice donor variant | 1396 | 1421 |
| Splice acceptor variant | 1126 | 1096 |
| Stop gained | 1615 | 1604 |
| Frameshift variant | 10,912 | 11,043 |
| Stop lost | 595 | 587 |
| Start lost | 423 | 421 |
| Inframe insertion | 990 | 987 |
| Inframe deletion | 1164 | 1186 |
| Protein altering variant | 62 | 62 |
| Missense variant | 70,758 | 69,983 |
| Splice region variant | 22,493 | 22,148 |
| Incomplete terminal codon variant | 12 | 11 |
| Synonymous variant | 76,977 | 75,279 |
| Stop retained variant | 149 | 135 |
| Start retained variant | 4 | 4 |
| Coding sequence variant | 98 | 96 |
| Mature miRNA variant | 12 | 16 |
| 5′ - UTR variant | 168,000 | 164,866 |
| 3′ - UTR variant | 348,135 | 344,514 |
| Non-coding transcript exon variant | 277,002 | 275,909 |
| Intron variant | 12,213,614 | 12,092,056 |
| Non-coding transcript variant | 11 | 10 |
| Upstream gene variant | 878,779 | 869,207 |
| Downstream gene variant | 757,364 | 750,548 |
| Intergenic variant | 8,942,362 | 8,848,730 |
Fig. 3Signatures of selection detected in the sire and dam line of the SLW breed. Signatures of selection detected in the sire and dam line of the SLW breed using CLR (a) and iHS (b) Dotted lines indicate the empirical 0.5 (CLR) and 0.1% (iHS) thresholds. Blue, orange and grey vertical bars highlight signatures of selection detected in the sire, dam and both lines, respectively
Accuracy of sequence variant genotyping in low-coverage (1.11-fold) sequencing data
| Variant genotyping approach | Genotype concordance | Non-reference sensitivity | Non-reference discrepancy |
|---|---|---|---|
| GLIMPSE | 97.60 | 98.73 | 3.24 |
| GATK raw | 75.90 | 52.35 | 30.20 |
| GATK filtered | 75.89 | 52.36 | 30.22 |
| GATK filtered & imputed | 85.74 | 96.56 | 19.34 |
Fig. 4Accuracy of genotyping by low-coverage sequencing. a Concordance between array-called and sequence-called genotypes at 46,001 biallelic autosomal SNP in 243 pigs that had been sequenced at either low (N = 175; < 1.5-fold) or medium to high (N = 68; 8.88–37.60-fold) coverage. Correlations of (b) diagonal (r = 0.96) and (c) off-diagonal (r = 0.99) elements of genomic relationship matrices constructed from array- and GLIMPSE-called genotypes at 44,268 SNP that had minor allele frequency > 0.01