| Literature DB >> 35480318 |
Guoyan Qiao1,2, Pan Xu3, Tingting Guo1,2, Yi Wu1,2, Xiaofang Lu4,5, Qingfeng Zhang4,5, Xue He1, Shaohua Zhu1,2, Hongchang Zhao1,2, Zhihui Lei1,2, Weibo Sun1,2, Bohui Yang1,2, Yaojing Yue1,2.
Abstract
Dorper sheep (Ovis aries) (DPS), developed in the 1930s by crossing Dorset Horn and Blackhead Persian sheep in South Africa, is a world-famous composite breed for mutton production. The genetic basis underlying this breed is yet to be elucidated. Here, we report the sequencing and assembly of a highly contiguous Dorper sheep genome via integration of Oxford Nanopore Technology (ONT) sequencing and Hi-C (chromatin conformation capture) approaches. The assembled genome was around 2.64 Gb with a contig N50 of 73.33 Mb and 140 contigs in total. More than 99.5% of the assembled sequences could be anchored to 27 chromosomes and they were annotated with 20,450 protein-coding genes. Allele-specific expression (ASE) genes of Dorper sheep were revealed through ASE analysis and they were involved in the immune system, lipid metabolism, and environmental adaptation. A total of 5,701 and 456 allelic sites were observed in the SNP and indels loci identified from relevant whole-genome resequencing data. These allelic SNP and INDEL sites were annotated in 1,002 and 294 genes, respectively. Moreover, we calculated the number of variant sites and related genes derived from the maternal and paternal ancestors, revealing the genetic basis of outstanding phenotypic performance of Dorper sheep. In conclusion, this study reports the first reference genome of Dorper sheep and reveals its genetic basis through ASE. This study also provides a pipeline for mining genetic information of composite breeds, which has an implication for future hybrid-breeding practices.Entities:
Keywords: allele-specific expression (ASE); composite breed; dorper sheep; genetic basis; reference genome
Year: 2022 PMID: 35480318 PMCID: PMC9035736 DOI: 10.3389/fgene.2022.846449
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Image and genome quality of Dorper sheep. (A) Image of Dorper Sheep. (B) Genome-wide all-by-all Hi–C interaction. (C) Syntenic analysis of assembled genome.
Genome assembly statistics of Dorper sheep.
| Statistic | Contig length (bp) | Contig number |
|---|---|---|
| N50 | 73,326,320 | 13 |
| N50 | 64,997,665 | 17 |
| N50 | 43,243,940 | 21 |
| N50 | 37,997,576 | 28 |
| N50 | 22,706,521 | 37 |
| Longest | 158,282,255 | 1 |
| Total | 2,648,309,365 | 140 |
| Length >= 1 kb | 2,648,309,365 | 140 |
| Length >= 2 kb | 2,648,309,365 | 140 |
| Length >= 5 kb | 2,648,309,365 | 140 |
FIGURE 2Characterization of Dorper sheep genome. Circos diagram showing (outer to inner): A. Gene density (genes per 1 Mb window). B. Repeat density (repetitive sequence per in 1 Mb window). C. Gypsy-type LTR density. D. Copia-type LTR density. E. The numbers and distribution of ASE genes derived from DSS and PSS were represented in green and purple respectively. F. GC content in sliding windows of 1 Mb across each chromosome.
Assembly quality statistics comparison.
| Assembly statistic | ASM1914517V1 | ARS-UI_Ramb_v2.0 | Oar_rambouillet_v1.0 | Oar_v4.0 |
|---|---|---|---|---|
| Total Length (Mb) | 2,648.31 | 2,628.15 | 2,869.91 | 2,615.52 |
| Contig No. | 140 | 226 | 7,486 | 48,482 |
| Contig N50 (bp) | 73326320 | 43,178,051 | 2,850,956 | 145,655 |
| Contig L50 (No. of contigs) | 13 | 24 | 263 | 5,206 |
| Complete, single-copy BUSCOs (%) | 91.58 | 93.9 | 93.0 | 91.2 |
| Complete, duplicated BUSCOs (%) | 1.6 | 2.1 | 2.6 | 1.6 |
| Percent of fragmented BUSCOs | 2.22 | 0.9 | 1.1 | 2.4 |
| Percent of missing BUSCOs | 6.2 | 3.1 | 3.3 | 4.8 |
The top ten genes from Persian sheep
| Gene symbol | Full name | CHR | |
|---|---|---|---|
| 1 |
| Gene—Suppressor Of Cytokine Signaling 2 | 3 |
| 2 |
| MYC Binding Protein 2 | 11 |
| 3 |
| ADP Ribosylation Factor Guanine Nucleotide Exchange Factor 2 | 12 |
| 4 |
| SEC31 Homolog A, COPII Coat Complex Component | 6 |
| 5 |
| Exostosin Glycosyltransferase 2 | 13 |
| 6 |
| Inositol 1,4,5-Trisphosphate Receptor Type 1 | 21 |
| 7 |
| Ariadne RBR E3 Ubiquitin Protein Ligase 1 | 8 |
| 8 |
| Leucine Rich Repeats And Immunoglobulin Like Domains 1 | 21 |
| 9 |
| Ubiquitin C-Terminal Hydrolase L4 | 11 |
| 10 |
| UBIQUITIN-SPECIFIC PROTEASE 22 | 20 |
The top ten genes from Dorset sheep.
| Gene symbol | Full name | CHR | |
|---|---|---|---|
| 1 |
| Ubiquitin Conjugating Enzyme E2 E2 | 26 |
| 2 |
| Potassium Voltage-Gated Channel Subfamily Q Member 5 | 9 |
| 3 |
| Pecanex 1 | 8 |
| 4 |
| Rho GTPase Activating Protein 24 | 6 |
| 5 |
| Syntaxin 8 | 20 |
| 6 |
| KIAA0586 | 8 |
| 7 |
| Mitochondrial Ribosomal Protein L42 | 3 |
| 8 |
| Kinectin 1 | 8 |
| 9 |
| DDB1 And CUL4 Associated Factor 5 | 8 |
| 10 |
| Utrophin | 10 |
FIGURE 3Bioinformatics pipeline for allelic genes expression estimation.