| Literature DB >> 35238061 |
Jianhai Chen1, Jie Zhong1, Xuefei He1, Xiaoyu Li1, Pan Ni2, Toni Safner3,4, Nikica Šprem3, Jianlin Han5,6.
Abstract
The rapid progress of sequencing technology has greatly facilitated the de novo genome assembly of pig breeds. However, the assembly of the wild boar genome is still lacking, hampering our understanding of chromosomal and genomic evolution during domestication from wild boars into domestic pigs. Here, we sequenced and de novo assembled a European wild boar genome (ASM2165605v1) using the long-range information provided by 10× Linked-Reads sequencing. We achieved a high-quality assembly with contig N50 of 26.09 Mb. Additionally, 1.64% of the contigs (222) with lengths from 107.65 kb to 75.36 Mb covered 90.3% of the total genome size of ASM2165605v1 (~2.5 Gb). Mapping analysis revealed that the contigs can fill 24.73% (93/376) of the gaps present in the orthologous regions of the updated pig reference genome (Sscrofa11.1). We further improved the contigs into chromosome level with a reference-assistant scaffolding method. Using the 'assembly-to-assembly' approach, we identified intra-chromosomal large structural variations (SVs, length >1 kb) between ASM2165605v1 and Sscrofa11.1 assemblies. Interestingly, we found that the number of SV events on the X chromosome deviated significantly from the linear models fitting autosomes (R2 > 0.64, p < 0.001). Specifically, deletions and insertions were deficient on the X chromosome by 66.14 and 58.41% respectively, whereas duplications and inversions were excessive on the X chromosome by 71.96 and 107.61% respectively. We further used the large segmental duplications (SDs, >1 kb) events as a proxy to understand the large-scale inter-chromosomal evolution, by resolving parental-derived relationships for SD pairs. We revealed a significant excess of SD movements from the X chromosome to autosomes (p < 0.001), consistent with the expectation of meiotic sex chromosome inactivation. Enrichment analyses indicated that the genes within derived SD copies on autosomes were significantly related to biological processes involving nervous system, lipid biosynthesis and sperm motility (p < 0.01). Together, our analyses of the de novo assembly of ASM2165605v1 provides insight into the SVs between European wild boar and domestic pig, in addition to the ongoing process of meiotic sex chromosome inactivation in driving inter-chromosomal interaction between the sex chromosome and autosomes.Entities:
Keywords: zzm321990Sus scrofazzm321990; meiotic sex chromosome inactivation; reference genome; whole genome sequencing
Mesh:
Year: 2022 PMID: 35238061 PMCID: PMC9314987 DOI: 10.1111/age.13181
Source DB: PubMed Journal: Anim Genet ISSN: 0268-9146 Impact factor: 2.884
FIGURE 4The pipeline designed for identifying the segmental duplications (SDs). (a) The flowchart of major software used and the overall processes. (b) The two types of SDs, which cover the boundary‐derived SD (bSD) and internal‐derived SD (iSD)
FIGURE 1(a) The number of gaps which are filled by the assemblies of ASM2165605v1 and other European pig breeds. (b) The comparison of non‐missing lengths for all chromosomes of ASM2165605v1 and Sscrofa11.1 assemblies
The annotated protein‐coding genes and non‐coding genes in ASM2165605v1
| Type | Subtype | Count | Average length (bp) | Total length (bp) | Percentage of genome |
|---|---|---|---|---|---|
| Coding genes | 21,400 | 34,328 | 32,493,688 | 1.3014 | |
| miRNA | 861 | 79 | 68,172 | 0.2730 | |
| tRNA | 4471 | 76 | 338,344 | 1.3551 | |
| rRNA | rRNA | 135 | 242 | 32,614 | 0.1306 |
| 18S | 9 | 1506 | 13,550 | 0.543 | |
| 28S | 3 | 1610 | 4829 | 0.0193 | |
| 5.8S | 6 | 154 | 925 | 0.0037 | |
| 5S | 117 | 114 | 13,310 | 0.0533 | |
| snRNA | snRNA | 1697 | 113 | 192,366 | 0.7705 |
| CD‐box | 294 | 92 | 26,999 | 0.1081 | |
| HACA‐box | 277 | 135 | 37,347 | 0.1496 | |
| Splicing | 1100 | 112 | 123,581 | 0.004950 | |
| scaRNA | 26 | 171 | 4439 | 0.000178 |
The annotated genomic repeats and their summaries in ASM2165605v1
| Type | Repbase TEs | Other TEs | De novo | Combined TEs | ||||
|---|---|---|---|---|---|---|---|---|
| Length (bp) | Percentage in genome | Length (bp) | Percentage in genome | Length (bp) | Percentage in genome | Length (bp) | Percentage in genome | |
| DNA | 74,618,563 | 2.99 | 3,922,533 | 0.16 | 28,749,938 | 1.15 | 76,142,807 | 3.05 |
| LINE | 486,903,714 | 19.5 | 227,213,753 | 9.1 | 551,923,837 | 22.11 | 665,121,705 | 26.64 |
| SINE | 22,342,313 | 0.89 | 0 | 0 | 25,909,379 | 1.04 | 35,962,746 | 1.44 |
| LTR | 132,835,795 | 5.32 | 6,568,546 | 0.26 | 2,616,79,411 | 10.48 | 318,961,338 | 12.77 |
| Satellite | 8,716,245 | 0.35 | 0 | 0 | 4,100,318 | 0.16 | 8,849,865 | 0.35 |
| Unknown | 1,147,190 | 0.05 | 9942 | 0 | 1,230,106 | 0.05 | 2,387,238 | 0.1 |
| Total | 726,563,820 | 29.1 | 237,714,774 | 9.52 | 873,592,989 | 34.99 | 1,107,425,699 | 44.35 |
DNA refers to DNA transposons whereas LINE/SINE/LTR are retrotransposons. The TEs represents Transposable elements.
FIGURE 2The structural variations between ASM2165605v1 and Sscrofa11.1 assemblies inferred with SyRI using default parameters. The four types of variations are shown in different colors
FIGURE 3The numbers of structural variations across chromosomes (a) and the regression of the numbers of structural variations against the lengths of chromosomes (b)
FIGURE 5The regression of the numbers of SDs against the lengths of chromosomes. All numbers are inter‐chromosomal SD numbers. Red and blue show the directions ‘into autosomes’ and ‘into X chromosome’, respectively
FIGURE 6The enrichment analysis of biological processes using X‐derived autosomal genes (a) and all genes (b) related to SD movements. All processes are statistically significant (p < 0.01) as visualized with colors from green to red