| Literature DB >> 32641450 |
Tina Graceline Kirubakaran1, Øivind Andersen1,2, Michel Moser1, Mariann Árnyasi1, Philip McGinnity3, Sigbjørn Lien1, Matthew Kent4.
Abstract
Currently available genome assemblies for Atlantic cod (Gadus morhua) have been constructed from fish belonging to the Northeast Arctic Cod (NEAC) population; a migratory population feeding in the Barents Sea. These assemblies have been crucial for the development of genetic markers which have been used to study population differentiation and adaptive evolution in Atlantic cod, pinpointing four discrete islands of genomic divergence located on linkage groups 1, 2, 7 and 12. In this paper, we present a high-quality reference genome from a male Atlantic cod representing a southern population inhabiting the Celtic sea. The genome assembly (gadMor_Celtic) was produced from long-read nanopore data and has a combined contig length of 686 Mb with an N50 of 10 Mb. Integrating contigs with genetic linkage mapping information enabled us to construct 23 chromosome sequences which mapped with high confidence to the latest NEAC population assembly (gadMor3) and allowed us to characterize, to an extent not previously reported large chromosomal inversions on linkage groups 1, 2, 7 and 12. In most cases, inversion breakpoints could be located within single nanopore contigs. Our results suggest the presence of inversions in Celtic cod on linkage groups 6, 11 and 21, although these remain to be confirmed. Further, we identified a specific repetitive element that is relatively enriched at predicted centromeric regions. Our gadMor_Celtic assembly provides a resource representing a 'southern' cod population which is complementary to the existing 'northern' population based genome assemblies and represents the first step toward developing pan-genomic resources for Atlantic cod.Entities:
Keywords: Atlantic cod; centromere repeats; chromosomal rearrangements; genome assembly; linkage map; nanopore
Mesh:
Year: 2020 PMID: 32641450 PMCID: PMC7466986 DOI: 10.1534/g3.120.401423
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Assembly statistics
| Total size (bp) | Total number of contigs | Contig N50 (bp) | BUSCO annotation | |
|---|---|---|---|---|
| Wtdgb2 “Trial c” | 668,357,526 | 1600 | 6,012,173 | C:23.2%[D:0.3%], F:11.8%,M:65%, n = 4584 |
| Wtdgb2 “Trial q” | 670,278,278 | 1666 | 6,004,590 | C:42.5%[D:0.4%], F:7.6%, M:50%, n = 4584 |
| Quickmerge contigs | 677,547,349 | 1253 | 10,448,158 | Not done |
| Racon polishing | 683,672,734 | 1253 | 10,518,163 | C:66.5%[D:1.2%], F:8.1%, M:25.3%, n = 4584 |
Metrics describing genome statistics of the two initial Wtdgb2 assemblies, the quickmerge assembly, the gadMor_Celtic assembly after polishing with nanopore (Racon) and final assembly after polishing with Illumina (Pilon) data.
Figure 1Alignment of gadMor_Celtic (x-axis) and gadMor3 (y-axis) chromosome sequences for linkage groups 1, 2, 7 and 12. Vertical lines (pink) demarcate boundaries of gadMor_Celtic contigs.
Genomic regions likely containing the inversion breakpoints
| Linkage Group | Putative interval containing breakpoint | Size (bp) | Inversion size (Mb) | |
|---|---|---|---|---|
| Start | End | |||
| 10,782,691 | 10,787,755 | 5,064 | 17.45 | |
| 18,422,802 | 18,425,099 | 2,297 | ||
| 28,225,372 | 28,228,130 | 2,758 | ||
| 21,733,338 | 21,733,998 | 660 | 4.51 | |
| 26,233,253 | 26,238,098 | 4,840 | ||
| 15,208,043 | 15,210,043 | 2,000 | 9.37 | |
| 24,574,346 | 24,575,510 | 1,164 | ||
| 493,527 | 635,659 | 142,132 | 13.88 | |
| 14,330,965 | 14,376,973 | 46,008 | ||
A pairwise comparison between gadMor_Celtic and gadMor3 reveals the interval (start and stop coordinates relative to the gadMor_Celtic assembly) for each inversion breakpoint in LGs 1, 2, 7, and 12.
Figure 2Putative inversions detected on LGs 6, 11 and 21.
Figure 3Position of potential centromere related sequence on LG04. Collinearity between LG04 genetic maps for males (red) and female (blue) and the frequency of a 258bp tandem repeat structure (histogram) predicted to be related to centromeres.