| Literature DB >> 35804506 |
Eunkyung Choi1, Seung Jae Lee1, Euna Jo1,2, Jinmu Kim1, Steven J Parker3, Jeong-Hoon Kim2, Hyun Park1.
Abstract
The Muraenolepididae family of fishes, known as eel cods, inhabits continental slopes and shelves in the Southern Hemisphere. This family belongs to the Gadiformes order, which constitutes one of the most important commercial fish resources worldwide, but the classification of the fish species in this order is ambiguous because it is only based on the morphological and habitat characteristics of the fishes. Here, the genome of Patagonian moray cod was sequenced using the Illumina HiSeq platform, and screened for microsatellite motifs. The genome was predicted to be 748.97 Mb, with a heterozygosity rate of 0.768%, via K-mer analysis (K = 25). The genome assembly showed that the total size of scaffolds was 711.92 Mb and the N50 scaffold length was 1522 bp. Additionally, 4,447,517 microsatellite motifs were identified from the genome survey assembly, and the most abundant motif type was found to be AC/GT. In summary, these data may facilitate the identification of molecular markers in Patagonian moray cod, which would be a good basis for further whole-genome sequencing with long read sequencing technology and chromosome conformation capture technology, as well as population genetics.Entities:
Keywords: Illumina; Muraenolepis orangiensis; Patagonian moray cod; SSR; microsatellite
Year: 2022 PMID: 35804506 PMCID: PMC9265078 DOI: 10.3390/ani12131608
Source DB: PubMed Journal: Animals (Basel) ISSN: 2076-2615 Impact factor: 3.231
Statistics of the genome-sequencing data.
| Raw Data (bp) | Q20 (%) | Q30 (%) | GC Content (%) |
|---|---|---|---|
| 54,142,458,226 | 93.3 | 87.2 | 49.5 |
Genome size estimation via K-mer analysis.
| Genome Size (bp) | Heterozygosity (%) | Duplication Ratio (%) | |
|---|---|---|---|
| 17-mer | 709,066,708 | 0.82 | 1.3 |
| 19-mer | 723,179,522 | 0.832 | 1.23 |
| 25-mer | 748,978,687 | 0.768 | 1.18 |
Figure 1K-mer analysis (K = 25).
Statistics of the assembled genomic sequences.
| MaSuRCA | |
|---|---|
| Number of scaffolds | 661,719 |
| Total size of scaffolds | 711,920,928 |
| Longest scaffolds | 67,330 |
| Number of scaffolds > 1K nt | 211,863 |
| Number of scaffolds > 10K nt | 1199 |
| N50 scaffold length | 1522 |
| L50 scaffold count | 120,833 |
| GC content (%) | 45.7 |
Distribution of the microsatellite motifs.
| Repeat Motif | Number of Repeats | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 5 | 6 | 7 | 8 | 9 | 10 | 11–20 | >20 | Total | |
|
| |||||||||
| AC/GT | 329,788 | 224,159 | 178,326 | 154,725 | 144,969 | 134,928 | 710,095 | 223,051 | 2,100,041 |
| AG/CT | 230,644 | 141,955 | 96,287 | 67,788 | 50,979 | 39,120 | 158,771 | 65,210 | 850,754 |
| AT/AT | 119,678 | 76,695 | 51,221 | 38,042 | 31,331 | 24,631 | 122,875 | 10,068 | 474,541 |
| CG/CG | 3286 | 1359 | 427 | 156 | 141 | 15 | 0 | 0 | 5384 |
|
| |||||||||
| AGG/CCT | 64,598 | 41,234 | 30,104 | 21,653 | 15,244 | 10,409 | 42,322 | 4735 | 230,299 |
| AAT/ATT | 40,273 | 26,132 | 19,051 | 14,810 | 12,857 | 9999 | 26,160 | 1321 | 150,603 |
| ACC/GGT | 37,009 | 28,553 | 19,322 | 12,185 | 6659 | 4288 | 5583 | 102 | 113,701 |
| AAC/GTT | 31,660 | 18,564 | 11,204 | 5687 | 3476 | 1592 | 3708 | 404 | 76,295 |
| AAG/CTT | 18,427 | 12,484 | 7928 | 6032 | 4720 | 3360 | 11,275 | 7388 | 71,614 |
| ATC/GAT | 14,451 | 10,771 | 6815 | 4771 | 3575 | 2427 | 7904 | 795 | 51,509 |
| AGC/GCT | 16,945 | 9439 | 5433 | 3676 | 2141 | 1205 | 2433 | 324 | 41,596 |
| ACT/AGT | 9510 | 6022 | 4239 | 2686 | 2189 | 1570 | 4066 | 564 | 30,846 |
| CCG/CGG | 10,871 | 5428 | 2073 | 1244 | 695 | 401 | 616 | 0 | 21,328 |
| ACG/CGT | 2637 | 1513 | 640 | 252 | 123 | 53 | 113 | 0 | 5331 |
|
| |||||||||
| ACAG/CTGT | 12,245 | 8574 | 5599 | 3472 | 3155 | 2310 | 7692 | 510 | 43,557 |
| AGGG/CCCT | 10,965 | 6691 | 4420 | 2849 | 2176 | 1828 | 1754 | 0 | 30,683 |
| ACGC/GCGT | 1952 | 1653 | 885 | 758 | 607 | 757 | 3761 | 786 | 11,159 |
| AAAG/CTTT | 2532 | 1485 | 1131 | 527 | 407 | 380 | 2587 | 802 | 9851 |
| AAAC/GTTT | 3692 | 2538 | 1319 | 861 | 371 | 258 | 254 | 0 | 9293 |
| AAAT/ATTT | 3812 | 1907 | 822 | 476 | 285 | 98 | 520 | 89 | 8009 |
| ACTC/GAGT | 1846 | 1499 | 853 | 459 | 319 | 828 | 1834 | 215 | 7853 |
| ACAT/ATGT | 2236 | 1207 | 920 | 565 | 375 | 481 | 1481 | 171 | 7436 |
| AAGG/CCTT | 2474 | 1144 | 607 | 304 | 178 | 136 | 655 | 242 | 5740 |
| ATCC/GGAT | 2643 | 1129 | 527 | 344 | 110 | 128 | 473 | 77 | 5431 |
| Others | 14,624 | 7669 | 4304 | 2517 | 1748 | 1238 | 4132 | 547 | 36,779 |
|
| |||||||||
| AGAGG/CCTCT | 2090 | 1072 | 733 | 467 | 606 | 365 | 2378 | 33 | 7744 |
| AATGT/ACATT | 976 | 554 | 333 | 225 | 82 | 12 | 9 | 0 | 2191 |
| AATCT/AGATT | 1044 | 392 | 244 | 183 | 72 | 42 | 9 | 0 | 1986 |
| AACAC/GTGTT | 897 | 358 | 270 | 149 | 69 | 15 | 45 | 0 | 1803 |
| AGGGG/CCCCT | 968 | 419 | 161 | 113 | 15 | 0 | 0 | 1676 | |
| AATAT/ATATT | 431 | 319 | 201 | 116 | 129 | 60 | 180 | 0 | 1436 |
| Others | 2932 | 1735 | 739 | 694 | 268 | 212 | 509 | 0 | 7089 |
|
|
|
|
|
|
|
|
|
|
|
Figure 2PCR products corresponding to the microsatellites. M is the 20 bp DNA marker and 4% agarose gel electrophoresis was used. The number is primer pairs order.