| Literature DB >> 33339898 |
Lidia de Los Ríos-Pérez1, Julien A Nguinkal2, Marieke Verleih2, Alexander Rebl2, Ronald M Brunner2, Jan Klosa1, Nadine Schäfer2, Marcus Stüeken3, Tom Goldammer4,5, Dörte Wittenburg6.
Abstract
Pikeperch (Sander lucioperca) is a fish species with growing economic significance in the aquaculture industry. However, successful positioning of pikeperch in large-scale aquaculture requires advances in our understanding of its genome organization. In this study, an ultra-high density linkage map for pikeperch comprising 24 linkage groups and 1,023,625 single nucleotide polymorphisms markers was constructed after genotyping whole-genome sequencing data from 11 broodstock and 363 progeny, belonging to 6 full-sib families. The sex-specific linkage maps spanned a total of 2985.16 cM in females and 2540.47 cM in males with an average inter-marker distance of 0.0030 and 0.0026 cM, respectively. The sex-averaged map spanned a total of 2725.53 cM with an average inter-marker distance of 0.0028 cM. Furthermore, the sex-averaged map was used for improving the contiguity and accuracy of the current pikeperch genome assembly. Based on 723,360 markers, 706 contigs were anchored and oriented into 24 pseudomolecules, covering a total of 896.48 Mb and accounting for 99.47% of the assembled genome size. The overall contiguity of the assembly improved with a scaffold N50 length of 41.06 Mb. Finally, an updated annotation of protein-coding genes and repetitive elements of the enhanced genome assembly is provided at NCBI.Entities:
Mesh:
Year: 2020 PMID: 33339898 PMCID: PMC7749136 DOI: 10.1038/s41598-020-79358-z
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Pipeline showing the number of variants involved in the different steps. SNPs hard-filtering criteria: QualByDepth (QD) < 10.0, Quality (QUAL) < 30.0, StrandOddsRatio (SOR) > 3.0, FisherStrand (FS) > 60.0, RMSMappingQuality (MQ) < 40.0, MappingQualityRankSumTest (MQRankSum) < -12.5 and ReadPosRankSumTest (ReadPosRankSum) < -8.0. Indels hard-filtering criteria: QualByDepth (QD) < 2.0, Quality (QUAL) < 30.0, FisherStrand (FS) > 200.0 and ReadPosRankSumTest (ReadPosRankSum) < -20.0.
Matings and number of individuals sampled from each family.
| Family | Sire Id | Dam Id | Number of progeny |
|---|---|---|---|
| 1 | 1 | 2 | 29 |
| 2 | 3 | 4 | 98 |
| 3 | 5 | 6 | 3 |
| 4 | 7 | 8 | 224 |
| 5 | 9 | 10 | 15 |
| 6 | 9 | 11 | 6 |
Description of the female, male and sex-averaged linkage maps. LG: linkage group, cM: centiMorgan, F:M: female:male.
| LG | Number of SNPs | Female map | Male map | Sex-averaged map | F:M length ratio | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Distinct positions | LG Length | Average inter-marker distance (cM) | Max gap (cM) | Distinct positions | LG Length | Average inter-marker distance (cM) | Max gap (cM) | Distinct positions | LG Length | Average inter-marker distance (cM) | Max gap (cM) | |||
| 1 | 59,051 | 381 | 129.69 | 0.0022 | 9.37 | 135 | 99.05 | 0.0017 | 28.58 | 508 | 111.65 | 0.0019 | 12.28 | 1.31 |
| 2 | 57,505 | 441 | 152.63 | 0.0027 | 3.72 | 197 | 87.68 | 0.0015 | 5.53 | 626 | 119.66 | 0.0021 | 4.16 | 1.74 |
| 3 | 51,714 | 325 | 120.87 | 0.0023 | 7.48 | 187 | 94.85 | 0.0018 | 25.73 | 505 | 105.69 | 0.0020 | 11.23 | 1.27 |
| 4 | 53,468 | 392 | 122.71 | 0.0023 | 2.44 | 163 | 117.02 | 0.0022 | 39.42 | 550 | 120.40 | 0.0023 | 15.92 | 1.05 |
| 5 | 49,759 | 345 | 120.06 | 0.0024 | 4.69 | 184 | 143.95 | 0.0029 | 39.82 | 518 | 127.05 | 0.0026 | 19.23 | 0.83 |
| 6 | 56,492 | 380 | 147.84 | 0.0026 | 10.12 | 192 | 145.25 | 0.0026 | 24.69 | 557 | 144.61 | 0.0026 | 13.59 | 1.02 |
| 7 | 43,973 | 336 | 120.50 | 0.0027 | 12.98 | 174 | 80.88 | 0.0018 | 5.74 | 501 | 100.51 | 0.0023 | 9.48 | 1.49 |
| 8 | 43,504 | 272 | 111.41 | 0.0026 | 4.01 | 188 | 117.91 | 0.0027 | 15.77 | 449 | 112.68 | 0.0026 | 8.00 | 0.94 |
| 9 | 36,159 | 313 | 135.60 | 0.0037 | 9.37 | 148 | 102.00 | 0.0028 | 13.88 | 444 | 117.58 | 0.0033 | 7.41 | 1.33 |
| 10 | 49,364 | 337 | 119.22 | 0.0024 | 8.08 | 175 | 87.34 | 0.0018 | 11.35 | 504 | 103.80 | 0.0021 | 9.69 | 1.37 |
| 11 | 45,511 | 340 | 131.79 | 0.0029 | 8.71 | 140 | 102.26 | 0.0022 | 17.27 | 473 | 114.95 | 0.0025 | 7.89 | 1.29 |
| 12 | 52,318 | 405 | 176.19 | 0.0034 | 10.03 | 177 | 111.91 | 0.0021 | 12.06 | 564 | 142.63 | 0.0027 | 5.77 | 1.57 |
| 13 | 42,566 | 278 | 126.05 | 0.0030 | 11.75 | 185 | 125.63 | 0.0030 | 14.99 | 443 | 124.13 | 0.0029 | 10.20 | 1.00 |
| 14 | 38,321 | 285 | 102.72 | 0.0027 | 4.55 | 120 | 83.00 | 0.0022 | 26.95 | 398 | 90.70 | 0.0024 | 11.68 | 1.24 |
| 15 | 42,062 | 370 | 144.93 | 0.0034 | 22.77 | 184 | 96.34 | 0.0023 | 13.52 | 548 | 119.93 | 0.0029 | 17.93 | 1.50 |
| 16 | 35,329 | 257 | 113.01 | 0.0032 | 9.37 | 161 | 99.85 | 0.0028 | 9.37 | 401 | 109.55 | 0.0031 | 6.93 | 1.13 |
| 17 | 35,432 | 298 | 101.64 | 0.0029 | 6.15 | 143 | 93.37 | 0.0026 | 36.11 | 436 | 93.99 | 0.0027 | 14.86 | 1.09 |
| 18 | 35,434 | 362 | 138.10 | 0.0039 | 5.84 | 146 | 89.53 | 0.0025 | 17.27 | 490 | 112.45 | 0.0032 | 7.89 | 1.54 |
| 19 | 33,516 | 316 | 131.74 | 0.0039 | 10.37 | 182 | 113.13 | 0.0034 | 30.61 | 483 | 122.43 | 0.0037 | 11.27 | 1.16 |
| 20 | 28,022 | 223 | 91.21 | 0.0033 | 8.05 | 148 | 83.28 | 0.0030 | 9.36 | 365 | 86.59 | 0.0031 | 4.77 | 1.10 |
| 21 | 30,845 | 330 | 115.03 | 0.0037 | 3.72 | 157 | 115.09 | 0.0037 | 12.98 | 476 | 113.86 | 0.0037 | 7.49 | 1.00 |
| 22 | 29,046 | 226 | 85.79 | 0.0030 | 4.31 | 168 | 142.53 | 0.0049 | 53.73 | 386 | 106.49 | 0.0037 | 19.97 | 0.60 |
| 23 | 42,635 | 328 | 126.10 | 0.0030 | 7.73 | 152 | 100.89 | 0.0024 | 9.37 | 464 | 112.37 | 0.0026 | 4.56 | 1.25 |
| 24 | 31,599 | 265 | 120.35 | 0.0038 | 9.60 | 111 | 107.74 | 0.0034 | 17.27 | 370 | 111.83 | 0.0035 | 9.17 | 1.12 |
| Total | 1,023,625 | 7,805 | 2,985.16 | – | – | 3,917 | 2,540.47 | – | – | 11,459 | 2,725.53 | – | – | – |
| Average | 42,651 | 325.21 | 124.38 | 0.0030 | 8.13 | 163.21 | 105.85 | 0.0026 | 20.47 | 477.46 | 113.56 | 0.0028 | 10.47 | 1.21 |
Figure 2Genetic positions of markers for the 24 linkage groups in the (a) female, (b) male and (c) sex-averaged linkage maps. A black bar represents a SNP marker. The scale on the left indicates the genetic position in centiMorgan (cM).
Comparison of statistics between our chromosome-scale assembly and the first published pikeperch draft assembly (GenBank accession PRJNA561467). Genome annotation metrics were taken from Nguinkal et al. (2019)[21]. Differences between the statistic results shown in this table and NCBI are due to the use of different genome annotation services.
| Chromosome-scale assembly | Draft assembly | |
|---|---|---|
| Total assembly size (bp) | 901,221,791 | 900,477,756 |
| Number of contigs | 1048 | 1966 |
| Contig N50 length (bp) | 6,348,792 | 2,995,800 |
| Number of scaffolds | 336 | 1313 |
| Scaffold N50 length (bp) | 41,060,379 | 4,929,547 |
| Longest scaffold size (bp) | 54,393,628 | 19,065,786 |
| Scaffold L50 | 10 | 52 |
| Base-level accuracy | 99.9996 (Q50) | 99.998 (Q40) |
| Σ Scaffolds > 10 Mb (% of assembly size) | 99.47 | 26.60 |
| Σ Unplaced scaffolds (% of assembly size) | 0.53 | - |
| GC-content (%) | 41.00 | 40.91 |
| Complete BUSCO | 4434 (96.73%) | 4413 (96.27%) |
| Complete and single copy BUSCO | 4332 (94.50%) | 4301 (93.83%) |
| Complete and duplicated BUSCO | 102 (2.23%) | 112 (2.44%) |
| Fragmented BUSCO | 73 (1.59%) | 89 (1.94%) |
| Missing BUSCO | 77 (1.68%) | 82 (1.79%) |
| Number of genes | 36,010 | 24,278 |
| Number of protein-coding genes | 33,456 | 21,249 |
| Mean gene length (bp) | 10,697 | 10,961 |
| Mean CDS length (bp) | 1451 | 1313 |
| Mean exon count per CDS | 7.80 | 6.70 |
| Coding genes with homology-based functional annotation | 31,234 (93.36%) | 18,536 (87.23%) |
| Mean intron length (bp) | 2276 | 1696 |
| Mean exon length (bp) | 156 | 196 |
| % of genome covered by exons | 3.82 | 3.11 |
| Number of tRNA | 2345 | 2313 |
| Numer of rRNA | 160 | 180 |
| Number of miRNA | 145 | 166 |
Figure 3(a) The percentage coverage of the most abundant families of transposable elements in pikeperch. LINE: long interspersed nuclear elements; LTR: long terminal repeat. Correlation between (b) total introns length, (c) total exons length, and (d) gene content per chromosome and the pikeperch chromosome size (Mb).
Description of chromosomes ordered by size with corresponding LG. LG: linkage group, Mb: Megabase.
| Chromosome | LG | No. of anchored markers | Integrated contigs | No. of genes | Physical length (Mb) | Gene density (genes/Mb) |
|---|---|---|---|---|---|---|
| 1 | 15 | 33,239 | 33 | 2095 | 54.39 | 38.52 |
| 2 | 4 | 38,017 | 26 | 2071 | 49.41 | 41.92 |
| 3 | 1 | 38,493 | 34 | 1598 | 46.65 | 34.25 |
| 4 | 2 | 40,980 | 24 | 1846 | 45.68 | 40.41 |
| 5 | 6 | 33,387 | 48 | 1675 | 44.88 | 37.32 |
| 6 | 12 | 35,005 | 31 | 1722 | 43.59 | 39.50 |
| 7 | 3 | 41,147 | 24 | 1692 | 43.41 | 38.98 |
| 8 | 10 | 29,918 | 35 | 1739 | 42.48 | 40.94 |
| 9 | 5 | 40,404 | 33 | 1793 | 42.11 | 42.58 |
| 10 | 23 | 27,205 | 25 | 1777 | 41.06 | 43.28 |
| 11 | 11 | 32,750 | 29 | 1668 | 40.55 | 41.14 |
| 12 | 18 | 24,823 | 41 | 1697 | 39.47 | 43.00 |
| 13 | 19 | 28,731 | 24 | 1329 | 36.97 | 35.95 |
| 14 | 9 | 24,299 | 29 | 1260 | 35.01 | 35.99 |
| 15 | 7 | 28,542 | 24 | 1249 | 34.13 | 36.59 |
| 16 | 24 | 18,151 | 38 | 1263 | 32.13 | 39.31 |
| 17 | 13 | 28,606 | 58 | 1288 | 31.68 | 40.65 |
| 18 | 17 | 33,245 | 27 | 1383 | 31.57 | 43.81 |
| 19 | 21 | 23,452 | 26 | 1288 | 31.48 | 40.91 |
| 20 | 22 | 21,440 | 14 | 1202 | 29.81 | 40.32 |
| 21 | 8 | 31,401 | 19 | 1531 | 29.61 | 51.70 |
| 22 | 14 | 25,569 | 31 | 1254 | 29.18 | 42.98 |
| 23 | 16 | 20,259 | 19 | 708 | 20.93 | 33.83 |
| 24 | 20 | 24,297 | 14 | 864 | 20.30 | 42.57 |
| Total | - | 723,360 | 706 | 35,992 | 896.48 | - |
| Average | - | 30,140 | 29.42 | 1500 | 37.35 | 40.27 |
Figure 4Gene density on each pikeperch chromosome ordered by length and distribution of non-coding RNA loci including miRNA (orange triangle), tRNA (purple circle) and rRNA (green square). The colour code within each chromosome represents the gene density from low (blue) to high (red) in a window of 1 Mb.