| Literature DB >> 32165614 |
Xi-Wen Xu1,2, Chang-Wei Shao1,2, Hao Xu1,3, Qian Zhou1,2, Feng You4, Na Wang1,2, Wen-Long Li1, Ming Li1,3, Song-Lin Chen5,6.
Abstract
Turbot (Scophthalmus maximus) is a commercially important flatfish species in aquaculture. It has a drastic sexual dimorphism, with females growing faster than males. In the present study, we sequenced and de novo assembled female and male turbot genomes. The assembled female genome was 568 Mb (scaffold N50, 6.2 Mb, BUSCO 97.4%), and the male genome was 584 Mb (scaffold N50, 5.9 Mb, BUSCO 96.6%). Using two genetic maps, we anchored female scaffolds representing 535 Mb onto 22 chromosomes. Annotation of the female anchored genome identified 87.8 Mb transposon elements and 20,134 genes. We identified 17,936 gene families, of which 369 gene families were flatfish specific. Phylogenetic analysis showed that the turbot, Japanese flounder and Chinese tongue sole form a clade that diverged from other teleosts approximately 78 Mya. This report of female and male turbot draft genomes and annotated genes provides a new resource for identifying sex determination genes, elucidating the evolution of adaptive traits in flatfish and developing genetic techniques to increase the sustainability of turbot aquaculture.Entities:
Mesh:
Year: 2020 PMID: 32165614 PMCID: PMC7067757 DOI: 10.1038/s41597-020-0426-6
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Paraffin sectioning and HE staining of gonadal tissues of the female and male turbot. (a) Section of the ovary. (b) Section of the testis.
Summary of sequencing data.
| Libraries | Female turbot | Male turbot | ||
|---|---|---|---|---|
| Total raw data (Gb) | Total clean data (Gb) | Total raw data (Gb) | Total clean data (Gb) | |
| 170 bp | 20.33 | 19.75 | / | / |
| 230 bp | / | / | 65.93 | 59.85 |
| 500 bp | 11.02 | 9.95 | 49.37 | 47.58 |
| 800 bp | 8.91 | 7.42 | 19.72 | 17.9 |
| 2 kb | 31.08 | 28.81 | 16.84 | 13.38 |
| 5 kb | 8.39 | 7.44 | 22.55 | 17.81 |
| 10 kb | 13.01 | 11.25 | 21.99 | 18.08 |
| 20 kb | 1.86 | 1.79 | / | / |
| 40 kb | 4.9 | 2.88 | / | / |
| Total | 99.5 | 89.29 | 196.4 | 174.6 |
Turbot genome assembly statistics.
| Genome assembly | Female turbot | Male turbot |
|---|---|---|
| Contig N50 Size (kb) | 12.16 | 16.52 |
| Contig No. (>1 Kp) | 73,671 | 57,539 |
| Longest Contig (kb) | 197.81 | 132.66 |
| Total Contig Length (Mb) | 541.51 | 553.24 |
| Scaffold N50 Size (Mb) | 6.17 | 5.93 |
| Scaffold No. (>1 Kp) | 6,292 | 1,064 |
| Longest Scaffold (Mb) | 19.88 | 19.47 |
| Total Scaffold Length (Mb) | 568.45 | 584.74 |
| GC Content (%) | 43.42 | 43.70 |
Predicted levels of differentgenomic repeat elements.
| RepBase TEs | TE Proteins | Combined TEs | ||||||
|---|---|---|---|---|---|---|---|---|
| Length (bp) | % in Genome | Length (bp) | % in Genome | Length (bp) | % in Genome | Length (bp) | % in Genome | |
| DNA | 20,826,307 | 3.66 | 1,913,179 | 0.34 | 16,586,479 | 2.92 | 32,827,840 | 5.77 |
| LINE | 8,911,117 | 1.57 | 5,680,889 | 1.00 | 7,515,445 | 1.32 | 13,233,157 | 2.33 |
| LTR | 8,577,908 | 1.51 | 2,067,745 | 0.36 | 1,931,900 | 0.34 | 10,207,971 | 1.80 |
| SINE | 2,054,064 | 0.36 | 0 | 0.00 | 2,462,561 | 0.43 | 2,749,822 | 0.48 |
| Other | 7,610 | 0.00 | 0 | 0.00 | 5,880,197 | 1.03 | 5,887,807 | 1.04 |
| Unknown | 0 | 0.00 | 0 | 0.00 | 33,943,746 | 5.97 | 33,943,746 | 5.97 |
| Total | 36,237,231 | 6.37 | 9,656,599 | 1.70 | 67,224,071 | 11.82 | 87,802,760 | 15.44 |
Note: Repbase TEs, the results of RepeatMasker based on Repbase; TE proteins, the results of RepeatProteinMask based on Repbase; De novo, the results of RepeatMasker by using the library predicted through De novo; Combined, all the results combined.
Summary of predicted protein-coding genes in the female turbot genome.
| Gene set | Number | Average gene length (bp) | Average CDS length (bp) | Average exon per gene | Average exon length (bp) | Average intron length (bp) | |
|---|---|---|---|---|---|---|---|
| Augustus | 27,283 | 11,982 | 1,373 | 7.84 | 175.09 | 1,551 | |
| Genscan | 26,365 | 15,475 | 1,579 | 9.5 | 166.23 | 1,635 | |
| Homolog | 19,789 | 9,844 | 1,540 | 9.51 | 161.89 | 976 | |
| 17,209 | 10,649 | 1,579 | 9.65 | 163.73 | 1,049 | ||
| 20,057 | 8,619 | 1,425 | 8.7 | 163.86 | 935 | ||
| 11,245 | 14,191 | 1,710 | 11.19 | 152.82 | 1,225 | ||
| 18,185 | 10,426 | 1,601 | 9.84 | 162.71 | 998 | ||
| 17,043 | 8,835 | 1,446 | 8.99 | 160.85 | 925 | ||
| 19,749 | 9,928 | 1,533 | 9.14 | 167.61 | 1,031 | ||
| 20,028 | 11,190 | 1,693 | 9.76 | 173.5 | 1,085 | ||
| RNA-seq | 17,668 | 8,050 | 1,826 | 8.52 | 214.3 | 868 | |
| Final set | 20,134 | 10,322 | 1,605 | 9.63 | 166.63 | 1,010 | |
Note: Gene length includes the lengths of the exon and intron regions but not the lengths of the UTRs. The accession numbers of the RNA-seq data in this study are SRR4853423 and SRR346085.
Fig. 2Comparisons of gene parameters among Scophthalmus maximus, Danio rerio, Paralichthys olivaceus, Gasterosteus aculeatus, Oryzias latipes, Takifugu rubripes, Tetraodon nigroviridis and Homo sapiens genomes. (a) Gene length distributions of the species. (b) CDS length distributions of the species. (c) Exon number distributions of the species. (d) Exon length distributions of the species. (e) Intron length distributions of the species. Y-axis of (a,b,d,e) stand for density, while Y-axis of (c) stands for ratio of genes.
Fig. 3Venn diagram of the numbers of unique and shared gene families among nine sequenced teleost species.
Fig. 4Evolution of orthologous gene families and their estimated divergence times in nine teleost species. The blue numbers on the nodes are the divergence times in million years ago (Mya). The red circles indicated the calibration time.
The comparison between the new male and female genome assemblies and the reference genome assembly of turbot.
| Reference genome | Female genome | Male genome | |
|---|---|---|---|
| Total Bases | 524,979,463 | 568,483,288 | 587,187,767 |
| Aligned Bases | 520,165,145 (99.08%) | 552,306,146 (97.15%) | 562,821,085 (95.85%) |
| Unaligned Bases | 4,814,318 (0.92%) | 16,177,142 (2.85%) | 24,366,682 (4.15%) |
Fig. 5Circos graph of whole-genome synteny analysis for female genome and the reference genome of turbot.
| Measurement(s) | DNA • genome • sequence_assembly • sequence feature annotation |
| Technology Type(s) | DNA sequencing • sequence assembly process • sequence annotation |
| Factor Type(s) | sex |
| Sample Characteristic - Organism | Scophthalmus maximus |