| Literature DB >> 31711192 |
Qinghua Liu1,2,3, Xueying Wang1,2,3, Yongshuang Xiao1,2,3, Haixia Zhao1,2,3,4, Shihong Xu1,2,3, Yanfeng Wang1,2,3, Lele Wu1,2,3,4, Li Zhou1,2,3,4, Tengfei Du1,2,3,4, Xuejiao Lv1,2,3,4, Jun Li1,2,3.
Abstract
Black rockfish (Sebastes schlegelii) is an economically important viviparous marine teleost in Japan, Korea, and China. It is characterized by internal fertilization, long-term sperm storage in the female ovary, and a high abortion rate. For better understanding the mechanism of fertilization and gestation, it is essential to establish a reference genome for viviparous teleosts. Herein, we used a combination of Pacific Biosciences sequel, Illumina sequencing platforms, 10× Genomics, and Hi-C technology to obtain a genome assembly size of 848.31 Mb comprising 24 chromosomes, and contig and scaffold N50 lengths of 2.96 and 35.63 Mb, respectively. We predicted 39.98% repetitive elements, and 26,979 protein-coding genes. S. schlegelii diverged from Gasterosteus aculeatus ∼32.1-56.8 million years ago. Furthermore, sperm remained viable within the ovary for up to 6 months. The glucose transporter SLC2 showed significantly positive genomic selection, and carbohydrate metabolism-related KEGG pathways were significantly up-regulated in ovaries after copulation. In vitro suppression of glycolysis with sodium iodoacetate reduced sperm longevity significantly. The results indicated the importance of carbohydrates in maintaining sperm survivability. Decoding the S. schlegelii genome not only provides new insights into sperm storage; additionally, it is highly valuable for marine researchers and reproduction biologists.Entities:
Keywords: zzm321990 Sebastes schlegeliizzm321990 ; Hi-C genome assemble; PacBio sequencing; sperm storage; viviparous
Mesh:
Substances:
Year: 2019 PMID: 31711192 PMCID: PMC6993816 DOI: 10.1093/dnares/dsz023
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1Photograph of the reproductive characteristics of the viviparous marine teleost Sebastes schlegelii (black rockfish). (a) Photograph of black rockfish, (b) the sperm ultrastructure, (c) sperm in the female ovary, (d) embryo in the female ovary before hatching, and (e) larva fish in the female ovary after hatching.
Summary of sequence data from S. schlegelii
| Platform | Insert size | Raw data (Gb) | Clean data (Gb) | Read length(bp) | Sequence coverage (×) | SRA accession number |
|---|---|---|---|---|---|---|
| PacBio reads | 30k | 85.78 | — | — | 101.76 | SRP173183 |
| 10× Genomics | 500–700 bp | 129.75 | 126.37 | 150 | 153.92 | SRP173183 |
| Hi-C | 350 bp | 118.90 | 118.46 | 150 | 141.05 | SRP173183 |
| Illumina reads | 350 bp | 88.08 | 88.05 | 150 | 104.49 | SRP173183 |
| In total | — | 422.51 | — | — | 501.22 |
Genome assembly of S. schlegelii
| Description | First assembly | Second assembly | Third assembly | Fourth error correction |
|---|---|---|---|---|
| Platform | PacBio | 10× Genomics | Hi-C | Illumina reads |
| Software | Falcon | FragScaff | Lachesis | Pilon |
| No. of contig | 2,031 | 2,031 | 2,031 | 2,019 |
| Total length of contig (Mb) | 842.15 | 843.91 | 843.91 | 843.86 |
| Contig N50 (Mb) | 2.92 | 2.93 | 2.93 | 2.96 |
| Minimum length (bp) | 129 | 129 | 129 | 130 |
| Maximum length (Mp) | 10.97 | 10.99 | 10.99 | 10.99 |
| No. of Scaffold | 2,031 | 1,471 | 854 | 854 |
| Total length of Scaffold (Mb) | 842.15 | 847.88 | 847.94 | 848.31 |
| Scaffold N50 (Mp) | 2.92 | 4.34 | 35.60 | 35.63 |
| Minimum length (bp) | 129 | 129 | 129 | 130 |
| Maximum length (Mp) | 10.97 | 15.60 | 43.18 | 43.20 |
|
| 0 | 0.47 | 0.48 | 0.52 |
Figure 2The contig contact matrix from the genome of Sebastes schlegelii derived from Hi-C data. In the plot, the red colour indicates a high-density logarithm and the white colour indicates a low contact density logarithm. In Hi-C analysis, the genome was divided into bin by 100k. The number of interactions between bin reads was calculated, that is, the number of interactions between bins. Each point in the figure represents the number of interactions between bins with horizontal and vertical coordinates, and the colour intensity represents the strength of the interactions. Genome-wide interactions tend to be more intra-chromosomal than inter-chromosomal.
Figure 3A schematic representation of the characteristics of the genome of Sebastes schlegelii. From the outer to the inner circles: I, chromosomes; II, gene density; III, repeat density; IV, coding-sequence region.
Statistics for genome characteristic of S. schlegelii
| Genome characteristic | |
|---|---|
| Estimated genome size (Mb) | 842.97 |
| Assembled genome size (Mb) | 848.31 |
| Reads mapping rate (%) | 97.93 |
| Genome coverage (%) | 99.61 |
| GC content (%) | 40.75 |
| Homology SNP (%) | 0.00038 |
| CEGMA evaluate (%) | 92.34 |
| BUSCO genome completence | n=2586 |
| Complete | 2470 (95.5%) |
| Complete and single copy | 2400 (92.8%) |
| Complete and duplicated | 70 (2.7%) |
| Fragmented | 54(2.1%) |
| Missing | 62 (2.4%) |
The percentage of homology SNP reflects the accuracy of genome assemble, and the results Homology SNP 0.00038% shows that the level of the genome assembly possesses high quality at single base level.
Genome assembly versions comparison of Sebastes schlegelii
| Dataset | Metric | FALCON+FragScaff+Lachesis+Pilon | Wtdbg2+FragScaff+Lachesis+Pilon |
|---|---|---|---|
|
| Contig N50 (Mb) | 2.92 | 15.39 |
| Illumina reads | Scaffold N50 (Mb) | 35.63 | 33.81 |
| Pacbio reads | Assembled genome size (Mb) | 848.31 | 784.94 |
| 10× Genomics | Reads mapping rate (%) | 97.93 | 98.29 |
| Hi-C | Genome coverage (%) | 99.61 | 99.36 |
| GC content (%) | 40.75 | 40.81 | |
| Homology SNP (%) | 0.00038 | 0.0009 | |
|
| 0.52 | 0.18 | |
| CEGMA evaluate (%) | 92.34 | 94.76 | |
| BUSCO genome completence | 2,586 (95.5%) | 2,586 (98.0%) |
Figure 4Comparison of the Sebastes schlegelii genome with other publicly available teleost genomes. The x axis represents the contig N50 values and the y axis represents the scaffold N50 values. The genomes sequenced with PacBio are highlighted in orange and the genome of S. schlegelii is highlighted in red.
Figure 5FISH DNA probes obtained from an identical chromosome (Chr 3) anchored on the same chromosome to confirm the quality of chromosome-scale assembly using Hi-C. (a) Giemsa staining, (b) DAPI, (c) fluorescein-labelled, and (d) DIG-labelled, 100×.
Summary of genome annotation for S. schlegelii
| Annotation | |
|---|---|
| Repetitive sequence content | 39.98% |
| DNA | 18.06% |
| LINE | 9.59% |
| SINE | 1.08% |
| LTR | 7.26% |
| Protein-coding genes | 26,979 |
| Mean transcript length | 14,159.49 bp |
| Mean CDS length | 1,452.03 bp |
| Mean exon per gene | 8.63 |
| Mean exon length | 168.32 bp |
| Mean intron length | 1,666.16 bp |
Statistics for genome annotation of S. schlegelii
| Database | Number of annotated transcripts | % |
|---|---|---|
| Swissprot | 23,337 | 86.50 |
| Nr | 24,963 | 92.50 |
| KEGG | 21,449 | 79.50 |
| InterPro | 26,698 | 99.00 |
| GO | 24,857 | 92.10 |
| Pfam | 20,818 | 77.20 |
| Annotated | 26,775 | 99.20 |
| Unannotated | 204 | 0.80 |
Figure 6Gene-family cluster analysis. (a) The comparison of gene families from Sebastes. schlegelii and other teleosts. The horizontal axis indicates the species and the vertical axis represents the number of genes. The pink colour represents single-copy genes; yellow represents multiple-copy genes; deep yellow represents unique paralogues; green represents other orthologues and unclustered genes. Here, other means except the above three types. Some genes were not clustered in the gene family or clustered in a gene family from some of the species. (b) The gene-family Venn diagram. Ssc, Sebastes schlegelii; Gac, Gasterosteus aculeatus; Tru, Takifugu rubripes; Tni, Tetraodon nigroviridis.
Figure 7Estimation of the time of divergence of Sebastes. Schlegelii. Note: The numbers on the nodes represent the divergence times (millions of years ago, mya).
Figure 8The interaction of ovary microenvironment and sperm storage. (a) The heatmap of carbohydrate metabolism-related gene expression from Pre-copulation (FII) to post-copulation (FIII–IV); the higher the gene expression, the lighter the colour. The quantities expression was calculated based on FPKM (expected number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced); (b) the time of sperm survivability of control and experimental groups (sodium iodoacetate treatment) in vitro. The error bars were calculated by mean value ± standard deviation, and are shown as standard deviation.