Literature DB >> 31711192

Sequencing of the black rockfish chromosomal genome provides insight into sperm storage in the female ovary.

Qinghua Liu1,2,3, Xueying Wang1,2,3, Yongshuang Xiao1,2,3, Haixia Zhao1,2,3,4, Shihong Xu1,2,3, Yanfeng Wang1,2,3, Lele Wu1,2,3,4, Li Zhou1,2,3,4, Tengfei Du1,2,3,4, Xuejiao Lv1,2,3,4, Jun Li1,2,3.   

Abstract

Black rockfish (Sebastes schlegelii) is an economically important viviparous marine teleost in Japan, Korea, and China. It is characterized by internal fertilization, long-term sperm storage in the female ovary, and a high abortion rate. For better understanding the mechanism of fertilization and gestation, it is essential to establish a reference genome for viviparous teleosts. Herein, we used a combination of Pacific Biosciences sequel, Illumina sequencing platforms, 10× Genomics, and Hi-C technology to obtain a genome assembly size of 848.31 Mb comprising 24 chromosomes, and contig and scaffold N50 lengths of 2.96 and 35.63 Mb, respectively. We predicted 39.98% repetitive elements, and 26,979 protein-coding genes. S. schlegelii diverged from Gasterosteus aculeatus ∼32.1-56.8 million years ago. Furthermore, sperm remained viable within the ovary for up to 6 months. The glucose transporter SLC2 showed significantly positive genomic selection, and carbohydrate metabolism-related KEGG pathways were significantly up-regulated in ovaries after copulation. In vitro suppression of glycolysis with sodium iodoacetate reduced sperm longevity significantly. The results indicated the importance of carbohydrates in maintaining sperm survivability. Decoding the S. schlegelii genome not only provides new insights into sperm storage; additionally, it is highly valuable for marine researchers and reproduction biologists.
© The Author(s) 2019. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

Entities:  

Keywords:  zzm321990 Sebastes schlegeliizzm321990 ; Hi-C genome assemble; PacBio sequencing; sperm storage; viviparous

Mesh:

Substances:

Year:  2019        PMID: 31711192      PMCID: PMC6993816          DOI: 10.1093/dnares/dsz023

Source DB:  PubMed          Journal:  DNA Res        ISSN: 1340-2838            Impact factor:   4.458


1. Introduction

Black rockfish (Sebastes schlegelii Hilgendorf) is an economically important viviparous marine teleost species of the Sebastidae family which inhabits the seas of Japan, Korea, and China. However, in northern China, high rates of abortion during the gestation period cause substantial economic losses. Black rockfish copulates from November through December via specialized urogenital papilla. During the post-copulatory period, sperm is stored within the female ovary, in which survivability and viability are maintained for up to 6 months. The species is characterized by internal fertilization, which in China occurs during the period from April to May of the following year. Fertilization and embryo hatching occur internally within the female ovary., Thus, black rockfish is considered an attractive viviparous fish model for studies on reproductive specialization (Fig. 1), particularly, studies focusing on the mechanisms underlying reproductive strategy, sperm storage, sperm competition, and sexual selection, and studies attempting to overcome the problems associated with abortion during the gestational stage. Unfortunately, to date, information regarding the genetic basis of vivipary in marine teleosts is scarce at best.
Figure 1

Photograph of the reproductive characteristics of the viviparous marine teleost Sebastes schlegelii (black rockfish). (a) Photograph of black rockfish, (b) the sperm ultrastructure, (c) sperm in the female ovary, (d) embryo in the female ovary before hatching, and (e) larva fish in the female ovary after hatching.

Photograph of the reproductive characteristics of the viviparous marine teleost Sebastes schlegelii (black rockfish). (a) Photograph of black rockfish, (b) the sperm ultrastructure, (c) sperm in the female ovary, (d) embryo in the female ovary before hatching, and (e) larva fish in the female ovary after hatching. Sperm storage is a widely common reproductive strategy among those vertebrate species, characterized by internal fertilization. However, the mechanism of long-term sperm storage tends to be species-specific due to differences in storage organs. Numerous studies on mammals, birds, and insects have focused on issues associated with long-term sperm storage in females., Such studies have indicated that energy metabolism play a key role in sperm survivability and in maintaining sperm viability. Accordingly, it has been speculated that carbohydrates produced in female sperm storage organs could serve as metabolic substrates required for long-term sperm storage. In the present study, we describe the first chromosome-level S. schlegelii genome characterization based on sequence analysis performed by combining the Pacific Biosciences (PacBio) Sequel sequencing platform and 10× Genomic and Hi-C mapping technologies to improve genome assembly. This genome description will provide valuable resources for researchers in the field to elucidate the mechanisms underlying key aspects of the reproductive biology of S. schlegelii; in addition, it will contribute to culturing the larvae of this species. To our knowledge, this study is the first to report genome information for a viviparous marine teleost. Moreover, here we provide new insights into the long-term storage of sperm in the female ovary through transcriptome and sperm physiological analyses.

2. Materials and methods

2.1. Sample collection

Male black rockfish (S. schlegelii) was collected from Penglai, China, and used to generate the genome sequence data. Fresh muscle samples were obtained from the black rockfish specimens under sterile conditions. Samples were stored in liquid nitrogen until used for genomic DNA extraction. Genomic DNA was obtained using standard SDS phenol/chloroform extraction and purification protocols. The quality of the genomic DNA obtained was assessed. Two-year-old male black rockfish (S. schlegelii) was anesthetized with MS222 (100 μg/ml), injected into the bottom of the pectoral fin colchicine (2.5 μg/g). Head-kidney was collected 4 h later to prepare the chromosomes.

2.2. DNA sequencing

For PacBio sequel sequencing, MagBeads bound with DNA-Polymerase complexes were loaded at 0.1 nM (on-plate concentration) using 14 single-molecule real-time (SMRT) Cells. Single-molecule sequences with C4 chemistry were constructed with PacBio sequel platform. Thereafter, a single 10× Genomics Linked-Read library from the Illumina HiSeq X Ten platform was constructed, and then, a Hi-C library was prepared with formaldehyde fixation, enzyme restriction, and biotinylated labelling. Finally, 350-bp paired-end libraries from the Illumina HiSeq X Ten platform were constructed.

2.3. Genome size estimation

Black rockfish genome-size was estimated using the k-mer method (Supplementary Fig. S1).

2.4. Genome assembly

2.4.1. PacBio assembly

FALCON assembler was used to assemble third-generation long reads to contigs of the S. schlegelii genome. The FALCON assembly process was as follows. (i) DALIGNER was used to perform error correction, according to the probability of insertion, deletion, and sequence errors. After error correction, we obtained pre-assemble reads. (ii) LASort and LAMerge were used for overlap-detection using the pre-assemble reads. To generate a layout of overlapping reads, we obtained de novo assembled reference contigs. (iii) The single-pass long reads were re-sequenced, mapped to de novo assembled reference contigs, and obtained for base-quality-aware consensus of uniquely mapped reads. In addition to FALCON, wtdbg2 was also used to assemble third-generation long reads by blast (KBM), assemble (FBG), and error correction (daccord).

2.4.2. 10× genomics assembly

Quiver was used to refine the genome. Initially, PacBio contigs were scaffolded, and then fragScaff was used to obtain super-scaffolds using 10× Genomics Linked-Read data.

2.4.3. Chromosomal-level genome assembly using Hi-C

To enhance a chromosomal-level assembly, we used the Hi-C sequence library with Lachesis software., Initially, we compared the sequence with the draft version. BWA was used to map Hi-C clean reads to the polished S. schlegelii genome. Thereafter, cluster, order, and orientations were determined. Contigs were clustered into chromosome groups, according to the interaction of paired reads between two contigs. If the number of paired reads was much larger and the contigs interaction greater, they were clustered into one group according to the number of interactions reads which interacted with each other between two contigs, clustered, and classified into groups based on the number of the S. schlegelii chromosome, and then they were ordered within groups and assigned contig orientations in line with the strength and location of the interaction between the reads. Juicebox was used to correct the contig orientation; finally, chromosomes were anchored. Chromosomal-level assembly of the black rockfish genome was based on restriction sites in sequences and the link relationship from Hi-C; then we constructed a map, computed the weight, and connected the contigs (scaffolds) for each chromosome.

2.4.4. Final assembly refinement

Illumina short reads were initially mapped to the chromosomal-level genome assembly version using the BWA software. Subsequently, we applied Pilon to correct the remaining base errors with short reads according to the map results.

2.5. Genome quality evaluation

The accuracy of the assembled S. schlegelii genome was evaluated by mapping short sequence reads to the S. schlegelii genome using the BWA program, and we performed variant calling based on SAMtools. CEGMA with the core genes from vrt dataset and BUSCO analyses for completeness of evaluation of the S. schlegelii genome assembly. The genome assemblies by falcon and wtdbg2 were compared to obtain a more reliable genome assembly. Furthermore, we compared characteristics of the S. schlegelii genome with those of other teleost species. In addition, after completing the genome assembly, we confirmed the quality by FISH probes obtained from an identical chromosome assembled that could be anchored on the same chromosome. Two genes of interested, 3.816 and 3.70, from chr3 were used. Firstly, we created the local blast database of the S. schlegelii genome. Secondly, we extracted their gene sequence of them. Thirdly, we blasted each of them to the local database and selected the chr3-specific section for further design. Fourthly, PCR amplification, gel electrophoresis detection and PCR product purification sequencing were performed. Primers that were PCR single banded, size and sequence corrected were used for further probes preparation. The probes were synthetized by PCR. 3.816 was labelled with digoxin, and 3.70 was labelled with fluorescein. The PCR system was according to a modified ExTaq multiplex system (TAKARA) with 1 μg high purity DNA template. The probes were purified using sin sequencing reaction clean-up kit (Sigma). The detection was conducted by anti-dig and anti-fluorescein POD antibodies. Signal amplification was conducted with the TSA plus fluorescein/TMR kit (PerkinElmer). Mounting was performed with prolonged gold anti-fade (molecular probes by Life Technologies). Images were obtained by a microscope (Niko Eclipse Ni).

2.6. Annotation

2.6.1. Repetitive-sequences annotation

Tandem Repeat Finder was used to detect repetitive elements in the S. schlegelii genome. RepeatModeler (http://www.repeatmasker.org/RepeatModeler.html) was used to de novo identify genomic transposable elements (TE) and Repbase was used for the known repeats library. The de novo and known libraries were then combined. RepeatMasker was used to identify the TEs in the S. schlegelii genome.

2.6.2. Gene structural and functional annotation

The structural and functional annotations of the assembled genome were conducted using de novo, homolog-based, and RNA-seq methods. Augustus, GeneID, GeneScan, GlimmerHMM, and SNAP were used for de novo genome prediction. Thereafter, protein sequences from Cynoglossus semilaevis, Paralichthys olivaceus, Takifugu rubripes, Oreochromis niloticus, Monopterus albus, Hippocampus comes, Oryzias latipes, Xiphophorus maculatus, Oncorhynchus mykiss, and Danio rerio were searched against the S. schlegelii genome using TBLASTN. RNA-seq data assembled using Trinity were aligned against the S. schlegelii genome. Putative exon regions and splice junctions were identified by mapping RNA-seq data to the genome with Tophat, then, mapped reads were assembled into gene models using Cufflinks. All the gene models were integrated using Evidence Modeler (EVM). We compared the genomic structural characters of the S. schlegelii genome with those of the genomes of closely related species. Gene functions were annotated using BLAST with the SwissProt, Nr, Pfam, GO, and KEGG, and InterPro databases. We predicted the gene structure first, and blast the gene functional clusters against known databases by comparison software, then we obtained the function information for the genes. First, we blast S. schlegelii and other homologous species with blastall, with parameters set as follows: -p: tblastn (procedure), -e 1e-05 (expectation value), -F: T (low complexity regions, LCR filter). In a second step, we combined the hits of blast results with Solar software set as follows: -a prot2genome2 (-cCn 100000-d -1), -c cluster and constructed multi-blocks, -C do not examine the overlap in query, -n INUM maximum gap length 100000, -d -1 minimum depth for repeats (-1 stands for no masking). Finally, we predicted the full gene structure based on the blast hits with GeneWise with the following commands: -trev Compare on the reverse strand, -tfor Compare on the forward strand, -gensef show gene structure with supporting evidence, -gff Gene Feature Format file, -sum show summary output.

2.6.3. ncRNA annotation

Non-coding RNA in the S. schlegelii genome was predicted by BLAST against the human rRNA database, tRNAscan-SE, INFERNAL, and the Rfam database.

2.7. Phylogenetic analysis and estimation of divergence time

The OrthoMCL method was used to cluster into gene families. Maximum likelihood (ML) was used for phylogenetic analysis. PAML was used for estimation time of divergence.

2.8. Microenvironment of the female ovary

Six cDNA libraries (FII, FIII–IV) were constructed using total RNA from pre-copulatory and post-copulatory female ovaries. Clean reads were assembled into non-redundant transcripts, and then, these transcripts were clustered into Unigenes. There were three biological replicates at each stage. The differential expression of genes was analysed between pre- and post-copulatory stages.

2.9. Sperm analysis

Fresh sperm was collected into a 200-μl centrifuge tube by gently hand stripping the testis dissected from ripe males in November. Five male individuals were prepared. Three individuals showing sperm motility>80% were used in subsequent experiments. The sperm of the three individuals were mixed together to eliminate individual differences. They were divided into two groups, a control group and a treatment group. The sperm activator of male serum was added to each group. Suppression of glycolysis was attained with sodium iodoacetate at 0.125 mM. The two groups were placed at 4 °C. Sperm motility parameters and longevity were determined using an SCA Evolution CASA sperm class analyser (Barcelona, Spain).

3. Results

3.1. Genome sequencing and assembly

The size of the S. schlegelii genome was estimated at 842.97 Mb (Supplementary Fig. S1), and the assembled genome size was 848.31 Mb. The initial 85.78 Gb (101.76× coverage) PacBio data (Table 1) determined N50 length to be between 15.66 and 25.20 kb. Subsequently, a 129.75-Gb (153.92× coverage) of sequencing data were obtained from the 10× Genomics Linked-Read library (Table 1). The addition resulted in an 847.88-Mb draft genome comprising 1,471 scaffolds, with N50 value being improved to between 2.92 and 4.34 Mb (Table 2). Following this step, a total of 118.90 Gb (141.05× coverage) of Hi-C data were generated to assisted the assembly at the chromosomal level. We then successfully clustered 951 contigs into 24 groups using Lachesis (Fig. 2), resulting in 641 contigs that were reliably anchored on chromosomes by Hi-C. The cluster number was at 67.40% and the base count of the total genome was 96.19%. This third refinement resulted in a draft genome size of 847.94 Mb with 854 scaffolds, and an enhanced N50 value of 35.60 Mb (Table 2). Finally, we corrected the remaining errors using Pilon (Table 2). The genome size of the finally draft was 848.31 Mb, comprising 854 scaffolds, with a Contig N50 of 2.96 Mb and a Scaffold N50 of 35.63 Mb. A schematic representation of the characteristics of the genome of S. schlegelii is shown in Figure 3.
Table 1

Summary of sequence data from S. schlegelii

PlatformInsert sizeRaw data (Gb)Clean data (Gb)Read length(bp)Sequence coverage (×)SRA accession number
PacBio reads30k85.78101.76SRP173183
10× Genomics500–700 bp129.75126.37150153.92SRP173183
Hi-C350 bp118.90118.46150141.05SRP173183
Illumina reads350 bp88.0888.05150104.49SRP173183
In total422.51501.22
Table 2

Genome assembly of S. schlegelii

DescriptionFirst assemblySecond assemblyThird assemblyFourth error correction
PlatformPacBio10× GenomicsHi-CIllumina reads
SoftwareFalconFragScaffLachesisPilon
No. of contig2,0312,0312,0312,019
Total length of contig (Mb)842.15843.91843.91843.86
Contig N50 (Mb)2.922.932.932.96
Minimum length (bp)129129129130
Maximum length (Mp)10.9710.9910.9910.99
No. of Scaffold2,0311,471854854
Total length of Scaffold (Mb)842.15847.88847.94848.31
Scaffold N50 (Mp)2.924.3435.6035.63
Minimum length (bp)129129129130
Maximum length (Mp)10.9715.6043.1843.20
N (%)00.470.480.52
Figure 2

The contig contact matrix from the genome of Sebastes schlegelii derived from Hi-C data. In the plot, the red colour indicates a high-density logarithm and the white colour indicates a low contact density logarithm. In Hi-C analysis, the genome was divided into bin by 100k. The number of interactions between bin reads was calculated, that is, the number of interactions between bins. Each point in the figure represents the number of interactions between bins with horizontal and vertical coordinates, and the colour intensity represents the strength of the interactions. Genome-wide interactions tend to be more intra-chromosomal than inter-chromosomal.

Figure 3

A schematic representation of the characteristics of the genome of Sebastes schlegelii. From the outer to the inner circles: I, chromosomes; II, gene density; III, repeat density; IV, coding-sequence region.

The contig contact matrix from the genome of Sebastes schlegelii derived from Hi-C data. In the plot, the red colour indicates a high-density logarithm and the white colour indicates a low contact density logarithm. In Hi-C analysis, the genome was divided into bin by 100k. The number of interactions between bin reads was calculated, that is, the number of interactions between bins. Each point in the figure represents the number of interactions between bins with horizontal and vertical coordinates, and the colour intensity represents the strength of the interactions. Genome-wide interactions tend to be more intra-chromosomal than inter-chromosomal. A schematic representation of the characteristics of the genome of Sebastes schlegelii. From the outer to the inner circles: I, chromosomes; II, gene density; III, repeat density; IV, coding-sequence region. Summary of sequence data from S. schlegelii Genome assembly of S. schlegelii

3.2. Genome quality evaluation

A total of 97.93% of the short sequence reads covered 99.61% map of the genome assembly map. We used samtools (http://dept.qdio.cas.cn/emblc/ktzjs/hyjg/zncy/) to deal with the comparison result of BWA, order the chromosome coordinate, dispose of the repeat reads, SNP calling, filter the raw data, and finally get the homozygous single-nucleotide polymorphisms (SNP) percentage. The homology for SNP was 0.00038%. As the percentage of homology for SNP reflects the accuracy of genome assembly, and 0.00038% indicates that the level of genome assembly shows high quality at the single-base level. Moreover, CEGMA and BUSCO analyses were used to evaluate the genome assembly quality, providing scores of 92.34% and 95.5%, respectively (Table 3). In the BUSCO analysis summarized in Table 3, 2.4% of the genes were missing and 2.1% of the genes were fragmented, together adding up to 4.5%. There were 127 genes missing in the BUSCO dataset. We extracted the pep ID of the missing genes, and blast with the pep sequence of S. schlegelii. The percentage of the alignments was all <50%, indicating that they were not in the genome of S. schlegelii. Therefore, the results confirmed that the missing genes from BUSCO’S aligner could not be aligned. Furthermore, the genome assembly versions of S. schlegelii were compared (Table 4). The scaffold N50 and genome coverage assembly as per the falcon version (35.63 Mb, 99.61) was higher than that of the wtdbg2 version (33.81 Mb, 99.36) while the contig N50 and the homology SNP (%) assembly as per the falcon version (2.92 Mb, 0.00038) is lower than that of the wtdbg2 version (15.39 Mb, 0.0009). Assembled S. schlegelii genome was compared with those of other teleost species (Fig. 4 and Supplementary Table S1). The N50 lengths of both contigs and scaffolds are shown in Supplementary Table S2. Two-colour DNA probes obtained from an identical chromosome (chr3) anchored on the same chromosome (Fig. 5).
Table 3

Statistics for genome characteristic of S. schlegelii

Genome characteristic
Estimated genome size (Mb)842.97
Assembled genome size (Mb)848.31
Reads mapping rate (%)97.93
Genome coverage (%)99.61
GC content (%)40.75
Homology SNP (%)0.00038
CEGMA evaluate (%)92.34
BUSCO genome completencen=2586
 Complete2470 (95.5%)
 Complete and single copy2400 (92.8%)
 Complete and duplicated70 (2.7%)
 Fragmented54(2.1%)
 Missing62 (2.4%)

The percentage of homology SNP reflects the accuracy of genome assemble, and the results Homology SNP 0.00038% shows that the level of the genome assembly possesses high quality at single base level.

Table 4

Genome assembly versions comparison of Sebastes schlegelii

DatasetMetricFALCON+FragScaff+Lachesis+PilonWtdbg2+FragScaff+Lachesis+Pilon
S. schlegelii Contig N50 (Mb)2.9215.39
Illumina readsScaffold N50 (Mb)35.6333.81
Pacbio readsAssembled genome size (Mb)848.31784.94
10× GenomicsReads mapping rate (%)97.9398.29
Hi-CGenome coverage (%)99.6199.36
GC content (%)40.7540.81
Homology SNP (%)0.000380.0009
N (%)0.520.18
CEGMA evaluate (%)92.3494.76
BUSCO genome completence2,586 (95.5%)2,586 (98.0%)
Figure 4

Comparison of the Sebastes schlegelii genome with other publicly available teleost genomes. The x axis represents the contig N50 values and the y axis represents the scaffold N50 values. The genomes sequenced with PacBio are highlighted in orange and the genome of S. schlegelii is highlighted in red.

Figure 5

FISH DNA probes obtained from an identical chromosome (Chr 3) anchored on the same chromosome to confirm the quality of chromosome-scale assembly using Hi-C. (a) Giemsa staining, (b) DAPI, (c) fluorescein-labelled, and (d) DIG-labelled, 100×.

Comparison of the Sebastes schlegelii genome with other publicly available teleost genomes. The x axis represents the contig N50 values and the y axis represents the scaffold N50 values. The genomes sequenced with PacBio are highlighted in orange and the genome of S. schlegelii is highlighted in red. FISH DNA probes obtained from an identical chromosome (Chr 3) anchored on the same chromosome to confirm the quality of chromosome-scale assembly using Hi-C. (a) Giemsa staining, (b) DAPI, (c) fluorescein-labelled, and (d) DIG-labelled, 100×. Statistics for genome characteristic of S. schlegelii The percentage of homology SNP reflects the accuracy of genome assemble, and the results Homology SNP 0.00038% shows that the level of the genome assembly possesses high quality at single base level. Genome assembly versions comparison of Sebastes schlegelii

3.3. Genome annotation of black rockfish

The RNA-seq data for the S. schlegelii genome and that of the genomes of 10 other teleost species were used for the structural and functional annotations (Supplementary Table S2). The annotated results revealed the following information: repetitive elements, 39.98%; in the genome of S. schlegelii, the main repetitive transposable elements were the DNA transposons (18.06%) and retrotransposable elements (17.93%) (Table 5). Among 26,979 protein-coding genes, 26,775 (99.20%) were functionally annotated with terms (Table 6). We compared the structure of the genome of S. schlegelii with those of closely related species. The mean number of exons per gene was 8.63 (Supplementary Table S3).
Table 5

Summary of genome annotation for S. schlegelii

Annotation
Repetitive sequence content39.98%
 DNA18.06%
 LINE9.59%
 SINE1.08%
 LTR7.26%
Protein-coding genes26,979
 Mean transcript length14,159.49 bp
 Mean CDS length1,452.03 bp
 Mean exon per gene8.63
 Mean exon length168.32 bp
 Mean intron length1,666.16 bp
Table 6

Statistics for genome annotation of S. schlegelii

DatabaseNumber of annotated transcripts%
Swissprot23,33786.50
Nr24,96392.50
KEGG21,44979.50
InterPro26,69899.00
GO24,85792.10
Pfam20,81877.20
Annotated26,77599.20
Unannotated2040.80
Summary of genome annotation for S. schlegelii Statistics for genome annotation of S. schlegelii

3.4. Phylogenetic and divergence-time analysis

In the present study, we constructed 24,636 gene family clusters with 648 single-copy gene families (Fig. 6). S. schlegelii diverged from the common ancestor of Gasterosteus aculeatus ∼32.1–56.8 million years ago (Fig. 7). The retrotransposable elements (17.93%) were more than in zebrafish (11%), and less than in humans (44%). In contrast, the DNA transposable elements of S. schlegelii were 18.06%, more than in humans (3.2%), and medaka (<10%) but less than in zebrafish (39%). In addition, there were 1,331 specific family clusters in S. schlegelii, over four times more than that in G. aculeatus (322). We identified 422 gene families to be expanded in the S. schlegelii genome. The functional enrichment by GO and KEGG of those expanded gene families identified 282 and 45 significantly enriched (P < 0.05) GO terms and pathways, respectively. The expanded gene families were mainly found on NOD-like receptor signal pathways (P = 2.91E-23), circadian entrainment (P = 1.48E-17), taste transduction (P = 3.39E-15), calcium signal pathway (P = 6.40E-13), olfactory transduction signal pathway (P = 4.06E-09), dynein complex term (P = 5.12E-21), homophilic cell adhesion term (P = 7.27E-17), transmembrane transport term (P = 7.35e-15), and microtubule motor activity term (P = 5.25E-14). Additionally, we identified 76 gene families that were enriched significantly contracted in this work. The lineage-specific gene families may contribute to reproductive traits that are specific to the S. schlegelii.
Figure 6

Gene-family cluster analysis. (a) The comparison of gene families from Sebastes. schlegelii and other teleosts. The horizontal axis indicates the species and the vertical axis represents the number of genes. The pink colour represents single-copy genes; yellow represents multiple-copy genes; deep yellow represents unique paralogues; green represents other orthologues and unclustered genes. Here, other means except the above three types. Some genes were not clustered in the gene family or clustered in a gene family from some of the species. (b) The gene-family Venn diagram. Ssc, Sebastes schlegelii; Gac, Gasterosteus aculeatus; Tru, Takifugu rubripes; Tni, Tetraodon nigroviridis.

Figure 7

Estimation of the time of divergence of Sebastes. Schlegelii. Note: The numbers on the nodes represent the divergence times (millions of years ago, mya).

Gene-family cluster analysis. (a) The comparison of gene families from Sebastes. schlegelii and other teleosts. The horizontal axis indicates the species and the vertical axis represents the number of genes. The pink colour represents single-copy genes; yellow represents multiple-copy genes; deep yellow represents unique paralogues; green represents other orthologues and unclustered genes. Here, other means except the above three types. Some genes were not clustered in the gene family or clustered in a gene family from some of the species. (b) The gene-family Venn diagram. Ssc, Sebastes schlegelii; Gac, Gasterosteus aculeatus; Tru, Takifugu rubripes; Tni, Tetraodon nigroviridis. Estimation of the time of divergence of Sebastes. Schlegelii. Note: The numbers on the nodes represent the divergence times (millions of years ago, mya).

3.5. The interaction between ovary microenvironment and sperm storage

Female black rockfish have been found to store sperm in their ovaries for up to 6 months. The maintenance of sperm viability is dependent upon exogenous energy sources derived from the ovary microenvironment. Carrier protein SLC2 showed significantly positive selection based on comparative genome analysis. The expression of carbohydrate metabolism-related KEGG pathways was significantly up-regulated in ovaries from pre-copulation to post-copulation, based on differential genes expression analysis of transcriptome. Based on FPKM value, gene expression of carbohydrate metabolism-related genes, such as HXK2, GAA, GDE, UGP2, HXK1, PFKFB3, ALDOA, ADPGK, PFKAP, and ENOA were all significantly up-regulated from pre-copulation (FII) to post-copulation (FIII–IV), as per KEGG (Fig. 8a). Moreover, glycolysis is one of the ATP-energy producing pathways enhanced by energy-substrate availability. Sodium iodoacetate is a specific inhibitor of glycolysis acting on glyceraldehyde-3-phosphate dehydrogenase (GAPDH). In the present study, sperm longevity in the experimental group subjected to in vitro suppression of glycolysis by sodium iodoacetate was significantly reduced sperm longevity from 504 ± 24 h to 384 ± 48 h (control group) (Fig. 8b). These results indicated that carbohydrate sources from the microenvironment surrounding the ovaries may play an important role in maintaining sperm survivability during long-term storage.
Figure 8

The interaction of ovary microenvironment and sperm storage. (a) The heatmap of carbohydrate metabolism-related gene expression from Pre-copulation (FII) to post-copulation (FIII–IV); the higher the gene expression, the lighter the colour. The quantities expression was calculated based on FPKM (expected number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced); (b) the time of sperm survivability of control and experimental groups (sodium iodoacetate treatment) in vitro. The error bars were calculated by mean value ± standard deviation, and are shown as standard deviation.

The interaction of ovary microenvironment and sperm storage. (a) The heatmap of carbohydrate metabolism-related gene expression from Pre-copulation (FII) to post-copulation (FIII–IV); the higher the gene expression, the lighter the colour. The quantities expression was calculated based on FPKM (expected number of Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced); (b) the time of sperm survivability of control and experimental groups (sodium iodoacetate treatment) in vitro. The error bars were calculated by mean value ± standard deviation, and are shown as standard deviation.

4. Discussion

Black rockfish is a viviparous marine teleost characterized by internal fertilization associated with long-term (up to 6 months) sperm storage in the female ovary. However, although the genomes of numerous oviparous fish species have been previously been sequenced, to date, few genomic resources have been reported for viviparous marine teleosts. Currently, data are available for the viviparous freshwater fish platyfish and for the chondrichthyes elephant shark. The S. schlegelii genome described herein expands the information available on genome evolution of viviparous marine teleost species. Moreover, the chromosomal-level genome assembly of S. schlegelii provides an opportunity to examine the appearance (reproductive strategy, sperm storage, sperm competition, and sexual selection) of viviparty at the genome level. In recent years, long-read sequences have experienced an important growth spurt with PacBio technologies. There are many assemblers for long-read assembly, and it is necessary to generate multiple genome assemblies and compare the results to obtain a more reliable genome assembly for the genome community. In the present study, the genome assembly was done using FALCON and wtdbg2. Currently, many genome assemblies obtained for teleosts by FALCON are available, such as those of Antarctic blackfin icefish, snailfish, yellow catfish, Cephalopods, barkley, and mountain carps. In addition, in other species, such as great ape, koala, water buffalo, maize, stout camphor tree, and apple, the FALCON assembler has been widely used in long-read assembly of the genome. At first, we selected FALCON as the assembler, and then, we also used wtdbg2 to reassembly and compared the two in order to assess the quality of the two assemblies. Although FALCON may not be the best assembler, it is reliable enough in long-read assembly. The overall quality of the FALCON assembly of S. schlegelii genome resides in its reliability. On the basis of comparison of the genome assembly of S. schlegelii with that available for other teleosts, the contig and scaffold N50 lengths were both of considerable continuity. In the present study, we used a combination of Pacific Biosciences sequel and Illumina sequencing platforms and 10× Genomics and Hi-C technology to obtain a genome assembly size of 848.31 Mb comprising 24 chromosomes, and contig and scaffold N50 lengths of 2.96 and 35.63 Mb, respectively. Moreover, the sequenced S. schlegelii genome was found to be considerably longer than those obtained for other fish species using next-generation sequencing technology, and even far surpassed some genome sequencing obtained using PacBio. We also compared basic genome structural features, including genes lengths, coding regions, and non-coding regions of the S. schlegelii genome with those of closely related species, all of which reached a reasonable high level. Genome annotation, revealed that the S. schlegelii genome contains 39.98% repetitive elements (Table 5), which is considerably higher than the corresponding percentage of the three-spine stickleback, but lower than that of the zebrafish. Among the 19 species, we used to construct the phylogenetic tree in the present study, there are two types of reproductive strategy, namely, viviparity, and oviparity., Interestingly, we found that those species characterized by viviparous and oviparous modes of reproduction did not show any particular evolutionary relationship (Fig. 6). The results showed that the reproductive mode is not significantly or no directly related to an evolutionary relationship. Vivipary is not an attribute of phyletic evolution but of specialization from closely related oviparous species. In particular, black rockfish and platy fish are both viviparous, and we found that they diverged from the three-spined stickleback fish and medaka several tens of millions of years ago, respectively. The specialization of viviparity from the closely related oviparous species may be ascribed to environmental influences. Currently, there is limited information available regarding reproductive development in viviparous species, and thus, the black rockfish is considered an attractive viviparous fish model for studies on sperm storage, reproductive mode, and fertilization biology, among other biological issues of importance. Sperm storage is a common reproductive strategy among vertebrate species that are characterized by internal fertilization. Nevertheless, sperm storage time is a species-specific characteristic that varies from minutes to years. In black rockfish, females have been found to store sperm in their ovaries for up to 6 months. Furthermore, the state of sperm changes concomitant with ovary development, from swimming in the ovarian fluid to penetration of the ovigerous lamellae epithelium, subsequent reactivation, and finally fertilizing the eggs. The maintenance of sperm viability is dependent upon exogenous energy sources derived from the ovary microenvironment. The solute carriers (SLCs) superfamily is one of the most important membrane transporter families; SLCs are involved in the intercellular transport of substances, and transfer of energy, nutrients, and metabolites. In the present study, we found that the glucose transporter protein SLC2, a member of SLC superfamily, showed significantly positive selection in black rockfish genome. In mammals, including humans and mice, carbohydrates are positively correlated with the duration of sperm viability. Furthermore, in the present study, we found that many carbohydrate metabolism-related KEGG pathways that provide energy substrates sources showed significant up-regulation from pre- to post-copulation. These observations agree with our belief that during the storage stage, sperm in the female ovary is dependent on energy substrates derived from the surrounding microenvironment. We accordingly provided evidence in support of this hypothesis in vitro by demonstrating that in vitro suppression of glycolysis significantly reduced sperm longevity, thereby indicating the importance of carbohydrate sources in maintaining sperm survivability. In conclusion, this is the first study to conduct chromosomal-level sequencing of the genome of a viviparous marine teleost characterized by long-term sperm storage (up to 6 months) in female ovaries. Here, we obtained a genome assembly size of 848.31 Mb comprising 24 chromosomes, and contig and scaffold N50 lengths of 2.96 and 35.63 Mb, respectively. We predicted 39.98% repetitive elements, and 26,979 protein-coding genes; further our analysis determined that S. schlegelii diverged from Gasterosteus aculeatus ∼32.1–56.8 million years ago. Genome, transcriptome, and in vitro sperm physiological analyses provided an insight into the carbohydrate substances produced in female ovaries in support of long-term sperm storage. Therefore, we believe our findings will provide an important genomic resource for researchers in the fields of marine and reproductive biology. Click here for additional data file.
  57 in total

1.  The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000.

Authors:  A Bairoch; R Apweiler
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes.

Authors:  Genis Parra; Keith Bradnam; Ian Korf
Journal:  Bioinformatics       Date:  2007-03-01       Impact factor: 6.937

3.  KEGG: Kyoto Encyclopedia of Genes and Genomes.

Authors:  H Ogata; S Goto; K Sato; W Fujibuchi; H Bono; M Kanehisa
Journal:  Nucleic Acids Res       Date:  1999-01-01       Impact factor: 16.971

4.  The role of glucose in supporting motility and capacitation in human spermatozoa.

Authors:  A C Williams; W C Ford
Journal:  J Androl       Date:  2001 Jul-Aug

5.  Glycolysis plays a major role for adenosine triphosphate supplementation in mouse sperm flagellar movement.

Authors:  Chinatsu Mukai; Makoto Okuno
Journal:  Biol Reprod       Date:  2004-04-14       Impact factor: 4.285

6.  The zebrafish reference genome sequence and its relationship to the human genome.

Authors:  Kerstin Howe; Matthew D Clark; Carlos F Torroja; James Torrance; Camille Berthelot; Matthieu Muffato; John E Collins; Sean Humphray; Karen McLaren; Lucy Matthews; Stuart McLaren; Ian Sealy; Mario Caccamo; Carol Churcher; Carol Scott; Jeffrey C Barrett; Romke Koch; Gerd-Jörg Rauch; Simon White; William Chow; Britt Kilian; Leonor T Quintais; José A Guerra-Assunção; Yi Zhou; Yong Gu; Jennifer Yen; Jan-Hinnerk Vogel; Tina Eyre; Seth Redmond; Ruby Banerjee; Jianxiang Chi; Beiyuan Fu; Elizabeth Langley; Sean F Maguire; Gavin K Laird; David Lloyd; Emma Kenyon; Sarah Donaldson; Harminder Sehra; Jeff Almeida-King; Jane Loveland; Stephen Trevanion; Matt Jones; Mike Quail; Dave Willey; Adrienne Hunt; John Burton; Sarah Sims; Kirsten McLay; Bob Plumb; Joy Davis; Chris Clee; Karen Oliver; Richard Clark; Clare Riddle; David Elliot; David Eliott; Glen Threadgold; Glenn Harden; Darren Ware; Sharmin Begum; Beverley Mortimore; Beverly Mortimer; Giselle Kerry; Paul Heath; Benjamin Phillimore; Alan Tracey; Nicole Corby; Matthew Dunn; Christopher Johnson; Jonathan Wood; Susan Clark; Sarah Pelan; Guy Griffiths; Michelle Smith; Rebecca Glithero; Philip Howden; Nicholas Barker; Christine Lloyd; Christopher Stevens; Joanna Harley; Karen Holt; Georgios Panagiotidis; Jamieson Lovell; Helen Beasley; Carl Henderson; Daria Gordon; Katherine Auger; Deborah Wright; Joanna Collins; Claire Raisen; Lauren Dyer; Kenric Leung; Lauren Robertson; Kirsty Ambridge; Daniel Leongamornlert; Sarah McGuire; Ruth Gilderthorp; Coline Griffiths; Deepa Manthravadi; Sarah Nichol; Gary Barker; Siobhan Whitehead; Michael Kay; Jacqueline Brown; Clare Murnane; Emma Gray; Matthew Humphries; Neil Sycamore; Darren Barker; David Saunders; Justene Wallis; Anne Babbage; Sian Hammond; Maryam Mashreghi-Mohammadi; Lucy Barr; Sancha Martin; Paul Wray; Andrew Ellington; Nicholas Matthews; Matthew Ellwood; Rebecca Woodmansey; Graham Clark; James D Cooper; James Cooper; Anthony Tromans; Darren Grafham; Carl Skuce; Richard Pandian; Robert Andrews; Elliot Harrison; Andrew Kimberley; Jane Garnett; Nigel Fosker; Rebekah Hall; Patrick Garner; Daniel Kelly; Christine Bird; Sophie Palmer; Ines Gehring; Andrea Berger; Christopher M Dooley; Zübeyde Ersan-Ürün; Cigdem Eser; Horst Geiger; Maria Geisler; Lena Karotki; Anette Kirn; Judith Konantz; Martina Konantz; Martina Oberländer; Silke Rudolph-Geiger; Mathias Teucke; Christa Lanz; Günter Raddatz; Kazutoyo Osoegawa; Baoli Zhu; Amanda Rapp; Sara Widaa; Cordelia Langford; Fengtang Yang; Stephan C Schuster; Nigel P Carter; Jennifer Harrow; Zemin Ning; Javier Herrero; Steve M J Searle; Anton Enright; Robert Geisler; Ronald H A Plasterk; Charles Lee; Monte Westerfield; Pieter J de Jong; Leonard I Zon; John H Postlethwait; Christiane Nüsslein-Volhard; Tim J P Hubbard; Hugues Roest Crollius; Jane Rogers; Derek L Stemple
Journal:  Nature       Date:  2013-04-17       Impact factor: 49.962

7.  Insights into the molecular basis of long-term storage and survival of sperm in the honeybee (Apis mellifera).

Authors:  Ellen Paynter; A Harvey Millar; Mat Welch; Barbara Baer-Imhoof; Danyang Cao; Boris Baer
Journal:  Sci Rep       Date:  2017-01-16       Impact factor: 4.379

8.  Chromosome-level assembly of the water buffalo genome surpasses human and goat genomes in sequence contiguity.

Authors:  Wai Yee Low; Rick Tearle; Derek M Bickhart; Benjamin D Rosen; Sarah B Kingan; Thomas Swale; Françoise Thibaud-Nissen; Terence D Murphy; Rachel Young; Lucas Lefevre; David A Hume; Andrew Collins; Paolo Ajmone-Marsan; Timothy P L Smith; John L Williams
Journal:  Nat Commun       Date:  2019-01-16       Impact factor: 14.919

9.  Gene finding in novel genomes.

Authors:  Ian Korf
Journal:  BMC Bioinformatics       Date:  2004-05-14       Impact factor: 3.169

10.  The Pfam protein families database: towards a more sustainable future.

Authors:  Robert D Finn; Penelope Coggill; Ruth Y Eberhardt; Sean R Eddy; Jaina Mistry; Alex L Mitchell; Simon C Potter; Marco Punta; Matloob Qureshi; Amaia Sangrador-Vegas; Gustavo A Salazar; John Tate; Alex Bateman
Journal:  Nucleic Acids Res       Date:  2015-12-15       Impact factor: 16.971

View more
  3 in total

1.  Pacific Biosciences assembly with Hi-C mapping generates an improved, chromosome-level goose genome.

Authors:  Yan Li; Guangliang Gao; Yu Lin; Silu Hu; Yi Luo; Guosong Wang; Long Jin; Qigui Wang; Jiwen Wang; Qianzi Tang; Mingzhou Li
Journal:  Gigascience       Date:  2020-10-24       Impact factor: 6.524

2.  Germline Specific Expression of a vasa Homologue Gene in the Viviparous Fish Black Rockfish (Sebastes schlegelii) and Functional Analysis of the vasa 3' Untranslated Region.

Authors:  Li Zhou; Xueying Wang; Shuran Du; Yanfeng Wang; Haixia Zhao; Tengfei Du; Jiachen Yu; Lele Wu; Zongcheng Song; Qinghua Liu; Jun Li
Journal:  Front Cell Dev Biol       Date:  2020-10-28

3.  Expression Analysis of ZPB2a and Its Regulatory Role in Sperm-Binding in Viviparous Teleost Black Rockfish.

Authors:  Rui Li; Jiangbo Qu; Dan Huang; Yan He; Jingjing Niu; Jie Qi
Journal:  Int J Mol Sci       Date:  2022-08-22       Impact factor: 6.208

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.