| Literature DB >> 30680137 |
Zhaofang Han1, Wanbo Li1, Wen Zhu1, Sha Sun1, Kun Ye1, Yangjie Xie1, Zhiyong Wang1,2.
Abstract
Yellow drum (Nibea albiflora) is an important fish species in capture fishery and aquaculture in East Asia. We herein report the first and near-complete genome assembly of an ultra-homologous gynogenic female yellow drum using Illumina short sequencing reads. In summary, a total of 154.2 Gb of raw reads were generated via whole-genome sequencing and were assembled to 565.3 Mb genome with a contig N50 size of 50.3 kb and scaffold N50 size of 2.2 Mb (BUSCO completeness of 97.7%), accounting for 97.3%-98.6% of the estimated genome size of this fish. We further identified 22,448 genes using combined methods of ab initio prediction, RNAseq annotation, and protein homology searching, of which 21,614 (96.3%) were functionally annotated in NCBI nr, trEMBL, SwissProt, and KOG databases. We also investigated the nucleotide diversity (around 1/390) of aquacultured individuals and found the genetic diversity of the aquacultured population decreased due to inbreeding. Evolutionary analyses illustrated significantly expanded and extracted gene families, such as myosin and sodium: neurotransmitter symporter (SNF), could help explain swimming motility of yellow drum. The presented genome will be an important resource for future studies on population genetics, conservation, understanding of evolutionary history and genetic breeding of the yellow drum and other Nibea species.Entities:
Keywords: annotation; genome assembly; nucleotide diversity; yellow drum (Nibea albiflora)
Year: 2018 PMID: 30680137 PMCID: PMC6342179 DOI: 10.1002/ece3.4778
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1The yellow drum (Nibea albiflora). The picture of the yellow drum was provided by Shuqiu Xie (Mindong Fishery Research Institute of Fujian Province)
Summary statistics of the whole‐genome sequencing data
| Library | Library type | Insert size (bp) | Reads number | Total base (bp) | Reads number after trimming | Total bases after trimming (bp) |
|---|---|---|---|---|---|---|
| DES00946 | Paired‐end | 300 | 214,014,344 | 32,102,151,600 | 213,149,506 | 31,435,660,732 |
| DES00947 | Paired‐end | 300 | 175,073,346 | 26,261,001,900 | 174,401,308 | 25,728,984,413 |
| DES00948 | Paired‐end | 450 | 158,969,122 | 23,845,368,300 | 157,188,702 | 22,982,429,030 |
| DES00949 | Paired‐end | 450 | 166,828,960 | 25,024,344,000 | 164,607,364 | 24,031,348,725 |
| DEL00758 | Mate‐pair | 2,000 | 104,631,768 | 15,694,765,200 | 103,664,548 | 5,231,588,400 |
| DEL00757 | Mate‐pair | 5,000 | 116,137,928 | 17,420,689,200 | 114,947,454 | 5,806,896,400 |
| DEL00775 | Mate‐pair | 10,000 | 91,955,960 | 13,793,394,000 | 90,954,304 | 4,597,798,000 |
Figure 2The flow chart depicting the whole‐genome sequencing, assembly, and annotation
Summary statistics of the Nibea albiflora genome assembly
| Assembly | Contig | Scaffold |
|---|---|---|
| Size (bp) | 557,190,737 | 565,299,463 |
| GC content (%) | 42.4 | 41.8 |
| Number | 25,182 | 1,252 |
| N50 size (bp) | 50,300 | 2,254,189 |
| Shortest (bp) | 301 | 2,002 |
| Longest (bp) | 641,168 | 13,214,368 |
| Average length (bp) | 22,127 | 451,517 |
| N bases (bp) | 0 | 8,026,962 |
Summary of repeat elements identified in the Nibea albiflora genome
| Repeat element | Fragments | Total length (bp) | % of genome |
|---|---|---|---|
| SINE | 20,650 | 2,761,049 | 0.5 |
| LINE | 40,857 | 7,549,448 | 1.3 |
| LTR element | 16,278 | 3,870,965 | 0.7 |
| DNA element | 84,443 | 12,838,672 | 2.2 |
| RC element | 4,504 | 1,205,596 | 0.2 |
| Small RNA | 1,578 | 127,371 | 0.02 |
| Simple repeat | 380,033 | 15,842,155 | 2.8 |
| Low complexity | 40,161 | 2,123,227 | 0.4 |
| Unclassified | 225,094 | 31,820,793 | 5.5 |
| Total | 813,598 | 78,139,276 | 13.7 |
Summary statistics for gene prediction for Nibea albiflora genome
| Gene number | Average gene length (bp) | Average CDS length (bp) | Average exons per gene | ||
|---|---|---|---|---|---|
| De novo | Augustus | 25,718 | 11,121 | 1,754 | 12.8 |
| GeneMark‐ET | 59,067 | 3,277 | 767 | 5.4 | |
| Braker | 27,331 | 10,700 | 1,699 | 12.2 | |
| Homolog |
| 51,671 | 16,589 | 936 | 6.2 |
|
| 33,864 | 17,462 | 1,241 | 7.6 | |
|
| 45,545 | 19,485 | 1,557 | 10.0 | |
|
| 21,873 | 13,336 | 1,287 | 8.1 | |
|
| 25,185 | 13,059 | 1,342 | 8.7 | |
|
| 23,825 | 15,024 | 1,619 | 9.3 | |
|
| 23,815 | 14,520 | 1,477 | 8.4 | |
| Transcriptome | PASA | 6,746 | 9,826 | 1,589 | 15.4 |
| Merge | Evidence Modeler | 22,448 | 12,764 | 1,844 | 13.4 |
Figure 3Phylogenetic tree and orthologous genes in Nibea albiflora and 11 other vertebrates. Blue numbers in the phylogenetic tree indicate the divergence time (MYA, million years ago), and the green and red numbers represent the expanded and contracted gene families, respectively. The histogram shows different types of orthologous relationships. “1:1:1” means universal single‐copy genes; “N:N:N” means orthologs exist in all genomes; “SS” means species‐specific genes; and “Others” means orthologs that do not fit into the other categories