| Literature DB >> 33367716 |
Xinghua Lin1,2,3,4, Yang Huang1,2,3,4,5, Dongneng Jiang1,2,3,4,5, Huapu Chen1,2,3,4,5, Siping Deng1,2,3,4,5, Yulei Zhang1,3,4,5, Tao Du1,2,3,4,5, Chunhua Zhu1,2,3,4,5, Guangli Li1,2,3,4,5, Changxu Tian1,2,3,4,5.
Abstract
Silver sillago, Sillago sihama is a member of the family Sillaginidae and found in all Chinese inshore waters. It is an emerging commercial marine aquaculture species in China. In this study, high-quality chromosome-level reference genome of S. sihama was first constructed using PacBio Sequel sequencing and high-throughput chromosome conformation capture (Hi-C) technique. A total of 66.16 Gb clean reads were generated by PacBio sequencing platforms. The genome-scale was 521.63 Mb with 556 contigs, and 13.54 Mb of contig N50 length. Additionally, Hi-C scaffolding of the genome resulted in 24 chromosomes containing 96.93% of the total assembled sequences. A total of 23,959 protein-coding genes were predicted in the genome, and 96.51% of the genes were functionally annotated in public databases. A total of 71.86 Mb repetitive elements were detected, accounting for 13.78% of the genome. The phylogenetic relationships of silver sillago with other teleosts showed that silver sillago was separated from the common ancestor of Sillago sinica ∼7.92 Ma. Comparative genomic analysis of silver sillago with other teleosts showed that 45 unique and 100 expansion gene families were identified in silver sillago. In this study, the genomic resources provide valuable reference genomes for functional genomics research of silver sillago.Entities:
Keywords: Hi-C; PacBio; chromosomal assembly; genome; silver sillago
Mesh:
Year: 2021 PMID: 33367716 PMCID: PMC7875006 DOI: 10.1093/gbe/evaa272
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Statistics of Sillago sihama Genome Assembly and Annotation Data
| Chromosome-Level Genome Assembly | |
|---|---|
| Assembly | |
| Assembly size (bp) | 521,631,495 |
| Number of scaffolds | 470 |
| Scaffold N50 (bp) | 21,469,626 |
| Longest scaffold (bp) | 28,013,376 |
| Number of contigs | 556 |
| Contig N50 (bp) | 13,543,514 |
| Longest contig max (bp) | 22,111,180 |
| GC (%) | 44.66 |
| BUSCO (% of total BUSCO) | |
| Complete | 4,463 (97.36%) |
| Single-copy | 4,345 (94.79%) |
| Duplicated | 118 (2.57%) |
| Fragmented | 27 (0.6%) |
| Missing | 94 (2.05%) |
| CEGMA | |
| CEGs (% of all CEGs) | 453 (98.97%) |
| Highly conserved CEGs (% of all highly conserved CEGs) | 246 (99.16%) |
| Repetitive sequences (% of genome) | |
| SINE (bp) | 60,396 (0.01%) |
| LINE (bp) | 7,497,699 (1.44%) |
| LTR (bp) | 6,955,457 (1.33%) |
| DNA (bp) | 17,803,273 (3.00%) |
| SSR (bp) | 101,169 (0.02%) |
| Unclassified (bp) | 39,549,161 (7.58%) |
| Total (bp) | 71,864,242 (13.78%) |
| Gene annotations (% of all genes) | |
| GO annotation | 12,408 (51.79%) |
| KEGG annotation | 14,510 (60.59%) |
| KOG annotation | 15,991 (66.74%) |
| TrEMBL annotation | 22,953 (95.8%) |
| NR annotation | 23,101 (96.42%) |
| All annotated | 23,123 (96.51%) |
| Noncoding protein genes (% of genome) | |
| Number of miRNA | 419 |
| Number of tRNA | 1,587 |
| Number of rRNA | 67 |
| Length of miRNA | 34,211 (0.00656%) |
| Length of tRNA | 160,051 (0.03068%) |
| Length of rRNA | 60,018 (0.00575%) |
. 1Genome landscape and evolutionary analysis of Sillago sihama. (A) Genome landscape of S. sihama. (a) Chromosome length, (b) GC content, (c) gene density, (d) repeat sequence, (e) long terminal repeated (LTE), (f) long interspersed nuclear elements (LINE), and (g) simple sequence repeat (SSR). (B) Phylogenetic analysis of 11 teleost fishes. At each branch point, the predicted species divergence time (million years ago) is marked. The red number on each evolutionary branch represents the number of expanding gene families, and the blue number represents the number of contracting gene families. (C) Collinearity analysis of S. sihama and Larimichthys crocea genomes. Blue and orange outer circles represent the chromosome of S. sihama and L. crocea, respectively.