| Literature DB >> 36101431 |
Zhenzhen Xie1, Dengdong Wang2, Shoujia Jiang2, Cheng Peng3, Qing Wang4, Chunren Huang5, Shuisheng Li2, Haoran Lin2, Yong Zhang2.
Abstract
The tomato hind, Cephalopholis sonnerati, is a bottom-dwelling coral reef fish, which is widely distributed in the Indo-Pacific and Red Sea. C. sonnerati also features complex social structures and behaviour mechanisms. Here, we present a high-quality, chromosome-level genome assembly for C. sonnerati that was derived using PacBio sequencing and Hi-C technologies. A 1043.66 Mb genome with an N50 length of 2.49 Mb was assembled, produced containing 795 contigs assembled into 24 chromosomes. Overall, 97.2% of the complete BUSCOs were identified in the genome. A total of 26,130 protein-coding genes were predicted, of which 94.26% were functionally annotated. Evolutionary analysis revealed that C. sonnerati diverged from its common ancestor with E. lanceolatus and E. akaara approximately 41.7 million years ago. In addition, comparative genome analyses indicated that the expanded gene families were highly enriched in the sensory system. Finally, we found the tissue-specific expression of 8108 genes. We found that these tissue-specific genes were highly enriched in the brain. In brief, the high-quality, chromosome-level reference genome will provide a valuable genome resource for studies of the genetic conservation, resistance breeding, and evolution of C. sonnerati.Entities:
Keywords: Cephalopholis sonnerati; chromosome-level genome assembly; comparative genome analysis; genome annotation; pacbio sequencing
Year: 2022 PMID: 36101431 PMCID: PMC9312885 DOI: 10.3390/biology11071053
Source DB: PubMed Journal: Biology (Basel) ISSN: 2079-7737
Figure 1A picture of C. sonnerati for genome sequencing.
Summary of the C. sonnerati Genome Assembly and Annotation.
| Chromosome-Level Genome Assembly | |
|---|---|
| Genome assembly and chromosomes construction | |
| Contig N50 size (bp) | 2,482,587 |
| Contig N90 size (bp) | 683,704 |
| Maximum contig size (bp) | 12,345,001 |
| Total contigs number | 939 |
| Total length of genome | 1,043,655,803 |
| Number of chromosomes (bp) | 24 |
| Total length of chromosomes (bp) | 1,022,871,484 |
| Scaffold N50 (bp) | 44,482,143 |
| Contig N50 (bp) | 2,517,244 |
| Final Assembly Genome Quality Evaluation | |
| Proportion of complete BUSCOs (%) | 97.2 |
| Proportion of complete and single-copy BUSCOs (%) | 94.3 |
| Proportion of complete and duplicated BUSCOs (%) | 2.9 |
| Proportion of fragmented BUSCOs (%) | 0.8 |
| Proportion of missing BUSCOs (%) | 2.0 |
| Gene Annotation | 21,173 |
| Number of InterPro annotation | 16,331 |
| Number of GO annotation | 24,250 |
| Number of KEGG annotation | 14,971 |
| Number of KO annotation | 22,270 |
| Number of SwissProt annotation | 24,372 |
| Number of TrEMBL annotation | 24,574 |
| Number of NR annotation | 24,629 |
| Number of all annotation | 1501 |
| Unannotated | |
Figure 2The C. sonnerati genome contig contact matrix using Hi-C data. The color bar indicates contact density from red (high) to white (low).
Figure 3Orthologous genes between C. sonnerati and other 15 teleost species.
Figure 4C. sonnerati diverged from other species and their phylogeny. The blue numbers are the divergence time of the prediction. The numbers below the branches are the numbers of expanded and contracted gene families (green, expanded; red, extracted). The scale at the bottom represents divergence time, and the one-time unit represents 100 million years ago. The pie charts represent gene families (black, expanded; red, extracted; blue, others).
Figure 5Chromosome synteny analysis between C. sonnerati and other three groupers. LG 1–24 represents chromosomes 1–24 of C. sonnerati and other three groupers, respectively.
Figure 6The heatmap of tissue-specific expression genes in 11 tissues of C. sonnerati genome. Genes with a tissue specificity score absolute value of 1 were considered to show tissue specificity.