| Literature DB >> 30477130 |
Shiyong Zhang1,2, Jia Li3, Qin Qin4, Wei Liu5, Chao Bian6, Yunhai Yi7,8, Minghua Wang9, Liqiang Zhong10, Xinxin You11, Shengkai Tang12, Yanshan Liu13, Yu Huang14,15, Ruobo Gu16, Junmin Xu17,18, Wenji Bian19, Qiong Shi20,21,22, Xiaohui Chen23.
Abstract
Naturally derived toxins from animals are good raw materials for drug development. As a representative venomous teleost, Chinese yellow catfish (Pelteobagrus fulvidraco) can provide valuable resources for studies on toxin genes. Its venom glands are located in the pectoral and dorsal fins. Although with such interesting biologic traits and great value in economy, Chinese yellow catfish is still lacking a sequenced genome. Here, we report a high-quality genome assembly of Chinese yellow catfish using a combination of next-generation Illumina and third-generation PacBio sequencing platforms. The final assembly reached 714 Mb, with a contig N50 of 970 kb and a scaffold N50 of 3.65 Mb, respectively. We also annotated 21,562 protein-coding genes, in which 97.59% were assigned at least one functional annotation. Based on the genome sequence, we analyzed toxin genes in Chinese yellow catfish. Finally, we identified 207 toxin genes and classified them into three major groups. Interestingly, we also expanded a previously reported sex-related region (to ≈6 Mb) in the achieved genome assembly, and localized two important toxin genes within this region. In summary, we assembled a high-quality genome of Chinese yellow catfish and performed high-throughput identification of toxin genes from a genomic view. Therefore, the limited number of toxin sequences in public databases will be remarkably improved once we integrate multi-omics data from more and more sequenced species.Entities:
Keywords: Chinese yellow catfish; identification; toxin genes; whole genome sequencing
Mesh:
Substances:
Year: 2018 PMID: 30477130 PMCID: PMC6316204 DOI: 10.3390/toxins10120488
Source DB: PubMed Journal: Toxins (Basel) ISSN: 2072-6651 Impact factor: 4.546
Figure 1The 17-mer distribution of Chinese yellow catfish. Sequencing data from the Illumina short-insert libraries (200, 500, and 800 bp) were used for this analysis. The x-axis is the sequencing depth of each unique 17-mer, and the y-axis is the percentage of unique 17-mers. The peak depth was 57, and the percentage for peak (0.638%) was based on the total k-mer number (410,049,532,138).
Summary of the assembled genome in each procedure.
| Step | Software | Contig N50 (bp) | Maximum Contig (bp) | Minimum Contig (bp) | Scaffold N50 (bp) | Maximum Scaffold (bp) | Minimum Scaffold (bp) | Total Size (bp) |
|---|---|---|---|---|---|---|---|---|
| Contig assembling | Platanus | 1054 | 49,678 | 109 | - | - | - | 1,010,987,672 |
| DBG2OLC | 707,335 | 6,076,047 | 268 | - | - | - | 706,928,086 | |
| Polishing round 1 | Pilon | 705,180 | 6,050,085 | 270 | - | - | - | 702,622,905 |
| Scaffolding | SSPACELongRead | 982,636 | 6,050,085 | 270 | 1,109,190 | 7,365,535 | 270 | 706,306,982 |
| SSPACE_Standard | 705,180 | 6,050,085 | 270 | 3,655,204 | 19,552,289 | 270 | 712,893,760 | |
| Gap filling | Gapcloser | 813,785 | 11,966,130 | 270 | 3,655,204 | 19,552,617 | 270 | 712,834,712 |
| GapFiller | 859,168 | 11,966,116 | 270 | 3,655,204 | 19,552,752 | 270 | 712,901,309 | |
| PBjelly | 962,661 | 14,953,314 | 270 | 3,655,300 | 19,560,773 | 270 | 714,800,876 | |
| Polishing round 2 | Pilon | 970,098 | 15,455,883 | 277 | 3,653,474 | 19,544,699 | 277 | 713,824,612 |
Figure 2The BUSCO assessment of genomes from Chinese yellow catfish and other fish species. The genome-level benchmarking value of Chinese yellow catfish was C: 94.8% (containing S: 90.9%, D: 3.9%, F: 1.7%, M: 3.5%, n: 4584), and the corresponding protein-level benchmarking value was C: 84.4% (including S: 79.6%, D: 4.8%, F: 8.7%, M: 6.9%, n: 4584). Abbreviations: C, complete; S, Complete and single-copy; D, duplicated; F, fragmental; M, missed; n: total BUSCO groups for searching.
Evaluation the completeness of gene regions in our genome assembly by assembled transcripts.
| Dataset | Number of EST Clusters | Total Length (bp) | Coverage Rate by the Assembly (%) | with >90% Sequence in One Scaffold | with >50% Sequence in One Scaffold | ||
|---|---|---|---|---|---|---|---|
| Number | Percentage (%) | Number | Percentage (%) | ||||
| >0 bp | 78,225 | 57,694,186 | 98.1907917 | 73,167 | 93.53404 | 77,222 | 98.7178 |
| >200 bp | 60,258 | 54,613,314 | 98.2312921 | 56,311 | 93.44983 | 59,575 | 98.86654 |
| >500 bp | 30,229 | 45,487,954 | 98.32383756 | 28,117 | 93.01333 | 29,963 | 99.12005 |
| >1000 bp | 17,675 | 36,547,853 | 98.41627906 | 16,434 | 92.97878 | 17,543 | 99.25318 |
Figure 3The phylogenetic tree of yellow catfish and other 14 related fish species. The red dot nodes have been validated based on the TimeTree (http://www.timetree.org/). Numbers represent the estimated divergence times.
Figure 4A phylogenetic classification of the nine subgroups of medium-length toxin genes. Two other genes “Q6T269-D1” and “Q8AY75-D1,” however, do not belong to any subgroup.
Figure 5A phylogenetic classification of the 21 long-length toxin genes. Three other genes “P81428-D1,” “F8S101-D1,” and “Q92035-D1,” do not belong to the classified six families.
Figure 6Distribution of a female specific marker and other sex-related/toxin genes in the Contig326_pilon. Two toxin gens, “Q9PS06-D1” and “Q8AY81-D3,” were presented in dark green. The female specific marker, located in the intron26 of inad gene, was 42 bp in length. The left circos atlas represents the entire Contig326_pilon. Its rings from outside to inside include: (A) nucleotide sequence of the Contig326_pilon, (B) percentage of GC content in 10-kb non-overlapping windows, and (C) percentage of repeat elements in 10-kb non-overlapping windows. In the Contig326_pilon, faint yellow ribbons represent “+” orientating genes, while grey ribbons represent “−” orientating genes; sex-related genes and inad were drawn with a red ribbon.