| Literature DB >> 35783279 |
Kangqi Zhou1, Zhong Chen1, Xuesong Du1, Yin Huang1, Junqi Qin1, Luting Wen1, Xianhui Pan1, Yong Lin1.
Abstract
Cipangopaludina chinensis is an economically important aquatic snail with high medicinal value. However, molecular biology research on C. chinensis is limited by the lack of a reference genome, so the analysis of its transcripts is an important step to study the regulatory genes of various substances in C. chinensis. Herein, we conducted the first full-length transcriptome analysis of C. chinensis using PacBio single-molecule real-time (SMRT) sequencing technology. We identified a total of 26,312 unigenes with an average length of 2,572 bp, of which the largest number of zf-c2h2 transcription factor families (120,18.24%) were found, and also observed that the majority of the 8,058 SSRs contained 4-7 repeat units, which provided data for subsequent work on snail genetics Subsequently, 91.86% (24,169) of the genes were successfully annotated to the four major databases, while the highest homology was observed with Pomacea canaliculata. Functional annotation revealed that the majority of transcripts were enriched in metabolism, signal transduction and Immune-related pathways, and several candidate genes involved in drug metabolism and immune response were identified (e.g., CYP1A1, CYP2J, CYP2U1, GST, ,PIK3, PDE3A, PRKAG). This study lays a foundation for future molecular biology research and provides a reference for studying genes associated with the medicinal value of C. chinensis.Entities:
Keywords: Cipangopaludina chinensis; SMRT sequencing; full-length transcriptome; functional annotation; structure prediction
Year: 2022 PMID: 35783279 PMCID: PMC9243326 DOI: 10.3389/fgene.2022.881952
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
Description of full-length sequencing in Cipangopaludina chinensis.
| Type | Number | Min length | Average length | Max length | N50 |
|---|---|---|---|---|---|
| Polymerase | 959,578 | 52 | 113,116 | 521,756 | 175,815 |
| Subreads | 26,658,437 | 51 | 2,154 | 274,227 | 2,466 |
| CCS | 818,895 | 52 | 2,428 | 818,895 | 2,614 |
| FLNC | 697,861 | 50 | 2,315 | 15,192 | 2,530 |
| Unigene | 26,312 | 64 | 2,572 | 14,031 | 2,904 |
FIGURE 1Length distribution of unigenes obtained from the Cipangopaludina chinensis library.
FIGURE 2Functional annotation of Cipangopaludina chinensis. (A) Statistics of the transcripts annotated in different databases. (B) Venn diagram of annotations in the NR, GO, KEGG, KOG, and Swiss-Prot databases. (C) Distribution of the top 10 species with matched transcripts in the NR database. (D) Distribution of GO terms for all annotated transcripts in the biological process, cellular component, and molecular function ontologies. (E) KEGG pathways enriched in the transcripts. (F) COG categories of the transcripts.
Regional distribution of some SSRs in the full-length transcripts of Cipangopaludina chinensis.
| Type | Number | Ratio | 5UTR | CDS | 3UTR |
|---|---|---|---|---|---|
| Mono- | 688 | 11.4 | 72 | 7 | 609 |
| Di- | 2761 | 45.73 | 304 | 204 | 2253 |
| Tri- | 1035 | 17.14 | 205 | 358 | 472 |
| Tetra- | 512 | 8.48 | 47 | 22 | 443 |
| Penta- | 34 | 0.56 | 7 | 1 | 26 |
| Hexa- | 40 | 0.66 | 7 | 17 | 16 |
| Complex | 967 | 16.02 | 43 | 76 | 848 |
FIGURE 3Sequence structure analysis of Cipangopaludina chinensis. (A) Length distribution of CDSs. (B) Venn diagram of lncRNAs identified by the CNCI and CPC methods. (C) Length density distribution of the lncRNA and mRNA transcripts. (D) Families and numbers of the top 10 TFs produced by SMRT. (E) Summary of SSR types in the full-length transcripts of C. chinensis. (F) Alternative splicing events predicted by SUPPA.