| Literature DB >> 34849807 |
Chengcai Zhang1, Huadong Ren1, Xiaohua Yao1, Kailiang Wang1, Jun Chang1.
Abstract
Pecan is rich in bioactive components such as fatty acids (FAs) and flavonoids and is an important nut type worldwide. Therefore, the molecular mechanisms of phytochemical biosynthesis in pecan are a focus of research. Recently, a draft genome and several transcriptomes have been published. However, the full-length mRNA transcripts remain unclear, and the regulatory mechanisms behind the quality components biosynthesis and accumulation have not been fully investigated. In this study, single-molecule long-read sequencing technology was used to obtain full-length transcripts of pecan kernels. In total, 37,504 isoforms of 16,702 genes were mapped to the reference genome. The numbers of known isoforms, new isoforms, and novel isoforms were 9013 (24.03%), 26,080 (69.54%), and 2411 (6.51%), respectively. Over 80% of the transcripts (30,751, 81.99%) had functional annotations. A total of 15,465 alternative splicing (AS) events and 65,761 alternative polyadenylation events were detected; wherein, the retained intron was the predominant type (5652, 36.55%) of AS. Furthermore, 1894 long noncoding RNAs and 1643 transcription factors were predicted using bioinformatics methods. Finally, the structural genes associated with FA and flavonoid biosynthesis were characterized. A high frequency of AS accuracy (70.31%) was observed in FA synthesis-associated genes. This study provides a full-length transcriptome data set of pecan kernels, which will significantly enhance the understanding of the regulatory basis of phytochemical biosynthesis during pecan kernel maturation.Entities:
Keywords: zzm321990 Carya illinoinensiszzm321990 ; PacBio; alternative splicing; fatty acid; flavonoid; lncRNA
Mesh:
Substances:
Year: 2021 PMID: 34849807 PMCID: PMC8496322 DOI: 10.1093/g3journal/jkab182
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Summary of PacBio sequencing results
| Terms | Amount |
|---|---|
| Total base (bp) | 39,556,542,673 |
| subreads number | 22,601,162 |
| subreads average length (bp) | 1,750 |
| subreads N50 (bp) | 2,420 |
| Number of CCS reads | 485,150 |
| Mean of CCS Read Length (bp) | 2,312 |
| Number of full-length reads | 445,838 |
| Number of FLNC reads | 442,244 |
| FLNC read average length (bp) | 2,154 |
| Number of unpolished consensus isoforms | 236,820 |
| Number of polished high-quality isoforms | 194,991 |
| Unpolished consensus isoforms average read length (bp) | 2,123 |
| Correct consensus number | 194,992 |
| Correct consensus average length (bp) | 2,184 |
| Correct consensus N50 length (bp) | 2,650 |
Figure 1KEGG and GO functional classification of the pecan full-length transcriptome.
Figure 2Summary of alternative splicing events.
Figure 3Validation of AS events using RT-PCR. The gel bands show DNA markers and the RT-PCR results. The arrows indicate the positions of different isoforms. The green boxes show the positions of exons, and lines show introns. The expected PCR sizes of each band was listed beside the structures.
Figure 4The Venn diagram of lncRNAs predicted by three databases.
Figure 5The number of top 20 transcription factors.
Figure 6The proposed FA biosynthesis pathway of pecan. The numbers in brackets indicate the number of putative genes (in black font) and the number of AS genes (in red font).
Figure 7The proposed flavonoid biosynthesis pathway of pecan. The numbers in brackets indicate the number of putative genes (in black font) and the number of AS genes (in red font).