| Literature DB >> 30455715 |
Xue Zhang1, Yue Zhang1, Yue-Hua Wang1, Shi-Kang Shen1.
Abstract
Cinnamomum chago, an endangered species endemic to Yunnan province, possesses large economic and phylogenetic values in Lauraceae. However, the genomic information of this species remains relatively unexplored. In this study, we used RNAseq technology to characterize and annotate the C. chago transcriptome and identify candidate genes involved in special metabolic pathways and gene-associated simple sequence repeats (SSRs) and single-nucleotide polymorphism (SNP). A total of 129,097 unigenes, with a mean length of 667 bp and an N50 length of 1,062 bp, were assembled. Among these genes, 56,887 (44.07%) unigenes were successfully annotated using at least one database. Furthermore, 47 and 46 candidate genes were identified in terpenoid biosynthesis and fatty acid biosynthesis, respectively. A total of 22 candidate genes participated in at least one abiotic stress response of C. chago. Additionally, a total of 25,654 SSRs and 640 SNPs were also identified. Based on these potential loci, 55 novel expressed sequence tag (EST)-SSR primers were successfully developed. This work provides comprehensive transcriptomic data that can be used to establish a valuable information platform for gene prediction, signaling pathway investigation, and molecular marker development for C. chago and other related species. Such a platform can facilitate further studies on germplasm conservation and utilization of Lauraceae species.Entities:
Keywords: Lauraceae; abiotic stress; adaptation; molecular makers; terpenoid; transcriptome
Year: 2018 PMID: 30455715 PMCID: PMC6231050 DOI: 10.3389/fgene.2018.00505
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Summary statistics of de novo assembled transcriptome for C. chago.
| Category | Items | Number |
|---|---|---|
| Raw data | Total raw reads | 56,891,547 |
| (Average) | Total raw base(bp) | 8,590,623,647 |
| Clean data | Total clean reads | 55,525,751 |
| (Average) | Total clean bases(bp) | 8,235,167,620 |
| Error% | 0.01 | |
| Q20% | 98.29 | |
| Q30% | 94.84 | |
| GC% | 48.57 | |
| Transcripts | Total number | 179,491 |
| Smallest length(bp) | 201 | |
| Largest length(bp) | 61,022 | |
| Mean length (bp) | 628 | |
| N50 (bp) | 1,025 | |
| Unigenes | Total number | 129,097 |
| Smallest length(bp) | 201 | |
| Largest length(bp) | 61,022 | |
| Mean length (bp) | 667 | |
| N50 (bp) | 1062 |
FIGURE 1Summary statistics of functional annotations for the C. chago transcriptome in public databases.
FIGURE 2GO classification of the C. chago transcriptome.
FIGURE 3COG functional classification of the C. chago transcriptome.
FIGURE 4KEGG classification of the C. chago transcriptome (red) Metabolism, (green) Genetic Information Processing (pink) Environmental Information Processing (blue) Cellular Processes (yellow) Organismal Systems.
Candidate genes of the C. chago transcriptome simultaneously involved in the response to abiotic stress.
| KO ID | Gene | KEGG Annotation | Numbers unigenes | Abiotic stress |
|---|---|---|---|---|
| K00799 | GST, gst | Glutathione S-transferase | 28 | Cold, water deprivation |
| K14638 | SLC15A3_4, PHT | Solute carrier family 15 (peptide/histidine transporter), member 3/4 | 34 | Water deprivation, pH |
| K03283 | HSPA1_8 | Heat shock 70 kDa protein 1/8 | 32 | Cold, heat |
| K13448 | CML | Calcium-binding protein CML | 33 | Cold, heat |
| K09286 | EREBP | EREBP-like factor | 32 | Water deprivation, cold, heat |
| K04077 | groEL, HSPD1 | Chaperonin GroEL | 16 | Cold, heat |
| K01115 | PLD1_2 | Phospholipase D1/2 | 19 | Water deprivation, cold |
| K00695 | E2.4.1.13 | Sucrose synthase (SUS) | 9 | Water deprivation, cold |
| K09487 | HSP90B, TRA1 | Heat shock protein 90kDa beta | 12 | Water deprivation, cold, heat |
| K17279 | REEP5_6 | Receptor expression-enhancing protein 5/6 | 9 | Water deprivation, cold |
| K01177 | E3.2.1.2 | Beta-amylase | 8 | Water deprivation, cold |
| K17095 | ANXA7_11 | Annexin A7/11 | 8 | Water deprivation, cold, heat |
| K14803 | PTC2_3 | Protein phosphatase PTC2/3 | 10 | Water deprivation, cold |
| K09250 | CNBP | Cellular nucleic acid-binding protein | 8 | water deprivation, cold |
| K16911 | DDX21 | ATP-dependent RNA helicase DDX21 | 3 | water deprivation, cold |
| K17679 | MSS116 | ATP-dependent RNA helicase MSS116, mitochondrial | 3 | Water deprivation, cold |
| K06268 | PPP3R, CNB | Serine/threonine-protein phosphatase 2B regulatory subunit | 5 | Water deprivation, cold |
| K12885 | RBMX, HNRNPG | Heterogeneous nuclear ribonucleoprotein G | 4 | Water deprivation, cold |
| K03627 | MBF1 | Putative transcription factor | 2 | Water deprivation, heat |
| K03098 | APOD | Apolipoprotein D and lipocalin family protein | 2 | Cold, heat |
| K17991 | PXG | Peroxygenase | 2 | Water deprivation, cold |
| K04688 | RPS6KB | Ribosomal protein S6 kinase beta | 1 | Cold, heat |
Summary statistics of EST-SSRs and ESE-SNPs identified from the transcriptome of C. chago.
| EST-SSRs | Total number of identified SSRs | 25,654 |
|---|---|---|
| Total size of examined sequences (bp) | 84,676,396 | |
| Number of SSR-containing sequences | 20,307 | |
| Number of sequences containing more than one SSR | 3,208 | |
| EST-SNPs | Number of SNPs | 640 |
| SNP frequency per Kb | 0.01 | |
| Transition | 396 | |
| A/G | 202 | |
| C/T | 194 | |
| Transversion | 244 | |
| A/C | 63 | |
| A/T | 74 | |
| C/G | 44 | |
| G/T | 63 | |
Distribution of EST-SSRs based on motif types and nucleotide repeat units in C. chago.
| SSR type | Repeat number (1–5) | Repeat number (6–10) | Repeat number (11–15) | Repeat number (>15) |
|---|---|---|---|---|
| A/T | 0 | 4826 | 7664 | 2247 |
| C/G | 0 | 35 | 92 | 50 |
| AC/GT | 0 | 656 | 27 | 0 |
| AG/CT | 0 | 4641 | 96 | 0 |
| AT/AT | 0 | 1198 | 59 | 1 |
| CG/CG | 0 | 53 | 0 | 0 |
| AAC/GTT | 127 | 86 | 0 | 0 |
| AAG/CTT | 772 | 611 | 0 | 0 |
| AAT/ATT | 211 | 201 | 0 | 0 |
| ACC/GGT | 136 | 92 | 1 | 0 |
| ACG/CGT | 37 | 22 | 0 | 0 |
| ACT/AGT | 23 | 15 | 0 | 0 |
| AGC/CTG | 202 | 139 | 0 | 0 |
| AGG/CCT | 233 | 171 | 0 | 0 |
| ATC/ATG | 385 | 194 | 0 | 0 |
| CCG/CGG | 81 | 37 | 0 | 0 |
| AAAC/GTTT | 12 | 2 | 0 | 0 |
| AAAG/CTTT | 35 | 4 | 0 | 0 |
| AAAT/ATTT | 48 | 5 | 0 | 0 |
| AACC/GGTT | 1 | 0 | 0 | 0 |
| AACG/CGTT | 1 | 0 | 0 | 0 |
| AAGC/CTTG | 1 | 0 | 0 | 0 |
| AAGG/CCTT | 8 | 1 | 0 | 0 |
| AATC/ATTG | 7 | 0 | 0 | 0 |
| AATG/ATTC | 1 | 0 | 0 | 0 |
| AATT/AATT | 2 | 0 | 0 | 0 |
| ACAG/CTGT | 1 | 0 | 0 | 0 |
| ACAT/ATGT | 10 | 1 | 0 | 0 |
| ACCC/GGGT | 1 | 0 | 0 | 0 |
| ACGC/CGTG | 2 | 0 | 0 | 0 |
| ACGG/CCGT | 0 | 1 | 0 | 0 |
| ACTC/AGTG | 1 | 0 | 0 | 0 |
| ACTG/AGTC | 1 | 0 | 0 | 0 |
| AGAT/ATCT | 18 | 2 | 1 | 0 |
| AGCC/CTGG | 1 | 0 | 0 | 0 |
| AGCG/CGCT | 2 | 1 | 0 | 0 |
| AGGC/CCTG | 2 | 0 | 0 | 0 |
| AGGG/CCCT | 13 | 1 | 0 | 0 |
| ATCC/ATGG | 4 | 1 | 0 | 0 |
| ATCG/ATCG | 1 | 0 | 0 | 0 |
| ATGC/ATGC | 1 | 0 | 0 | 0 |
| AAAAG/CTTTT | 0 | 0 | 1 | 0 |
| AAACG/CGTTT | 0 | 1 | 0 | 0 |
| AAATC/ATTTG | 1 | 0 | 0 | 0 |
| AAGGG/CCCTT | 1 | 0 | 0 | 0 |
| AAGTG/ACTTC | 1 | 0 | 0 | 0 |
| AATAT/ATATT | 2 | 0 | 0 | 0 |
| AATCG/ATTCG | 1 | 0 | 0 | 0 |
| AATCT/AGATT | 1 | 0 | 0 | 0 |
| ACACG/CGTGT | 1 | 0 | 0 | 0 |
| ACAGG/CCTGT | 1 | 0 | 0 | 0 |
| ACCCG/CGGGT | 1 | 0 | 0 | 0 |
| ACTCC/AGTGG | 1 | 0 | 0 | 0 |
| ACTCT/AGAGT | 2 | 0 | 0 | 0 |
| AGAGG/CCTCT | 1 | 0 | 0 | 0 |
| AGCCC/CTGGG | 1 | 0 | 0 | 0 |
| AGGCG/CCTCG | 1 | 0 | 0 | 0 |
| CCCCG/CGGGG | 1 | 0 | 0 | 0 |
| AAAACG/CGTTTT | 1 | 0 | 0 | 0 |
| AAACAG/CTGTTT | 0 | 1 | 0 | 0 |
| AAAGAG/CTCTTT | 0 | 1 | 0 | 0 |
| AAAGAT/ATCTTT | 1 | 0 | 0 | 0 |
| AAATGC/ATTTGC | 1 | 0 | 0 | 0 |
| AACCCC/GGGGTT | 1 | 0 | 0 | 0 |
| AACCTG/AGGTTC | 1 | 0 | 0 | 0 |
| AAGGAG/CCTTCT | 0 | 1 | 0 | 0 |
| AAGGTG/ACCTTC | 0 | 1 | 0 | 0 |
| AAGTGG/ACTTCC | 1 | 0 | 0 | 0 |
| AATAGT/ACTATT | 0 | 1 | 0 | 0 |
| AATATG/ATATTC | 0 | 1 | 0 | 0 |
| AATGAG/ATTCTC | 0 | 1 | 0 | 0 |
| ACACAT/ATGTGT | 1 | 0 | 0 | 0 |
| ACCATC/ATGGTG | 1 | 0 | 0 | 0 |
| ACGAGG/CCTCGT | 1 | 0 | 0 | 0 |
| AGATGG/ATCTCC | 1 | 0 | 0 | 0 |
| AGCAGG/CCTGCT | 0 | 1 | 0 | 0 |
| ATCGCC/ATGGCG | 1 | 0 | 0 | 0 |
| ATCGGC/ATGCCG | 0 | 1 | 0 | 0 |
| 2409 | 13006 | 7941 | 2298 | |
| 9.39% | 50.70% | 30.95% | 8.96% |