| Literature DB >> 24130803 |
Xiao-Jiao Han1, Yang-Dong Wang, Yi-Cun Chen, Li-Yuan Lin, Qing-Ke Wu.
Abstract
BACKGROUND: Aromatic essential oils extracted from fresh fruits of Litsea cubeba (Lour.) Pers., have diverse medical and economic values. The dominant components in these essential oils are monoterpenes and sesquiterpenes. Understanding the molecular mechanisms of terpenoid biosynthesis is essential for improving the yield and quality of terpenes. However, the 40 available L. cubeba nucleotide sequences in the public databases are insufficient for studying the molecular mechanisms. Thus, high-throughput transcriptome sequencing of L. cubeba is necessary to generate large quantities of transcript sequences for the purpose of gene discovery, especially terpenoid biosynthesis related genes.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24130803 PMCID: PMC3793921 DOI: 10.1371/journal.pone.0076890
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Length distribution of assembled contigs, transcripts, and unigenes.
| Nucleotide length (bp) | Contigs | Transcripts | Unigenes |
|---|---|---|---|
| 0-100 | 1,758,303 | 0 | 0 |
| 100-200 | 134,073 | 0 | 0 |
| 200-300 | 34,458 | 33,002 | 28,329 |
| 300-400 | 14,452 | 15,481 | 12,137 |
| 400-500 | 7,512 | 8,861 | 6,134 |
| 500-600 | 4,722 | 6,157 | 3,822 |
| 600-700 | 3,256 | 4,621 | 2,678 |
| 700-800 | 2,578 | 3,923 | 2,100 |
| 800-900 | 2,047 | 3,319 | 1,732 |
| 900-1000 | 1,697 | 2,921 | 1,446 |
| 1000-1100 | 1,475 | 2,648 | 1,284 |
| 1100-1200 | 1,296 | 2,305 | 1,158 |
| 1200-1300 | 1,111 | 2,057 | 1,019 |
| 1300-1400 | 844 | 1,773 | 839 |
| 1400-1500 | 824 | 1,579 | 799 |
| 1500-1600 | 731 | 1,364 | 663 |
| 1600-1700 | 650 | 1,199 | 615 |
| 1700-1800 | 577 | 1,057 | 553 |
| 1800-1900 | 465 | 978 | 455 |
| 1900-2000 | 407 | 804 | 424 |
| 2000-2100 | 390 | 719 | 379 |
| 2100-2200 | 322 | 638 | 316 |
| 2200-2300 | 267 | 524 | 261 |
| 2300-2400 | 194 | 424 | 202 |
| 2400-2500 | 202 | 420 | 193 |
| 2500-2600 | 176 | 363 | 183 |
| 2600-2700 | 121 | 262 | 126 |
| 2700-2800 | 91 | 230 | 115 |
| 2800-2900 | 87 | 191 | 86 |
| 2900-3000 | 72 | 174 | 82 |
| 3000-3100 | 71 | 146 | 71 |
| >3000 | 425 | 920 | 447 |
| Total number | 1,973,896 | 99,060 | 68,648 |
| Total length | 132,955,711 | 67,394,111 | 39,398,812 |
| N50 length | 87 | 1053 | 834 |
| Mean length | 67.35699905 | 680.3362709 | 573.9251253 |
Figure 1The dependence of unigene lengths on the number of reads assembled into each unigene.
Figure 2Assessment of assembly quality.
Distribution of unique-mapped reads and RPKM (reads per kb per million reads) of the assembled unigenes.
Functional annotation of the L. cubeba.
| Annotated databases | Unigenes | Percentage of unigenes |
|---|---|---|
| Nr-annotation | 36,041 | 52.5% |
| Nt-annotation | 25,806 | 37.6% |
| SwissProt-annotation | 25,606 | 37.3% |
| TrEMBL-annotation | 35,732 | 52.1% |
| GO-annotation | 25,340 | 36.9% |
| KEGG-annotation | 7,702 | 11.2% |
| COG-annotation | 9,803 | 14.3% |
| PlantCyc-annotation | 12,963 | 18.9% |
| Total | 38,439 | 56.0% |
Figure 3Species distribution of the top BLAST hits in Nr dababase.
36,041 BLASTX-hit unigenes were calculated. Species with proportions of more than 1% are shown.
Figure 4Functional annotation of assembled sequences based on gene ontology (GO) categorization.
The unigenes are summarized into three main categories: cellular component, molecular function and biological process.
Figure 5Clusters of orthologous groups (COG) classification.
In total, 25,806 of the 68,648 sequences with Nr hits were grouped into 25 classifications.
Figure 6Monoterpenes and sesquiterpenes biosynthetic pathway in L.cubeba (adapted from Gahlan et al and Ma et al [57, 58]).
Each enzyme name is followed in parentheses by the number of unigenes homologous to gene families encoding this enzyme. AACT: acetoacetyl-CoA thiolase; HMGS: 3-hydroxy-3-methylglutaryl-CoA synthase; HMGR: 3-hydroxy-3-methylglutaryl-CoA reductase; MVK: mevalonate kinase; PMVK: phosphomevalonate kinase; MVD: mevalonate diphosphate decarboxylase; DXS: 1-deoxy-dxylulose-5-phosphate synthase; DXR: 1-deoxy-D-xylulose-5-phosphate reductoisomerase; MCT: 4-diphosphocytidyl-2C-methyl-D- erythritol synthase; CMK: 4-diphosphocytidyl-2C-methyl-D-erythritol kinase; MECPS: 2C-methyl-D-erythritol 4-phosphate cytidylyltransferase; HDS: 1-hydroxy-2-methyl-2-(E)-butenyl-4-diphosphate synthase; HDR: 1-hydroxy-2-methyl-2-(E)-butenyl-4-diphosphate reductase; IPPI: isopentenyl-diphosphate isomerase; FPPS: farnesyl diphosphate synthase; GPPS: geranyl diphosphate synthase; Sesqui-TPS: Sesquiterpene synthase; Mono-TPS: Monoterpene synthase; CAS: (+)-3-carene synthase; CUS: beta-cubebene synthase; OS: Trans-ocimene synthase; GES: Geraniol synthase; KS: Ent-kaurene synthase; LS: limonene synthase; THS: Alpha-thujene synthase; TERS: Alpha-terpineol synthase; LIS: Linalool synthase; PINS: pinene synthase; EAS: 5-epi-aristolochene synthase; CAD: delta-cadinene synthase; GAS: Germacrene A synthase; NES/LIS: nerolidol/linalool synthase.
Figure 7Terpenoid pathways represented in the PlantCyc annotation of the unigenes.
Frequency of SSRs in L. cubeba.
| Motif length | Repeat numbers | Total | % | ||||||
|---|---|---|---|---|---|---|---|---|---|
| 5 | 6 | 7 | 8 | 9 | 10 | >10 | |||
| Di | - | 419 | 270 | 202 | 166 | 198 | 106 | 1361 | 50.90 |
| Tri | 727 | 319 | 165 | 17 | 1 | 1 | 2 | 1232 | 46.07 |
| Tetra | 36 | 10 | 0 | 3 | 0 | 0 | 0 | 49 | 1.83 |
| Penta | 11 | 1 | 0 | 0 | 0 | 1 | 0 | 13 | 0.47 |
| Hexa | 8 | 4 | 5 | 0 | 2 | 0 | 0 | 19 | 0.71 |
| Total | 782 | 753 | 440 | 222 | 169 | 200 | 108 | 2674 | |
| % | 29.24 | 28.16 | 16.45 | 8.30 | 6.32 | 7.48 | 4.04 | ||
Figure 8Phylogeny tree of terpene synthases.
The two clades illustrate the likely enzymatic function of 14 L. cubeba unigenes. It also shows a distribution of the L. cubeba sequences throughout the two main clades of the tree. Vitvi Germacrene D (NP_001268213.1, (-)-germacrene D synthase, Vitis vinifera), Vitvi Caryophyllene (AEP17005.1, (E)-beta-caryophyllene synthase, Vitis vinifera), Maldo Germacrene D (AGB14625.1, germacrene-D synthase, Malus domestica), Frave Germacrene D (XP_004308410.1, (-)-germacrene D synthase-like, Fragaria vesca subsp. vesca), Artan Amorphadiene (AAF98444.1, amorpha-4,11-diene synthase, Artemisia annua), Eletr Copaene (ADK94034.1, alpha-copaene synthase, Eleutherococcus trifoliatus), Solly Germacrene C (XP_004242295.1, germacrene C synthase-like, Solanum lycopersicum), Actde Germacrene D (AAX16121.1, germacrene-D synthase, Actinidia deliciosa), Cicar Germacrene D (XP_004505471.1, (-)-germacrene D synthase-like, Cicer arietinum), Ricco Cadinene (XP_002523635.1, (+)-delta-cadinene synthase isozyme A, Ricinus communis), Vitvi Curcumene (ADR74200.1, beta-curcumene synthase, Vitis vinifera), Vitvi Cadinene (ADR74199.1, gamma-cadinene synthase, Vitis vinifera), Vitvi Valencene (NP_001268028.1, valencene synthase-like, Vitis vinifera), Vitvi Germacrene A (ADR66821.1, germacrene A synthase, Vitis vinifera), Citsi Valencene (AF441124_1, valencene synthase, Citrus sinensis), Solca Cascarilladiene (AAT72931.1, cascarilladiene synthase, Solidago canadensis), Goshi Cadinene (AAX44033.1, (+)-delta-cadinene synthase, Gossypium hirsutum), Theca Cadinene (EOY12645.1, delta-cadinene synthase isozyme A, Theobroma cacao), Warug Sesquiterpene (ACJ46047.1, sesquiterpene synthase, Warburgia ugandensis), Maggr Cubebene (B3TPQ6.1, beta-cubebene synthase, Magnolia grandiflora), Nicta Epi-Aristolochene (3M02.A, 5-Epi- Aristolochene Synthase, Nicotiana tabacum), Soltu Vetispiradiene (Q9XJ32.1, vetispiradiene synthase 1, Solanum tuberosum), Solly Vetispiradiene (AAG09949.1, AF171216_1, vetispiradiene synthase, Solanum lycopersicum), Litcu Ocimene (AEJ91554.1, trans-ocimene synthase, Litsea cubeba), Litcu Thujene (AEJ91555.1, alpha-thujene synthase, Litsea cubeba), Litcu Thujene/Sabinene (AEJ91556.1, alpha-thujene synthase/sabinene synthase, Litsea cubeba), Maggr Terpineol (ACC66282.1, alpha-terpineol synthase, Magnolia grandiflora), Maldo Ocimene (AGB14628.1, (E)-beta-ocimene synthase, Malus domestica), Phalu Ocimene (ABY65110.1, beta-ocimene synthase, Phaseolus lunatus), Citli Limonene (AAM53946.1|AF514289_1, (+)-limonene synthase 2, Citrus limon), Pontr Limonene (BAG74774.1, limonene synthase, Poncirus trifoliata), Cucsa Terpineol (XP_004161807.1, (-)-alpha-terpineol synthase-like, Cucumis sativus), Menaq Linalool (AAL99381.1, linalool synthase, Mentha aquatica), Cofar Limonene (CCM43927.1, limonene synthase, Coffea arabica), Vitvi Ocimene/Myrcene (ADR74206.1, (E)-beta-ocimene/myrcene synthase, Vitis vinifera), Vitvi Terpineol (NP_001268216.1, (-)-alpha-terpineol synthase, Vitis vinifera), Ricco Limonene (XP_002533355.1, (R)-limonene synthase, Ricinus communis), Queil Pinene (CAK55186.1, pinene synthase, Quercus ilex), Citun Ocimene (BAD91046.1, (E)-beta-ocimene synthase, Citrus unshiu), Citun Terpinene (BAD27259.1, gamma-terpinene synthase, Citrus unshiu), Artan Linalool (AAF13356.1|AF154124_1, (3R)-linalool synthase, Artemisia annua), Cinos Linalool (AFQ20811.1, S-(+)-linalool synthase, Cinnamomum osmophloeum), Vitvi Linalool/Nerolidol (ADR74212.1, (3S)-linalool/(E)-nerolidol synthase, Vitis vinifera).
Figure 9RT-qPCR analysis of 16 terpenoid pathway-related candidate unigenes in L. cubeba.
The gene names, sequences and the primers used for RT-qPCR analysis are shown in File S3. Standard error of the mean for three biological replicates (nested with three technical replicates) is represented by the error bars.