| Literature DB >> 23468859 |
Ramasamy S Annadurai1, Ramprasad Neethiraj, Vasanthan Jayakumar, Anand C Damodaran, Sudha Narayana Rao, Mohan A V S K Katta, Sreeja Gopinathan, Santosh Prasad Sarma, Vanitha Senthilkumar, Vidya Niranjan, Ashok Gopinath, Raja C Mugasimangalam.
Abstract
Herbal remedies are increasingly being recognised in recent years as alternative medicine for a number of diseases including cancer. Curcuma longa L., commonly known as turmeric is used as a culinary spice in India and in many Asian countries has been attributed to lower incidences of gastrointestinal cancers. Curcumin, a secondary metabolite isolated from the rhizomes of this plant has been shown to have significant anticancer properties, in addition to antimalarial and antioxidant effects. We sequenced the transcriptome of the rhizome of the 3 varieties of Curcuma longa L. using Illumina reversible dye terminator sequencing followed by de novo transcriptome assembly. Multiple databases were used to obtain a comprehensive annotation and the transcripts were functionally classified using GO, KOG and PlantCyc. Special emphasis was given for annotating the secondary metabolite pathways and terpenoid biosynthesis pathways. We report for the first time, the presence of transcripts related to biosynthetic pathways of several anti-cancer compounds like taxol, curcumin, and vinblastine in addition to anti-malarial compounds like artemisinin and acridone alkaloids, emphasizing turmeric's importance as a highly potent phytochemical. Our data not only provides molecular signatures for several terpenoids but also a comprehensive molecular resource for facilitating deeper insights into the transcriptome of C. longa.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23468859 PMCID: PMC3585318 DOI: 10.1371/journal.pone.0056217
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of RNA-Seq.
| Cultivar Nattu | Cultivar Erode | Cultivar Mysore | |
| Number of raw reads | 41,039,760 | 60,685,196 | 74,386,806 |
| Read length | 72 | 73 | 100 |
| Number of High Quality (HQ) bases | 2,786,398,656 | 4,124,692,295 | 6,787,408,561 |
| Percentage of HQ bases | 94.3 | 93.1 | 91.2 |
| Reads after trimming adapters and low quality bases | 34,924,986 | 48,755,296 | 63,574,950 |
| Number of bases in trimmed reads | 2,496,109,122 | 3,525,038,126 | 6,085,481,848 |
| Mean trimmed read length | 71.47 | 72.30 | 95.72 |
| Median trimmed read length | 72 | 73 | 100 |
Reads = Read1+Read2.
Bases with >20 Phred score.
Figure 1Transcript assembly statistics.
A) Length of the assembled transcripts vs. Number of transcripts B) ATGC composition of the RTs.
Assembly summary of cultivar A, cultivar B, cultivar C, ArREST ESTs and representative transcripts.
| Cultivar Nattu | Cultivar Erode | Cultivar Mysore | ArREST | Representative Transcripts | |
| No of Transcripts | 56,787 | 65,956 | 92,214 | 78,516 | 61,538 |
| Maximum transcript length | 15,271 | 11,938 | 15,293 | 6,639 | 15,293 |
| Minimum transcript length | 200 | 200 | 200 | 100 | 200 |
| Total transcript length (bases) | 53,751,599 | 62,409,692 | 120,256,594 | 21,494,172 | 56,012,833 |
| Number of Ns | 537 | 839 | 4,775 | 19,766 | 902 |
| Mean transcript length | 946.55 | 946.23 | 1304.10 | 273.755 | 910.22 |
| N50 | 1,466 | 1,448 | 1,995 | 467 | 1,515 |
Figure 2Coverage distribution of NCBI C. longa ESTs matched against representative transcripts using BLAST.
Figure 3Top ten most represented GO terms in each of the three GO domains.
Figure 4KOG Classification.
Figure 5Top ten most expressed Pfam domains.
Annotation summary.
| Database | Version | Transcripts | Percentage of transcripts |
| GenBank-NT | As of 14th March 2012 | 116 | 0.35% |
| KOG | As of 9th April 2012 | 8,322 | 24.76% |
| PlantCyc | Version 2.0 | 2,437 | 7.25% |
| Swiss-Prot | As of 21st March 2012 | 15,632 | 46.51% |
| TrEMBL | As of 21st March 2012 | 6,829 | 20.31% |
| Pfam | Version 26.0 | 277 | 0.82% |
Alignment summary of Cultivars A, B and C.
| Statistics | Cultivar Nattu | Cultivar Erode | Cultivar Mysore |
| Total reads | 34,924,986 | 48,755,296 | 63,574,950 |
| Reads aligned | 31,572,848 | 44,488,764 | 59,171,432 |
| %Reads aligned | 90.40 | 91.24 | 93.07 |
| Reference sequence length | 56,012,833 | 56,012,833 | 56,012,833 |
| Total reference covered | 48,952,920 | 49,806,266 | 53,364,114 |
| % Total reference covered | 87.40 | 88.89 | 95.27 |
Figure 6Expression profile of the differentially expressed transcripts (A) in cultivar B with respect to A (B) in cultivar C with respect to A.
Summary of SSRs.
| Motif size | SSRs observed in Cultivar Nattu | SSRs observed in Cultivar Erode | SSRs observed in Cultivar Mysore |
| 2 | 1014 (10%) | 1364 (11.4) | 3023 (12.1%) |
| 3 | 5646 (55.9%) | 6421 (53.7%) | 12927 (51.7%) |
| 4 | 2150 (21.3%) | 2707 (22.7%) | 5973 (23.9%) |
| 5 | 550 (5.4%) | 683 (5.7%) | 1463 (5.9%) |
| 6 | 748 (7.4%) | 782 (6.5%) | 1600 (6.4%) |
Comparative analysis of Plant transcriptome N50 values.
| Organism | N50 (in bases) |
|
| 948 |
|
| 938 |
|
| 1378 |
|
| 1192 |
|
| ∼1500 |
|
| 1510 |
|
| 485 |
|
| 220, 150, 180 |
|
| 765 |
|
| 506 |
Figure 7Terpenoid pathways represented in the PlantCyc annotation of the representative transcripts.