| Literature DB >> 26858723 |
Aparupa Bose Mazumdar1, Sharmila Chattopadhyay1.
Abstract
Phyllanthus amarus Schum. and Thonn., a widely distributed annual medicinal herb has a long history of use in the traditional system of medicine for over 2000 years. However, the lack of genomic data for P. amarus, a non-model organism hinders research at the molecular level. In the present study, high-throughput sequencing technology has been employed to enhance better understanding of this herb and provide comprehensive genomic information for future work. Here P. amarus leaf transcriptome was sequenced using the Illumina Miseq platform. We assembled 85,927 non-redundant (nr) "unitranscript" sequences with an average length of 1548 bp, from 18,060,997 raw reads. Sequence similarity analyses and annotation of these unitranscripts were performed against databases like green plants nr protein database, Gene Ontology (GO), Clusters of Orthologous Groups (COG), PlnTFDB, KEGG databases. As a result, 69,394 GO terms, 583 enzyme codes (EC), 134 KEGG maps, and 59 Transcription Factor (TF) families were generated. Functional and comparative analyses of assembled unitranscripts were also performed with the most closely related species like Populus trichocarpa and Ricinus communis using TRAPID. KEGG analysis showed that a number of assembled unitranscripts were involved in secondary metabolites, mainly phenylpropanoid, flavonoid, terpenoids, alkaloids, and lignan biosynthetic pathways that have significant medicinal attributes. Further, Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values of the identified secondary metabolite pathway genes were determined and Reverse Transcription PCR (RT-PCR) of a few of these genes were performed to validate the de novo assembled leaf transcriptome dataset. In addition 65,273 simple sequence repeats (SSRs) were also identified. To the best of our knowledge, this is the first transcriptomic dataset of P. amarus till date. Our study provides the largest genetic resource that will lead to drug development and pave the way in deciphering various secondary metabolite biosynthetic pathways in P. amarus, especially those conferring the medicinal attributes of this potent herb.Entities:
Keywords: Illumina Miseq; Phyllanthus amarus; de novo assembly; functional annotation; leaf transcriptome; next-generation sequencing (NGS); secondary metabolism
Year: 2016 PMID: 26858723 PMCID: PMC4729934 DOI: 10.3389/fpls.2015.01199
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Sequencing, Sequence length distribution of P. amarus non-redundant unique unitranscript sequences. (B) BLASTX-Hit species distribution of P. amarus unitranscripts against nr protein database. (C) E-value distribution of BLAST hits against nr protein database.
Figure 2Graphical representations of functional annotations in Representation of mapping databases (UniprotKB, TAIR) sources. (B) Annotation score distribution of assembled unitranscripts. (C) Sequence similarity distribution graph. (D) Distribution of annotated sequences with length. (E) GO level distribution of annotated unitranscript sequences.
Figure 3GO functional classifications using WEGO.
Figure 4Clusters of Orthologous Groups (COG) classification of .
Figure 5Annotation of Distribution of P. amarus unitranscripts into KEGG biological categories. (B) Classification of P. amarus leaf transcriptome into KEGG “Metabolism” category.
Figure 6Phenylpropanoid biosynthesis pathway study by KEGG analysis showing the different identified enzymes (one color for each Enzyme Code or EC).
Figure 7Flavonoid biosynthesis pathway study in .
Figure 8KEGG analysis showing genes involved in MVA, MEP pathways forming the terpenoid backbone biosynthesis (Each EC with one color).
Figure 9Indole alkaloid biosynthetic pathway genes found in .
Summary of few major genes involved in Phenylpropanoid and Flavonoid biosynthesis pathways identified putatively from .
| Phenylalanine Ammonia Lyase (PAL) | 4.3.1.24 | Unitranscript 1386, 31401,31403, | 2379, 2553, 1162, 1643, 1186,2574, | 0,0,0,0,0,0,0,0,0,0 | XP_002519521.1, XP_002531677.1 | 10 |
| Cinnamate 4-hydroxylase/ trans-cinnamate 4-monooxygenase | 1.14.13.11 | Unitranscript 31571, 79728, 79729 | 1331, 622, 610 | 0, 2.53E-095, | XP_002523952.1 | 3 |
| Flavonoid 3′- monooxygenase | 1.14.13.21 | Unitranscript 67104, 74675 | 1154, 1948 | 2.93E-110, 0 | XP_002533334.1 XP_002531093.1 | 2 |
| Flavonoid 3′, 5′-hydroxylase | 1.14.13.88 | Unitranscript 2751, 29623, 48352,48353, 48354,48355, 48356,48357, 48358,48359,48360 | 5660,1912, 876,1272, | 0,0, 1.05E-053,9.11E-117,0,0, 4.15E-041, | XP_002528647.1 XP_002510313.1 XP_002531094.1 | 24 |
| Chalcone synthase | 2.3.1.74 | Unitranscript 12832, 12833,12834, | 5097, 3769, | 0,0,0,0 | XP_002529257.1 | 4 |
| Chalcone isomerase | 5.5.1.6 | Unitranscript 5340, 5341,42537, | 863,960,838,1322,1210, 637,908 | 2.74E-104, | XP_002315258.1 | 7 |
| Flavonol synthase | 1.14.11.23 | Unitranscript 29419,33161, 77019,77020 | 1822,383, | 6.76E-089, | XP_002531459.1 | 4 |
| Leucoanthocyanidin dioxygenase | 1.14.11.19 | Unitranscript 13933, 13934,13942, 13943,70616, 70617 | 515,512,398,466,1445, | 2.64E-033 | XP_002533635.1 | 6 |
Summary of few major genes involved in Terpenoid and Alkaloid biosynthesis pathways identified putatively from .
| HMG-CoA synthase | 2.3.3.10 | Unitranscript 62630 | 1906 | 0 | XP_002509692.1 | 1 |
| HMG-CoA reductase | 1.1.1.34 | Unitranscript 1329,30497, 30499,30500, 30501,30502, 30503,30504, | 4686,973, 747,744, | 0, 6.75E-126, | XP_002510732.1 | 14 |
| Mevalonate diphosphate decarboxylase | 4.1.1.33 | Unitranscript 2488, 45660 | 3562, 1734 | 0,0 | XP_002521172.1 | 2 |
| 1-deoxy-D-xylulose-5-phosphate synthase | 2.2.1.7 | Unitranscript 5233,6343, 45230, 45231,53517, | 2774, 1120, 2604,2671, | 0,0,0,0,0, | XP_002514364.1 | 12 |
| 1-deoxy-D-xylulose-5-phosphate reductoisomerase | 1.1.1.267 | Unitranscript 54720, 54721 | 2158, 2108 | 0,0 | XP_002511399.1 | 2 |
| Mevalonate diphosphate decarboxylase | 4.1.1.33 | Unitranscript 2488, 45660 | 3562, 1734 | 0, 0 | XP_002521172.1 | 2 |
| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase | 1.17.1.2 | Unitranscript 26231 | 1352 | 2.75E-141 | XP_002519102.1 | 1 |
| Isopentenyl diphosphate delta isomerase | 5.3.3.2 | Unitranscript 3238, 53622, 53623,53624 | 1438, 881, | 4.40E-159, 2.54E-129, | XP_002514848.1 | 4 |
| Squalene synthase | 2.5.1.21 | Unitranscript 658, 20308, 20313 | 2021, 1773, 1856 | 0,0,0 | NP_001236365.1 | 3 |
| Squalene monooxygenase | 1.14.13.132 | Unitranscript 81899 | 338 | 1.36E-068 | XP_002530610.1 | 1 |
| Polyneuridine-aldehyde esterase | 3.1.1.78 | Unitranscript 869,73905, 73906 | 1595, 1624, | 6.04E-135, 0, | XP_002522352.1 | 3 |
| Strictosidine synthase | 4.3.3.2 | Unitranscript 53363, 53366, 62407, 62408 | 243, 1726, 1523, 831 | 1.01E-023, 0, 0, 3.97E-117 | XP_002513740.1 | 4 |
| Deacetoxyvindoline 4-hydroxylase | 1.14.11.20 | Unitranscript 3405, 3406, 17317,17320, 41143, 41144, 48984, 48986, 48987, 55158, 57629, 57630, 57631, 63714, 63716, 79369, 79370, 80577, 84582 | 1758, 1086, 1475, 1627, 2334,2415, 2336, 1508, 2179, 256, 1736, 1714, 531, 587, 717, 564, 664, 409, 337 | 0, 8.92E-128, 1.95E-177,0, | XP_002529304.1 | 19 |
| Polyphenol oxidase | 1.10.3.1 | Unitranscript 16967, 16969 | 756, 952, | 1.43E-015, 9.29E-015, | XP_002316632.1 | 5 |
| Amine oxidase | 1.4.3.21 | Unitranscript 63789,63790, 65487, 79801, 81672 | 692, 2340, 3132, 1096, 236 | 1.53E-075, 0, 0, 3.70E-173, 2.30E-034 | XP_002509596.1 | 5 |
| N-methylcoclaurine 3′-monooxygenase | 1.14.13.71 | Unitranscript 77543, 79547 | 944,514 | 2.61E-102, 1.39E-067 | XP_002510830.1 | 2 |
| Reticuline oxidase | 1.21.3.3 | Unitranscript 3212, 70115, | 1094, 1725 | 2.91E-173, 0,0, 5.41E-156 | XP_002523151.1 | 4 |
Figure 10Transcription factors identified from leaf transcriptome of .
Statistics of SSRs identified from .
| Total number of sequences examined | 85,927 |
| Total size of examined sequences (bp) | 133,023,042 |
| Total number of identified SSRs | 65,273 |
| Number of SSRs containing sequences | 28,304 |
| Number of sequences containing more than 1 SSR | 11906 |
| Number of SSRs present in compound formation | 29,652 |
Figure 11Identification of molecular markers (SSRs) in leaf transcriptome of Distribution of SSR's into mono, di, tri, tetra, penta, and hexa repeat types. (B) Distribution of mono and di-nucleotide SSR motifs and percent frequency of their repeat types.
Figure 12RT-PCR image of selected .