| Literature DB >> 25806807 |
Ryan A Groves1, Jillian M Hagel1, Ye Zhang2, Korey Kilpatrick3, Asaf Levy4, Frédéric Marsolais3, Efraim Lewinsohn5, Christoph W Sensen2, Peter J Facchini1.
Abstract
Amphetamine analogues are produced by plants in the genus Ephedra and by khat (Catha edulis), and include the widely used decongestants and appetite suppressants (1S,2S)-pseudoephedrine and (1R,2S)-ephedrine. The production of these metabolites, which derive from L-phenylalanine, involves a multi-step pathway partially mapped out at the biochemical level using knowledge of benzoic acid metabolism established in other plants, and direct evidence using khat and Ephedra species as model systems. Despite the commercial importance of amphetamine-type alkaloids, only a single step in their biosynthesis has been elucidated at the molecular level. We have employed Illumina next-generation sequencing technology, paired with Trinity and Velvet-Oases assembly platforms, to establish data-mining frameworks for Ephedra sinica and khat plants. Sequence libraries representing a combined 200,000 unigenes were subjected to an annotation pipeline involving direct searches against public databases. Annotations included the assignment of Gene Ontology (GO) terms used to allocate unigenes to functional categories. As part of our functional genomics program aimed at novel gene discovery, the databases were mined for enzyme candidates putatively involved in alkaloid biosynthesis. Queries used for mining included enzymes with established roles in benzoic acid metabolism, as well as enzymes catalyzing reactions similar to those predicted for amphetamine alkaloid metabolism. Gene candidates were evaluated based on phylogenetic relationships, FPKM-based expression data, and mechanistic considerations. Establishment of expansive sequence resources is a critical step toward pathway characterization, a goal with both academic and industrial implications.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25806807 PMCID: PMC4373857 DOI: 10.1371/journal.pone.0119701
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Proposed biosynthetic routes leading from L-phenylalanine to amphetamine-type alkaloids in khat and Ephedra sinica.
A CoA-independent, non-β-oxidative pathway of L-phenylalanine side chain-shortening is shown in blue, whereas a CoA-dependent, β-oxidative route is shown in purple. Benzaldehyde, benzoic acid and/or benzoyl-CoA undergo condensation with pyruvate, a reaction putatively catalyzed by a ThDP-dependent carboligase. 1-Phenylpropane-1,2-dione undergoes transamination to yield (S)-cathinone, which is reduced to cathine and (1R,2S)-norephedrine. N-Methylation is restricted to Ephedra spp. and does not occur in khat. Activity has been detected for enzymes highlighted in yellow, and corresponding genes are available for enzymes highlighted in green. Enzymes highlighted in red have not been isolated, although database mining revealed numerous potential candidates (Tables 2 and 3). Abbreviations: CoA, Coenzyme A; NAD(H), nicotinamide adenine dinucleotide; NADP(H), nicotinamide adenine dinucleotide phosphate. PAL, phenylalanine ammonia lyase; ThDP, thiamine diphosphate.
Khat unigenes representing enzymes putatively involved in ephedrine alkaloid biosynthesis.
| Identifier | Database ID | Candidate reaction | Identity | Query | Accession number |
|---|---|---|---|---|---|
| CePAL1–1 | comp191_c0_seq1 | PAL | 592/729 (81%) | AtPAL1 | P35510.3 |
| CePAL1–2 | comp2625_c0_seq1 | PAL | 491/513 (95%) | AtPAL1 | P35510.3 |
| Ce4CL1–1 | comp2198_c0_seq1 | 4CL | 414/563 (73%) | At4CL1 | Q42524.1 |
| Ce4CL1–2 | comp1320_c0_seq1 | 4CL | 396/564 (70%) | At4CL1 | Q42524.1 |
| Ce4CL1–3 | comp24405_c0_seq1 | 4CL | 343/573 (59%) | At4CL1 | Q42524.1 |
| CeBDH1–1 | comp9812_c0_seq2 | BDH | 710/1367 (51%) | AtAAO4 | NP_563711 |
| CeBDH2–1 | comp1221_c0_seq1 | BDH | 423/537 (78%) | AmBALDH | ACM89738.1 |
| CeBDH2–2 | comp15067_c0_seq1 | BDH | 360/535 (67%) | AmBALDH | ACM89738.1 |
| CeBDH2–3 | comp6403_c0_seq2 | BDH | 276/534 (51%) | AmBALDH | ACM89738.1 |
| CeCHD1–1 | comp2279_c1_seq1 | CHD | 579/724 (79%) | PhCHD | JX142126.1 |
| CeCHD1–2 | comp1276_c0_seq1 | CHD | 423/728 (58%) | PhCHD | JX142126.1 |
| CeKAT1–1 | comp4442_c0_seq2 | KAT | 368/464 (79%) | PhKAT1 | ACV70032.1 |
| CeKAT1–2 | comp3066_c1_seq1 | KAT | 363/463 (78%) | PhKAT1 | ACV70032.1 |
| CeBL1–1 | comp2535_c0_seq1 | BL | 347/581 (59%) | AtBZO1 | NP_176763.1 |
| CeBL1–2 | comp13850_c0_seq1 | BL | 273/585 (46%) | AtBZO1 | NP_176763.1 |
| CeBL1–3 | comp18626_c0_seq1 | BL | 268/625 (42%) | AtBZO1 | NP_176763.1 |
| CeBL1–4 | comp10832_c0_seq1 | BL | 273/583 (46%) | AtBZO1 | NP_176763.1 |
| CeBL1–5 | comp3351_c0_seq1 | BL | 271/584 (46%) | AtBZO1 | NP_176763.1 |
| CeBL1–6 | comp5650_c0_seq1 | BL | 262/591 (44%) | AtBZO1 | NP_176763.1 |
| CeThDPC1–1 | comp2318_c0_seq1 | ThDPC | 513/678 (75%) | AtAHAS | ABJ80681.1 |
| CeThDPC2–1 | comp2120_c0_seq1 | ThDPC | 506/608 (83%) | AtPDC2 | NP_200307.1 |
| CeThDPC2–2 | Manual Assembly | ThDPC | 468/608 (76%) | AtPDC2 | NP_200307.1 |
| CeTA1–1 | comp1545_c0_seq1 | TA | 368/480 (76%) | PhPPA-AT | E9L7A5.1 |
| CeTA1–2 | comp1191_c0_seq1 | TA | 128/490 (26%) | PhPPA-AT | E9L7A5.1 |
| CeTA1–3 | comp29507_c0_seq1 | TA | 120/493 (24%) | PhPPA-AT | E9L7A5.1 |
| CeTA2–1 | comp7370_c0_seq1 | TA | 328/419 (78%) | CmArAT1 | ADC45389.1 |
| CeTA2–2 | comp25340_c0_seq1 | TA | 253/427 (59%) | CmArAT1 | ADC45389.1 |
| CeRED1–1 | comp5248_c0_seq1 | RED | 155/273 (56%) | DsTRI | AAA33281.1 |
| CeRED1–2 | comp6446_c0_seq1 | RED | 155/280 (55%) | DsTRI | AAA33281.1 |
| CeRED1–3 | comp24582_c0_seq1 | RED | 148/273 (54%) | DsTRI | AAA33281.1 |
| CeRED1–4 | comp24893_c0_seq1 | RED | 147/278 (52%) | DsTRI | AAA33281.1 |
| CeRED1–5 | comp601_c1_seq1 | RED | 143/284 (50%) | DsTRI | AAA33281.1 |
| CeRED1–6 | comp2523_c1_seq1 | RED | 137/273 (50%) | DsTRI | AAA33281.1 |
| CeRED1–7 | Manual Assembly | RED | 140/276 (50%) | DsTRI | AAA33281.1 |
| CeRED2–1 | comp1063_c0_seq1 | RED | 179/324 (55%) | PsCOR1 | AAF13739.1 |
| CeRED2–2 | comp7109_c0_seq1 | RED | 180/382 (47%) | PsCOR1 | AAF13739.1 |
| CeRED2–3 | comp7266_c0_seq1 | RED | 148/323 (45%) | PsCOR1 | AAF13739.1 |
| CeRED2–4 | comp33459_c0_seq1 | RED | 130/327 (39%) | PsCOR1 | AAF13739.1 |
| CeRED2–5 | comp3071_c2_seq1 | RED | 123/328 (37%) | PsCOR1 | AAF13739.1 |
| CeRED2–6 | comp1646_c0_seq1 | RED | 118/331 (35%) | PsCOR1 | AAF13739.1 |
| CeRED3–1 | comp673_c0_seq1 | RED | 176/274 (64%) | EcSanR | ADE41047.1 |
| CeRED3–2 | comp4872_c0_seq1 | RED | 86/372 (23%) | EcSanR | ADE41047.1 |
Each unigene is assigned an identifier, which corresponds to a database ID in the CED-Trinity library. Percent amino acid identity between unigenes and queries is provided. Abbreviations: AAO4, aromatic aldehyde oxidase 4; Am, Antirrhinum majus; AHAS, acetohydroxyacid synthase; ArAT, aromatic amino acid transaminase; At, Arabidopsis thaliana; BALDH, benzaldehyde dehydrogenase; BDH, benzaldehyde dehydrogenase; BL, benzoate CoA-ligase; BZO, benzoyloxyglucosinolate; CHD, cinnamoyl-CoA hydratase-dehydrogenase; 4CL, 4-coumaroyl-CoA ligase; Ce, Catha edulis; Cm, Cucumis melo; COR, codeinone reductase; Ds, Datura stramonium; Ec, Eschscholzia californica; Es, Ephedra sinica; KAT, 3-ketoacyl-CoA thiolasae; Ps, Papaver somniferum; PAL, L-phenylalanine ammonia lyase; PDC, pyruvate decarboxylase; Ph, Petunia x hybrida; PPA-AT, prephenate aminotransferase; RED, reductase; SanR, sanguinarine reductase; TA, transaminase; ThDPC, thiamin diphosphate-dependent carboligase; TR, tropinone reducase.
Ephedra sinica unigenes representing enzymes putatively involved in ephedrine alkaloid biosynthesis.
| Identifier | Database ID | Candidate reaction | Identity | Query | Accession number |
|---|---|---|---|---|---|
| EsPAL1–1 | Contig5 | PAL | 718/723 (99%) | EsPAL1 | AB300199.1 |
| EsPAL1–2 | Contig20298 | PAL | 445/728 (61%) | EsPAL1 | AB300199.1 |
| Es4CL1–1 | Contig18937 | 4CL | 368/545 (67%) | Pt4CL | AAB42382 |
| Es4CL1–2 | Contig547 | 4CL | 350/554 (63%) | Pt4CL | AAB42382 |
| Es4CL1–3 | Contig5701 | 4CL | 353/551 (64%) | Pt4CL | AAB42382 |
| EsBDH1–1 | Singlet106372 | BDH | 657/1444 (45%) | AtAAO4 | NP_563711 |
| EsBDH1–2 | Singlet88157 | BDH | 332/1346 (24%) | AtAAO4 | NP_563711 |
| EsBDH1–3 | Contig2002 | BDH | 427/1423 (30%) | AtAAO4 | NP_563711 |
| EsBDH2–1 | Contig9169 | BDH | 431/536 (80%) | AmBALDH | ACM89738.1 |
| EsBDH2–2 | Singlet9479 | BDH | 396/541 (73%) | AmBALDH | ACM89738.1 |
| EsBDH2–3 | Contig9201 | BDH | 396/545 (72%) | AmBALDH | ACM89738.1 |
| EsBDH2–4 | Contig12745 | BDH | 292/535 (54%) | AmBALDH | ACM89738.1 |
| EsBDH2–5 | Contig22940 | BDH | 290/537 (54%) | AmBALDH | ACM89738.1 |
| EsCHD1–1 | Contig833 | CHD | 351/724 (48%) | PhCHD | JX142126.1 |
| EsCHD1–2 | Singlet85794 | CHD | 427/731 (58%) | PhCHD | JX142126.1 |
| EsCHD1–3 | Singlet68699 | CHD | 376/725 (51%) | PhCHD | JX142126.1 |
| EsKAT1–1 | Contig6801 | KAT | 365/463 (78%) | PhKAT1 | ACV70032.1 |
| EsKAT1–2 | Contig24534 | KAT | 344/463 (74%) | PhKAT1 | ACV70032.1 |
| EsKAT1–3 | Contig31090 | KAT | 329/466 (70%) | PhKAT1 | ACV70032.1 |
| EsKAT1–4 | Contig2319 | KAT | 307/463 (66%) | PhKAT1 | ACV70032.1 |
| EsKAT1–5 | Singlet7228 | KAT | 298/462 (64%) | PhKAT1 | ACV70032.1 |
| EsBL1–1 | Contig433 | BL | 312/603 (51%) | AtBZO1 | NP_176763.1 |
| EsBL1–2 | Contig10250 | BL | 305/593 (51%) | AtBZO1 | NP_176763.1 |
| EsBL1–3 | Singlet7007 | BL | 273/582 (46%) | AtBZO1 | NP_176763.1 |
| EsBL1–4 | Contig33805 | BL | 281/586 (47%) | AtBZO1 | NP_176763.1 |
| EsBL1–5 | Contig6255 | BL | 265/584 (45%) | AtBZO1 | NP_176763.1 |
| EsBL1–6 | Singlet85705 | BL | 246/593 (41%) | AtBZO1 | NP_176763.1 |
| EsThDPC1–1 | Contig5434 | ThDPC | 457/671 (68%) | AtAHAS | ABJ80681.1 |
| EsThDPC1–2 | Contig23037 | ThDPC | 413/672 (61%) | AtAHAS | ABJ80681.1 |
| EsThDPC2–1 | Contig35903 | ThDPC | 454/610 (74%) | AtPDC2 | NP_200307.1 |
| EsThDPC2–2 | Contig5589 | ThDPC | 250/608 (41%) | AtPDC2 | NP_200307.1 |
| EsTA1–1 | Contig4104 | TA | 309/485 (63%) | PhPPA-AT | E9L7A5.1 |
| EsTA1–2 | Singlet66014 | TA | 237/486 (48%) | PhPPA-AT | E9L7A5.1 |
| EsTA2–1 | Contig13244 | TA | 187/416 (44%) | CmArAT1 | ADC45389.1 |
| EsTA2–2 | Singlet18529 | TA | 175/413 (42%) | CmArAT1 | ADC45389.1 |
| EsTA2–3 | Singlet87150 | TA | 178/416 (42%) | CmArAT1 | ADC45389.1 |
| EsRED1–1 | Contig36780 | RED | 90/274 (32%) | DsTRI | AAA33281.1 |
| EsRED1–2 | Contig143 | RED | 82/293 (27%) | DsTRI | AAA33281.1 |
| EsRED1–3 | Contig14099 | RED | 73/313 (23%) | DsTRI | AAA33281.1 |
| EsRED1–4 | Contig19440 | RED | 82/278 (29%) | DsTRI | AAA33281.1 |
| EsRED1–5 | Singlet71271 | RED | 83/288 (28%) | DsTRI | AAA33281.1 |
| EsRED1–6 | Contig16252 | RED | 78/292 (26%) | DsTRI | AAA33281.1 |
| EsRED2–1 | Contig30213 | RED | 66/261 (25%) | DsTRII | AAA33282 |
| EsRED2–2 | Singlet88092 | RED | 76/266 (28%) | DsTRII | AAA33282 |
| EsRED2–3 | Singlet14224 | RED | 49/262 (18%) | DsTRII | AAA33282 |
| EsRED3–1 | Singlet88290 | RED | 132/327 (40%) | PsCOR1 | AAF13739.1 |
| EsRED3–2 | Singlet12882 | RED | 133/325 (40%) | PsCOR1 | AAF13739.1 |
| EsRED3–3 | Singlet16920 | RED | 122/341 (35%) | PsCOR1 | AAF13739.1 |
| EsRED3–4 | Contig35277 | RED | 124/331 (37%) | PsCOR1 | AAF13739.1 |
| EsRED3–5 | Contig20961 | RED | 120/357 (33%) | PsCOR1 | AAF13739.1 |
| EsRED4–1 | Contig37733 | RED | 165/274 (60%) | EcSanR | ADE41047.1 |
| EsNMT1–1 | Singlet3659 | NMT | 152/360 (42%) | PsTNMT | AAY79177 |
| EsNMT2–1 | Singlet112119 | NMT | 354/497 (71%) | SlPEAMT | AAG59894 |
| EsNMT2–2 | Contig24415 | NMT | 312/496 (62%) | SlPEAMT | AAG59894 |
| EsNMT3–1 | Contig17099 | NMT | 150/395 (37%) | CaCS | BAC75663 |
| EsNMT3–2 | Contig29630 | NMT | 120/406 (29%) | CaCS | BAC75663 |
| EsNMT3–3 | Contig17536 | NMT | 131/397 (32%) | CaCS | BAC75663 |
| EsNMT3–4 | Contig1597 | NMT | 118/401 (29%) | CaCS | BAC75663 |
| EsNMT4–1 | Contig29277 | NMT | 186/341 (54%) | AbPMT | BAA82264 |
| EsNMT4–2 | Contig12525 | NMT | 176/354 (49%) | AbPMT | BAA82264 |
| EsNMT5–1 | Contig6426 | NMT | 242/689 (35%) | AtSUVH | NP_196113 |
| EsNMT5–2 | Contig4615 | NMT | 228/715 (31%) | AtSUVH | NP_196113 |
| EsNMT5–3 | Contig8760 | NMT | 227/680 (33%) | AtSUVH | NP_196113 |
| EsNMT6–1 | Contig20402 | NMT | 330/563 (58%) | AtPRMT | NP_199713 |
| EsNMT6–2 | Contig29347 | NMT | 127/530 (23%) | AtPRMT | NP_199713 |
| EsNMT6–3 | Contig36061 | NMT | 117/548 (21%) | AtPRMT | NP_199713 |
| EsNMT6–4 | Singlet15333 | NMT | 113/561 (20%) | AtPRMT | NP_199713 |
| EsNMT6–5 | Singlet13096 | NMT | 124/661 (18%) | AtPRMT | NP_199713 |
Each unigene is assigned an identifier, which corresponds to a database ID in the ESI-Velvet library. Percent amino acid identity between unigenes and queries is provided. Abbreviations: AAO4, aromatic aldehyde oxidase 4; Ab, Atropa belladonna; Am, Antirrhinum majus; AHAS, acetohydroxyacid synthase; ArAT, aromatic amino acid transaminase; At, Arabidopsis thaliana; BALDH, benzaldehyde dehydrogenase; BDH, benzaldehyde dehydrogenase; BL; benzoate CoA ligase; BZO, benzoyloxyglucosinolate; Ca, Caffea arabica; Ce, Catha edulis; CHD, cinnamoyl-CoA hydratase-dehydrogenase; 4CL, 4-coumaroyl-CoA ligase; Cm, Cucumis melo; CS, caffeine synthase; COR, codeinone reductase; Ds, Datura stramonium; Ec, Eschscholzia californica; Es, Ephedra sinica; KAT, 3-ketoacyl-CoA thiolasae; NMT, N-methyltransferase; PEANMT, phosphoethanolamine N-methyltransferase; PMT, putrescine N-methyltransferase; Ps, Papaver somniferum; PAL, L-phenylalanine ammonia lyase; PDC, pyruvate decarboxylase; Ph, Petunia x hybrida; PPA-AT, prephenate aminotransferase; PRMT, protein arginine N-methyltransferase; Pt, Pinus taeda; RED, reductase; SanR, sanguinarine reductase; Sl, Solanum lycopersicon; SUVH, histone lysine N-methyltransferase, H3L9-specific; TA, transaminase; ThDPC, thiamin diphosphate-dependent carboligase; TNMT, (S)-tetrahydroprotoberberine N-methyltransferase; TR, tropinone reducase.
Summary of the construction and assembly for three Illumina NGS libraries.
| Abbrev. | Plant | Tissue | SRA accession number | No. of raw reads | No. of cleaned reads | Average transcript read depth (reads/bp) | Unigenes | Predicted no. of full-length CDS |
|---|---|---|---|---|---|---|---|---|
| CED-Trinity |
| Young leaf | SRX485764 | 272,586,558 | 215,532,092 | 119.0 | 77,290 | 38,322 |
| ESI-Trinity |
| Shoot tip | SRX485643 | 191,352,154 | 164,533,710 | 121.3 | 63,344 | 18,342 |
| ESI-Velvet |
| Shoot tip | SRX485643 | 191,352,154 | 164,533,710 | 62.9 | 59,448 | 41,671 |
Abbreviations: CED, Catha edulis; ESI, Ephedra sinica; SRA, short-read archive; CDS, coding sequence.
Fig 2Functional category analysis based on Gene Ontology (GO) annotations of CED-Trinity (upper panel) and ESI-Trinity (lower panel) unigenes.
Results for ESI-Velvet are found in S2 Fig.
Fig 3Phylogenetic analysis of gene candidates.
Abbreviations: benzaldehyde dehydrognase (BDH) (A), thiamin diphosphate-dependent carboligase (ThDPC) (B) and transaminase (TA) (C). Similar analyses for remaining candidates are found in S3 Fig. Sequences were aligned and analyzed for phylogenetic relationships using the neighbor-joining algorithm. Numbers at each node represent bootstrap values calculated using 1000 iterations. Accession numbers are found under Experimental section, and abbreviations are defined in Tables 2 and 3.
Fig 4Relative expression of gene candidates identified in khat (CED-Trinity).
FPKM (fragments mapped per kilobase of exon per million reads mapped) is a normalizing statistic measuring gene expression while accounting for variation in gene length [23]. Abbreviations are defined in Tables 2 and 3.
Fig 5Relative expression of gene candidates identified in Ephedra sinica (ESI-Trinity).
FPKM (fragments mapped per kilobase of exon per million reads mapped) is a normalizing statistic measuring gene expression while accounting for variation in gene length [23]. Abbreviations are defined in Tables 2 and 3.