| Literature DB >> 32024021 |
Conor Jenkins1,2, Benjamin Orsburn3.
Abstract
Recently we have seen a relaxation of the historic restrictions on the use and subsequent research on the Cannabis plants, generally classified as Cannabis sativa and Cannabis indica. What research has been performed to date has centered on chemical analysis of plant flower products, namely cannabinoids and various terpenes that directly contribute to phenotypic characteristics of the female flowers. In addition, we have seen many groups recently completing genetic profiles of various plants of commercial value. To date, no comprehensive attempt has been made to profile the proteomes of these plants. We report herein our progress on constructing a comprehensive draft map of the Cannabis proteome. To date we have identified over 17,000 potential protein sequences. Unfortunately, no annotated genome of Cannabis plants currently exists. We present a method by which "next generation" DNA sequencing output and shotgun proteomics data can be combined to produce annotated FASTA files, bypassing the need for annotated genetic information altogether in traditional proteomics workflows. The resulting material represents the first comprehensive annotated protein FASTA for any Cannabis plant. Using this annotated database as reference we can refine our protein identifications, resulting in the confident identification of 13,000 proteins with putative function. Furthermore, we demonstrate that post-translational modifications play an important role in the proteomes of Cannabis flower, particularly lysine acetylation and protein glycosylation. To facilitate the evolution of analytical investigations into these plant materials, we have created a portal to host resources developed from our proteomic and metabolomic analysis of Cannabis plant material as well as our results integrating these resources.Entities:
Keywords: Cannabis; PTMs; Proteomics; proteogenomics
Mesh:
Substances:
Year: 2020 PMID: 32024021 PMCID: PMC7037972 DOI: 10.3390/ijms21030965
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1An UpSetR graph showing the unique protein identifications associated with each genomic dataset used for the generation of the proteogenomic FASTA as well as the number of proteins shared between the datasets.
Figure 2ShinyGO network analysis demonstrating the pathway differences between leaves (top panel) and mature female flowers (bottom panel).
Figure 3Evidence of acetylation on TPS1. (A) A sequence map demonstrating 3 observed lysine acetylation modifications. (B) A fragment map showing 100% sequence coverage for one acetylation site. (C) MS/MS spectra matching the fragment map.
Figure 4Example correlation analysis plot (A) Radar diagram for 11-OH-THC where the blue line represents all positive Pearson Correlation and orange is the p-value for each measurement. (B) A plot overlaying the 11-OH-THC metabolite peaks and technical replicates. (C) A plot of the protein from A demonstrating the highest correlation with this metabolite.
Figure 5Peptide and protein identification pipeline. Two point eight million spectra obtained on a high resolution mass spectrometer were searched against a search space consisting of a six frame translation of three reference genomes as well as the cRAP FASTA and a complete collection of all green plant proteins hosted by UniProt. The 17,000 SEQUEST identifications were then processed with the eggNOG mapper program to annotate the identifications according to sequence orthology of the Viridiplantae database. This subsequent fasta could then be utilized to search the raw data for post-translational modifications using a variety of tools and identify pathways correlating to the small molecule profile of the plant.
An overview of the progress to date.
| Category of Data | Number in 2019 Upload |
|---|---|
| Protein Sequenced | 17,269 |
| Protein Annotated | 13,929 |
| Proteins with homologous 3D structures | 964 |
| Acetylation sites Mapped | 584 |
| MS/MS Spectra Acquired | 1.40 × 107 |
| MS/MS Spectra Searched | 2.40 × 106 |
| MS/MS Spectra with Evidence of Glycosylation | 3.50 × 105 |
| Skyline Spectral Library | 43,612 annotated spectra |
| Gene Coding Regions Annotated | 13,850 |
| Small Molecule Features Isolated | 1050 |
| Small Molecules Identified | 535 |