| Literature DB >> 22465851 |
Ambarish Nag1, Tatiana V Karpinets, Christopher H Chang, Maor Bar-Peled.
Abstract
Understanding how cellular metabolism works and is regulated requires that the underlying biochemical pathways be adequately represented and integrated with large metabolomic data sets to establish a robust network model. Genetically engineering energy crops to be less recalcitrant to saccharification requires detailed knowledge of plant polysaccharide structures and a thorough understanding of the metabolic pathways involved in forming and regulating cell-wall synthesis. Nucleotide-sugars are building blocks for synthesis of cell wall polysaccharides. The biosynthesis of nucleotide-sugars is catalyzed by a multitude of enzymes that reside in different subcellular organelles, and precise representation of these pathways requires accurate capture of this biological compartmentalization. The lack of simple localization cues in genomic sequence data and annotations however leads to missing compartmentalization information for eukaryotes in automatically generated databases, such as the Pathway-Genome Databases (PGDBs) of the SRI Pathway Tools software that drives much biochemical knowledge representation on the internet. In this report, we provide an informal mechanism using the existing Pathway Tools framework to integrate protein and metabolite sub-cellular localization data with the existing representation of the nucleotide-sugar metabolic pathways in a prototype PGDB for Populus trichocarpa. The enhanced pathway representations have been successfully used to map SNP abundance data to individual nucleotide-sugar biosynthetic genes in the PGDB. The manually curated pathway representations are more conducive to the construction of a computational platform that will allow the simulation of natural and engineered nucleotide-sugar precursor fluxes into specific recalcitrant polysaccharide(s). Database URL: The curated Populus PGDB is available in the BESC public portal at http://cricket.ornl.gov/cgi-bin/beocyc_home.cgi and the nucleotide-sugar biosynthetic pathways can be directly accessed at http://cricket.ornl.gov:1555/PTR/new-image?object=SUGAR-NUCLEOTIDES.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22465851 PMCID: PMC3316911 DOI: 10.1093/database/bas013
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Schematic representation of the sub-cellular localization and catalytic domain orientation of cytosolic and Golgi Arabidopsis UDP-xylose synthase enzyme isoforms.
Comparison of sub-cellular localization prediction software used to infer sub-cellular localization of Populus enzymes involved in nucleotide-sugar biosynthesis
| Program | Prediction Method | Prediction Scope/Accuracy |
|---|---|---|
| WoLF PSORT ( | Weighted k-nearest neighbor classifier | Predicts localization to 10 sub-cellular sites, including dual localization such as proteins which shuttle between the cytosol and nucleus; 70% accuracy for nucleus, mitochondria, cytosol, plasma membrane, extracellular and chloroplast; less accurate for peroxisome, Golgi |
| SubLoc ( | Support Vector Machine (SVM) | 91.4% accuracy for three sub-cellular locations (cytoplasmic, periplasmic, extracellular) in prokaryotic organisms and 79.4% accuracy for four sub-cellular locations (cytoplasmic, extracellular, mitochondrial, nuclear) in eukaryotic organisms |
| PredoTar ( | Neural Networks | Predicts sub-cellular localization of proteins to ER, mitochondria and plastids from their characteristic N-terminal targeting sequences |
| MultiLoc2 ( | Support Vector Machine + Phylogenetic Profiles + Gene Ontology terms | High-resolution version of MultiLoc2 can predict localization to 11 eukaryotic sub-cellular locations—nucleus, cytoplasm, mitochondria, chloroplast, extracellular, plasma membrane, peroxisome, ER, Golgi apparatus, lysosome and vacuole; Accuracy—89.2% for animal proteins, 89.2% for fungal proteins and 89.4% for plant proteins |
| PredictNLS ( | Identification of sequence from protein in a carefully curated NLS (nucleotide localization signal) database | Predicts nuclear localization with close to 100% accuracy but low coverage (43%) |
| MITOPRED ( | Identification of Pfam domain occurrence patterns and the amino acid compositional differences between mitochondrial and non-mitochondrial proteins | Predicts mitochondrial versus non-mitochondrial localization of proteins. Depending on the allowed proportions of true positives and true negatives to total positives and total negatives respectively, accuracy can vary from 71% to 92% |
| CELLO ( | Two-level SVM + homology search | Predicts localization to 12 eukaryotic sub-cellular locations—nucleus, cytoplasm, cytoskeleton, mitochondria, chloroplast, extracellular, plasma membrane, peroxisome, ER, Golgi apparatus, lysosome and vacuole |
List of curated nucleotide sugar biosynthesis pathways in the prototype Populus PGDB and the primary metabolite(s) from which these pathways are initiated
| Primary source metabolite(s) | Pathway name |
|---|---|
| GDP | |
| GDP | |
| UDP | |
| UDP | |
| Sucrose, | |
| UDP | |
| UDP | |
| GDP | |
| UDP | |
| UDP | |
| UDP | |
| UDP | |
| Sucrose, |
Figure 2.UDP pathway representation in PoplarCyc 1.0. Note that the pathway does not distinguish cytosolic reaction with EC # 4.1.1.35 from the corresponding reaction catalyzed by membrane-bound and Golgi-localized enzymes.
Two letter prefixes that identify the sub-cellular localization of metabolites
| Prefix | Cellular Compartment |
|---|---|
| CY | Cytosol |
| CS | Chloroplast stroma |
| GL | Golgi lumen |
| ER | ER |
| VC | Vacuole |
| NC | Nucleus |
| MT | Mitochondrion |
This lexicon of prefixes can be easily extended to include other cellular compartments. For example, we would use the three-letter prefix ERL to pinpoint localization to the ER lumen.
Figure 3.The enhanced representation of UDP pathway in prototype P. trichocarpa PGDB. In this representation the sub-cellular localization of each metabolite in the pathway is specified by a two-letter prefix (Table 3). Note that Figure 2 shows 12 genes annotated as UDP-glucose dehydrogenases and 9 genes annotated as UDP-glucuronate decarboxylase. Manual curation yields only four UDP-glucose dehydrogenase enzymes and six UDP-glucuronate decarboxylase enzymes in Populus. Also note that genes encoding Golgi-localized membrane-bound UDP-glucuronic acid decarboxylase (UXS) can now be distinguished from the same enzyme activity residing in the cytosol.
Figure 4.(a) UDP pathway representation in PoplarCyc 1.0. Note that no information is available to evaluate the source of UDP-glucuronate and how it becomes available to the UDP-glucuronate 4-epimerase (b) UDP pathway representation in AraCyc 6.0.1 Note a pathway with a missing EC number for the conversion of galacturonate-1-P to UDP-galacturonate. For both parts (a) and (b), no information is available where these cellular processes occur in the cell.
Figure 5.The enhanced representation of UDP pathway starting from UDP-d-glucuronate in prototype Populus PGDB. The two metabolites in this pathway are linked by green arrows to two other pathways, UDP and UDP.
Figure 6.The enhanced representation of UDP starting from d-galacturonate in prototype Populus PGDB.
Figure 7.Schematic representation of association of multiple Cellular Component Gene Ontology terms with two differentially oriented Golgi membrane enzymes.
Figure 9.Nucleotide-sugar biosynthetic genes expressed in xylem tissue from 20 different Populus trees with different wood properties. Log2 transformed SNP abundance data for individual Populus nucleotide-sugar biosynthetic genes are overlaid with the curated nucleotide-sugar biosynthetic pathways in the prototype Populus PGDB. Genes that exhibit at least one SNP are considered expressed, whereas the status of genes lacking SNPs cannot be ascertained. UDP- (from UDP-D-xylose, catalyzed by UDP-arabinose 4-epimerase) is currently represented in the PGDB as occurring in the cytosol and two different organelles—ER and Golgi, based on sequence analyses of UDP-arabinose 4-epimerase isoforms. The localization of any UDP-arabinose 4-epimerase isoform to the ER is yet to be validated by experimental data.
Figure 10.Overlay of log2-transformed SNP abundance for genes in the UDP- pathway. Individual reaction representations are highlighted using the color code at the top left corner. The numbers in the color code define the ranges of the log2(SNP abundance) values of the genes in such a way that red, blue and green colors correspond to genes with high, medium and low SNP values, respectively. The reaction arrows for UDP-d-glucuronate formation, and for both the cytosolic and intra-Golgi conversions of UDP-d-glucuronate to UDP-d-xylose are highlighted in red since all the corresponding genes have high log2(SNP abundance). The color-coded log2(SNP abundance) values for the transcriptome genes encoding the enzymes catalyzing each reaction can be exhibited using pop-up bar diagrams, shown as inset figures. The absolute log2(SNP abundance) values for the genes that explicitly appear on the pathway diagram occur next to the corresponding gene locus tags and are color coded as well.