| Literature DB >> 25210598 |
Abstract
The availability of thousands of sequenced genomes has revealed the diversity of biochemical solutions to similar chemical problems. Even for molecules at the heart of metabolism, such as cofactors, the pathway enzymes first discovered in model organisms like Escherichia coli or Saccharomyces cerevisiae are often not universally conserved. Tetrahydrofolate (THF) (or its close relative tetrahydromethanopterin) is a universal and essential C1-carrier that most microbes and plants synthesize de novo. The THF biosynthesis pathway and enzymes are, however, not universal and alternate solutions are found for most steps, making this pathway a challenge to annotate automatically in many genomes. Comparing THF pathway reconstructions and functional annotations of a chosen set of folate synthesis genes in specific prokaryotes revealed the strengths and weaknesses of different microbial annotation platforms. This analysis revealed that most current platforms fail in metabolic reconstruction of variant pathways. However, all the pieces are in place to quickly correct these deficiencies if the different databases were built on each other's strengths.Entities:
Keywords: Metabolic reconstruction; Non-orthologous displacements; Paralogs
Year: 2014 PMID: 25210598 PMCID: PMC4151868 DOI: 10.1016/j.csbj.2014.05.008
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Known variations and paralogs in the THF pathway. Code: underlined, canonical enzymes; red, non-orthologous displacements; blue, alternate pathways; green, salvage; yellow box, unknown gene; orange box, paralogs not in THF pathway; and orange, paralogs in folate pathway. Enzymes names are given in Table 2. Abbreviations: DHN-TP, dihydroneopterin triphosphate; DHN-MP, dihydroneopterin monophosphate; cDHNP, 7,8-dihydro-d-neopterin 2′,3′-cyclic phosphate; DRP-P, 2,5-diamino-6-ribosylamino-4(3H)-pyrimidinone 5′-phosphate; HMDHP, 6-hydroxymethyldihydropterin; HMDHP-PP 6-hydroxymethyldihydropterin diphosphate; pABA, p-aminobenzoate; ADC, aminodeoxychorismate; DHP, dihydropteroate; THP, tetrahydropteroate; DHF, dihydrofolate; THF, tetrahydrofolate. THF-(Glu)n, polyglutamylated THF; Mpt, methanopterin.
Integrative microbial databases analyzed.
| Database | Families | Reactions | Pathway reconstruction | Phenotype | Location |
|---|---|---|---|---|---|
| Uniprot/Unipathway | Yes/HAMAP | Yes | Only a subset of genomes | No | |
| IMG | No | Yes | Yes | Yes | |
| PATRIC | Yes/FigFam | No | Yes | No | |
| MicrobesOnline | No | No | Yes | No | |
| Microscope | Yes/Syntonome | Yes | Yes | No | |
| BioCyc | No | Yes | Yes | No | |
| KEGG | No | Yes | Yes | No | |
| CMR | Yes/TIGRFAm | No | Yes | Yes |
Abbreviations: High-quality Automated and Manual Annotation of Proteins (HAMAP); Integrated Microbial Genomes (IMG); PAThogen Resource Integration Center (PATRIC); Kyoto Encyclopedia of Genes and Genomes (KEGG); Comprehensive Microbial Resource (CMR).
This is a family definition where annotations are going to be transferred to that family, not just a COG or Pfam membership, the name of these isofunctional families is listed.
The chemical reaction is encoded in the database.
Prediction of prototrophy or auxotrophy.
Fig. 2Distribution of THF pathway variations. The percentage of each gene family was calculated based on a total number of 9327 bacteria that contained both a FolK and a FolB homolog and on 61 Archaea with an active THF or Mpt pathway. All the percentages in this figure, as well as those given throughout the text, were extracted from the “Folate biosynthesis” (http://pubseed.theseed.org/SubsysEditor.cgi?page=ShowSubsystem&subsystem=Folate_Biosynthesis.) and the “Early steps pterin biosynthesis Archaea” (http://pubseed.theseed.org/SubsysEditor.cgi?page=ShowSubsystem&subsystem=Early_Pterin_Biosynthesis_Steps_Archaea) SEED subsystems after downloading the corresponding excel files, eliminating duplicate genomes, and sorting using Excel tools. Abbreviations are defined in Table 2.
Comparison THF pathway reconstruction and THF gene annotations in different annotations databases.
| Correct Prediction | Year | Uniprot/Unipathway | IMG | Patric | MicrobesOnline | Microscope | BioCyc | KEGG | CMR |
|---|---|---|---|---|---|---|---|---|---|
| THF pathway correctly predicted | |||||||||
| THF pathway incorrectly | Ct | EcSaCbCtHs | CbCt | SaCbCtHs | CbCtHs | BtCtHs | CbCtHs | SaCbCtHs | |
| FolE2 (SACOL0613/Q5HIA9 | 2006 | Yes | Yes | Yes | No | Yes | Yes | Yes | No |
| FolE2 (OE3673F/VNG1901c | 2006 | Yes | Yes | NA | Yes | Yes | Yes | Yes | No |
| FolC2 (CT611/CTA0664/Q3KL84) in Ct | 2007 | No | No | Yes | No | No | No | No | No |
| QueD/PTPS-III (CBO0827/CLC_0882/A5I019) in Cb | 2009 | Only QueD | No | Yes | No | No | No | No/PTPS-II | No/PTPS-II |
| MptD (VNG0127C/Q9HSQ4) in Hs | 2012 | Yes | No | NA | No | No | No | No | |
| MptE (OE2919R/VNG1343C/B0R5E3) in Hs | 2012 | Yes | No | NA | No | No | No | No | No |
| CT610 (CTA0663/Q3KL85) in Ct | 2013 | No | No | No | No | No | No | No | No |
Abbreviations
Ct, Chlamydia trachomatis D/UW-3/CX (sv D) or C. trachomatis A/HAR-13 in BioCyc.
Cb, Clostridium botulinum A str. ATCC 3502 or Hall.
Ec, Escherichia coli K-12 MG1655.
Hs, Halobacterium salinarum R1 DSM 671 or Halobacterium species NRC-1.
Sa, Staphylococcus aureus subsp. aureus COL or Staphylococcus aureus subsp. aureus NCTC 8325.
The year when protein/gene characterization was first published.
Cb genome not in Unimap.
No Archaea in PATRIC.
The FolE2/MptA was correctly linked to the pathway in Hs.
Uniprot numbers were used because locus tags were erratic for BioCyc.
Some databases require the underscore version of the locus tag such as VNG_1901c.
But correct SEED annotation on gene page.
Cannot be evaluated as gene not called in H. salinarum R1 and Halobacterium NRC-1 not in BioCyc.
But correct GO term on gene page.
But correct reference on gene page.
No C. botulinum in Unipathway.
Fig. 3BioCyc pathway comparisons. (A) The “enzymes and genes for 6-hydromethyl dihydropterin diphosphate biosynthesis II (archaea)” pathway was opened in BioCyc (BioCyc: http://biocyc.org/), and the pathway comparison tool was used choosing S. aureus COL in the genome list (B) The “Enzymes and genes for 6-hydromethyl dihydropterin diphosphate biosynthesis II (Archaea)” pathway was opened in BioCyc and the pathway comparison tool was used choosing H. salinarum R1 in the genome list.
Fig. 4(A) Microscope pathway species comparisons. (B) Description of IMG pathway and linkage of reactions to genes.
IMG THF-related pathway assertions. None of the pathways were correctly predicted, including IMG-1005 and IMG-1006 in E. coli. Analysis was performed on IMG/JGI: (https://img.jgi.doe.gov/cgi-bin/w/main.cgi) by adding the five chosen genomes (abbreviations given in Table 1) and the selected IMG pathways (listed with IMG numbering) in the cart, and then conducting the IMG pathway distribution analysis in these genomes.
| IMG pathway | Ct | Cb | Ec | Hs | Sa |
|---|---|---|---|---|---|
| 1005 — 6-hydroxymethyl-dihydropterin diphosphate biosynthesis | a(0/5) | a(1/5) | a(2/5) | a(0/5) | a(1/5) |
| 1006 — Tetrahydrofolate biosynthesis | a(0/3) | a(1/3) | a(2/3) | a(1/3) | a(2/3) |
| 1032 — Folate precursors biosynthesis in Archaea | a(0/2) | a(0/2) | a(0/2) | a(1/2) | a(0/2) |
| 1034 — Tetrahydromonapterin biosynthesis | a(0/3) | a(0/3) | a(0/3) | a(0/3) | a(0/3) |
| 1037 — 6-hydroxymethyl-dihydropterin diphosphate biosynthesis | a(0/3) | a(1/3) | a(1/3) | a(0/3) | a(1/3) |
Assertion: a — absent or not asserted; p — present or asserted; u — unknown; N/A — no data available.
Evidence level (g/R): g — number of reactions with associated genes; R — total number of reactions in pathway.
Prediction of the state of the folate biosynthesis pathway for specific organisms in CMR.
| Organism | State | FolE (O) | FolB (R) | FolK (R) | FolP (R) | FolC (R) | FolA (R) |
|---|---|---|---|---|---|---|---|
| Yes | N | Y | Y | Y | Y | Y | |
| Yes | Y | Y | Y | Y | Y | Y | |
| Some evidence | Y | N | Y | Y | Y | N | |
| Not supported | N | N | N | Y | Y | N | |
| Some evidence | N | Y | Y | Y | N | Y | |
| Evidence | TIGR00063 | TIGR00525 | TIGR01498 | TIGR01496 | TIGR01499 | PF00186 |
R — required, signature gene; O — not required, not signature gene.
Enzymes of the THF pathways.
| Abbreviation | Enzyme name | COG number |
|---|---|---|
| FolE | GTP cyclohydrolase I (EC 3.5.4.16) type 1 | COG0302 |
| FolE2 or MptA | GTP cyclohydrolase I (EC 3.5.4.16) type 2 | COG1469 |
| RibA | GTP cyclohydrolase II (EC 3.5.4.25) | COG0807 |
| TrpF | Phosphoribosylanthranilate isomerase (EC 5.3.1.24) | COG0135 |
| FolQ | Dihydroneopterin triphosphate pyrophosphatase | COG1051 |
| Nudix | Nudix hydrolase superfamily | COG1051 |
| P-ase | Dihydroneopterin monophosphate phosphatase | ? |
| FolB | Dihydroneopterin aldolase (EC 4.1.2.25) | COG1539 |
| PTPS-III | 6-Hydroxymethyldihydropterin synthase, PTPS-III type | COG0720 |
| PTPS-IV | 6-Hydroxymethyldihydropterin synthase, PTPS-VI type | COG0720 |
| QueD | 6-Carboxytetrahydropterin synthase (EC 4.1.2.50) | COG0720 |
| PTPS-II | 6-Pyruvoyl tetrahydrobiopterin synthase (EC 4.2.3.12) | |
| MptB | 7,8-Dihydro- | COG3481 |
| MptD | MptD, dihydroneopterin aldolase archaeal type | COG2098 |
| FolK | FolK, hydroxymethyldihydropterin pyrophosphokinase (EC 2.7.6.3) | COG0801 |
| MptE | Hydroxymethyldihydropterin pyrophosphokinase archaeal type | COG1634 |
| FolP | Dihydropteroate synthase (EC 2.5.1.15) | COG0294 |
| FolC | Bifunctional dihydrofolate synthase (EC 6.3.2.12) folylpolyglutamyl synthase (EC 6.3.2.17) | COG0285 |
| FolC2 | Bifunctional dihydrofolate synthase (EC 6.3.2.12) folylpolyglutamyl synthase (EC 6.3.2.17) type 2 | COG1478 |
| FolX | Dihydroneopterin triphosphate epimerase | COG1539 |
| FolA | Dihydrofolate reductase type I | COG0262 |
| FolM | Dihydropterin reductase | COG1028 |
| R67 | Dihydrofolate reductase type II | pfam06442 |
| Dpr | Flavin-dependent dihydropteroate reductase | No named domain |
| PabA | Para-aminobenzoate synthase, aminase component (EC 2.6.1.85) | COG0147 |
| PabB | Para-aminobenzoate synthase, amidotransferase component (EC 2.6.1.85) | COG0512 |
| PabC | Aminodeoxychorismate lyase (EC 4.1.3.38) | COG0115 |
| CT610 | Alternate p | COG5424 |
| FBT | Folate-biopterin transporter | COG2111 |
| ECF-FolT | Substrate-specific component FolT of folate Energy coupling factor (ECF) transporter | pfam12822 |
When no “Cluster of Orthologous Group” (COG) number exists, a pfam number is given when it is available.