| Literature DB >> 22102576 |
Ron Caspi1, Tomer Altman, Kate Dreher, Carol A Fulcher, Pallavi Subhraveti, Ingrid M Keseler, Anamika Kothari, Markus Krummenacker, Mario Latendresse, Lukas A Mueller, Quang Ong, Suzanne Paley, Anuradha Pujar, Alexander G Shearer, Michael Travers, Deepika Weerasinghe, Peifen Zhang, Peter D Karp.
Abstract
The MetaCyc database (http://metacyc.org/) provides a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. MetaCyc contains more than 1800 pathways derived from more than 30,000 publications, and is the largest curated collection of metabolic pathways currently available. Most reactions in MetaCyc pathways are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes and literature citations. BioCyc (http://biocyc.org/) is a collection of more than 1700 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference database, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs contain additional features, including predicted operons, transport systems and pathway-hole fillers. The BioCyc website and Pathway Tools software offer many tools for querying and analysis of PGDBs, including Omics Viewers and comparative analysis. New developments include a zoomable web interface for diagrams; flux-balance analysis model generation from PGDBs; web services; and a new tool called Web Groups.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22102576 PMCID: PMC3245006 DOI: 10.1093/nar/gkr1014
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
List of species with 18 or more experimentally elucidated pathways represented in MetaCyc (meaning that there is experimental evidence for the occurrence of these pathways in the organism)
| Bacteria | Eukarya | Archaea | |||
|---|---|---|---|---|---|
| 276 | 311 | 20 | |||
| 66 | 186 | 21 | |||
| 57 | 134 | 18 | |||
| 49 | 77 | 18 | |||
| 36 | 67 | ||||
| 30 | 50 | ||||
| 26 | 48 | ||||
| 25 | 47 | ||||
| 25 | 46 | ||||
| 21 | 41 | ||||
| 18 | 39 | ||||
| 18 | 35 | ||||
| 31 | |||||
| 25 | |||||
| 23 | |||||
| 22 | |||||
| 18 | |||||
| 18 | |||||
The species are grouped by taxonomic domain and are ordered within each domain based on the number of pathways (number following species name) to which the given species was assigned. Some pathways may be labeled with a higher-level taxon, such as genus, if all the species within that genus are thought to have the given pathway. However, such higher-level taxa are not included in this table.
The distribution of pathways in MetaCyc based on the taxonomic classification of associated species
| Bacteria | Eukarya | Archaea | |||
|---|---|---|---|---|---|
| Proteobacteria | 900 | Viridiplantae | 784 | Euryarchaeota | 125 |
| Firmicutes | 258 | Fungi | 271 | Crenarchaeota | 37 |
| Actinobacteria | 214 | Metazoa | 247 | ||
| Bacteroidetes/Chlorobi | 59 | Euglenozoa | 24 | ||
| Cyanobacteria | 48 | Alveolata | 15 | ||
| Deinococcus-Thermus | 25 | Amoebozoa | 10 | ||
| Tenericutes | 19 | Stramenopiles | 5 | ||
| Thermotogae | 19 | Fornicata | 4 | ||
| Aquificae | 13 | Rhodophyta | 4 | ||
| Spirochaetes | 12 | Haptophyceae | 3 | ||
| Chlamydiae- Verrucomicrobia | 6 | Parabasalia | 3 | ||
| Planctomycetes | 6 | ||||
| Chloroflexi | 4 | ||||
| Fusobacteria | 4 | ||||
| Nitrospirae | 2 | ||||
| Thermodesulfobacteria | 2 | ||||
| Chrysiogenetes | 1 | ||||
For example, the statement ‘Tenericutes 19’ means that there is experimental evidence for at least 19 MetaCyc pathways for their occurrence in members of this taxonomic group. Major Taxonomic groups are grouped by domain and are ordered within each domain based on the number of pathways (number following taxon name) associated with the taxon. A pathway may be associated with multiple organisms.
Figure 1.An object group was created from the results of a search of the EcoCyc PGDB for genes containing the text string ‘trp’. After deleting a few rows of the table, two more columns were added by several transformations performed on the gene group, including the transformation ‘Products of gene’ and the transformation ‘Reaction of gene’.
Figure 3.Enrichment analysis of Web Groups objects. A group of Escherichia coli genes was analyzed for enrichment of the genes in pathways. The resulting table includes a list of pathways, the P-value for each pathway and the subgroup of genes from the original group that participate in each pathway. The table has been modified by removing some rows that represented pathway classes and superpathways, leaving only base pathways.
Figure 2.An object group created by several transformations performed on the group shown in Figure 1. The first column contains all substrates that are included in the ‘Reaction’ column of that table, and the second column shows the structures of these compounds. These columns were generated using the transformations ‘Substrates of reaction’ and ‘Structures of compound’.
Figure 4.The new Web Cellular Omics Viewer. This figure, showing a Cellular Omics Viewer for the bacterium E. coli, depicts the overlay of a gene expression data set (39). The level of transcription is indicated by the color of the reactions that are catalyzed by the enzymes encoded by the specific genes. The legend for mapping colors to data values is not shown in the figure. By hovering the mouse cursor over a compound or a reaction, the user can create popup windows that provide information and enable navigation to the relevant compound page or to a pathway display.
Figure 5.(A) Searching HumanCyc for several monoisotopic molecular weights, with specified tolerance of 5 ppm. This type of search is useful for analysis of compounds identified by mass spectroscopy, enabling researchers to find candidate compounds known to exist in the organism, and to learn about their roles in the metabolic network. (B) The result of the search is a table that includes matching compounds, their monoisotopic mass, the query mass they match and their chemical formula. The compound name is a hyperlink to the compound’s page, enabling users to quickly learn about the reactions and pathways in which the compound participates in this organism.
Figure 6.The new database selector lets the user select a PGDB either by typing a name of an organism or by browsing the organism taxonomy. If the ‘Go to Organism Summary page for selected database’ box at the bottom of the selector window is checked, the software will display that page upon selection, providing background information and statistics about that database.