| Literature DB >> 16845105 |
Luca Pireddu1, Duane Szafron, Paul Lu, Russell Greiner.
Abstract
Pathway Analyst (Path-A) is a publicly available web server (http://path-a.cs.ualberta.ca) that predicts metabolic pathways. It takes a FASTA format file containing a set of query protein sequences from a single organism (a partial or complete proteome) and identifies those sequences that are likely to participate in any of its supported metabolic pathways (currently 10). Path-A uses a number of machine-learning and sequence analysis techniques (e.g. SVM, BLAST and HMM) to predict pathways. Each machine-learned classifier exploits similarity between sequences in the pathways of its model organisms and sequences in the query set. It predicts the pathways that are present in the query organism and annotates each predicted reaction and catalyst, using the appropriate sequences from the query set. Path-A also provides a browsable and searchable database of the pathways for the model organisms that are used to make its predictions. Path-A's predictor sets (using different classifier technologies) have been evaluated using standard cross-validation techniques on a dataset of 10 metabolic pathways across 13 model organisms--a total of 125 organism-specific pathways. The most accurate classifier technology obtained a mean precision of 78.3% and a mean recall of 92.6% in predicting all catalyst proteins, of all reactions, in all pathways present in the dataset. Although Path-A currently only supports metabolic pathways, the underlying prediction techniques are general enough for other types of pathways. Consequently, it is our intent to extend Path-A to predict other types of pathways, including signalling pathways.Entities:
Mesh:
Year: 2006 PMID: 16845105 PMCID: PMC1538809 DOI: 10.1093/nar/gkl228
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Different classifiers in Path-A: mean catalyst prediction scores for each classifier type (standard deviation given in parentheses)
| Classifier | Precision | Recall | |
|---|---|---|---|
| Opt BLAST | 0.837 (0.130) | 0.783 (0.170) | 0.926 (0.114) |
| Opt HMM | 0.795 (0.141) | 0.777 (0.184) | 0.848 (0.138) |
| BLAST–HMM | 0.673 (0.152) | 0.630 (0.197) | 0.784 (0.176) |
| BLAST | 0.667 (0.155) | 0.609 (0.205) | 0.802 (0.170) |
| Motif SVM | 0.659 (0.155) | 0.666 (0.190) | 0.692 (0.187) |
| HMM | 0.654 (0.164) | 0.704 (0.190) | 0.671 (0.221) |
Figure 1Path-A services in the Control centre.
Figure 2New analysis: Step 1—Start.
Figure 3New analysis: Step 2—Proteins.
Figure 4New analysis: Step 2—Proteins: upload a new protein set page.
Figure 5New analysis: Step 2—Proteins: select an organism page.
Figure 6New analysis: Step 2—Upload a new protein set revisited.
Figure 7Protein set uploaded.
Figure 8New analysis: Step 3—Which pathways page.
Figure 9Viewing an analysis.
Figure 10A predicted pathway instance.