| Literature DB >> 21672955 |
You Zhou1, Yongjie Liang, Karlene H Lynch, Jonathan J Dennis, David S Wishart.
Abstract
PHAge Search Tool (PHAST) is a web server designed to rapidly and accurately identify, annotate and graphically display prophage sequences within bacterial genomes or plasmids. It accepts either raw DNA sequence data or partially annotated GenBank formatted data and rapidly performs a number of database comparisons as well as phage 'cornerstone' feature identification steps to locate, annotate and display prophage sequences and prophage features. Relative to other prophage identification tools, PHAST is up to 40 times faster and up to 15% more sensitive. It is also able to process and annotate both raw DNA sequence data and Genbank files, provide richly annotated tables on prophage features and prophage 'quality' and distinguish between intact and incomplete prophage. PHAST also generates downloadable, high quality, interactive graphics that display all identified prophage components in both circular and linear genomic views. PHAST is available at (http://phast.wishartlab.com).Entities:
Mesh:
Year: 2011 PMID: 21672955 PMCID: PMC3125810 DOI: 10.1093/nar/gkr485
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
PHAST's phage completeness score calculation
| Scenario A | Scenario B | Scenario C | |
|---|---|---|---|
| Number of nucleotides (# bases) | – | (# bases in the region/ # bases in the related phage) × 100 | +10 if # bases >30 kb |
| Number of genes (# genes) | – | (# genes in the region/ # genes in the related phage) × 100 | +10 if # genes >40 |
| Cornerstone genes | – | – | +10 for each cornerstone gene |
| Phage-like genes | – | – | +10 if occupies 70% or more of the region |
| Final scores | 150 | Sum of above | Sum of above |
The completeness score is calculated for three different scenarios: (A) the region contains all genes of a known phage. (B) >50% of the genes in the region are related to a known phage. (C) <50% of the genes in the region are related to a known phage.
aCornerstone genes are identified key phage structural genes (using keywords such as ‘capsid’, ‘head’, ‘plate’, ‘tail’, ‘coat’, ‘portal’ and ‘holin’) and phage DNA regulation genes (such as ‘integrase’, ‘transposase’ and ‘terminase’) and phage function genes (such as ‘lysin’ and ‘bacteriocin’).
Figure 1.A screenshot montage of some of PHAST's different graphical and tabular views including its linear and circular genome renderings as well PHAST's corresponding prophage annotation.
Features and performance of PHAST relative to other prophage identification tools
| PHAST | PROPHINDER | PROPHAGE FINDER | PHAGE_FINDER | |
|---|---|---|---|---|
| Execution time ( | 140 s | 1980 ± 90 s | N/A | 5547 s |
| Execution time ( | 240 s | N/A | 1800 ± 90 s | N/A |
| Accepts raw DNA file | Yes | No | Yes | No |
| Accepts Genbank file | Yes | Yes | No | Yes |
| Sensitivity (%) | 85.4 | 77.5 | N/A | 68.5 |
| Sensitivity (%) (sequence alone) | 79.4 | N/A | 92.1 | N/A |
| PPV (%) | 94.2 | 93.6 | N/A | 94.3 |
| PPV (sequence alone) (%) | 86.5 | N/A | 52.1 | N/A |
| Downloadable images | Yes | No | No | No |
| Phage completeness labeling | Yes | No | No | No |
| Circular genome view | Yes | No | No | No |
| Zoomable graphics | Yes | No | No | No |
| Highlights key phage proteins | Yes | No | No | No |
| Scriptable operation | Yes | No | No | Yes |
| Attachment site prediction | Yes | No | No | Yes |
| Output type | Tables + graphics | Tables + graphics | Text only | Text only |
| Output readability | Good | Good | Poor | Poor |
The performance of all web servers and programs was evaluated using a reference set of 54 manually annotated genomes (1,10). Annotations for all 54 genome inputs can also be found at Prophinder's web page. Phage_Finder was tested locally (strict mode) using HMMER 2.3 for its HMM search. Prophage Finder's (11) web server was tested using its default parameters. Detailed results can be found on the PHAST website.
aProphage Finder failed to return results for all input files. The numbers reported here are for just 46 of the 54 genomes.
bPhage_Finder can be run under four different modes, this table reports the results using the strict mode with HMMER 2.3. Using other parameters one obtains: Sn = 88.4%, PPV = 23.7% (HMMER 2.3, non-strict mode), Sn = 90.3%, PPV = 23.9% (HMMER 3.0, non-strict mode) and Sn = 0.0%, PPV = 0.0% (HMMER 3.0, strict mode). When HMMER 3.0 is used Phage_Finder is significantly faster (898 s for E. coli O157:H7), but significantly less accruate.