| Literature DB >> 17485476 |
Chi-Hua Tung1, Jinn-Moon Yang.
Abstract
The fastSCOP is a web server that rapidly identifies the structural domains and determines the evolutionary superfamilies of a query protein structure. This server uses 3D-BLAST to scan quickly a large structural classification database (SCOP1.71 with <95% identity with each other) and the top 10 hit domains, which have different superfamily classifications, are obtained from the hit lists. MAMMOTH, a detailed structural alignment tool, is adopted to align these top 10 structures to refine domain boundaries and to identify evolutionary superfamilies. Our previous works demonstrated that 3D-BLAST is as fast as BLAST, and has the characteristics of BLAST (e.g. a robust statistical basis, effective search and reliable database search capabilities) in large structural database searches based on a structural alphabet database and a structural alphabet substitution matrix. The classification accuracy of this server is approximately 98% for 586 query structures and the average execution time is approximately 5. This server was also evaluated on 8700 structures, which have no annotations in the SCOP; the server can automatically assign 7311 (84%) proteins (9420 domains) to the SCOP superfamilies in 9.6 h. These results suggest that the fastSCOP is robust and can be a useful server for recognizing the evolutionary classifications and the protein functions of novel structures. The server is accessible at http://fastSCOP.life.nctu.edu.tw.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17485476 PMCID: PMC1933144 DOI: 10.1093/nar/gkm288
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Overview of the fastSCOP server for SCOP domain recognition and superfamily assignment. The fastSCOP has (A) four main steps, including (B) 3D-BLAST for scanning structural database, (C) MAMMOTH for detailed structural alignment and (D) domain boundary refinement and reassignment.
Figure 2.Evolutionary superfamily assignment and structural alignment of the fastSCOP server using the structure of multi-domain immunophilin (AtFKBP42) from Arabidopsis thaliana (PDB code 2IF4-A) as the query. (A) The assigned SCOP superfamilies are the FKBP-like domain (SCOP entry d.26.1) and the TPR domain (SCOP entry a.118.8). (B) Multiple structural alphabet and amino acid sequences alignments of FKBP-like domain between the query protein and five homologous proteins. The aligned secondary structures are represented as a continuous color spectrum from red through orange, yellow, green and blue to violet. The color is mapped to (C) the structure of the FKBP-like domain. (D) Structural alignments between the FKBP-like domain of the query protein and that of the homologous protein (PDB code 1Q1C-A).
Accuracy of evolutionary superfamily assignment and average execution time of fastSCOP, 3D-BLAST and MAMMOTH on 586 queries in the set SCOP-586
| Query type | Number of queries (domains) | Program | Number of assigned domains | Assignment accuracy (%) | Unassigned domain percentage (%) | Average time per query (s) | Relative to fastSCOP |
|---|---|---|---|---|---|---|---|
| Single domain | 464 query proteins (464 domains) | 3D-BLAST | 464 | 94.4% (95.9% | 0% | 1.166 | 0.38 |
| MAMMOTH | 464 | 98.7% (98.7% | 0% | 1046.47 | 338.61 | ||
| fastSCOP | 455 | 98.5% (99.6% | 1.94% | 3.09 | 1 | ||
| Multiple domain | 122 query proteins (272 domains) | 3D-BLAST | 275 | 86.9% | 1.8% | 2.238 | 0.34 |
| MAMMOTH | 238 | 94.1% | 12.5% | 1859.80 | 278.40 | ||
| fastSCOP without reassignment | 214 | 98.6% | 19.48% | 5.11 | 0.76 | ||
| fastSCOP | 254 | 98% | 6.6% | 6.68 | 1 |
aAssignment accuracy at SCOP fold level.
bfastSCOP does not apply the reassignment step, which is step 4 in Figure 1A.
SCOP-586 consists of 586 query proteins, which are in SCOP1.69 but not in SCOP1.67; the search database is SCOP1.67.
Time was measured using a personal computer with an Intel Pentium 2.8 GHz processor with 1024 MB of RAM.