| Literature DB >> 20522512 |
Zong Hong Zhang1, Kavitha Bharatham, Westley A Sherman, Ivana Mihalek.
Abstract
deconSTRUCT webserver offers an interface to a protein database search engine, usable for a general purpose detection of similar protein (sub)structures. Initially, it deconstructs the query structure into its secondary structure elements (SSEs) and reassembles the match to the target by requiring a (tunable) degree of similarity in the direction and sequential order of SSEs. Hierarchical organization and judicious use of the information about protein structure enables deconSTRUCT to achieve the sensitivity and specificity of the established search engines at orders of magnitude increased speed, without tying up irretrievably the substructure information in the form of a hash. In a post-processing step, a match on the level of the backbone atoms is constructed. The results presented to the user consist of the list of the matched SSEs, the transformation matrix for rigid superposition of the structures and several ways of visualization, both downloadable and implemented as a web-browser plug-in. The server is available at http://epsf.bmad.bii.a-star.edu.sg/struct_server.html.Entities:
Mesh:
Year: 2010 PMID: 20522512 PMCID: PMC2896154 DOI: 10.1093/nar/gkq489
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Result presentation in deconSTRUCT. (A) Results of a database search are presented in form of a table, giving several scores for each hit, and the link (red arrowhead) to the page describing the query-hit match in more detail. (B–D) Query-hit comparison page. (B) The page provides Jmol visualization of the structure superposition, as well as links for the download of Pymol and Chimera sessions using the same visualization scheme: the SSEs motivating the match are represented in solid color, whereas the rest of the two structures is semi-transparent. The visualization using Pymol shown. (C) Furthermore, the page lists the transformation used to produce the coordinate superposition in a typical format: three columns of the rotation matrix, followed by the translation vector column. The transformation applies to the hit structure. Following is the list of mapped elements of secondary structure, including their sequential number, type (strand or helix) cosine of the angle between the matched SSEs, the exponential weight for the cosine (see Supplementary Data, Equation 2) and their range on the respective structure. (D) Finally, the last piece of visualization shows the distance between structurally alignable residues as a colored bar between them, the color indicating the distance range between the corresponding Cαs.
Figure 2.Performance of the method behind deconSTRUCT, in comparison with other representative methods. The two panels give two representations of data collected in the same computational experiment. The legend corresponds to both panels. For description of the test set, see the main text. The times are CPU times, on a 3 GHz processor. Although the presented graphs use each pairwise comparison once (query-versus-target but not target-versus-query and not query-versus-self), all pairs (including query-versus. self) were used for the timing runs. deconSTRUCT uses pre-processed structure files. Pre-processing of the presented test set takes 4s. If it were it processing two full PDB entries in each pairwise comparison, as the other methods (in the implementation available to us) do, the total deconSTRUCT time would be 15 min. (A) ROC curves. For each method and for every possible pair in the test set, the quality of the structural match is evaluated. The pairs are sorted according to the match score native to each method. The ROC curve shows fraction of true positive versus fraction of false positive as the cutoff in the score value is moved down the sorted pairs list. This graph is a standard way of representing and comparing binary classifiers as their discrimination threshold is varied. (B) ROC area versus query. For each individual query, the area under that query's ROC curve is calculated. For each method, the queries are sorted according to ROC area and the ROC area is plotted as a function of (sorted) query. This plot shows the ability of the method to bring to the top of the list true positives for a given query (irrespective of the values that the scoring function might take for other queries) which is precisely the task of a server, like deconSTRUCT discussed here.