| Literature DB >> 20529369 |
Kieran Alden1, Stella Veretnik, Philip E Bourne.
Abstract
BACKGROUND: Partitioning of a protein into structural components, known as domains, is an important initial step in protein classification and for functional and evolutionary studies. While the systematic assignments of domains by human experts exist (CATH and SCOP), the introduction of high throughput technologies for structure determination threatens to overwhelm expert approaches. A variety of algorithmic methods have been developed to expedite this process, allowing almost instant structural decomposition into domains. The performance of algorithmic methods can approach 85% agreement on the number of domains with the consensus reached by experts. However, each algorithm takes a somewhat different conceptual approach, each with unique strengths and weaknesses. Currently there is no simple way to automatically compare assignments from different structure-based domain assignment methods, thereby providing a comprehensive understanding of possible structure partitioning as well as providing some insight into the tendencies of particular algorithms. Most importantly, a consensus assignment drawn from multiple assignment methods can provide a singular and presumably more accurate view.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20529369 PMCID: PMC2897830 DOI: 10.1186/1471-2105-11-310
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Features of the dConsensus tool. The screenshots represent a subset of the pages of dConsensus. A. Initial query form B. Results of domain assignments by all methods for 1cs6A. C. Consensus (simple) for 1cs6A D. Consensus (weighted) for 1cs6A. E. Boundary analysis options for 1smaA F. Boundary analysis of specified region of 1smaA. Alpha-helical regions are marked in blue, beta-sheets are marked in gold, position of domain/fragment boundaries are marked in red.
Figure 2Analysis of individual algorithmic methods and performance of consensuses. All evaluations use 315-chain Balanced Benchmark_2 [8]. A. Evaluation of domain prediction by individual methods using number of predicted domains as a sole criterion. Correct assignments are in green, over-cuts (predicting too many domains) are in red, undercuts (predicting too few domains) are in blue. B. Placement of domain/fragment boundaries by individual methods with respect to secondary structures. Fraction of cuts through alpha-helical structures is indicated in gold, fraction of beta-sheet cuts are indicated in green. C. Fraction of chains that reach consensus. Simple consensus is indicated in blue, weighted consensus indicated in red. D. Fraction of chains whose consensus agrees with expert consensus.
Set of rules used to determine final contributions of individual methods toward weighted consensus
| Number of predicted domains | If the number of domains predicted by PDP and NCBI > = 4, then the weight assigned to DP is reduced by 10% |
|---|---|
| Should PUU predict more domains than PDP and NCBI, downgrade PUU prediction by 10% | |
| If PDP predicts five domains or more, downgrade NCBI by 10% | |
| Number of fragments per domain/chain | If three or more methods have at least one domain fragmented (may not be the same domain) then the weight of all methods that do not predict fragmented domains is reduced by 10% |
| If NCBI and PDP have no fragmented domains, then the weight of all methods that predict fragmented domains is reduced by 10% | |
| Type of Structure | If the structure is all alpha-helix (in the DSSP structure definition) and NCBI and PDP disagree on the number of domains in the chain, the weight of PDP is increased by 10% |
| If the structure is all beta-sheet and NCBI and PDP disagree on the number of domains, the weight of PDP is increased by 10% | |
| If the structure is all beta-sheet and NCBI and PDP agree, the weight of both methods is increased by 10% | |
| If the structure is alpha-beta and NCBI and PDP agree, the weights of all methods that disagree with PDP and NCBI are reduced by 10% | |