| Literature DB >> 23126528 |
Marcin Pawlowski1, Janusz M Bujnicki.
Abstract
BACKGROUND: Computational models of protein structures were proved to be useful as search models in Molecular Replacement (MR), a common method to solve the phase problem faced by macromolecular crystallography. The success of MR depends on the accuracy of a search model. Unfortunately, this parameter remains unknown until the final structure of the target protein is determined. During the last few years, several Model Quality Assessment Programs (MQAPs) that predict the local accuracy of theoretical models have been developed. In this article, we analyze whether the application of MQAPs improves the utility of theoretical models in MR.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23126528 PMCID: PMC3534383 DOI: 10.1186/1471-2105-13-289
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The schema of the MR workflow utilizing local model quality. A) The B-factor value of each atom in a search model is modified according to the corresponding deviation. B) The initial placement of a search model in the asymmetric unit is done by AMORE and MOLREP programs, then 10 best scored solutions of each of the programs are selected. C) Then, such solutions are converted to polyalanine models and given as an input to PHASER. D) Next, the top solution of the PHASER program is given as the input to Refmac5 to perform restrained refinement. E) Finally, the structure of the target is built automatically using the ARP/wARP program.
Figure 2The impact of local model quality on MR success ratio. The success ratio is the fraction of correct MR solutions found by models of a given type. Among the types are: 1) models not utilizing local model quality:ALL_20, BACKBONE_20; 2) models modified according to ideal local quality of each atom: ALL_IDEAL, BACKBONE_IDEAL; 3) models modified according to ideal local quality of C-α atoms: ALL_CA_IDEAL, BACKBONE_CA_IDEAL; 4) models modified according to predicted local quality by MQAPs: MetaMQAP-evaluated, MetaMQAPclust-evaluated; 5) templates converted into polyalanine models.
Global model quality measures in the definition of model usefulness for MR
| RMSD | 0.842 | 0.846 | 0.845 | 0.838 | 0.841 | 0.831 | 0.804 | 0.841 | 0.864 | 0.856 | 0.841 |
| GDT_TS | 0.851 | 0.852 | 0.849 | 0.839 | 0.847 | 0.831 | 0.821 | 0.848 | 0.857 | 0.856 | 0.845 |
| seqId_stru | 0.859 | 0.866 | 0.786 | 0.784 | 0.802 | 0.792 | 0.859 | 0.81 | 0.879 | 0.872 | 0.831 |
| seqId_seq | 0.856 | 0.861 | 0.767 | 0.764 | 0.785 | 0.772 | 0.855 | 0.8 | 0.878 | 0.872 | 0.821 |
| fraction_of_gaps | 0.67 | 0.698 | 0.697 | 0.702 | 0.688 | 0.682 | 0.694 | 0.697 | 0.708 | 0.712 | 0.695 |
| aligned_columns | 0.653 | 0.662 | 0.535 | 0.54 | 0.561 | 0.566 | 0.69 | 0.586 | 0.674 | 0.67 | 0.614 |
| global_alignment score | 0.816 | 0.823 | 0.715 | 0.724 | 0.729 | 0.728 | 0.833 | 0.749 | 0.826 | 0.815 | 0.776 |
| target length | 0.672 | 0.674 | 0.545 | 0.556 | 0.579 | 0.577 | 0.701 | 0.61 | 0.693 | 0.687 | 0.629 |
| above_1.5 | 0.78 | 0.784 | 0.717 | 0.718 | 0.715 | 0.718 | 0.787 | 0.732 | 0.811 | 0.799 | 0.756 |
| abone_0.5 | 0.739 | 0.748 | 0.687 | 0.691 | 0.69 | 0.692 | 0.781 | 0.702 | 0.784 | 0.776 | 0.729 |
| above_-0.5 | 0.686 | 0.695 | 0.621 | 0.623 | 0.633 | 0.645 | 0.756 | 0.635 | 0.694 | 0.697 | 0.669 |
| above_-1.5 | 0.638 | 0.646 | 0.565 | 0.565 | 0.581 | 0.592 | 0.709 | 0.582 | 0.645 | 0.648 | 0.617 |
| below_-1.5 | 0.591 | 0.594 | 0.593 | 0.594 | 0.58 | 0.585 | 0.591 | 0.59 | 0.595 | 0.587 | 0.590 |
| AmIgoMR | 0.868 | 0.88 | 0.799 | 0.802 | 0.809 | 0.812 | 0.869 | 0.846 | 0.875 | 0.864 | 0.842 |
| RMSD | 1.61 | 1.61 | 1.71 | 1.70 | 1.70 | 1.70 | 1.68 | 1.70 | 1.60 | 1.60 | 1.660 |
| GDT_TS | 79.68 | 79.68 | 78.11 | 77.71 | 78.25 | 78.93 | 79.68 | 78.86 | 79.68 | 79.68 | 79.02 |
| seqId_stru | 0.31 | 0.31 | 0.24 | 0.24 | 0.25 | 0.25 | 0.32 | 0.25 | 0.31 | 0.31 | 0.278 |
| seqId_seq | 31.05 | 31.05 | 24.02 | 24.02 | 25.07 | 27.06 | 33.05 | 27.06 | 35.05 | 31.05 | 28.84 |
| AmIgoMR | 0.29 | 0.29 | 0.18 | 0.17 | 0.20 | 0.20 | 0.29 | 0.19 | 0.39 | 0.29 | 0.249 |
Part A presents the accuracy of predictions of whether a model is suitable for MR. The AUC values are presented for all types of search models studied here. In addition, for each predictor also the averaged AUC value is shown. Part B shows the optimal threshold values for the predictors. Such a point reports to the value of a predictor for which the success of MR drops significantly. Predictors tested: RMSD and GDT_TS - global model quality computed by LGA program. seqId_stru and seqId_seq – sequence identity computer for structural alignment (TM-align) and sequence alignment (HHalign), respectively. The following parameters describe target-template sequence alignment made by the HHalign program: fraction_of_gaps, aligned_columns, global_alignment_score, target length, above_1.5, above_0.5, above_-0.5, above_-1.5, below_-1.5. AmIgoMR – a predictor based on the logistic regression on abovementioned HHalign parameters.
Figure 3The impact of local model quality on MR success ratio in the function of global model quality. The success ratio is the fraction of correct MR solutions found by models of a given type. Among the model types are: 1) models with B-factor vales set to 20: ALL_20, BACKBONE_20. 2) models modified according to real accuracy of a model: ALL_IDEAL, BACKBONE_IDEAL, ALL_CA_IDEAL, BACKBONE_CA_IDEAL. 3) models modified according to predicted local quality by MQAPs: MetaMQAP-evaluated, MetaMQAPclust-evaluated; 4) templates converted into polyalanine models. The black, grey and blue bars correspond to search models of high (GDT_TS≥80), moderate (70≤GDT_TS<80) and low (GDT_TS<70) global quality.