| Literature DB >> 36048149 |
Paul S Bond1, Kevin D Cowtan1.
Abstract
Interactive model building can be a difficult and time-consuming step in the structure-solution process. Automated model-building programs such as Buccaneer often make it quicker and easier by completing most of the model in advance. However, they may fail to do so with low-resolution data or a poor initial model or map. The Buccaneer pipeline is a relatively simple program that iterates Buccaneer with REFMAC to refine the model and update the map. A new pipeline called ModelCraft has been developed that expands on this to include shift-field refinement, machine-learned pruning of incorrect residues, classical density modification, addition of water and dummy atoms, building of nucleic acids and final rebuilding of side chains. Testing was performed on 1180 structures solved by experimental phasing, 1338 structures solved by molecular replacement using homologues and 2030 structures solved by molecular replacement using predicted AlphaFold models. Compared with the previous Buccaneer pipeline, ModelCraft increased the mean completeness of the protein models in the experimental phasing cases from 91% to 95%, the molecular-replacement cases from 50% to 78% and the AlphaFold cases from 82% to 91%. open access.Entities:
Keywords: Buccaneer; ModelCraft; X-ray crystallography; automation; model building; software; structure solution
Mesh:
Substances:
Year: 2022 PMID: 36048149 PMCID: PMC9435595 DOI: 10.1107/S2059798322007732
Source DB: PubMed Journal: Acta Crystallogr D Struct Biol ISSN: 2059-7983 Impact factor: 5.699
Entries discarded from the experimental phasing test set
| Count | Reason |
|---|---|
| 36 | No experimental phases deposited |
| 13 | Different cell or space group in the structure and data |
| 1 | Error during |
| 45 | Data completeness less than 90% |
| 37 |
|
| 10 |
|
Entries discarded from the molecular-replacement test set
| Count | Reason |
|---|---|
| 1 | Entry has been obsoleted in the PDB |
| 1 | Different cell or space group in the structure and data |
| 3 | Data completeness less than 90% |
| 4 |
|
| 1 | Recalculated |
Entries discarded from the AlphaFold test set
| Count | Reason |
|---|---|
| 16427 | No PDB entries |
| 1335 | No PDB entries with superposed similarity between 20% and 90% |
| 9 | Error processing structure-factor data |
| 3 | Different cell or space group in the structure and data |
| 185 | Data completeness less than 90% |
| 151 |
|
| 82 | Molecular replacement could not place all copies |
| 71 |
|
Figure 1Comparison of ModelCraft and the CCP4i Buccaneer pipeline for molecular replacement (left) and experimental phasing (right). The top row shows protein completeness for each structure and the bottom rows show completeness as a function of resolution and F-map correlation. Structures were split into three resolution and F-map correlation bins. Points show the mean completeness at the centre of each bin and the shaded area shows one standard error either side.
Figure 2Extra completeness gained by using ModelCraft instead of the CCP4i Buccaneer pipeline against the extra time that it takes for the pipeline to finish for the molecular-replacement test set.
Figure 3The mean change in completeness, R work and R free when individual steps are removed from the ModelCraft pipeline for the molecular-replacement test set. Error bars show one standard error above and below the mean.
Figure 4Completeness of the protein structure built by the ModelCraft and CCP4i Buccaneer pipelines for the AlphaFold test set.
Figure 5R free after refining the starting model with Sheetbend and then REFMAC and R free of the autobuilt structure from ModelCraft for the molecular-replacement test set.