| Literature DB >> 29652256 |
Claudia Millán1, Massimo Domenico Sammito2, Airlie J McCoy3, Andrey F Ziem Nascimento1, Giovanna Petrillo1, Robert D Oeffner3, Teresa Domínguez-Gil4, Juan A Hermoso4, Randy J Read3, Isabel Usón1.
Abstract
Macromolecular structures can be solved by molecular replacement provided that suitable search models are available. Models from distant homologues may deviate too much from the target structure to succeed, notwithstanding an overall similar fold or even their featuring areas of very close geometry. Successful methods to make the most of such templates usually rely on the degree of conservation to select and improve search models. ARCIMBOLDO_SHREDDER uses fragments derived from distant homologues in a brute-force approach driven by the experimental data, instead of by sequence similarity. The new algorithms implemented in ARCIMBOLDO_SHREDDER are described in detail, illustrating its characteristic aspects in the solution of new and test structures. In an advance from the previously published algorithm, which was based on omitting or extracting contiguous polypeptide spans, model generation now uses three-dimensional volumes respecting structural units. The optimal fragment size is estimated from the expected log-likelihood gain (LLG) values computed assuming that a substructure can be found with a level of accuracy near that required for successful extension of the structure, typically below 0.6 Å root-mean-square deviation (r.m.s.d.) from the target. Better sampling is attempted through model trimming or decomposition into rigid groups and optimization through Phaser's gyre refinement. Also, after model translation, packing filtering and refinement, models are either disassembled into predetermined rigid groups and refined (gimble refinement) or Phaser's LLG-guided pruning is used to trim the model of residues that are not contributing signal to the LLG at the target r.m.s.d. value. Phase combination among consistent partial solutions is performed in reciprocal space with ALIXE. Finally, density modification and main-chain autotracing in SHELXE serve to expand to the full structure and identify successful solutions. The performance on test data and the solution of new structures are described.Entities:
Keywords: ARCIMBOLDO_SHREDDER; fragment-based molecular replacement; molecular replacement; phasing; small fragments
Mesh:
Substances:
Year: 2018 PMID: 29652256 PMCID: PMC5892878 DOI: 10.1107/S2059798318001365
Source DB: PubMed Journal: Acta Crystallogr D Struct Biol ISSN: 2059-7983 Impact factor: 7.652
X-ray data statistics for all structures used in this study
| PPAD | LTG | Hhed2 |
|
| |
|---|---|---|---|---|---|
| No. of copies in asymmetric unit | 1 | 1 | 4 | 1 | 1 |
| Space group |
|
|
|
|
|
| Unit-cell parameters | |||||
|
| 58.63 | 163.98 | 78.02 | 45.92 | 47.86 |
|
| 60.36 | 163.98 | 94.86 | 45.92 | 116.29 |
|
| 113.88 | 56.71 | 140.27 | 148.03 | 150.74 |
| α (°) | 90 | 90 | 90 | 90 | 90 |
| β (°) | 90 | 90 | 90 | 90 | 90 |
| γ (°) | 90 | 120 | 90 | 120 | 90 |
| Resolution (Å) | 1.5 | 2.1 | 1.6 | 1.9 | 2.0 |
| 〈 | 31.62 | 20.28 | 12 | 17.09 | 39.08 |
| Completeness (%) | 99.1 | 100 | 100 | 97.1 | 95 |
Figure 1ARCIMBOLDO_SHREDDER workflow. The numbers reference the steps described in §3.1. Orange colour refers to input/output, blue to Phaser steps, red to ARCIMBOLDO steps and purple to SHELXE steps.
Summary of possible operations to modify the search models throughout the program flow
| Refinement strategy | Previous step | Next step | Description |
|---|---|---|---|
|
| Rotation search | Translation search | Refinement of rigid-body groups against the RF target |
|
| Rigid-body refinement | Density modification and initial correlation coefficient computation | Refinement of rigid-body groups against the TF target |
| LLG-guided pruning | Rigid-body refinement | Density modification and initial correlation coefficient computation | Trimming of residues from a rototranslated model that upon removal promote an increase of the LLG |
| Mend after translation | Packing check |
| Superposition of the starting trimmed and annotated template over the solutions surviving the packing followed by |
| SHRED-LLG | Rotation search |
| After rotation search and clustering with the template, systematic removal of residues in different ranges and scoring in a single function for every rotation in order to trim the model of its most incorrect parts |
Figure 2Original solution of LTG. (a) Final structure (blue) versus the template used in ARCIMBOLDO_SHREDDER (orange). The r.m.s.d. between the structures is 4.6 Å over a core of 582 Cα atoms. (b) Coloured sticks show the solving fragments that clustered together and the black ribbon shows the final structure. (c) A detail of the SHELXE F o·FOM electron-density maps with the Cα trace. Orange, initial map from phase combination; blue, final map after density modification and autotracing; both are contoured at 1σ.
Summary of the parameterization and the results of the three ARCIMBOLDO_SHREDDER runs that led to the successful solution of Hhed2
| Run 1: | Run 2: | Run 3: | |
|---|---|---|---|
| RMSD (Å) | 0.8, 0.5 | 0.8, 0.5 | 0.8, 0.5 |
| Model size (No. of residues) | 89 (template of 128) | 89 (template of 138) | 89 (template of 150) |
| Unique models | 95 | 112 | 128 |
| Correct solutions | 576 | 19 | 4 |
| Total solutions | 896 | 1396 | 1448 |
| Correct ratio | 0.64 | 0.014 | 0.0027 |
| Lowest wMPE (°) | 71.0 | 73.6 | 73.9 |
| Top CC for phase cluster (%) | 38.0 (starting phase set from combination of three monomers) | 38.6 (starting phase set from a single monomer) | 38.0 (starting phase set from a single monomer) |
Figure 3Annotation levels for the 4urf model. (a) First-level annotation groups. (b) Second-level annotation separating the β-sheet and independent helices.
Summary of parameterization and results of the tests performed with the LTG structure
| No | Default | LLG-guided pruning | VRMS refinement | Variation in starting RMSD parameter and model size (runs 5, 6, 7 and 8) | ||||
|---|---|---|---|---|---|---|---|---|
| RMSD (Å) | 1.0 | 1.0, 1.2 | 1.0, 1.2 | 1.0, 1.2 | 2.0 | 2.0 | 3.0 | 3.0 |
| Model size (No. of residues) | 128 | 128 | 128 | 128 | 127 | 180 | 127 | 180 |
| Cycles of | 0 | 2 | 2 | 2 | 1 | 1 | 1 | 1 |
| Unique models | 417 | 417 | 417 | 417 | 408 | 436 | 408 | 436 |
| eLLG | 28.4 | 28.4 | 28.4 | 28.4 | 1.7 | 3.4 | 0.17 | 0.34 |
| Correct solutions | 205 | 295 | 450 | 296 | 135 | 136 | 5 | 23 |
| Total solutions | 1228 | 2162 | 3201 | 2132 | 1852 | 1756 | 852 | 1012 |
| Correct ratio | 0.17 | 0.14 | 0.14 | 0.14 | 0.07 | 0.07 | 0.006 | 0.02 |
| Best wMPE (°) | 66.3 | 61.8 | 61.7 | 63.1 | 66.6 | 67.8 | 76.9 | 72.6 |
| Top CC (%) | 33.88 | 34.76 | 31.84 | 32.19 | 30.79 | 31.39 | 10.92 | 32.24 |
Figure 4Tests on the LTG structure. Each scatter plot corresponds to a correct rotation cluster. In (c), (d), (e) and (g) the horizontal axis represents the number of the central residue of the model. (a) First-level annotation groups. (b) Second-level groups of helices. (c) wMPE versus model centre for solutions in gyre and gimble refinement run 2. (d) wMPE for solutions in the run with one cycle of gyre refinement at 2.0 Å RMSD (run 5). (e) wMPE for all solutions in the run with LLG-based pruning (run 3). (f) wMPE against the number of residues trimmed from each solution after LLG-based pruning in run 3. (g) wMPE versus model centre for solutions in the VRMS refinement run (run 4). A red colour marks solutions that have been prioritized for expansion. (h) VRMS against wMPE for all solutions.
Figure 5PPAD tests. In runs 1 and 2 coil residues were kept, and run 2 included LLG-guided pruning. In run 3 coil was removed and the models were subjected to gyre and gimble refinement. (a) Superposition between the 1zbr template (orange) and the final structure (blue). The r.m.s.d. is 1.57 Å for a core of 231 Cα atoms. (b) First level of annotation for the decomposition used in run 3. (c) wMPE of solutions versus the model centre in run 2. (d) Number of residues removed by the LLG-guided pruning against wMPE in run 2. (e) The coloured cartoon shows solving fragments from run 2 that clustered together and the grey ribbon shows the final structure. (f) R.m.s.d. to the final structure for each of the three correct fragments in run 3. Values at different refinement stages are calculated over a common core.
Summary of the parameterization and the results of the tests performed with the PPAD structure
| Maintain coil | Maintain coil, prune | Remove coil | |
|---|---|---|---|
| RMSD (Å) | 0.8 | 0.8 | 0.8 |
| Model size (No. of residues) | 101 | 101 | 101 |
| Unique models | 335 | 335 | 160 |
| eLLG | 60 | 60 | 60 |
| Correct solutions | 32 | 48 | 6 |
| Total solutions | 1652 | 2478 | 1504 |
| Correct ratio | 0.019 | 0.019 | 0.0039 |
| Best wMPE (°) | 72.7 | 72.1 | 67.7 |
| Top CC (%) | 30.69 | 31.43 | 31.05 |
Figure 6Tests on PDB entry 1yzf. (a) Final structure (blue) versus the template used in ARCIMBOLDO_SHREDDER (orange). The r.m.s.d. between the structures computed with super in PyMOL is 2.4 Å over a core of 121 Cα atoms. (b) Community clustering groups. (c) β-Sheet and independent helices grouping.
Figure 7Tests on PDB entry 3fp2. (a) Final structure (blue) versus the 1w3b template used in ARCIMBOLDO_SHREDDER (orange). The r.m.s.d. between the structures is 4.95 Å over a core of 208 Cα atoms. (b) First level of annotation for refinement. (c) Second level of annotation for refinement. (d) R.m.s.d. of each of the three correct fragments to the final structure and over a common core using different refinement stages.