| Literature DB >> 19649324 |
Pavel V Afonine, Ralf W Grosse-Kunstleve, Alexandre Urzhumtsev, Paul D Adams.
Abstract
Rigid-body refinement is the constrained coordinate refinement of one or more groups of atoms that each move (rotate and translate) as a single body. The goal of this work was to establish an automatic procedure for rigid-body refinement which implements a practical compromise between runtime requirements and convergence radius. This has been achieved by analysis of a large number of trial refinements for 12 classes of random rigid-body displacements (that differ in magnitude of introduced errors), using both least-squares and maximum-likelihood target functions. The results of these tests led to a multiple-zone protocol. The final parameterization of this protocol was optimized empirically on the basis of a second large set of test refinements. This multiple-zone protocol is implemented as part of the phenix.refine program.Entities:
Year: 2009 PMID: 19649324 PMCID: PMC2712840 DOI: 10.1107/S0021889809023528
Source DB: PubMed Journal: J Appl Crystallogr ISSN: 0021-8898 Impact factor: 3.304
Overview of structures used in tests
Resol. is the high-resolution limit (Å) of the observed data. NA is the number of atoms used in refinement (protein and nucleic acid only). The PDB ID column refers to related Protein Data Bank (Berman et al., 2000 ▶) entries with the same space group and a similar unit cell. In some cases, the data and model deposited in the PDB are slightly different from those used in the tests. NB is the number of bodies and marks the ten structures used in the second test series (see §2.3).
| Database ID | Resol. | NA | PDB ID | NB |
|---|---|---|---|---|
| group2-intron | 3.5 | 1497 | 1kxk | 1 |
| synaptotagmin | 3.2 | 2186 | 1dqv | |
| 1029B | 3.0 | 9230 | 1n0e | |
| 1038B | 3.0 | 11038 | 1lql | 5 |
| 1071B | 3.0 | 6558 | 1nf2 | 6 |
| proteasome | 2.9 | 24927 | 1q5q | |
| sec17 | 2.9 | 2217 | 1qqe | |
| cp-synthase | 2.8 | 4331 | 1l1e | |
| penicillopepsin | 2.8 | 2366 | 3app | |
| s-hydrolase | 2.8 | 6666 | 1a7a | |
| ut-synthase | 2.8 | 7504 | 1e8c | |
| gere | 2.7 | 3060 | 1fse | |
| groel | 2.7 | 26957 | 1oel | |
| aep-transaminase | 2.6 | 16698 | 1m32 | 4 |
| rab3a | 2.6 | 2431 | 1zbd | |
| a2u-globulin | 2.5 | 5148 | 2a2u | 4 |
| flavin-reductase | 2.5 | 3385 | 1bkj | |
| p32 | 2.5 | 4265 | 1p32 | |
| psd-95 | 2.5 | 2180 | 1jxm | |
| qaprtase | 2.5 | 12570 | 1qpo | 1 |
| rnase-s | 2.5 | 1488 | 1rge | |
| 1102B | 2.5 | 2662 | 1l2f | |
| rh-dehalogenase | 2.45 | 2336 | 1bn7 | |
| armadillo | 2.4 | 3458 | 3bct | |
| cyanase | 2.4 | 11970 | 1dw9 | |
| fusion-complex | 2.4 | 7025 | 1sfc | |
| human-otc | 2.4 | 2528 | 1ep9 | |
| mev-kinase | 2.4 | 2506 | 1kkh | |
| nsf-d2 | 2.4 | 1943 | 1nsf | |
| granulocyte | 2.35 | 1908 | 2gmf | |
| oat-gabaculine | 2.3 | 9450 | 1gbn | 2 |
| vmp | 2.3 | 7992 | 1l8w | |
| gpatase | 2.25 | 7786 | 1ecf | |
| hn-rnp | 2.2 | 1338 | 1ha1 | |
| antitrypsin | 2.1 | 2985 | 1hp7 | |
| pdz | 2.1 | 1372 | 1kwa | |
| 1167B | 2.0 | 2920 | 1s12 | |
| apoferritin | 2.0 | 1354 | 1gwg | |
| cobd | 2.0 | 2738 | 1lkc | |
| synapsin | 2.0 | 4636 | 1auv | 1 |
| tryparedoxin | 2.0 | 1145 | 1qk8 | |
| myoglobin | 1.9 | 1227 | 1n9x | |
| nsf-n | 1.9 | 1518 | 1qcs | |
| rop | 1.9 | 850 | 1f4n | |
| epsin | 1.8 | 1210 | 1edu | |
| gene-5 | 1.8 | 673 | 1vqb | 2 |
| ic lyase | 1.8 | 6484 | 1f61 | |
| mbp | 1.8 | 1760 | 1ytt | |
| p9 | 1.75 | 1062 | 1bkb | |
| 1063B | 1.7 | 1926 | 1lfp | |
| nitrite-reduct | 1.7 | 2582 | 1et7 | |
| insulin | 1.7 | 400 | 2bn3 | |
| lysozyme | 1.5 | 982 | 1aki | |
| rnase-p | 1.5 | 3607 | 1nz0 | |
| calmodulin | 1.1 | 1150 | 1exr | 2 |
| hipip | 0.8 | 616 | 1iua |
Figure 1Example of a success rate plot. The horizontal axis designates the number of refinement macro cycles and the vertical axis designates the success rate in percent (see §2.5). The solid line is the plot using a 0.25 Å r.m.s.d. threshold as the criterion for ‘success’, the dashed line with shorter segments is the plot using a 0.5 Å threshold, and the dashed line with the longer segments is the plot using a 1.0 Å threshold. The example plot was obtained for rnase-p with a fixed 6.0 Å high-resolution cutoff for the data, a random translational displacement magnitude of 2.0 Å and a random rotational displacement magnitude of 5°.
Figure 2Success rate plots using the LS target function. The high-resolution values are in ångströms. The four-by-four grid for each high-resolution cutoff is arranged by rotational displacement magnitude in the horizontal direction from left to right (0, 5, 10, 15°), and translational displacement magnitude in the vertical direction downwards (0, 2, 4, 6 Å).
Figure 3Success rate plots using the ML target function. See the caption of Fig. 2 ▶ for a guide to the plots.
Comparison of success rates for different values of the n_ref(1)1 parameter (§3.2.1)
The first row and the first column show the parameter values. The diagonal and the redundant lower triangle are omitted. Each cell shows three triples of success rates, for the r.m.s.d. cutoffs 1.0 Å (first row), 0.5 Å (second row) and 0.25 Å (third row), respectively. The left value in each triple is the number of times the success rate obtained with the parameter value given by the corresponding row was at least 2% better than that with the parameter value given by the corresponding column (§2.5); the right value is the number of times the success rate obtained with the parameter value given by the corresponding column was at least 2% better than that with the parameter value given by the corresponding row; the value in the middle is the number of times the difference between the success rates was smaller than 2%.
| n_ref(1)1 | 80 | 100 | 120 | 200 | 400 |
|---|---|---|---|---|---|
| 60 | (26, 66, 28) | (25, 62, 33) | (28, 54, 38) | (30, 56, 34) | (38, 49, 33) |
| (26, 65, 29) | (25, 63, 32) | (28, 53, 39) | (30, 56, 34) | (39, 48, 33) | |
| (26, 71, 23) | (25, 68, 27) | (27, 59, 34) | (27, 59, 34) | (34, 54, 32) | |
| 80 | (24, 72, 24) | (27, 63, 30) | (35, 53, 32) | (39, 55, 26) | |
| (23, 73, 24) | (26, 64, 30) | (35, 54, 31) | (39, 55, 26) | ||
| (21, 77, 22) | (23, 67, 30) | (29, 60, 31) | (32, 62, 26) | ||
| 100 | (27, 69, 24) | (34, 66, 20) | (42, 55, 23) | ||
| (27, 69, 24) | (33, 67, 20) | (41, 56, 23) | |||
| (25, 73, 22) | (27, 73, 20) | (35, 63, 22) | |||
| 120 | (28, 58, 34) | (37, 55, 28) | |||
| (28, 58, 34) | (37, 55, 28) | ||||
| (23, 64, 33) | (31, 62, 27) | ||||
| 200 | (33, 66, 21) | ||||
| (34, 65, 21) | |||||
| (31, 69, 20) |
Comparison of success rates for different values of the multi_body_factor parameter (§3.2.1)
See caption of Table 2 ▶ for a guide to the data in this table. However, in this case the count in the middle of each triplet is given as a sum of two values: the first value is for zones that are different; the second value is for zones that are not affected by the parameter and therefore lead to exactly identical results [see equations (1)–(3) in §3.2; in this case the three one-body structures are insensitive to the multi_body_factor].
| multi_body_factor | 1.0 | 2.0 |
|---|---|---|
| 0.5 | (15, 44+36, 25) | (15, 36+36, 30) |
| (15, 45+36, 24) | (15, 36+36, 30) | |
| (15, 50+36, 19) | (14, 42+36, 25) | |
| 1.0 | (17, 50+36, 14) | |
| (18, 49+36, 14) | ||
| (15, 52+36, 14) |
Comparison of success rates for different values of the zone_exponent parameter (§3.2.1)
See caption of Table 2 ▶ for a guide to the data in this table.
| zone_exponent | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|
| 1 | (5, 73, 42) | (6, 60, 54) | (13, 54, 53) | (19, 59, 42) | (20, 62, 38) |
| (5, 73, 42) | (6, 60, 54) | (13, 55, 52) | (19, 59, 42) | (21, 61, 38) | |
| (5, 77, 38) | (6, 69, 45) | (13, 62, 45) | (19, 67, 34) | (21, 68, 31) | |
| 2 | (14, 63, 43) | (15, 70, 35) | (20, 74, 26) | (27, 66, 27) | |
| (14, 63, 43) | (15, 71, 34) | (20, 74, 26) | (27, 66, 27) | ||
| (13, 71, 36) | (15, 76, 29) | (22, 77, 21) | (25, 73, 22) | ||
| 3 | (33, 62, 25) | (34, 71, 15) | (34, 75, 11) | ||
| (33, 62, 25) | (34, 72, 14) | (34, 76, 10) | |||
| (28, 67, 25) | (31, 75, 14) | (32, 78, 10) | |||
| 4 | (30, 77, 13) | (34, 72, 14) | |||
| (31, 76, 13) | (34, 71, 15) | ||||
| (30, 79, 11) | (32, 74, 14) | ||||
| 5 | (23, 74, 23) | ||||
| (24, 73, 23) | |||||
| (20, 79, 21) |
Comparison of success rates for different values of the n_zones parameter (§3.2.1)
See caption of Table 2 ▶ for a guide to the data in this table.
| n_zones | 4 | 5 | 6 | 7 | 8 | 9 |
|---|---|---|---|---|---|---|
| 3 | (14, 68, 38) | (2, 67, 51) | (10, 60, 50) | (9, 53, 58) | (8, 50, 62) | (7, 52, 61) |
| (13, 68, 39) | (2, 66, 52) | (10, 60, 50) | (9, 53, 58) | (8, 50, 62) | (7, 51, 62) | |
| (13, 73, 34) | (2, 74, 44) | (9, 65, 46) | (9, 61, 50) | (8, 57, 55) | (7, 59, 54) | |
| 4 | (7, 70, 43) | (9, 70, 41) | (7, 65, 48) | (4, 59, 57) | (4, 62, 54) | |
| (7, 70, 43) | (10, 68, 42) | (7, 65, 48) | (4, 59, 57) | (4, 62, 54) | ||
| (6, 75, 39) | (8, 73, 39) | (7, 69, 44) | (3, 65, 52) | (4, 68, 48) | ||
| 5 | (23, 73, 24) | (17, 76, 27) | (15, 70, 35) | (14, 68, 38) | ||
| (24, 72, 24) | (17, 76, 27) | (15, 70, 35) | (14, 69, 37) | |||
| (19, 77, 24) | (16, 77, 27) | (13, 74, 33) | (15, 73, 32) | |||
| 6 | (18, 70, 32) | (11, 71, 38) | (13, 71, 36) | |||
| (18, 70, 32) | (11, 71, 38) | (13, 71, 36) | ||||
| (17, 76, 27) | (11, 74, 35) | (13, 77, 30) | ||||
| 7 | (18, 73, 29) | (17, 79, 24) | ||||
| (17, 74, 29) | (16, 80, 24) | |||||
| (14, 79, 27) | (16, 84, 20) | |||||
| 8 | (19, 87, 14) | |||||
| (20, 85, 15) | ||||||
| (20, 90, 10) |
Comparison of runtimes for different values of the n_zones parameter (§3.2.1)
The runtime statistics shown in each row (columns 2–4) are based on 10 × 12 values (number of test structures × number of displacement combinations). Columns 5–10 show the ratios of the mean runtimes (mean in the given row divided by mean in the previous rows).
| Runtime (s) | n_zones | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| n_zones | Minimum | Maximum | Mean | 3 | 4 | 5 | 6 | 7 | 8 |
| 3 | 15.06 | 571.2 | 222.567 | ||||||
| 4 | 16.69 | 717.6 | 263.241 | 1.18 | |||||
| 5 | 17.77 | 795.6 | 295.355 | 1.33 | 1.12 | ||||
| 6 | 19.50 | 879.6 | 330.116 | 1.48 | 1.25 | 1.12 | |||
| 7 | 21.61 | 1018.2 | 370.33 | 1.66 | 1.41 | 1.25 | 1.12 | ||
| 8 | 21.82 | 1063.8 | 407.154 | 1.83 | 1.55 | 1.38 | 1.23 | 1.10 | |
| 9 | 23.28 | 1191.6 | 438.968 | 1.97 | 1.67 | 1.49 | 1.33 | 1.19 | 1.08 |
Comparison of success rates using the two Euler angle conventions (§§2.1 and 3.3)
See caption of Table 2 ▶ for a guide to the data in this table.
| Convention | |
|---|---|
| (75, 40, 5) | |
| (91, 25, 4) | |
| (83, 35, 2) |
Comparison of success rates using different values for the resolution at which the target function is switched from least squares to maximum likelihood (§3.4)
See caption of Table 3 ▶ for a guide to the data in this table.
| Switch resolution (Å) | 5 | 6 | 7 |
|---|---|---|---|
| 4 | (1, 56+60, 3) | (7, 102, 11) | (7, 102, 11) |
| (0, 57+60, 3) | (5, 102, 13) | (5, 102, 13) | |
| (0, 57+60, 3) | (5, 102, 13) | (5, 102, 13) | |
| 5 | (8, 92+12, 8) | (8, 92+12, 8) | |
| (7, 91+12, 10) | (7, 91+12, 10) | ||
| (6, 92+12, 10) | (6, 92+12, 10) | ||
| 6 | (0, 12+108, 0) | ||
| (0, 12+108, 0) | |||
| (0, 12+108, 0) |