| Literature DB >> 34196295 |
Jordi Rius1, Xavier Torrelles1.
Abstract
The incorporation of the new peakness-enhancing fast Fourier transform compatible ipp procedure (ipp = inner-pixel preservation) into the recently published SM algorithm based on |ρ| [Rius (2020). Acta Cryst A76, 489-493] improves its phasing efficiency for larger crystal structures with atomic resolution data. Its effectiveness is clearly demonstrated via a collection of test crystal structures (taken from the Protein Data Bank) either starting from random phase values or by using the randomly shifted modulus function (a Patterson-type synthesis) as initial ρ estimate. It has been found that in the presence of medium scatterers (e.g. S or Cl atoms) crystal structures with 1500 × c atoms in the unit cell (c = number of centerings) can be routinely solved. In the presence of strong scatterers like Fe, Cu or Zn atoms this number increases to around 5000 × c atoms. The implementation of this strengthened SM algorithm is simple, since it only includes a few easy-to-adjust parameters. open access.Entities:
Keywords: SM phasing algorithm; direct methods; ipp procedure; origin-free modulus sum function; structure solution; |ρ|-based phasing residual
Year: 2021 PMID: 34196295 PMCID: PMC8248888 DOI: 10.1107/S2053273321004915
Source DB: PubMed Journal: Acta Crystallogr A Found Adv ISSN: 2053-2733 Impact factor: 2.290
Figure 1The recursive S-ipp phase refinement algorithm with enhanced peakness: (upper right corner) φ phase estimates (either initial or updated values) are combined with experimental |E|’s to obtain ρ, |ρ| and m ρ (the latter is stored). Next, the Fourier transform of |ρ| is calculated leading to new |C| and α values, and the former are used in the calculation of CC . The new α values are combined with the experimental (|E| − 〈|E|〉) (lower left corner), and their inverse Fourier transform, δ , is calculated. In the next step, function δ is multiplied with the stored m ρ mask to give the η product function. Peakness in η is enhanced by applying the ipp density-modification procedure and, finally, the Fourier transform of the modified η supplies the updated φ phases. [Initial sets of φ estimates investigated in this article are either Φrnd (random phase values) or Φ (phase values corresponding to the Fourier coefficients of M′, i.e. the randomly shifted modulus function).]
Figure 2S-ipp phasing with initial random phases (Φrnd): variation of N η and CC with the iteration number for data set 1a7z (t η = 3.7). N is the number of non-H atoms in the unit cell.
Data sets from the Protein Data Bank (PDB) used to compare the S -ipp and S phasing algorithms corresponding to compounds with only weak scatterers (top five) or with weak and medium scatterers (remaining)
Residues = number of residues; c = number of centerings; N = number of non-H atoms in the unit cell (PDB); M and H2O = number of medium scatterers and refined water molecules; %Sol = solvent volume percentage; d min = minimum d spacing in Å of used reflection data; T = data collection temperature in K. (1a7y, 1ob4, 1a7z, 1alz, 2erl, 1a0m are rotating-anode or sealed-tube data sets; otherwise synchrotron data.)
| PDB code | Compound | Residues | Space group |
|
| H2O/ | %Sol |
|
|
|---|---|---|---|---|---|---|---|---|---|
| 1a7y | Actino D(1) | 33 |
| 314 | – | 44 | 18 | 0.94 | 133 |
| 3sbn | Trichovirin(2) | 30 |
| 444 | – | 32 | 24 | 0.95 | 100 |
| 1ob4 | Cephaibol A(3) | 17 |
| 548 | – | 60 | 22 | 1.00 | 100 |
| 1a7z | Actino Z3(1) | 22 |
| 1228 | 8Cl | 4 | 49 | 0.95 | 173 |
| 1alz | Gramicidin A(4) | 34 |
| 1348 | – | 4 | 30 | 1.00 | 120 |
| 1byz | Alpha1-peptide(5) | 52 |
| 479 | 1Cl | 30 | 27 | 0.91 | 100 |
| 2erl | Er-1 pheromone(6) | 40 |
| 656 | 14S | 44 | 20 | 1.00 | 273 |
| 1p9g | Antifungal(7) | 41 |
| 702 | 20S | 122 | 23 | 1.00 | 283 |
| 3nir | Crambin(8) | 48 |
| 902 | 12S | 196 | 31 | 1.00 | 100 |
| 1a0m | Conotoxin(9) | 34 |
| 1144 | 40S | 168 | 24 | 1.10 | 286 |
| 4lzt | Lysozime(10) | 129 |
| 1183 | 10S | 139 | 32 | 1.00 | 295 |
| 1f94 | Bucandin(11) | 63 |
| 1232 | 20S | 236 | 35 | 1.02 | 100 |
| 1hhu | Balhimycin(12) | 28 |
| 1310 | 16Cl | 250 | 22 | 0.89 | 100 |
| 3odv | Kaliotoxin(13) | 76 | P\bar 1 | 1392 | 32S | 180 | 20 | 1.00 | 100 |
| 3psm | Plant defensin(14) | 94 |
| 1882 | 16S | 366 | 45 | 0.98 | 100 |
| 3bcj | Aldose reductase(15) | 316 |
| 7308 | 26S | 1374 | 43 | 0.78 | 15 |
| + 3P | 0.85 |
(1) Schäfer et al. (1998 ▸); (2) Gessmann et al. (2012 ▸); (3) Bunkóczi et al. (2003 ▸); (4) Burkhart et al. (1998 ▸); (5) Privé et al. (1999 ▸); (6) Anderson et al. (1996 ▸); (7) Xiang et al. (2004 ▸); (8) Schmidt et al. (2011 ▸); (9) Hu et al. (1998 ▸); (10) Walsh et al. (1998 ▸); (11) Kuhn et al. (2000 ▸); (12) Lehmann et al. (2002 ▸); (13) Pentelute et al. (2010 ▸); (14) Song et al. (2011 ▸); (15) Zhao et al. (2008 ▸).
PDB data sets used to test the S-ipp and S phasing algorithms corresponding to compounds with strong scatterers
Residues, space group, c, %Sol and d min as in Table 1 ▸. N = number of non-H atoms in the unit cell (PDB); M and S = number of medium and strong scatterers; H2O = number of refined water molecules. Data sets 2bf9, 8rxn and 1c7k measured at room temperature; otherwise at 100 K.
| PDB code | Compound | Residues | Space group |
| ( | H2O/ | %Sol |
|
|---|---|---|---|---|---|---|---|---|
| 2bf9 | aPP(1) | 36 |
| 768 | 2Zn | 164 | 31 | 1.00 |
| 8rxn | Rubredoxin(2) | 52 |
| 1010 | 12S+2Fe | 204 | 35 | 1.00 |
| 1w3m | Tsushimycin(3) | 132 |
| 1276 | 10Cl+24Ca | 191 | 35 | 1.00 |
| 2ov0 | Amicyanin(4) | 105 |
| 2060 | 6S+2P+2Cu | 432 | 34 | 0.95 |
| 1c75 | Cythochrome 553(5) | 71 |
| 2660 | 12S+4Fe | 500 | 38 | 0.97 |
| 3d1p | Transferase(6) | 120 |
| 2702 | 2S+2Cl+4Se | 498 | 29 | 0.95 |
| 1pwl | Aldose reductase Br(7) | 316 |
| 3030 | 14S+3P+1Br | 429 | 25 | 1.10 |
| 1a6m | Myoglobin(8) | 151 |
| 3154 | 8S+2Fe | 372 | 36 | 1.00 |
| 41au | Geodin(9) | 161 |
| 3278 | 2Ca+ 6Se | 740 | 40 | 0.99 |
| 1eb6 | Deuterolysin(10) | 177 |
| 3300 | 12S+2Zn | 518 | 39 | 1.00 |
| 1b0y | H42Q(11) | 85 |
| 3348 | 36S+16Fe | 824 | 30 | 0.90 |
| 1x8q | Nitrophorin 4C(12) | 184 |
| 3662 | 10S+2Fe | 720 | 24 | 0.90 |
| 2fdn | Ferredoxin(13) | 55 |
| 3964 | 128S+64Fe | 768 | 35 | 1.00 |
| 3fsa | Azurin(14) | 125 |
| 4488 | 36S+4Cu | 856 | 38 | 1.00 |
| 1c7k | Endoprotease Zn(15) | 132 |
| 4532 | 12S+4Ca+4Zn | 464 | 37 | 1.00 |
| 3ks3 | H. C. anhydrase II(16) | 260 |
| 5626 | 6S+2Zn | 962 | 41 | 0.95 |
| 1heu | L. A. dehydrogenase(17) | 748 |
| 7618 | 58S+4Cd | 1297 | 50 | 1.15 |
(1) Glover et al. (1983 ▸); (2) Dauter et al. (1992 ▸); (3) Bunkóczi et al. (2005 ▸); (4) Carrell et al. (unpublished); (5) Benini et al. (2000 ▸); (6) Nocek et al. (unpublished); (7) El-Kabbani et al. (2004 ▸); (8) Vojtěchovský et al. (1999 ▸); (9) Fanfrlik et al. (2013 ▸); (10) McAuley et al. (2001 ▸); (11) Parisini et al. (1999 ▸); (12) Kondrashov et al. (2004 ▸); (13) Dauter et al. (1997 ▸); (14) Sato et al. (2009 ▸); (15) Kurisu et al. (2000 ▸); (16) Avvaru et al. (2010 ▸); (17) Meijers et al. (2001 ▸).
Four disordered Se positions.
Application of the S and S algorithms to crystal structures only containing weak scatterers (A1, A2 and B1 phasing strategies)
The t ρ parameter controlling the threshold of the m ρ mask is always 2.50. N/c as in Table 1 ▸; N = number of peaks showing up in the final E map above the n σρ threshold; CC = correlation coefficient between experimental and calculated modulus function; N iter = number of iterations to achieve convergence (n.c. = no convergence in 1000 iterations); t η is the parameter controlling the number N η of strongest η peaks; Q = N η(2)/N.
| PDB code |
| Phasing strategy |
| CC
| 〈 |
|
|
|---|---|---|---|---|---|---|---|
| 1a7y | 314 | A1 | 376 (1.1) | 0.86 | 〈 | 4.0 | 1.0 |
| A2 | 363 (1.1) | 0.86 | 〈 | – | – | ||
| B1 | 370 (1.1) | 0.85 | 〈 | 4.0 | 0.9 | ||
| 3sbn | 444 | A1 | 449 (1.1) | 0.82 | 〈 | 3.7 | 1.4 |
| A2 | 456 (1.1) | 0.85 | 〈 | – | – | ||
| B1 | 456 (1.1) | 0.82 | 〈 | 3.7 | 1.2 | ||
| 1ob4 | 548 | A1 | 564 (1.1) | 0.86 | 〈 | 3.7 | 1.1 |
| A2 | 573 (1.1) | 0.87 | 〈 | – | – | ||
| B1 | 569 (1.1) | 0.86 | 〈 | 3.7 | 1.0 | ||
| 1a7z | 1228 | A1 | 1279 (1.1) | 0.83 | 〈 | 3.7 | 1.4 |
| A2 | 1372 (1.1) | 0.85 | 〈 | – | – | ||
| B1 | 1281 (1.1) | 0.83 | 〈 | 3.7 | 1.0 | ||
| 1alz | 1348 | A1 | 1308 (1.5) | 0.84 | 136, 520; n.c. (48×) | 3.8 | 1.2 |
| A2 | – | – | n.c. (50×) | – | – | ||
| B1 | – | – | n.c. (50×) | 3.8 | 0.9 |
Application of the S and S algorithms to crystal structures with medium scatterers
Upper and lower lines refer to phasing strategies B1 and B2, respectively (except for 3bcj). N, M, c as in Table 1 ▸; N = number of peaks showing up in the final E map above the n σρ threshold; CC = correlation coefficient between experimental and calculated modulus function; N iter = number of iterations to achieve convergence (n.c. = no convergence in 1000 iterations); t ρ, t η = parameters controlling, respectively, the threshold of the m ρ mask and the number N η of strongest η peaks; Q = N η(2)/N.
| PDB code |
|
| CC
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| 1byz | 479 (1Cl) | 520 (1.1) | 0.86 | 46, 56, 102, 134, 146 | 2.65 | 4.0 | 0.9 |
| 529 (1.1) | 0.86 | 234, 356, 535; n.c. (2×) | – | – | |||
| 2erl | 656 (14S) | 610 (1.1) | 0.78 | 32, 77, 112, 251, 270 | 2.50 | 4.0 | 0.8 |
| 703 (1.1) | 0.83 | 330, 475, 731; n.c. (2×) | – | – | |||
| 1p9g | 702 (20S) | 695 (1.4) | 0.85 | 24, 29, 30, 31, 43 | 2.60 | 3.8 | 1.1 |
| 769 (1.4) | 0.86 | 156, 174, 279, 296, 409 | – | – | |||
| 3nir | 902 (12S) | 934 (1.1) | 0.86 | 24, 38, 46, 98, 127 | 2.70 | 4.0 | 0.7 |
| 921 (1.1) | 0.88 | 211, 333, 666, 690; n.c. (1×) | – | – | |||
| 1a0m | 1144 (40S) | 1124 (1.3) | 0.83 | 93, 97, 110, 158,186 | 2.60 | 3.7 | 1.0 |
| 1342 (1.3) | 0.86 | 217, 344, 366, 510, 844 | – | – | |||
| 4lzt | 1183 (10S) | 1134 (1.5) | 0.83 | 43, 47, 48, 49, 51 | 2.65 | 4.0 | 1.0 |
| – | – | n.c. (5×) | – | – | |||
| 1f94 | 1232 (20S) | 1160 (1.1) | 0.81 | 108, 110, 111, 171, 189, 353, 834, 897; n.c. (17×) | 2.50 | 3.8 | 1.1 |
| 1230 (1.1) | 0.83 | 342, 681; n.c. (23×) | – | – | |||
| 1hhu | 1310 (16Cl) | 1360 (1.4) | 0.82 | 〈 | 2.60 | 3.9 | 1.1 |
| 1378 (1.4) | 0.85 | 〈 | – | – | |||
| 3odv | 1392 (32S) | 1480 (1.0) | 0.72 | 18, 23, 23, 27, 36 | 2.50 | 3.5 | 1.2 |
| 1499 (1.0) | 0.77 | 176, 249, 256, 290, 583 | – | – | |||
| 3psm | 1882 (16S) | 1854 (1.4) | 0.78 | 23, 24, 27, 33, 45 | 2.60 | 4.0 | 0.9 |
| 2132 (1.4) | 0.82 | 309; n.c. (4×) | – | – | |||
| 3bcj | 7308 (26S) | 7222 (1.3) | 0.81 | 〈 | 2.65 | 3.8 | 1.5 |
| 7086 (1.3) | 0.82 | 〈 | 3.8 | 1.4 |
Upper and lower lines correspond to B1 at d min = 0.78 and 0.85 Å, respectively. N iter(max) = 200.
Application of S to crystal structures containing strong scatterers (S) (strategy B1)
N = number of non-H atoms in the unit cell (PDB); c = number of centerings; N , CC , N iter, n.c., t ρ, t η and Q as in Table 3 ▸. [t ρ = 2.80 except for 1w3m (2.60), 3d1p (2.70), 1a6m (2.75), 41au (2.70) and 3fsa (2.70); 〈B Wilson〉 is 6.8 (1.1) Å2 with the extrema being 5.3 for 2fdn and 9.1 for 1eb6.]
| PDB code |
|
| CC
|
|
|
|
|---|---|---|---|---|---|---|
| 2bf9 | 768 (2Zn) | 709 (1.1) | 0.81 | 10, 11, 12, 13, 15 | 3.5 | 2.2 |
| 8rxn | 1010 (2Fe) | 905 (1.1) | 0.83 | 14, 15, 17, 18, 22 | 3.5 | 1.4 |
| 1w3m | 1276 (24Ca) | 1275 (1.4) | 0.81 | 30, 33, 37, 42, 80 | 4.0 | 1.1 |
| 2ov0 | 2060 (2Cu) | 1990 (1.5) | 0.84 | 14, 15, 16, 16, 29 | 4.0 | 0.9 |
| 1c75 | 2660 (4Fe) | 2541 (1.4) | 0.82 | 12, 13, 14, 16,16 | 4.0 | 1.1 |
| 3d1p | 2702 (4Se) | 2642 (1.1) | 0.83 | 12, 12, 14, 15, 16 | 3.8 | 1.1 |
| 1pwl | 3030 (1Br) | 3123 (1.1) | 0.83 | 32, 54, 57, 62, 149 | 3.8 | 1.2 |
| 1a6m | 3154 (2Fe) | 3203 (1.1) | 0.83 | 28, 30, 31, 37, 48 | 4.0 | 0.8 |
| 41au | 3278 (6Se | 3440 (1.1) | 0.85 | 41, 69, 79; n.c. (2×) | 3.8 | 1.0 |
| 1eb6 | 3300 (2Zn) | 3406 (1.1) | 0.82 | 16, 18, 23, 24, 40 | 4.0 | 0.9 |
| 1b0y | 3348 (16Fe) | 3360 (1.3) | 0.76 | 38, 39, 41, 53, 79 | 3.5 | 1.5 |
| 1x8q | 3662 (2Fe) | 3510 (1.5) | 0.83 | 34, 36, 58, 64, 92 | 4.0 | 1.0 |
| 2fdn | 3944 (64Fe) | 3832 (1.3) | 0.81 | 21, 21, 22, 23, 26 | 3.8 | 1.0 |
| 3fsa | 4488 (4Cu) | 4580 (1.5) | 0.83 | 31, 39, 40, 44, 56 | 4.0 | 1.0 |
| 1c7k | 4532 (4Zn) | 4548 (1.3) | 0.84 | 80, 96, 128, 202, 399 | 4.0 | 0.9 |
| 3ks3 | 5626 (2Zn) | 5588 (1.2) | 0.83 | 9, 10, 10, 10, 10 | 3.9 | 1.1 |
| 1heu | 7618 (4Cd) | 7603 (1.1) | 0.82 | 35, 40, 42, 45, 176 | 3.9 | 1.1 |
Four Se atoms are partially disordered.
Comparison of strategies B1 and B2 when applied to crystal structures with strong scatterers (S)
For B2, the individual N iter values are given; for B1, 〈N iter〉 corresponds to N iter values in Table 5 ▸. It is evident that B1 (using ipp) performs better than B2 in all cases. N = number of non-H atoms in the unit cell (PDB); c = number of centerings; 〈N iter〉 = average number of iterations to achieve convergence (n.c. = no convergence in 1000 iterations).
| PDB code |
| B1 strategy 〈 | B2 strategy |
|---|---|---|---|
| 2bf9 | 768 (2Zn) | 12.2 (5×) | 29, 29, 36, 39, 55 |
| 8rxn | 1010 (2Fe) | 17.2 (5×) | 44, 53, 54, 56, 61 |
| 1w3m | 1276 (24Ca) | 44.4 (5×) | 97, 109, 118, 124, 134 |
| 2ov0 | 2060 (2Cu) | 18.0 (5×) | 52, 54, 61, 62, 86 |
| 1c75 | 2660 (4Fe) | 14.2 (5×) | 41, 51, 52, 57, 73 |
| 3d1p | 2702 (4Se) | 13.8 (5×) | 35, 38, 44, 45, 47 |
| 1pwl | 3030 (1Br) | 70.8 (5×) | n.c. (5×) |
| 1a6m | 3154 (2Fe) | 34.8 (5×) | n.c. (5×) |
| 41au | 3278 (6Se | 63.0 (3×) | 400, n.c. (4×) |
| 1eb6 | 3300 (2Zn) | 24.2 (5×) | 69, 86, 109, 156, 289 |
| 1b0y | 3348 (16Fe) | 50.0 (5×) | 210, 225, 243, 254,259 |
| 1x8q | 3662 (2Fe) | 56.8 (5×) | 261, 317, 404, 961, n.c. |
| 2fdn | 3944 (64Fe) | 22.6 (5×) | 62, 77, 82, 93, 97 |
| 3fsa | 4488 (4Cu) | 42.0 (5×) | 163, 288, 324, 413, n.c. |
| 1c7k | 4532 (4Zn) | 181.0 (5×) | n.c. (5×) |
| 3ks3 | 5626 (2Zn) | 9.8 (5×) | 36, 42, 45, 46, 48 |
| 1heu | 7618 (4Cd) | 67.6 (5×) | 226, 259, 273, 534, n.c. |
Four Se atoms are partially disordered.
Figure 3Effect of the ipp procedure on the phasing efficiency of the S algorithm with Φrnd. The two selected data sets belong to: (top) 3sbn (trichovirin) with 444 atoms in the unit cell; (bottom) 1a7z (Actino Z3) with 1228. True solutions obtained with/without the ipp procedure in black/gray (same starting random phase values for each pair of trials).
Figure 4S -ipp phasing with Φ : variation of N η and CC with the iteration number for data set 3ks3 (t η = 3.9). N = number of non-H atoms in the unit cell.
Figure 5Unit-cell content of aldose reductase (Zhao et al., 2008 ▸; data set 3bcj) showing the two unique protein chains related by the screw axis along b as obtained with the S-ipp phasing algorithm directly from the experimental modulus synthesis (Φ ) by assuming P1 symmetry (S and light atoms are found simultaneously). Atoms with higher refined peak strength are shown in red.