| Literature DB >> 31480623 |
Sen Yao1,2,3, Hunter N B Moseley4,5,6,7,8.
Abstract
As the number of macromolecular structures in the worldwide Protein Data Bank (wwPDB) continues to grow rapidly, more attention is being paid to the quality of its data, especially for use in aggregated structural and dynamics analyses. In this study, we systematically analyzed 3.5 Å regions around all metal ions across all PDB entries with supporting electron density maps available from the PDB in Europe. All resulting metal ion-centric regions were evaluated with respect to four quality-control criteria involving electron density resolution, atom occupancy, symmetry atom exclusion, and regional electron density discrepancy. The resulting list of metal binding sites passing all four criteria possess high regional structural quality and should be beneficial to a wide variety of downstream analyses. This study demonstrates an approach for the pan-PDB evaluation of metal binding site structural quality with respect to underlying X-ray crystallographic experimental data represented in the available electron density maps of proteins. For non-crystallographers in particular, we hope to change the focus and discussion of structural quality from a global evaluation to a regional evaluation, since all structural entries in the wwPDB appear to have both regions of high and low structural quality.Entities:
Keywords: electron density analysis; metal binding site; metalloprotein; pdb-eda; regional protein structure analysis
Mesh:
Substances:
Year: 2019 PMID: 31480623 PMCID: PMC6751499 DOI: 10.3390/molecules24173179
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1High- and low-quality examples for all four criteria: (A) Protein Data Bank (PDB) ID: 5FVN.F.405.CA, representative of high quality by passing all four criteria; (B) PDB ID: 1YV0.C.163.MG, resolution: 7 Å; (C) PDB ID: 3CIA.B.701.ZN, occupancy: 0.7; (D) PDB ID: 5ER8.A.706.MN, symmetry atoms nearby; (E) PDB ID: 3LZQ.A.200.CU, high discrepancy between calculated and observed electron density maps.
Tabulation of different elemental types of metal ions observed in the worldwide Protein Data Bank (wwPDB) with respect to four quality-control criteria.
| Metal | Number of PDB Entries | Number of Total Metal Ion Sites | Number with <2.5 Å Resolution | Number with High Occupancy | Number with No Nearby Symmetry Atoms | Number with No Significant Discrepancy Densities | Number That Passes All Criterion |
|---|---|---|---|---|---|---|---|
| Zn | 13,497 | 67,405 | 56,176(83%) | 62,550(93%) | 62,967(93%) | 23,883(35%) | 13,230(20%) |
| Mg | 13,225 | 85,080 | 30,537(36%) | 81,528(96%) | 81,463(96%) | 48,708(57%) | 18,305(22%) |
| Ca | 10,138 | 44,538 | 36,193(81%) | 41,929(94%) | 41,176(92%) | 23,836(54%) | 15,787(35%) |
| Fe | 8054 | 41,898 | 32,283(77%) | 39,207(94%) | 41,499(99%) | 20,881(50%) | 14,427(34%) |
| Na | 7516 | 23,697 | 16,645(70%) | 22,523(95%) | 21,072(89%) | 18,295(77%) | 10,700(45%) |
| Mn | 3177 | 14,347 | 9037(63%) | 12,089(84%) | 13,630(95%) | 8755(61%) | 4275(30%) |
| K | 2390 | 8819 | 5671(64%) | 7498(85%) | 7678(87%) | 5541(63%) | 2973(34%) |
| Ni | 1533 | 3578 | 2803(78%) | 2752(77%) | 2943(82%) | 2093(58%) | 986(28%) |
| Cu | 1469 | 6918 | 5913(85%) | 5548(80%) | 6474(94%) | 3676(53%) | 2111(31%) |
| Co | 1146 | 3601 | 2897(80%) | 2920(81%) | 3288(91%) | 1687(47%) | 976(27%) |
| Cd | 926 | 6535 | 5351(82%) | 4708(72%) | 4863(74%) | 2334(36%) | 624(10%) |
| Hg | 640 | 2302 | 1525(66%) | 808(35%) | 2223(97%) | 528(23%) | 11(0%) |
| U | 507 | 6032 | 5522(92%) | 4553(75%) | 5196(86%) | 2351(39%) | 1693(28%) |
| Pt | 242 | 869 | 564(65%) | 212(24%) | 802(92%) | 249(29%) | 4(0%) |
| Mo | 209 | 785 | 685(87%) | 505(64%) | 692(88%) | 323(41%) | 147(19%) |
| Al | 189 | 399 | 187(47%) | 390(98%) | 399(100%) | 217(54%) | 112(28%) |
| Be | 187 | 510 | 273(54%) | 461(90%) | 504(99%) | 318(62%) | 175(34%) |
| Ba | 166 | 900 | 399(44%) | 558(62%) | 733(81%) | 186(21%) | 6(1%) |
| Ru | 162 | 341 | 288(84%) | 163(48%) | 318(93%) | 113(33%) | 8(2%) |
| Sr | 151 | 3972 | 1394(35%) | 3764(95%) | 3869(97%) | 2846(72%) | 745(19%) |
| V | 143 | 488 | 285(58%) | 399(82%) | 462(95%) | 219(45%) | 130(27%) |
| Cs | 115 | 666 | 402(60%) | 251(38%) | 526(79%) | 226(34%) | 14(2%) |
| W | 96 | 1743 | 396(23%) | 1218(70%) | 1639(94%) | 280(16%) | 15(1%) |
| Yb | 91 | 247 | 189(77%) | 136(55%) | 127(51%) | 60(24%) | 6(2%) |
| Au | 90 | 437 | 275(63%) | 120(27%) | 373(85%) | 136(31%) | 2(0%) |
| Li | 73 | 124 | 110(89%) | 116(94%) | 109(88%) | 96(77%) | 72(58%) |
| Gd | 65 | 444 | 408(92%) | 268(60%) | 409(92%) | 134(30%) | 22(5%) |
| Pb | 62 | 229 | 113(49%) | 87(38%) | 187(82%) | 63(28%) | 5(2%) |
| Y | 58 | 218 | 168(77%) | 154(71%) | 108(50%) | 77(35%) | 16(7%) |
| Tl | 54 | 400 | 143(36%) | 119(30%) | 383(96%) | 82(21%) | 1(0%) |
| Ir | 51 | 333 | 132(40%) | 138(41%) | 317(95%) | 44(13%) | 0(0%) |
| Rb | 49 | 229 | 139(61%) | 73(32%) | 174(76%) | 83(36%) | 4(2%) |
| Sm | 45 | 205 | 106(52%) | 132(64%) | 142(69%) | 42(20%) | 11(5%) |
| Ag | 34 | 381 | 329(86%) | 361(95%) | 365(96%) | 67(18%) | 25(7%) |
| Pr | 31 | 77 | 56(73%) | 46(60%) | 40(52%) | 23(30%) | 4(5%) |
| Eu | 24 | 71 | 64(90%) | 14(20%) | 60(85%) | 23(32%) | 3(4%) |
| Pd | 24 | 108 | 108(100%) | 55(51%) | 79(73%) | 19(18%) | 2(2%) |
| Os | 23 | 101 | 34(34%) | 77(76%) | 97(96%) | 20(20%) | 3(3%) |
| Re | 21 | 71 | 71(100%) | 27(38%) | 68(96%) | 13(18%) | 3(4%) |
| Rh | 20 | 68 | 68(100%) | 25(37%) | 62(91%) | 18(26%) | 1(1%) |
| Tb | 18 | 168 | 139(83%) | 134(80%) | 157(93%) | 20(12%) | 3(2%) |
| Ta | 18 | 529 | 106(20%) | 42(8%) | 503(95%) | 199(38%) | 0(0%) |
| Lu | 15 | 62 | 46(74%) | 31(50%) | 54(87%) | 19(31%) | 0(0%) |
| Ho | 13 | 55 | 47(85%) | 43(78%) | 44(80%) | 7(13%) | 0(0%) |
| La | 11 | 115 | 107(93%) | 106(92%) | 112(97%) | 1(1%) | 1(1%) |
| Cr | 10 | 53 | 43(81%) | 49(92%) | 52(98%) | 8(15%) | 5(9%) |
| Ga | 10 | 80 | 80(100%) | 80(100%) | 80(100%) | 5(6%) | 5(6%) |
| Sn | 9 | 16 | 16(100%) | 6(38%) | 16(100%) | 2(13%) | 0(0%) |
| Sb | 5 | 10 | 4(40%) | 6(60%) | 10(100%) | 7(70%) | 3(30%) |
| Ce | 4 | 70 | 70(100%) | 66(94%) | 70(100%) | 0(0%) | 0(0%) |
| Er | 3 | 18 | 0(0%) | 17(94%) | 0(0%) | 1(6%) | 0(0%) |
| Zr | 3 | 31 | 28(90%) | 30(97%) | 0(0%) | 0(0%) | 0(0%) |
| In | 2 | 3 | 1(33%) | 3(100%) | 0(0%) | 1(33%) | 0(0%) |
| Bi | 2 | 2 | 2(100%) | 0(0%) | 2(100%) | 0(0%) | 0(0%) |
| Hf | 2 | 44 | 44(100%) | 43(98%) | 0(0%) | 10(23%) | 9(20%) |
| Dy | 1 | 26 | 26(100%) | 0(0%) | 0(0%) | 18(69%) | 0(0%) |
| Total | 66,819 | 33,0448 | 218,698(66%) | 299,138(91%) | 308,616(93%) | 168,843(51%) | 87,660(27%) |
Figure 2Distribution and 2.5 Å cutoff of structure resolutions: (A) Zn; (B) Ca; (C) Fe; (D) Mn; (E) Cu; (F) all metal ions.
Figure 3Distribution of electron discrepancy within 3.5 Å of the metal ion: (A) Zn; (B) Ca; (C) Fe; (D) Mn; (E) Cu; (F) all metal ions. The red line represents a standard deviation cutoff of 1, calculated from the distribution in graph F with outliers removed.