Literature DB >> 31970272

The oxygen-oxygen distance of water in crystallographic data sets.

Luigi Leonardo Palese1.   

Abstract

Water is a key component of cellular biochemistry and numerous water molecules are visible in crystallographic structures. Here we report a series of data sets of crystallographic water: a high resolution data set, a cytochrome c oxidase (subunit I) data set and a carbonic anhydrase data set. These data support the evidence that short distance water molecule pairs are present both at the surface and inside the cavities of proteins. These data are related to article entitled "Oxygen-oxygen distances in protein-bound crystallographic water suggest the presence of protonated clusters" (Palese, 2020) [1].
© 2020 The Author.

Entities:  

Keywords:  Confined water; Protein surface; Proton; Water; Water cluster; X-ray crystallography

Year:  2020        PMID: 31970272      PMCID: PMC6965711          DOI: 10.1016/j.dib.2019.105076

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table X-ray crystallography has shown that proteins contain numerous water molecules The role of these protein-bound water molecules is still not fully understood Water molecules in large data sets of high resolution structures are reported Positions of candidate protonated water clusters in some model proteins are reported

Data

Fig. 1 reports the atomic radial pair distribution function (RDF) in the region between 2.15 and 2.85 Å of the high resolution (HR) data set and of the HR subset not refined by SHELX (see Ref. [1]).
Fig. 1

The RDF of the protein-bound water oxygen atoms. Left panel reports the HR data set (see Table 1); right panel reports the HR data set not refined by SHELX (see Table 2). See Ref. [1] for details. Red crosses refer to the experimental RDF; colored curves indicated with dashed lines are the Gaussian curves used for the fitting, and their sum is reported as a thin red line.

The RDF of the protein-bound water oxygen atoms. Left panel reports the HR data set (see Table 1); right panel reports the HR data set not refined by SHELX (see Table 2). See Ref. [1] for details. Red crosses refer to the experimental RDF; colored curves indicated with dashed lines are the Gaussian curves used for the fitting, and their sum is reported as a thin red line.
Table 1

The HR data set.

1A6M1B0Y1BXO1C751DY51EA71EB61EXR1F941FN81FY41FY51G4I
1G6X1GA61GCI1GDN1GDQ1GQV1GVK1HJ81I1W1IQZ1IR01IUA1J0P
1JFB1K2A1K6U1KWF1L9L1LNI1LUG1M1Q1M401 MC21MJ51MN81MUW
1MXT1N1P1N4U1N4V1N4W1N9B1NQJ1OAI1OD31OEW1OK01PQ51PQ7
1PQ81PWM1R2M1R6J1RTQ1SSX1SY21SY31T2H1TG01TQG1TT81UG6
1UNQ1US01V0L1VB01VBW1VL91VYR1W0N1X8P1XMK1XVO1YK41YLJ
1YWA1YWB1YWC1Z8A1ZLB1ZUU1ZZK2A6Z2ABB2AGT2AT32AT82AYW
2B972BF62CE22CNQ2CWS2DDX2E4T2EWI2EWK2FDN2FLA2FMA2FOU
2FVY2GBA2GEW2GG22GGC2GKG2H3L2H5C2H5D2I4A2IDQ2IDS2IDT
2IDU2IXT2JFR2JJJ2NLS2NRL2NX02O7A2O9S2OFR2OV02P5K2P74
2PEV2PF82PFH2PND2PNE2PVB2PVE2PYA2PZN2QCP2QDV2QDW2QSK
2RBK2V1M2V8B2VB12VHK2VHR2VI32VK52VU62WFI2WFJ2WUR2X46
2XFR2XJP2XOD2XU32Z6W2ZPM2ZQ72ZQA3A023A4R3AJ43AKQ3AKS
3AKT3B3R3BOI3C783CNJ3D1P3D433DHA3E4G3E6Z3EA63FSA3FYM
3G9X3GHR3GHS3GOE3GYI3GYJ3H313HGP3I2Y3I303I343I373IP0
3JUD3K343KFF3KLR3KS33LEP3LL13LL23LZ53M4H3MFJ3 MI43NE0
3NED3NIR3O5P3O5Q3ODV3P8J3PUC3QL93QM53QM63QM83QM93QMA
3QPA3QPC3S6E3SOJ3TEU3TRV3U2C3U7C3UI43UI63V1A3VHG3VIF
3VIG3VII3VJQ3VN33VOR3WCQ3WGE3WGX3WL23WOU3WVM3X2G3X2M
3X323X333X343X353ZR83ZSJ3ZSK3ZUC3ZZP4A024ACJ4AQO4AWS
4AWT4BCT4BJ04BM84BVM4DP94DPB4DRQ4E3Y4EIC4EKF4F184FPT
4FRC4FU54G784G9S4GA24GCA4GNR4HGU4HVU4HVW4I8G4I8H4I8J
4I8K4I8L4IAU4IGS4J5E4JP64K8Y4KQP4LAU4LAZ4LB34LB44LBR
4LBS4LFS4M7G4MTY4MZC4N1I4NPD4O6Q4O6U4O8H4OO44PRT4PSS
4PSY4PTH4Q4G4Q784Q9W4QB34QBX4QXI4R5R4REK4RTZ4TJZ4TKB
4TKH4TKJ4UA64UA74UA94UAA4UYR4WJX4WKA4WPK4X5P4X6H4XDX
4XOJ4XZH4Y5L4Y9W4YXI4Z8J4ZC94ZGF4ZM75A8C5AVD5AVG5B28
5B5H5CMT5CTM5D665DGJ5DJ75DK15DKM5DP25DZE5E1K5E1N5E9N
5EMB5F825FLK5HB75I5B5IG65II85IQN5IVN5JDK5JDT5JIG5JUG
5KWM5KXV5LJT5LP95LXW5MAE5MAJ5MB55MEH5MK95MN15MNC5MNG
5MNK5MNN5MTU5NFK5NFM5NW35O2X5O995OAV5OBK5OGO5OME5OTN
5OU05OUJ5OUK5SV55TDA5TIF5U3A5VLE5WS75X9L5X9M5XBU5XP6
5XQU5XQV5XR05Y2R5Y2S5YCE5ZGE5ZGI5ZGW5ZGX5ZGY5ZGZ5ZIO
5ZJ15ZJ75ZJ85ZJC6B006CNW6EQE6ETK6ETM6EUW6F1O6F7R6F81
6FMC6FO56FVI6G1I6HMD6HMQ6I746J936JJ16JJQ6MU96NFR6Q2Y
6Q496Q4G6Q4H6Q6T6RGP6RHH6RHU6RHX6RI06RI66RI86RII7A3H
8A3H
Table 2

The no-SHELX HR data set.

1EB61OAI1OD31R2M1SY21SY31T2H1UG61V0L1W0N1ZUU2ABB2AT3
2AT82AYW2CE22CNQ2DDX2GG22GGC2H3L2I4A2IXT2NLS2NRL2NX0
2OFR2P5K2PND2PNE2QCP2V1M2VHK2VHR2VI32VU62XFR2XU32Z6W
2ZQ72ZQA3A023A4R3AJ43AKQ3AKS3AKT3BOI3C783E4G3E6Z3FSA
3I2Y3I303I343I373IP03JUD3LL13LL23NED3NIR3O5Q3P8J3PUC
3QL93QM53QM63QM83QM93QMA3QPA3QPC3TEU3UI43UI63V1A3VIF
3VIG3VII3VN33WCQ3WGE3WGX3WL23WVM3X2G3X2M3ZR83ZUC4ACJ
4AQO4BCT4BJ04BVM4DP94DPB4DRQ4E3Y4F184G9S4GA24GCA4GNR
4HVU4HVW4IGS4J5E4K8Y4KQP4LAU4LAZ4LB34LB44LBR4LBS4M7G
4MTY4MZC4O6Q4O6U4O8H4OO44PRT4Q4G4Q784Q9W4QB34QBX4QXI
4R5R4REK4RTZ4TJZ4TKB4TKH4TKJ4WJX4WKA4WPK4X5P4X6H4XDX
4XOJ4XZH4Y5L4Y9W4YXI4Z8J4ZC94ZM75A8C5AVD5AVG5B285B5H
5CMT5CTM5DGJ5DJ75DK15DKM5DP25DZE5E1N5E9N5EMB5F825FLK
5HB75I5B5IG65II85IQN5IVN5JDK5JDT5JIG5JUG5KWM5KXV5LJT
5LP95LXW5MAE5MAJ5MB55MEH5MN15MNC5MNG5MNK5MNN5NFK5NFM
5NW35O2X5O995OAV5OBK5OGO5OME5OTN5OU05OUJ5OUK5SV55TDA
5TIF5U3A5WS75XBU5XP65XQU5XQV5XR05Y2R5Y2S5YCE5ZGE5ZGI
5ZGW5ZGX5ZGY5ZGZ5ZIO5ZJ15ZJ75ZJ85ZJC6B006CNW6EQE6EUW
6F1O6FMC6FO56FVI6G1I6HMD6HMQ6I746J936JJ16JJQ6MU96NFR
6Q2Y6Q496Q4G6Q4H6Q6T6RGP6RHH6RHU6RHX6RI06RI66RI86RII
7A3H8A3H
Fig. 2 reports the RDF for the water oxygen atoms in the sodium free HR data set [1].
Fig. 2

The RDF of the water oxygen atoms in the sodium free HR data set (see Table 3) described in Ref. [1].

The RDF of the water oxygen atoms in the sodium free HR data set (see Table 3) described in Ref. [1].
Table 3

The sodium free HR data set.

1B0Y1DY51EA71F941G4I1I1W1IQZ1IR01J0P1JFB1K6U1LNI1LUG
1M1Q1 MC21MJ51MUW1N1P1N4W1N9B1PWM1R2M1R6J1SSX1SY21SY3
1TG01TT81UG61V0L1VB01VL91W0N1X8P1XMK1YWA1YWB1YWC1Z8A
1ZZK2A6Z2AGT2AT32AT82AYW2BF62CNQ2EWI2EWK2FDN2FMA2FOU
2FVY2GEW2GG22GGC2H3L2H5C2H5D2JFR2O9S2OFR2P742PEV2PF8
2PFH2PVE2PZN2WFI2WFJ2WUR2XFR2XU32Z6W2ZPM3A023A4R3AKQ
3C783D1P3D433DHA3E4G3EA63FSA3FYM3GHR3GHS3GOE3I2Y3I30
3I343I373JUD3K343KFF3LEP3LZ53M4H3NED3NIR3O5P3O5Q3PUC
3QL93QPA3QPC3SOJ3TRV3U2C3U7C3UI43UI63V1A3VHG3VIF3VIG
3VII3VJQ3VOR3WCQ3WGE3WGX3WOU3WVM3X2G3X2M3ZR83ZUC3ZZP
4ACJ4AQO4AWS4AWT4BCT4BJ04DRQ4E3Y4EIC4F184FU54G784G9S
4GA24GCA4GNR4IAU4IGS4J5E4JP64LAU4LAZ4LB34LB44LBR4LBS
4M7G4O6U4PRT4PSS4PSY4PTH4Q9W4QB34QBX4QXI4RTZ4TJZ4TKB
4TKH4TKJ4UA64UA74UA94UAA4XZH4Y9W4YXI4Z8J4ZC94ZGF4ZM7
5A8C5B285B5H5DGJ5DJ75DK15DP25E9N5EMB5F825FLK5HB75IVN
5JDK5JIG5KWM5MAE5MAJ5MEH5MN15MNC5MNG5MNK5MNN5NFK5NFM
5O2X5O995OTN5OU05OUJ5OUK5TDA5TIF5VLE5XBU5XP65YCE5ZGE
5ZGY5ZGZ5ZIO5ZJ15ZJ75ZJ85ZJC6B006F7R6F816FMC6G1I6MU9
6NFR6Q2Y
Fig. 3 shows the RDF of the water oxygen atoms in the sodium free and not refined by SHELX subset described in Ref. [1] (see also Table 4 below).
Fig. 3

The RDF of the water oxygen atoms in the sodium free, not refined by SHELX subset (see Table 4) described in Ref. [1].

Table 4

The sodium free, no-SHELX HR data set.

1R2M1SY21SY31UG61V0L1W0N2AT32AT82AYW2CNQ2GG22GGC2H3L
2OFR2XFR2XU32Z6W3A023A4R3AKQ3C783E4G3FSA3I2Y3I303I34
3I373JUD3NED3NIR3O5Q3PUC3QL93QPA3QPC3UI43UI63V1A3VIF
3VIG3VII3WCQ3WGE3WGX3WVM3X2G3X2M3ZR83ZUC4ACJ4AQO4BCT
4BJ04DRQ4E3Y4F184G9S4GA24GCA4GNR4IGS4J5E4LAU4LAZ4LB3
4LB44LBR4LBS4M7G4O6U4PRT4Q9W4QB34QBX4QXI4RTZ4TJZ4TKB
4TKH4TKJ4XZH4Y9W4YXI4Z8J4ZC94ZM75A8C5B285B5H5DGJ5DJ7
5DK15DP25E9N5EMB5F825FLK5HB75IVN5JDK5JIG5KWM5MAE5MAJ
5MEH5MN15MNC5MNG5MNK5MNN5NFK5NFM5O2X5O995OTN5OU05OUJ
5OUK5TDA5TIF5XBU5XP65YCE5ZGE5ZGY5ZGZ5ZIO5ZJ15ZJ75ZJ8
5ZJC6B006FMC6G1I6MU96NFR6Q2Y
The RDF of the water oxygen atoms in the sodium free, not refined by SHELX subset (see Table 4) described in Ref. [1]. Fig. 4 shows the scatter plot of the RDFs relative to the no SHELX subset and the sodium free, no SHELX subset [1]. The Figure reports also the line of best fit, whose equation is y = 1.0191x - 0.0001 (R2 = 0.9966).
Fig. 4

Scatter plot of the RDFs relative to the no SHELX subset (horizontal axis) and the sodium free, no SHELX subset (vertical axis) described in Ref. [1]. Solid black line is the best fit.

Scatter plot of the RDFs relative to the no SHELX subset (horizontal axis) and the sodium free, no SHELX subset (vertical axis) described in Ref. [1]. Solid black line is the best fit. Fig. 5 reports the number of water molecules in cytochrome c oxidase (CcO) subunit I vs the structure resolution. Data have been obtained from the PDB entries 5B1A, 5B1B, 5B3S, 5XDQ, 5ZCP and 5ZCQ (diffraction temperature 50 K; resolution 1.5 Å, 1.6 Å, 1.68 Å, 1.77 Å, 1.65 Å, 1.65 Å, respectively) and 1V54, 2DYR, 3AG2 and 3AG3 (diffraction temperature 100 K; all these structures have a resolution of 1.8 Å). Both subunits of the same type have been considered, labeled as A and N in the original pdb file.
Fig. 5

Number of crystallographic water molecules in CcO subunit I structures as a function of the resolution of the relative PDB entry. Structures diffracted at 100 K and 50 K have been considered for this analysis.

Number of crystallographic water molecules in CcO subunit I structures as a function of the resolution of the relative PDB entry. Structures diffracted at 100 K and 50 K have been considered for this analysis. Fig. 6, Fig. 7 report the Euclidean distance distribution of oxygen-oxygen (O–O) pairs for the data set of CcO structures obtained at 50 K and 100 K, respectively. Two major peaks are present in both distributions, centered at 2.73 and 2.88 Å.
Fig. 6

Distribution of the Euclidean oxygen-oxygen (O–O) distances of water molecules in the CcO subunit I data set. Data are obtained from the 50 K structures. Only water pairs with O–O distances <5 Å are reported.

Fig. 7

Distribution of the Euclidean oxygen-oxygen (O–O) distances of water molecules in the CcO subunit I data set. Data are obtained from the 100 K structures. Only water pairs with O–O distances <5 Å are reported.

Distribution of the Euclidean oxygen-oxygen (O–O) distances of water molecules in the CcO subunit I data set. Data are obtained from the 50 K structures. Only water pairs with O–O distances <5 Å are reported. Distribution of the Euclidean oxygen-oxygen (O–O) distances of water molecules in the CcO subunit I data set. Data are obtained from the 100 K structures. Only water pairs with O–O distances <5 Å are reported. Table 1 reports the pdb codes for the entries of the HR data set [1], containing 469 elements. The HR data set. Table 2 reports the pdb codes of the subset of the HR data set containing structures not refined by SHELX (the no-SHELX HR data set discussed in Ref. [1]). The no-SHELX HR data set. Table 3 reports the pdb codes of the subset of the HR data set containing structures in which no sodium is declared in the crystallization methods (the sodium free HR data set discussed in Ref. [1]). The sodium free HR data set. Table 4 reports the subset of the HR data set containing structures not refined by SHELX and in which the use of sodium in the crystallization conditions is not reported. The sodium free, no-SHELX HR data set. Table 5 reports the statistics of water pairs analyzed in human carbonic anhydrase II (hCA II), as detailed in Ref. [1].
Table 5

The number of water molecules in each considered hCA II is reported (#H2O). The table reports also the number of calculated oxygen-oxygen Euclidean distances (# O–O) and the number of oxygen-oxygen pairs considered as putative, charged (protonated or deprotonated), water clusters as defined in Ref. [1] (# OHO).

4MTY4Q784YXI5LJT5OGO5Y2R5Y2S6B00
# H2O398214264439333441429409
# O–O7900322791347169614155278970209180683436
# OHO33119121710
The number of water molecules in each considered hCA II is reported (#H2O). The table reports also the number of calculated oxygen-oxygen Euclidean distances (# O–O) and the number of oxygen-oxygen pairs considered as putative, charged (protonated or deprotonated), water clusters as defined in Ref. [1] (# OHO). Table 6 reports the statistics of water pairs analyzed in subunit I of bovine CcO, as detailed in Ref. [1].
Table 6

The number of water molecules in each considered CcO subunit I is reported (#H2O). The table reports also the number of calculated oxygen-oxygen Euclidean distances (# O–O) and the number of oxygen-oxygen pairs considered as putative, charged (protonated or deprotonated), water clusters as defined in Ref. [1] (# OHO).

5B1A_A5B1A_N5B1B_A5B1B_N5B3S_A5B3S_N5XDQ_A5XDQ_N5ZCP_A5ZCP_N5ZCQ_A5ZCQ_N
# H2O297290289286273278257267235203228215
# O–O439564190541616407553712838503328963551127495205032587823005
# OHO313295113342
The number of water molecules in each considered CcO subunit I is reported (#H2O). The table reports also the number of calculated oxygen-oxygen Euclidean distances (# O–O) and the number of oxygen-oxygen pairs considered as putative, charged (protonated or deprotonated), water clusters as defined in Ref. [1] (# OHO). Table 7 contains the pseudo-code for the RDF calculations (see the Experimental Design, Materials, and Methods section).
Table 7

The RDF pseudo-code.

set num_mol [molinfo num]for {set i 0} {$i < $num_mol} {incr i} {set name_prot [molinfo $i get name]set sel [atomselect $i “water"]set gr [measure gofr $sel $sel]set outfile [open gofr_$name_prot.dat w]set r [lindex $gr 0]set gr1 [lindex $gr 1]set igr [lindex $gr 2]foreach j $r k $gr1 l $igr {puts $outfile "$j $k $l"}close $outfile}
The RDF pseudo-code. In the water_pairs.csv file in Supplementary data all the short distance water molecules (SDWMs) in the model proteins discussed in Ref. [1] are listed. These are pairs of water molecules whose O–O distance is in the range 2.29 Å - 2.50 Å, confirmed by inspection of the electron density maps at 1.0 and 4.0 sigma as described in Ref. [1]. Columns in this csv file are: the PDB id of the molecule, the residue number in the original pdb file of the two water molecules in the pair (two columns), and the O–O distance in this water pair obtained from the pdb coordinates. Water molecules in Fig. 2, Fig. 4 in Ref. [1] are listed in this file. In the Supplementary file raw_data.zip there are the raw data used in this work. The files contained in this zipped archive are: raw_data.txt, raw_RDF.csv, raw_RDF_CcO.csv, water_resolution.csv, 50K_distance.txt, 100K_distance.txt, which are described in detail below. The file raw_data.txt lists the URLs for all the *.pdb, *.cif and *.ccp4 files considered in this work. The file raw_RDF.csv contains all the raw RDF data used for calculating the plots shown in Fig. 1, Fig. 2, Fig. 3, Fig. 4; these data were also used for the calculation of Fig. 1 in Ref. [1]. Each row corresponds to a crystallographic structure (PDB codes are in the first column). The file raw_RDF_CcO.csv contains all the raw RDF data used for calculating the plot reported as Fig. 3 in Ref. [1]. Each row corresponds to a subunit I structure (A and N subunits in the PDB entries reported in the first column). The file water_resolution.csv contains the number of water molecules in the CcO data set; these are the raw data for Fig. 5. The file 50K_distance.txt reports all the Euclidean O–O distances of crystallografic water molecules in subunits I of CcO structures obtained by X-ray diffraction at 50 K (A and N subunits in the PDB entries 5B1A, 5B1B, 5B3S, 5XDQ, 5ZCP and 5ZCQ). Distances <5 Å, but above the reported experimental resolution in the relative PDB record, were considered. These are the raw data used to calculate the distribution shown in Fig. 6. The file 100K_distance.txt reports all the Euclidean O–O distances of crystallografic water molecules in subunits I of CcO structures obtained by X-ray diffraction at 100 K (A and N subunits in the PDB entries 1V54, 2DYR, 3AG2 and 3AG3). Distances <5 Å, but above the reported experimental resolution in the relative PDB record, were considered. These are the raw data used to calculate the distribution shown in Fig. 7.

Experimental Design, materials, and methods

The X-ray structures were obtained from the Protein Data Bank (PDB) [2] (Table 1, Table 2, Table 3, Table 4; direct URLs to these data are in the raw_data.txt file in the Supplementary data). The HR data set was obtained by searching in the PDB for structures corresponding to the following constraints: resolution ≤1 Å; X-ray only; protein only; monomer only. After this, we considered only structures whose diffraction pattern was obtained at 100 K. Some entries have been discarded at the radial distribution function (RDF) calculation stage (typically small structures with few water molecules that give anomalous RDF). After these steps, the HR data set contained 469 entries. Two subset have been obtained from the HR data set by adding additional constraints: (a) the absence of sodium in all buffer and solution declared in the deposited methods (the sodium free HR data set), or (b) the absence of the SHELX program in the software reported in the deposited refinement methods (the no SHELX HR data set). The entries in this last data set reported as hCA II are those considered as models for this enzyme (see Ref. [1]). From the HR data set, a further subset has been obtained, containing only entries obtained in absence of sodium in the crystallization protocol and not refined by SHELX. For the bovine CcO data set, only structures with a resolution of at least 1.8 Å were considered for analysis. The high resolution data set of CcO contained the PDB entries (see Ref. [1]): 5B1A (fully oxidized state, pH 6.8), 5B1B (fully reduced state, pH 6.8), 5B3S (carbon monoxide-bound mixed-valence, pH 6.8), 5XDQ (fully oxidized state, pH 7.3), 5ZCP and 5ZCQ (azide-bound states obtained by long time exposure to 20 mM or 10 mM azide solutions, respectively, pH 6.8). All these structures, obtained at 50 K, are characterized by a resolution between 1.65 and 1.50 Å (5XDQ has been considered here, even if its resolution is 1.77 Å, because it is the structure characterized by the higher resolution at alkaline pH). We have also considered a data set of structures obtained at 100 K, with a resolution of 1.8 Å: 1V54, 2DYR, 3AG2 and 3AG3, which are respectively fully oxidized, fully reduced, carbon monoxide-bound fully reduced and nitric oxide-bound fully reduced species, all at pH 6.8 [[3], [4], [5]]. The data sets containing a single type of protein were analyzed in detail as specified below. From the pdb files, the atomic coordinates of all atoms belonging to the subunit of interest were used to make a new pdb file (consider that bovine CcO is in dimeric form in crystals, and the type I subunits are labeled as A and N in the files retrieved from the PDB). Sequence and structural analyses have been performed as described previously [1,[6], [7], [8]]. The mutual Euclidean distance between all the oxygen atoms of the water molecules contained into the protein, at the protein surface or near lipids [9] was calculated by means of a Tcl program in a VMD environment [10]. The RDF for each pdb file has been calculated in VMD using a code similar to that reported in Table 7, and further analyzed in a Jupyter environment (see below). Electron density maps at 1.0 and 4.0 sigma of the model proteins discussed in Ref. [1] were obtained using the *.cif and *.ccp4 files in Jmol (http://www.jmol.org/). The URLs for these files are reported in the raw_data.txt file in the Supplementary data where * is the PDB code of the protein of interest (4mty, 4q78, 4yxi, 5ljt, 5ogo, 5y2r, 5y2s, 6b00, 5b1a, 5b1b, 5b3s, 5xdq, 5zcp, 5zcq). The mean displacement of atoms U was calculated considering that the B-factor is given by B = 8π2U2. Numerical calculations were implemented in Python (www.python.org) in an IPython/Jupyter environment, using the NumPy numerical software library, the Scipy and the Matplotlib packages [[11], [12], [13], [14]]. Final editing of images was performed by means of the GNU Image Manipulation Program (The GIMP team, GIMP 2.8.10, www.gimp.org) or the ImageMagik (imagemagick.org) software packages.

Specifications Table

SubjectBiochemistry
Specific subject areaBiochemistry, biophysics, structural biology, bioenergetics.
Type of dataTableFigureGraphText files
How data were acquiredSurvey of the protein crystal structures obtained by X-ray diffraction deposited in the Protein Data Bank (PDB). Input data for analysis were obtained as pdb, cif or ccp4 files from public databases.
Data formatRaw: pdb files, ccp4 files, cif files and text files.Analyzed: table, csv files, text file, graph, figure.
Parameters for data collectionRaw pdb files were checked for quality (resolution). Atoms in cif files were compared with the electron density maps in ccp4 files.
Description of data collectionRaw data were analyzed by different computational protocols.
Data source locationDepartment of Basic Medical Sciences, Neurosciences and Sense Organs (SMBNOS), Bari 70124, Italy.
Data accessibilityAll the data produced in this work are available within the article.
Related research articleLuigi Leonardo Palese, Oxygen-oxygen distances in protein-bound crystallographic water suggest the presence of protonated clusters, Biochemical et Biophysical Acta - General Subjects, https://doi.org/10.1016/j.bbagen.2019.129480.
Value of the Data

X-ray crystallography has shown that proteins contain numerous water molecules

The role of these protein-bound water molecules is still not fully understood

Water molecules in large data sets of high resolution structures are reported

Positions of candidate protonated water clusters in some model proteins are reported

  10 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  VMD: visual molecular dynamics.

Authors:  W Humphrey; A Dalke; K Schulten
Journal:  J Mol Graph       Date:  1996-02

3.  Protein States as Symmetry Transitions in the Correlation Matrices.

Authors:  Luigi L Palese
Journal:  J Phys Chem B       Date:  2016-11-01       Impact factor: 2.991

4.  Bovine cytochrome c oxidase structures enable O2 reduction with minimization of reactive oxygens and provide a proton-pumping gate.

Authors:  Kazumasa Muramoto; Kazuhiro Ohta; Kyoko Shinzawa-Itoh; Katsumasa Kanda; Maki Taniguchi; Hiroyuki Nabekura; Eiki Yamashita; Tomitake Tsukihara; Shinya Yoshikawa
Journal:  Proc Natl Acad Sci U S A       Date:  2010-04-12       Impact factor: 11.205

5.  Conformations of the HIV-1 protease: A crystal structure data set analysis.

Authors:  Luigi Leonardo Palese
Journal:  Biochim Biophys Acta Proteins Proteom       Date:  2017-08-26       Impact factor: 3.036

6.  Oxygen-oxygen distances in protein-bound crystallographic water suggest the presence of protonated clusters.

Authors:  Luigi Leonardo Palese
Journal:  Biochim Biophys Acta Gen Subj       Date:  2019-11-15       Impact factor: 3.770

7.  Structures and physiological roles of 13 integral lipids of bovine heart cytochrome c oxidase.

Authors:  Kyoko Shinzawa-Itoh; Hiroshi Aoyama; Kazumasa Muramoto; Hirohito Terada; Tsuyoshi Kurauchi; Yoshiki Tadehara; Akiko Yamasaki; Takashi Sugimura; Sadamu Kurono; Kazuo Tsujimoto; Tsunehiro Mizushima; Eiki Yamashita; Tomitake Tsukihara; Shinya Yoshikawa
Journal:  EMBO J       Date:  2007-03-01       Impact factor: 11.598

8.  Prediction of high- and low-affinity quinol-analogue-binding sites in the aa3 and bo3 terminal oxidases from Bacillus subtilis and Escherichia coli1.

Authors:  Fabrizio Bossis; Anna De Grassi; Luigi Leonardo Palese; Ciro Leonardo Pierri
Journal:  Biochem J       Date:  2014-07-15       Impact factor: 3.857

9.  The low-spin heme of cytochrome c oxidase as the driving element of the proton-pumping process.

Authors:  Tomitake Tsukihara; Kunitoshi Shimokata; Yukie Katayama; Hideo Shimada; Kazumasa Muramoto; Hiroshi Aoyama; Masao Mochizuki; Kyoko Shinzawa-Itoh; Eiki Yamashita; Min Yao; Yuzuru Ishimura; Shinya Yoshikawa
Journal:  Proc Natl Acad Sci U S A       Date:  2003-12-12       Impact factor: 11.205

10.  Relationship between cardiolipin metabolism and oxygen availability in Bacillus subtilis.

Authors:  Simona Lobasso; Luigi L Palese; Roberto Angelini; Angela Corcelli
Journal:  FEBS Open Bio       Date:  2013-02-19       Impact factor: 2.693

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.