Literature DB >> 32043355

Comparing Cryo-EM Reconstructions and Validating Atomic Model Fit Using Difference Maps.

Agnel Praveen Joseph1, Ingvar Lagerstedt2, Arjen Jakobi3, Tom Burnley1, Ardan Patwardhan2, Maya Topf4, Martyn Winn1.   

Abstract

Cryogenic electron microscopy (cryo-EM) is a powerful technique for determining structures of multiple conformational or compositional states of macromolecular assemblies involved in cellular processes. Recent technological developments have led to a leap in the resolution of many cryo-EM data sets, making atomic model building more common for data interpretation. We present a method for calculating differences between two cryo-EM maps or a map and a fitted atomic model. The proposed approach works by scaling the maps using amplitude matching in resolution shells. To account for variability in local resolution of cryo-EM data, we include a procedure for local amplitude scaling that enables appropriate scaling of local map contrast. The approach is implemented as a user-friendly tool in the CCP-EM software package. To obtain clean and interpretable differences, we propose a protocol involving steps to process the input maps and output differences. We demonstrate the utility of the method for identifying conformational and compositional differences including ligands. We also highlight the use of difference maps for evaluating atomic model fit in cryo-EM maps.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32043355      PMCID: PMC7254831          DOI: 10.1021/acs.jcim.9b01103

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


Introduction

Over the past few years, cryogenic electron microscopy (cryo-EM) has had an enormous impact on the structure determination of large and dynamic molecular machines. Better detectors and algorithms for three-dimensional structure reconstruction from images have helped in achieving near atomic resolutions. There has been a large influx of structures solved using cryo-EM in the central repository—the Electron Microscopy Data Bank (EMDB, https://www.ebi.ac.uk/pdbe/emdb/statistics_main.html/)—and this is expected to rise dramatically in the coming years. The lack of validation methods and guidelines to deal with this data has been realized, and efforts are underway to address this.[1−3] Cryo-EM enables structure determination of different functional forms of biological macromolecules in the near-native state.[4] Comparison of individual forms gives insights into the biological pathway of the molecule. In some cases, new (different state or conformation) cryo-EM structures are compared to existing ones to understand structural and functional differences. Usually difference maps are calculated for such comparisons, and the maps are scaled to an equivalent density range prior to such calculations. Approaches for global density scaling exist; e.g., Relion[5] (relion_image_handler), EMAN2[6] (e2proc3d), diffmap (http://grigoriefflab.janelia.org/diffmap), and BSoft[7] (bscale) work by scaling amplitudes in each resolution shell of a map to that of a reference power spectrum (usually based on an atomic model). Sample heterogeneity arising from conformational and/or compositional differences limits the resolution of cryo-EM reconstructions, often resulting in local anisotropy of data resolution. The periphery of the macromolecular complex is usually less resolved compared to the core. Flexible domains or subunits with partial occupancy may be smoothed out as well. Local scaling of maps has been found useful to improve interpretation of density features with appropriate scaling estimated based on local resolution differences.[8] In this approach, a reference power spectrum (of an atomic model) from a local window is used for scaling the corresponding segment of the map. Apart from calculating map–map differences, local scaling may be appropriate for model–map comparisons as well. A segment of an atomic model with high B-factors (larger uncertainty in atomic positions) often relates to poorly resolved areas of the map and hence scales differently compared to a better resolved segment. Difference maps are very useful pointers to areas in the map where the atomic model fit is poor or incomplete. For structure determination using X-ray crystallography, difference map calculations have been used regularly for ligand identification and fixing atomic model fits in density. In this study, we implement a generic approach for calculating difference densities for cryo-EM data. The two maps to be compared are scaled based on Fourier amplitude matching before computing the difference. The proposed method has the ability to scale maps locally taking the local density variations into account. For intermediate resolutions and noisy data, it is often difficult to get clean and interpretable difference maps. We use map preprocessing steps including masking, dusting, and filtering before scaling and associate a fractional difference with each voxel to help interpret the differences. The protocol presented here is the result of trying several approaches to obtain clean and interpretable differences. We test its application for detecting compositional and conformational differences and also as a tool for validating atomic model fits in maps. We also provide a user-friendly GUI implementation of this method in the CCP-EM software package.[9]

Methods

We implemented a method for calculating difference maps based on either global or local amplitude scaling. The approach involves the following steps: Map preprocessing To minimize the effects of background density artifacts on the scaling procedure, contour thresholds can be selected for the experimental maps, or a mask may be applied. This step is optional, but for a few cases discussed in this paper, we noticed density artifacts in the original map which possibly resulted from use of tight masks during map postprocessing. For the test cases, we selected a contour threshold of two times sigma from the background peak. Upon visual inspection, we found that most of the densities arising from background artifacts are flattened at this threshold. However, the choice of the threshold level is often subjective and can vary depending on the density distribution, background artifacts, and map resolution. For a systematic segregation of molecular volume and background noise, the local signal with respect to noise has to be quantified. One of the approaches that deals with the separation of signal from background noise is the false discovery rate control.[10] This uses a statistical framework to calculate 3D confidence maps whose values (ranging from 0 to 1) correspond to the confidence that the voxel contains a signal separated from the background noise. The confidence map can be used as a mask for processing the map or as a guide to choose a contour threshold for the map. A graphical interface to this tool is available through the CCP-EM software suite. Density values below the threshold were set to zero, and a dust filter was applied to remove any small disconnected densities that remained. To this end, the sizes of disconnected densities (in number of voxels) are divided into 20 bins. Those density islands that fall into bins having a frequency of more than 10% and also having mean densities within the lower 50% of the density range are removed. To minimize the effect of sharp contour edges on scaling, the edge at the selected contour was smoothed by convolution with a Gaussian kernel. We used the implementation of the n-dimensional Gaussian filter in SciPy[11] with a sigma of 1 (radius of the filter kernel is four times sigma) to smooth the edges at the contour threshold. This results in a soft mask applied to the map, where the density values within the contour are not altered, and voxels at the edge are affected by this filter to obtain a smoother falloff to zero. For calculating the difference between a map and model, a simulated map was calculated from the atomic model using Refmac5,[12] which uses electron scattering factors and considers the map resolution and atomic B-factors to generate density.[12] Low pass filtering For calculating differences between experimental maps resolved at different resolutions, the maps are low-pass filtered to the lower resolution of the two maps using a hyperbolic tangent (tanh) filter (in TEMPy[13]) which is similar to that of the tanh filter in EMAN2.[6] Scaling The amplitude scaling can be performed either globally or locally over sliding windows. For global amplitude scaling, the whole map grid is used for the calculation of the power spectra, whereas for local scaling, a grid based on a local moving window is used. The local scaling procedure follows the implementation used in LocScale,[8] which performs local scaling based on a reference amplitude spectra. As in the case of LocScale, a default window size which is seven times the map resolution was used. The scaling calculation is used to update the value assigned to the central voxel of the window. For a given map, the amplitudes in each resolution shell are scaled by the square root of the ratio of the average intensities of both maps to the intensity of that map in that shell.where FT1sc is the scaled Fourier term in a given shell for map 1, FT1 is the initial Fourier term in the shell, I1 and I2 are the average intensities (square of amplitudes) in the shell for map1 and map2, respectively. Map 2 is scaled in an analogous manner. When the difference is calculated between a map and an atomic model, the amplitudes of the map simulated from the model are used as the reference for scaling, by default. This is under the assumption that the map simulated from the atomic model is noise free and gives a reasonable representation of features at this resolution (of the experimental map). In this case, the map amplitudes are scaled by the square root of the ratio of the average intensity (rotationally averaged) of the model-derived map (I2) to the average intensity of the map (I1) in that shell. The map from the atomic model is not scaled. The reference-based scaling can be overridden by changing the default option, especially for cases where the atomic model is partial or not fitted well in the map. The differences between the scaled maps are calculated in real space, giving absolute map–map or map–model difference maps. To interpret the differences, we also calculate the fractional differences with respect to the scaled maps. For each voxelwhere Df is the fractional difference, D1–2 is the density difference between map1 and map2, and ρ1 is the density of scaled map1. A similar computation can be used for calculating the extent of the difference with respect to map2, for D2–1. Because of this weighting, Df is not the negative of Df,2–1. In assessing differences, it is useful to look at the positive regions of D1–2 or D2–1 and quantify the significance using Df and Df,2–1. The fractional difference maps are useful guides to interpret differences. A suitable threshold of fractional difference can be used to mask the difference maps. A lower threshold (e.g., 0.25) removes any insignificant differences arising from noise. On the other hand, a higher threshold (e.g., 0.5) shows areas of large differences. To further clean the differences, a dust filter can be applied on the masked difference map to remove small isolated densities around the masked difference map.

Results and Discussion

Map–Map Comparison

We applied the difference map approach to the following cases to test the method and identify compositional and conformational differences.

Strychnine-Bound vs Glycine-Bound GlyR

A glycine receptor is a ligand-gated channel receptor that opens a chloride-permeable pore leading to inhibition of neuronal firing in the spinal cord and brain stem.[14,15] It controls a wide range of motor and sensory functions including vision and audition. Strychnine is a complex alkaloid which is a potent receptor antagonist that binds to the canonical intersubunit neurotransmitter site and locks the receptor in the closed state.[16] Glycine binds at the same site but induces channel opening, allowing permeation of chloride ions. Ivermectin is an unconventional agonist of the GlyR that activates GlyR, potentiates response to glycine,[17] and triggers the open conformation. The structures of strychnine- and ivermectin/glycine-bound forms of GlyR (alpha-1 isoform) were determined at 3.9 Å (EMD-6344) and 3.8 Å (EMD-6346) resolutions, respectively, using cryo-EM.[18] The structures have a five-fold symmetry around the pore axis. We calculated the difference density using global amplitude scaling between the strychnine- and ivermectin/glycine-bound forms of GlyR (Figure A,B). The maps were not preprocessed. To assess the differences, we used a comparison of the atomic models for the two forms built on the maps and also the crystal structures of strychnine-bound (PDB ID: 5CFB) and ivermectin-bound (PDB ID: 5VDH) GlyR (alpha-3 isoform).[19]
Figure 1

GlyR receptor. (A) Global scaling-based density difference between strychnine (EMD-6344)- and ivermectin/glycine (EMD-6346)-bound forms of GlyR (alpha-1 isoform). The difference map (D1–2) is shown in gray, and the backbone of the atomic model (ribbon) associated with the strychnine-bound map (PDB ID: 3JAD) is colored based on the fractional difference Df,1–2 (averaged over voxels covered by each amino acid). Individual atoms of the strychnine molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,1–2. (B) Density difference between the ivermectin/glycine (EMD-6346)- and strychnine (EMD-6344)-bound forms. The atomic model associated with the ivermectin-bound map (PDB ID: 3JAF) is colored based on the fractional difference Df,2–1 averaged over voxels covered by each amino acid. Individual atoms of the ivermectin molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,2–1. The difference map (D2–1) is in yellow. The insets between panels A and B show differences at the strychnine and ivermectin binding sites (zoomed in). (C) Comparison of crystal structures of strychnine (PDB ID: 5CFB)- and ivermectin-bound (PDB ID: 5VDH) GlyR (alpha-3 isoform). The structure of strychnine-bound GlyR is shown, colored based on the distance between backbone C-alpha atoms in the two forms. (D) Local scaling-based density difference between strychnine (EMD-6344)- and ivermectin/glycine (EMD-6346)-bound forms of GlyR (alpha-1 isoform). The difference map (D1–2) is shown in gray, and the backbone of the atomic model associated with the map (PDB ID: 3JAD) is colored based on the fractional difference Df,1–2. Individual atoms of the strychnine molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,1–2. (E) Local scaling-based density difference between the ivermectin/glycine (EMD-6346)- and strychnine (EMD-6344)-bound forms. The atomic model associated with the ivermectin-bound map (PDB ID: 3JAF) is colored based on the fractional difference Df,2–1. The difference map D2–1 is in yellow. Individual atoms of the ivermectin molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,2–1. The insets between panels D and E shows differences at the strychnine and ivermectin binding sites (zoomed in). (F) Crystal structures of strychnine (PDB ID: 5CFB)-bound GlyR (alpha-3 isoform) are colored based on the atomic B-factor distribution (averaged over atoms in each amino acid residue).

GlyR receptor. (A) Global scaling-based density difference between strychnine (EMD-6344)- and ivermectin/glycine (EMD-6346)-bound forms of GlyR (alpha-1 isoform). The difference map (D1–2) is shown in gray, and the backbone of the atomic model (ribbon) associated with the strychnine-bound map (PDB ID: 3JAD) is colored based on the fractional difference Df,1–2 (averaged over voxels covered by each amino acid). Individual atoms of the strychnine molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,1–2. (B) Density difference between the ivermectin/glycine (EMD-6346)- and strychnine (EMD-6344)-bound forms. The atomic model associated with the ivermectin-bound map (PDB ID: 3JAF) is colored based on the fractional difference Df,2–1 averaged over voxels covered by each amino acid. Individual atoms of the ivermectin molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,2–1. The difference map (D2–1) is in yellow. The insets between panels A and B show differences at the strychnine and ivermectin binding sites (zoomed in). (C) Comparison of crystal structures of strychnine (PDB ID: 5CFB)- and ivermectin-bound (PDB ID: 5VDH) GlyR (alpha-3 isoform). The structure of strychnine-bound GlyR is shown, colored based on the distance between backbone C-alpha atoms in the two forms. (D) Local scaling-based density difference between strychnine (EMD-6344)- and ivermectin/glycine (EMD-6346)-bound forms of GlyR (alpha-1 isoform). The difference map (D1–2) is shown in gray, and the backbone of the atomic model associated with the map (PDB ID: 3JAD) is colored based on the fractional difference Df,1–2. Individual atoms of the strychnine molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,1–2. (E) Local scaling-based density difference between the ivermectin/glycine (EMD-6346)- and strychnine (EMD-6344)-bound forms. The atomic model associated with the ivermectin-bound map (PDB ID: 3JAF) is colored based on the fractional difference Df,2–1. The difference map D2–1 is in yellow. Individual atoms of the ivermectin molecule (ball and stick representation) and the bound sugars (stick representation) are colored based on Df,2–1. The insets between panels D and E shows differences at the strychnine and ivermectin binding sites (zoomed in). (F) Crystal structures of strychnine (PDB ID: 5CFB)-bound GlyR (alpha-3 isoform) are colored based on the atomic B-factor distribution (averaged over atoms in each amino acid residue).

Difference Based on Global Scaling

The locations of strychnine and ivermectin were identified as difference densities (Figure A,B). The atomic models in Figure A and B corresponding to the two GlyR states are colored by the Df,1–2 and Df,2–1 values, respectively. A clear difference density was observed for strychnine at the intersubunit site between the extracellular domains. The fractional difference averaged over the voxels of the binding site is Df,1–2 ∼ 0.49, which is less than 1.0 due to residual density in the ivermectin-bound form arising mainly from the background and conformational changes in the surrounding protein. Ivermectin density on the other hand was found at the subunit interface between transmembrane domains. The difference density was relatively less prominent (Df,2–1 ∼ 0.30) compared to that of strychnine. The C-terminal segment of ivermectin is exposed to the membrane layer and is associated with high B-factors (>100 Å2, PDB ID: 5VDH) suggesting greater flexibility. The conformational changes between the closed strychnine-bound and open/activated ivermectin-bound forms of GlyR are also captured as differences. We compared the difference density against the differences between crystal structures (alpha-3 isoform) of the two ligand-bound forms (Figure C). The differences generally agree and are more prominent in the transmembrane domain. The differences also reflect the differences in the mechanism of action of the ligands. In the glycine/ivermectin-bound form, the intracellular halves of the transmembrane helices move closer to each other compared to the extracellular half which is wider (Figure S1). In contrast, the pore in the strychnine-bound form is constricted and rather perpendicular to the membrane. The helices in the intracellular domain that bind the pore axis undergo a larger tilt and clockwise rotation compared to the glycine/ivermectin-bound form.[18,19]

Difference Based on Local Scaling

The local amplitude scaling approach uses only a local window segment of the map at a time to calculate amplitude spectra and the associated scaling factors (see Methods). Hence, local contrast differences can be accounted for in the scaling procedure and difference calculation. To assess this advantage, we compared the difference densities from local and global scaling approaches for the glycine receptor. The B-factor distribution suggests that the intracellular half of the transmembrane domain and the tip of the extracellular domain of GlyR receptors are more dynamic relative to the rest of the structure (Figure F and Figure S2). We calculated difference maps between the strychnine- and ivermectin/glycine-bound forms of GlyR based on local scaling. The differences corresponding to the flexible segments are relatively less pronounced (compared to differences from global scaling), reflecting an appropriate contrast for the flexible segments (Figure D). The difference map also shows more features in the regions with lower B-factors, especially for the interface between extracellular and transmembrane domains (Figure D,E). The difference density corresponding to the C-terminal segment of ivermectin is more evident as well in the locally scaled difference map (Figure E inset). The fractional difference Df,1–2 averaged over voxels covered by strychnine is about 0.31, while the voxels covered by ivermectin has an average fractional difference Df,2–1 ∼ 0.24. Hence, the local scaling procedure enables differential scaling depending on the signal in the windowed region. The distribution of the difference density is altered accordingly, enhancing differences in areas associated with smaller uncertainty.

MKLP2 ADP-AlFx vs Non-Nucleotide State

MKLP2 is a kinesin-6 family motor protein that has important roles in different stages of cell division.[20,21] Structural characterization of the microtubule-bound MKLP2 motor domain at different stages of its ATPase cycle provided insights into its function and divergence from other kinesins.[22] Among different conformational states, the structure of the ADP-AlFx (ATP analogue)-bound form of the kinesin-6 (MKLP2) motor domain was solved at 4.4 Å resolution (EMD-3622) and the non-nucleotide state (NN)[22] at a resolution of 6.1 Å (EMD-3621). Compared to the previous example, these maps are resolved at lower resolutions, and there is a mismatch in resolution between the maps we want to compare, making this a more challenging test of the method. The difference map approach was applied to compare the conformations of the ADP-AlFx (ATP analogue)-bound state to that of the non-nucleotide state (NN). Without any map preprocessing (thresholding/masking, dusting, and low-pass filtering), the difference map is much noisier with several disconnected densities (Figure S3A). Without thresholding and dusting but with low-pass filtering, the difference is less noisy but has a few small disconnected densities (probable dust) and broken features for loop11 (Figure S3B). With all preprocessing steps (see Methods), a cleaner difference is obtained (Figure S3C). The location of ADP-AlFx was observed as a density difference unoccupied by the protein model at the nucleotide-binding pocket (Figure A) (Df,1–2 ∼ 0.73). Significant differences were also observed in the vicinity of the nucleotide indicating structural rearrangements upon binding.
Figure 2

Actin-bound MKLP2. (A) Global scaling-based density difference (gray) between ADP-AlFx (ATP analogue)-bound and non-nucleotide (NN) states of kinesin-6 (MKLP2) motor domain. The backbone of the atomic model built on the ADP-ALFx-bound map is colored by Df,1–2 values (averaged over voxels covered by each amino acid). Different structural segments of the MKLP2 motor domain are labeled. Atoms of ADP-AlFx (stick representation) are colored based on Df,1–2. (B) Atomic model built on the ADP-ALFx-bound map is colored based on backbone C−α distances between the models built in the ADP-AlFx (ATP analogue)-bound and non-nucleotide (NN) states of the kinesin-6 (MKLP2) motor domain. (C) Local scaling-based density difference (gray) between ADP-AlFx (ATP analogue)-bound and non-nucleotide (NN) states of the kinesin-6 (MKLP2) motor domain. The atomic model built on the ADP-ALFx-bound map is colored by Df,1–2 values. The region of loop6 where the density difference is less prominent is pointed with an arrow. Atoms of ADP-AlFx (stick representation) are colored based on Df,1–2.

Actin-bound MKLP2. (A) Global scaling-based density difference (gray) between ADP-AlFx (ATP analogue)-bound and non-nucleotide (NN) states of kinesin-6 (MKLP2) motor domain. The backbone of the atomic model built on the ADP-ALFx-bound map is colored by Df,1–2 values (averaged over voxels covered by each amino acid). Different structural segments of the MKLP2 motor domain are labeled. Atoms of ADP-AlFx (stick representation) are colored based on Df,1–2. (B) Atomic model built on the ADP-ALFx-bound map is colored based on backbone C−α distances between the models built in the ADP-AlFx (ATP analogue)-bound and non-nucleotide (NN) states of the kinesin-6 (MKLP2) motor domain. (C) Local scaling-based density difference (gray) between ADP-AlFx (ATP analogue)-bound and non-nucleotide (NN) states of the kinesin-6 (MKLP2) motor domain. The atomic model built on the ADP-ALFx-bound map is colored by Df,1–2 values. The region of loop6 where the density difference is less prominent is pointed with an arrow. Atoms of ADP-AlFx (stick representation) are colored based on Df,1–2. To assess the conformational difference, we checked the agreement of the difference density with the spatial differences in the coordinates of models fitted in the maps. The model segments associated with significant spatial differences agree well with the density differences between respective maps (Figure B). The atomic models fitted in intermediate resolution maps are likely to be error prone compared to those built in a high resolution map. Hence, the map–map differences may reflect a more reliable comparison of the two states of MKLP2. Nevertheless, we use the models to identify any significant changes and only used the backbone C−α atoms (more reliable than side chains at these resolutions) to calculate distances between the models. Also, we compare the differences to the changes observed across other kinesins during the ATPase cycle (see below). It is observed that the structural segments around the nucleotide binding site (e.g., loop 9, loop 11, and N-term helix-α4) are more stable in the ADP-AlFx-bound state (ADP.Pi-like).[22] In addition, loop6 forms a separate subdomain in kinesin-6[23] and is better resolved in the ADP-AlFx-bound map. Secondary structure prediction for the sequence of this loop suggested the presence of helices[22] which is also evident in the helical densities in the difference (Figure A). Coordinated movements of structural segments are observed during the microtubule-bound ATPase cycle and in the transition from the NN to ADP-AlFx state; the P-loop and alpha-3/loop9/loop11 segments move toward the catalytic site.[24] These segments are also associated with difference densities. Similar subdomain rearrangements were also reported for other well-studied kinesins.[24,25] The difference map calculated after local scaling (Figure C) had a similar profile compared to the global scaling-based difference. A more localized density for the nucleotide analogue (ADP-AlFx) was obtained with the local scaling-based difference, and part of the differences corresponding to loop6 was less prominent (highlighted in the figure). The voxels covered by ADP-AlFx are associated with an average fractional difference Df,1–2 ∼ 0.60. The local scaling-based difference is associated with a relatively narrow range of fractional difference values compared to that of global scaling. This can be observed while comparing the Df value-based coloring of atomic models discussed in the cases above (Figures and 2). As the scale of the amplitude falloff is optimized locally, the local scaling procedure minimizes oversharpening and overblurring of parts of the map that might otherwise result from global scaling (due to local resolution variation). The range of Df values over a structure narrows as the window size for local scaling decreases. Df values around ligands are also suppressed with small window sizes but remain significantly above the rest of the structure.

Model Validation Using Difference Maps

Atomic model building and refinement in maps of resolutions worse than 3.0 Å can be challenging. Moreover, local regions of cryo-EM maps often have relatively lower resolutions associated with larger uncertainty. We tested the difference map approach as a tool to identify errors in the atomic model based on differences with the density. We used the 3.2 Å hemoglobin map in the nonfunctional ferric state (close to relaxed R2 state).[26] The map was preprocessed with a contour threshold of two times sigma, followed by application of dust and soft edge filters. An atomic model was also deposited with the experimental map (PDB ID: 5NI1). The structure is a heterotetramer made of two alpha and two beta subunits. The alpha subunit is better resolved in the map than the beta subunit and is associated with relatively lower B-factors (Figure S4A). Global scaling associates more differences to the beta subunit compared to the alpha subunit, and the fractional differences agree overall with the B-factor profile (Figure S4B). Local scaling, however, results is a more uniform distribution (Figure S4C). Hence, the effect of the nonuniform local resolution is minimized with local scaling, potentially making real differences more apparent. We carried out a few tests to check whether the local scaling-based differences are useful for model validation.

Identify Errors Introduced

As described below, we introduced specific errors in side chains and the backbone of parts of the model that were otherwise well fitted in the density. The difference map approach was then applied to check whether these errors could be detected as differences. We first altered rotamers of a few side chains in the model (Figure A). The map–model differences were calculated after local density scaling. The errors associated with side chain fits could be identified as peaks in the fractional difference maps, suggesting that the differences can be a useful guide to track such errors, and this method can be used to assess model fits in maps. As expected, larger deviations (e.g., K11, W14, N68, and L80) from the true fit were associated with more pronounced difference densities with misfitted side chain atoms associated with Df,model-map values greater than 0.5. On the other hand, for subtle changes (e.g., H72, L83), displaced atoms were associated with Df,model-map values of about 0.3.
Figure 3

Detecting potential errors in atomic model fits. (A) Structural segment of the atomic model (PDB ID: 5ni1) built on the cryo-EM map of hemoglobin in the nonfunctional ferric state (close to relaxed R2 state) is shown (yellow). Six residues are labeled where the side chain rotamers were altered to introduce errors in the fit. The atoms in the altered model are colored based on the Dmodel-map of local scaling-based difference density between the model and map. The difference map Dmodel-map is shown as orange mesh, while Dmap-model is shown as solid yellow. (B) Backbone atoms of another segment of the model are shown where errors were introduced by peptide flips and carbonyl rotations. The initial atom positions are shown with thin sticks (green), and the atoms in the mutated model are colored based on Df of the model–map difference. (C) Plot of Df,model-map (averaged over atoms of a residue) vs TEMPy SMOC scores for fit of original atomic model (PDB ID: 5ni1) to density map. Examples of residues associated with high Df,model-map (averaged over atoms of a residue) and low SMOC scores are shown above the plot. A few potential misfits highlighted by fractional difference but not by SMOC scores are shown on the right (marked within a circle). The difference map Dmodel-map is shown as orange mesh, while Dmap-model is shown as solid yellow. The cryo-EM map associated with the model (EMD-3488) is shown in transparent gray.

Detecting potential errors in atomic model fits. (A) Structural segment of the atomic model (PDB ID: 5ni1) built on the cryo-EM map of hemoglobin in the nonfunctional ferric state (close to relaxed R2 state) is shown (yellow). Six residues are labeled where the side chain rotamers were altered to introduce errors in the fit. The atoms in the altered model are colored based on the Dmodel-map of local scaling-based difference density between the model and map. The difference map Dmodel-map is shown as orange mesh, while Dmap-model is shown as solid yellow. (B) Backbone atoms of another segment of the model are shown where errors were introduced by peptide flips and carbonyl rotations. The initial atom positions are shown with thin sticks (green), and the atoms in the mutated model are colored based on Df of the model–map difference. (C) Plot of Df,model-map (averaged over atoms of a residue) vs TEMPy SMOC scores for fit of original atomic model (PDB ID: 5ni1) to density map. Examples of residues associated with high Df,model-map (averaged over atoms of a residue) and low SMOC scores are shown above the plot. A few potential misfits highlighted by fractional difference but not by SMOC scores are shown on the right (marked within a circle). The difference map Dmodel-map is shown as orange mesh, while Dmap-model is shown as solid yellow. The cryo-EM map associated with the model (EMD-3488) is shown in transparent gray. We introduced another set of modeling errors in the backbone of a helix (Figure B) using peptide flips and change of phi/psi dihedrals introduced using tools in Coot.[27] The misfit atoms in were associated with a difference fraction Df,model-map greater than 0.25, suggesting that the backbone changes are less prominent as expected at this resolution. Nevertheless, as routinely done in crystallography, the difference densities can be used as a guide to track potential misfits along the protein chain.

Compare against a Density Fit Score

The difference densities are usually more informative and quite complementary to the metrics that evaluate the extent of model fit to density. The positive and negative differences (D1–2 and D2–1) can act as a guide (by providing directionality) for fixing the models. In another test, we compared the difference density against the TEMPy SMOC score[28] which gives a cross-correlation analogue (Manders’ overlap coefficient) of the local density fit. For the original atomic model (PDB ID: 5NI1) without any errors introduced, the average Df,model-map of each residue generally agrees with the trend of SMOC scores (Figure C). We looked at a few examples of residues associated with high Df,model-map (averaged over atoms) and low SMOC scores, reflecting potential errors with model fit (Figure C). The segment involving Gly51 is likely to be mistraced, as the backbone is out of density. However, all the residues in this category are not obvious misfits. We also observe cases where the differences arise from inconsistencies between experimental maps and the theoretical maps derived from the model. Residues Asp47 and Asp75 have acidic side chains and lack well-defined densities at the end of their side chains. The high Df,model-map associated with the side chain atoms can be accounted for by the fact that the map generated from the model does not accurately reflect the effects of factors like atomic charges and radiation damage that affect the experimental map. Lys56 is another example where the side chain lacks a well-defined density but has high Df,model-map associated with the side chain atoms. This can be attributed to the fact that the refined atomic B-factors used in the map calculation may not accurately account for the dynamics or disorder. Nevertheless, these differences reflected by high Df,model-map (and low SMOC scores) suggest that the atomic positions in the side chains of these residues are less reliable. We looked at the residues whose Df,model-map (averaged over atoms) is greater than 0.3, despite relatively high SMOC scores (Figure C, circled). One or more atoms in most of these residues are associated with a Df,model-map greater than 0.5. These cases point to areas where the agreement between the residue backbone and/or side chain and map density might be poor either due to a bad fit (e.g., Pro114) and/or the map is poorly resolved (e.g., Pro5, Thr12) in this region.

Validate Atomic Models from the Model Challenge

As a separate test of the applicability of this approach for atomic model validation, we selected models submitted to the EMDB Model Challenge 2015[29,30] and checked whether the difference maps can indicate errors in the density fits. We compared models submitted for the target gamma-secretase map (EMD-3061). The map was preprocessed with a contour threshold of two times sigma, followed by application of dust and soft edge filters. We selected a model ranked higher by different metrics used to evaluate density fit in the model challenge (see http://model-compare.emdatabank.org/2016/cgi-bin/em_multimer_results.cgi?target_map=T0007emd_3061). We compared this model against another model which was ranked lower by metrics used in the model challenge. We calculated model–map differences and compared areas where errors were identified based on the differences (Figure A–D). The differences clearly point to locations where residues fit poorly in density in the second model compared to the best ranked model. The poorly fitted atoms are usually associated with Df,model-map > 0.5. A better fit was observed in the best model in these regions.
Figure 4

Identifying errors in atomic model fits. In each panel (A–D), local segments of two atomic models submitted to the EMDB Model Challenge 2015 for the target gamma-secretase map (EMD-3061) are compared for fit to density. For each panel, the figure on the left corresponds to the model ranked higher in the challenge, and a relatively lower scoring model is on the right. The atoms in the models are colored by Df,model-map based on model–map difference. The poorly fitted residue (in the model on the right subpanel) is labeled, and the chain ID is in paranthesis. In (D), a poorly fitted backbone near S401 (chain B) is indicated with an arrow.

Identifying errors in atomic model fits. In each panel (A–D), local segments of two atomic models submitted to the EMDB Model Challenge 2015 for the target gamma-secretase map (EMD-3061) are compared for fit to density. For each panel, the figure on the left corresponds to the model ranked higher in the challenge, and a relatively lower scoring model is on the right. The atoms in the models are colored by Df,model-map based on model–map difference. The poorly fitted residue (in the model on the right subpanel) is labeled, and the chain ID is in paranthesis. In (D), a poorly fitted backbone near S401 (chain B) is indicated with an arrow.

Discussion

The approach presented in this paper is useful in identifying ligand densities and conformational differences by comparing density maps. Identification of a ligand binding site is challenging at intermediate-to-low resolutions, and the difference density is a useful pointer to potential locations. In addition to the examples presented above, this approach was found useful for identifying the binding site of a kinesin inhibitor based on cryo-EM maps of resolutions between 5 and 6 Å. A difference density blob coincided with a potential drug binding pocket on the protein surface, with the interacting site harboring residues specific for the subfamily of proteins that the drug targets.[31] The drug molecule when docked computationally at this pocket correlated well with the difference density, although the resolution is not good enough to confirm details of the pose. Map density scaling is central to difference map calculations, and local scaling has been shown to be useful for model building in maps that sample a wide range of local resolutions.[8] Local scaling was found more appropriate to interpret differences especially when the differences are contributed by segments involving flexible or less resolved parts of the molecule. The developed approach is also useful to compare atomic models to maps and can be a helpful guide in identifying errors in atomic model fits. In the context of model validation, difference maps complement other metrics based on model–map fit or expected geometries. Some metrics are less discriminative at lower resolutions, though CaBLAM, for example, still picks up the backbone model errors considered in Figure B. In general, although it is important to compare different validation metrics when finalizing a structure, the difference maps provide useful visual clues to problem areas. As mentioned earlier, inaccuracies in map calculations from the model can result in differences with the experimental map. Accounting for factors like atomic charges, radiation damage, and accurate B-factor estimates to reflect dynamics will improve theoretical map calculation and minimize such differences. The fractional difference maps act as useful means to locate voxels associated with significant conformational and compositional changes. A threshold applied to the fractional difference maps is useful to mask out differences that are less significant or arising from noise. The choice of the threshold might depend on whether the differences arise from areas where the molecular volumes overlap, local dynamics of the molecule, and occupancy in the region of interest. In the case of a map–map comparison applied to GlyR (discussed above), the core of the ligands (which is better resolved than periphery) could be located with a Df threshold of 0.4, while this threshold covers most or all of the ligand density (ADP-AlFx) in the case of MKLP2 example. For validating atomic models fitted in maps, a Df threshold of 0.5 identifies most of the obvious misfits and atoms outside the molecular contour of the map. Subtle differences in backbone and side chains were visible above a threshold of about 0.25. These thresholds may be used as a guide, although different values might have to be tested in practice. The quality of the map–map (or map–model) alignment affects the differences obtained, and errors in alignment are observed as differences. For large-scale conformational changes or domain motions, the alignment of two maps may have to be anchored on the less dynamic segment of the molecular complex. Also, global scaling might be preferable in such cases as local scaling works on the assumption that the equivalent parts of the maps are aligned.

Implementation

The difference map calculation method is implemented in the CCP-EM software package for electron cryo-microscopy.[9] The interface either takes two maps or a map and a model as input, and these should be aligned beforehand. If the map sizes and/or voxel spacings differ, they have to be resampled to a common grid. The input map(s) can be preprocessed to remove any background using the map processing tool in CCP-EM. This tool provides options to threshold/mask and dust and adds a soft edge to the masked map. To calculate differences between a map and an atomic model, a map simulated from the model can be generated externally and supplied as input. Alternatively, if the atomic model is used as the second input, a map is generated from the model using the TEMPy software package.[13] By default, the model is used as the reference for scaling, but this can be disabled. For calculating differences, both local and global scaling modes are provided as options for the user to choose from. For local scaling, a mask file should be provided which covers the area wherein scaling calculations will be done (note that this can be distinct from the mask used in map preprocessing). Ideally, this mask covers useful molecular volumes of both inputs, and it is recommended to provide a mask. If a mask is not provided, a map contour threshold of 2.0 sigma is applied on the first map to create a mask. As expected, the local scaling calculation for the maps is much slower than the global calculation. For a map grid of size 1003, local scaling calculations take about 1 min 20 s, while global scaling for the same map takes 1.3 s on a single CPU. The interface provides links to visualize the difference densities in Chimera or Coot. The fractional difference maps Df,1–2 and Df,2–1 are also calculated by default. These maps can be used to color atomic models in Chimera, using the fractional difference values as attributes for atoms. Optionally, a fractional difference threshold can be used to mask the output difference map calculated. All voxels with Df less than the threshold are masked out in the difference map. Similarly, a dust filter can be applied on the difference map as an option. This removes any dust after masking the differences at a given Df threshold (0.3 by default).

Conclusions

We present an approach for calculation of difference densities for cryo-EM maps and implement this as a tool with a user-friendly interface in the CCP-EM package. The tests discussed here reflect its potential for comparing different EM reconstructions to identify compositional and conformational differences, as well as to evaluate atomic model fit in maps. The fractional difference values help to associate significance to the differences. Our multistep protocol produces relatively clean and interpretable difference maps. Nevertheless, a systematic study on the significance of difference densities will be useful to delineate differences arising from noise vs signal.
  29 in total

1.  Ivermectin, an unconventional agonist of the glycine receptor chloride channel.

Authors:  Q Shan; J L Haddrill; J W Lynch
Journal:  J Biol Chem       Date:  2001-01-18       Impact factor: 5.157

2.  Bsoft: image and molecular processing in electron microscopy.

Authors:  J B Heymann
Journal:  J Struct Biol       Date:  2001 Feb-Mar       Impact factor: 2.867

3.  Structural analysis of the ZEN-4/CeMKLP1 motor domain and its interaction with microtubules.

Authors:  Dilem Hizlan; Masanori Mishima; Peter Tittmann; Heinz Gross; Michael Glotzer; Andreas Hoenger
Journal:  J Struct Biol       Date:  2006-01       Impact factor: 2.867

4.  Features and development of Coot.

Authors:  P Emsley; B Lohkamp; W G Scott; K Cowtan
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2010-03-24

5.  Kinesin motility is driven by subdomain dynamics.

Authors:  Wonmuk Hwang; Matthew J Lang; Martin Karplus
Journal:  Elife       Date:  2017-11-07       Impact factor: 8.140

6.  Model-based local density sharpening of cryo-EM maps.

Authors:  Arjen J Jakobi; Matthias Wilmanns; Carsten Sachse
Journal:  Elife       Date:  2017-10-23       Impact factor: 8.140

7.  Recent developments in the CCP-EM software suite.

Authors:  Tom Burnley; Colin M Palmer; Martyn Winn
Journal:  Acta Crystallogr D Struct Biol       Date:  2017-05-31       Impact factor: 7.652

8.  The divergent mitotic kinesin MKLP2 exhibits atypical structure and mechanochemistry.

Authors:  Joseph Atherton; I-Mei Yu; Alexander Cook; Joseph M Muretta; Agnel Joseph; Jennifer Major; Yannick Sourigues; Jeffrey Clause; Maya Topf; Steven S Rosenfeld; Anne Houdusse; Carolyn A Moores
Journal:  Elife       Date:  2017-08-11       Impact factor: 8.140

9.  TEMPy: a Python library for assessment of three-dimensional electron microscopy density fits.

Authors:  Irene Farabella; Daven Vasishtan; Agnel Praveen Joseph; Arun Prasad Pandurangan; Harpal Sahota; Maya Topf
Journal:  J Appl Crystallogr       Date:  2015-06-27       Impact factor: 3.304

Review 10.  Structural Study of Heterogeneous Biological Samples by Cryoelectron Microscopy and Image Processing.

Authors:  H E White; A Ignatiou; D K Clare; E V Orlova
Journal:  Biomed Res Int       Date:  2017-01-15       Impact factor: 3.411

View more
  10 in total

1.  Faces of Contemporary CryoEM Information and Modeling.

Authors:  Giulia Palermo; Yuji Sugita; Willy Wriggers; Rommie E Amaro
Journal:  J Chem Inf Model       Date:  2020-05-26       Impact factor: 4.956

2.  Cryo-EM structure and kinetics reveal electron transfer by 2D diffusion of cytochrome c in the yeast III-IV respiratory supercomplex.

Authors:  Agnes Moe; Justin Di Trani; John L Rubinstein; Peter Brzezinski
Journal:  Proc Natl Acad Sci U S A       Date:  2021-03-16       Impact factor: 11.205

Review 3.  Cryo-electron Microscopy of Adeno-associated Virus.

Authors:  Scott M Stagg; Craig Yoshioka; Omar Davulcu; Michael S Chapman
Journal:  Chem Rev       Date:  2022-05-16       Impact factor: 72.087

4.  VESPER: global and local cryo-EM map alignment using local density vectors.

Authors:  Xusi Han; Genki Terashi; Charles Christoffer; Siyang Chen; Daisuke Kihara
Journal:  Nat Commun       Date:  2021-04-07       Impact factor: 14.919

5.  Macromolecular refinement of X-ray and cryoelectron microscopy structures with Phenix/OPLS3e for improved structure and ligand quality.

Authors:  Gydo C P van Zundert; Nigel W Moriarty; Oleg V Sobolev; Paul D Adams; Kenneth W Borrelli
Journal:  Structure       Date:  2021-04-05       Impact factor: 5.871

6.  TEMPy2: a Python library with improved 3D electron microscopy density-fitting and validation workflows.

Authors:  Tristan Cragnolini; Harpal Sahota; Agnel Praveen Joseph; Aaron Sweeney; Sony Malhotra; Daven Vasishtan; Maya Topf
Journal:  Acta Crystallogr D Struct Biol       Date:  2021-01-01       Impact factor: 7.652

7.  Cryo-EM reveals mechanisms of angiotensin I-converting enzyme allostery and dimerization.

Authors:  Lizelle Lubbe; Bryan Trevor Sewell; Jeremy D Woodward; Edward D Sturrock
Journal:  EMBO J       Date:  2022-07-12       Impact factor: 14.012

8.  Structure of the decoy module of human glycoprotein 2 and uromodulin and its interaction with bacterial adhesin FimH.

Authors:  Alena Stsiapanava; Chenrui Xu; Shunsuke Nishio; Ling Han; Nao Yamakawa; Marta Carroni; Kathryn Tunyasuvunakool; John Jumper; Daniele de Sanctis; Bin Wu; Luca Jovine
Journal:  Nat Struct Mol Biol       Date:  2022-03-10       Impact factor: 15.369

9.  Cryo-EM single-particle structure refinement and map calculation using Servalcat.

Authors:  Keitaro Yamashita; Colin M Palmer; Tom Burnley; Garib N Murshudov
Journal:  Acta Crystallogr D Struct Biol       Date:  2021-09-29       Impact factor: 7.652

10.  Validation, analysis and annotation of cryo-EM structures.

Authors:  Grigore Pintilie; Wah Chiu
Journal:  Acta Crystallogr D Struct Biol       Date:  2021-08-31       Impact factor: 7.652

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.