| Literature DB >> 34915128 |
Rangana Warshamanage1, Keitaro Yamashita2, Garib N Murshudov3.
Abstract
An open-source Python library EMDA for cryo-EM map and model manipulation is presented with a specific focus on validation. The use of several functionalities in the library is presented through several examples. The utility of local correlation as a metric for identifying map-model differences and unmodeled regions in maps, and how it is used as a metric of map-model validation is demonstrated. The mapping of local correlation to individual atoms, and its use to draw insights on local signal variations are discussed. EMDA's likelihood-based map overlay is demonstrated by carrying out a superposition of two domains in two related structures. The overlay is carried out first to bring both maps into the same coordinate frame and then to estimate the relative movement of domains. Finally, the map magnification refinement in EMDA is presented with an example to highlight the importance of adjusting the map magnification in structural comparison studies.Entities:
Keywords: Cryo-EM; EMDA; Likelihood; Local correlation; Magnification; Overlay
Mesh:
Year: 2021 PMID: 34915128 PMCID: PMC8935390 DOI: 10.1016/j.jsb.2021.107826
Source DB: PubMed Journal: J Struct Biol ISSN: 1047-8477 Impact factor: 2.867
Fig. 1Architecture of EMDA library. The three Python code layers are shown in blue and the layer of external libraries is shown in green. The black arrows show the data flow and the functional dependencies. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Table of notation.
| Notation | Description |
|---|---|
| fullmap | Map obtained by averaging half data reconstructed maps |
| Covariance between random variables | |
| Variance of the random variable X | |
| 3D column vectors in real and Fourier space | |
| Convolution kernel | |
| Cryo-EM map number | |
| Local variance of | |
| Local covariance between | |
| Local correlation coefficient calculated between | |
| Local correlation calculated between halfmaps | |
| Local correlation in the fullmap ( | |
| Local correlation calculated between the fullmap and the atomic model-based map | |
| Two-dimensional Gaussian distribution with mean | |
| 2 | |
| A column vector formed by complex Fourier coefficients of observed maps | |
| A column vector formed by complex Fourier coefficients of unknown “true” maps | |
| A column vector formed by complex Fourier coefficients calculated from models | |
| A column vector formed by normalized complex Fourier coefficients of observed maps | |
| Normalized complex Fourier coefficients of | |
| Transformation matrix and translational vector in 3D to be applied for the map number | |
| Diagonal matrix formed by scale factors between the true and calculated Fourier coefficients | |
| Diagonal matrix formed by blurring parameters | |
| Diagonal matrix formed by variances of observational noise | |
| A column vector formed by the expectation values of the true maps | |
| FSC in the fullmap converted from halfmap FSC ( | |
| Diagonal matrix formed by square root of fullmap FSC values. It is also an estimate for FSC between fullmap and “true” signal ( | |
| Correlation coefficient between true maps | |
| Correlation coefficient between observed maps |
It should be noted that the covariance matrix is ⊗ I2, i.e. Kronecker product of covariance and 2-dimensional identity matrices. The reason of such covariance matrix structure is that there are not correlations between real and imaginary parts of Fourier coefficients and the variance of real and imaginary parts of each Fourier coefficient are equal to each other.
Fig. 2Identifying model-map discrepancies by local correlation. a) EMD-5623 primary map density near residues Lys52-Val54 of chain U of the 3j9i model coloured by . b) the same density coloured by . c) corresponding atomic model coloured by showing low correlation residues Lys52-Val54 of chain U. The figure was made with ChimeraX (Pettersen et al., 2021).
Fig. 3Use of local correlation to identify unmodeled linoleic acid (LA) in EMD-11203 map. a) unmodeled ligand density in the primary map coloured by the . High correlation indicates the presence of a strong signal. b) the same density coloured by calculated between the fullmap and the refined 6zge model using normalized and weighted densities. The correlation in this region is low compared to its surrounding. c) ligand density coloured by calculated between the fullmap and the refined model with LA using normalized and weighted densities. Densities in a, b and c panels were contoured at the same level. Those figures were made with Chimera (Pettersen et al., 2004). d) distribution of atomic B values of refined LA where the atoms are coloured by the B values. This figure was made with PyMOL (Schrödinger and DeLano, 2020). e) distribution of atomic correlation values at refined LA coordinates. and are shown with orange and blue, respectively. Also, the atomic B values are shown in grey. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4Map superposition in EMDA illustrated using EMD-21997 and EMD-21999 maps. (a) keeping EMD-21997 map (i) static, EMD-21999 map (ii) was moved to obtain the optimal overlay between them (iii). Starting from the overlaid maps (iv) and (v), RBDs were extracted using masks. The extracted RBDs (vi) and (vii) were superposed (viii) and optimized their overlay (ix) in EMDA. The final values of relative rotation and translation are 3.38° and 1.76 Å, respectively. The same transformation was applied on the model 6x2a of the moving map. The superposition of 6x29 (static, grey) and 6x2a (moving, cyan) RBD models before (x) and after (xi) the domain transformation being applied. This figure was made with Chimera (Pettersen et al., 2004). (b) FSC between static and moving RBDs before (blue) and after (orange) the overlay optimization. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 5The magnification refinement in EMDA using Haemoglobin data (EMD-3651). (a) the superposition of the original (half1) map (in grey) on the initial map (in cyan) obtained by introducing a -5% magnification error on the original map is improved after magnification correction (adjusted map shown in cyan). This figure was made with Chimera (Pettersen et al., 2004). (b) FSC between initial and adjusted maps against the original map indicating improvement in the superposition due to correction in magnification. (c) FSC curves for initial and adjusted maps calculated against the half2 are shown in blue and orange, respectively. The increase of FSC from blue to orange is due to the improved magnification. The green curve is the FSC between the half maps and it serves as the ground truth. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 6Magnification correction in EMD-7770, EMD-10574 maps relative to the crystallography model 3dyp. (a) the overlaid EMD-7770 (i, yellow) and EMD-10574 (ii, cyan) maps on the reference map (grey) before the magnification optimisation. (iii) and (iv) are the same maps after the optimisation. The magnification differences in EMD-7770 and EMD-10574 relative to the reference are +0.3% and +1.7%, respectively. The FSC curves for EMD-7770 and EMD-10574 maps against the reference before and after the magnification adjustment are shown in (v) and (vi), respectively. The blue and orange curves correspond to FSCs before and after the magnification refinement, respectively. The increase in FSC is attributed to the corrected magnification. This figure was made with Chimera (Pettersen et al., 2004). (b) comparison of EMD-7770 map (yellow) and EMD-10574 map (cyan) densities against the reference map (grey) in different regions before and after the magnification correction. See text for details. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 7Movement of one monomer unit of EMD-7770 (yellow) relative to the corresponding monomer unit of the reference map (grey). Selected monomers are highlighted in (i) and those extracted are shown in (ii) before the fit optimisation. (iii) the monomer units after the fit optimisation. (iv) FSCs between monomer units before (blue) and after (orange) the fit optimisation. This figure was made with Chimera (Pettersen et al., 2004). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)