Literature DB >> 21715383

MultiFit: a web server for fitting multiple protein structures into their electron microscopy density map.

Elina Tjioe1, Keren Lasker, Ben Webb, Haim J Wolfson, Andrej Sali.   

Abstract

Advances in electron microscopy (EM) allow for structure determination of large biological assemblies at increasingly higher resolutions. A key step in this process is fitting multiple component structures into an EM-derived density map of their assembly. Here, we describe a web server for this task. The server takes as input a set of protein structures in the PDB format and an EM density map in the MRC format. The output is an ensemble of models ranked by their quality of fit to the density map. The models can be viewed online or downloaded from the website. The service is available at; http://salilab.org/multifit/ and http://bioinfo3d.cs.tau.ac.il/.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21715383      PMCID: PMC3125811          DOI: 10.1093/nar/gkr490

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


SIGNIFICANCE

Macromolecular assemblies are involved in nearly all cellular processes. Determining the structures of these biological machines is crucial for deciphering their function. Recent advances established electron microscopy as a central technique for studying the structures of macromolecular assemblies in different functional states in vitro and in vivo. Because the resolution of an electron microscopy density map is relatively low, fitting of atomic resolution component structures into the density map of the whole assembly is essential. MultiFit is the first web server for achieving this task.

INTRODUCTION

Recent advances have established electron microscopy (EM) as a central technique for studying the structures of macromolecular assemblies in different functional states in vitro and in vivo (1). The resolution of an EM density map is typically better than 25 Å, and can be as high as ∼4 Å for highly symmetric structures (2,3). In most cases, however, the resolution is insufficient to construct a full atomic model of a protein complex. To this end, fitting of atomic resolution structures into an EM density map of the whole assembly is essential (4–8). In the past decade, different algorithms have been developed for fitting a single protein subunit into its density map (9–20). Most methods use a variant of the cross-correlation coefficient as the quality-of-fit measure (21). The position of a protein subunit inside the density map is sampled either exhaustively or by matching precalculated geometric features. Methods for fitting multiple components of large assemblies have also been recently described (22–25). In particular, we have developed the MultiFit module of the Integrative Modeling Platform (IMP, http://www.salilab.org/imp/) software package (23,26). MultiFit simultaneously positions protein subunits into a density map of a protein assembly by combining geometric criteria commonly used in molecular docking and quality-of-fit criteria commonly used in EM fitting. The method was validated in the 2010 EM modeling challenge (http://ncmi.bcm.edu/challenge/). Here, we present a web interface to MultiFit. The server takes as input a set of protein structures in the PDB format and an EM density map in the MRC format. The output is an ensemble of models ranked by their quality of fit to the density map. The models can be viewed online or downloaded from the website.

THE MULTIFIT METHOD

MultiFit is a method for simultaneously fitting atomic-resolution protein structures into their assembly density map at resolutions as low as 25 Å. The input is a set of atomic structures of proteins and an EM density map of their assembly. The component positions and orientations are optimized with respect to a scoring function that includes the quality-of-fit of components in the map, the protrusion of components from the map envelope and the shape complementarity between pairs of components. The scoring function is optimized by an exact inference optimizer DOMINO (Discrete Optimization of Multiple INteracting Objects) that efficiently finds the global minimum within a discrete sampling space. Specifically, the optimization algorithm is composed of four stages, each sampling assembly models at increasingly higher resolution and accuracy. In ‘anchor graph segmentation’ stage, an unlabeled segmentation of the density map into regions is calculated using a Gaussian mixture model; the segmented regions correspond approximately to the subunits in the complex. In ‘fitting-based assembly configuration’ stage, a set of coarse assembly models is found by an enumeration over possible assignments of subunits to regions, followed by simultaneous local fitting of the subunits in the corresponding regions. In ‘docking-based pose refinement’ stage, each of the models found in the ‘configuration’ stage is refined by simultaneous local optimization of the interfaces between pairs of interacting subunits as sampled by local pairwise docking. In ‘rigid body minimization’ stage, each of the models found in the ‘refinement’ stage is further refined using a local Monte Carlo/conjugate gradients minimization procedure. The default run of the MultiFit web server omits the final refinement stage. Users can explore the ensemble of solutions generated by the first three stages and then refine a subset of the ensemble using a downloaded version of MultiFit. For cyclic symmetric complexes, the symmetry is imposed within the optimization procedure for improved efficiency, such that only symmetric models are sampled. In particular, in ‘fitting-based assembly configuration’ and ‘docking-based pose refinement’, only cyclic symmetric models consistent with the symmetry of the density map are sampled (26).

WEB SERVER

Input

The MultiFit web server requires as input a set of protein structures in the PDB format, an EM density map of their assembly in the MRC format, and a few parameters (Figure 1). The parameters for the density map include: (i) resolution (Å) (27); (ii) voxel spacing on the grid representing the map (Å); and (iii) the contour level that results in the volume accommodating the molecular mass of the complex. These parameters are included for maps deposited in the EM Data Bank (EMDB) (28).
Figure 1.

Snapshots of the MultiFit web server. (A) Input page. The inputs are divided into three parts: (i) general information, (ii) density map information and (iii) protein complex information. Seven copies of the GroEL chaperon monomer [PDB entry 1oel (33)] are simultaneously fitted to its ring density map at 11.5-Å resolution [EMDB entry 1080 (34)] using cyclic symmetry mode. The input subunit PDB file and the input assembly density map used for this example can be obtained from the MultiFit web server help page. The input parameters for the resolution, spacing, contour level and symmetry order obtained from EMDB site are 11.5, 2.7, 0.852 and 7, respectively. The optional parameters for X, Y and Z origins are set to –50, –50 and –50, respectively. (B) Output page. The top 20 assembly models of the GroEL chaperon complex are ranked according to the quality-of-fit score from top left to bottom right. The user can click on the model thumbnail to open it using Chimera for further analysis. The PDB files and the transformation output file can be downloaded. Job results will be available for 6 days. (C) Top scored structural model of the GroEL ring fitted into the density map.

Snapshots of the MultiFit web server. (A) Input page. The inputs are divided into three parts: (i) general information, (ii) density map information and (iii) protein complex information. Seven copies of the GroEL chaperon monomer [PDB entry 1oel (33)] are simultaneously fitted to its ring density map at 11.5-Å resolution [EMDB entry 1080 (34)] using cyclic symmetry mode. The input subunit PDB file and the input assembly density map used for this example can be obtained from the MultiFit web server help page. The input parameters for the resolution, spacing, contour level and symmetry order obtained from EMDB site are 11.5, 2.7, 0.852 and 7, respectively. The optional parameters for X, Y and Z origins are set to –50, –50 and –50, respectively. (B) Output page. The top 20 assembly models of the GroEL chaperon complex are ranked according to the quality-of-fit score from top left to bottom right. The user can click on the model thumbnail to open it using Chimera for further analysis. The PDB files and the transformation output file can be downloaded. Job results will be available for 6 days. (C) Top scored structural model of the GroEL ring fitted into the density map. The MultiFit web server operates in two modes: cyclic-symmetric and non-symmetric. In the cyclic-symmetric mode, the symmetry order should be provided (2 for dimer, 3 for trimer, etc.). If the arrangement of the input monomers in its native complex follows a different type of symmetry, the user should use the downloaded version of MultiFit. In the non-symmetric mode, a list of subunit PDB files and the number of copies of each subunit are required. The input density should be pre-segmented to contain only the input set of proteins. The server also has an optional input parameter specifying an e-mail address to which a link to the results page will be sent once the job is completed. Alternatively, the user can bookmark a web link to the results page at the time of data submission. The status of the job (queued, running or finished) can be accessed on the queue page.

Output

The computation is performed in real time and the server page is updated once the calculation has finished. The typical running time is about 20 min for assemblies with tens of thousands of atoms. The web server output page displays a table of the top 20 assembly models that best fit the assembly density map, along with their quality-of-fit scores ranked from top left to bottom right (Figure 1). MultiFit lists the optimal as well as suboptimal solutions; when the latter have good scores and are different from the optimal solution, the user should be skeptical about all solutions and further analyze the ensemble. Each model can be saved as a PDB file and can also be directly opened with UCSF Chimera (19). A compressed file containing all models is available for download. Moreover, the MultiFit output text file can be downloaded. Row i lists the transformation applied to each of the subunits, the model quality-of-fit score, and the geometric complementarity score for model i. This output file can be used as input to IMP for further refinement and analysis. It can also be used as input for refining symmetric complexes using the SymmRef method (29).

CONCLUSIONS

With the growing number of macromolecular assemblies characterized by EM, integrative modeling techniques are becoming increasingly useful for a mechanistic understanding of these assemblies (6,30–32). The MultiFit web server was designed to provide a user-friendly web interface to the MultiFit module in the IMP package, for fitting multiple protein structures into their assembly density map.

FUNDING

The Clore Foundation Ph.D Scholars program (to K.L.); the Israel Science Foundation (1403/09) and the Hermann Minkowski-Minerva Center for Geometry at Tel Aviv University (H.J.W.); and the Sandler Family Supporting Foundation, National Institutes of Health (R01 GM54762, U54 RR022220, PN2 EY016525 and R01 GM083960), Hewlett-Packard, NetApp, IBM and Intel (to A.S.). Conflict of interest statement. None declared.
  33 in total

Review 1.  Fitting of high-resolution structures into electron microscopy reconstruction images.

Authors:  Felcy Fabiola; Michael S Chapman
Journal:  Structure       Date:  2005-03       Impact factor: 5.006

2.  ADP_EM: fast exhaustive multi-resolution docking for high-throughput coverage.

Authors:  José Ignacio Garzón; Julio Kovacs; Ruben Abagyan; Pablo Chacón
Journal:  Bioinformatics       Date:  2006-12-06       Impact factor: 6.937

3.  EMatch: discovery of high resolution structural homologues of protein domains in intermediate resolution cryo-EM maps.

Authors:  Keren Lasker; Oranit Dror; Maxim Shatsky; Ruth Nussinov; Haim J Wolfson
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2007 Jan-Mar       Impact factor: 3.710

4.  NORMA: a tool for flexible fitting of high-resolution protein structures into low-resolution electron-microscopy-derived density maps.

Authors:  Karsten Suhre; Jorge Navaza; Yves Henri Sanejouand
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2006-08-19

5.  Multi-resolution anchor-point registration of biomolecular assemblies and their components.

Authors:  Stefan Birmanns; Willy Wriggers
Journal:  J Struct Biol       Date:  2006-08-25       Impact factor: 2.867

Review 6.  Integrating diverse data for structure determination of macromolecular assemblies.

Authors:  Frank Alber; Friedrich Förster; Dmitry Korkin; Maya Topf; Andrej Sali
Journal:  Annu Rev Biochem       Date:  2008       Impact factor: 23.643

Review 7.  The molecular sociology of the cell.

Authors:  Carol V Robinson; Andrej Sali; Wolfgang Baumeister
Journal:  Nature       Date:  2007-12-13       Impact factor: 49.962

8.  Inferential optimization for simultaneous fitting of multiple components into a CryoEM map of their assembly.

Authors:  Keren Lasker; Maya Topf; Andrej Sali; Haim J Wolfson
Journal:  J Mol Biol       Date:  2009-02-20       Impact factor: 5.469

9.  Multiple subunit fitting into a low-resolution density map of a macromolecular complex using a gaussian mixture model.

Authors:  Takeshi Kawabata
Journal:  Biophys J       Date:  2008-08-15       Impact factor: 4.033

Review 10.  Hybrid approaches: applying computational methods in cryo-electron microscopy.

Authors:  Steffen Lindert; Phoebe L Stewart; Jens Meiler
Journal:  Curr Opin Struct Biol       Date:  2009-03-30       Impact factor: 6.809

View more
  13 in total

Review 1.  Automated Modeling and Validation of Protein Complexes in Cryo-EM Maps.

Authors:  Tristan Cragnolini; Aaron Sweeney; Maya Topf
Journal:  Methods Mol Biol       Date:  2021

2.  PRISM-EM: template interface-based modelling of multi-protein complexes guided by cryo-electron microscopy density maps.

Authors:  Guray Kuzu; Ozlem Keskin; Ruth Nussinov; Attila Gursoy
Journal:  Acta Crystallogr D Struct Biol       Date:  2016-09-20       Impact factor: 7.652

3.  SSEThread: Integrative threading of the DNA-PKcs sequence based on data from chemical cross-linking and hydrogen deuterium exchange.

Authors:  Daniel J Saltzberg; Morgan Hepburn; Kala Bharath Pilla; David C Schriemer; Susan P Lees-Miller; Tom L Blundell; Andrej Sali
Journal:  Prog Biophys Mol Biol       Date:  2019-09-27       Impact factor: 3.667

4.  Computational methods for constructing protein structure models from 3D electron microscopy maps.

Authors:  Juan Esquivel-Rodríguez; Daisuke Kihara
Journal:  J Struct Biol       Date:  2013-06-21       Impact factor: 2.867

5.  Enhanced sampling and overfitting analyses in structural refinement of nucleic acids into electron microscopy maps.

Authors:  Harish Vashisth; Georgios Skiniotis; Charles L Brooks
Journal:  J Phys Chem B       Date:  2013-04-01       Impact factor: 2.991

Review 6.  Collective variable approaches for single molecule flexible fitting and enhanced sampling.

Authors:  Harish Vashisth; Georgios Skiniotis; Charles Lee Brooks
Journal:  Chem Rev       Date:  2014-01-21       Impact factor: 60.622

7.  Atomic-accuracy models from 4.5-Å cryo-electron microscopy data with density-guided iterative local refinement.

Authors:  Frank DiMaio; Yifan Song; Xueming Li; Matthias J Brunner; Chunfu Xu; Vincent Conticello; Edward Egelman; Thomas Marlovits; Yifan Cheng; David Baker
Journal:  Nat Methods       Date:  2015-02-23       Impact factor: 28.547

8.  A fragment based method for modeling of protein segments into cryo-EM density maps.

Authors:  Jochen Ismer; Alexander S Rose; Johanna K S Tiemann; Peter W Hildebrand
Journal:  BMC Bioinformatics       Date:  2017-11-13       Impact factor: 3.169

Review 9.  Three dimensional electron microscopy and in silico tools for macromolecular structure determination.

Authors:  Subhomoi Borkotoky; Chetan Kumar Meena; Mohammad Wahab Khan; Ayaluru Murali
Journal:  EXCLI J       Date:  2013-04-24       Impact factor: 4.068

10.  3DIANA: 3D Domain Interaction Analysis: A Toolbox for Quaternary Structure Modeling.

Authors:  Joan Segura; Ruben Sanchez-Garcia; Daniel Tabas-Madrid; Jesus Cuenca-Alba; Carlos Oscar S Sorzano; Jose Maria Carazo
Journal:  Biophys J       Date:  2016-01-07       Impact factor: 4.033

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.