| Literature DB >> 33978743 |
Chloé Quignot1, Guillaume Postic2, Hélène Bret1, Julien Rey2, Pierre Granger1, Samuel Murail2, Pablo Chacón3, Jessica Andreani1, Pierre Tufféry2, Raphaël Guerois1.
Abstract
The InterEvDock3 protein docking server exploits the constraints of evolution by multiple means to generate structural models of protein assemblies. The server takes as input either several sequences or 3D structures of proteins known to interact. It returns a set of 10 consensus candidate complexes, together with interface predictions to guide further experimental validation interactively. Three key novelties were implemented in InterEvDock3 to help obtain more reliable models: users can (i) generate template-based structural models of assemblies using close and remote homologs of known 3D structure, detected through an automated search protocol, (ii) select the assembly models most consistent with contact maps from external methods that implement covariation-based contact prediction with or without deep learning and (iii) exploit a novel coevolution-based scoring scheme at atomic level, which leads to significantly higher free docking success rates. The performance of the server was validated on two large free docking benchmark databases, containing respectively 230 unbound targets (Weng dataset) and 812 models of unbound targets (PPI4DOCK dataset). Its effectiveness has also been proven on a number of challenging examples. The InterEvDock3 web interface is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock3/.Entities:
Mesh:
Substances:
Year: 2021 PMID: 33978743 PMCID: PMC8265070 DOI: 10.1093/nar/gkab358
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.General pipeline highlighting the 3 novel approaches implemented in the InterEvDock3 server. Mode 1 uses sequences as input and runs a template-based modeling protocol search for close and remote homologs with HHsearch in an exhaustive manner to generate models of homomeric and heteromeric assemblies. Mode 2 uses 3D structures of monomers or homomeric complexes to run a free docking approach trying to satisfy the contacts predicted in the contact map provided as input and obtained from servers such as ComplexContact (34), EvComplex (52) or trRosetta (35). Mode 3 uses 3D structures of monomers or multimeric complexes (possibly modeled from sequences in Mode 1) and implements a new strategy for scoring interfaces with coevolution information using an atomic-based scoring of interface models for 10 to 40 pairs of interolog sequences using scores such as InterEvScore (37), SOAP-PP (38) or Rosetta scoring function (39). A single complex model is returned after mode 1 while 10 best models are returned for every score calculated in modes 2 and 3. Mode 3 also returns a consensus selection of 10 best models and a prediction of residues likely to be in the interface. Template-based modeling (mode 1) is in general more reliable than free docking approaches and should be favored if possible. In case large multiple sequence alignments can be obtained, it is advised to run first the coevolution-derived contact mode (mode 2) which can strongly restrict the relative orientations between docked partners. Mode 3 should be privileged in cases where only shallow co-MSAs can be obtained.
Figure 2.(A) Model automatically generated using InterEvDock3 in mode 1 for 6 human proteins forming a subcomplex of the inner kinetochore display in the online web interface of InterEvDock3. Despite very low mean sequence identity between human and yeast subunits (11.5% on average), HHsearch detected orthologous subunits for CENPI (13%), CENPO (13%), CENPL (11%), CENPP (11%), CENPN (10%) and CENPK (9%) as indicated in the table of templates listing all the possible complexes and subcomplexes which could be used as templates for this set of sequences. (B) Structural model obtained using InterEvDock3 in mode 2 (by applying the top 100 contacts predicted by the ComplexContact server) for the complex between E. coli MutS homodimer and MutL homodimeric N-terminal domain (PDB:1B63). In the crystal structure of the complex, a single domain of MutL was crystallized (5AKB). A PyMol script ‘start_analysis_cmap.pml’ is distributed in the downloadable results archive to get all the predicted contacts in the map which were satisfied in the model as shown in the figure.
Figure 3.(A) Increased performance of InterEvDock3 (IED3) free docking server (Figure 1 mode 3) benchmarked on the Weng and PPI4DOCK datasets compared to the success rates obtained with the previous version of InterEvDock2 (IED2) (53). Success rates obtained for three options which can be activated in the web interface are shown. (B) Example of a docking model obtained for a target of the PPI4DOCK dataset involving unbound models of E2 and E3A ligases. The model representation was obtained from the PyMOL script available in the result archive from InterEvDock3. Colored representation corresponds to either the conservation (yellow-red scale) or the probability of a residue to be part of the interface (white-green scale).