Literature DB >> 19458157

RosettaAntibody: antibody variable region homology modeling server.

Aroop Sircar¹, Eric T Kim, Jeffrey J Gray.

Abstract

The RosettaAntibody server (http://antibody.graylab.jhu.edu) predicts the structure of an antibody variable region given the amino-acid sequences of the respective light and heavy chains. In an initial stage, the server identifies and displays the most sequence homologous template structures for the light and heavy framework regions and each of the complementarity determining region (CDR) loops. Subsequently, the most homologous templates are assembled into a side-chain optimized crude model, and the server returns a picture and coordinate file. For users requesting a high-resolution model, the server executes the full RosettaAntibody protocol which additionally models the hyper-variable CDR H3 loop. The high-resolution protocol also relieves steric clashes by optimizing the CDR backbone torsion angles and by simultaneously perturbing the relative orientation of the light and heavy chains. RosettaAntibody generates 2000 independent structures, and the server returns pictures, coordinate files, and detailed scoring information for the 10 top-scoring models. The 10 models enable users to use rational judgment in choosing the best model or to use the set as an ensemble for further studies such as docking. The high-resolution models generated by RosettaAntibody have been used for the successful prediction of antibody-antigen complex structures.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2009 PMID： 19458157 PMCID： PMC2703951 DOI： 10.1093/nar/gkp387

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Therapeutic monoclonal antibodies are a genre of biopharmaceuticals which has benefitted healthcare in various fields from oncology to immune and inflammatory disorders. Development of successful novel therapeutic antibodies requires understanding of drug and disease mechanisms and the ability to stabilize, affinity mature, and humanize antibodies. Antibody structures can help overcome these challenges by providing atomic level insights into structure–function relationships and the antibody–antigen interaction [e.g. see refs. (1–4)]. However, experimental techniques for obtaining antibody structures, like X-ray crystallography and nuclear magnetic resonance, are laborious, time consuming and costly. Computational antibody structure prediction provides a fast and inexpensive route to obtain structures, including those which are not obtainable otherwise. Two antibody variable region (FV) modeling servers are available on the Internet: the Web Antibody Modeling (WAM) (5) and Prediction of Immunoglobulin Structure (PIGS) (6) servers. WAM can require several days to output one antibody model in response to a submitted query sequence. No information on templates used for modeling the antibody is provided. Furthermore, antibody structures predicted with WAM have internal clashes and their inaccuracies can confound computational docking (2,7). The PIGS server returns an antibody model in about a minute and displays the antibody crystal structures that it selects as templates. The PIGS models are generated by grafting complementarity determining region (CDR) loops onto selected framework templates, even for the hyper-variable and non-canonical CDR H3 loop. Accurate CDR H3 predictions would only be expected when a similar CDR H3 loop is present in the database, which is unlikely for novel antibody sequences. The existing servers do not provide high-resolution refinement of antibody structures and do not consider thermodynamics during modeling. RosettaAntibody (7) is a homology modeling program within the Rosetta suite (8) for predicting high-resolution antibody FV structures. The prediction includes modeling CDR H3 loop conformations, and it uses a simple free energy function to relieve steric clashes by simultaneously optimizing the CDR loop backbone dihedral angles, the relative orientation of the light (VL) and heavy (VH) chains, and the side chain conformations. A crude model where all the CDRs are grafted from template structures can be provided in a few minutes, and high-resolution models can be generated by the full RosettaAntibody protocol running on a cluster of computers in about a day (sometimes users may need to wait for other jobs in the queue). The 10 top-scoring RosettaAntibody models can be used in docking techniques such as EnsembleDock (9) that can select binding competent conformers during docking. A few limitations of RosettaAntibody have been that (i) execution of multiple scripts for the identification of templates can be complex, (ii) finding the template structures that have been used in modeling can be challenging given the large number of intermediate files that are generated (iii) the Rosetta command-line interface can be difficult to use and (iv) it requires significant computational time to generate all-atom models, requiring a cluster of computers. To overcome these limitations and to make the high-resolution modeling available to a broader community, we have developed the RosettaAntibody server (http://antibody.graylab.jhu.edu), where the interface is simple and modest computing resources are provided.

PROCESSING METHOD

RosettaAntibody predicts the structure of the FV region in two stages. The first stage identifies the CDR loops and the framework regions in the input sequences, chooses the most sequence homologous templates for each respective segment, grafts the template CDRs onto template frameworks, and finally optimizes the side chains of all the residues in the assembled model. The crude model generated in the first stage is used as an input to the second stage. The second stage of RosettaAntibody is a multi-start, multi-scale Monte Carlo-plus-minimization algorithm that generates two thousand candidate structures. The second stage of the algorithm is split into a low-resolution and a high-resolution phase. The low-resolution phase represents side chains as single pseudo-atoms (10) and generates candidate CDR H3 loop conformations via fragment assembly and cyclic coordinate descent (11) in a Monte Carlo loop. Scoring in the low-resolution phase favors nonlocal properties of native protein structures such as hydrophobic burial, compactness, pairing of β-strands and closure of the chain gap during loop building (12). The high-resolution phase iteratively performs the following: (i) optimizes side chains via rotamer packing and continuous minimization (13), (ii) perturbs CDR backbone torsion angles and the relative orientation of the light and heavy chains, and (iii) uses gradient-based minimization over the CDR torsion angles and the light chain-heavy chain displacement. The high-resolution energy function includes van der Waals energy, orientation-dependent hydrogen bonding (14), implicit Gaussian solvation (15), side-chain rotamer propensities (16), and a low-weighted distance-dependent dielectric electrostatic energy (17). Complete methodological details are provided in ref. (7).

INPUTS AND OUTPUTS

Input

Amino-acid sequences of the light and heavy chains of the FV region are submitted to the server by either pasting the sequences in the appropriate field or by uploading them as two separate FASTA formatted files. Since the RosettaAntibody server models only the FV region of the antibody, FC and leader sequences should be truncated from the input sequence prior to submission. To uniquely identify each job, the user must specify the name of the antibody and can optionally specify the user's name and an email address for notification when the modeling task has finished.

Output

Figure 1 shows a representative output page from the RosettaAntibody server. The top of the page summarizes the details of the respective job, e.g. the name of the antibody, the name of the user (if provided), and the date and time of submission, execution and completion. A chart shows the boundaries of the CDR loops and the framework regions. Next, a table displays the most sequence-homologous template [identified by BLAST (18)] for each antibody segment (VH and VL frameworks, CDRs L1, L2, L3, H1, H2 and H3). In a series of selectable panes, the table also displays the top seven templates for each antibody segment. A picture of the crude FV model, formed by joining the top templates, is shown next on the output page with a link for downloading a file containing the coordinates and various energies. Residue numbering of all models generated by RosettaAntibody follows Chothia's antibody numbering scheme (19).

Figure 1.

Sample results page provided by the server for monoclonal antibody 14B7 (49). Inset shows the overlay of the 10 top-scoring RosettaAntibody models which appears further down the page (antibody framework regions, gray; light chain CDRs, green; CDRs H1 and H2, blue; CDR H3, red). Web page images are generated with MolScript (50). If requested, the next section of the page will show the 10 top-scoring high-resolution structures in rank order by energy. Each model output file includes the scoring data of individual energy terms (van der Waals, solvation, hydrogen bonding energies, etc.) for the whole FV model as well as residue-by-residue breakdowns. Finally, the web page shows a picture of a superposition of the 10 top-scoring structures from the perspective of an antigen, demonstrating the differences in the different models (figure inset). Any difficulties in processing the sequence would be shown in a ‘warnings’ section. Warnings sometimes arise to indicate poor matching of template structures or broken predicted CDR H3 loop conformations, resulting in a lower confidence model. Additional explanations of the warnings are provided in the website documentation. The documentation page also explains all output in detail, including a description of the scoring terms found in the coordinate files.

SYSTEM ARCHITECTURE

The RosettaAntibody server has a front-end web process which interfaces with the computation daemon and engine. The front-end, implemented in Python using Django (http://www.djangoproject.com), provides results upon request for users and enters modeling tasks into a MySQL database once an input file is submitted. A back-end daemon pulls tasks from the queue in the MySQL database, translates the modeling task into Perl wrapper scripts that detect the different segments of the antibody variable region, runs BLAST to detect templates, specifies a Rosetta++ command-line, and finally submits a job to a Condor (http://www.cs.wisc.edu/condor) queue. Condor runs the job on our 188-processor Linux cluster at Johns Hopkins University, as time is available. The back-end daemon periodically detects the status of the job to report, and eventually enters the complete set of results into the MySQL database.

SERVER PERFORMANCE

Since the RosettaAntibody web server opened in December of 2007, over 100 individuals have used the web server for more than 400 modeling jobs. Jobs typically require about 1700 processor-hours, and results are typically complete within a day of submission, although the time will vary with the current cluster load and server queue. The website is free and open to all users with no login requirement.

Validation of the RosettaAntibody server algorithm

In a large scale test of RosettaAntibody, the program was used to recover the native crystal structures of 54 antibodies (7). To simulate blind prediction, when database information was used in modeling, only nonrelated (less than 90% sequence identity) antibody structures in the Protein Data Bank (20) were used. For the best ranked model of each target, the median root mean square deviation (rmsd) of the antigen binding pocket comprising of all the CDR residues was 1.5 Å, and 80% of the targets had an rmsd lower than 2.0 Å. The loop modeling capabilities of RosettaAntibody were tested by ab initio modeling of the CDR H3 loop. The CDR H3 loop is composed of residues 95–102 of the heavy chain [Chothia numbering (19)]. The median backbone heavy atom global rmsd of the CDR H3 loop prediction for the best ranked model was 1.6, 1.9, 2.4, 3.1 and 6.0 Å, respectively, for very short (4–6 residues), short (7–9 residues), medium (10–11 residues), long (12–14 residues) and very long (17–22 residues) loops. Finally, a practical measure of the accuracy of the antibody structures is their utility for docking to antigens. While the inclusion of the RosettaAntibody refinement steps had a small effect on homology modeling rmsds (other than CDR H3), refinement was critical for achieving docking accuracy (7). When the set of 10 top-scoring RosettaAntibody FV homology models was used in local ensemble docking to antigen, a moderate-to-high accuracy docking prediction [rated by Critical Assessment of PRediction of Interactions criteria (21)] was achieved in 7 of 15 targets (7). In a comparison of WAM and RosettaAntibody (7), for some antibodies, the CDR H3 predicted by WAM was closer to the native structure than that of the top-scoring model produced by RosettaAntibody. However, there was typically a more accurate structure among the 10 top-scoring RosettaAntibody models. Furthermore, antibody–antigen docking simulations starting with RosettaAntibody FV models consistently resulted in more accurate docking predictions than those obtained by starting with WAM generated models or unrefined RosettaAntibody models (7).

Potential uses of the RosettaAntibody server

Antibody structures can be used to guide rational efforts to enhance stability (22,23) or to humanize sequences to minimize immunological response (24,25). Antibody structures can also be used for docking to their antigens, either for epitope mapping (26) or for high-resolution refinement (27). For example, we docked models of monoclonal antibody 14B7 to the anthrax toxin protective antigen (2). The models helped us form hypotheses about the mechanism of affinity maturation of several variants of 14B7. Several other instances of docking antibody homology models are present in the literature (28–30). Docking calculations can be done on several publicly available servers (31–38) including the RosettaDock Server (local docking only for high-resolution refinement, http://rosettadock.graylab.jhu.edu) (39). Docking of homology models is necessarily less accurate than docking of crystal structures. Experimental information can be used to mitigate errors. For example, we used computational mutagenesis and hotspot analysis to evaluate models of epidermal growth factor receptor binding to mAb 806 (1). In recognition of the errors present in homology models, RosettaAntibody provides 10 alternate low-energy structures. There are several new docking methods which can use multiple input structures for one of the docking partners (40–43). Our EnsembleDock program (9) can improve low-energy docking solutions, and sometimes the low-energy docking solution is formed by the component homology structure that is closest to the crystal structure.

FUTURE DIRECTIONS

Accurate loop modeling remains one of the central challenges of antibody modeling. Thus, future improvements might be made as better and more efficient loop modeling algorithms [e.g. kinematic loop closure (44) or hierarchical local optimization (45,46)] become available. Predictions might also be improved by inclusion of NMR constraints to bias simulations [e.g. (47)]. Some researchers are pursuing therapeutics based ‘heavy chain only’ (VHH) antibodies discovered in the blood of camelids (48). VHHs are also easy to clone and express, and their structure might be amendable to prediction, although tests are required to assess the use of standard antibody database. Finally, we are currently developing flexible backbone antibody docking techniques which exploit the same antibody structural modeling tools as the server. These induced-fit antibody docking techniques may additionally help overcome homology modeling errors as predicted structures are used for high-resolution applications.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

National Institutes of Health (R01-GM073151, R01-GM078221) and UCB S.A. Funding for open access charge: National Institutes of Health Grant Number R01-GM078221. Conflict of interest statement. None declared.

49 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Cyclic coordinate descent: A robotics algorithm for protein loop closure.

Authors: Adrian A Canutescu; Roland L Dunbrack
Journal: Protein Sci Date: 2003-05 Impact factor: 6.725

3. ZDOCK: an initial-stage protein-docking algorithm.

Authors: Rong Chen; Li Li; Zhiping Weng
Journal: Proteins Date: 2003-07-01

4. A hierarchical approach to all-atom protein loop prediction.

Authors: Matthew P Jacobson; David L Pincus; Chaya S Rapp; Tyler J F Day; Barry Honig; David E Shaw; Richard A Friesner
Journal: Proteins Date: 2004-05-01

5. Searching for protein-protein interaction sites and docking by the methods of molecular dynamics, grid scoring, and the pairwise interaction potential of amino acid residues.

Authors: Genki Terashi; Mayuko Takeda-Shitaka; Daisuke Takaya; Katsuichiro Komatsu; Hideaki Umeyama
Journal: Proteins Date: 2005-08-01

6. PIGS: automatic prediction of antibody structures.

Authors: Paolo Marcatili; Alessandra Rosi; Anna Tramontano
Journal: Bioinformatics Date: 2008-07-19 Impact factor: 6.937

Review 7. Macromolecular modeling with rosetta.

Authors: Rhiju Das; David Baker
Journal: Annu Rev Biochem Date: 2008 Impact factor: 23.643

8. Interaction of malaria parasite-inhibitory antibodies with the merozoite surface protein MSP1(19) by computational docking.

Authors: Flavia Autore; Sara Melchiorre; Jens Kleinjung; William D Morgan; Franca Fraternali
Journal: Proteins Date: 2007-02-15

9. Antibody-protein interactions: benchmark datasets and prediction tools evaluation.

Authors: Julia V Ponomarenko; Philip E Bourne
Journal: BMC Struct Biol Date: 2007-10-02

10. Using the natural evolution of a rotavirus-specific human monoclonal antibody to predict the complex topography of a viral antigenic site.

Authors: Brett A McKinney; Nicole L Kallewaard; James E Crowe; Jens Meiler
Journal: Immunome Res Date: 2007-09-18

63 in total

1. IBC's 22nd Annual Antibody Engineering and 9th Annual Antibody Therapeutics International Conferences and the 2011 Annual Meeting of The Antibody Society, December 5-8, 2011, San Diego, CA.

Authors: Johan Nilvebrant; D Cameron Dunlop; Aroop Sircar; Thierry Wurch; Emilia Falkowska; Janice M Reichert; Gustavo Helguera; Emily C Piccione; Simon Brack; Sven Berger
Journal: MAbs Date: 2012-03-01 Impact factor: 5.857

2. Constitutive production of catalytic antibodies to a Staphylococcus aureus virulence factor and effect of infection.

Authors: Eric L Brown; Yasuhiro Nishiyama; Jesse W Dunkle; Shreya Aggarwal; Stephanie Planque; Kenji Watanabe; Keri Csencsits-Smith; M Gabriela Bowden; Sheldon L Kaplan; Sudhir Paul
Journal: J Biol Chem Date: 2012-02-02 Impact factor: 5.157

3. Stability engineering of anti-EGFR scFv antibodies by rational design of a lambda-to-kappa swap of the VL framework using a structure-guided approach.

Authors: Andreas Lehmann; Josephine H F Wixted; Maxim V Shapovalov; Heinrich Roder; Roland L Dunbrack; Matthew K Robinson
Journal: MAbs Date: 2015-09-04 Impact factor: 5.857

4. Antibody modeling using the prediction of immunoglobulin structure (PIGS) web server [corrected].

Authors: Paolo Marcatili; Pier Paolo Olimpieri; Anna Chailyan; Anna Tramontano
Journal: Nat Protoc Date: 2014-11-06 Impact factor: 13.491

Review 5. Antibody specific epitope prediction-emergence of a new paradigm.

Authors: Inbal Sela-Culang; Yanay Ofran; Bjoern Peters
Journal: Curr Opin Virol Date: 2015-03-31 Impact factor: 7.090

6. AbRSA: A robust tool for antibody numbering.

Authors: Lei Li; Shuang Chen; Zhichao Miao; Yang Liu; Xu Liu; Zhi-Xiong Xiao; Yang Cao
Journal: Protein Sci Date: 2019-05-11 Impact factor: 6.725

7. Development of a novel anti-HER2 scFv by ribosome display and in silico evaluation of its 3D structure and interaction with HER2, alone and after fusion to LAMP2B.

Authors: Fatemeh Salimi; Mehdi Forouzandeh Moghadam; Masoumeh Rajabibazl
Journal: Mol Biol Rep Date: 2018-09-17 Impact factor: 2.316

8. Preferential interactions of trehalose, L-arginine.HCl and sodium chloride with therapeutically relevant IgG1 monoclonal antibodies.

Authors: Chaitanya Sudrik; Theresa Cloutier; Phuong Pham; Hardeep S Samra; Bernhardt L Trout
Journal: MAbs Date: 2017-07-31 Impact factor: 5.857

9. RosettaBackrub--a web server for flexible backbone protein structure modeling and design.

Authors: Florian Lauck; Colin A Smith; Gregory F Friedland; Elisabeth L Humphris; Tanja Kortemme
Journal: Nucleic Acids Res Date: 2010-05-12 Impact factor: 16.971

10. SnugDock: paratope structural optimization during antibody-antigen docking compensates for errors in antibody homology models.

Authors: Aroop Sircar; Jeffrey J Gray
Journal: PLoS Comput Biol Date: 2010-01-22 Impact factor: 4.475