Ryan L Kelly1, Jessie Zhao2, Doris Le2, K Dane Wittrup1,2. 1. a Department of Biological Engineering, Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology , Cambridge , MA , U.S.A. 2. b Department of Chemical Engineering, Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology , Cambridge , MA , U.S.A.
Abstract
Efforts to develop effective antibody therapeutics are frequently hampered by issues such as aggregation and nonspecificity, often only detected in late stages of the development process. In this study, we used a high throughput cross-reactivity assay to select nonspecific clones from a naïve human repertoire scFv library displayed on the surface of yeast. Most antibody families were de-enriched; however, the rarely expressed VH6 family was highly enriched among nonspecific clones, representing almost 90% of isolated clones. Mutational analysis of this family reveals a dominant role of CDRH2 in driving nonspecific binding. Homology modeling of a panel of VH6 antibodies shows a constrained β-sheet structure in CDRH2 that is not present in other families, potentially contributing to nonspecificity of the family. These findings confirm the common decision to exclude VH6 from synthetic antibody libraries, and support VH6 polyreactivity as a possible important role for the family in early ontogeny and cause for its overabundance in cases of some forms of autoimmunity.
Efforts to develop effective antibody therapeutics are frequently hampered by issues such as aggregation and nonspecificity, often only detected in late stages of the development process. In this study, we used a high throughput cross-reactivity assay to select nonspecific clones from a naïve human repertoire scFv library displayed on the surface of yeast. Most antibody families were de-enriched; however, the rarely expressed VH6 family was highly enriched among nonspecific clones, representing almost 90% of isolated clones. Mutational analysis of this family reveals a dominant role of CDRH2 in driving nonspecific binding. Homology modeling of a panel of VH6 antibodies shows a constrained β-sheet structure in CDRH2 that is not present in other families, potentially contributing to nonspecificity of the family. These findings confirm the common decision to exclude VH6 from synthetic antibody libraries, and support VH6 polyreactivity as a possible important role for the family in early ontogeny and cause for its overabundance in cases of some forms of autoimmunity.
Although intended to bind to only a specific antigen, some monoclonal antibodies (mAbs) can bind nonspecifically to other proteins and surfaces, albeit with a much lower affinity. This nonspecificity can lead to off-target effects or accelerated clearance rates due to nonspecific tissue binding., While not desired in affinity-matured antibodies, polyspecificity may exist in the natural antibody repertoire as a means for broad antigen recognition. Studies on this pool of polyreactive, or “natural,” antibodies reveal often elongated CDR H3 regions, which are hypothesized to allow for greater flexibility in antigen recognition., Long, flexible CDRs are also found in many broadly neutralizing antibodies that target viruses such as influenza and the human immunodeficiency virus, and it is believed that many of these antibodies target specific proteins on the virus surface and engage in nonspecific binding to the virus coat. Beyond the understanding of polyreactive antibodies in natural repertoires, some studies have implicated the importance of arginine and tryptophan residues, although these were in the context of a synthetic library design.,A greater understanding of nonspecific binding will improve the isolation and development of mAbs using display technologies such as phage and yeast display, which can suffer from poor biophysical characteristics. Multiple naïve or semi-synthetic libraries based on the naïve repertoire are regularly used for the isolation of candidate antibodies, and as such a better understanding of nonspecificity in these libraries could lead to more efficient selections and downstream development.Traditional methods of assessing specificity include measuring binding of antibody to tissues or a panel of representative non-cognate antigens. More recently it has been discovered that complex protein extracts can also serve as nonspecificity reagents. Binding of antibodies to baculovirus particles in an ELISA format correlates remarkably well with a phenotype as complex as clearance rate in humans and a high throughput assay using a soluble membrane preparation (SMP) from HEK cells on the surface of yeast also serves as an effective predictor of nonspecificity in displayed antibodies and clearance rates in mice.,In this study, we utilize a yeast surface displayed single-chain variable fragment (scFv) library to assess nonspecificity in the natural repertoire. This library, created from amplified VH and VL segments from pooled human spleen and lymph node RNA, samples the human antibody repertoire, with some differences in antibody frequency and the possibility for non-natural VH/VL pairing. To isolate nonspecific clones, a modified version of the SMP assay was applied to sorting, revealing a predominance of nonspecific clones from the VH6 subfamily. Further analysis of these clones revealed CDR H2 as the dominant source of the nonspecificity of these clones. All other families were de-enriched in sorted populations, suggesting their preferred applicability to further naïve or synthetic library designs.
Results
Naïve library panning reveals enrichment of the VH6 family
To identify nonspecific clones in the human repertoire, we used a nonimmune human antibody yeast surface display library previously constructed by Feldhaus et al. The library was screened against 4 nonspecificity reagents: a preparation of soluble membrane or cytosolic proteins from humanembryonic kidney (HEK293) or Spodoptera frugiperda (Sf9) cells. These reagents were chosen to maximize orthogonality between preparations and avoid isolation of specific binders to components in one particular reagent. Populations were positively or negatively selected for binding against the 4 reagents sequentially to maximize round-to-round differences in selection reagent (Fig. 1, Fig. S1). After 4 rounds of selection, the resulting populations were validated by assessing binding to all 4 reagents. This selection process was able to isolate a positive, nonspecific population that displayed robust binding to all reagents and a negative, clean population that bound to none of the reagents (Positive; Fig. 1, Negative; Fig. S1). Sorts were repeated in duplicate to ensure consistency of sorted families.
Figure 1.
Isolation of nonspecific antibodies. A nonimmune scFv library was subjected to 4 rounds of yeast display-based selections. To ensure nonspecificity, the antigen used in each round was changed, with the following order used: HEK SMP, Sf9 SCP, Sf9 SMP, HEK SCP (Outlined boxes). The first 2 rounds were completed using magnetic-activated cell sorting and the final 2 rounds were completed using fluorescence-activated cell-sorting with the indicated sort gates (Thin boxes). The populations were validated after each round by measuring binding to each of the 4 reagents. As expected, the positively selected panel displays binding to all reagents.
Isolation of nonspecific antibodies. A nonimmune scFv library was subjected to 4 rounds of yeast display-based selections. To ensure nonspecificity, the antigen used in each round was changed, with the following order used: HEK SMP, Sf9 SCP, Sf9 SMP, HEK SCP (Outlined boxes). The first 2 rounds were completed using magnetic-activated cell sorting and the final 2 rounds were completed using fluorescence-activated cell-sorting with the indicated sort gates (Thin boxes). The populations were validated after each round by measuring binding to each of the 4 reagents. As expected, the positively selected panel displays binding to all reagents.We sequenced the resulting populations and analyzed the distribution of the various germline families (Fig. 2A, B). Examination of negatively sorted sequences shows there was an enrichment in sequences from the VH1 family, with some depletion of sequences from the VH4 and VH6 families. More strikingly, an average of 92% of the positively sorted antibodies originated from the VH6 family, with depletions in all other families. This rare germline family, represented by a single V-gene, represents approximately 1% of the expressed germline antibody pool in humans. It is worth noting that this family was already overrepresented in the original library, present in 40% of sequences, which likely contributed to the overall dominance of this family in the sequencing output.
Figure 2.
Antibody germline family distribution. Variable heavy (A) and light (B) chain sequences were aligned against human germline genes using IMGT V-QUEST and grouped into major families. Abundances of each family are displayed for the natural human repertoire (Blue), naive library (Red), negatively sorted pool (Green), and positively sorted population (Purple). Bars represent the average percentage of clones among 2 sorts and error bars indicate standard error.
Antibody germline family distribution. Variable heavy (A) and light (B) chain sequences were aligned against human germline genes using IMGT V-QUEST and grouped into major families. Abundances of each family are displayed for the natural human repertoire (Blue), naive library (Red), negatively sorted pool (Green), and positively sorted population (Purple). Bars represent the average percentage of clones among 2 sorts and error bars indicate standard error.The repertoire of light chain segments was enriched for Vκ1 segments and depleted for Vκ3 segments in both positive and negative cases, while there was a considerable enrichment of sequences from the Vλ6 subfamily among positively sorted sequences. The enrichment in Vκ1 and Vκ3 could be attributed to the fact that the initial library sequencing was completed on the naïve, unselected population and we have selected all clones for successful display, which likely subtly increases their abundance as these segments both tend to be well behaved. With respect to the Vλ6 subfamily, this germline family was overrepresented in the original library (0.5% vs. 11%), but was enriched 2.3-fold among nonspecific sequences.
Nonspecific binding is driven by CDR H2
We next investigated what sequences drive nonspecificity within the VH6 germline family, focusing first on the conserved V gene segment, which contains the sequences of both CDR H1 and H2. Aligning all positively selected VH6 sequences we see a high level of homology in both the H1 and H2 sequences (Fig. 3). It has been suggested that positive charge and hydrophobic regions can lead to nonspecific behavior., This led us further investigate the H2 sequence, which contains both two positively charged residues and one tryptophan residue.
Figure 3.
VH6 family sequence alignment. All VH6 antibody sequences were aligned using IMGT V-QUEST and sequence logos were plotted using Matlab for CDR H1 (A) and CDR H2 (B). Residues are numbered using Kabat standard numbering and CDRs are defined using IMGT definitions.
VH6 family sequence alignment. All VH6 antibody sequences were aligned using IMGT V-QUEST and sequence logos were plotted using Matlab for CDR H1 (A) and CDR H2 (B). Residues are numbered using Kabat standard numbering and CDRs are defined using IMGT definitions.To isolate any effects driven by the H2, we spliced the CDR H2 sequence from two “clean” VH1 antibodies into an example antibody derived from the VH6 family and measured poly-specificity reagent (PSR) binding for all resulting clones. Interestingly, while one of the mutants (VH6.H2.1) displayed similar binding to the original sequence, the other mutant (VH6.H2.2) completely ablated binding to all four nonspecificity reagents (Fig. 4). This suggests that the H2 is indeed driving nonspecificity in the family, but not every “clean” H2 sequence is able to reverse this phenotype. Comparing the sequences of the two VH1 segments used, there are four amino acids that are different, with one of these, Arg50, matching the VH6 sequence. We next mutated these four residues individually in the VH6.H2.1 and measured PSR binding. No individual mutation was able to fully ablate nonspecific binding, but most of the mutations had some marginal effect, with the R50G mutation displaying the greatest drop in PSR binding (Supp. Figure 1). Combined, this suggests a dominant role of Arg50 in driving nonspecificity, with appreciable contributions from other residues in the H2 loop.
Figure 4.
VH6 family CDRH2 mutations. One example VH6 clone was mutated by replacing CDRH2 with 2 different full H2 sequences from the VH1 family. PSR binding of the original clone (Blue) was compared against both mutants, VH6.H2.1 (Red) and VH6.H2.2 (Orange).
VH6 family CDRH2 mutations. One example VH6 clone was mutated by replacing CDRH2 with 2 different full H2 sequences from the VH1 family. PSR binding of the original clone (Blue) was compared against both mutants, VH6.H2.1 (Red) and VH6.H2.2 (Orange).
Homology modeling of the VH6 CDR H2 segment
To better understand structurally the importance of Arg50 in the structure of the CDR loop, we used the Rosetta Antibody server to create homology models of three representative clones from the VH6 family. All threeclones displayed similar structures in the region surrounding and containing CDR H2, and Arg50 lies at the base of the H2 region (Fig. 5a). Interestingly, the exposed surface area of this site is low, and its main effect appears to be in giving added structure to the loop, creating a hydrogen bond with Asp58, which stabilizes a β sheet not widely present in other families of antibodies. This rigid structure allows the tip of the loop to protrude further than other H2 loops, allowing for greater exposure of the other residues in the loop (Fig. 5b). While Asp58 is not present in VH6.H2.1, Arg50 is similarly able to form a hydrogen bond with Asn58 that also appears to stabilize the H2 loop. This may explain why VH6.H2.1 retains its nonspecificity, whereas VH6.H2.1, which is unable to make this contact, does not.
Figure 5.
Homology modeling of the VH6 CDRH2. Rosetta Antibody was used to homology model 3 representative clones from the VH6 family (A). Alignment of the structures highlights a common constrained β sheet structure stabilized by hydrogen bond interactions between Arg50 and Asp58. Comparing this structure against that of clone VH6.H2.2 (B), we observe the CDRH2 of the VH6 family clone (orange) protrudes farther than that of the mutant (blue).
Homology modeling of the VH6 CDRH2. Rosetta Antibody was used to homology model 3 representative clones from the VH6 family (A). Alignment of the structures highlights a common constrained β sheet structure stabilized by hydrogen bond interactions between Arg50 and Asp58. Comparing this structure against that of clone VH6.H2.2 (B), we observe the CDRH2 of the VH6 family clone (orange) protrudes farther than that of the mutant (blue).
Discussion
The initial intention of this study was to unveil motifs present in the natural repertoire that may lead to nonspecificity. The sort output was unexpectedly dominated by clones from the usually rare VH6 family, making it difficult to analyze the enrichment of other motifs that may also drive nonspecificity. These clones have conserved CDR H1 and H2 regions, with little to no similarity in the H3 region. Mutation of the H2 region revealed that it was the main driver of nonspecificity in the family. This loop is more structured than other naturally encoded H2 loops, driven by the interaction of Arg50 and Asp58. This added structure causes the H2 loop to protrude farther than in other families, exposing multiple residues.Of the residues exposed in the loop, in general tyrosine and serine have not been shown to increase nonspecific binding., However, the positively charged arginine and lysine have been implicated as leading to increased nonspecificity.,,, While in isolation positively charged residues can be important in establishing charge-charge contacts in inter- and intra-molecular interactions, the small positively charged patch formed by the arginine and lysine residue may contribute to the nonspecific binding profile we see in this work.In establishment of the human repertoire, the VH6 family is present in early development and in immature B lymphoid cells, but rare to absent in the developed human repertoire. Interestingly, the VH6 family has been shown to be overrepresented in patients with some autoimmune diseases, and the broad binding spectrum of autoantibodies from this family has been shown in prior work., Despite the polyspecificity of the family as a whole, highly specific antibodies have been discovered from this family against influenza and bacterial pathogens. The VH6–1 gene segment is also the most JH proximal, which is preferentially used in ontogeny. Taken as a whole, it appears that nonspecificity, or polyreactivity, of the VH6 family may be important for early immunologic protection, which is then downregulated as the full repertoire develops. This may explain why a V gene encoding nonspecific antibodies is not completely deleted as a result of evolution, and can help explain why it may be overrepresented in patients with autoimmune disorders.Applying these lessons to creating display libraries for the isolation of new therapeutic antibodies, these results caution against the use of the VH6–1 gene segment in naïve human repertoire designs. While it can be possible to derive highly specific antibodies in certain cases, there is still considerable risk that the majority of clones will display nonspecificity and poor development characteristics, including rapid clearance, which are correlated with nonspecificity.,, The VH6 class was accidentally overrepresented in the library used in this work, either by PCR amplification bias during construction or an abnormal antibody distribution in the donors used, which is likely detrimental for antigen selections. Outside of the VH6 family, all other families were less prevalent in nonspecific clones, with a noticeable enrichment of the VH1 family in negatively selected clones. These families, more prevalent in the developed natural repertoire, appear to be better starting points for both natural repertoire library construction and for use as scaffolds in synthetic library designs.
Materials and methods
Antigen preparation
SMPs were prepared as described previously. Briefly, one billion HEK or Sf9 cells were pelleted, washed and homogenized. The homogenate was separated into membrane and cytosolic fractions via centrifugation at 40,000 × g for 1h at 4° C. The supernatant was collected as the soluble cytosolic preparation (SCP) and the pellet was further washed and solubilized to create the enriched membrane fraction (EMF). Both the SCP and EMF were diluted to a concentration of 1 mg/mL and biotinylated using NHS-LC-Biotin (Pierce, Thermo Fisher Cat# 21336). Biotin solution was prepared according to manufacturer's protocol and 20 µL 10 mM biotin reagent was added per 1 mg sample. The reaction mixture was incubated for 3 h at 4 °C with gentle agitation, after which the biotinylated EMF (b-EMF) was pelleted at 40,000 × g for 1 h at 4 °C and the biotinylated SCP (b-SCP) was buffer exchanged into phosphate-buffered saline, pH 7.4 (PBS; Corning) using a Zeba Spin Desalting column (Thermo, 7K MWCO). The b-EMF was subsequently solubilized overnight using a buffer containing 1% n-dodecyl-b-D-maltopyranoside (DDM) and then centrifuged at 40,000 × g for 1h at 4 °C. The resulting supernatant was collected as the final, biotinylated SMP (b-SMP) product.
Yeast display experiments
All experiments were conducted using the naïve humanscFv library described by Feldhaus et al. using described previously yeast display techniques., Initial rounds of sorting were conducted using Dynabeads biotin binder beads (Life Technologies), ensuring at least ten-fold coverage of the library or population size in each round. After tworounds, subsequent sorting was completed on a FACSAria III cell sorter (BD Biosciences). Due to concern over selection of scFvs specific to a single protein, the order of antigens was chosen to maximize round-to-round differences. In order, the population was positively screened against: HEK bSMP, Sf9 bSCP, Sf9 bSMP, and HEK bSCP. To ensure no enrichment of binders to biotin, 1 mM biotin (Sigma) was added to sort rounds three and four. In all instances, nonspecificity was evaluated using a 1:10 dilution of each biotinylated nonspecificity reagent as primary and streptavidin AlexaFluor 488 conjugate (1:100, Life Technologies) as secondary. Simultaneously, successful display was evaluated using chicken anti-c-Myc primary (1:250, Gallus Immunotech cat# ACMYC) followed by goat anti-chicken secondary (1:100, Life Technologies cat # A-21449) antibodies.
Clone analysis
Yeast DNA was isolated using a Zymoprep Yeast Plasmid Miniprep II kit (Zymo Research). For most analysis, sequencing of individual clones was completed by Sanger sequencing (Macrogen). For high throughput sequencing efforts, insert DNA was first amplified for Illumina sequencing using the pCTFwdSeq and pCTRevSeq primers (See Supplementary Table 1 for sequences). Illumina libraries were prepared using NexteraXT (Illumina, Inc.) and sequenced on an Illumina Miseq using a 300nt kit (v2). IMGT/V-QUEST was used for all sequence alignment and assignment of closest germline V-genes. Both input and output libraries were sequenced, and clones with greater than two-fold enrichment over input and p-value less than 0.01 (Student's t-test, n = 4 replicates) we deemed to be significantly enriched. Motif enrichment and sequence analysis was conducted using Matlab.
Mutant construction
All H2 mutants were constructed via homologous recombination of three DNA fragments in yeast. Designed overlap sequences were located at the two ends of the scFv construct as well as directly in the H2 region. The fragment before and containing the H2 was amplified with the pCTFwd Recomb and appropriate PreH2 primers, and the fragment after and also containing the H2 was amplified using the pcTRev Recomb and proper PostH2 primers (Primer sequences in Supplementary Table 1). The pCTCON2 vector was prepared for homologous recombination by digestion with SalI-HF followed by digestion with NheI-HF and BamI-HF (New England Biolabs). The two scFv fragments along with the digested vector were contransformed into chemically competent yeast using the Frozen-EZ Yeast Transformation II Kit (Zymo Research) and resultant clones were sequence verified before experiments.
Modeling
Antibody homology modeling was conducted using Rosetta Antibody through ROSIE using default settings., For all clones, the grafted, relaxed model structure was chosen for further analysis. Structure analysis, alignment and figure generation were conducted using PyMol.
Authors: Yingda Xu; William Roach; Tingwan Sun; Tushar Jain; Bianka Prinz; Ta-Yi Yu; Joshua Torrey; Jerry Thomas; Piotr Bobrowicz; Maximiliano Vásquez; K Dane Wittrup; Eric Krauland Journal: Protein Eng Des Sel Date: 2013-09-17 Impact factor: 1.650
Authors: Thomas Tiller; Ingrid Schuster; Dorothée Deppe; Katja Siegers; Ralf Strohner; Tanja Herrmann; Marion Berenguer; Dominique Poujol; Jennifer Stehle; Yvonne Stark; Martin Heßling; Daniela Daubert; Karin Felderer; Stefan Kaden; Johanna Kölln; Markus Enzelberger; Stefanie Urlinger Journal: MAbs Date: 2013-04-09 Impact factor: 5.857
Authors: Michael J Feldhaus; Robert W Siegel; Lee K Opresko; James R Coleman; Jane M Weaver Feldhaus; Yik A Yeung; Jennifer R Cochran; Peter Heinzelman; David Colby; Jeffrey Swers; Christilyn Graff; H Steven Wiley; K Dane Wittrup Journal: Nat Biotechnol Date: 2003-01-21 Impact factor: 54.908
Authors: Sergey Lyskov; Fang-Chieh Chou; Shane Ó Conchúir; Bryan S Der; Kevin Drew; Daisuke Kuroda; Jianqing Xu; Brian D Weitzner; P Douglas Renfrew; Parin Sripakdeevong; Benjamin Borgo; James J Havranek; Brian Kuhlman; Tanja Kortemme; Richard Bonneau; Jeffrey J Gray; Rhiju Das Journal: PLoS One Date: 2013-05-22 Impact factor: 3.240
Authors: Nicole L Kallewaard; Davide Corti; Patrick J Collins; Ursula Neu; Josephine M McAuliffe; Ebony Benjamin; Leslie Wachter-Rosati; Frances J Palmer-Hill; Andy Q Yuan; Philip A Walker; Matthias K Vorlaender; Siro Bianchi; Barbara Guarino; Anna De Marco; Fabrizia Vanzetta; Gloria Agatic; Mathilde Foglierini; Debora Pinna; Blanca Fernandez-Rodriguez; Alexander Fruehwirth; Chiara Silacci; Roksana W Ogrodowicz; Stephen R Martin; Federica Sallusto; JoAnn A Suzich; Antonio Lanzavecchia; Qing Zhu; Steven J Gamblin; John J Skehel Journal: Cell Date: 2016-07-21 Impact factor: 41.582
Authors: Yulei Zhang; Lina Wu; Priyanka Gupta; Alec A Desai; Matthew D Smith; Lilia A Rabia; Seth D Ludwig; Peter M Tessier Journal: Mol Pharm Date: 2020-06-11 Impact factor: 4.939
Authors: Mark C Julian; Lilia A Rabia; Alec A Desai; Ammar Arsiwala; Julia E Gerson; Henry L Paulson; Ravi S Kane; Peter M Tessier Journal: J Biol Chem Date: 2019-03-27 Impact factor: 5.486
Authors: Jung-Eun Shin; Adam J Riesselman; Aaron W Kollasch; Conor McMahon; Elana Simon; Chris Sander; Aashish Manglik; Andrew C Kruse; Debora S Marks Journal: Nat Commun Date: 2021-04-23 Impact factor: 14.919