| Literature DB >> 32286330 |
Jessica S Ebo1,2, Janet C Saunders1,2,3,3, Paul W A Devine1,2,3, Alice M Gordon1,2, Amy S Warwick1,2, Bob Schiffrin1,2, Stacey E Chin3, Elizabeth England3, James D Button3, Christopher Lloyd3, Nicholas J Bond3, Alison E Ashcroft1,2, Sheena E Radford1,2, David C Lowe4, David J Brockwell5,6.
Abstract
Protein biopharmaceuticals are highly successful, but their utility is compromised by their propensity to aggregate during manufacture and storage. As aggregation can be triggered by non-native states, whose population is not necessarily related to thermodynamic stability, prediction of poorly-behaving biologics is difficult, and searching for sequences with desired properties is labour-intensive and time-consuming. Here we show that an assay in the periplasm of E. coli linking aggregation directly to antibiotic resistance acts as a sensor for the innate (un-accelerated) aggregation of antibody fragments. Using this assay as a directed evolution screen, we demonstrate the generation of aggregation resistant scFv sequences when reformatted as IgGs. This powerful tool can thus screen and evolve 'manufacturable' biopharmaceuticals early in industrial development. By comparing the mutational profiles of three different immunoglobulin scaffolds, we show the applicability of this method to investigate protein aggregation mechanisms important to both industrial manufacture and amyloid disease.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32286330 PMCID: PMC7156504 DOI: 10.1038/s41467-020-15667-1
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1The tripartite β-lactamase assay.
a The test protein (green) is inserted into a 28-residue glycine/serine-rich linker (grey) separating the two domains of the E. coli enzyme TEM-1 β-lactamase (purple and pink). b Correct folding of the test protein in the E. coli periplasm enables the two halves of β-lactamase to be brought into close proximity to form the functional enzyme active site that hydrolyses β-lactam antibiotics. c Antibiotic survival curve of the maximal cell dilution allowing growth (MCDGROWTH) on solid medium over a range of ampicillin concentrations for bacteria expressing the aggregation-prone scFvWFL within β-lactamase (blue) or the aggregation-resistant sequence scFvSTT (pink). d Calculating the area under the antibiotic survival curves (blue and pink shaded area, c) yields a single value to compare the behaviour of the different sequences. Data are shown for three aggregation-prone model therapeutic proteins (open bars) and their engineered aggregation-resistant counterparts (solid bars). Data represent mean values ± s.e.m. (n = 4 biologically independent experiments). Asterisks denote significance: **p < 0.002, ****p < 0.0001 (two-sided t-test). Source data are provided as a Source Data file.
Fig. 2Comparison of the aggregation of WFL and its sequence variants in a scFv format in vivo and in an IgG1 format in vitro.
Average in vivo growth score (bars, with individual experimental data shown as points) for scFvWFL and scFvSTT together with their six combinatorial variants. Positions with the same amino acid as STT are highlighted in pink. Error bars represent s.e.m. (n = 4 biologically independent experiments). These data are overlaid with the HP-SEC retention times for the same variants reformatted as an IgG1 (black dots). Source data are provided as a Source Data file.
Fig. 3In vivo growth score of evolved βLa-scFvWFL variants and the aggregation propensity and target affinity of ten selected variants in IgG1 format.
a Ranked in vivo growth score of 185 variants (Inset shows error for controls βLa-scFvWFL and βLa-scFvSTT, data represent mean values ± s.d. (n = 15 biological repeats)). Ten variants across the rank (11, 176, 37, 59, 128, 72, 126, 130, 16 and 139) were selected and reformatted as full-length IgG1s for biophysical analysis. b HP-SEC retention time (green dots, longer times indicate greater interaction with column matrix) and AC-SINS (purple triangles, larger plasmon shifts correlate with greater self-association. n = 3 technical repeats. Note: error bars smaller than symbols (mean values)) of the ten selected variants in IgG1 format. These data correlate inversely with in vivo growth score (grey bars represent mean values, error bars represent s.e.m. n = 3 technical repeats). c Data used to calculate the IC50 values of binding of the ten evolved variants in IgG1 format to NGF determined using a homogeneous time-resolved fluorescence assay (HTRF). Data represent mean values ± s.d. (n = 3 technical repeats). Source data are provided as a Source Data file.
Fig. 4Mutation frequency profile of evolved antibody fragments.
CDRs are highlighted as grey rectangles and VH and VL domains are labelled. a Mutational frequency of the screened scFvWFL* library reveals 12 residues with a mutational frequency greater than two standard deviations from the average value (2σ). Nine occur in VH and three in VL. b The mutation frequency profile for screened scFvLi33 reveals only three sites with a mutational frequency >2σ. c The mutational frequency of the evolved VL domain, JTO, reveals 10 residues with a mutational frequency >2σ. All profiles use IMGT numbering. The cumulative mutational frequency is normalised to 1 for each dataset. Residues showing high mutational frequencies (>2σ) are labelled in each case. Datasets are pooled from two independent experiments. Source data are provided as a Source Data file.
Summary of the 12 most frequently substituted residues after directed evolution of the scFvWFL* library.
| Residue | RSAa | Most frequently observed aa substitution | Mutation frequency of most often observed mutation | Expected frequency of most often observed mutationb | Amino acid substitutions observedc,d | Available residues with single DNA base changed |
|---|---|---|---|---|---|---|
| F30 | 0.07 | S | 0.81 | 0.41 | SL | IVLFCSY |
| W35 | 0.63 | R | 0.93 | 0.90 | RG | LCGSR |
| F36 | 0.66 | S | 0.60 | 0.41 | SL(V | IVLFCSY |
| I56 | 0.09 | V | 0.50 | 0.43 | VT(LF) | IVLFMTSN |
| I57 | 0.01 | T | 0.66 | 0.44 | TNV | IVLFMTSN |
| I59 | 0.44 | T | 0.60 | 0.43 | TNVF | IVLFMTSN |
| F62 | 0.45 | S | 0.60 | 0.43 | SLY | IVLFCSY |
| I110 | 0.19 | T | 0.82 | 0.43 | TV(LFM) | IVLFMTSN |
| L112c | 0.65 | P | 1.00 | 0.83 | P | IVLFPHR |
| K18LC | 0.81 | E | 0.52 | 0.44 | ERNQ | ITEQNKR |
| N57LC | 0.18 | D | 0.73 | 0.43 | DS | ITSYHDNK |
| I71LC | 0.24 | T | 0.62 | 0.43 | TVN | IVLFMTSN |
aRSA = relative surface area (0 = completely buried residue, 1 = maximally solvent exposed residue (see Methods)).
bExpected frequency calculated using the mutational bias found in the naive library.
cResidues are listed in decreasing mutational frequency with brackets indicating residues with equal frequency of mutation. Substitutions shown in bold are due to two base-pair changes in the DNA codon.
dAmino acids are listed in decreasing hydrophobicity (left to right) using the Kyte-Doolittle scale[57].
Fig. 5Comparison of computational predictors of aggregation with the evolved mutational hotspots for WFL.
a Comparison of evolution hotspots for scFvWFL, with predictions based on (left to right) structurally corrected CamSol[18], SAP[19] or Aggrescan3D[45]. Insoluble/aggregation-prone and soluble/non-aggregation-prone regions are shown on a surface model of the protein (created from PDB 5JZ7[35]) in red and blue, respectively. b Computational prediction of insoluble and/or aggregation-prone sequences of scFvWFL for (top to bottom) structurally corrected CamSol, where +1 indicates soluble and −1 indicates insoluble (dotted lines); SAP (using a 10 Å radius), where values >0.5 and <−0.5 are significant; and Aggrescan3D, where values >1 and <−1 are significant. In each plot the significance values are highlighted by dotted lines and colours are as in (a). Dark grey vertical bars denote evolution hotspot residues and light grey boxes highlight CDRs. Residues are numbered according to IMGT numbering. Supplementary Fig. 13 shows an expanded view of residues 111–112. Source data are provided as a Source Data file.