| Literature DB >> 28443097 |
Pieter Deschaght1, Ana Paula Vintém1, Marc Logghe1, Miguel Conde1, David Felix1, Rob Mensink1, Juliana Gonçalves1, Jorn Audiens1, Yanik Bruynooghe1, Rita Figueiredo1, Diana Ramos1, Robbe Tanghe1, Daniela Teixeira1, Liesbeth Van de Ven1, Catelijne Stortelers1, Bruno Dombrecht1.
Abstract
Next-generation sequencing (NGS) has been applied successfully to the field of therapeutic antibody discovery, often outperforming conventional screening campaigns which tend to identify only the more abundant selective antibody sequences. We used NGS to mine the functional nanobody repertoire from a phage-displayed camelid immune library directed to the recepteur d'origine nantais (RON) receptor kinase. Challenges to this application of NGS include accurate removal of read errors, correct identification of related sequences, and establishing meaningful inclusion criteria for sequences-of-interest. To this end, a sequence identity threshold was defined to separate unrelated full-length sequence clusters by exploring a large diverse set of publicly available nanobody sequences. When combined with majority-rule consensus building, applying this elegant clustering approach to the NGS data set revealed a wealth of >5,000-enriched candidate RON binders. The huge binding potential predicted by the NGS approach was explored through a set of randomly selected candidates: 90% were confirmed as RON binders, 50% of which functionally blocked RON in an ERK phosphorylation assay. Additional validation came from the correct prediction of all 35 RON binding nanobodies which were identified by a conventional screening campaign of the same immune library. More detailed characterization of a subset of RON binders revealed excellent functional potencies and a promising epitope diversity. In summary, our approach exposes the functional diversity and quality of the outbred camelid heavy chain-only immune response and confirms the power of NGS to identify large numbers of promising nanobodies.Entities:
Keywords: amino acid; clustering; immune repertoire diversity; nanobodies; next-generation sequencing; phage display; recepteur d’origine nantais signaling; sequence homology
Year: 2017 PMID: 28443097 PMCID: PMC5385344 DOI: 10.3389/fimmu.2017.00420
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Summary of next-generation sequencing raw data and initial processing output.
| Negative control | Recepteur d’origine nantais | |
|---|---|---|
| Selection output size (cfu) | 9 × 105 | 8 × 106 |
| Raw reads (counts) | 1.0 × 107 | 7.5 × 106 |
| Joined reads (counts) | 4.9 × 106 | 3.6 × 106 |
| Joinable fraction (%) | 94 | 96 |
| Full-length nanobody sequences (counts) | 3.4 × 106 | 2.8 × 106 |
| Unique sequences (counts) | 1.8 × 106 | 1.1 × 106 |
| Fraction unique sequences (%) | 53 | 39 |
| Unique sequences/selection output size (%) | 200 | 14 |
Figure 1Schematic overview of the work flows for the next-generation sequencing and conventional screening campaigns.
Figure 2Clustering of publicly available nanobody sequences. On the x-axis, the different CD-HIT clustering exercises at various sequence identity thresholds are shown, including the number of clusters at a given threshold. The y-axis (cluster size) displays the sequence counts per cluster. The symbol size indicates the number of unrelated nanobody sequences. The identities of the sequences in each cluster are given in Table S1 in Supplementary Material. The alignments of the sequences captured in clusters identified by a capital letter are shown in Figure S1 in Supplementary Material.
Summary of next-generation sequencing CD-HIT 0.9 clusters.
| Negative control | Recepteur d’origine nantais | |
|---|---|---|
| All clusters (count) | 8.1 × 105 | 2.7 × 105 |
| Mean cluster size (# sequences) | 4 | 11 |
| Orphan clusters (1 member) (count) | 6.5 × 105 | 1.9 × 105 |
| Fraction of total sequences (%) | 19 | 7 |
| Medium clusters (1 < | 1.3 × 105 | 6.5 × 104 |
| Fraction of total sequences (%) | 14 | 8 |
| Mean cluster size (# sequences) | 3.8 | 3.4 |
| Large clusters ( | 3.1 × 104 | 1.2 × 104 |
| Fraction of total sequences (%) | 67 | 86 |
| Mean cluster size (# sequences) | 75 | 208 |
Figure 3Next-generation sequencing (NGS) frequency analysis identifies 5,173 candidate human RON binders. All symbols represent CD-HIT clusters (0.9 sequence identity threshold) with cluster sizes [sequence counts in the recepteur d’origine nantais (RON) sample] ≥10 and enrichment factors (ratio of sequence counts per cluster in RON sample over negative control sample) ≥10. Blue squares represent the clusters that were also identified by the conventional screening campaign. Green triangles represent the NGS clusters that were selected for further screening. Clusters for which no sequence counts were observed in the negative control sample were attributed a sequence count of one, in order to be able to calculate and plot enrichment factors for these clusters.
Figure 4(A) Binding to human RON (hRON) of candidate binders. Shown are the selective binding ratios of ELISA and FACS experiments. (B) Inhibition of ligand-induced ERK phosphorylation by candidate binders. Shown are the % inhibition of ERK phosphorylation and the selective binding ratios of the ELISA experiment. Green triangles represent 28 randomly selected candidate hRON-binding nanobodies, predicted by the next-generation sequencing (NGS) analysis. Blue squares represent 35 hRON-binding nanobodies, predicted by the NGS analysis and identified in the conventional screening campaign. The white triangle represents nanobody NGS00009 which was not analyzed in the FACS experiment (see Table S2 in Supplementary Material) and as such was given a selective binding ratio of 0, but scored positive in the ELISA and pERK assays.
Figure 5Absence of correlation between next-generation sequencing cluster size or enrichment factor and binding strength to recepteur d’origine nantais (RON). Shown are selective binding ratios from the ELISA experiment of each candidate human RON-binding nanobody and the (A) size (sequence counts in the RON sample) or (B) enrichment factor (ratio of sequence counts per cluster in RON sample over negative control sample) of the corresponding clusters.
Overview characterization of selected anti-human RON nanobodies.
| ID | EC50 (M) binding | IC50 (M) inhibition of MSP binding | IC50 (M) inhibition of ERK phosphorylation | Epitope bin | |
|---|---|---|---|---|---|
| 8A09 | 6.2 × 10−4 | 9.3 × 10−11 | 1.6 × 10−8 (98%) | 4.9 × 10−9 (100%) | A |
| 8F09 | 6.4 × 10−4 | 2.0 × 10−10 | 1.2 × 10−8 (98%) | 5.2 × 10−9 (99%) | A |
| 11F05 | 4.6 × 10−4 | 5.3 × 10−11 | 9.7 × 10−9 (98%) | 6.0 × 10−9 (100%) | A |
| 8A12 | 2.8 × 10−3 | 1.2 × 10−10 | 7.0 × 10−9 (96%) | 1.3 × 10−8 (100%) | C–D |
| 8D12 | 5.1 × 10−4 | 6.1 × 10−11 | 1.1 × 10−8 (98%) | 1.5 × 10−8 (100%) | A |
| 8C09 | 2.3 × 10−3 | 1.7 × 10−7 | 4.7 × 10−8 (90%) | 3.3 × 10−8 (100%) | A |
| 8G11 | 9.7 × 10−4 | 3.4 × 10−8 | 2.4 × 10−7 (92%) | 8.2 × 10−8 (98%) | B |
| 5C06 | 3.7 × 10−3 | 2.0 × 10−7 | 2.2 × 10−8 (98%) | 1.2 × 10−7 (98%) | A |
| 2C06 | 6.9 × 10−3 | >1.0 × 10−6 | 2.5 × 10−7 (90%) | 3.0 × 10−7 (92%) | A |
| 5G04 | 2.9 × 10−4 | 3.3 × 10−10 | n.a. (92%) | 4.9 × 10−9 (100%) | C–D |
| 2D07 | 5.0 × 10−3 | 5.0 × 10−9 | n.a. (30%) | 1.8 × 10−8 (100%) | D |
| 2B09 | 1.8 × 10−3 | 1.3 × 10−9 | n.a. (67%) | 9.6 × 10−8 (96%) | C |
.
n.a.: IC.
Figure 6Dose–response curves of selected anti-human RON (hRON) nanobodies inhibiting ligand-induced ERK phosphorylation (A) and binding of ligand to hRON (B). Symbol colors relate to the different epitope bins to which the nanobodies belong (Table 3): bin A (shades of blue), bin B (purple), bin C (green), bin D (red), and bin C–D (shades of orange).
Figure 7Alignment of human RON (hRON) nanobodies (see also Table S2 in Supplementary Material). The 28 randomly selected candidate hRON-binding nanobodies are identified by the acronym “NGS” followed by a five digit number. The three sequences marked by an asterisk (NGS00003, NGS00020, and NGS00027) are the non-binding sequences from the randomly selected panel of 28. The 35 nanobodies discovered in the conventional screening campaign and predicted by the next-generation sequencing (NGS) analysis are identified by a one or two digit number, followed by a letter, followed by a two digit number. Numbering of alignment positions was done according to the IMGT V-DOMAIN system (26). CDR regions are highlighted in gray. Dots represent residues identical to the top sequence. Dashes represent gaps introduced by the alignment.