| Literature DB >> 24423623 |
Sara D'Angelo1, Jacob Glanville2, Fortunato Ferrara1, Leslie Naranjo3, Cheryl D Gleasner3, Xiaohong Shen3, Andrew R M Bradbury3, Csaba Kiss3.
Abstract
In vitro selection has been an essential tool in the development of recombinant antibodies against various antigen targets. Deep sequencing has recently been gaining ground as an alternative and valuable method to analyze such antibody selections. The analysis provides a novel and extremely detailed view of selected antibody populations, and allows the identification of specific antibodies using only sequencing data, potentially eliminating the need for expensive and laborious low-throughput screening methods such as enzyme-linked immunosorbant assay. The high cost and the need for bioinformatics experts and powerful computer clusters, however, have limited the general use of deep sequencing in antibody selections. Here, we describe the AbMining ToolBox, an open source software package for the straightforward analysis of antibody libraries sequenced by the three main next generation sequencing platforms (454, Ion Torrent, MiSeq). The ToolBox is able to identify heavy chain CDR3s as effectively as more computationally intense software, and can be easily adapted to analyze other portions of antibody variable genes, as well as the selection outputs of libraries based on different scaffolds. The software runs on all common operating systems (Microsoft Windows, Mac OS X, Linux), on standard personal computers, and sequence analysis of 1-2 million reads can be accomplished in 10-15 min, a fraction of the time of competing software. Use of the ToolBox will allow the average researcher to incorporate deep sequence analysis into routine selections from antibody display libraries.Entities:
Keywords: AbMining ToolBox; HCDR3; antibody library; deep sequencing; regular expression
Mesh:
Substances:
Year: 2014 PMID: 24423623 PMCID: PMC3929439 DOI: 10.4161/mabs.27105
Source DB: PubMed Journal: MAbs ISSN: 1942-0862 Impact factor: 5.857

Figure 1. PCR priming scheme for the different sequencing platforms
Table 1. List of all primers used for sequencing
| Primer ID | Platform | Sequence |
|---|---|---|
| 454-for | 454 | CGTATCGCCTCCCTCGCGCCATCAGATGTATACTATACGAAGTTATCCTCGAG |
| 454-MID1-rev | 454 | CTATGCGCCTTGCCAGCCCGCTCAGACGAGTGCGTGCAGTGGGTTTGGGATTGGTTTGCC |
| Ion_fw3.vh1 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATTCTACAGACACAGCCTACATGGAGC |
| Ion_fw3.vh1b | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATACGAGCACAGCCTACATGGAGC |
| Ion_fw3.vh1c | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATTACATGGAGCTGAGCAGCCTGAG |
| Ion_fw3.vh2 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATATGACCAACATGGACCCTGTGGAC |
| Ion_fw3.vh3 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATCCAGAGACAATTCCAAGAACACGC |
| Ion_fw3.vh3b | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATTGCAAATGAACAGCCTGAAAACCGAGG |
| Ion_fw3.vh4 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATAACCAGTTCTCCCTGAAGCTGAGC |
| Ion_fw3.vh5 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATAGTGGAGCAGCCTGAAGGCC |
| Ion_fw3.vh3c | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATATCTGCAAATGAACAGYCTGAGAGC |
| Ion_fw3.vh3d | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATAGAGACAATTCCAGGAACWYCCTG |
| Ion_fw3.vh7 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATCCWTGGACACCTCTGYCAGC |
| IGHV1–2 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATATCAGCACAGCCTACATGGAGCTG |
| Ion_IGHV1–68 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATTGAGGACAGCCTACATAGAGCTGAG |
| Ion_IGHV3–13 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATTCAAATGAACAGCCTGAGAGCCGG |
| Ion_IGHV3–43 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATAACAGTCTGAGAACTGAGGACACCG |
| Ion_IGHV3–47 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATAGAGACAACGCCAAGAAGTCCTTG |
| Ion_IGHV3–49 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATTCGCCTATCTGCAAATGAACAGCC |
| Ion_IGHV6–1 | Ion Torrent | CCTCTCTATGGGCAGTCGGTGATACCCAGACACATCCAAGAACCAG |
| Ion_MID_SV5_Rev | Ion Torrent | TTCCATCTCATCCCTGCGTGTCTCCGACTCAGACGTGTGCAGTGGGTTTGGGATTGGTTTGCC |
| Mi_fw3.vh1 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTCTACAGACACAGCCTACATGGAGC |
| Mi_fw3.vh1b | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTACGAGCACAGCCTACATGGAGC |
| Mi_fw3.vh1c | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTACATGGAGCTGAGCAGCCTGAG |
| Mi_fw3.vh2 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTATGACCAACATGGACCCTGTGGAC |
| Mi_fw3.vh3 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCCAGAGACAATTCCAAGAACACGC |
| Mi_fw3.vh3b | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTGCAAATGAACAGCCTGAAAACCGAGG |
| Mi_fw3.vh4 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAACCAGTTCTCCCTGAAGCTGAGC |
| Mi_fw3.vh5 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAGTGGAGCAGCCTGAAGGCC |
| Mi_fw3.vh3c | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTATCTGCAAATGAACAGYCTGAGAGC |
| Mi_fw3.vh3d | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAGAGACAATTCCAGGAACWYCCTG |
| Mi_fw3.vh7 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTCCWTGGACACCTCTGYCAGC |
| Mi_IGHV1–2 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTATCAGCACAGCCTACATGGAGCTG |
| Mi_IGHV1–68 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTGAGGACAGCCTACATAGAGCTGAG |
| Mi_IGHV3–13 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTCAAATGAACAGCCTGAGAGCCGG |
| Mi_IGHV3–43 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAACAGTCTGAGAACTGAGGACACCG |
| Mi_IGHV3–47 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTAGAGACAACGCCAAGAAGTCCTTG |
| Mi_IGHV3–49 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTTCGCCTATCTGCAAATGAACAGCC |
| Mi_IGHV6–1 | MiSeq | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTACCCAGACACATCCAAGAACCAG |
| Mi_MID1_SV5_Rev | MiSeq | CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCGCAGTGGGTTTGGGATTGGTTTGCC |
Table 2. Sequence Statistics for 454, Ion Torrent and MiSeq data sets of the library
| 454 | Ion 1 | Ion 2 | Ion 2.2 | MiSeq | ||
|---|---|---|---|---|---|---|
| 1,417,344 | 2,151,956 | 3,895,583 | 3,909,701 | 5,697,883 | ||
| 1,296,818 | 817,468 | 1,644,295 | 1,570,152 | 5,612,344 | ||
| 426,894 | 1,049,297 | 797,613 | 5,046,749 | |||
| 553,376 | 613,513 | |||||
| 363,620 | 396,183 | 240,209 | 604,107 | 487,428 | 2,022,431 | |

Figure 2. RegEx validation. (A) Comparison frequency of HCDR3s identified by RegEx and VDJFastA on the same 454 data set. The numbers of HCDR3s identified at each frequency are color coded with the numbers of HCDR3s recognized by either RegEx, VDJFasta, or both indicated. (B) Proportional VENN diagram of the identified unique HCDR3s by RegEx and VDJFasta on the naïve library and an independent data set. The sizes and the intersections of the circles are proportional to the number of HCDR3s. (C) The accumulation of unique HCDR3s identified by RegEx or VDJFasta in the 454 data set. (D) HCDR3 length distribution determined for all three sequencing platforms by RegEx, and for 454 sequencing using either VDJFasta or RegEx.
. See previous page for figure legend.

Figure 3. (A) The amino acid distribution at each HCDR3 position identified exclusively by RegEx (RegEx+), VDJFasta (VDJFasta+), or by both methods (RegEx+/VDJFasta+) using the 454 sequence data set. (B) for each sequencing platform using RegEx, for three different HCDR3 lengths (9, 14, and 18).
Table 3. Regex validation by an independent data set of human VH antibody sequences
| Filtered reads | 1,976,330 | |
|---|---|---|
| 1,101,812 | 1,213,417 | |
| 165,903 | 178,055 | |

Figure 4. HCDR3 analysis of different data sets. For each panel, HCDR3s were identified using AbMining ToolBox from each indicated data set and then plotted, as described in Figure 1A. Comparisons of (A) 454 and Ion Torrent. (B) MiSeq and Ion Torrent. (C) 454 and Miseq. (D) Two independent Ion Torrent sequencing runs.

Figure 3B. See previous page for figure legend.

Figure 5. (A) Minimal amino acid Hamming distance distribution for the three sequencing platforms for all HCDR3 lengths of the naïve library. (B) Library diversity estimate by accumulation using the pooled unique sequences of all three sequencing platforms.

Figure 6. Binding specificity assessment of the 15 most abundant HCDR3 clones by flow cytometry against Ag85 and a negative antigen.
Table 4. Quality trimming optimization including average quality value and step value on an Ion Torrent, 454, and MiSeq sequencing output.
| Step 1 | Step 3 | Step 5 | Step 10 | ||
|---|---|---|---|---|---|
| 16 min | 8 min | 7 min | 6 min | ||
| 1305694 | 1305695 | 1305696 | 1305696 | ||
| 56092 | 56096 | 56096 | 56096 | ||
| 4.2963% | 4.2963% | 4.2963% | 4.2963% | ||
| 13 min | 8 min | 6:30 min | 6 min | ||
| 1228662 | 1231206 | 1233520 | 1238795 | ||
| 32853 | 33514 | 34098 | 35390 | ||
| 2.674% | 2.722% | 2.764% | 2.857% | ||
| 11 min | 7 min | 6 min | 5:30 min | ||
| 1145112 | 1147791 | 1150310 | 1156599 | ||
| 14732 | 15010 | 15283 | 15936 | ||
| 1.2866% | 1.3077% | 1.3286% | 1.3778% | ||
| 11 min | 7 min | 6 min | 5 min | ||
| 1088986 | 1092005 | 1094978 | 1102442 | ||
| 9072 | 9182 | 9307 | 9595 | ||
| 0.833% | 0.841% | 0.850% | 0.870% | ||
| 10 min | 7 min | 6 min | 5 min | ||
| 1026139 | 1029917 | 1033471 | 1061683 | ||
| 6655 | 6718 | 6779 | 6964 | ||
| 0.649% | 0.652% | 0.656% | 0.656% | ||
| 9 min | 6 min | 5:30 min | 5 min | ||
| 921544 | 926888 | 931401 | 942917 | ||
| 5220 | 5268 | 5300 | 5422 | ||
| 0.566% | 0.568% | 0.569% | 0.575% | ||
| 8 min | N/D | N/D | N/D | ||
| 732920 | N/D | N/D | N/D | ||
| 3800 | N/D | N/D | N/D | ||
| 0.52% | N/D | N/D | N/D | ||
| 7:30 min | N/D | N/D | N/D | ||
| 377137 | N/D | N/D | N/D | ||
| 1819 | N/D | N/D | N/D | ||
| 0.48% | N/D | N/D | N/D | ||
| 6:30 min | N/D | N/D | N/D | ||
| 13330 | N/D | N/D | N/D | ||
| 56 | N/D | N/D | N/D | ||
| 0.042% | N/D | N/D | N/D |
Table 5. The optimization of average quality value and step value on 454
| Q0 | Q10 | Q15 | Q16 | Q18 | Q20 | Q22 | Q25 | |
|---|---|---|---|---|---|---|---|---|
| 611536 | 611520 | 602941 | 594041 | 561389 | 510993 | 450001 | 356249 | |
| 7605 | 7605 | 7105 | 6682 | 5367 | 3950 | 2907 | 1962 | |
| 1.24% | 1.24% | 1.18% | 1.12% | 0.96% | 0.77% | 0.65% | 0.55% |
Table 6. The optimization of average quality value and step value on MiSeq sequencing output
| Q9 | Q12 | Q15 | Q18 | Q21 | Q24 | |
|---|---|---|---|---|---|---|
| 5067895 | 5067888 | 5067821 | 5066130 | 5046749 | 4983417 | |
| 25530 | 25530 | 25522 | 25435 | 25035 | 24446 | |
| 0.503% | 0.504% | 0.504% | 0.502% | 0.496% | 0.490% |