| Literature DB >> 28336957 |
Wonjun Yang1,2,3, Aerin Yoon1,3, Sanghoon Lee4, Soohyun Kim1,2,3, Jungwon Han1,3, Junho Chung1,2,3.
Abstract
Phage display technology provides a powerful tool to screen a library for a binding molecule via an enrichment process. It has been adopted as a critical technology in the development of therapeutic antibodies. However, a major drawback of phage display technology is that because the degree of the enrichment cannot be controlled during the bio-panning process, it frequently results in a limited number of clones. In this study, we applied next-generation sequencing (NGS) to screen clones from a library and determine whether a greater number of clones can be identified using NGS than using conventional methods. Three chicken immune single-chain variable fragment (scFv) libraries were subjected to bio-panning on prostate-specific antigen (PSA). Phagemid DNA prepared from the original libraries as well as from the Escherichia coli pool after each round of bio-panning was analyzed using NGS, and the heavy chain complementarity-determining region 3 (HCDR3) sequences of the scFv clones were determined. Subsequently, through two-step linker PCR and cloning, the entire scFv gene was retrieved and analyzed for its reactivity to PSA in a phage enzyme immunoassay. After four rounds of bio-panning, the conventional colony screening method was performed for comparison. The scFv clones retrieved from NGS analysis included all clones identified by the conventional colony screening method as well as many additional clones. The enrichment of the HCDR3 sequence throughout the bio-panning process was a positive predictive factor for the selection of PSA-reactive scFv clones.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28336957 PMCID: PMC5382563 DOI: 10.1038/emm.2017.22
Source DB: PubMed Journal: Exp Mol Med ISSN: 1226-3613 Impact factor: 8.718
HCDR3 amino-acid sequences selected using the conventional colony screening method, and binding reactivity measurement of the antibody clones
| Library 1 | Cluster 1 | DFGSGVGEIDA | 3.81 | 1.04 | 1.010 |
| GIESDSDGYMTAEEIDA | 0.13 | 1.04 | 0.977 | ||
| Cluster 2 | AAHSTYIWGGYEAGSIDA | 6.49 | 4.17 | 0.669 | |
| SAVSSCSSGSCSASWIDA | 1.16 | 2.08 | 0.873 | ||
| TADDGFSCGGYGLCADRIDA | 0.39 | 1.04 | 0.723 | ||
| ESGNGGWITAARIDA | 0.08 | 1.04 | 0.767 | ||
| SSHSTYIWGAYEAGSIDA | 0.03 | 2.08 | 0.651 | ||
| Cluster 4 | APGTGSGYCGIWTYTTAGCIDA | 0.03 | 1.04 | 0.964 | |
| GRISYICADYDAGCIDA | 0.02 | 5.21 | 1.063 | ||
| SSHSTYIWGGYEAGSIDA | 0.01 | 2.08 | 0.916 | ||
| Library 2 | Cluster 2 | SSYSDGATVIYNIDA | 0.69 | 1.04 | 0.870 |
| Cluster 3 | GRISYICADYDAGCIDA | 0.04 | 6.25 | 1.063 | |
| AAGSWCAWGTGSCAGSIDA | 0.02 | 5.21 | 1.067 | ||
| AAGSWCAWGTGSCAGNIDA | 0.01 | 1.04 | 0.985 | ||
| TTGGDFYSGIDTAGYIDA | 0.01 | 5.21 | 0.938 | ||
| APGTGSGYCGIWTYTTAGCIDA | 0.01 | 3.13 | 0.964 | ||
| Library 3 | Cluster 2 | AAGSGYIYSGSAGWIDA | 1.07 | 3.13 | 0.941 |
| Cluster 3 | AAGSWCAWGTGSCAGSIDA | 0.03 | 4.17 | 0.918 | |
| GRISYICADYDAGCIDA | 0.02 | 8.33 | 1.063 | ||
| TTGGDFYSGIDTAGYIDA | 0.02 | 2.08 | 0.889 | ||
| AAGSWCAWGAGSCAGSIDA | 0.01 | 1.04 | 0.914 | ||
| AAGSGYVYSGSAGWIDA | 0.01 | 2.08 | 1.021 |
Abbreviations: HCDR3; heavy chain complementarity-determining region 3; NGS, next-generation sequencing; O.D., optical density.
Sequence read counts by preprocessing raw sequencing data
| Library 1 | R0 | 664 955 | 393 749 | 393 624 | 125 | 310 589 (78.9) | 205 255 |
| R1 | 663 061 | 377 630 | 377 484 | 146 | 298 474 (79) | 198 150 | |
| R2 | 391 118 | 229 873 | 229 773 | 100 | 181 430 (78.9) | 128 513 | |
| R3 | 673 875 | 388 341 | 388 179 | 162 | 314 517 (81) | 148 787 | |
| R4 | 621 174 | 379 630 | 379 611 | 19 | 334 387 (88.1) | 27 141 | |
| Library 2 | R0 | 432 274 | 256 268 | 256 199 | 69 | 193 262 (75.4) | 148 862 |
| R1 | 661 248 | 417 426 | 417 323 | 103 | 316 150 (75.7) | 221 423 | |
| R2 | 608 850 | 363 553 | 363 460 | 93 | 274 100 (75.4) | 197 190 | |
| R3 | 547 353 | 342 189 | 342 123 | 66 | 289 287 (84.5) | 66 545 | |
| R4 | 455 119 | 290 741 | 290 722 | 19 | 274 635 (94.5) | 22 763 | |
| Library 3 | R0 | 616 410 | 360 830 | 360 783 | 47 | 279 996 (77.6) | 164 869 |
| R1 | 608 045 | 370 090 | 370 033 | 57 | 288 172 (77.9) | 167 249 | |
| R2 | 619 731 | 373 093 | 373 038 | 55 | 290 056 (77.7) | 168 084 | |
| R3 | 690 602 | 419 796 | 419 757 | 39 | 343 996 (81.9) | 74 611 | |
| R4 | 568 948 | 354 314 | 354 301 | 13 | 287 126 (81) | 21 884 | |
Abbreviations: FLASH, fast length adjustment of short reads; HCDR3, heavy chain complementarity-determining region 3.
Dunn index on hierarchical clustering to estimate optimal number of clusters in scFv nucleotide sequence profile data
| 2 | 3 | 4 | 5 | 6 | |
| Library 1 | 0.0863 | 0.0723 | |||
| Library 2 | 0.0564 | 0.0564 | 0.0845 | ||
| Library 3 | 0.1508 | 0.1544 | 0.0893 | 0.0893 | |
Abbreviation: scFv, single-chain variable fragment. Bold numbers indicate the largest Dunn index in each library.
Figure 1Heat map representing the population of heavy chain complementarity-determining region 3 (HCDR3) sequences in each cluster through bio-panning rounds. Red and blue denote high and low proportions of the HCDR3 sequence, respectively. (a) scFv library 1, (b) scFv library 2 and (c) scFv library 3.
Figure 2Line graph representing population shifts in HCDR3 sequences through bio-panning rounds. (a) scFv library 1, (b) scFv library 2 and (c) scFv library 3.
HCDR3 amino-acid sequences selected in each cluster from NGS and binding reactivity measurement of antibody clones
| Library 1 | Cluster 1 | GVYSGSPDGYDIDA | 0.32% | 502 | 550 | 289 | 1133 | 1235 | 0.454 |
| TTCVGSSYCGGENIDA | 0.16% | 8061 | 8199 | 4786 | 6273 | 603 | 0.173 | ||
| GAYSDWGAGFIDA | 0.08% | 2016 | 2033 | 1237 | 1809 | 301 | 0.161 | ||
| DGDSGWGVYLNSAGNIDA | 0.03% | 39 | 25 | 19 | 76 | 133 | 0.153 | ||
| Cluster 2 | YAGSGWTYYSSDVGSIDA | 2.16% | 0 | 1 | 2 | 1498 | 8314 | 0.620 | |
| GVYSASGCCDSIDT | 1.93% | 0 | 0 | 2 | 1445 | 7443 | 1.032 | ||
| SAHSTYIWGGYEAGSIDA | 1.41% | 0 | 1 | 0 | 1049 | 5420 | 1.075 | ||
| GGGAGYGAPSIDT | 1.05% | 0 | 0 | 0 | 866 | 4034 | 0.871 | ||
| DVYSGLITANTIDA | 0.67% | 0 | 1 | 1 | 325 | 2607 | 0.639 | ||
| Cluster 3 | SSHSTYIWGAYEAGCIDA | 0.02% | 5 | 0 | 0 | 5 | 64 | 0.757 | |
| RAYGGGYCGCIEDIDA | 0.01% | 0 | 0 | 0 | 12 | 44 | 0.323 | ||
| AASTWSFYGSAEDIDA | 0.01% | 0 | 0 | 0 | 3 | 31 | 0.725 | ||
| Cluster 4 | APGTGSGYCGIWTYTTAGSIDA | 0.04% | 0 | 0 | 0 | 1 | 39 | 0.323 | |
| GRISYICADYEAGSIDA | 0.02% | 0 | 0 | 0 | 0 | 61 | 0.407 | ||
| Library 2 | Cluster 1 | GAYGHCDGWCAVDSIDT | 0.07% | 1673 | 2610 | 2430 | 823 | 196 | 0.175 |
| AAGSGYCGWGDCIAGSIDA | 0.07% | 108 | 159 | 139 | 184 | 193 | 0.167 | ||
| GIYGYSGGDYAAAEIDA | 0.06% | 1145 | 1815 | 1712 | 621 | 167 | 0.179 | ||
| GAGGSCDGGSWCSPGIIDA | 0.04% | 1423 | 2179 | 1964 | 595 | 121 | 0.187 | ||
| TRGGAGSGWYWYSGIAGIIDA | 0.03% | 782 | 1172 | 1118 | 399 | 96 | 0.180 | ||
| Cluster 2 | TAGCGPWSYITAGCIDA | 0.21% | 0 | 0 | 6 | 969 | 604 | 1.119 | |
| DAAYGYCGTWAGCAGRIDA | 0.21% | 12 | 22 | 37 | 5404 | 606 | 1.187 | ||
| CAYSGCTGGWSTSSIDA | 0.20% | 18 | 23 | 19 | 1046 | 592 | 1.007 | ||
| DVYGCNSYGCPYIGNTIDA | 0.09% | 0 | 2 | 3 | 190 | 259 | 1.254 | ||
| RAFSGCCDADSIDA | 0.07% | 4 | 5 | 3 | 275 | 195 | 0.845 | ||
| Cluster 3 | SSSGTTYYSSGVISAGGIDA | 0.17% | 0 | 0 | 0 | 62 | 488 | 0.167 | |
| GRISYICVDYDAGCIDA | 0.07% | 0 | 0 | 0 | 59 | 209 | 0.706 | ||
| NAYTSAYITDIDS | 0.06% | 0 | 1 | 1 | 103 | 188 | 0.944 | ||
| SAYSDSCCAEDIDA | 0.04% | 0 | 0 | 1 | 53 | 106 | 0.876 | ||
| SAFGGGACCYTAGTIDA | 0.03% | 0 | 4 | 0 | 15 | 103 | 0.165 | ||
| Library 3 | Cluster 1 | DGSGCGWSAAGCIDA | 0.35% | 9970 | 10385 | 10438 | 4639 | 924 | 0.160 |
| AATYSWLHSGIDA | 0.29% | 112 | 104 | 103 | 246 | 1045 | 0.728 | ||
| DGSDCGWSAAGCIDA | 0.06% | 2430 | 2498 | 2476 | 1164 | 222 | 0.146 | ||
| GTGSWCYSGADSIDT | 0.06% | 2206 | 2381 | 2367 | 1006 | 207 | 0.167 | ||
| SAAGYWYAGSIDA | 0.05% | 10 | 8 | 12 | 121 | 194 | 0.138 | ||
| Cluster 2 | TAGGDFYSGVDTAGYIDA | 4.79% | 1 | 1 | 4 | 3070 | 17187 | 1.064 | |
| Cluster 3 | GSGYSCWSYAGCIDA | 0.66% | 1 | 1 | 1 | 1034 | 2132 | 1.083 | |
| GRIYYICADYDAGCIDA | 0.53% | 0 | 0 | 1 | 429 | 1890 | 1.052 | ||
| TADSGFGCGGYGLCAAFIDA | 0.09% | 2 | 2 | 2 | 743 | 303 | 0.907 | ||
| TADIGYCFGGGIGCIDA | 0.08% | 0 | 0 | 0 | 86 | 289 | 0.984 | ||
| SAGGSYGYRYMDTAAAIDA | 0.07% | 2 | 1 | 1 | 195 | 269 | 0.861 | ||
Abbreviations: HCDR3; heavy chain complementarity-determining region 3; NGS, next-generation sequencing.
Figure 3Schematic representation of next-generation sequencing and two-step linker PCR. The structure of scFv gene, CDRs and frameworks of variable regions are indicated by colored boxes. (a) For NGS analysis, most of VH region including HCDR3 was amplified and sequenced using specific primers as described in materials and methods. The sequencing coverage is indicated with dashed lines. (b) To retrieve scFv gene, two-step linker PCR was performed using primers annealing to HCDR3, LFR1 and HFR4. The first step of PCR was performed using LFR1_F and HCDR3_R primers and HCDR3_F and HFR4_R primers. The linker PCR was performed using LFR1_F and HFR4_R primers.
Figure 4Binding reactivity of scFv antibodies retrieved from selected HCDR3 amino-acid sequences in each cluster using NGS. (a) scFv library 1, (b) scFv library 2 and (c) scFv library 3. ANOVA with Turkey's multiple-comparison test was used to compare cluster 1 with other clusters. In library 3, the P-value was calculated using the Mann–Whitney U-test. *P-value <0.05; **P-value <0.01; ***P-value <0.001. ANOVA, analysis of variance.