| Literature DB >> 31216939 |
Konrad Krawczyk1, Matthew I J Raybould2, Aleksandr Kovaltsuk2, Charlotte M Deane2.
Abstract
Recently it has become possible to query the great diversity of natural antibody repertoires using next-generation sequencing (NGS). These methods are capable of producing millions of sequences in a single experiment. Here we compare clinical-stage therapeutic antibodies to the ~1b sequences from 60 independent sequencing studies in the Observed Antibody Space database, which includes antibody sequences from NGS analysis of immunoglobulin gene repertoires. Of 242 post-Phase 1 antibodies, we found 16 with sequence identity matches of 95% or better for both heavy and light chains. There are also 54 perfect matches to therapeutic CDR-H3 regions in the NGS outputs, suggesting a nontrivial amount of convergence between naturally observed sequences and those developed artificially. This has potential implications for both the legal protection of commercial antibodies and the discovery of antibody therapeutics.Entities:
Keywords: Antibody therapeutics; data mining; next generation sequencing; patent
Year: 2019 PMID: 31216939 PMCID: PMC6748601 DOI: 10.1080/19420862.2019.1633884
Source DB: PubMed Journal: MAbs ISSN: 1942-0862 Impact factor: 5.857
Figure 1.Best sequence identity matches to Clinical Stage Therapeutics (CST) in naturally sourced NGS datasets. (a) Heavy and light chain variable regions of 242 CST sequences from Raybould et al.[7] aligned to variable region sequences in OAS.[18] (b) Heavy and light chain IMGT CDR regions of 242 CSТs aligned to IMGT CDR regions in OAS. Fully human sequences are denoted by blue dots, humanized by green, chimeric by magenta and mouse in red. In small amount of cases where CSTs had the same identity values and different antibody type, we report the antibody type by majority vote of proximal CSTs. The precise alignment values can be found in Table 1 and their distributions in Figures 2 and 3. Interactive versions of these charts are available at http://naturalantibody.com/therapeutics.
Best sequence identities of Clinical Stage Therapeutic (CST) antibodies to sequences found in public NGS repositories. Sequence identities are given for the best alignment of a sequence from a public repository to a CST heavy or light chain variable region, heavy or light CDR region or CDR-H3 alone (IMGT-defined). The CSTs are identified by their names in the leftmost column. The entries are sorted from top to bottom by the highest heavy chain identity. An interactive version of this table together with aligned sequences are available at http://naturalantibody.com/therapeutics.
| CST Name | Best Heavy Chain Identity (%) | Best Light Chain Identity (%) | Best Heavy Chain CDRs Identity (%) | Best Light Chain CDRs Identity (%) | Best CDR-H3 Identity (%) |
|---|---|---|---|---|---|
| 98 | 98 | 96 | 100 | 100 | |
| 97 | 100 | 90 | 100 | 92 | |
| 97 | 99 | 96 | 100 | 100 | |
| 97 | 99 | 93 | 95 | 87 | |
| 97 | 97 | 94 | 94 | 88 | |
| 96 | 100 | 96 | 100 | 100 | |
| 96 | 100 | 89 | 100 | 92 | |
| 96 | 100 | 88 | 100 | 81 | |
| 96 | 99 | 92 | 100 | 91 | |
| 96 | 98 | 93 | 94 | 100 | |
| 96 | 98 | 90 | 94 | 92 | |
| 96 | 97 | 92 | 95 | 90 | |
| 96 | 96 | 90 | 95 | 94 | |
| 95 | 100 | 85 | 100 | 77 | |
| 95 | 98 | 89 | 100 | 91 | |
| 95 | 96 | 88 | 100 | 100 | |
| 95 | 92 | 87 | 88 | 81 | |
| 95 | 87 | 100 | 88 | 100 | |
| 94 | 99 | 100 | 100 | 100 | |
| 94 | 98 | 89 | 100 | 100 | |
| 94 | 97 | 100 | 86 | 100 | |
| 94 | 97 | 90 | 94 | 85 | |
| 94 | 97 | 82 | 100 | 83 | |
| 94 | 96 | 96 | 88 | 100 | |
| 94 | 96 | 93 | 95 | 100 | |
| 94 | 95 | 93 | 94 | 92 | |
| 94 | 94 | 89 | 85 | 82 | |
| 94 | 93 | 89 | 88 | 83 | |
| 93 | 100 | 88 | 100 | 100 | |
| 93 | 100 | 82 | 100 | 91 | |
| 93 | 99 | 88 | 94 | 94 | |
| 93 | 98 | 96 | 100 | 100 | |
| 93 | 98 | 87 | 94 | 87 | |
| 93 | 98 | 85 | 100 | 88 | |
| 93 | 98 | 82 | 94 | 92 | |
| 93 | 97 | 88 | 93 | 90 | |
| 93 | 96 | 80 | 84 | 100 | |
| 92 | 100 | 90 | 100 | 93 | |
| 92 | 100 | 89 | 100 | 91 | |
| 92 | 100 | 83 | 100 | 86 | |
| 92 | 100 | 75 | 100 | 88 | |
| 92 | 99 | 89 | 100 | 91 | |
| 92 | 99 | 85 | 100 | 100 | |
| 92 | 99 | 84 | 100 | 87 | |
| 92 | 97 | 85 | 100 | 90 | |
| 92 | 97 | 81 | 100 | 81 | |
| 92 | 96 | 82 | 100 | 84 | |
| 92 | 92 | 82 | 70 | 83 | |
| 92 | 90 | 86 | 89 | 92 | |
| 92 | 89 | 88 | 55 | 100 | |
| 92 | 87 | 92 | 65 | 100 | |
| 91 | 99 | 92 | 100 | 100 | |
| 91 | 99 | 88 | 100 | 90 | |
| 91 | 98 | 85 | 100 | 100 | |
| 91 | 97 | 82 | 94 | 92 | |
| 91 | 97 | 80 | 94 | 90 | |
| 91 | 96 | 84 | 89 | 90 | |
| 91 | 96 | 80 | 100 | 80 | |
| 91 | 95 | 78 | 95 | 83 | |
| 91 | 95 | 78 | 88 | 91 | |
| 91 | 94 | 80 | 82 | 90 | |
| 91 | 91 | 78 | 83 | 83 | |
| 91 | 90 | 89 | 94 | 100 | |
| 91 | 89 | 92 | 100 | 88 | |
| 91 | 88 | 87 | 65 | 75 | |
| 91 | 71 | 82 | 75 | 83 | |
| 90 | 100 | 85 | 100 | 91 | |
| 90 | 100 | 82 | 100 | 78 | |
| 90 | 100 | 81 | 100 | 90 | |
| 90 | 100 | 78 | 100 | 93 | |
| 90 | 100 | 78 | 100 | 88 | |
| 90 | 99 | 88 | 100 | 100 | |
| 90 | 98 | 80 | 100 | 84 | |
| 90 | 98 | 80 | 100 | 80 | |
| 90 | 98 | 77 | 100 | 90 | |
| 90 | 97 | 89 | 94 | 90 | |
| 90 | 97 | 82 | 94 | 92 | |
| 90 | 97 | 80 | 100 | 100 | |
| 90 | 97 | 80 | 94 | 71 | |
| 90 | 96 | 91 | 100 | 88 | |
| 90 | 96 | 91 | 90 | 100 | |
| 90 | 95 | 89 | 83 | 90 | |
| 90 | 95 | 80 | 85 | 80 | |
| 90 | 94 | 81 | 94 | 85 | |
| 90 | 92 | 82 | 73 | 84 | |
| 90 | 91 | 84 | 78 | 90 | |
| 90 | 90 | 82 | 88 | 83 | |
| 90 | 88 | 78 | 73 | 71 | |
| 90 | 87 | 96 | 100 | 100 | |
| 90 | 87 | 87 | 94 | 100 | |
| 89 | 100 | 91 | 100 | 94 | |
| 89 | 100 | 81 | 100 | 94 | |
| 89 | 100 | 77 | 100 | 100 | |
| 89 | 100 | 75 | 100 | 84 | |
| 89 | 98 | 87 | 100 | 80 | |
| 89 | 98 | 67 | 88 | 83 | |
| 89 | 96 | 85 | 100 | 91 | |
| 89 | 96 | 80 | 90 | 85 | |
| 89 | 95 | 87 | 94 | 100 | |
| 89 | 94 | 75 | 89 | 75 | |
| 89 | 93 | 82 | 94 | 100 | |
| 89 | 93 | 75 | 83 | 90 | |
| 89 | 92 | 88 | 86 | 100 | |
| 89 | 92 | 87 | 73 | 100 | |
| 89 | 92 | 80 | 91 | 100 | |
| 89 | 91 | 72 | 73 | 61 | |
| 89 | 90 | 92 | 88 | 100 | |
| 89 | 87 | 89 | 100 | 83 | |
| 89 | 87 | 85 | 90 | 91 | |
| 88 | 100 | 80 | 100 | 86 | |
| 88 | 100 | 80 | 100 | 80 | |
| 88 | 100 | 77 | 100 | 78 | |
| 88 | 99 | 71 | 100 | 82 | |
| 88 | 96 | 85 | 95 | 90 | |
| 88 | 94 | 68 | 89 | 63 | |
| 88 | 92 | 73 | 77 | 78 | |
| 88 | 91 | 95 | 100 | 100 | |
| 88 | 91 | 80 | 95 | 85 | |
| 88 | 91 | 75 | 100 | 83 | |
| 87 | 100 | 83 | 100 | 86 | |
| 87 | 97 | 76 | 95 | 72 | |
| 87 | 95 | 75 | 83 | 70 | |
| 87 | 94 | 82 | 94 | 84 | |
| 87 | 94 | 79 | 88 | 69 | |
| 87 | 90 | 86 | 95 | 83 | |
| 87 | 90 | 77 | 100 | 90 | |
| 87 | 88 | 75 | 100 | 92 | |
| 87 | 86 | 88 | 91 | 100 | |
| 87 | 85 | 100 | 100 | 100 | |
| 87 | 83 | 79 | 94 | 75 | |
| 86 | 100 | 82 | 100 | 92 | |
| 86 | 100 | 80 | 100 | 85 | |
| 86 | 100 | 67 | 100 | 73 | |
| 86 | 99 | 78 | 100 | 83 | |
| 86 | 98 | 85 | 100 | 90 | |
| 86 | 96 | 80 | 100 | 73 | |
| 86 | 94 | 58 | 90 | 63 | |
| 86 | 92 | 96 | 86 | 100 | |
| 86 | 92 | 87 | 95 | 80 | |
| 86 | 91 | 89 | 94 | 92 | |
| 86 | 90 | 92 | 77 | 88 | |
| 86 | 90 | 88 | 94 | 90 | |
| 86 | 89 | 81 | 89 | 100 | |
| 86 | 87 | 92 | 88 | 100 | |
| 86 | 87 | 84 | 88 | 90 | |
| 86 | 87 | 80 | 72 | 86 | |
| 86 | 87 | 77 | 100 | 91 | |
| 86 | 86 | 88 | 83 | 91 | |
| 86 | 85 | 88 | 94 | 90 | |
| 85 | 100 | 84 | 100 | 80 | |
| 85 | 96 | 70 | 95 | 71 | |
| 85 | 95 | 62 | 89 | 84 | |
| 85 | 93 | 84 | 100 | 81 | |
| 85 | 91 | 80 | 100 | 90 | |
| 85 | 90 | 68 | 76 | 92 | |
| 85 | 89 | 77 | 83 | 81 | |
| 85 | 89 | 77 | 77 | 90 | |
| 85 | 88 | 89 | 100 | 100 | |
| 85 | 88 | 70 | 70 | 90 | |
| 85 | 87 | 96 | 100 | 100 | |
| 85 | 84 | 67 | 66 | 66 | |
| 85 | 82 | 90 | 94 | 92 | |
| 85 | 82 | 73 | 82 | 80 | |
| 84 | 91 | 83 | 91 | 87 | |
| 84 | 91 | 73 | 100 | 87 | |
| 84 | 90 | 92 | 100 | 100 | |
| 84 | 88 | 67 | 78 | 75 | |
| 84 | 87 | 92 | 100 | 100 | |
| 84 | 86 | 79 | 88 | 75 | |
| 84 | 85 | 96 | 94 | 100 | |
| 83 | 93 | 67 | 80 | 55 | |
| 83 | 91 | 78 | 100 | 83 | |
| 83 | 90 | 90 | 100 | 83 | |
| 83 | 90 | 78 | 91 | 75 | |
| 83 | 89 | 85 | 100 | 90 | |
| 83 | 89 | 82 | 94 | 84 | |
| 83 | 89 | 76 | 72 | 100 | |
| 83 | 89 | 64 | 78 | 77 | |
| 83 | 85 | 83 | 88 | 92 | |
| 83 | 85 | 75 | 88 | 83 | |
| 83 | 84 | 90 | 90 | 92 | |
| 83 | 84 | 65 | 78 | 76 | |
| 82 | 91 | 80 | 83 | 86 | |
| 82 | 90 | 65 | 72 | 81 | |
| 82 | 88 | 93 | 94 | 93 | |
| 82 | 88 | 75 | 82 | 83 | |
| 82 | 85 | 87 | 77 | 100 | |
| 82 | 84 | 86 | 94 | 100 | |
| 81 | 94 | 59 | 83 | 88 | |
| 81 | 92 | 82 | 100 | 83 | |
| 81 | 90 | 75 | 83 | 83 | |
| 81 | 90 | 63 | 77 | 78 | |
| 81 | 89 | 68 | 94 | 73 | |
| 81 | 88 | 88 | 100 | 100 | |
| 81 | 88 | 86 | 95 | 85 | |
| 81 | 88 | 83 | 77 | 87 | |
| 81 | 87 | 90 | 100 | 100 | |
| 81 | 87 | 83 | 100 | 86 | |
| 81 | 86 | 89 | 86 | 100 | |
| 81 | 86 | 81 | 88 | 90 | |
| 81 | 80 | 70 | 29 | 100 | |
| 80 | 98 | 62 | 100 | 62 | |
| 80 | 91 | 90 | 86 | 93 | |
| 80 | 88 | 76 | 94 | 88 | |
| 80 | 88 | 75 | 83 | 91 | |
| 80 | 88 | 71 | 88 | 81 | |
| 80 | 88 | 66 | 88 | 81 | |
| 80 | 87 | 77 | 100 | 86 | |
| 80 | 87 | 67 | 77 | 53 | |
| 80 | 87 | 65 | 57 | 78 | |
| 80 | 86 | 86 | 90 | 84 | |
| 80 | 82 | 80 | 95 | 100 | |
| 80 | 82 | 76 | 94 | 90 | |
| 80 | 79 | 82 | 88 | 92 | |
| 79 | 89 | 83 | 83 | 71 | |
| 79 | 87 | 81 | 100 | 100 | |
| 79 | 85 | 74 | 95 | 91 | |
| 79 | 84 | 84 | 95 | 88 | |
| 79 | 84 | 71 | 72 | 83 | |
| 79 | 83 | 82 | 83 | 84 | |
| 78 | 89 | 92 | 77 | 100 | |
| 78 | 85 | 78 | 87 | 75 | |
| 78 | 82 | 96 | 90 | 100 | |
| 77 | 93 | 90 | 88 | 93 | |
| 77 | 92 | 65 | 94 | 80 | |
| 77 | 91 | 83 | 95 | 87 | |
| 77 | 90 | 80 | 95 | 80 | |
| 77 | 88 | 76 | 95 | 90 | |
| 77 | 86 | 81 | 82 | 93 | |
| 77 | 83 | 80 | 86 | 88 | |
| 77 | 83 | 76 | 91 | 90 | |
| 76 | 94 | 83 | 100 | 85 | |
| 76 | 90 | 80 | 66 | 91 | |
| 76 | 84 | 82 | 91 | 85 | |
| 76 | 84 | 72 | 100 | 93 | |
| 75 | 90 | 76 | 100 | 71 | |
| 75 | 81 | 68 | 95 | 62 | |
| 74 | 91 | 81 | 88 | 81 | |
| 74 | 82 | 82 | 100 | 83 | |
| 73 | 92 | 81 | 88 | 93 | |
| 72 | 92 | 78 | 95 | 84 | |
| 69 | 85 | 78 | 84 | 82 |
Figure 2.Distribution of sequence identity matches of Clinical Stage Therapeutics (CSTs) to naturally-sourced NGS. The violin plots show the distribution of sequence identities of the variable heavy (VH) and light (VL) chains, heavy and light CDR regions and CDR-H3 of CSTs to best matches in OAS.
Figure 3.Sequence identity matches of Clinical Stage Therapeutic (CST) variable regions to naturally sourced NGS datasets stratified by CST antibody type. CST a) heavy chain and b) light chain identities to NGS sequences in OAS stratified by fully human, chimeric and humanized antibody types. The three mouse molecules were omitted as too small a sample.