| Literature DB >> 31588507 |
Dmitry V Bagaev1,2, Renske M A Vroomans3,4, Jerome Samir5,6, Ulrik Stervbo7,8, Cristina Rius9, Garry Dolton9, Alexander Greenshields-Watson9, Meriem Attaf9, Evgeny S Egorov2, Ivan V Zvyagin1,2, Nina Babel7,8, David K Cole9,10, Andrew J Godkin9, Andrew K Sewell9, Can Kesmir11, Dmitriy M Chudakov1,2,12, Fabio Luciani5,6, Mikhail Shugay1,2,12.
Abstract
Here, we report an update of the VDJdb database with a substantial increase in the number of T-cell receptor (TCR) sequences and their cognate antigens. The update further provides a new database infrastructure featuring two additional analysis modes that facilitate database querying and real-world data analysis. The increased yield of TCR specificity identification methods and the overall increase in the number of studies in the field has allowed us to expand the database more than 5-fold. Furthermore, several new analysis methods are included. For example, batch annotation of TCR repertoire sequencing samples allows for annotating large datasets on-line. Using recently developed bioinformatic methods for TCR motif mining, we have built a reduced set of high-quality TCR motifs that can be used for both training TCR specificity predictors and matching against TCRs of interest. These additions enhance the versatility of the VDJdb in the task of exploring T-cell antigen specificities. The database is available at https://vdjdb.cdr3.net.Entities:
Mesh:
Substances:
Year: 2020 PMID: 31588507 PMCID: PMC6943061 DOI: 10.1093/nar/gkz874
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Summary statistics of VDJdb records according to July 2019 database release. Plots show the cumulative number of unique TCR sequences (up to unique V/J gene and CDR3 amino acid), antigens, MHC alleles and publications added to the database arranged by the publication date of corresponding papers. Colored lines represent records that only have TCR alpha (TRA, blue) or beta (TRB, green) chain or both chains (‘paired’ records, red). Note that as several VDJdb records can link a single TCR sequence to different metadata (e.g. another study and donor, or distinct epitope in case of cross-reactivity), the total number of unique TCR sequences (n = 42 211) is less than the total number of VDJdb records (n = 61 049).
Figure 2.Batch query interface of the VDJdb web browser. (A) TCR sequence annotations provided for an example AIRR-Seq sample with default matching criteria. The sample (HIP02877) represents an individual carrying a HLA-B*35 allele and is taken from the Emerson et al.'s study (17). Note a prominent EBV-specific clonal expansion restricted to this allele at the top of the annotations list. (B) Summary statistics charts comparing HIP02877 (‘B35+.txt’, left) to HIP13994 (‘CMV+.txt’, right) sample representing a CMV+ individual from the same study.
Figure 3.VDJdb motif browser interface. (A) Navigation tab showing the tree of available epitope motifs that can be selected to view PWMs of CDR3 amino acid sequences. The top plot shows the most abundant TCR beta chain motif for the A*02:GIL Influenza epitope. Motifs normalized for V(D)J rearrangement background are shown. (B) An example of CDR3 sequence query with matching amino acids highlighted. CDR3 sequence (CAEDNNARLMF) of the TCR alpha chain from the 3O4L PDB structure (TCR bound to A*02:GLC EBV epitope) was used as a query.