Pieter Meysman1,2,3, Nicolas De Neuter1,2,3, Sofie Gielis1,2,3, Danh Bui Thi2,3, Benson Ogunjimi1,4,5,6, Kris Laukens1,2,3. 1. Antwerp Unit for Data Analysis and Computation in Immunology and Sequencing (AUDACIS). 2. Department of Computer Science and Mathematics, ADREM Data Lab. 3. Biomedical Informatics Research Network Antwerp (biomina), University of Antwerp, Antwerp, Belgium. 4. Antwerp Center for Translational Immunology and Virology (ACTIV), Vaccine & Infectious Disease Institute (VAXINFECTIO). 5. Centre for Health Economics Research & Modeling Infectious Diseases (CHERMID), Vaccine & Infectious Disease Institute (VAXINFECTIO), University of Antwerp, Wilrijk, Belgium. 6. Department of Pediatrics, Antwerp University Hospital, Edegem, Belgium.
Abstract
MOTIVATION: The T-cell receptor (TCR) is responsible for recognizing epitopes presented on cell surfaces. Linking TCR sequences to their ability to target specific epitopes is currently an unsolved problem, yet one of great interest. Indeed, it is currently unknown how dissimilar TCR sequences can be before they no longer bind the same epitope. This question is confounded by the fact that there are many ways to define the similarity between two TCR sequences. Here we investigate both issues in the context of TCR sequence unsupervised clustering. RESULTS: We provide an overview of the performance of various distance metrics on two large independent datasets with 412 and 2835 TCR sequences respectively. Our results confirm the presence of structural distinct TCR groups that target identical epitopes. In addition, we put forward several recommendations to perform unsupervised T-cell receptor sequence clustering. AVAILABILITY AND IMPLEMENTATION: Source code implemented in Python 3 available at https://github.com/pmeysman/TCRclusteringPaper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: The T-cell receptor (TCR) is responsible for recognizing epitopes presented on cell surfaces. Linking TCR sequences to their ability to target specific epitopes is currently an unsolved problem, yet one of great interest. Indeed, it is currently unknown how dissimilar TCR sequences can be before they no longer bind the same epitope. This question is confounded by the fact that there are many ways to define the similarity between two TCR sequences. Here we investigate both issues in the context of TCR sequence unsupervised clustering. RESULTS: We provide an overview of the performance of various distance metrics on two large independent datasets with 412 and 2835 TCR sequences respectively. Our results confirm the presence of structural distinct TCR groups that target identical epitopes. In addition, we put forward several recommendations to perform unsupervised T-cell receptor sequence clustering. AVAILABILITY AND IMPLEMENTATION: Source code implemented in Python 3 available at https://github.com/pmeysman/TCRclusteringPaper. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Koshlan Mayer-Blackwell; Stefan Schattgen; Liel Cohen-Lavi; Jeremy C Crawford; Aisha Souquette; Jessica A Gaevert; Tomer Hertz; Paul G Thomas; Philip Bradley; Andrew Fiore-Gartland Journal: Elife Date: 2021-11-30 Impact factor: 8.140
Authors: Marion Pardons; Linos Vandekerckhove; Basiel Cole; Laurens Lambrechts; Pierre Gantner; Ytse Noppe; Noah Bonine; Wojciech Witkowski; Lennie Chen; Sarah Palmer; James I Mullins; Nicolas Chomont Journal: Nat Commun Date: 2021-06-17 Impact factor: 14.919
Authors: Dmitry V Bagaev; Renske M A Vroomans; Jerome Samir; Ulrik Stervbo; Cristina Rius; Garry Dolton; Alexander Greenshields-Watson; Meriem Attaf; Evgeny S Egorov; Ivan V Zvyagin; Nina Babel; David K Cole; Andrew J Godkin; Andrew K Sewell; Can Kesmir; Dmitriy M Chudakov; Fabio Luciani; Mikhail Shugay Journal: Nucleic Acids Res Date: 2020-01-08 Impact factor: 16.971