Literature DB >> 35845836

Deciphering the language of antibodies using self-supervised learning.

Jinwoo Leem1, Laura S Mitchell1, James H R Farmery1, Justin Barton1, Jacob D Galson1.   

Abstract

An individual's B cell receptor (BCR) repertoire encodes information about past immune responses and potential for future disease protection. Deciphering the information stored in BCR sequence datasets will transform our understanding of disease and enable discovery of novel diagnostics and antibody therapeutics. A key challenge of BCR sequence analysis is the prediction of BCR properties from their amino acid sequence alone. Here, we present an antibody-specific language model, Antibody-specific Bidirectional Encoder Representation from Transformers (AntiBERTa), which provides a contextualized representation of BCR sequences. Following pre-training, we show that AntiBERTa embeddings capture biologically relevant information, generalizable to a range of applications. As a case study, we fine-tune AntiBERTa to predict paratope positions from an antibody sequence, outperforming public tools across multiple metrics. To our knowledge, AntiBERTa is the deepest protein-family-specific language model, providing a rich representation of BCRs. AntiBERTa embeddings are primed for multiple downstream tasks and can improve our understanding of the language of antibodies.
© 2022 The Authors.

Entities:  

Keywords:  B cell receptors; antibodies; language models; paratope prediction; representation learning; self-supervised learning; transfer learning; transformers

Year:  2022        PMID: 35845836      PMCID: PMC9278498          DOI: 10.1016/j.patter.2022.100513

Source DB:  PubMed          Journal:  Patterns (N Y)        ISSN: 2666-3899


  46 in total

1.  Learning the High-Dimensional Immunogenomic Features That Predict Public and Private Antibody Repertoires.

Authors:  Victor Greiff; Cédric R Weber; Johannes Palme; Ulrich Bodenhofer; Enkelejda Miho; Ulrike Menzel; Sai T Reddy
Journal:  J Immunol       Date:  2017-09-18       Impact factor: 5.422

2.  AbDesign: An algorithm for combinatorial backbone design guided by natural conformations and sequences.

Authors:  Gideon D Lapidoth; Dror Baran; Gabriele M Pszolla; Christoffer Norn; Assaf Alon; Michael D Tyka; Sarel J Fleishman
Journal:  Proteins       Date:  2015-06-06

3.  The H3 loop of antibodies shows unique structural characteristics.

Authors:  Cristian Regep; Guy Georges; Jiye Shi; Bojana Popovic; Charlotte M Deane
Journal:  Proteins       Date:  2017-04-06

Review 4.  How B-Cell Receptor Repertoire Sequencing Can Be Enriched with Structural Antibody Data.

Authors:  Aleksandr Kovaltsuk; Konrad Krawczyk; Jacob D Galson; Dominic F Kelly; Charlotte M Deane; Johannes Trück
Journal:  Front Immunol       Date:  2017-12-08       Impact factor: 7.561

5.  SCALOP: sequence-based antibody canonical loop structure annotation.

Authors:  Wing Ki Wong; Guy Georges; Francesca Ros; Sebastian Kelm; Alan P Lewis; Bruck Taddese; Jinwoo Leem; Charlotte M Deane
Journal:  Bioinformatics       Date:  2019-05-15       Impact factor: 6.937

6.  Human B Cell Clonal Expansion and Convergent Antibody Responses to SARS-CoV-2.

Authors:  Sandra C A Nielsen; Fan Yang; Katherine J L Jackson; Ramona A Hoh; Katharina Röltgen; Grace H Jean; Bryan A Stevens; Ji-Yeun Lee; Arjun Rustagi; Angela J Rogers; Abigail E Powell; Molly Hunter; Javaria Najeeb; Ana R Otrelo-Cardoso; Kathryn E Yost; Bence Daniel; Kari C Nadeau; Howard Y Chang; Ansuman T Satpathy; Theodore S Jardetzky; Peter S Kim; Taia T Wang; Benjamin A Pinsky; Catherine A Blish; Scott D Boyd
Journal:  Cell Host Microbe       Date:  2020-09-03       Impact factor: 21.023

7.  Epitope profiling using computational structural modelling demonstrated on coronavirus-binding antibodies.

Authors:  Sarah A Robinson; Matthew I J Raybould; Constantin Schneider; Wing Ki Wong; Claire Marks; Charlotte M Deane
Journal:  PLoS Comput Biol       Date:  2021-12-13       Impact factor: 4.475

8.  Characterizing the Diversity of the CDR-H3 Loop Conformational Ensembles in Relationship to Antibody Binding Properties.

Authors:  Monica L Fernández-Quintero; Johannes R Loeffler; Johannes Kraml; Ursula Kahler; Anna S Kamenik; Klaus R Liedl
Journal:  Front Immunol       Date:  2019-01-07       Impact factor: 7.561

9.  A pathogenic and clonally expanded B cell transcriptome in active multiple sclerosis.

Authors:  Akshaya Ramesh; Ryan D Schubert; Ariele L Greenfield; Ravi Dandekar; Rita Loudermilk; Joseph J Sabatino; Matthew T Koelzer; Edwina B Tran; Kanishka Koshal; Kicheol Kim; Anne-Katrin Pröbstel; Debarko Banerji; Chu-Yueh Guo; Ari J Green; Riley M Bove; Joseph L DeRisi; Jeffrey M Gelfand; Bruce A C Cree; Scott S Zamvil; Sergio E Baranzini; Stephen L Hauser; Michael R Wilson
Journal:  Proc Natl Acad Sci U S A       Date:  2020-08-28       Impact factor: 11.205

10.  Humanization of antibodies using a machine learning approach on large-scale repertoire data.

Authors:  Claire Marks; Alissa M Hummer; Mark Chin; Charlotte M Deane
Journal:  Bioinformatics       Date:  2021-06-10       Impact factor: 6.931

View more
  2 in total

Review 1.  Machine-designed biotherapeutics: opportunities, feasibility and advantages of deep learning in computational antibody discovery.

Authors:  Wiktoria Wilman; Sonia Wróbel; Weronika Bielska; Piotr Deszynski; Paweł Dudzic; Igor Jaszczyszyn; Jędrzej Kaniewski; Jakub Młokosiewicz; Anahita Rouyan; Tadeusz Satława; Sandeep Kumar; Victor Greiff; Konrad Krawczyk
Journal:  Brief Bioinform       Date:  2022-07-18       Impact factor: 13.994

Review 2.  Machine Learning Approaches to TCR Repertoire Analysis.

Authors:  Yotaro Katayama; Ryo Yokota; Taishin Akiyama; Tetsuya J Kobayashi
Journal:  Front Immunol       Date:  2022-07-15       Impact factor: 8.786

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.