Leon Kuchenbecker1, Mikalai Nienen2, Jochen Hecht3, Avidan U Neumann3, Nina Babel3, Knut Reinert3, Peter N Robinson4. 1. Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany. 2. Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany. 3. Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany. 4. Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany Berlin-Brandenburg Center for Regenerative Therapies, Charité Universitätsmedizin, Berlin, Department of Computer Science, Freie Universität, Berlin, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany, Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel, Marien Hospital Herne, Ruhr University Bochum, Bochum and Institute of Medical Genetics and Human Genetics, Charité Universitätsmedizin Berlin, Berlin, Germany.
Abstract
MOTIVATION: Recombined T- and B-cell receptor repertoires are increasingly being studied using next generation sequencing (NGS) in order to interrogate the repertoire composition as well as changes in the distribution of receptor clones under different physiological and disease states. This type of analysis requires efficient and unambiguous clonotype assignment to a large number of NGS read sequences, including the identification of the incorporated V and J gene segments and the CDR3 sequence. Current tools have deficits with respect to performance, accuracy and documentation of their underlying algorithms and usage. RESULTS: We present IMSEQ, a method to derive clonotype repertoires from NGS data with sophisticated routines for handling errors stemming from PCR and sequencing artefacts. The application can handle different kinds of input data originating from single- or paired-end sequencing in different configurations and is generic regarding the species and gene of interest. We have carefully evaluated our method with simulated and real world data and show that IMSEQ is superior to other tools with respect to its clonotyping as well as standalone error correction and runtime performance. AVAILABILITY AND IMPLEMENTATION: IMSEQ was implemented in C++ using the SeqAn library for efficient sequence analysis. It is freely available under the GPLv2 open source license and can be downloaded at www.imtools.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: lkuchenb@inf.fu-berlin.de or peter.robinson@charite.de.
MOTIVATION: Recombined T- and B-cell receptor repertoires are increasingly being studied using next generation sequencing (NGS) in order to interrogate the repertoire composition as well as changes in the distribution of receptor clones under different physiological and disease states. This type of analysis requires efficient and unambiguous clonotype assignment to a large number of NGS read sequences, including the identification of the incorporated V and J gene segments and the CDR3 sequence. Current tools have deficits with respect to performance, accuracy and documentation of their underlying algorithms and usage. RESULTS: We present IMSEQ, a method to derive clonotype repertoires from NGS data with sophisticated routines for handling errors stemming from PCR and sequencing artefacts. The application can handle different kinds of input data originating from single- or paired-end sequencing in different configurations and is generic regarding the species and gene of interest. We have carefully evaluated our method with simulated and real world data and show that IMSEQ is superior to other tools with respect to its clonotyping as well as standalone error correction and runtime performance. AVAILABILITY AND IMPLEMENTATION: IMSEQ was implemented in C++ using the SeqAn library for efficient sequence analysis. It is freely available under the GPLv2 open source license and can be downloaded at www.imtools.org. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. CONTACT: lkuchenb@inf.fu-berlin.de or peter.robinson@charite.de.
Authors: Aleksandr Kovaltsuk; Konrad Krawczyk; Sebastian Kelm; James Snowden; Charlotte M Deane Journal: J Immunol Date: 2018-11-05 Impact factor: 5.422
Authors: Lmar Babrak; Susanna Marquez; Christian E Busse; William D Lees; Enkelejda Miho; Mats Ohlin; Aaron M Rosenfeld; Ulrik Stervbo; Corey T Watson; Chaim A Schramm Journal: Methods Mol Biol Date: 2022
Authors: Robert Bentham; Kevin Litchfield; Thomas B K Watkins; Emilia L Lim; Rachel Rosenthal; Carlos Martínez-Ruiz; Crispin T Hiley; Maise Al Bakir; Roberto Salgado; David A Moore; Mariam Jamal-Hanjani; Charles Swanton; Nicholas McGranahan Journal: Nature Date: 2021-09-08 Impact factor: 49.962
Authors: Alexander Shlemov; Sergey Bankevich; Andrey Bzikadze; Maria A Turchaninova; Yana Safonova; Pavel A Pevzner Journal: J Immunol Date: 2017-10-04 Impact factor: 5.422
Authors: Aayoung Hong; Marco Piva; Sixue Liu; Gatien Moriceau; Roger S Lo; Willy Hugo; Shirley H Lomeli; Vincent Zoete; Christopher E Randolph; Zhentao Yang; Yan Wang; Jordan J Lee; Skylar J Lo; Lu Sun; Agustin Vega-Crespo; Alejandro J Garcia; David B Shackelford; Steven M Dubinett; Philip O Scumpia; Stephanie D Byrum; Alan J Tackett; Timothy R Donahue; Olivier Michielin; Sheri L Holmen; Antoni Ribas Journal: Cancer Discov Date: 2020-12-14 Impact factor: 39.397
Authors: Michael J T Stubbington; Tapio Lönnberg; Valentina Proserpio; Simon Clare; Anneliese O Speak; Gordon Dougan; Sarah A Teichmann Journal: Nat Methods Date: 2016-03-07 Impact factor: 28.547