| Literature DB >> 35852318 |
Torbjørn Rognes1,2,3, Lonneke Scheffer1,3, Victor Greiff4, Geir Kjetil Sandve1,3.
Abstract
MOTIVATION: Adaptive immune receptor (AIR) repertoires (AIRRs) record past immune encounters with exquisite specificity. Therefore, identifying identical or similar AIR sequences across individuals is a key step in AIRR analysis for revealing convergent immune response patterns that may be exploited for diagnostics and therapy. Existing methods for quantifying AIRR overlap scale poorly with increasing dataset numbers and sizes. To address this limitation, we developed CompAIRR, which enables ultra-fast computation of AIRR overlap, based on either exact or approximate sequence matching.Entities:
Year: 2022 PMID: 35852318 PMCID: PMC9438946 DOI: 10.1093/bioinformatics/btac505
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.931
Fig. 1.Overview of CompAIRR features and performance. (a) CompAIRR has configurable AIR matching criteria and output formats. (b) CompAIRR calculates pairwise AIRR overlap up to 1000-fold faster than currently available tools. (c) The maximum RAM usage of CompAIRR is below one-third of the most memory-efficient alternative. (d) The CompAIRR running time increases when allowing more AIR sequence mismatches, but multithreading helps reduce this running time. (b–d) Data shown are mean with error bars showing min/max values across three replicate runs. For the largest dataset, only CompAIRR was run three times, and VDJtools failed to run due to memory limitations. Unless otherwise specified, datasets consist of 1000 AIRRs containing 105 OLGA-generated sequences (Sethna ) (default human IgH CDR3 model)