Literature DB >> 36001559

The splenic T cell receptor repertoire during an immune response against a complex antigen: Expanding private clones accumulate in the high and low copy number region.

Martin Meinhardt¹, Cornelia Tune¹, Lisa-Kristin Schierloh¹, Andrea Schampel¹, René Pagel¹, Jürgen Westermann¹.

Abstract

Large cellular antigens comprise a variety of different epitopes leading to a T cell response of extreme diversity. Therefore, tracking such a response by next generation sequencing of the T cell receptor (TCR) in order to identify common TCR properties among the expanding T cells represents an enormous challenge. In the present study we adapted a set of established indices to elucidate alterations in the TCR repertoire regarding sequence similarities between TCRs including VJ segment usage and diversity of nucleotide coding of a single TCR. We combined the usage of these indices with a new systematic splitting strategy regarding the copy number of the extracted clones to divide the repertoire into multiple fractions for separate analysis. We implemented this new analytic approach using the splenic TCR repertoire following immunization with sheep red blood cells (SRBC) in mice. As expected, early after immunization presumably antigen-specific clones accumulated in high copy number fractions, but at later time points similar accumulation of specific clones occurred within the repertoire fractions of lowest copy number. For both repertoire regions immunized animals could reliably be distinguished from control in a classification approach, demonstrating the robustness of the two effects at the individual level. The direction in which the indices shifted after immunization revealed that for both the early and the late effect alterations in repertoire parameters were caused by antigen-specific private clones displacing non-specific public clones. Taken together, tracking antigen-specific clones by their displacement of average TCR repertoire characteristics in standardized repertoire fractions ensures that our analytical approach is fairly independent from the antigen in question and thus allows the in-depth characterization of a variety of immune responses.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 36001559 PMCID： PMC9401120 DOI： 10.1371/journal.pone.0273264

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

T cells are key players of the adaptive immune system. They are involved in direct combating of pathogens as well as in the coordination of other parts of the immune system including the B cell response [1, 2]. Despite the different functions of the T cell subpopulations, the T cell receptor (TCR) is the common tool which enables them to recognize peptides presented by other cells. In humans and mice ∼95% of circulating T cells express the TCR variant that consists of a dipeptide of α—and β -chain [3, 4]. The immense variety of pathogens threating the organism induces the need of a receptor repertoire of high diversity, which is achieved by a genetic mechanism called somatic recombination. This process includes V(D)J segment recombination as well as random nucleotide insertion and depletion at the junction sites. Each chain contains three hypervariable areas termed complementary determining region (CDR1–3). The CDR3 region spans the junction sites of V(D)J segments and is the site of highest variability [3]. In addition, it is the prime site of antigen-interaction and thus the main determinant of specificity of each T cell receptor/clone. In mice, the number of different TCRs which can theoretically be generated is estimated as ∼1015 [5, 6], from which only ∼2 ⋅ 106 are realized at a given time point [5]. T cells that display CDR3 sequences commonly found in many individuals are termed “public” clones [7], while so-called “private” clones hold CDR3 sequences realized only in a few individuals. In case of antigen exposure, the number of reacting T cells is—even in the case of huge cellular antigens like bacteria – extremely low compared to the total number of T cells (between 0.01% and 0.1%, [8]) which renders their detection within the T cell receptor repertoire (TCR-R) determined by next generation sequencing a challenging bio-statistical task. A common strategy is to focus on T cell clones with highest copy number (CN) assuming that reacting and thus expanding clones exceed the CN of non-reacting ones (see e.g. [9-11]). For example, 3 days after immunization with SRBC a clear effect was observed among the splenic high CN (top 100) clones that quickly vanished within one day [12, 13], although T cell proliferation in this large antigen model continues well beyond this time point [13, 14]. This demonstrates the need for additional analytical strategies that can detect the effects induced by ongoing proliferation of presumably highly diverse T cell clones within the TCR-R. In addition, even after expansion less frequent reacting clones might not reach CN levels higher than that of more frequent non-reacting ones. In line with this hypothesis, a classification approach after immunization with ovalbumin failed to discriminate between immunized and control animals based on clones with highest CN but was successful using clones detected with a single copy only [10]. Therefore, the aim of the present study was to improve the in-depth analysis of an immune response against a huge cellular antigen at the clonal level, for which we used the above-mentioned splenic TCR-R after immunization with SRBC [12-15]. We applied and adapted established indices which capture repertoire characteristics between and within animals exploiting both amino acid sequence and nucleotide coding level. In combination therewith we introduced a systematic fractioning strategy of the TCR-R that allowed the identification of those CN fractions in which antigen-specific clones accumulate. Our results showed that the analytical approach presented here significantly improved the in-depth analysis of TCR-Rs harboring highly divers immune reactions. We identified two separate regions of the splenic TCR-R where specific private clones accumulate at different time points following immunization with SRBC.

Results

Specific adaptions of the Simpson index reveal diversification of the TCR-R accompanied by decreasing diversity of nucleotide sequences which code for a given clonotype, for up to 7 days after immunization

To analyze the TCR-R during an immune response in the spleen we split the repertoire into three fractions of clonotypes according to their CN. Throughout this study the term ‘clonotypes’ refers to a set of T-cell clones with equal CDR3 amino acid sequence. To each of these clone sets exactly one V- and J-segment was assigned (see Material and method): CNlow: CN = 2 CNmed: 2 < CN ≤ 500 CNhigh: CN > 500 This tripartition ensures that the CNhigh fraction contains on average ∼100 clonotypes, so this fraction is comparable to the top 100 clonotypes/clones of highest copy number which are analyzed separately in many studies [10, 12, 15]. Since we excluded sequences with CN = 1 from our analyses (see Material and method section), CN = 2 represents the group of lowest CN possible. We compared PBS-injected control mice to SRBC-immunized mice 3 days (3d), 4d and 7d after immunization and found an increase in clonotype number and mean CDR3β sequence length, which was restricted to both the 3d time point and the CNhigh fraction (Fig 1A and 1B). To assess possible expansion-induced shifts within both more complex and more sensitive repertoire parameters we converted the Simpson Index into a generalized version (see Material and methods for mathematical procedure) that quantifies the appearance of similar clonotypes within the repertoire of an animal and was therefore termed Repertoire Homogeneity Index (RHI). The advantage of the adapted version is that similarity can be defined within a variety of parameters depending on the research question. We here used the Levenshtein distance to capture the homogeneity of CDR3 sequences (RHILD; two clonotypes are defined as similar if the Levenshtein distance of their CDR3β regions is at maximum 1, note that a Levenshtein distance of zero implies equality of the two clonotypes). To capture the homogeneity of gene usage we considered the V- and J-segments which were assigned to the clonotypes (RHIVJ; two clonotypes are defined as similar if equal V- and J-segments were assigned). Thus, for both RHILD and RHIVJ an increase in value reflects an homogenization of the TCR-R.

Fig 1

Immunization with SRBC induces consistent parameter shifts in two distinct copy number fractions of the T cell receptor repertoire.

Immunization with SRBC induces consistent parameter shifts in two distinct copy number fractions of the T cell receptor repertoire.

The data are presented either as total T cell receptor repertoire (containing all clonotypes) or divided into three clonotype fractions according to their copy number: clonotypes of low (CN = 2), intermediate (2 < CN ≤ 500) and high (CN > 500) copy number. The total T cell receptor repertoire and each copy number fraction were compared 3, 4 and 7 days (d) after injection of SRBC (n = 10 per time point) with control (PBS-injected) animals (n = 20). (A) Number of clonotypes, (B) Mean sequence length of the CDR3β region, (C) Repertoire Homogeneity Index (RHI) capturing the homogeneity of CDR3β sequences within animals (two clonotypes defined as similar if the Levenshtein distance of the CDR3β regions is at maximum one), (D) RHI capturing homogeneity of gene usage (two clonotypes defined as similar if equal V- and J-segments were assigned), (E) Coding Diversity Index (CDI) capturing the diversity of nucleotide coding of the CDR3β amino acid sequence of each clonotype, and (F) Jaccard Index measuring the clonal overlap of CDR3β sequences from different animals. Boxplots display median, interquartile range and minima/maxima. For (A)-(E) immunized repertoires were tested for deviations from control repertoires using Mann-Whitney-U-test. p-values are displayed as * p < 0.05, ** p < 0.01, *** p < 0.001. Correction for multiple testing was performed using Holm’s method. For (F) apparent immunization effects are highlighted by arrows (see Material and methods). Due to degeneration of the genetic code each clonotype can be encoded by different nucleotide sequences (i.e. consist of several actual T cell clones). To elucidate if immunization with SRBC also induces a shift at this level of the repertoire we used an alternative adaption of the Simpson Index to measure how diverse the nucleotide coding of each clonotype is (see Material and methods for mathematical details). We incorporated not only the number of nucleotides coding for the respective clonotype but also their proportion of sequence reads in such that a value close to 1 indicates rather balanced coding by multiple nucleotide sequences and a value close to 0 reflects predominant coding by a single nucleotide sequence. Accordingly, we named the subsequently calculated average of all clonotypes per animal Coding Diversity Index (CDI) for which an increase in value indicates a diversification of clonotype coding. The significant decrease of RHILD at 3d after immunization in the CNhigh fraction (Fig 1C) showed that the increase of clonotype number at this time point (Fig 1A) is associated with a diversification of immunized repertoires compared to control. Application of the RHIVJ (Fig 1D) revealed a diversification of gene usage within the CNhigh fraction that lasted until 4d and that a second diversification effect became clearly visible among the CNlow fraction 7d after immunization. In parallel, the CDI shows that these timely separated diversification effects within the CNhigh and CNlow fractions of the TCR-R were accompanied by a decreasing diversity of clonotype coding (Fig 1E), i.e. most of the expanding clonotypes are coded by few dominant nucleotide sequences. Taken together, basic parameters such as CN and CDR3β sequence length identified immunization effects within the TCR-R only 3d after immunization with SRBC and only within the CNhigh fraction [12, 13, 15]. However, applying the newly created RHILD, RHIVJ and CDI made it possible to detect immune response-induced alterations within the TCR-R also among clonotypes of lowest CN and to trace these effects throughout all time points. Both effects showed diversification of repertoires (indicated by a decrease of RHI) that was accompanied by decreasing diversity of clonotype coding (indicated by a decrease of CDI). To elucidate if these alterations within animals also lead to changes between animals and thus repertoire shifts at the population level we applied the Jaccard Index and found it reduced both at 3d after immunization within the CNhigh fraction and at 7d within the CNlow fraction (Fig 1F) showing that in each animal different clonotypes reacted to the SRBC antigens which indicated the private nature of this immune response [7, 12, 15].

Systematic fractioning of the TCR-R by clonotype copy number reveals the full extent of two clearly separated repertoire regions affected by immunization

The significant decrease of RHIVJ within the CNmed fraction 7d after immunization (Fig 1D) raised the question how far the effect detected by all indices in the CNlow fraction extended from the bottom into the CNmed fraction. Similarly, the early effects within the CNhigh fraction might extend into the CNmed fraction from the top due to huge differences in the number of clonotypes in the two fractions (on average about 100 within CNhigh and about 60, 000 within CNmed; Fig 1A). Therefore, we aimed to refine the previously arbitrary fractioning by further and systematical splitting of the intermediate fraction in such that: i) the number of fractions is kept as low as possible to avoid the loss of true effects due to correction for multiple comparisons but high enough for a precise mapping of immunization effects to distinct fractions, and ii) the number of clonotypes in each fraction is high enough to yield statistically valid results. While these criteria ruled out a linear approach, we found that fractioning the total repertoire by CN based on the logarithm to the base 2 fulfilled these requirements: resulting in 10 fractions with the first fraction with log2(CN) = 1 (i.e. CN = 2) equaling the CNlow fraction and the last with log2(CN)>9 (i.e. CN > 512) corresponding to the CNhigh fraction, while the ∼60, 000 clonotypes of the intermediate fraction now distributed over 8 fractions. The number of clonotypes per fraction in a single data set ranged from a maximum of about 21, 000 to a minimum of 45 (Fig 2A). While basic parameters such as number of clonotypes (Fig 2A) and mean sequence length (S1A Fig) remained unaltered in all but the fraction of highest CN 3d after immunization (the mean sequence length also at 4d within the second highest copy fraction), the refined fractioning already payed off concerning the Jaccard Index, where in addition to the known decreases in CNhigh at 3d and CNlow at 7d obvious decreases appeared in the two fractions from 26 to 29 at both 3d and 4d after immunization (S1B Fig).

Fig 2

Logarithmic partitioning leads to repertoire fractions of appropriate size for separate analyses that reveal true extent and separation of two immunization-induced effects within the repertoire.

Logarithmic partitioning leads to repertoire fractions of appropriate size for separate analyses that reveal true extent and separation of two immunization-induced effects within the repertoire.

Total repertoires where split into 10 fractions according to their copy number based on the logarithm to base 2. (A) Number of extracted clonotypes per fraction. Bars and whiskers display means and standard deviations. Repertoires of animals 3, 4 and 7 days (d) after immunization with SRBC (n = 10 each) were tested for deviations from the control (PBS-injected) repertoires (n = 20) using the Mann-Whitney-U-test. p-values are displayed as * p < 0.05. Correction for multiple testing was performed using Holm’s method. (B) For each fraction and time point immunized samples were compared to control samples for three parameters: the Repertoire Homogeneity Index (RHI) assessing repertoire homogeneity concerning either i) CDR3β sequence similarity measured by the Levenshtein distance (RHILD) or ii) VJ segment usage (RHIVJ) as well as iii) the Coding Diversity Index (CDI) assessing heterogeneity of clonotype coding. p-values refer to a one tailed Mann-Whitney-U-test with significant decrease of the respective index. The 10 p-values of each row were independently corrected for multiple testing using Holm’s method. Subsequently, we compared RHILD, RHIVJ and CDI of the immunized animals for each time point and fraction to that of control animals, visualizing the resulting p-values in a heat map for easy comparison (Fig 2B). Our results showed that fractioning the total repertoire by CN based on the logarithm to the base 2 paired with application of the adapted indices allowed an in-depth analysis of the splenic TCR-R during a SRBC-induced immune response that revealed three main findings (Fig 2B): First, the V- and J-segment usage measured by RHIVJ harbored the greatest discriminatory power by detecting immunization effects among clonotypes of high and medium CN until 7d after immunization as well as among clonotypes of low CN already 4d after immunization, while the CDI displayed a superior discriminatory power for the immunization effect within the two fractions of lowest CN 7d after immunization. Second, 7d after immunization none of the indices reached significance within the fraction of highest CN, while the immunization effect was still prominent several fractions below (6 < log2(CN)≤9) at least for RHIVJ. This remained true even if the p-values were not corrected for multiple testing demonstrating that significant effects were not hidden by statistical correction steps (S1C Fig). Third, the early effect among clonotypes of medium to high CN and the late effect within fractions of lowest CN were separated by a clear gap of at least two fractions (4 < log2(CN)≤6, about 19, 000 clones total and thus nearly a quarter of the whole repertoire) that did not display alterations in any of the repertoire parameters, even without correction for multiple testing (S1C Fig). Thus, systematic fractioning of the splenic TCR-R by CN clearly revealed two discrete sites of diversification on the animal level, one occurring early during the immune response within the high CN region and one occurring late within the low CN region, both being accompanied by a homogenization of nucleotide coding on the clonotype level.

Classification becomes successful when performed separately on the high and low copy number sub-repertoires that display immunization-induced effects

Our results revealed a variety of significant alterations of repertoire characteristics ranging from the animal level (clonotype similarities measured by the RHI) to the clonotype level (homogenization of nucleotide coding measured by the CDI) when the group of control animals was compared to that of immunized animals. To investigate whether these effects were pronounced enough to be demonstrated within individual animals, we constructed a simple classification tool. In brief, we used a generalized version of Morisita-Horn Index [16, 17] to define a measure of dissimilarity between to TCR-Rs. Subsequently the data sets were arranged in clusters via K-medoid clustering [18, 19]. A slight modification leads to a supervised classification procedure (see Material and methods). We analyzed, if this algorithm can reliable distinguish TCR-Rs of immunized animals from those of naïve. We hypothesized that focusing on those repertoire regions in which reacting clones accumulate will lead to a successful classification. For definition of the two relevant sub-repertoires, we used those fractions that reached significance either at least at two different time points or for two of the three indices, leading to Xtop: log2(CN) > 6, i.e. CN > 64 and Xbottom: log2(CN)≤2, i.e. CN ≤ 4. Considering VJ segment usage as criterion, unsupervised classification of Xtop arranged the sub-repertoire of high CN in a cluster structure where one cluster is clearly dominated by control (PBS) and the other by immunized samples (SRBC) indicated by triangles and circles, respectively (Fig 3A). While most overlap between the clusters was caused by mice of the control and 3d SRBC group, all but one animal each of the 4d and 7d group were assigned into the same cluster. This was confirmed by the supervised approach that classified 18 out of 20 control mice and 26 out of 30 immunized mice correctly (Fig 3B), which differed significantly from random labelling (p < 6 ⋅ 10−8). However, applying the nucleotide coding criterion to the Xtop sub-repertoire neither resulted in obvious clustering (Fig 3C) nor significantly outstripped random classification (Fig 3D, p = 0.377). In contrast, when applying cluster analysis to the bottom sub-repertoire, VJ segment usage as criterion failed to distinguish between the experimental groups, with only the 7d group slightly separating from the remaining samples (Fig 3E). For supervised classification we even had to remove the 3d and 4d samples to avoid disrupting the algorithm, since at this time point only marginally immunization effects were detectable in the respective fractions (Fig 2B). Subsequently, supervised classification distinguished animals of the 7d group from control samples with moderate accuracy (Fig 3F, p < 2 ⋅ 10−4). Analog to the highly significant differences in CDI (see Fig 2B), the nucleotide coding criterion for Xbottom on the other hand led to one cluster that exactly coincides with the 7d group, whereas the other cluster contained the remaining data sets (Fig 3G). Correspondingly, the supervised approach led to a perfect classification (Fig 3H; random effects: p < 4 ⋅ 10−8). It should be mentioned that a distinct improvement of the classification based on nucleotide coding can be achieved, if the definition of Xtop is modified in this way that only clonotypes with log2(CN)>8 are included (i.e. the fractions were CDI is affected significantly). In particular the 3d and 4d samples can be distinguished from naïve with satisfying accuracy (data not shown). When total repertoires were analyzed without weighting for copy number both cluster analysis and supervised classification approaches based on VJ segment usage failed to distinguish any of the immunized groups from naïve (S2A and S2B Fig). If the nucleotide criterion is applied, only the samples of the 7d group are slightly separated from the remaining samples, displaying a similar pattern as for Xbottom. This can easily be explained by the large quantity of clonotypes in these fractions. In particular the 3d and 4d data were indistinguishable from the naïve (data not shown). Thus, focusing on those repertoire regions in which the immune response was localized using the fractioning approach allowed for classification of individual TCR-R with satisfying accuracy.

Fig 3

Sub-repertoires can reliable be classified in view of the immunization status depending on the classification criterion.

(A)-(D) The high copy number sub-repertoire Xtop was defined as clonotypes with copy number >64. (A) A dissimilarity matrix was calculated based on VJ segment usage (see Material and methods) by which these sub-repertoires were divided into two clusters (immunized vs. control) using the K-medoid algorithm and the result visualized via metric multidimensional scaling with dissimilarities approximated by distances of points in the scatterplot and clusters defined by the algorithm displayed as circles and triangles, respectively. (B) The same dissimilarity matrix was used for a supervised classification that in total assigned 44 of the 50 data sets correctly. The number on top of each bar denotes the percentage of correctly classified samples for each of the 4 experimental groups. (C)-(D) The Xtop sub-repertoire data were classified based on the variability of clonotype nucleotide coding, which fails to distinguish any of the experimental groups from the others. In total, 22 of the 50 data sets were classified correctly which does not significantly outstrip random labeling (p = 0.377). (E)-(H) The low copy number sub-repertoire Xbottom was defined as clonotypes with copy number ≤ 4 and subjected to the same classification approaches as the Xtop sub-repertoires, with the exception that for supervised classification the control data sets were compared to immunized samples of the SRBC group 7 days (d) after immunization only. Although the VJ usage as criterion did not result in obvious clustering (E) the supervised approach (F) significantly outstripped random classification (p < 2 ⋅ 10−4). The nucleotide coding criterion led to a well separated cluster formed by the 7d SRBC group (G) that could be classified with perfect accuracy (H). In contrast, the Xbottom sub-repertoire of animals 3d and 4d after immunization were indistinguishable from corresponding sub-repertoires of control animals.

Sub-repertoires can reliable be classified in view of the immunization status depending on the classification criterion.

Expanding public clonotypes distribute throughout the repertoire fractions, but accumulate in the fraction of highest copy number after immunization with SRBC

The SRBC-specific effects described so far were mainly caused by an immune response involving private clonotypes as demonstrated previously [12, 15] and in the present study by a decrease of the Jaccard index (Fig 1F and S1B Fig). However, in these previous studies we were able to also identify a small set of clonotypes that was present in the majority of SRBC-immunized animals and significantly expanded compared to control [12, 13]. The expanding public clonotypes were identified via differential gene expression analysis using the R-package edgeR [20] (see Material and methods). Although the specificity of the expanding clonotypes was not tested explicitly, the observed enrichment can be seen as a hint that they react to an epitope derived from the injected SRBC, thus they can be seen as a public component of the SRBC induced T-cell response. Here, we now ask how these clonotypes distribute within the different CN fractions of the TCR-R over time. Within the control group presumably SRBC-specific clonotypes displayed a bell-shaped distribution over the ten repertoire fractions defined above. Thereby the highest number of clonotypes was found in the 5 < log2(CN) ≤ 6 fraction (Fig 4). In contrast to this, considering the total distribution of all clonotypes, the highest number of clonotypes was found in the 2 < log2(CN) ≤ 4 fractions (see Fig 2A). After the immunization with SRBC the expansion effects led to an accumulation of presumably specific public clonotypes within the high CN fraction. This prominent effect persisted until 7d p.i. (Fig 4C). Thus, the immunization effects among public and private clones (the latter are summarized in Fig 2B and S1C Fig) match at 3d and 4d with similar accumulation of clonotype numbers in the high copy fraction. In difference to this, the immunization effect of the private component failed to reach significance within the high CN fraction 7d p.i (see S1C Fig). Apart from the accumulation in the high CN fraction the immunization resulted in significant differences between control group and immunized mice in a few separate fractions at 3d and 4d. At all time points more than a quarter of the expanding public clonotypes were located in the 4 < log2(CN) ≤ 6 fractions where no significant ‘private activity’ was detected at any time points investigated (see S1C Fig). In contrast to the predominant private component of the reaction, there is no accumulation of expanding public clonotypes in the low CN fraction. This effect can be explained by the fact that public clonotypes often descend from various progenitor cells whose descendants can not be distinguished in our experimental setting (see discussion).

Fig 4

Expanding public clonotypes distribute throughout the repertoire and accumulate in the fraction of highest copy number after immunization.

40 clonotypes detected in the majority of immunized animals and found significantly expanded after SRBC injection were defined as (presumably) SRBC-specific public clonotypes. Their distribution throughout the 10 fractions defined by copy number based on the logarithm of the base 2 in PBS-injected control animals is indicated by grey bars compared to animals 3 days (d) (A), 4d (B) and 7d (C) after immunization with SRBC. Barplots state number of SRBC-specific clonotypes per fraction for control (n = 20) and immunized (n = 10 for each time point) mice. Bars and whiskers display means and standard deviations. For each time point control and immunized animals were compared using the Mann-Whitney-U-test with resulting p-values displayed as * p < 0.05, *** p < 0.001. p-values were corrected for multiple testing using Holm’s method.

Expanding public clonotypes distribute throughout the repertoire and accumulate in the fraction of highest copy number after immunization.

Discussion

Tracing highly diverse antigen-specific T cells during an immune response at the TCR-R level

The aim of this study was to analyze an immune response where the antigen-specific TCR-R is too heterogeneous to be identified by common approaches searching for e.g. clusters of CDR3 similarity. Via specific adaptions of established indices we monitored immunization-induced shifts of repertoire characteristics within different aspects of the TCR-R, first by measuring repertoire homogeneity concerning CDR3β sequence similarity (RHILD) and VJ segment usage (RHIVJ) and second by depicting nucleotide coding diversity (CDI). While the sensitivity of RHIVJ was potent enough to detect the presumably minuscule shifts within the whole repertoire, an arbitrary separation into three repertoire fractions clearly revealed shifts for RHIVJ, RHILD and CDI within both the low and high CN fractions. However, only systematic fractioning of the TCR-R by CN based on the logarithm to the base 2 resulting in 10 fractions revealed the full extend and detailed time course of the two immune response-induced parameter shifts, which allowed us to demonstrate two important aspects: First, early immunization induced effects at 3d and 4d could be located not only among clonotypes of highest CN but also within the upper four CN fractions (on average matching the top 5,000 clonotypes). Furthermore, 7d after immunization the same pattern was presented with the important exception that the highest CN fraction was not affected anymore (Fig 2B). This might explain why studies analyzing only clonotypes with highest CN at late time points after immunization failed to demonstrate immune response-induced effects on the TCR-R [10, 12, 13]. Second, our approach revealed a late immunization-induced shift within the fractions of lowest CN manifesting 7d after immunization and thus considerably later than the first effect. This effect was seen in all parameters used and confirmed by significant classification results and has to our knowledge not been characterized before. Throughout time points and parameters, the two effects observed within the fractions of high and low CN were separated by at least two fractions of medium CN that did not show any immunization effects indicating that the observed effects are caused by two different biological mechanisms.

Expansion and migration of SRBC-specific clonotypes cause the effects observed in the high and low copy number fractions

The final expansion level of a clonotype is assumed to be determined by receptor affinity, antigen dose and competitive pressure of other clonotypes [21, 22]. Thus, clonotypes which first reach the site of antigen presentation (here: the spleen) expand under conditions of overabundance of antigen and without any competitive pressure, leading to a massive expansion that manifests in an increased number of clonotypes in the top CN fractions. Subsequently, since SRBC are a non-replicating antigen, over time an increasing number of specific clonotypes compete for a decreasing amount of antigen which leads to a continuous decrease in expansion rates. This explains the accumulation of reacting clones in fractions of medium CN and the lack of immunization effects in the fraction with highest CN at later time points. Furthermore, both the egress of SRBC-specific clonotypes and simultaneous entry of non-specific clonotypes contribute to the descent of the immunization-induced parameter shifts to fractions of medium CN. The second effect of the T cell response that manifested among clonotypes of lowest CN is probably due to the (re-)immigration of antigen-specific T cells that originally expanded in other parts of the spleen. The observation that the number of SRBC-specific clonotypes in the blood increases 4d after immunization strongly supports this notion [13]. Further experiments are needed to determine whether characteristics and time course of the described alterations of the TCR-R are typical for immune responses against large, non-replicating antigens or if they also hold true for small and/or replicating antigens.

The immunization effects revealed by RHI and CDI within the TCR-R are due to the displacement of unspecific public clonotypes by specific private clonotypes

The reduction of the Jaccard Index measuring the clonal overlap between animals already confirmed the predominantly private nature of both immunization-induced effects (Fig 1F). Thus, expanding SRBC-specific private clonotypes displace naïve clonotypes within both affected regions of the repertoire. The applied indices do not detect the displacement of the many unspecific private clonotypes but instead that of the few unspecific public clonotypes which are known to appear in considerably higher CN than private clonotypes [23]. This is due to their generation by a combination of increased probability of occurrence during somatic recombination [24, 25] and selective advantages during positive and negative selection [23], which also explains their on average shorter CDR3β length. These features are reflected by both the decrease of mean sequence length and increase of the Jaccard index in the CNhigh fraction (Fig 1B and 1F) and throughout the ten repertoire fractions (S1A and S1B Fig), respectively. Furthermore, the three parameters recognized by RHILD, RHIVJ and CDI and their change of direction fit well into the interpretation that the immunization-induced effects emerge by specific private clonotypes displacing non-specific public clonotypes: First, public clonotypes are arranged in cluster structures where the public prototype is ‘surrounded’ by a set of (not necessarily public) clonotypes with very similar CDR3β regions [26]. Thus, the reduced number of similar CDR3β sequences as quantified by a decrease of RHILD can be ascribed to the disappearance of the public cluster structures due to the expansion of SRBC-specific private clonotypes (Fig 1C). Second, public clonotypes display a restricted VJ segment usage [23] which leads to relative homogeneity of VJ segment usage within the different CN fractions. After immunization with SRBC the affected CN fractions are less homogenous as indicated by the RHIVJ index (Fig 1D) because public clonotypes are displaced by SRBC-specific private clonotypes. The superior discriminatory power of RHIVJ compared to RHILD is due to the biological features of the immune response against SRBC and could as well be the other way round in responses against other antigens when dominated by clonotypes that rather share similar CDR3β sequences than VJ segment usage. Third, convergent recombination leads to the appearance of public clonotypes in families with identical CDR3β amino acid sequences (i.e. one clonotype) which are encoded by several nucleotide sequences [23]. This is only rarely the case for private clonotypes. Therefore, a decrease of nucleotide coding diversity as measured by the CDI indicates the displacement of public clonotypes due to an increase in the number of SRBC-specific private clonotypes within the respective CN fractions (Fig 1E). Taken together, the decrease in homogeneity regarding CDR3β amino acid sequence (RHILD) and VJ segment usage (RHIVJ) in combination with the decrease in nucleotide coding diversity (CDI) demonstrates the displacement of public clonotypes by SRBC-specific private ones. Most likely, however, RHI and CDI would be also able to monitor immune responses at the TCR-R level that predominately are of public nature. Here the expansion of specific public clonotypes would lead to repertoire shifts in RHILD, RHIVJ and CDI in the opposite direction as found in the present study for SRBC due to the replacement of unspecific private clonotypes. In addition, the extremely small public component among a SRBC-induced immune response can only be followed by directly tracing a set of previously identified SRBC-specific public clonotypes [12, 13, 15]. In contrast to the private component, an increase of SRBC-specific public clonotypes was observed mainly in the high CN fraction and lacking in the fraction of lowest CN (Fig 4). This observation can be explained by limitations of our analyzing approach. Activated T cells with public CDR3β sequence which remigrate from the blood into the spleen usually meet local proliferating ‘conspecifics’ i.e. T cells of the same clonotype. In our setting remigrating and local expanding T cells of the same clonotype can not be distinguished. Thus, the extracted CDR3β sequences are merged and ascribed to a fraction of higher CN.

Classification approaches highlight the importance of systematic repertoire fractioning for in-depth analysis of the TCR-R

Detection of immunization-induced effects in this model failed when either the whole repertoire was analyzed or the CN fractions were defined too broad (Fig 1). In addition, also classification was impossible when whole repertoires were investigated (S2A and S2B Fig). Interestingly, the latter could partly be solved by weighting for CN (see S1 Methods and S2C and S2D Fig). However, such approaches are in principle unable to reveal the two separated immunization-induced effects. Especially the late effect among clonotypes of lowest CN and its clear separation from the early effect within high CN fractions could only be elucidated by the refined fractioning of the whole repertoire following the systematic and objective strategy based on the logarithm to the base 2. This system leads to an optimal combination of clonotypes per fraction and number of fractions for subsequent statistical analysis that can be applied to any repertoire, allowing precise detection and discrimination of immunization-induced effects. Subsequently, separate classification approaches within those repertoire regions where antigen-specific clonotypes accumulate can reach satisfactory success rates as shown here for immunization with SRBC (Fig 3) confirming the occurrence of distinct accumulations of antigen-specific clonotypes at the level of individual animals. Thus, identification of CN fractions containing significant numbers of antigen-specific clonotypes as outlined in the present study might reveal important insights in the dynamics of the T cell response against a variety of antigens.

Conclusion

Combining RHI and CDI and/or classification approaches using analog discriminatory parameters with systematical splitting of the TCR-R into different CN fractions allows to reveal the dynamics of the SRBC-specific T cell response at the TCR-R level. We demonstrate that SRBC-specific clonotypes first accumulate in high CN fractions and at later time points also in low CN fractions. The early expansion-based effect within high copy fractions extends also into the medium CN fractions together containing approximately the top 5,000 clonotypes. Thus, selective analyses of considerably smaller top fractions like the top 100 clonotypes as done in our previous [12, 15] or other studies investigating non-replicating antigens such as ovalbumin [10] might miss relevant information. The present study shows that although no alterations are observed in the highest CN fraction (comparable to the top 100) at 7d, clear shifts are seen within the three fractions below (Fig 2B). The late migration-based effect among clonotypes of lowest CN has not been described before. Thus, the analytic strategy outlined in the present study allows the precise localization and characterization of immunization effects at the TCR-R level.

Materials and methods

The present work is based on data derived from a previously published study [12]. Relevant experimental aspects such as rearrangement of experimental groups and data preprocessing are briefly described below. For protocols of animal handling, spleen removal and cryo-sectioning as well as total RNA extraction, CDR3β-chain transcription and amplification we refer to the initial publication [12].

Mouse model and experimental groups

Eight- to twelve-week-old C57BL/6J mice were either immunized by injecting 200 ml phosphate-buffered saline (PBS) containing 109 SRBC into the tail vein or attributed as control receiving 200 ml PBS only. Immunized mice were sacrificed 3d, 4d and 7d after immunization (n = 10 each), PBS-injected animals at 3d and 4d (n = 10 each) and were merged into a single control group (n = 20). Originally, half of each group was experimentally exposed to short-term sleep restriction, but the T cell response was not affected by this manipulation [12], which is why we merged both conditions resulting in four groups: ‘PBS’ (control), ‘SRBC 3d’, ‘SRBC 4d’ and ‘SRBC 7d’ (days post-immunization, respectively).

TCR repertoire generation

CDR3β identification, clonotype clustering and correction of sequencing errors were performed using MiTCR software [27]. All parameters were set to the standard values of the ClonoCalc graphical user interface [28] for MiTCR. After removing nonfunctional sequences, we obtained an average of 1.9 million reads for each sample which were assigned to a mean of 100,000 different CDR3β nucleotide sequences. Due to experimental variability, the read counts in the 7d group were consistently increased (on average 2.7 million in the 7d SRBC group vs. 1.6 million in the remaining samples). To obtain comparable samples, the data sets of the 7d group were downsampled to the mean read count of the remaining data sets of 1.6 million reads. Note, that the downsampling of the 7d group was performed after removing nonfunctional nucleotide sequences. Thus, the final result of the downsampling step is not affected by these sequences. Subsequently, all nucleotide sequences coding for identical amino acid sequences were merged into one ‘clonotype’ that refers to a set of T cells with identical CDR3β region. Note that it is not ensured that all sequences of such clonotypes bear identical receptors since they can differ in subsequences outside the CDR3β region as well as in the α-chain. Subsequently, each clonotype was assigned the V- and J-segments of the underlying nucleotide sequence of highest read count as well as the summarized read count of all underlying nucleotide sequences as total read count, in the following referred to as copy number (CN). To avoid artificial clonotypes arising from polymerase reading errors during the sequencing procedure, sequences of copy number 1 were excluded from further analysis. After this, the number of clonotypes per repertoire ranged from approximately 50,000 to 100,000 per sample. The exact numbers for the extracted clonotypes and sequences of each sample are provided as (S1 Table).

Standard parameters and statistics

For each animal we extracted number of clonotypes and mean CDR3β sequence length, either for the whole repertoire or certain fractions defined in the results section. Furthermore, we used the Jaccard Index to quantify the clonal overlap of the extracted TCR-R of different mice. We calculated this index for each potential pairing of data sets in each experimental group which leads to multiple dependencies between the obtained values. This violates basic assumptions of commonly used inferential statistics, e.g. calculation of p-values and confidence intervals, which is why we restricted our analysis on descriptive considerations for this parameter. For all standard parameters and other indices (see below), immunized and control repertoires were compared using the Mann-Whitney-U-test. Unless otherwise mentioned all tests were two-tailed with a limit of significance of 0.05. Correction for multiple testing was performed using Holm’s method. p-values which refer to analyses of repertoire fractions were corrected independently from the analyses of total repertoires. Calculations and data visualization were performed using the R platform for statistical computing [29]. In the following, we refined established indices like the Simpson and Morisita-Horn Index in such that the resulting new indices allow for both a more flexible and deeper characterization of the TCR-R. A detailed description of underlying assumptions and derivations drawn is provided as (see S1 Methods), while only central aspects are presented in the following. Each TCR-R data set X can be interpreted as a finite set of pairs (x, ν(x)), where ν(x) denotes the CN of a clonotype x in X. To allow a comparison of clonotypes concerning complex parameters such as V- and J-segment usage and CDR3β sequence similarity we defined generalized versions of the indices mentioned above. In the following we denote by Ω the set of all possible TCRβ sequences, R ⊂ Ω × Ω an arbitrary reflexive, symmetric relation (i.e. a criterion of similarity of two sequences) on Ω and by (⋅) the indicator function (returning 1 if the given statement is true and 0 if not). In dependence of the similarity criterion R we defined the Repertoire Homogeneity Index as Thereby RHI reflects the probability by which a randomly sampled pair of clonotypes within a data set is similar in respect of R, with R allowing a flexible definition of similarity within a variety of sequence parameters. The concept of this approach is analog to that of the Simpson Index [30, 31]. Note that the adapted index does not account for CN since we aimed to apply it on repertoire fractions already defined by CN (for a weighted version of the RHI, see S1 Methods). Subsequently, we defined as criterion of similarity R either that the two clonotypes share equal V- and J-segments (RHIVJ), or that the Levenshtein distance (LD) [32] of their CDR3β regions is at maximum 1 (RHILD).

Coding diversity index

Each clonotype x ∈ X can be encoded by several nucleotide sequences . Based on the Simpson Index we converted this nucleotide coding parameter into an index that accounts for the whole repertoire and thus allows comparison between animals and groups. For each clonotype x ∈ X we first defined the Nucleotide Coding Simpson Index as DNC(x) quantifies the coding diversity of the clonotype x. The mean value of these indices provides a measure for the heterogeneity of the nucleotide coding of the total of the clonotypes in X. Accordingly, we defined this value as Coding Diversity Index which is given by

Cluster and classification analyses

The construction of classification tools required a quantification of dissimilarity of two different data sets. In analogy of RHI capturing clonal similarities within a repertoire, we defined the Repertoire Similarity Index (RSI) that allows a comparison of clonotypes deriving from X with clonotypes of another data set Y = (y, ν(y)) and thus between repertoires as Here, in the special case of the similarity criterion R demanding clonotype identity, RSI coincides with the Sørensen Index [17] and the analog version weighting for CN (see S1 Methods) with the Morisita-Horn Index [16, 33]. Subsequently, for k different relations R1, …, R attributed with appropriate weights α = (α1, …, α) satisfying α ≥ 0, i = 1, …, k and the index provides the required measure of dissimilarity defined by the respective k criteria. For the subsequent cluster and classification analysis of repertoire fractions defined by CN we considered two different versions of d. With the first we targeted VJ segment usage corresponding to RHIVJ, with identity of V- and J-segment, respectively, treated as two independent and equally weighted criteria of similarity for calculation of RSIV, RSIJ and subsequently dV,J. For the second we utilized the nucleotide coding (NC) of clonotypes by calculating RSINC and dNC with clonotypes x and y considered similar if the number of different nucleotide sequences coding for the respective clonotypes either coincides or exceeds 5 for both sequences as single criterion. This corresponds to CDI but does not account for proportional read counts of the nucleotide sequences. The data sets were categorized into two groups via K-medoids clustering [18, 19] using either dV,J or dNC as discriminatory criterion, with results visualized via metric multidimensional scaling [18] where dissimilarities are approximated by distances of points in a scatterplot. Supervised classification was performed using the leave-one-out procedure [34] with the training data sets labeled as ‘control’ and ‘immunized’. Subsequently, the K-medoid algorithm was independently applied to each of the two groups defining exactly one medoid in each group. The label of the nearest medoid (prototype) was assigned to the only sample of the test data set. Note that classification criteria and subrepertoires were defined after analyzing the data sets and subsequently the same data were used as test data sets for the evaluation of the algorithm in view of discriminatory power. Such approaches might result in an overestimation of discriminatory power, in particular if sample specific properties of the data are incorporated in the algorithm, a problem discussed in detail in [35]. The results were evaluated using an exact Fisher test, with a significant result implying that the labels are indeed affected by the true immunization status (in contrast to random labeling of the samples). K-medoid clustering was performed using the pam function of the R-package cluster [36]. Calculation of RHI and RSI was performed in Java with the required software developed using Eclipse IDE for Java Developers. For calculation of the Levenshtein distance the Apache Commons Text library [37] was applied. In general, the main advantage of RHI and RSI is a high degree of flexibility. In fact, arbitrary criteria for similarity between sequences (which are formalized as reflexive and symmetric relations) can be applied to quantify data sets in view of homogeneity and similarity. This ensures a universal applicability which is independent from the concrete biological parameters of interest. Their main disadvantage are the high computational costs in case of non-transitive relations where all potential pairs of clonotypes have to be evaluated in view of the similarity criterion leading to exorbitant computational time. In case of equivalence relations, the indices can be calculated using the original formula (Simpson or Morisita-Horn Index) and represent an application of well-established statistical methods. For example, Glanville et. al applied the Simpson Index to evaluate the homogeneity of V-segments and CDR3 sequence length in preselected TCR-clusters [38]. In addition to a dramatic reduction of computational time, the application of equivalence relations ensures some analytic properties of the indices which may be desired in many applications. Detailed derivations are provided as supplemental material (see S1 Methods).

Differential gene-expression analysis

To identify those public clonotypes which expand after SRBC application we performed a differential gene expression analysis using the R-package edgeR [20]. Since the mathematical procedure is described in detail in [39] we give only a brief summary here. The algorithm is based on the assumption that for each clonotype, the distribution of the CNs follows a negative binomial distribution. For each clonotype, the null hypothesis that this distribution is not affected by the experimental conditions (SRBC-application, time point of immunization), was tested using a likelihood-ration test. To reduce the effect of unspecific proliferation phenomena or PCR artefacts, the threshold of significance for the likelihood-ratio test was determined as 0.005 [13]. In this analysis only those clonotypes were included which were detected in at least 75% of immunized animals. This way, we identified 40 clonotypes with significantly increased CN after immunization compared to control animals. In the original study, these criteria were fulfilled by 44 clonotypes [12], with the deviation caused by differences in data preprocessing including downsampling of the 7d group data. However, relative distributions remained comparable to the previous study: out of the 40 clonotypes an average of ∼35 were detected in control animals and ∼38 in immunized animals, with no significant differences between the three time points (not shown). While the mean CN per clonotype was considerably elevated at 3d and 4d to approximately 4- and 6-fold compared to control animals (p < 4 ⋅ 10−6 and p < 6 ⋅ 10−8), respectively, CNs and thus clonotype expression levels at 7d after immunization decreased to about 2-fold compared to the control level (p < 4 ⋅ 10−5).

Logarithmic fractioning of the repertoire shows graduated shifting of certain parameters with increasing copy number and the intermediate part of the repertoire untouched by immunization-induced effects.

(PDF) Click here for additional data file.

Classification of total repertoires is successful only when weighted for copy numbers.

(PDF) Click here for additional data file.

Detailed derivations of the adapted statistic indices introduced in this study.

For statistic properties, which differ from those of the original versions, detailed analytic proofs are provided. Furthermore, we provide weighted versions (counting for copy numbers) for two of these indices. (PDF) Click here for additional data file.

Number of extracted clonotypes and sequence reads of the 50 data sets after all preprocessing steps (including downsampling of the 7d group and removing clonotypes with CN 1).

(PDF) Click here for additional data file. 9 Jun 2022

PONE-D-22-11828

Tracking a diverse immune response against a complex cellular antigen in the murine T cell receptor repertoire: Over time, expanding private clones accumulate in the high but also the low copy number region

PLOS ONE Dear Dr. Westermann, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Some issues and questions arose with respect to methodology and interpretation of your work that require detailed changes and re-interpretation.

Please submit your revised manuscript by Jul 24 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Jörg Hermann Fritz, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well. Additional Editor Comments: Dear Dr. Westerman, two experts in the field have reviewed your manuscript and consider the work highly interesting. However, a couple a issues and questions have been raised with respect to interpretation and clarity that need to be addressed and discussed. with best regards Jörg Fritz [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In this manuscript, Meinhardt et al., analyze the splenic TCR repertoire of mice post-immunization with sheep red blood cells. They adapt a set of statistical metrics with the aim of studying the dynamics of the SRBC-specific response over time and detect clonotypes contributing to this response. They also introduce a fractioning strategy allowing to identify the distribution of SRBC-related clonotypes within the repertoire. They posit that early after immunization, private antigen-specific clones accumulate in high copy numbers, whereas they are found in lower numbers at later timepoints. There is high value in this paper in terms of highlighting the importance of looking beyond the top expanded clonotypes when studying an immune response over time, and the challenges that come with the search for response-specific clonotypes in a highly diverse TCR repertoire, which is like looking for a needle in a haystack. However, I fear that the fractioning method that they put in place could only work for unsorted samples, as such repertoires represent a smooth and “conventional” distribution across all count fractions. In contrast, studying sorted populations of different sizes and phenotypes would imply differential clonotype count distributions and subsequently incomparable CN fraction sizes. Thus, using their strategy can only allow a global description of the repertoire modulation post-immunization, but do not allow a thorough investigation of a specific cell subset dominating the response. I am curious about the dominant discriminatory contribution of the VJ usage in this study, as we would expect modifications of the CDR3 characteristics in response to an antigen rather than the VJ usage. Moreover, this contribution might be only reflecting the underlying differential contribution of the CD4/CD8 populations in the immune response and might not be the case when studying sorted populations. The term “tracking” is used in the manuscript title. The way I see it, this would imply the tracking of specific clonotypes across time, which is not the case. Thus, this term, and the whole title which is bit too long, should maybe be reconsidered. Overall, I think that the strategical aspects of this paper need to be looked at more carefully in order to determine the validity of the conclusions that are made and the hypotheses that are proposed. Introduction: “The specificity of the receptor of each single T cell is restricted to a small set of amino acid patterns”: Any references in support of this statement? To my knowledge, a T-cell can recognize up to 106 different pMHC complexes (Mason, 1998; Wooldridge et al., 2012). Moreover, the TCR can induce structural shifts allowing the recognition of different peptides as it is the case for the BM3.3 TCR (Archbold et al., 2009; Borbulevych et al., 2009, Reiser et al., 2003). Methods: 1- Additional precisions on the MiXCR parameters used to align the datasets should be provided. 2- Was the down-sampling performed using the number of sequencing reads as a threshold and thus applied on the lists of reads pre-alignment? I think this is an important aspect to be clarified as in this case, the sampling threshold would take into account sequencing errors and unproductive sequences that might lead to an overestimation of the repertoire sizes. Along with this point, the down-sampling applied on the repertoires 7d post-immunization could be causing the elimination of rare clonotypes that could otherwise be contributing to the immune response, particularly as significant perturbations were observed in the CNlow fraction at this timepoint. 3- “Each clonotype was assigned the V- and J- segments of the underlying nucleotide sequence of highest read count”. Why did the authors opt for this strategy? I think assigning the most expressed VJ combination to each “set of clonotypes” might overshadow other highly expressed combinations (was the VJ distribution looked at beforehand?) subsequently giving a higher importance to certain combinations in the RHIVJ analysis. 4- If I understood well, the repertoire fractioning was applied without taking into account the VJ genes. While I totally agree with the rationale of focusing on the CDR3 region while searching for relevant CDR3s in the context of an immunization in view of its center role in antigen recognition, taking into account the V-J usage in the identification of repertoire fractions could be important particularly as it is one of the two criteria on which this study is based. 5- It is mentioned that the number of clonotypes ranged from 50k to 110k per sample, however these numbers do not match with the ones plotted in Fig1A. A summary table of the number of sequences/clonotypes would be much clearer. Figures: Fig. 1: 1- Title: The use of the term “SRBC-specific” is not appropriate as no functional assays were performed to validate the specificity of the identified clones. 2- Fig1B: an “e” is missing in “sequence” 3- Was the arbitrary fractioning completely random or based on a certain percentage of the top clonotypes? I am curious about the number of clones being that homogeneous between experimental groups for the CNlow and CNmed but not the CNhigh fractions. 4- What does “clonotype” refer to in this section? If I refer to the materials and methods, a clonotype is a “set of T cells with identical CDRβ region”, and thus does not include the V and J genes. With that being said, “two clonotypes are defined as similar if the LvD of their CDR regions is at maximum 1”, how could two clonotypes be completely similar and thus have a LvD=0 if the VJ genes are not accounted? Shouldn’t they be considered as one clonotype in this case? More precision on the definition of a clonotype is needed in this section to clarify these subtilties. 5- I would not consider that the RHIVJ can capture” sequence similarity”, but rather the diversity in the gene usage. Two clonotypes expressing the same VJ combination but having completely different CDR3 sequences cannot be considered as similar, particularly when the specificity is mostly encoded by the CDR3 sequence. 6- A visualization of the LvD clusters as a complement to the RHILD could bring additional information on the size/density of the formed clusters and the effect of the sequence similarity reduction observed 3d post-immunization compared to the PBS group on the cluster’s architecture. Considering that private clones displace public ones (as stated in the discussion), big dense clusters should be fractioned into multiple small clusters. 7- The use of the term “homogenization” in reference to a decrease in the number of nucleotide CDR3 sequences encoding for the same amino acid sequence is confusing, as by homogenous one would think of a homogeneous number of nucleotide sequences across all amino acid clonotypes in the repertoire. It is however not the case herein. Fig 2: 1- It would be easier to follow through the results if the main text and the figures have the same CN nomenclature (either use 1-9 or 2-512). 2- It would be interesting to look at the Jaccard scores between the CNhigh fraction at day 3 and CNlow at day 7 to track whether the potentially antigen-related expansions are the ones that are detected at low frequency at a later timepoint post-immunization. 3- I’m curious about the differences observed in the 64Moreover, as RHIVJ shows significant differences for fractions with CN>64, I would assume that the perturbation in the VJ usage is not directly linked or caused by the potential SRBC-related clonotypes that are expanded 3-4 days post-immunization and which are observed within the CN>256 fraction. This might be caused by the V-J assigning strategy. 4- The superior discriminatory power of RHIVJ compared to RHILD and CDI can be a reflection of the differential contribution of the CD8 and CD4 populations in the immune response against SRBC as the study was done on unsorted T cells. Has the CD8/CD4 ratio been looked at? Fig 3: 1- Precisions on the methods used for both unsupervised and supervised classification within the main text are needed to further clarify what was done in this section. 2- The choice of the “relevant” fractions is arbitrary in my sense and could be the reason behind the discordant results obtained on both fractions when looking at the VJ usage and nucleotide coding. In fig2 B, significant p values are obtained with CDI on the CN< 4 and CN>256 fractions whereas significant differences are observed with RHIVJ on CN<16 and CN>64. This could explain why VJ usage but not nucleotide coding performs well on the chosen Xtop fraction (CN>64), whereas nucleotide coding classifies well on the Xbottom (CN<4) fraction. Thus, I think the choice of fractions should take into account the results obtained for each index in fig2. 3- For the Xbottom fraction, 3d and 4d post-immunization repertoires could be used as a control in the supervised classification analysis by comparing the repertoire of each timepoint to the control group. Based on the results in fig2B, values should be similar to the ones obtained when applying random labeling. 4- FigS2: Did the authors perform the classification analysis using the nucleotide coding criteria on the total repertoire? Fig4: 1- A brief description of the identification strategy of the “public SRBC-specific” clonotypes is needed in the results section. Was the strategy timepoint-dependent, i.e. applied on each timepoint compared to the control group? 2- Does this strategy take into account the CN of the clonotypes across mice within the same experimental group? For example, a clonotype identified as significantly more present in the immunized group but in totally different CN fractions across the mice (CNlow and CNhigh) might not have the same weight nor contribution in the SRBC response than other clonotypes that are found within the CNhigh fraction in all mice. 3- In view of the low Jaccard scores shown in fig1, I find it surprising that only 40 clonotypes were found to be enriched in the immunized group. It would maybe be interesting (if it is not already the case) to look at each time point independently as their behavior is not similar across all previous analyses. 4- Again, the term “specific public clones” is not appropriate as the specificity was not tested by functional assays. 5- I do not think that the observations in fig 2 and 3 reflect the behavior of “private clonotypes” exclusively as no prior filtering was applied to select such clonotypes as it is the case in fig 4 for the public ones. A more appropriate comparison could be performed between the identified 40 public clones and a list of CDR3s with no significant enrichment in the immunized group in order to confirm the accumulation of the public clones within the highest fraction. 6- The inclusion of the RHIVJ results from fig2 is misleading and seems unnecessary in this analysis, particularly as the number of clonotypes and the RHIVJ did not show concordant observations in fig 2. It would be more interesting to look at the RHILD of the 40 public clonotypes compared to a list of non-enriched CDR3s. Furthermore, the search for amino acid motifs that are shared by the private clonotypes could reveal their potential implication in the SRBC response, or on the contrary reveal their rather bystander activation. Reviewer #2: Summary In this manuscript, Meinhardt et al. reported a novel approach to estimate the diversity of TCRs using CDR3β and VJ usage. The authors utilize this model to reveal the T cell repertoire dynamics post-immunization. This approach can potentially be also used in studying a variety of immune responses. The manuscript is suitable for publishing in PLOS ONE. Please see attachment for comments. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. Submitted filename: 20220603 PLOS One ReviewCorrected.docx Click here for additional data file. 24 Jul 2022 see cover letter Submitted filename: Response to Reviewers.pdf Click here for additional data file. 5 Aug 2022 The splenic T cell receptor repertoire during an immune response against a complex antigen: Expanding private clones accumulate in the high and low copy number region PONE-D-22-11828R1 Dear Dr. Westermann, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Jörg Hermann Fritz, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: (No Response) Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: (No Response) Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: (No Response) Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: (No Response) Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: All comments have been addressed. Two minor comments to be taken into account: -Title: if clones are expanded then they shouldn't be in the low copy number fraction... Maybe review this term in the title. -Fig1: specify the logarithmic scale in fig1A Reviewer #2: Summary In this manuscript, Meinhardt et al. utilized a new model to estimate the dynamic diversity of TCRs post-immunization. The authors have addressed some of the concerns and comments in my previous review. The revised version has improved clarity and more detailed methodology. The manuscript is suitable for publishing on PLOS ONE. Comments for the authors (Revision is not needed): I am satisfied with the authors’ responses. No further edit is needed. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** Submitted filename: 20220802 PLOS One D-22-11828 2nd Review.docx Click here for additional data file. 12 Aug 2022 PONE-D-22-11828R1 The splenic T cell receptor repertoire during an immune response against a complex antigen: Expanding private clones accumulate in the high and low copy number region Dear Dr. Westermann: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Jörg Hermann Fritz Academic Editor PLOS ONE

27 in total

Review 1. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification.

Authors: Richard Simon; Michael D Radmacher; Kevin Dobbin; Lisa M McShane
Journal: J Natl Cancer Inst Date: 2003-01-01 Impact factor: 13.506

Review 2. Comparative analysis of murine T-cell receptor repertoires.

Authors: Mark Izraelson; Tatiana O Nakonechnaya; Bruno Moltedo; Evgeniy S Egorov; Sofya A Kasatskaya; Ekaterina V Putintseva; Ilgar Z Mamedov; Dmitriy B Staroverov; Irina I Shemiakina; Maria Y Zakharova; Alexey N Davydov; Dmitriy A Bolotin; Mikhail Shugay; Dmitriy M Chudakov; Alexander Y Rudensky; Olga V Britanova
Journal: Immunology Date: 2017-11-27 Impact factor: 7.397

Review 3. T-cell antigen receptor genes and T-cell recognition.

Authors: M M Davis; P J Bjorkman
Journal: Nature Date: 1988-08-04 Impact factor: 49.962

4. Overview of methodologies for T-cell receptor repertoire analysis.

Authors: Elisa Rosati; C Marie Dowds; Evaggelia Liaskou; Eva Kristine Klemsdal Henriksen; Tom H Karlsen; Andre Franke
Journal: BMC Biotechnol Date: 2017-07-10 Impact factor: 2.563

5. T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequences.

Authors: Asaf Madi; Asaf Poran; Eric Shifrut; Shlomit Reich-Zeliger; Erez Greenstein; Irena Zaretsky; Tomer Arnon; Francois Van Laethem; Alfred Singer; Jinghua Lu; Peter D Sun; Irun R Cohen; Nir Friedman
Journal: Elife Date: 2017-07-21 Impact factor: 8.140

6. Specificity, Privacy, and Degeneracy in the CD4 T Cell Receptor Repertoire Following Immunization.

Authors: Yuxin Sun; Katharine Best; Mattia Cinelli; James M Heather; Shlomit Reich-Zeliger; Eric Shifrut; Nir Friedman; John Shawe-Taylor; Benny Chain
Journal: Front Immunol Date: 2017-04-13 Impact factor: 7.561

7. Complete but curtailed T-cell response to very low-affinity antigen.

Authors: Dietmar Zehn; Sarah Y Lee; Michael J Bevan
Journal: Nature Date: 2009-01-28 Impact factor: 49.962

8. VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires.

Authors: Mikhail Shugay; Dmitriy V Bagaev; Maria A Turchaninova; Dmitriy A Bolotin; Olga V Britanova; Ekaterina V Putintseva; Mikhail V Pogorelyy; Vadim I Nazarov; Ivan V Zvyagin; Vitalina I Kirgizova; Kirill I Kirgizov; Elena V Skorobogatova; Dmitriy M Chudakov
Journal: PLoS Comput Biol Date: 2015-11-25 Impact factor: 4.475

Review 9. Regulation of B cell responses by distinct populations of CD4 T cells.

Authors: Meryem Aloulou; Nicolas Fazilleau
Journal: Biomed J Date: 2019-09-26 Impact factor: 4.910