| Literature DB >> 32123808 |
Trisha A Rettig1, Michael J Pecaut2, Stephen K Chapes1.
Abstract
Sequencing antibody repertoires has steadily become cheaper and easier. Sequencing methods usually rely on some form of amplification, often a massively multiplexed PCR prior to sequencing. To eliminate potential biases and create a data set that could be used for other studies, our laboratory compared unamplified sequencing results from the splenic heavy-chain repertoire in the mouse to those processed through two commercial applications. We also compared the use of mRNA vs total RNA, reverse transcriptase, and primer usage for cDNA synthesis and submission. The use of mRNA for cDNA synthesis resulted in higher read counts but reverse transcriptase and primer usage had no statistical effects on read count. Although most of the amplified data sets contained more antibody reads than the unamplified data set, we detected more unique variable (V)-gene segments in the unamplified data set. Although unique CDR3 detection was much lower in the unamplified data set, RNASeq detected 98% of the high-frequency CDR3s. We have shown that unamplified profiling of the antibody repertoire is possible, detects more V-gene segments, and detects high-frequency clones in the repertoire.Entities:
Keywords: heavy chain; high‐throughput sequencing; immunoglobulin genes; mouse; repSEQ
Year: 2018 PMID: 32123808 PMCID: PMC6996338 DOI: 10.1096/fba.1017
Source DB: PubMed Journal: FASEB Bioadv ISSN: 2573-9832
Total number of productive reads per data set
| KSU | Com1 | Com2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| mRNA | TRNA | mRNA | TRNA | ||||||||
| AMV | MMLV | AMV | MMLV | ||||||||
| dT | Hex | dT | Hex | dT | Hex | dT | Hex | ||||
| Total Productive Reads | 11 200 | 553 521 | 1 263 003 | 883 532 | 1 035 461 | 7 975 | 6 867 | 208 979 | 220 772 | 637 214 | 766 075 |
Sequencing technique (Com1 and Com2 are amplified data sets).
Starting material (mRNA, messenger RNA; TRNA, total RNA).
Reverse transcriptase (AMV, Avian Myeloblastosis Virus; MMLV, Moloney Murine Leukemia Virus).
Primer (dt, Oligo dT; Hex, random hexamer).
An additional 27 896 reads were used for V‐gene segment usage assessment. These sequences were not long enough for CDR3 detection.
Percent of non‐C57BL/6 V‐gene segments detected per data set
| Com1 | Com2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| mRNA | TRNA | mRNA | TRNA | |||||||
| AMV | MMLV | AMV | MMLV | |||||||
| dT | Hex | dT | Hex | dT | Hex | dT | Hex | |||
| % Non‐B6 V‐Gene segments | 0.92 | 0.88 | 0.84 | 1.17 | 1.30 | 1.46 | 1.02 | 1.14 | 1.74 | 1.41 |
Sequencing technique (Com1 and Com2 are amplified data sets).
Starting material (mRNA, messenger RNA; TRNA, total RNA).
Reverse transcriptase (AMV, Avian Myeloblastosis Virus; MMLV, Moloney Murine Leukemia Virus).
Primer (dt, Oligo dT; Hex, random hexamer).
Figure 1R2 values of sequencing technical replicates. The percent of repertoire for each V‐gene segment detected was compared between technical replicates. The highest R2 are dark read, while the lowest R2 are blue
Correlations of data sets to unamplified KSU data set and read counts
| Com1 | Com2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| mRNA | TRNA | mRNA | TRNA | |||||||
| AMV | MMLV | AMV | MMLV | |||||||
| dT | Hex | dT | Hex | dT | Hex | dT | Hex | |||
| R2 to KSU Dataset | 0.5677 | 0.5773 | 0.4496 | 0.5517 | 0.4457 | 0.5606 | 0.5554 | 0.5841 | 0.6695 | 0.6607 |
| Assessed V‐Gene Segments | 506 503 | 151 104 | 1 749 618 | 1 245 999 | 267 946 | 5666 | 267 946 | 302 057 | 626 093 | 755 280 |
Sequencing technique (Com1 and Com2 are amplified data sets).
Starting material (mRNA, messenger RNA; TRNA, total RNA).
Reverse transcriptase (AMV, Avian Myeloblastosis Virus; MMLV, Moloney Murine Leukemia Virus).
Primer (dt, Oligo dT; Hex, random hexamer).
Figure 2Percent of repertoire for high‐frequency V‐gene segments among data sets. Percent of repertoire for the KSU, Com1 (mRNA‐MMLV‐hex), Com2 (mRNA), and Prod (productive only sequences from the KSU data set) are displayed. The highest value percent of repertoire is dark read while the lowest are white. Black boxes represent no detected reads (true zero). Rounded zeros are represented as 0.0
Unique CDR3 sequences in the KSU, Com1, and Com2 data sets
| KSU | mRNA‐MMLV‐Hex (Com1) | mRNA | |
|---|---|---|---|
| Read Count | 11 200 | 1 035 461 | 637 214 |
| Unique CDR3 Sequences | 6668 | 180 266 | 146 231 |
Sequencing data set.
Total number of reads obtained per data set.
Total number of unique CDR3 AA sequences.
Figure 3Overlap of CDR3 sequence detection between technical replicates. CDR3 amino acid sequences were compared between technical replicates. Sequences unique to one data set are displayed in the outer circles. Sequences shared between data sets are in the overlap. Percent of shared CDR3 sequences is displayed in parentheses in the outer circles. (A) KSU data sets 32‐1 and 32‐2. (B) KSU data sets 39‐1 and 39‐2. (C) Com1 data sets mRNA‐MMLV‐Hex and mRNA‐MMLV‐dT. (D) Com2 data sets mRNA and TRNA
Figure 4CDR3 sequence capture among Com1, Com2, and KSU data sets. CDR3 amino acid sequences were compared among the Com1 mRNA‐MMLV‐Hex, Com2 mRNA, and the KSU data sets. Percent of the repertoire shared with at least one other data set is listed in parentheses
Figure 5High‐frequency CDR3s detected among the Com1, Com2, and KSU data sets. The top 25 CDR3s from each data set (48 total) were compiled and percent of repertoire compared. Black boxes represent no detected reads (true zero). Rounded zeros are represented as 0.0
CDR3 AA sequences frequencies in the whole and unique repertoire
| Whole repertoire | Unique repertoire | |||||
|---|---|---|---|---|---|---|
| KSU | Com1 | Com2 | KSU | Com1 | Com2 | |
| Minimum | 0.008758 | 0.000446 | 0.000221 | 0.008758 | 0.000446 | 0.000221 |
| Maximum | 4.165500 | 0.216969 | 2.259680 | 0.035032 | 0.005804 | 0.026044 |
| Average | 0.014997 | 0.000555 | 0.00684 | 0.008929 | 0.000478 | 0.00290 |
Repertoire sampled (Whole repertoire – includes all sequences, unique repertoire – sequences unique to a single analyzed repertoire).
Minimum or maximum frequency in the repertoire.