| Literature DB >> 29944755 |
Deborah Dunn-Walters1, Catherine Townsend1, Emma Sinclair1, Alex Stewart1.
Abstract
The human immunoglobulin repertoire is a hugely diverse set of sequences that are formed by processes of gene rearrangement, heavy and light chain gene assortment, class switching and somatic hypermutation. Early B cell development produces diverse IgM and IgD B cell receptors on the B cell surface, resulting in a repertoire that can bind many foreign antigens but which has had self-reactive B cells removed. Later antigen-dependent development processes adjust the antigen affinity of the receptor by somatic hypermutation. The effector mechanism of the antibody is also adjusted, by switching the class of the antibody from IgM to one of seven other classes depending on the required function. There are many instances in human biology where positive and negative selection forces can act to shape the immunoglobulin repertoire and therefore repertoire analysis can provide useful information on infection control, vaccination efficacy, autoimmune diseases, and cancer. It can also be used to identify antigen-specific sequences that may be of use in therapeutics. The juxtaposition of lymphocyte development and numerical evaluation of immune repertoires has resulted in the growth of a new sub-speciality in immunology where immunologists and computer scientists/physicists collaborate to assess immune repertoires and develop models of immune action.Entities:
Keywords: B cell; antibody; human; repertoire
Mesh:
Substances:
Year: 2018 PMID: 29944755 PMCID: PMC6033188 DOI: 10.1111/imr.12659
Source DB: PubMed Journal: Immunol Rev ISSN: 0105-2896 Impact factor: 12.988
Figure 1(a) Variable (V), Diversity (D) and Joining (J) gene segments are arranged in a non‐functional state in the germline. During V(D)J recombination, a V, a D and a J gene segment (just V and J in the case of light chains) are brought together at random. RSS sequences ensure gene segments are recombined in the correct order to form a functional variable region sequence. Blue, orange and purple rectangles represent V, D, and J gene segments, respectively, with gray leader regions upstream of the V genes. Turquoise and red triangles represent 12RSS and 23RSS, respectively. Constant region exons are represented by green rectangles. (b) Functional variable regions are composed of four conserved structural framework regions (FR) and three more diverse complementarity determining regions (CDR). The CDR3 regions are the most diverse as they span multiple gene segments and contain random nucleotide addition. C) The CDR loops make the most contact with antigen (PDB ID: 1FVC)
Comparison of next generation sequencing methods
| Illumina (300 bp paired end) | Pacific biosciences RSII (per SMART cell) | 454 (GS‐FLX Titanium) | |
|---|---|---|---|
| Maximum read length | 2 × 300 bp | >60 000 bp (10 000 bp average) | 700‐800 bp |
| Reads per run | 44‐50 million (Minimum) | 55 000 | ~1 million |
| Output | 13.2‐15 Gb per run | 1‐2 Gb per day | 0.7 Gb |
| Bioinformatics analysis | Some assembly required | Simple | Simple |
| Ig Class | Generally limited to class only | Subclass possible | Subclass possible |
| QC issues | 2 μg of amplicons required | ||
| Time of run | ~65 h | ~6 h per SMRT cell | ~24 h |
| Cost | US$1400 | US$400 per cell | US$6000 |
| Quality | Q20‐Q30 | Q50 | Q30 |
The newer, but less available, Sequel by Pacific Biosciences is capable of producing ~330 000 reads but at nearly double the cost per cell.
Costs have been based on a single website (allseq) to avoid provider differences and is based on running at cost. NB the PacBio RSII will take up to 16 SMART chips per run and therefore scales with cells used.
Quality scores are based on the base calling accuracy of a run. A Q20 has a probability of calling 1 incorrect base in 100 (99% accuracy), Q30 = 1 incorrect base in 1000 (99.9% accuracy), Q40 = 1 incorrect base in 10 000 (99.99% accuracy) ect.
Figure 2Ig gene repertoire variation between individuals, classes of antibody, and IGHV gene families. (a) Individual variability in a human vaccine response. Average clonality of selected IGHV genes in the repertoire of 12 individuals (each is color coded) at day 7 after challenge with influenza and pneumococcal vaccines.156 Average clonality is the number of sequences divided by the number of clonal families for each individual genes. Average clonality of 1 indicates lack of clonal expansion. (b) PCA analysis of CDR3 physicochemical properties, as defined by kidera factors, showing the difference between Ig genes of IgG1 vs IgG2 subclasses. Data from Martin et al73 (c) Segregation of IGHV family genes by CDR‐H3 physicochemical properties. Minkowsky distance clustering by Brepertoire146 on IgM sequences from B cells in early development in 12 different individuals.76 Each sample is a separate individual. IGHV genes color coded: Yellow; IGHV2, Red;IGHV1, Green; IGHV3, Blue; IGHV4, Violet; IGHV5, Gray; IGHV6
The costs of running some of the more prominent single‐cell technologies. Note that prices are estimates and may vary as a result of different suppliers, exchange rates and prices scalable on quantity purchased. None of these costs include sequencing, see Table 1
| scRNA‐Seq | Paired heavy‐light chain | ||||
|---|---|---|---|---|---|
| Drop‐Seq | 10x genomics | Smart‐seq | Overlap‐extension | 10x Genomics | |
| Equipment cost | US$50 000‐65 000 | US$75 000 | N/A | US$55 000 | US$75 000 |
| Per run cost | US$500‐700 | US$1288 | US$1000 | US$400‐500 | US$1288 |
| Cells per run | ~10 000 | 100‐100 000 | 96‐384 | 100 000‐150 000 | 100‐100 000 |
| Estimated time to process a run (h) | 24‐48 | 24‐48 | 48‐72 | 24‐48 | 24‐48 |
| Capture efficiency | 5‐10% | 65% | 100% | >90% | 65% |
Although Smart‐seq does not require any specialized equipment it does require the ability to sort cells into 96 or 384 well plates.
This cost is based on an ‘off the shelf’ model although methods exist for self‐assembly. For Drop‐Seq and Ig pairing by overlap extension we have used Dolomite Bio as our reference. In this case as well, buying the equipment for one method will reduce the equipment purchase price for the other as parts are interchangeable.
The 10X system uses the same machine for both methods. Note that the system will also perform both scSeq and paired heavy light chain from the same sample for US$65 more and TCR on top of that at an additional US$65.