| Literature DB >> 27630639 |
Dongni Hou1, Cuicui Chen1, Eric John Seely2, Shujing Chen1, Yuanlin Song1.
Abstract
The selectivity of the adaptive immune response is based on the enormous diversity of T and B cell antigen-specific receptors. The immune repertoire, the collection of T and B cells with functional diversity in the circulatory system at any given time, is dynamic and reflects the essence of immune selectivity. In this article, we review the recent advances in immune repertoire study of infectious diseases, which were achieved by traditional techniques and high-throughput sequencing (HTS) techniques. HTS techniques enable the determination of complementary regions of lymphocyte receptors with unprecedented efficiency and scale. This progress in methodology enhances the understanding of immunologic changes during pathogen challenge and also provides a basis for further development of novel diagnostic markers, immunotherapies, and vaccines.Entities:
Keywords: bioinformatics; high-throughput sequencing; immune repertoire; infection; lymphocyte
Year: 2016 PMID: 27630639 PMCID: PMC5005336 DOI: 10.3389/fimmu.2016.00336
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Figure 1Process of generating a diverse B cell repertoire. The structure of each heavy chain (left) originates from rearrangement of Variable (V), Diversity (D), and Joining (J) gene segments. Recombination occurs first between D and J segment, and then V segment and D-J segment. Along with the selection of gene segments, insertion and deletion of nucleotides at the junctions between segments provides initial diversity for the primary BCR repertoire. In comparison, the light chain (right) is formed only by two segments (V and J), which makes the light chain to be less diverse. After encountering cognate antigen, somatic hypermutation introduces point mutations to frame region and complementary determining region of BCRs. This process further diversifies the repertoire and generates BCRs with higher affinity.
High-throughput sequencing platforms used for Lymphocyte Repertoire Study.
| HTS platform | Mechanism | Read depth | Read length | Typical throughput | Covered region | Application | Accuracy |
|---|---|---|---|---|---|---|---|
| Roche 454 GS FLX Titanium XLR70 | Pyro-sequencing | ~1 million | Up to 600 bp | 450 Mb | FR1-C region | BCR (long read), TCR (long read) | Consensus accuracy 99.995% |
| Illumina HiSeq | Dye terminator sequencing | ~2 billion | 2 × 250 bp | 500–1000 Gb | FR3-C region | TCR (CDR3), BCR (long read), TCR (long read) | >80% bases above Q30 |
| Illumina MiSeq | Dye terminator sequencing | ~25 million | 2 × 150 bp | 13.2–15 Gb | FR1-C region | TCR (CDR3) | >80% bases higher than Q30 at 2 × 150 bp |
| 2 × 300 bp | >75% bases higher than Q30 at 2 × 250 bp | ||||||
| Ion Torrent/Life Technologies | Semiconductor sequencing | ~1 billion | Up to 200 bp | 30 Mb–2 Gb | FR3-C region | TCR | >97% bases |
Annotation and integrated bioinformatic tools for Lymphocyte Repertoire Data Analysis.
| Bioinformatic tools | Models | Key feature | Limitation | Cases of infection study | Reference |
|---|---|---|---|---|---|
| IMGT/HighV-QUEST | – | Designed for NGS data; most commonly used | Online version; results sent to user after 1–2 weeks | MERS-CoV ( | ( |
| iHMMune-Align | Hidden Markov model (HMM) | Most probable gene segment; accurate IGHD gene identification | Nucleotide insertions or deletions at the gene junctions but not within germline genes | Influenza vaccine ( | ( |
| SoDA2 | Hidden Markov model (HMM) | Probability model; ≥1 probable rearrangement candidates | – | HIV ( | ( |
| IgBlast | BLAST algorithm | Open source; gene databases search simultaneously | Relatively low throughput (<1000 per batch) | Dengue virus ( | ( |
| Decombinator | Aho–Corasick algorithm | Increased speed | Better matches between query sequence and target sequence needed; only for TCR | ( | |
| O-Change | Integrated | Functions unique to BCR analysis (somatic hypermutation analysis, quantifying selection pressure, and calculating sequence chemical properties) | – | West Nile Virus ( | ( |
| LymAnalyzer | Integrated | Accurate and complete assignment; polymorphism and SHM analysis | – | – | ( |
| tcR | Integrated | Based on R; diversity calculation; repertoire comparison; visualization | For output of MiTCR | – | ( |
| iMonitor | Integrated | Re-alignment to improve accuracy; visualization; PCR and sequencing errors correction | No polymorphism and SHM analysis | – | ( |
| MiTCR | Integrated | Accurate and higher speed; PCR and sequencing errors correction; sequence quality filter; low quality sequencing data rescue | Strongly related to sequencing quality; only for TCR | CMV ( | ( |
| ImmunExplorer (IMEX) | Integrated | Simple statistical analysis and visualization; repertoire comparison | For IMGT/HighV-QUEST outputs | – | ( |
| VDJtools | Integrated | Diversity assessment, clustering analysis, CDR3 region analysis, and data visualization; simple and user-friendly | Only for TCR; compatible annotation tools needed | – | ( |
| sciReptor | Integrated | Single-cell sequencing data and HTS data; supports flow cytometry index data; clustering and SHM analysis | – | – | ( |
| Tool for Ig genotype elucidation | Novel V alleles identification; personalized germline database construction | Complementary tools for IMGT; only for novel alleles not distantly related; only for TCR | ( | ||
Statistical algorithms and models for Lymphocyte Repertoire Data Analysis.
| Statistical methods | Functions | Reference |
|---|---|---|
| Maximum entropy models | Statistical properties of the repertoire | ( |
| Capture–recapture analysis | Diversity assessment | ( |
| Shannon index | Diversity assessment | ( |
| Gini–Simpson index | Diversity assessment | ( |
| Chao1 algorithm | Richness (maximum number of clones) | ( |
| Abundance-based coverage estimator | Richness (maximum number of clones) | ( |
| Chao2 | Lower bound of species richness | ( |
| Wu–Kabat index | ||
| Poisson abundance model | Diversity and clone distribution estimation | ( |
| Morisita–Horn similarity index | Repertoire similarity comparison | ( |
| Clonality score | Convergence-level estimation | ( |
| Hidden Markov model | V and J assignment (probability based) | ( |
| Aho–Corasick algorithm | V and J assignment | ( |
| Fast-tag-searching algorithm | V and J assignment (Hamming distance based) | ( |
| Single nucleotide polymorphism (SNP) calling algorithm | Identify novel VDJ gene alleles | ( |