| Literature DB >> 28539189 |
Corey T Watson1, Jacob Glanville2, Wayne A Marasco3.
Abstract
Antibodies (Abs) produced by immunoglobulin (IG) genes are the most diverse proteins expressed in humans. While part of this diversity is generated by recombination during B-cell development and mutations during affinity maturation, the germ-line IG loci are also diverse across human populations and ethnicities. Recently, proof-of-concept studies have demonstrated genotype-phenotype correlations between specific IG germ-line variants and the quality of Ab responses during vaccination and disease. However, the functional consequences of IG genetic variation in Ab function and immunological outcomes remain underexplored. In this opinion article, we outline interconnections between IG genomic diversity and Ab-expressed repertoires and structure. We further propose a strategy for integrating IG genotyping with functional Ab profiling data as a means to better predict and optimize humoral responses in genetically diverse human populations, with immediate implications for personalized medicine.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28539189 PMCID: PMC5656258 DOI: 10.1016/j.it.2017.04.003
Source DB: PubMed Journal: Trends Immunol ISSN: 1471-4906 Impact factor: 16.687
Figure 1A Basic Overview of Key Elements That Contribute to the Diversity of Naïve and Memory Repertoires. A basic schematic of the germ-line IGH locus is shown (not to scale), consisting of clusters of tandemly arrayed IGH V, D, J, and constant (C) gene segments. For a subset of these segments, multiple alleles are shown, representing population-level ‘allelic diversity’ (see Table 1). During the initial formation of the naïve repertoire, single IGH V, D and J gene segment on one of two chromosomes in a given B cell are somatically recombined; at each of these steps, P and N nucleotides are added at the D–J and V–D junctions (‘junctional diversity’), respectively. This process, known as V(D)J rearrangement, is the basis for ‘combinatorial diversity’. The recombined V (red), D (orange), and J (maroon) segments will then be transcribed, and following splicing, will be paired with a C gene (gray). The somatic recombination process also occurs at one of two loci encoding the Ab light-chain gene segments [IGK and IGL; except it involves only V (yellow), J (maroon), and C (light gray) gene segments]. Two identical heavy chains and two identical light chains are ultimately paired through disulfide bonds to form a functional Ab; thus, additional diversity in the expressed Ab repertoire comes from ‘heavy- and light-chain pairing’. Together, the V, D, and J segments depicted comprise the variable domain of the heavy chain of a functional antibody, and together with the variable domain of the light chain, encoded by V and J segments, are responsible for Ag binding. The C domains of both heavy and light chains provide structural and/or effector functions of the Ab. As shown here for the heavy chain, the variable domain is partitioned into four framework regions (FRs) and three complementarity-determining regions (CDRs). Following Ag stimulation, ‘somatic hypermutations’ introduce additional variation in the variable domain of the Ab (vertical purple bars), with the aim of improving binding affinity. Mutations that arise via SHM can occur across all FRs and CDRs, but these are most prevalent in CDRs, as illustrated by the hypothetical frequency histogram shown between the unmutated and mutated IG heavy-chain RNA. While the general molecular mechanisms outlined here have long been realized as the primary determinants of diversity within a given expressed Ab repertoire, there is a growing appreciation for the contribution of ‘allelic diversity’ as well, particularly as this pertains to repertoire differences observed between unrelated individuals. Ab, antibody; C, constant; D, diversity; IGH, immunoglobulin heavy-chain locus; IGK, immunoglobulin kappa; IGL, immunoglobulin lambda; J, joining; SHM, somatic hypermutation; V, variable.
Allelic, Copy Number, and Amino Acid Variation for IG Functional and Open Reading Frame Genes Cataloged in IMGTa
| Family | Genes | Alleles | NS variants | S variants | CDR-H1 NS variants | CDR-H2 NS variants | Genes in CNV |
|---|---|---|---|---|---|---|---|
| IGHV1 | 11 | 40 | 19 | 13 | 2 | 3 | 6 |
| IGHV2 | 4 | 23 | 26 | 9 | 3 | 1 | 1 |
| IGHV3 | 27 | 109 | 82 | 57 | 9 | 17 | 12 |
| IGHV4 | 10 | 78 | 92 | 71 | 11 | 8 | 8 |
| IGHV5 | 2 | 9 | 4 | 4 | 0 | 0 | 1 |
| IGHV6 | 1 | 2 | 0 | 1 | 0 | 0 | 0 |
| IGHV7 | 2 | 6 | 4 | 0 | 0 | 0 | 1 |
| Subtotal | 58 | 267 | 227 | 155 | 25 | 29 | 29 |
| IGKV1 | 20 | 35 | 33 | 17 | 4 | 1 | 1 |
| IGKV2 | 11 | 18 | 14 | 4 | 1 | 1 | 0 |
| IGKV3 | 8 | 18 | 24 | 9 | 2 | 1 | 0 |
| IGKV4 | 1 | 1 | NA | NA | NA | NA | 0 |
| IGKV5 | 1 | 1 | NA | NA | NA | NA | 0 |
| IGKV6 | 3 | 5 | 2 | 0 | 0 | 0 | 0 |
| IGKV7 | 0 | 0 | NA | NA | NA | NA | 0 |
| Subtotal | 44 | 78 | 73 | 30 | 7 | 3 | 1 |
| IGLV1 | 7 | 12 | 4 | 2 | 0 | 2 | 1 |
| IGLV2 | 6 | 20 | 13 | 8 | 2 | 3 | 0 |
| IGLV3 | 11 | 18 | 14 | 5 | 3 | 3 | 0 |
| IGLV4 | 3 | 6 | 2 | 1 | 0 | 0 | 0 |
| IGLV5 | 5 | 10 | 3 | 2 | 0 | 0 | 1 |
| IGLV6 | 1 | 2 | 2 | 0 | 0 | 0 | 0 |
| IGLV7 | 2 | 3 | 1 | 0 | 0 | 0 | 0 |
| IGLV8 | 1 | 3 | 1 | 1 | 0 | 0 | 1 |
| IGLV9 | 1 | 3 | 0 | 2 | 0 | 0 | 0 |
| IGLV10 | 1 | 3 | 4 | 1 | 1 | 0 | 0 |
| IGLV11 | 1 | 2 | 1 | 1 | 0 | 0 | 0 |
| Subtotal | 39 | 82 | 45 | 23 | 6 | 8 | 3 |
| Total | 141 | 427 | 345 | 208 | 38 | 40 | 33 |
Data accessed from IMGT February 2017. NS, nonsynonymous; S, synonymous.
Figure IImpacts of IG Germ-Line Polymorphism on Ab Repertoire/Structural Diversity. (A) Hypothetical examples of associations between IG gene region CNV (V gene 1 insertion/deletion) and SNP (noncoding regulatory variant, A/C) genotypes and V gene usage frequencies in the expressed Ab repertoire. (B) Violin plot showing nonsynonymous polymorphism rates in CDR positions with high (>0.6; ‘high’, blue) or low (<0.25; ‘low’, red) frequency of contact with antigen, as labeled on the X axis. The Y axis records, for each CDR-H1 and CDR-H2 position, the number of IMGT IGHV genes that have alleles with nonsynonymous polymorphisms at that position. The positional probability of antigen contact was calculated for each CDR position as the percentage of 150 crystal structures of antibody–antigen complexes from the protein database (PDB) where any atom of that residue is within 5 Å of any antigen atom. Allelic variation is enriched in antigen-contact sites, in that the number of IGHV genes with alleles containing nonsynonymous polymorphisms is greater for high contact probability positions. (C) Genotype frequency differences between five human ethnic groups [Africans (AFR); East Asians (EAS); South Asians (SAS); Central/South American (AMR); and Europeans (EUR)], published by the 1000 Genomes Project [80]*, at two SNPs in IGHV1-69 that have been shown to encode functional residues critical for neutralizing Abs against the influenza HA stem (F54 and L54 amino acid-associated alleles; SNP rs55891010; left panel), and ‘NEAT2’ domain of Staphylococcus aureus (R50 and G50 alleles; SNP rs11845244; right panel). In the left panel, the F allele encodes the functional critical phenylalanine residue, and in the right panel, the primary glycine residue is encoded by the G allele. Interestingly, in both cases, the frequency of individuals lacking alleles encoding the critical residues varies among populations, with the L/L and R/R genotypes showing the lowest frequencies in Africans, and the highest frequencies in South Asians. rs55891010 and rs11845244 are in linkage disequilibrium, and thus R50 and L54 amino acids (and likewise, G50 and F54) tend to co-occur in alleles of IGHV1-69. This explains similarities in genotype frequency estimates between the two SNPs in each population. *Although these genotypes may contain error due to confounds of unrepresented CNV information, they can provide insight into potential population differences. Ab, antibody; CDR, complementarity-determining region; CNV, copy number variation; HA, hemagglutinin; IG, immunoglobulin; IMGT, ImMunoGeneTics information system database; SNP, single nucleotide polymorphism.
Figure 2Key Figure: A New Paradigm for Integrating Genotypic Information into the Study of the Ab-Mediated Response in Disease and Clinical Phenotypes
In the proposed paradigm, a population cohort is partitioned into subgroups based on functional genotypes/haplotypes that are directly associated with subgroup-specific signatures in the expressed repertoire and other relevant phenotypes (e.g., Ab titer; clinical outcome) associated with the Ab response to a given antigen/epitope. This partitioning can be used to inform tailored clinical care and treatment (e.g., vaccination regime). Ab, antibody.