| Literature DB >> 27233953 |
Stéphane Buhler1,2, José Manuel Nunes3,4, Alicia Sanchez-Mazas3,4.
Abstract
The main function of HLA class I molecules is to present pathogen-derived peptides to cytotoxic T lymphocytes. This function is assumed to drive the maintenance of an extraordinary amount of polymorphism at each HLA locus, providing an immune advantage to heterozygote individuals capable to present larger repertories of peptides than homozygotes. This seems contradictory, however, with a reduced diversity at individual HLA loci exhibited by some isolated populations. This study shows that the level of functional diversity predicted for the two HLA-A and HLA-B genes considered simultaneously is similar (almost invariant) between 46 human populations, even when a reduced diversity exists at each locus. We thus propose that HLA-A and HLA-B evolved through a model of joint divergent asymmetric selection conferring all populations an equivalent immune potential. The distinct pattern observed for HLA-C is explained by its functional evolution towards killer cell immunoglobulin-like receptor (KIR) activity regulation rather than peptide presentation.Entities:
Keywords: Asymmetric balancing selection; Functional variation; HLA class I polymorphism; Heterozygous advantage; Immune protection; Peptide-binding properties
Mesh:
Substances:
Year: 2016 PMID: 27233953 PMCID: PMC4911380 DOI: 10.1007/s00251-016-0918-x
Source DB: PubMed Journal: Immunogenetics ISSN: 0093-7711 Impact factor: 2.846
Summary of the population data
| Region | Npop (RGD/SGD) |
| Mean sample sizea |
|---|---|---|---|
| EUR | 5 (0/5) | 1563 | 312.6 (±775.25) |
| NAFR | 1 (0/1) | 230 | 230 (NA) |
| NAME | 1 (1/0) | 149 | 149 (NA) |
| NEASI | 2 (0/2) | 356 | 178 (±36.77) |
| OCE | 4 (4/0) | 399 | 99.75 (±123.96) |
| SAFR | 7 (0/7) | 1225 | 175 (±134.3) |
| SAME | 2 (2/0) | 212 | 106 (±90.51) |
| SEASIb | 20 (12/8) | 1637 | 81.85 (±103.98) |
| WASI | 4 (0/4) | 323 | 80.75 (±49.33) |
| 46 (19/27) | 6094 |
Npop number of population samples, SGD slow genetic drift, RGD rapid genetic drift, N number of individuals, NA not available, EUR Europe, NAFR Northern Africa, NAME Northern America, NEASI Northeastern Asia, OCE Oceania, SAFR Sub-Saharan Africa, SAME Southern America, SEASI Southeastern Asia, WASI Western Asia
aMean sample size (±2*standard deviation)
bIncluding 15 populations from Taiwan
Distribution of polymorphic residues within the PBR
| ABC | HLA-A | HLA-B | HLA-C | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| M | NS | S | M | NS | S | M | NS | S | M | NS | S | ||
| Distributiona | P | 9 | 25 | 0 | 13 | 20 | 1 | 10 | 24 | 0 | 19 | 15 | 0 |
| NP | 67 | 64 | 18 | 107 | 32 | 10 | 109 | 33 | 7 | 119 | 22 | 8 | |
| Stdresb | P | −1.97 | 3.22 | −2.13 | −3.72 | 4.36 | −0.83 | −4.83 | 5.5 | −1.29 | −2.93 | 3.85 | −1.38 |
| Chi-square | 0.0045 | 0.0005 | 0.0005 | 0.0005 | |||||||||
| Distributiona | B | 19 | 43 | 3 | 29 | 33 | 3 | 28 | 35 | 2 | 45 | 19 | 1 |
| NB | 57 | 46 | 15 | 91 | 19 | 8 | 91 | 22 | 5 | 93 | 18 | 7 | |
| Stdresb | B | −2.51 | 3.52 | −1.76 | −4.43 | 4.98 | −0.59 | −4.62 | 4.92 | −0.39 | −1.44 | 2.25 | −1.39 |
| Chi-square | 0.0025 | 0.0005 | 0.0005 | 0.037 | |||||||||
M monomorphic codons, NS codons containing at least one non-synonymous polymorphic site, S codons containing only synonymous polymorphic site(s), P pocket, NP non-pocket, B binding, NB non-binding (see Supplementary material and methods)
aNumber of codons for each category
bStandardized residuals are shown for the P and B categories. The stdres for NP and NB consist in the opposite values to P and B, respectively
Fig. 1Box and whisker plots of the entropy (H CODON_MAX) at each of the 183 codons coding for the peptide-binding region of each HLA class I locus HLA-A, HLA-B, and HLA-C; (top) when categorized as pocket codons (P, n = 34) and non-pocket codons (NP, n = 149); (middle) when subdividing the “P” codons into each of the four PBR pockets (A, B, C, D, E, and F); (bottom) when categorized as binding codons (B, n = 65) and non-binding codons (NB, n = 118). Entropy estimated on the basis of the combined sequence alignments for the three loci is also illustrated (ABC). The boxes correspond to the interquartile range, the median is the thick line inside the box, and whiskers extend up to observations that are outside the box for less than 1.5 times the interquartile range. Dots are outliers to these limits
Analysis of the allelic repertories
| Mean pairwise peptide-binding distances (PPBD) | Mean pairwise molecular distances (PMD) | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Loci |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ABC | 0.65 | 1.1E-06 | 0.42 | 3.9E-03 | 0.43 | 2.8E-03 | – | – | 0.45 | 1.7E-03 | 0.63 | 2.8E-06 | – | – |
| AB | 0.66 | 7.3E-07 | 0.11 | 0.45 | 0.02 | 0.89 | – | – | 0.34 | 0.02 | 0.43 | 2.5E-03 | – | – |
| AC | 0.6 | 1.2E-05 | 0.49 | 4.8E-04 | 0.39 | 0.01 | – | – | 0.51 | 2.5E-04 | 0.52 | 1.9E-04 | – | – |
| BC | 0.63 | 2.3E-06 | 0.45 | 1.6E-03 | 0.42 | 3.9E-03 | – | – | 0.29 | 0.05 | 0.56 | 5.6E-05 | – | – |
| A | 0.64 | 1.8E-06 | −0.47 | 9.4E-04 | −0.59 | 1.7E-05 | −0.55 | 7.6E-05 | 0.43 | 3.1E-03 | 0.74 | 5.4E-09 | 0.72 | 1.5E-08 |
| B | 0.67 | 3.6E-07 | 0.29 | 0.05 | 0.14 | 0.34 | 0.1 | 0.51 | 0.4 | 0.01 | 0.55 | 6.7E-05 | 0.53 | 0 |
| C | 0.53 | 1.4E-04 | 0.05 | 0.72 | −0.22 | 0.14 | −0.2 | 0.19 | −0.46 | 1.3E-03 | −0.76 | 7.6E-10 | −0.71 | 2.5E-08 |
r correlation coefficient, N sample size, k number of alleles, ar allelic richness (note that it was only possible to estimate this parameter at individual loci and not when using multi-loci data)
Fig. 2Proportion of homozygotes in 46 human populations at single and multiple loci. The geographic provenance of each population is indicated by a colored dot. Populations are subdivided into rapid genetic drift (RGD, on the left plots) and slow genetic drift (SGD, on the right plots)
Fig. 3Mean relative gain in peptide binding coverage (RGPBC) and mean relative increase in molecular distance (RIMD) in 46 human populations. Broad geographic regions are indicated by different colors, while demography is indicated by the shape of the dots (a circle for populations characterized by rapid genetic drift (RGD) and a triangle for populations with slow genetic drift (SGD)). Different scales are used on the y-axis for both measures. Standard deviations of RGPBC and RIMD in the populations are provided as insets and represented with boxplots
General linear regression models for mean relative gain in peptide binding coverage (RGPBC) and mean relative increase in molecular distance (RIMD)
| Dependent variable | ||
|---|---|---|
| RGPBC | RIMD | |
| locusAB | −0.001 | −0.0001 |
| (0.014) | (0.001) | |
| locusAC | −0.057** | −0.004** |
| (0.014) | (0.001) | |
| locusBC | −0.066** | −0.018** |
| (0.014) | (0.001) | |
| locusA | −0.205** | −0.044** |
| (0.014) | (0.001) | |
| locusB | −0.196** | −0.036** |
| (0.014) | (0.001) | |
| locusC | −0.341** | −0.049** |
| (0.014) | (0.001) | |
| demographySGD | 0.029* | 0.001 |
| (0.013) | (0.001) | |
| locusAB:demographySGD | 0.02 | 0.001 |
| (0.018) | (0.001) | |
| locusAC:demographySGD | 0.018 | −0.0003 |
| (0.018) | (0.001) | |
| locusBC:demographySGD | −0.004 | 0.001 |
| (0.018) | (0.001) | |
| locusA:demographySGD | 0.106** | 0.006** |
| (0.018) | (0.001) | |
| locusB:demographySGD | 0.067** | 0.006** |
| (0.018) | (0.001) | |
| locusC:demographySGD | 0.032* | 0.000 |
| (0.018) | (0.001) | |
| Constant | 0.682** | 0.069** |
| (0.010) | (0.001) | |
| Observations | 322 | 322 |
|
| 0.88 | 0.98 |
| Adjusted | 0.87 | 0.98 |
| Residual std. error ( | 0.04 | 0.003 |
|
| 167.724** | 1165.377** |
Standard errors are provided within parentheses. Baseline groups are “ABC” (for the “locus” explanatory variable) and “rapid genetic drift” (“RGD” for the “demography” explanatory variable)
SGD slow genetic drift
*p < 0.05; **p < 0.01