| Literature DB >> 32475052 |
Rodrigo Barquera1, Evelyn Collen2, Da Di3, Stéphane Buhler3,4, João Teixeira2,5, Bastien Llamas5,6, José M Nunes3,7, Alicia Sanchez-Mazas3,7.
Abstract
We report detailed peptide-binding affinities between 438 HLA Class I and Class II proteins and complete proteomes of seven pandemic human viruses, including coronaviruses, influenza viruses and HIV-1. We contrast these affinities with HLA allele frequencies across hundreds of human populations worldwide. Statistical modelling shows that peptide-binding affinities classified into four distinct categories depend on the HLA locus but that the type of virus is only a weak predictor, except in the case of HIV-1. Among the strong HLA binders (IC50 ≤ 50), we uncovered 16 alleles (the top ones being A*02:02, B*15:03 and DRB1*01:02) binding more than 1% of peptides derived from all viruses, 9 (top ones including HLA-A*68:01, B*15:25, C*03:02 and DRB1*07:01) binding all viruses except HIV-1, and 15 (top ones A*02:01 and C*14:02) only binding coronaviruses. The frequencies of strongest and weakest HLA peptide binders differ significantly among populations from different geographic regions. In particular, Indigenous peoples of America show both higher frequencies of strongest and lower frequencies of weakest HLA binders. As many HLA proteins are found to be strong binders of peptides derived from distinct viral families, and are hence promiscuous (or generalist), we discuss this result in relation to possible signatures of natural selection on HLA promiscuous alleles due to past pathogenic infections. Our findings are highly relevant for both evolutionary genetics and the development of vaccine therapies. However they should not lead to forget that individual resistance and vulnerability to diseases go beyond the sole HLA allelic affinity and depend on multiple, complex and often unknown biological, environmental and other variables.Entities:
Keywords: COVID-19; HIV; HLA population genetics; Indigenous Americans; SARS-CoV-2; coronavirus; influenza; natural selection; peptide-binding predictions
Mesh:
Substances:
Year: 2020 PMID: 32475052 PMCID: PMC7300650 DOI: 10.1111/tan.13956
Source DB: PubMed Journal: HLA ISSN: 2059-2302 Impact factor: 8.762
Number of SARS‐CoV‐2 peptides binding at different affinity levels or not binding HLA proteins
| HLA loci | ||||||
|---|---|---|---|---|---|---|
| Affinity levels | # peptides | A | B | C | DRB1 | DQA1/DQB1 |
| Strong binding (IC50 ≤ 50 nM) | Min (%) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| Max (%) | 272 (3.8) | 203 (2.9) | 99 (1.4) | 719 (10.1) | 9 (0.13) | |
| Average (%) | 50.6 (0.7) | 17.8 (0.25) | 17.7 (0.25) | 35.2 (0.5) | 0.5 (0.01) | |
| Regular binding (50 nM < IC50 ≤ 500 nM for Class I) (50 nM < IC50 ≤ 1000 nM for Class II) | Min (%) | 16 (0.2) | 0 (0) | 0 (0) | 2 (0.03) | 0 (0) |
| Max (%) | 329 (4.6) | 478 (6.7) | 448 (6.3) | 3855 (54.4) | 1536 (21.7) | |
| Average (%) | 136.3 (1.9) | 79.9 (1.1) | 125.9 (1.8) | 1507.2 (21.3) | 436.3 (6.1) | |
| Weak binding (500 nM < IC50 ≤ 5000 nM for Class II) (1000 nM < IC50 ≤ 5000 nM for Class II) | Min (%) | 130 (1.9) | 45 (0.6) | 18 (0.25) | 197 (2.8) | 50 (0.7) |
| Max (%) | 1123 (15.8) | 1162 (16.4) | 1206 (17) | 3917 (55.3) | 4572 (64.5) | |
| Average (%) | 433 (6.1) | 354.1 (5) | 560 (7.9) | 2841 (40.1) | 2701.3 (38.1) | |
| No binding (IC50 > 5000 nM) | Min (%) | 5605 (79.1) | 5246 (74) | 5404 (76.2) | 683 (9.6) | 976 (13.8) |
| Max (%) | 6939 (97.9) | 7041 (99.3) | 7071 (99.7) | 6885 (97.2) | 7034 (99.3) | |
| Average (%) | 6469.1 (91.3) | 6637.2 (93.6) | 6385.4 (90.1) | 2700.6 (38.1) | 3945.9 (55.7) | |
| Weak or nobinding (IC50 > 500 nM for Class I) (IC50 > 1000 nM for Class II) | Min (%) | 6502 (91.7) | 6408 (90.4) | 6564 (92.6) | 2510 (35.4) | 5548 (78.3) |
| Max (%) | 7072 (99.8) | 7089 (99.99) | 7089 (99.99) | 7082 (99.97) | 7084 (100) | |
| Average (%) | 6902.2 (97.4) | 6991.3 (98.6) | 6945.5 (98) | 5541.5 (78.2) | 6647.2 (93.8) | |
FIGURE 1Percentage of the total number of peptides derived from the complete SARS‐CoV‐2 peptidome that is bound by each HLA protein (dots) according to NetMHCpan v. 4.0 and NethMHCpanII v. 3.2 predictions (see section 2). The four binding classes strong, regular, weak and non‐binder follow the affinity criteria as indicated in the text.DQAB refers to the protein coded jointly by DQA1 and DQB1 molecules. Locus DRA was considered as non‐polymorphic, hence DRAB actually corresponds to DRB1 molecules. The distinct patterns of Class I and Class II alleles are visible through their variabilities, which are much higher for Class II
Number of HLA proteins binding at different affinity levels or not binding 0, ≥100 or ≥99% of SARS‐CoV‐2 peptides
| HLA loci (total # of proteins) | ||||||
|---|---|---|---|---|---|---|
| Affinity levels | # peptides | A (92) | B (164) | C (55) | DRB1 (94) | DQA1/DQB1 (34) |
| Strong binding (IC50 ≤ 50 nM) | 0 (%) | 1 (1.1) | 30 (18.3) | 11 (20) | 45 (47.9) | 32 (88.9) |
| ≥ 100 (%) | 17 (18.5) | 5 (3) | 0 (0) | 6 (6.4) | 0 (0) | |
| Regular binding (50 nM < IC50 ≤ 500 nM for Class I) (50 nM < IC50 ≤ 1000 nM for Class II) | 0 (%) | 0 (0) | 2 (1.2) | 5 (9.1) | 0 (0) | 1 (2.9) |
| ≥ 100 (%) | 57 (62) | 41 (25) | 27 (49.1) | 92 (97.9) | 25 (73.5) | |
| Weak binding (500 nM < IC50 ≤ 5000 nM for Class II) (1000 nM < IC50 ≤ 5000 nM for Class II) | 0 (%) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| ≥ 100 (%) | 92 (100) | 154 (93.9) | 49 (89.1) | 94 (100) | 33 (97.1) | |
| No binding (IC50 > 5000 nM) | 0 (%) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| ≥ 100 (%) | 92 (100) | 164 (100) | 55 (100) | 94 (100) | 34 (100) | |
| Weak or no binding (IC50 > 500 nM for Class I) (IC50 > 1000 nM for Class II) | 0 (%) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| ≥99% (%) | 21 (22.8) | 94 (57.3) | 21 (38.2) | 2 (2.1) | 7 (20.6) | |
List of HLA strongest binders (>100 peptides bound at high affinity, that is, IC50 ≤ 50 nM) of SARS‐CoV‐2 peptides
| Strongest binders | ||||||||
|---|---|---|---|---|---|---|---|---|
| HLA‐A | # bound peptides | HLA‐B | # bound peptides | HLA‐C | # bound peptides | HLA‐DRB1 | # bound peptides | HLA‐DQA1/DQB1 |
|
| 272 |
| 203 | ( | (99) |
| 719 | — |
|
| 224 |
| 154 |
| 358 | |||
|
| 179 |
| 147 |
| 185 | |||
|
| 176 |
| 104 |
| 169 | |||
|
| 144 |
| 103 |
| 169 | |||
|
| 142 |
| 169 | |||||
|
| 120 | |||||||
|
| 115 | |||||||
|
| 115 | |||||||
|
| 115 | |||||||
|
| 115 | |||||||
|
| 111 | |||||||
|
| 111 | |||||||
|
| 111 | |||||||
|
| 104 | |||||||
|
| 101 | |||||||
|
| 101 | |||||||
Note: The complete list of alleles with the number of peptides bound at different affinity levels is given in Data S1.
List of HLA weakest binders (>99% of weak or no bindings, that is, IC50 > 500 nM f or Class I, IC50 > 1000 nM for Class I) of SARS‐CoV‐2 peptides
| Weakest binders | |||||||
|---|---|---|---|---|---|---|---|
| HLA‐A | HLA‐B | HLA‐C | HLA‐DRB1 | HLA‐DQA1/DQB1 | |||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
|
| |
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
|
|
| ||
|
|
|
|
| ||||
|
|
|
| |||||
|
|
|
| |||||
Note: The complete list of alleles with the number of peptides bound at different affinity levels is given in Data S1.
Never strong binders.
Never strong nor regular binders.
FIGURE 2Cumulative allele frequencies for the two groups of alleles that were considered as strongest (in red) and weakest (in blue) binders, by locus (HLA‐A, ‐B, ‐C and ‐DRB1) and geographic region for each population sample. Population samples and binding criteria are described in the main text. In the bottom panel, HLA‐A and ‐B frequencies have been averaged (named as “A + B”) and the distribution of the cumulative frequencies among the population samples of each region are presented both as violin and box plots. Geographic regions are SAF, Sub‐Saharan Africa; NAF, North Africa; EUR, Europe; SWA, South‐West Asia; NEA, North‐East Asia; SEA, South‐East Asia; AUS, Australia; OCE, Oceania; NAM, North America; SAM, South America
Retained models for each kind of peptide binding
| Dependent variable | ||
|---|---|---|
| Freq | ||
| Terms | Strongest | Weakest |
| LocusB | −0.121 | 0.44 |
| LocusDR | −0.046 (0.028) | −0.068 |
| RegionNAF | 0.028 (0.029) | 0.04 (0.035) |
| RegionEUR | 0.109 | 0.037 (0.027) |
| RegionSWA | 0.051 | −0.025 (0.033) |
| RegionNEA | 0.045 | −0.022 (0.028) |
| RegionSEA | −0.033 (0.023) | −0.023 (0.028) |
| RegionAUS | −0.063 | −0.131 |
| RegionOCE | −0.081 | −0.096 |
| RegionNAM | 0.305 | −0.116 |
| RegionSAM | 0.314 | −0.151 |
| LocusB:RegionNAF | −0.075 | 0.04 (0.049) |
| LocusDR:RegionNAF | −0.079 | −0.121 |
| LocusB:RegionEUR | −0.183 | 0.058 (0.038) |
| LocusDR:RegionEUR | −0.115 | −0.123 |
| LocusB:RegionSWA | −0.118 | 0.092 |
| LocusDR:RegionSWA | −0.044 (0.039) | −0.058 (0.047) |
| LocusB:RegionNEA | −0.119 | 0.114 |
| LocusDR:RegionNEA | −0.111 | −0.064 (0.040) |
| LocusB:RegionSEA | −0.015 (0.033) | 0.021 (0.040) |
| LocusDR:RegionSEA | −0.096 | −0.063 (0.040) |
| LocusB:RegionAUS | −0.007 (0.043) | 0.232 |
| LocusDR:RegionAUS | −0.076 | 0.045 (0.052) |
| LocusB:RegionOCE | 0.057 | 0.09 |
| LocusDR:RegionOCE | −0.066 | 0.01 (0.038) |
| LocusB:RegionNAM | −0.381 | −0.001 (0.044) |
| LocusDR:RegionNAM | −0.436 | 0.031 (0.044) |
| LocusB:RegionSAM | −0.393 | −0.017 (0.052) |
| LocusDR:RegionSAM | −0.459 | 0.07 (0.052) |
| Constant | 0.202 | 0.153 |
| Observations | 372 | 372 |
| R2 | 0.859 | 0.954 |
| Adjusted R2 | 0.847 | 0.95 |
| Residual Std. Error (df = 342) | 0.052 | 0.063 |
| F Statistic (df = 29; 342) | 71.608 | 244.587 |
Note: The dependent variable is the frequency (Freq) of the strongest (left) and weakest (right) HLA binders. The left column (terms) lists all the independent variables and their interactions. For each retained model (Strongest and Weakest) the first column displays the coefficients of the model, that is, the differences in average cumulated frequencies between the group defined by each term and the reference (Locus: A; Region: SAF, grouped on the constant term); the second column shows asterisks indicating the significance level of a test for the coefficient being zero (no effect); and the third column presents in parentheses the values of the standard errors associated with the coefficients.
P < .1;
P < .05;
P < .01.
FIGURE 3Proportion of the total number of peptides derived from the peptidomes of the 7 viruses analysed in this study (SARS‐CoV‐2, SARS‐CoV‐1, MERS‐CoV; H1N1, H3N2, H7N9; HIV‐1) that is bound by each HLA protein, per locus and binding kind. The four binding classes strong, regular, weak and non‐binder follow the usual affinity criteria (as indicated in the text). DQAB refers to the protein coded jointly by DQA1 and DQB1 molecules. Locus DRA was considered as non‐polymorphic, hence DRAB actually corresponds to DRB1 molecules
Retained model for peptide‐binding proportion
| Dependent variable | |
|---|---|
| Terms | Rank (value) |
| Kind.Strong | −1813.988 |
| Kind.Weak | 1929.162 |
| Kind.NonBinder | 6177.37 |
| LocusB | −953.978 |
| LocusC | −609.803 |
| LocusDQ | 826.365 |
| LocusDR | 3302.612 |
| Virus.cov1 | −9.609 (75.479) |
| Virus.mers | 17.622 (75.436) |
| Virus.h1n1 | −141.812 |
| Virus.h3n2 | −235.549 |
| Virus.h7n9 | −193.273 |
| Virus.hiv | −654.673 |
| Kind.Strong:LocusB | −36.94 (77.816) |
| Kind.Weak:LocusB | 519.922 |
| Kind.NonBinder:LocusB | 1270.303 |
| Kind.Strong:LocusC | −458.426 |
| Kind.Weak:LocusC | 564.693 |
| Kind.NonBinder:LocusC | 590.722 |
| Kind.Strong:LocusDQ | −3019.89 |
| Kind.Weak:LocusDQ | 1420.448 |
| Kind.NonBinder:LocusDQ | −2089.007 |
| Kind.Strong:LocusDR | −4068.383 |
| Kind.Weak:LocusDR | −826.425 |
| Kind.NonBinder:LocusDR | −5163.576 |
| Kind.Strong:Virus.cov1 | −27.935 (106.744) |
| Kind.Weak:Virus.cov1 | −13.4 (106.744) |
| Kind.NonBinder:Virus.cov1 | 29.968 (106.744) |
| Kind.Strong:Virus.mers | −46.915 (106.683) |
| Kind.Weak:Virus.mers | 50.838 (106.683) |
| Kind.NonBinder:Virus.mers | −72.158 (106.683) |
| Kind.Strong:Virus.h1n1 | 69.534 (106.683) |
| Kind.Weak:Virus.h1n1 | −39.497 (106.683) |
| Kind.NonBinder:Virus.h1n1 | 261.228 |
| Kind.Strong:Virus.h3n2 | 86.513 (106.683) |
| Kind.Weak:Virus.h3n2 | 26.338(106.683) |
| Kind.NonBinder:Virus.h3n2 | 390.448 |
| Kind.Strong:Virus.h7n9 | 79.336 (106.683) |
| Kind.Weak:Virus.h7n9 | −15.716 (106.683) |
| Kind.NonBinder:Virus.h7n9 | 342.746 |
| Kind.Strong:Virus.hiv | 289.56 |
| Kind.Weak:Virus.hiv | 281.013 |
| Kind.NonBinder:Virus.hiv | 1027.091 |
| Constant | 4734.063 |
| Observations | |
| R2 | 0.901 |
| Adjusted R2 | 0.901 |
| Residual Std. Error | 1117.625 (df = 12244) |
| F Statistic 12 288 | 2592.272 |
Note: The dependent variable is the rank of the proportion of bound peptides. The left column (terms) lists all the independent variables and their interactions. For the retained model, the first column displays the coefficients of the model, that is, the differences in average ranks between the group defined by each term and the reference (Locus: A; Virus: cov2; Kind: regular, grouped on the constant term); the second column shows asterisks indicating the significance level of a test for the coefficient being zero (no effect); and the third column presents in parentheses the values of the standard errors associated with the coefficients.
P < .1;
P < .05;
P < .01.
Binding affinities of HLA‐B*57:01 for different lengths of Gag‐derived peptide
| Allele | # | Start | End | Length | Peptide | Core | Icore | IC50 | Percentile rank |
|---|---|---|---|---|---|---|---|---|---|
|
| 1 | 1 | 11 | 11 | KAFSPEVIPMF | KAFSVIPMF | KAFSPEVIPMF | 145.5 | 0.26 |
|
| 1 | 1 | 10 | 10 | KAFSPEVIPM | KAFSPEVIM | KAFSPEVIPM | 591.6 | 0.77 |
|
| 1 | 1 | 8 | 8 | KAFSPEVI | KAFSP‐EVI | KAFSPEVI | 3307.6 | 3 |
|
| 1 | 3 | 11 | 9 | FSPEVIPMF | FSPEVIPMF | FSPEVIPMF | 3846.1 | 3.4 |
|
| 1 | 1 | 9 | 9 | KAFSPEVIP | KAFSPEVIP | KAFSPEVIP | 5220.4 | 4.5 |
|
| 1 | 2 | 11 | 10 | AFSPEVIMF | ASPEVIPMF | AFSPEVIPMF | 6502.4 | 5.6 |
|
| 1 | 3 | 10 | 8 | FSPEVIPM | FS‐PEVIPM | FSPEVIPM | 22 769.8 | 28 |
|
| 1 | 4 | 11 | 8 | SPEVIPMF | ‐SPEVIPMF | SPEVIPMF | 28 204.3 | 39 |
|
| 1 | 2 | 10 | 9 | AFSPEVIPM | AFSPEVIPM | AFSPEVIPM | 30 593.4 | 46 |
|
| 1 | 2 | 9 | 8 | AFSPEVIP | ‐AFSPEVIP | AFSPEVIP | 39 962.7 | 79 |
Note: NetMHCPan v. 4.0 output shows the IC50 affinity scores for the immunodominant HIV‐1 Gag‐derived peptide KAFSPEVIPMF and all possible 8, 9 and 10‐mer derived from this peptide. B*57:01 is a regular binder (50 nM < IC50 ≤ 500 nM) of the 11‐mer epitope and a bad (500 nM < IC50 ≤ 5000 nM) or non binder (IC50 > 5000 nM) for all other epitopes.