| Literature DB >> 33123133 |
Nicola Cotugno1,2, Veronica Santilli1, Giuseppe Rubens Pascucci1, Emma Concetta Manno1, Lesley De Armas3, Suresh Pallikkuth3, Annalisa Deodati4, Donato Amodio1,2, Paola Zangari1, Sonia Zicari1, Alessandra Ruggiero1, Martina Fortin5, Christina Bromley5, Rajendra Pahwa3, Paolo Rossi1,2, Savita Pahwa3, Paolo Palma1,2.
Abstract
The number of patients affected by chronic diseases with special vaccination needs is burgeoning. In this scenario, predictive markers of immunogenicity, as well as signatures of immune responses are typically missing even though it would especially improve the identification of personalized immunization practices in these populations. We aimed to develop a predictive score of immunogenicity to Influenza Trivalent Inactivated Vaccination (TIV) by applying deep machine learning algorithms using transcriptional data from sort-purified lymphocyte subsets after in vitro stimulation. Peripheral blood mononuclear cells (PBMCs) collected before TIV from 23 vertically HIV infected children under ART and virally controlled were stimulated in vitro with p09/H1N1 peptides (stim) or left unstimulated (med). A multiplexed-qPCR for 96 genes was made on fixed numbers of 3 B cell subsets, 3 T cell subsets and total PBMCs. The ability to respond to TIV was assessed through hemagglutination Inhibition Assay (HIV) and ELIspot and patients were classified as Responders (R) and Non Responders (NR). A predictive modeling framework was applied to the data set in order to define genes and conditions with the higher predicted probability able to inform the final score. Twelve NR and 11 R were analyzed for gene expression differences in all subsets and 3 conditions [med, stim or Δ (stim-med)]. Differentially expressed genes between R and NR were selected and tested with the Adaptive Boosting Model to build a prediction score. The score obtained from subsets revealed the best prediction score from 46 genes from 5 different subsets and conditions. Calculating a combined score based on these 5 categories, we achieved a model accuracy of 95.6% and only one misclassified patient. These data show how a predictive bioinformatic model applied to transcriptional analysis deriving from in-vitro stimulated lymphocytes subsets may predict poor or protective vaccination immune response in vulnerable populations, such as HIV-infected individuals. Future studies on larger cohorts are needed to validate such strategy in the context of vaccination trials.Entities:
Keywords: HIV; artificial intelligence; deep learning; gene expression; influenza vaccine; predictive biomarkers; vaccinomics
Mesh:
Substances:
Year: 2020 PMID: 33123133 PMCID: PMC7569088 DOI: 10.3389/fimmu.2020.559590
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
Figure 1Experimental design. The cartoon on the top panel depicts the experimental procedure. Briefly, total PBMCs are in 2 aliquotes, one stimulated and the latter unstimulated. After sorting lymphocite subsets, gene expression is analyzed by Fluidigm Biomark. Bottom panel describes the gating strategies and the lymphocites subsets selected for sorting and gene expression analysis. Mathematical analysis applied on the subsets in order to obtain differentially expressed genes (DEGs) and differentially induced genes (DIGs) are described.
Figure 2Analysis framework flowchart: pipeline workflow for predicting the vaccine response. Differentially induced genes between responders and non-responders were selected using a machine learning feature selection based on three different algorithms and the Wilcoxon test for each cell subset and condition. The list of selected genes was used by the Adaptive Boosting algorithm to build the predictive model and calculate the prediction score.
Figure 3ADA Boost Probability Scores for T Cells (A) and B cell subsets (B). The probabilities of prediction are shown for each patient (the non-responder in red and responder in blue). If the probability is >0.50 the patient has been classified as responder, on the contrary if <0.50.
Figure 4ADA Boost Probability Scores (B and T Cells): the combined prediction score between B and T cells are shown for each patient (the non-responder in red and responder in blue). If the probability is >0.50 the patient has been classified as responder, on the contrary if <0.50.
Study subjects' characteristics.
| Age years, mean (SEM) | 15.16 (2.1) | 13.72 (2.3) |
| n (female) | 12 ( | 11 ( |
| %CD4+ T cells, mean (SEM) | 37.97 (4.9) | 32.49 (6.0) |
| HIV RNA <50cp/mL, n | 11 | 10 |
| IgG (mg/dL) (mean) | 1387.4 | 1,356 |
| IgM (mg/dL) (mean) | 135.1 | 118.9 |
| IgA (mg/dL) (mean) | 210.7 | 225.1 |
| CDC (A/B/C) (1/2/3) | (3/4/5) (3/4/5) | (2/5/4) (4/3/4) |
| Lymphocites/mm3 mean (SEM) | 2494 (278,9) | 3109 (363,1) |
| WBC 103/uL, mean (SEM) | 7.6 (1.5) | 7.3 (0.7) |
| ART regimen (2 NRTI+PI-r/2 nNRTI+ NRTI/2 NRTI+ii) | (5/5/2) | (5/4/2) |
SEM, standard error of the mean; CRP, C-reactive protein; CDC, Center for Disease Control classification of AIDS. WBC, white blood cells. ART, antiretroviral treatment; NRTI, Nucleoside and Nucleotide Analog Reverse Transcriptase Inhibitors; PI, Protease Inhibitors; nNRTI, Non-Nucleoside Analog Reverse Transcriptase Inhibitors; ii, Integrase Inhibitors.
Selected genes and conditions.
| B | AM | 1 | 2 | 5 |
| DN | 5 | 9 | 5 | |
| REM | 8 | 8 | 3 | |
| T | CD4 | 7 | 9 | 12 |
| NT | 12 | 14 | 12 | |
| PBMC | 18 | 9 | 12 | |
| TFH | 4 | 11 | 13 | |
| Total | 55 | 62 | 62 | |
Subsets and conditions importance ranking.
| B_REM_med_stim | 17 | 100% | 2 | 0.022 | 0.972 | 0.950 | 1 |
| B_DN_stim | 20 | 100% | 1 | 0.084 | 0.916 | 0.832 | 2 |
| T_TFH_med_stim | 22 | 95% | 3 | 0.210 | 0.790 | 0.580 | 3 |
| T_PBMC_med | 21 | 90% | 6 | 0.258 | 0.742 | 0.484 | 4 |
| B_AM_med_stim | 21 | 86% | 12 | 0.261 | 0.739 | 0.479 | 5 |
| B_REM_stim | 19 | 89% | 7 | 0.271 | 0.729 | 0.459 | 6 |
| B_AM_stim | 22 | 86% | 10 | 0.273 | 0.727 | 0.453 | 7 |
| T_NT_med | 22 | 91% | 5 | 0.286 | 0.714 | 0.429 | 8 |
| T_PBMC_med_stim | 21 | 86% | 13 | 0.296 | 0.673 | 0.377 | 9 |
| T_PBMC_stim | 21 | 86% | 14 | 0.326 | 0.674 | 0.347 | 10 |
| B_AM_med | 21 | 81% | 16 | 0.329 | 0.671 | 0.343 | 11 |
| T_CD4_med | 23 | 74% | 20 | 0.347 | 0.653 | 0.305 | 12 |
| B_REM_med | 18 | 83% | 15 | 0.366 | 0.593 | 0.227 | 13 |
| B_DN_med_stim | 17 | 94% | 4 | 0.392 | 0.608 | 0.217 | 14 |
| T_CD4_stim | 23 | 87% | 8 | 0.404 | 0.596 | 0.192 | 15 |
| B_DN_med | 19 | 79% | 17 | 0.414 | 0.586 | 0.172 | 16 |
| T_NT_stim | 23 | 87% | 9 | 0.418 | 0.582 | 0.164 | 17 |
| T_CD4_med_stim | 23 | 78% | 18 | 0.427 | 0.573 | 0.146 | 18 |
| T_NT_med_stim | 22 | 86% | 11 | 0.429 | 0.571 | 0.142 | 19 |
| T_TFH_stim | 23 | 78% | 19 | 0.447 | 0.553 | 0.106 | 20 |
| T_TFH_med | 22 | 68% | 21 | 0.481 | 0.519 | 0.038 | 21 |
All subsets and conditions were ranked both for the accuracy of classification and for the expected probability range. According to both rankings, the categories with the best classification capacity, highlighted in red, were selected for the final score.
Cross validation of the model.
| H19 | 0.639 | 0.361 | Non.Responder | Non.Responder |
| H26 | 0.288 | 0.712 | Responder | Responder |
| H3 | 0.445 | 0.555 | Responder | Responder |
| H37 | 0.669 | 0.331 | Non.Responder | Non.Responder |
| H38 | 0.613 | 0.387 | Non.Responder | Non.Responder |
| H40 | 0.683 | 0.317 | Non.Responder | Non.Responder |
| H41 | 0.613 | 0.387 | Non.Responder | Responder |
| H44 | 0.331 | 0.669 | Responder | Responder |
| H46 | 0.683 | 0.317 | Non.Responder | Non.Responder |
| H47 | 0.443 | 0.557 | Responder | Responder |
| H48 | 0.712 | 0.288 | Non.Responder | Non.Responder |
| H52 | 0.403 | 0.597 | Responder | Responder |
| H55 | 0.683 | 0.317 | Non.Responder | Non.Responder |
| H56 | 0.397 | 0.603 | Responder | Responder |
| H58 | 0.617 | 0.383 | Non.Responder | Non.Responder |
| H60 | 0.712 | 0.288 | Non.Responder | Non.Responder |
| H69 | 0.683 | 0.317 | Non.Responder | Non.Responder |
| H7 | 0.443 | 0.557 | Responder | Responder |
| H70 | 0.712 | 0.288 | Non.Responder | Non.Responder |
| H75 | 0.356 | 0.644 | Responder | Responder |
| H8 | 0.473 | 0.527 | Responder | Responder |
| H80 | 0.397 | 0.603 | Responder | Responder |
| H83 | 0.683 | 0.317 | Non.Responder | Non.Responder |
Predicted and observed outcome are listed for all patients.
List of the genes used to build the five models with the highest prediction accuracy.
| REM med_stim | |
| DN stim | |
| TFH med_stim | |
| PBMC med | |
| DN med_stim |
In red are shown the up-regulated genes for the med and stim categories (DN stim, PBMC med) and the genes with the highest Δ (stim-med) value in the Responders for the med-stim categories (REM med-stim, TFH med- stim, DN med-stim. In blue are shown the down-regulated genes for the med and stim categories and the genes with the lowest Δ (stim-med) in the Responders for the med-stim categories (REM med-stim, TFH med-stim, DN med-stim.
Figure 5Enrichment analysis performed for REM med-stim and TFH med-stim genes on GO terms and pathways. (A) Bar plots with the top 10 terms sorted by p-value. (B) Cytokine-cytokine receptor Kegg map that shows in red the REM med-stim genes and in blue the DN stim gene.