| Literature DB >> 28484452 |
Michael T Zimmermann1,2, Richard B Kennedy2, Diane E Grill1,2, Ann L Oberg1,2, Krista M Goergen1, Inna G Ovsyannikova2, Iana H Haralambieva2, Gregory A Poland2.
Abstract
The development of a humoral immune response to influenza vaccines occurs on a multisystems level. Due to the orchestration required for robust immune responses when multiple genes and their regulatory components across multiple cell types are involved, we examined an influenza vaccination cohort using multiple high-throughput technologies. In this study, we sought a more thorough understanding of how immune cell composition and gene expression relate to each other and contribute to interindividual variation in response to influenza vaccination. We first hypothesized that many of the differentially expressed (DE) genes observed after influenza vaccination result from changes in the composition of participants' peripheral blood mononuclear cells (PBMCs), which were assessed using flow cytometry. We demonstrated that DE genes in our study are correlated with changes in PBMC composition. We gathered DE genes from 128 other publically available PBMC-based vaccine studies and identified that an average of 57% correlated with specific cell subset levels in our study (permutation used to control false discovery), suggesting that the associations we have identified are likely general features of PBMC-based transcriptomics. Second, we hypothesized that more robust models of vaccine response could be generated by accounting for the interplay between PBMC composition, gene expression, and gene regulation. We employed machine learning to generate predictive models of B-cell ELISPOT response outcomes and hemagglutination inhibition (HAI) antibody titers. The top HAI and B-cell ELISPOT model achieved an area under the receiver operating curve (AUC) of 0.64 and 0.79, respectively, with linear model coefficients of determination of 0.08 and 0.28. For the B-cell ELISPOT outcomes, CpG methylation had the greatest predictive ability, highlighting potentially novel regulatory features important for immune response. B-cell ELISOT models using only PBMC composition had lower performance (AUC = 0.67), but highlighted well-known mechanisms. Our analysis demonstrated that each of the three data sets (cell composition, mRNA-Seq, and DNA methylation) may provide distinct information for the prediction of humoral immune response outcomes. We believe that these findings are important for the interpretation of current omics-based studies and set the stage for a more thorough understanding of interindividual immune responses to influenza vaccination.Entities:
Keywords: cell sorting; data mining; differential expression; immunology; influenza vaccine; machine learning; methylation
Year: 2017 PMID: 28484452 PMCID: PMC5399034 DOI: 10.3389/fimmu.2017.00445
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 8.786
Figure 1Correlations among cell subset levels across subjects. We present a heatmap of Spearman’s correlation coefficients among cell subsets. Each cell in the matrix is the correlation between the corresponding two Flow markers, across subjects. The matrix is symmetric; columns labels omitted for brevity. Row order was determined using hierarchical clustering. Cell subsets are either directly named or labeled by the surface markers used. A forward slash indicates a fraction. For example, the first row indicates the fraction of CD20-positive cells that are IgD positive and CD27 negative.
Figure 2The distribution of correlations with gene expression differs for each flow cytometry feature. (A) After filtering relationships with low statistical significance using permutation, each cell subset shows positive and negative associations with many genes; each row corresponds to a cell subset. Correlation magnitude is shown along the abscissa and probability density along the ordinate. (B) As examples to demonstrate how genes with strong negative correlations with one subset have strong positive correlations with another, we selected genes with correlation ≤−0.4 with any cell subset and plot their associations across all subsets. Each of the selected genes is represented by a line connecting their correlation value with each subset. The same strong trend is observed when selecting genes with a positive correlation coefficient and for smaller magnitudes (not shown).
Figure 3Comparison between expression levels in human peripheral blood mononuclear cells (PBMCs) and sorted cell subsets. We performed fluorescence-activated cell sorting for 10 patient samples, and mRNA-Seq was assayed on three sorted cell subsets: monocytes, T-cells, and B-cells. In the first row, we show the relationship between gene expression levels in each cell subset versus PBMCs from the same patient samples, across the most variable quartile of the transcriptome. In the second row, we calculate the difference in expression (ΔExpr) between PBMCs and each sorted cell subset; the probability density of ΔExpr across genes is plotted. These data confirmed the trends observed from data generated on PBMCs—genes correlating with levels of a cell subset according to Flow are expressed to a higher degree in that cell subset than in PBMCs and often than in other cell subsets.
Figure 4The potential impact of accounting for immune cell composition on the interpretation of peripheral blood mononuclear cell (PBMC)-derived gene expression profiles. We quantified the fraction of differentially expressed (DE) genes from publically available studies for which the same gene was strongly correlated with an immune cell subset in our study. (A) Across 128 publically available human vaccine-related data sets where data were produced from PBMCs, we identified DE genes and determined the percent of those DE genes that are significantly associated with Flow-derived cell subset levels in our data set (%DEFlow). On average, 57% of the DE genes from external PBMC-derived samples were associated with a change in cell subset level in our study. (B) Recurrence analysis of DE genes across these studies highlights that underlying changes in PBMC composition could be driving many of the most important transcriptomic changes across these studies.
Performance of predictive models of B-cell ELISPOT using combinations of data types.
| Input data | Feature selection | Continuous prediction | Discrete prediction | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Flow | mRNA | CpG | LM | LM | sens | spec | AUC | ||||
| 27 | 3 | 0.07 | 1.05 × 10–3 | 0.67 | 1.59 × 10−4 | 0.78 | 0.53 | 0.67 | |||
| 151 | 6 | 0.00 | 7.73 × 10−1 | 0.01 | 7.60 × 10−1 | 0.75 | 0.35 | 0.51 | |||
| 31 | 1 | 0.00 | 4.81 × 10−1 | 0.02 | 3.68 × 10−1 | 0.30 | 0.79 | 0.52 | |||
| 63 | 2 | 0.04 | 1.83 × 10−2 | 0.11 | 2.61 × 10−1 | 0.44 | 0.69 | 0.55 | |||
| 10 | 3 | 0.00 | 5.38 × 10−1 | 0.00 | 8.40 × 10−2 | 0.90 | 0.22 | 0.55 | |||
| 72 | 29 | 0.23 | 1.81 × 10−10 | 1.13 | 1.21 × 10−10 | 0.76 | 0.73 | 0.78 | |||
| 178 | 8 | 0.03 | 2.74 × 10−2 | 0.52 | 4.91 × 10−3 | 0.58 | 0.68 | 0.63 | |||
| 58 | 3 | 0.04 | 1.14 × 10−2 | 0.44 | 3.67 × 10−3 | 0.65 | 0.62 | 0.63 | |||
| 90 | 4 | 0.03 | 4.22 × 10−2 | 0.49 | 9.25 × 10−3 | 0.58 | 0.67 | 0.62 | |||
| 37 | 3 | 0.06 | 1.97 × 10−3 | 0.68 | 1.38 × 10−4 | 0.65 | 0.65 | 0.68 | |||
| 99 | 31 | 0.22 | 5.77 × 10−10 | 1.04 | 4.29 × 10−11 | 0.81 | 0.67 | 0.79 | |||
| 250 | 35 | 0.12 | 1.22 × 10−5 | 0.84 | 3.03 × 10−5 | 0.63 | 0.69 | 0.69 | |||
| 130 | 27 | 0.17 | 5.42 × 10−8 | 1.06 | 4.29 × 10−9 | 0.82 | 0.63 | 0.76 | |||
| 162 | 32 | 0.28 | 1.63 × 10−12 | 1.08 | 2.26 × 10−11 | 0.70 | 0.76 | 0.79 | |||
| 109 | 29 | 0.19 | 8.96 × 10−9 | 1.01 | 4.04 × 10−9 | 0.71 | 0.74 | 0.76 | |||
| 50 | 32 | 0.22 | 7.42 × 10−10 | 0.84 | 2.59 × 10−7 | 0.62 | 0.76 | 0.73 | |||
| 77 | 23 | 0.18 | 3.05 × 10−8 | 0.88 | 4.47 × 10−7 | 0.72 | 0.68 | 0.72 | |||
| 14 | 10 | 0.15 | 8.49 × 10−7 | 0.75 | 9.68 × 10−7 | 0.72 | 0.65 | 0.72 | |||
| 23 | 13 | 0.13 | 3.13 × 10−6 | 0.80 | 1.27 × 10−6 | 0.68 | 0.65 | 0.71 | |||
| 41 | 10 | 0.13 | 3.27 × 10−6 | 0.83 | 9.51 × 10−7 | 0.62 | 0.74 | 0.72 | |||
| 50 | 7 | 0.01 | 2.36 × 10−1 | −0.19 | 3.06 × 10−1 | 0.61 | 0.56 | 0.56 | |||
| 77 | 3 | 0.06 | 1.97 × 10−3 | 0.43 | 2.80 × 10−3 | 0.37 | 0.86 | 0.64 | |||
| 14 | 2 | 0.03 | 2.24 × 10−2 | 0.30 | 2.77 × 10−2 | 0.59 | 0.62 | 0.60 | |||
| 23 | 3 | 0.02 | 9.50 × 10−2 | 0.29 | 4.88 × 10−2 | 0.85 | 0.32 | 0.59 | |||
| 41 | 3 | 0.05 | 3.28 × 10−3 | 0.60 | 2.88 × 10−4 | 0.80 | 0.50 | 0.67 | |||
AUC, area under the receiver operating curve; .
.
.
.
Annotation of CpGs recurrently used in classification of B-cell ELISPOT outcomes.
| Illumina ID | Context | NC | Promoter | Body | DNAse | #TF | Transcription factor binding site (TFBS) |
|---|---|---|---|---|---|---|---|
| cg06739303 | S_Shore | 6 | LOC441666 | x | 46 | ELF1, FOS, GABPA | |
| cg17959722 | Island | 6 | PNPLA7 | x | 28 | E2F1, POLR2A, SIN3A | |
| cg19566405 | 6 | SLFN12 | x | 16 | ZNF263, FOS, JUND | ||
| cg00310523 | 6 | RASSF9 | x | 11 | CEBPB, TBP, TCF7L2 | ||
| cg18963800 | N_Shore | 6 | HSD17B7P2 | x | 7 | CEBPB, POLR2A | |
| cg21384492 | Island | 6 | SNED1 | 3 | E2F1, POLR2A, SIN3A | ||
| cg15878909 | 5 | FAM90A1 | x | 9 | MAX, POLR2A, RAD51 | ||
| cg00785941 | Island | 5 | OR2L13 | 31 | CTCF, ZNF263, ELF1 | ||
| cg15633073 | Island | 11 | ZNF536 | x | 0 | ||
| cg20550154 | 6 | NID2 | x | 8 | EP300, NFE2, ZNF384 | ||
| cg00367615 | Island | 6 | MEDAG | x | 1 | EZH2 | |
| cg04681845 | 6 | FMNL2 | x | 1 | MYC | ||
| cg18396987 | S_Shore | 6 | SYCP1 | 1 | EZH2 | ||
| cg11430096 | S_Shore | 6 | CDK19 | 0 | |||
| cg18498565 | 3 | PFKP | x | 8 | CEBPB, FOS, EP300 | ||
| cg08065408 | N_Shore | 11 | x | 5 | NFYB, RFX5, ZBTB40 | ||
| cg14521995 | S_Shore | 11 | x | 0 | |||
| cg03532030 | S_Shore | 10 | x | 27 | MAX, POLR2A, SPI1 | ||
| cg02599498 | 6 | x | 69 | EP300, JUND, MYC | |||
| cg15203566 | Island | 6 | x | 7 | RAD21, TBP, EZH2 | ||
| cg17292337 | 6 | x | 6 | E2F6, L3MBTL2, EZH2 | |||
| cg11757417 | 6 | x | 0 | ||||
| cg06470855 | Island | 6 | x | 0 | |||
| cg19510820 | 6 | 1 | MAFK | ||||
| cg16005559 | S_Shore | 5 | x | 0 | |||
| cg18307968 | 5 | 5 | CTCF, MAFK, MAFF | ||||
| cg06134410 | Island | 3 | x | 2 | E2F6, UBTF | ||
| cg03121508 | 3 | 1 | EZH2 |
.
.
.
.