| Literature DB >> 35927749 |
G Provost1, F B Lavoie1, A Larbi2, T P Ng3, C Tan Tze Ying2, M Chua2, T Fulop4, A A Cohen5.
Abstract
Traditionally, the immune system is understood to be divided into discrete cell types that are identified via surface markers. While some cell type distinctions are no doubt discrete, others may in fact vary on a continum, and even within discrete types, differences in surface marker abundance could have functional implications. Here we propose a new way of looking at immune data, which is by looking directly at the values of the surface markers without dividing the cells into different subtypes. To assess the merit of this approach, we compared it with manual gating using cytometry data from the Singapore Longitudinal Aging Study (SLAS) database. We used two different neural networks (one for each method) to predict the presence of several health conditions. We found that the model built using raw surface marker abundance outperformed the manual gating one and we were able to identify some markers that contributed more to the predictions. This study is intended as a brief proof-of-concept and was not designed to predict health outcomes in an applied setting; nonetheless, it demonstrates that alternative methods to understand the structure of immune variation hold substantial progress.Entities:
Keywords: Complex system; Immunology; Neural network
Year: 2022 PMID: 35927749 PMCID: PMC9351261 DOI: 10.1186/s12979-022-00291-y
Source DB: PubMed Journal: Immun Ageing ISSN: 1742-4933 Impact factor: 9.701
Sample characteristics
|
| |
| Mean ± SD | 67.1 ± 7.5 |
| Range (min-max) | 55–89 |
|
| |
| M (%) | 337 (39) |
| F (%) | 527 (61) |
|
| 33 (5.8) |
|
| |
| 1 (%) – better health | 7 (1.2) |
| 2 (%) | 88 (15.5) |
| 3 (%) | 334 (58.9) |
| 4 (%) | 134 (23.6) |
| 5 (%) – worst health | 4 (0.7) |
|
| |
| 0 (%) | 261 (46) |
| 1 (%) | 184 (32.5) |
| 2 (%) | 81 (14.3) |
| 3 (%) | 35 (6.2) |
| 4 (%) | 5 (0.9) |
| 5 (%) | 1 (0.2) |
|
| 27.8 ± 2.8 |
|
| 2.4 ± 1.6 |
|
| 245 (43.2) |
|
| 263 (46.4) |
|
| 77 (13.6) |
|
| 23 (4) |
|
| 30 (5.3) |
|
| 19 (3.4) |
|
| 175 (30.1) |
|
| 28 (4.9) |
|
| 80 (14.1) |
|
| 28 (4.9) |
|
| 50 (8.8) |
|
| 28 (4.9) |
|
| 19 (3.4) |
|
| 18 (3.2) |
Fig. 1Representation of the models. A The continuous model. B The gated model
Fig. 2Example distributions of four of the surface markers tested in this article. A Distribution of CD3, used to identify T cells. B Distribution of CD38, used to identify B cell subsets. C Distribution of CD45RO, used to identify memory T cells. D Distribution of CD161, which can help define various T cell subsets
Averages and standard deviations of the rmse on the validation set of 100 separate runs of the non-gated and gated model for the health measure tested and the mean values for these measures. In bold are the health measures for which the models were able to make successful predictions (rmse < mean/3)
| Continuous | Gated | Mean | |||
|---|---|---|---|---|---|
|
|
|
|
| ||
|
|
|
|
|
|
|
| Mortality | 0.255 | 0.022 | 0.264 | 0.022 | 0.058 |
| Religion | 2.208 | 0.104 | 2.335 | 0.109 | 2.260 |
|
|
|
|
|
|
|
| Frailty | 1.154 | 0.048 | 1.302 | 0.057 | 0.830 |
|
|
|
|
|
|
|
| Comorbidity | 1.982 | 0.101 | 2.291 | 0.127 | 2.322 |
| High blood Pressure | 0.616 | 0.029 | 0.679 | 0.034 | 0.432 |
| High cholesterol | 0.626 | 0.024 | 0.714 | 0.037 | 0.464 |
| Diabetes | 0.413 | 0.026 | 0.481 | 0.034 | 0.136 |
| Stroke | 0.234 | 0.020 | 0.243 | 0.02 | 0.040 |
| Heart attack | 0.277 | 0.021 | 0.305 | 0.025 | 0.053 |
| Atrial fibrillation | 0.220 | 0.022 | 0.234 | 0.017 | 0.034 |
| Eye problem | 0.594 | 0.022 | 0.611 | 0.032 | 0.301 |
| Asthma | 0.263 | 0.020 | 0.298 | 0.032 | 0.049 |
| Arthritis | 0.428 | 0.022 | 0.481 | 0.023 | 0.141 |
| Osteoporosis | 0.277 | 0.035 | 0.301 | 0.027 | 0.049 |
| Gastrointestinal problem | 0.352 | 0.026 | 0.367 | 0.037 | 0.088 |
| Thyroid problem | 0.275 | 0.027 | 0.287 | 0.027 | 0.049 |
| Cancer | 0.219 | 0.025 | 0.216 | 0.017 | 0.034 |
| Depression | 0.214 | 0.020 | 0.241 | 0.047 | 0.032 |
Fig. 3Comparison of the errors of the predictions between the continuous and the gated models for Age, Self-assessed health, and MMSE. A, B and C Scatter plots of the difference between the observed value and the predicted value for Age, Self-assessed health, and MMSE respectively. D Violin plot of the difference between the observed value and the predicted value, with the middle bar representing the median
Fig. 4Values from the last layer of the non-gated model (informative layer) for successfully predicted outcomes, representing the contribution of that specific marker to the overall prediction. The middle square represents the mean value obtained for the 100 separate runs of the model and the bars are the standard error