| Literature DB >> 31043667 |
Gerjen H Tinnevelt1,2, Selma van Staveren3,4, Kristiaan Wouters5, Erwin Wijnands6, Kenneth Verboven7,8, Rita Folcarelli9, Leo Koenderman4, Lutgarde M C Buydens9, Jeroen J Jansen9.
Abstract
Multicolour flow cytometry (MFC) is used to measure multiple cellular markers at the single-cell level. Cellular markers may be coloured with different panels of fluorescently-labelled antibodies to enable cell identification or the detection of activated cells in pre-defined, 'gated' specific cell subsets. The number of markers that can be used per measurement is technologically limited however, requiring every panel to be analysed in a separate aliquot measurement. The combined analyses of these dedicated panels may enhance the predictive ability of these measurements and could enrich the interpretation of the immunological information. Here we introduce a fusion method for MFC data, based on DAMACY (Discriminant Analysis of Multi-Aspect Cytometry data), which can combine information from complementary panels. This approach leads to both enhanced predictions and clearer interpretations in comparison with the analysis of separate measurements. We illustrate this method using two datasets: the response of neutrophils evoked by a systemic endotoxin challenge and the activated immune status of the innate cells, T cells and B cells in obese versus lean individuals. The data fusion approach was able to detect cells that do not individually show a difference between clinical phenotypes but do play a role in combination with other cells.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31043667 PMCID: PMC6494873 DOI: 10.1038/s41598-019-43166-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Cross-validated performance of the models for each dataset and the fusion of all three datasets.
| Accuracy | Sensitivity | Specificity | |
|---|---|---|---|
| Aliquot 1 | 100% | 100% | 99% |
| Aliquot 2 | 97% | 98% | 97% |
| Aliquot 3 | 97% | 100% | 94% |
| Fusion | 100% | 100% | 99% |
Figure 1Fusion model of all the LPS aliquots after variable selection. The left panel (a) shows the LPS response samples in blue and the control samples in red. If the prediction score value is above the threshold, the samples were classified as a LPS responder. The three panels on the right (b–d) show the weights in the model of, respectively, aliquots 1, 2 and 3. Positive weights are coloured blue and belong to cells more represented in the LPS response samples, while negative weights are coloured red and belong to cells more abundant in the control samples. The arrows show the loadings and thus the marker expression. CM indicates mature neutrophils more abundant in the controls, while LM and LI respectively indicate mature and immature neutrophils more abundant in the LPS responders.
Performance of the DAMACY models on each dataset in the obese versus lean study, and the fusion of all three datasets.
| Dataset | Accuracy | Sensitivity | Specificity | p-value | %increase |
|---|---|---|---|---|---|
| B cells | 71% | 80% | 59% | <12/1000 | 1.5% |
| T cells | 79% | 86% | 70% | <1/1000 | 4.5% |
| Innate | 75% | 75% | 75% | <9/1000 | 11.8% |
| Data fusion | 81% | 82% | 80% | <2/1000 |
Accuracy indicates the percentage of correctly classified samples in the cross-validation study, sensitivity reflects accuracy in identifying the obese samples and specificity indicates the ability to detect control (lean) samples. The p-value is the relative amount of higher prediction accuracies found after 1000 permutations. The %increase values reflect how much the accuracy would increase if that dataset was included in the data fusion model when compared with a fusion model comprising only the two other datasets.
Figure 2Fusion model after variable selection. The left panel (a) shows the obese individuals (blue crosses) and the lean controls (red circles). If the predicted value was above the threshold, the individuals were classified as obese. The three panels on the right (b–d) present the weights in the model of, respectively, the B cell, T cell and innate datasets. Areas coloured blue belong to cells more abundant in obese individuals, while those in red belong to cells more common in lean (control) individuals. The blue contours show where, on average, 80% of the cells of the obese individuals lie, while the red contours indicate 80% of the cells in the lean individuals.
Cell populations found in Fig. 2.
| Aliquot | Letter | Cell type | Marker expression | More abundant in |
|---|---|---|---|---|
| B cells | A | B cells | CD38+CD45RO−CD25−CD127− | Lean |
| B cells | B | B cells | CD38+CD45RO+CD25+CD19++CD127dim | Obese |
| T cells | C | CD4+ CD8− | CD38+CD45RO−CD25−CD28+CD3++ | Lean |
| T cells | D | CD4+ CD8− | CD38−CD45RO+CD25+CD127+CD28++CD3+ | Obese |
| Innate | E | NK cells | CD56+CD16+CD14−HLADR−CD11bdimCD11cdimCX3CR1dim | Lean |
| Innate | F | NK cells | CD56++CD16++CD14−HLADR−CD11bdim+CD11cdim+CX3CR1dim+ | Obese |
| Innate | G | Classical monocytes | CD56−CD16−CD14+HLADR+CD11b+CD11cdim+CX3CR1dim | Lean |
| Innate | H | Activated classical monocytes | CD56−CD16−CD14+HLADR++CD11b++CD11c+CX3CR1+ | Obese |
| Innate | I | Activated non-classical monocytes | CD56−CD16++CD14dimHLADR++CD11bdim+CD11c++CX3CR1++ | Obese |
| Innate | J | plasmacytoid dendritic cells? | CD11cdimCD11bdim+, rest negative | Lean |
The expression of the markers are summarised in five different categories, from lowest to highest: −, dim, dim+, +, ++. The letters correspond to the same letters as in Fig. 2.
Performance of the different methods on all three datasets of the lean versus obese model.
| Accuracy | Sensitivity | Specificity | p-value | |
|---|---|---|---|---|
| Fusion with DAMACY | 81% | 82% | 80% | <2/1000 |
| SOM[ | 81% | 66% | 94% | <1/1000 |
| DAMACY base[ | 77% | 83% | 71% | <1/1000 |
| Admire-LVQ[ | 73% | 75% | 71% | <13/1000 |
| SOM[ | 73% | 73% | 74% | <3/1000 |
| SOM[ | 71% | 66% | 77% | <14/1000 |
Accuracy indicates the percentage of correctly classified samples in the cross-validation study, sensitivity reflects accuracy in identifying the obese samples and specificity indicates the ability to detect control (lean) samples. The p-value is the relative amount of higher prediction accuracies found after 1000 permutations.
Figure 3Self-organising maps of, respectively, the B cell (a) T cell (b) and innate cell (c) datasets. The relative marker expression of a node is depicted as a pie chart. Blue shading behind the node indicates cell populations more abundant in obese individuals, while those with red shading contain cells less abundant in obese individuals, as predicted with SVM.