| Literature DB >> 35007991 |
Vasileios C Pezoulas1, Konstantina D Kourou1, Eugenia Mylona1, Costas Papaloukas2, Angelos Liontos3, Dimitrios Biros3, Orestis I Milionis3, Chris Kyriakopoulos4, Kostantinos Kostikas4, Haralampos Milionis3, Dimitrios I Fotiadis5.
Abstract
The coronavirus disease 2019 (COVID-19) which is caused by severe acute respiratory syndrome coronavirus type 2 (SARS-CoV-2) is consistently causing profound wounds in the global healthcare system due to its increased transmissibility. Currently, there is an urgent unmet need to identify the underlying dynamic associations among COVID-19 patients and distinguish patient subgroups with common clinical profiles towards the development of robust classifiers for ICU admission and mortality. To address this need, we propose a four step pipeline which: (i) enhances the quality of multiple timeseries clinical data through an automated data curation workflow, (ii) deploys Dynamic Bayesian Networks (DBNs) for the detection of features with increased connectivity based on dynamic association analysis across multiple points, (iii) utilizes Self Organizing Maps (SOMs) and trajectory analysis for the early identification of COVID-19 patients with common clinical profiles, and (iv) trains robust multiple additive regression trees (MART) for ICU admission and mortality classification based on the extracted homogeneous clusters, to identify risk factors and biomarkers for disease progression. The contribution of the extracted clusters and the dynamically associated clinical data improved the classification performance for ICU admission to sensitivity 0.83 and specificity 0.83, and for mortality to sensitivity 0.74 and specificity 0.76. Additional information was included to enhance the performance of the classifiers yielding an increase by 4% in sensitivity and specificity for mortality. According to the risk factor analysis, the number of lymphocytes, SatO2, PO2/FiO2, and O2 supply type were highlighted as risk factors for ICU admission and the percentage of neutrophils and lymphocytes, PO2/FiO2, LDH, and ALP for mortality, among others. To our knowledge, this is the first study that combines dynamic modeling with clustering analysis to identify homogeneous groups of COVID-19 patients towards the development of robust classifiers for ICU admission and mortality.Entities:
Keywords: COVID-19; Dynamic Bayesian Networks (DBNs); ICU admission; Mortality; Self-Organizing Maps (SOMs)
Mesh:
Year: 2021 PMID: 35007991 PMCID: PMC8711179 DOI: 10.1016/j.compbiomed.2021.105176
Source DB: PubMed Journal: Comput Biol Med ISSN: 0010-4825 Impact factor: 6.698
Fig. 1Illustration of the workflow analysis.
Fig. 3A circular visualization of the DBN obtained in the present study based on the time series clinical data measured at 4 different time-points. This type of diagram presents the relationships (links) among the different nodes in our model. The more the connections among two nodes the stronger the relation among them.
Fig. 5The patterns of trajectory clusters identified for each feature.
Fig. 2Illustration of the quality status across the time-points for the continuous and the discrete features.
Fig. 4The centrality measures extracted for each group of variables regarding the discrete time-points. The in- and out-degrees are shown for each node along with the betweenness centrality that provides the node's influence over information flow.
Number of patients assigned in each SOMs super-cluster for the most important features from the DBNs (p-values in bold denote statistically significant differences among the distributions of the ICU against the non-ICU patients and the patients who survived against those who died across the clusters).
| Feature | Patient distribution in each super-cluster | p-value | ||||
|---|---|---|---|---|---|---|
| C1 | C2 | C3 | C4 | ICU | mortality | |
| 88 | 223 | 83 | 28 | 0.732 | ||
| 173 | 71 | 92 | 86 | 0.285 | ||
| 86 | 68 | 145 | 123 | 0.905 | 0.103 | |
| 107 | 167 | 44 | 104 | |||
| 80 | 61 | 101 | 180 | 0.061 | ||
| 82 | 82 | 84 | 174 | |||
| 102 | 105 | 79 | 136 | |||
| 130 | 148 | 95 | 49 | |||
| 148 | 95 | 74 | 105 | |||
| 132 | 74 | 87 | 129 | |||
| 166 | 89 | 79 | 88 | 1 | 0.319 | |
| 117 | 108 | 88 | 109 | |||
A Fisher's exact test was applied where the confidence level was set to 95%.
Performance evaluation results from the GBT for ICU and mortality classification across different cases with donwsampling using the SOMs clustering labels from all the 32 continuous features (with blue color: specifications with the best or equal classification performance).
Fig. 6Performance evaluation results for the GBT with the clustering labels from the SOMs. The line in bold denotes the average ROC across 100 iterations of the downsampling process.
Fig. 7Feature importance for ICU admission (on top) and mortality (on bottom) from case study 1 with the clustering labels from the SOMs.
Fig. 8Feature importance for ICU admission (on top) and mortality (on bottom) from case study 2 with the clustering labels from the SOMs.
Fig. 9Feature importance for ICU admission (on top) and mortality (on bottom) from case study 3 with the clustering labels from the SOMs.
Performance evaluation results for case study 2 before and after the inclusion of demographics, clinical data and treatments (with blue color: specifications with the best or equal classification performance).
Comparison with the state-of-the-art studies for ICU admission and mortality in COVID-19.
| Study | Method | Risk factors |
|---|---|---|
| Ensemble-based algorithms to predict ICU admission and mortality across 3597 COVID-19 patients. | Risk factors: CRP, LDH, O2 saturation for ICU admission and neutrophil and lymphocytes for mortality. | |
| Random forests for risk stratification based on time-series data across 1987 unique patients diagnosed with COVID-19. | A risk prioritization tool that predicts the need for ICU admission within 24h to optimize the flow of operations within the hospitals. | |
| Ensemble learning to objectively identify an optimal combination of factors that predicts ICU admissions across 733 COVID-19 patients. | The number of lymphocytes was involved in all prediction tasks with the highest AUC score. | |
| Multipurpose algorithms (boosting ensembles, artificial neural networks) to estimate the risk of ICU admission or mortality among 3623 patients with COVID-19. | The final model achieved good discrimination for the external validation set (AUC 0.821). A cut-off of 0.4 yields sensitivity and specificity 0.71 and 0.78, respectively. | |
| Predict the risk for COVID-19 severity by training multipurpose algorithms across 3280 patients. | High predictive performance (average ROC 0.92) with the following risk factors: lymphocytes, C-reactive protein, and Braden Scale. | |
| GBTs were trained on 1270 COVID-19 patients from Wuhan to detect risk factors. | Age, CRP, and LDH were identified as prominent features for COVID-19 mortality. | |
| Bagging methods were applied on clinical data from 362 patients with confirmed COVID-19. | Age, hypertension, gender, diabetes, absolute neutrophil count, IL-6, and LDH were identified as risk factors for COVID-19 severity. | |
| DBNs combined with SOMs to derive homogeneous clusters of patients with COVID-19 which were used to enrich the existing time-series clinical and laboratory data with meta information to increase the performance of classification models for ICU admission and mortality. | Risk factors: number of lymphocytes, SatO2, PO2/FiO2, and O2 supply type as risk factors for ICU admission and the percentage of neutrophils and lymphocytes, PO2/FiO2, LDH, and ALP for mortality. Classification performance for ICU admission with sensitivity: 0.83 and specificity: 0.83 (AUC 0.91), and mortality with sensitivity: 0.74 and specificity: 0.76 (AUC 0.83). |