| Literature DB >> 35722470 |
M Cheng1, K Roseberry2, Y Choi2, L Quast2, M Gaines2,3, G Sandusky3, J A Kline4,5, P Bogdan1, A B Niculescu2,6,7.
Abstract
Suicides are preventable tragedies, if risk factors are tracked and mitigated. We had previously developed a new quantitative suicidality risk assessment instrument (Convergent Functional Information for Suicidality, CFI-S), which is in essence a simple polyphenic risk score, and deployed it in a busy urban hospital Emergency Department, in a naturalistic cohort of consecutive patients. We report a four years follow-up of that population (n = 482). Overall, the single administration of the CFI-S was significantly predictive of suicidality over the ensuing 4 years (occurrence- ROC AUC 80%, severity- Pearson correlation 0.44, imminence-Cox regression Hazard Ratio 1.33). The best predictive single phenes (phenotypic items) were feeling useless (not needed), a past history of suicidality, and social isolation. We next used machine learning approaches to enhance the predictive ability of CFI-S. We divided the population into a discovery cohort (n = 255) and testing cohort (n = 227), and developed a deep neural network algorithm that showed increased accuracy for predicting risk of future suicidality (increasing the ROC AUC from 80 to 90%), as well as a similarity network classifier for visualizing patient's risk. We propose that the widespread use of CFI-S for screening purposes, with or without machine learning enhancements, can boost suicidality prevention efforts. This study also identified as top risk factors for suicidality addressable social determinants. Supplementary Information: The online version contains supplementary material available at 10.1007/s44192-022-00016-z.Entities:
Keywords: Emergency department; Machine learning; Prediction; Risk; Social Isolation; Suicidality
Year: 2022 PMID: 35722470 PMCID: PMC9192379 DOI: 10.1007/s44192-022-00016-z
Source DB: PubMed Journal: Discov Ment Health ISSN: 2731-4383
Fig. 1Traditional analyses. a ROC AUC b T-test. c Summary of results d. Individual items T-test
Aggregate demographics
| Analyses | Cohort | Number of participants | Gender | Ethnicity | Age mean |
|---|---|---|---|---|---|
| Traditional | No suicidality | 376 | Male 180 Female 195 Other 1 | EA 192 AA 158 Hispanic 15 Asian 2 Other 9 | 44.6 (14.8) |
| Suicidality | 106 | Male 55 Female 51 | EA 56 AA 44 Hispanic 2 Other 2 American Indian 1 Asian 1 | 39.6 (13) | |
| Machine learning | Discovery cohort | No suicidality 255 Suicidality 56 | Male 128 Female 126 Other 1 | EA 136 AA 106 Hispanic 10 Asian 2 Other 1 | 43.5 (14.8) |
| Test cohort | 227 No suicidality 50 suicidality | Male = 107 Female = 120 | EA 112 AA 96 Other 9 Hispanic 7 American Indian 1 Asian 1 Other 1 | 43.5 (14.4) |
Confidence interval results of machine learning methods in suicidality prediction
| Cohort | Methods | 95% Confidence interval of evaluation metrics | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | Precision | Recall | F1 score | AUROC | |||||||
| Discovery | NB | 0.692 | 0.798 | 0.751 | 0.849 | 0.692 | 0.798 | 0.707 | 0.811 | 0.760 | 0.856 |
| XGB | 0.777 | 0.871 | 0.773 | 0.867 | 0.777 | 0.871 | 0.774 | 0.868 | 0.758 | 0.851 | |
| RF | 0.866 | 0.938 | 0.866 | 0.938 | 0.866 | 0.938 | 0.860 | 0.934 | 0.831 | 0.913 | |
| SVC | 0.670 | 0.780 | 0.742 | 0.842 | 0.670 | 0.780 | 0.688 | 0.796 | 0.737 | 0.837 | |
| DNN | 0.755 | 0.853 | 0.802 | 0.890 | 0.755 | 0.853 | 0.766 | 0.862 | 0.792 | 0.882 | |
| Test | NB | 0.678 | 0.792 | 0.744 | 0.848 | 0.678 | 0.792 | 0.698 | 0.810 | 0.764 | 0.866 |
| XGB | 0.745 | 0.849 | 0.728 | 0.835 | 0.745 | 0.849 | 0.734 | 0.840 | 0.704 | 0.814 | |
| RF | 0.730 | 0.838 | 0.705 | 0.816 | 0.730 | 0.838 | 0.712 | 0.822 | 0.691 | 0.803 | |
| SVC | 0.769 | 0.869 | 0.751 | 0.855 | 0.769 | 0.869 | 0.739 | 0.846 | 0.681 | 0.795 | |
| DNN | 0.779 | 0.877 | 0.763 | 0.865 | 0.779 | 0.877 | 0.757 | 0.859 | 0.856 | 0.936 | |
Fig. 2Machine learning analyses. a Visualization of the discovery (255 patients) and the validation (227 patients) cohort in the raw data and the uniform manifold approximation and projection (UMAP) space, respectively. The UMAP projection transforms the high-dimensional data into a 2D visualization and shows that the two suicidality classes (0 and 1) are overlapping by a large margin. This preliminary data inspection demonstrates the difficulty posed to the supervised machine learning tasks. b The receiver operating characteristic (ROC) curve comparison among several classifiers (i.e., naive Bayes (NB), XGBoost (XGB), random forest (RF), support vector machine (SVM), and deep neural network (DNN) classifier) for the suicidality classification. c The accuracy, precision, recall evaluation metrics, F1 score, and area under the receiver operating characteristic (AUROC) for the considered classifiers in the suicidality classification. The RF and DNN classifier emerge as the best model in discovery and test cohorts, respectively. d DNN accuracy increases for larger prediction intervals (PI) for the imminence and severity prediction on the discovery and test cohorts. e Since the imminence prediction and severity prediction represent regression problems, we report several standard evaluation metrics, such as the root mean square error (RMSE), mean average error (MAE), R-squared (R^2), and standard deviation. The usage of the discovery cohort in all experiments is the same as the training set in standard machine learning problems, and the test cohort can be seen as equivalent to the external test set
Fig. 3Similarity network classifier for visualizing patient’s risk. The discovery cohort (a) and test cohort (b) CFI-S-based similarity networks consist of nodes (patients) and weighted edges capturing the cosine similarity among pairs of patients’ CFI-S records. The edge color depends on the weight value. The node color indicates if a patient has suicidality. This topological representation of the CFI-S’s patients data shows that patients with suicidality (red nodes) are more clustered together towards the lower left part of the networks, while patients without suicidality (yellow nodes) are located mostly in the rest of the network. From the results shown in (c), we can see that similarity network-based GNN provides results that are comparable with DNN model