| Literature DB >> 33192639 |
Bohan Xu1,2, Mahdi Moradi1,2, Rayus Kuplicki1, Jennifer L Stewart1,3, Brett McKinney2,4, Sandip Sen2, Martin P Paulus1,3,5.
Abstract
Non-intrusive, easy-to-use and pragmatic collection of biological processes is warranted to evaluate potential biomarkers of psychiatric symptoms. Prior work with relatively modest sample sizes suggests that under highly-controlled sampling conditions, volatile organic compounds extracted from the human breath (exhalome), often measured by an electronic nose ("e-nose"), may be related to physical and mental health. The present study utilized a streamlined data collection approach and attempted to replicate and extend prior e-nose links to mental health in a standard research setting within large transdiagnostic community dataset (N = 1207; 746 females; 18-61 years) who completed a screening visit at the Laureate Institute for Brain Research between 07/2016 and 05/2018. Factor analysis was used to obtain latent exhalome variables, and machine learning approaches were employed using these latent variables to predict three types of symptoms independent of each other (depression, anxiety, and substance use disorder) within separate training and a test sets. After adjusting for age, gender, body mass index, and smoking status, the best fitting algorithm produced by the training set accounted for nearly 0% of the test set's variance. In each case the standard error included the zero line, indicating that models were not predictive of clinical symptoms. Although some sample variance was predicted, findings did not generalize to out-of-sample data. Based on these findings, we conclude that the exhalome, as measured by the e-nose within a less-controlled environment than previously reported, is not able to provide clinically useful assessments of current depression, anxiety or substance use severity.Entities:
Keywords: computational psychiatry; data mining; electronic nose; exhalomes; machine learning; mental health
Year: 2020 PMID: 33192639 PMCID: PMC7524957 DOI: 10.3389/fpsyt.2020.503248
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 4.157
Figure 1Consort-like diagram for participant inclusion, according to research inclusion criteria.
Participant characteristics.
| Overall(n = 1207) | |
|---|---|
|
| |
| Mean (SDa) | 32.1 (10.4) |
| Median [Min, Max] | 30.0 [18.0, 61.0] |
| Missing | 4 (0.3%) |
|
| |
| Female | 756 (62.6%) |
| Male | 451 (37.4%) |
|
| |
| Current Smoker | 435 (36.0%) |
| Non-Smoker | 772 (64.0%) |
|
| |
| Mean (SDa) | 27.6 (5.77) |
| Median [Min, Max] | 26.6 [16.1, 52.1] |
| Missing | 21 (1.7%) |
|
| |
| Mean (SDa) | 9.29 (7.15) |
| Median [Min, Max] | 8.00 [0.00, 27.0] |
| Missing | 19 (1.6%) |
|
| |
| Mean (SDa) | 7.05 (4.92) |
| Median [Min, Max] | 7.00 [0.00, 20.0] |
| Missing | 20 (1.7%) |
|
| |
| Mean (SDa) | 1.74 (3.01) |
| Median [Min, Max] | 0.00 [0.00, 10.0] |
| Missing | 18 (1.5%) |
|
| |
| White | 723 (59.9%) |
| Black | 129 (10.7%) |
| Native American | 190 (15.7%) |
| Hispanic | 66 (5.5%) |
| Asian | 22 (1.8%) |
| Other | 47 (3.9%) |
| Missing | 30 (2.5%) |
|
| |
| Less than seven years of school | 2 (0.2%) |
| Junior high school (7th, 8th, 9th) | 22 (1.8%) |
| Some high school (10th, 11th) | 55 (4.6%) |
| High school graduate (including equivalency exam) | 216 (17.9%) |
| Some college or technical school (at least one year) | 488 (40.4%) |
| College graduate | 282 (23.4%) |
| Graduate professional training (Masters or above) | 92 (7.6%) |
| Other | 11 (0.9%) |
| Missing | 39 (3.2%) |
aStandard deviation.
bBody mass index.
cPatient Health Questionnaire 9.
dOverall Anxiety Severity and Impairment Scale.
eDrug Abuse Screening Test 10.
Figure 2Statistical data analysis and machine learning pipeline.
Figure 3Model performance (R2 values) in predicting PHQ-9, OASIS, and DAST-10 using Linear Model, Random Forest (RF), and Support Vector Machine (SVM) algorithms. Error bars represent standard deviations of R2 values.
Figure 4Model performance (R2 values) in predicting age and BMI using Linear Model, Random Forest (RF), and Support Vector Machine (SVM) algorithms. Error bars represent standard deviations of R2 values.
Figure 5Model performance (AUC values) in predicting smoking status and gender using Linear Model, Random Forest (RF), and Support Vector Machine (SVM) algorithms. Error bars represent standard deviations of AUC values.
Figure 6Principal component analysis plot of smoking status.