| Literature DB >> 35153864 |
Xue Li1, Chiaki Ono2, Noriko Warita2, Tomoka Shoji1,3, Takashi Nakagawa1,2, Hitomi Usukura4, Zhiqian Yu4, Yuta Takahashi2, Kei Ichiji5, Norihiro Sugita6, Natsuko Kobayashi2, Saya Kikuchi2, Yasuto Kunii2,4, Keiko Murakami3, Mami Ishikuro3, Taku Obara3, Tomohiro Nakamura7, Fuji Nagami8, Takako Takai7, Soichi Ogishima7, Junichi Sugawara9, Tetsuro Hoshiai10, Masatoshi Saito10, Gen Tamiya11, Nobuo Fuse11, Shinichi Kuriyama3, Masayuki Yamamoto6,11, Nobuo Yaegashi8,10, Noriyasu Homma5, Hiroaki Tomita1,2,3,4.
Abstract
In this study, the extent to which different emotions of pregnant women can be predicted based on heart rate-relevant information as indicators of autonomic nervous system functioning was explored using various machine learning algorithms. Nine heart rate-relevant autonomic system indicators, including the coefficient of variation R-R interval (CVRR), standard deviation of all NN intervals (SDNN), and square root of the mean squared differences of successive NN intervals (RMSSD), were measured using a heart rate monitor (MyBeat) and four different emotions including "happy," as a positive emotion and "anxiety," "sad," "frustrated," as negative emotions were self-recorded on a smartphone application, during 1 week starting from 23rd to 32nd weeks of pregnancy from 85 pregnant women. The k-nearest neighbor (k-NN), support vector machine (SVM), logistic regression (LR), random forest (RF), naïve bayes (NB), decision tree (DT), gradient boosting trees (GBT), stochastic gradient descent (SGD), extreme gradient boosting (XGBoost), and artificial neural network (ANN) machine learning methods were applied to predict the four different emotions based on the heart rate-relevant information. To predict four different emotions, RF also showed a modest area under the receiver operating characteristic curve (AUC-ROC) of 0.70. CVRR, RMSSD, SDNN, high frequency (HF), and low frequency (LF) mostly contributed to the predictions. GBT displayed the second highest AUC (0.69). Comprehensive analyses revealed the benefits of the prediction accuracy of the RF and GBT methods and were beneficial to establish models to predict emotions based on autonomic nervous system indicators. The results implicated SDNN, RMSSD, CVRR, LF, and HF as important parameters for the predictions.Entities:
Keywords: autonomic system; emotion; ensemble learning; gradient boosting trees; heart rate variability; machine learning; pregnancy; random forest
Year: 2022 PMID: 35153864 PMCID: PMC8830335 DOI: 10.3389/fpsyt.2021.799029
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 4.157
Figure 1The design of this study. The design of this study was plotted. CVRR, coefficient of variation R-R interval; SDNN, standard deviation of all NN intervals; RMSSD, square root of the mean squared differences of successive NN intervals; NN50, number of interval differences of successive RR-intervals >50 ms; pNN50, the proportion derived by dividing NN50 by the total number of RR-intervals; LF, frequency domain features include low frequency; HF, high frequency; LF/HF, the ratio of low frequency to high frequency; SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network.
Model evaluation indices of the 10 machine learning prediction of the four selected emotions.
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | 0.72 | 0.73 | 0.72 | 0.73 | 0.73 | 0.61 | 0.74 | 0.73 | 0.72 | 0.74 |
| Precision | 0.66 | 0.68 | 0.66 | 0.67 | 0.67 | 0.67 | 0.69 | 0.67 | 0.66 | 0.68 |
| Sensitivity | 0.72 | 0.73 | 0.72 | 0.73 | 0.73 | 0.61 | 0.74 | 0.73 | 0.72 | 0.74 |
| F1 score | 0.66 | 0.68 | 0.66 | 0.66 | 0.68 | 0.63 | 0.69 | 0.68 | 0.68 | 0.68 |
| AUC | 0.65 | 0.61 | 0.64 | 0.65 | 0.65 | 0.52 | 0.70 | 0.69 | 0.66 | 0.68 |
The model evaluation indices of the 10 machine learning predictions of the four emotions used the test dataset (5-fold cross-validation) independent of the training dataset.
SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network; AUC, area under the curve.
Figure 2Importance of each heart rate variability indicator. The importance scores of each feature in the prediction of emotions based on the nine heart rate variability indicators using random forest are plotted. CVRR, coefficient of variation R-R interval; SDNN, standard deviation of all NN intervals; RMSSD, square root of the mean squared differences of successive NN intervals; NN50, number of interval differences of successive RR-intervals >50 ms; pNN50, the proportion derived by dividing NN50 by the total number of RR-intervals; LF, frequency domain features include low frequency; HF, high frequency; LF/HF, the ratio of low frequency to high frequency.
Figure 3Numbers of features and cross-validation scores of random forest-based prediction of emotions. Cross-validation scores for each number of features used in the prediction of emotions are plotted. As more features are included in the prediction, cross-validation scores increase. A plateau is reached when five features are included.
Optimal parameters.
|
|
|
|---|---|
| SVM | |
| KNN | n_neighbors = 8, |
| LR | penalty = “l2,” “class_weight”: None |
| NB | var_smoothing = 1e-09 |
| SGD | penalty = “l2,” alpha = 0.001 |
| GBT | n_estimators = 100, criterion = “friedman_mse” |
| XGBoost | max_depth = 13, gamma = 2, objective = “multi:softmax” |
| DT | max_depth = 10, criterion = “gini,” splitter = “random,” |
| RF | n_estimators = 200, criterion = “gini,” max_features = 5 |
| ANN | hidden_layer_sizes = (100,), alpha = 0.001, max_iter = 200 |
SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network; AUC, area under the curve.
Previous machine learning studies to predict emotions based on heart rate variabilities.
|
|
|
|
|
|
| |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Rakshit et al. ( | 2016 | 33 | N.A. | Leave-one-out cross-validation | 33 | |||||
| Cheng et al. ( | 2017 | N.A | N.A | N.A | N.A | |||||
| Jang et al. ( | 2012 | N.A | N.A | N.A | 200 | |||||
| Guo et al. ( | 2016 | N.A | N.A | N.A. | 25 | |||||
| Domínguez-Jiménez et al. ( | 2020 | 80% of total number of subjects | 20% of total number of subjects | N.A. | 37 | |||||
| Chueh et al. ( | 2012 | 10 | 2 | Leave-one-out cross-validation | 12 | |||||
| Shu et al. ( | 2020 | 25 | N.A. | Leave-one-out cross-validation | 25 | |||||
| Zheng et al. ( | 2012 | 20 | 10-fold cross-validation | 10-fold cross-validation | 20 | |||||
| Ferdinando et al. ( | 2016 | 80% of total number of subjects | 20% of total number of subjects | N.A | 512 | |||||
| Jang et al. ( | 2014 | 70% of total number of subjects | 30% of total number of subjects | N.A. | 300 | |||||
| Subramanian et al. ( | 2016 | N.A | N.A. | A leave-one-out cross-validation | 58 | |||||
| Nikolova et al. ( | 2019 | N.A | N.A | N.A | 25 | |||||
| Colomer Granero et al. ( | 2016 | 47 | N.A. | 10-fold cross-validation | 47 | |||||
| Ayata et al. ( | 2018 | 32 | N.A. | 10-fold cross-validation | 32 | |||||
| Su et al. ( | 2020 | 369,289 records | 41,033 records | N.A | 25 | |||||
| Lee et al. ( | 2005 | N.A | N.A | N.A | 6 | |||||
|
|
|
|
|
|
|
|
|
|
|
|
| Rakshit et al. ( | * | - | - | - | - | - | - | - | - | - |
| Cheng et al. ( | * | o | - | - | - | o | o | o | - | - |
| Jang et al. ( | * | - | - | - | - | o | - | - | - | - |
| Guo et al. ( | * | - | - | - | - | - | - | - | - | - |
| Domínguez-Jiménez et al. ( | * | o | o | - | o | o | o | - | o | - |
| Chueh et al. ( | o | o | * | - | o | o | - | - | - | - |
| Shu et al. ( | - | - | * | - | - | - | - | - | - | o |
| Zheng et al. ( | - | * | - | - | - | - | - | - | - | - |
| Ferdinando et al. ( | - | * | - | - | - | - | - | - | - | - |
| Jang et al. ( | o | - | - | - | * | o | - | - | - | - |
| Subramanian et al. ( | o | - | - | - | * | - | - | - | - | - |
| Nikolova et al. ( | o | - | - | - | * | - | - | - | - | o |
| Colomer Granero et al. ( | o | - | o | - | o | - | * | - | - | o |
| Ayata et al. ( | o | o | - | - | - | o | * | - | - | - |
| Su et al. ( | - | o | - | - | - | o | * | * | - | - |
| Lee et al. ( | - | - | - | - | - | - | - | - | - | * |
o, machine learning algorithm tested in the study; -, machine learning algorithm not tested in the study; *machine learning algorithm with the highest prediction accuracy in the study. SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network.