Literature DB >> 35153864

Heart Rate Information-Based Machine Learning Prediction of Emotions Among Pregnant Women.

Xue Li¹, Chiaki Ono², Noriko Warita², Tomoka Shoji^1,3, Takashi Nakagawa^1,2, Hitomi Usukura⁴, Zhiqian Yu⁴, Yuta Takahashi², Kei Ichiji⁵, Norihiro Sugita⁶, Natsuko Kobayashi², Saya Kikuchi², Yasuto Kunii^2,4, Keiko Murakami³, Mami Ishikuro³, Taku Obara³, Tomohiro Nakamura⁷, Fuji Nagami⁸, Takako Takai⁷, Soichi Ogishima⁷, Junichi Sugawara⁹, Tetsuro Hoshiai¹⁰, Masatoshi Saito¹⁰, Gen Tamiya¹¹, Nobuo Fuse¹¹, Shinichi Kuriyama³, Masayuki Yamamoto^6,11, Nobuo Yaegashi^8,10, Noriyasu Homma⁵, Hiroaki Tomita^1,2,3,4.

Abstract

In this study, the extent to which different emotions of pregnant women can be predicted based on heart rate-relevant information as indicators of autonomic nervous system functioning was explored using various machine learning algorithms. Nine heart rate-relevant autonomic system indicators, including the coefficient of variation R-R interval (CVRR), standard deviation of all NN intervals (SDNN), and square root of the mean squared differences of successive NN intervals (RMSSD), were measured using a heart rate monitor (MyBeat) and four different emotions including "happy," as a positive emotion and "anxiety," "sad," "frustrated," as negative emotions were self-recorded on a smartphone application, during 1 week starting from 23rd to 32nd weeks of pregnancy from 85 pregnant women. The k-nearest neighbor (k-NN), support vector machine (SVM), logistic regression (LR), random forest (RF), naïve bayes (NB), decision tree (DT), gradient boosting trees (GBT), stochastic gradient descent (SGD), extreme gradient boosting (XGBoost), and artificial neural network (ANN) machine learning methods were applied to predict the four different emotions based on the heart rate-relevant information. To predict four different emotions, RF also showed a modest area under the receiver operating characteristic curve (AUC-ROC) of 0.70. CVRR, RMSSD, SDNN, high frequency (HF), and low frequency (LF) mostly contributed to the predictions. GBT displayed the second highest AUC (0.69). Comprehensive analyses revealed the benefits of the prediction accuracy of the RF and GBT methods and were beneficial to establish models to predict emotions based on autonomic nervous system indicators. The results implicated SDNN, RMSSD, CVRR, LF, and HF as important parameters for the predictions.

Copyright © 2022 Li, Ono, Warita, Shoji, Nakagawa, Usukura, Yu, Takahashi, Ichiji, Sugita, Kobayashi, Kikuchi, Kunii, Murakami, Ishikuro, Obara, Nakamura, Nagami, Takai, Ogishima, Sugawara, Hoshiai, Saito, Tamiya, Fuse, Kuriyama, Yamamoto, Yaegashi, Homma and Tomita.

Entities: Chemical

Keywords: autonomic system; emotion; ensemble learning; gradient boosting trees; heart rate variability; machine learning; pregnancy; random forest

Year: 2022 PMID： 35153864 PMCID： PMC8830335 DOI： 10.3389/fpsyt.2021.799029

Source DB: PubMed Journal: Front Psychiatry ISSN： 1664-0640 Impact factor: 4.157

Introduction

Heart rate variability (HRV) is often used to characterize the function of autonomic nervous system activity by analyzing time and frequency domains based on normal-to-normal (NN) intervals (1). Time domain features include the coefficient of variation R-R interval (CVRR), standard deviation of all NN intervals (SDNN), square root of the mean squared differences of successive NN intervals (RMSSD), the number of interval differences of successive RR-intervals >50 ms (NN50), and proportion derived by dividing NN50 by the total number of RR-intervals (pNN50). Frequency domain features include low frequency from 0.04 to 0.15 Hz (LF), high frequency from 0.15 to 0.4 Hz (HF), and the ratio of low frequency to high frequency (LF/HF). HRV reflects many physiological and psychological factors, including emotions. Emotions affect our daily lives. However, emotions can also reflect mental conditions and significantly correlate with physical health (2). Negative emotions induce physiological arousal in a manner specific to the type of emotion (3). Physiological arousal can be measured as the change in HRV. Many researchers have focused on the effects of different emotions and HRV. For example, Xiu et al. (4) indicated that the HF reflects emotion and used HF to assess the effect of working memory training on emotion regulation. Rakshit et al. (5) used HRV features to classify different types of emotions, including happy and sad and neutral or null emotions. Goldstein et al. (6) suggested that HRV can be used as a marker to recognize different emotions. Many researchers have combined AI to establish a prediction model to predict changes in emotion based on HRV. However, the previous studies applied only a limited number of machine learning algorithms among many widely used algorithms. For example, Li et al. (7) provided a comprehensive overview of physiological signal-based emotion recognition. In this study, the authors enumerated various studies on physiological signal types and various machine learning algorithms for emotion recognition. The diverse algorithms include support vector machine (SVM), k-nearest neighbor (k-NN), decision tree (DT), and random forest (RF). Among them, several studies addressed emotion classifiers (8–11). SVM has been implicated as an appropriate method to discriminate among different emotions (5, 12–15). Two studies reported that logistic regression (LR) is an appropriate method to distinguish emotions (16, 17). Two other studies demonstrated that k-NN can be used as an emotion classifier (18, 19). Naïve Bayes (NB) is a proper method to predict emotions (20–22). RF can solve the problem of emotion recognition with a higher accuracy than that of a few other methods (23–25). Lee et al. (26) utilized an artificial neural network (ANN) to distinguish different emotions. Besides the seven algorithms mentioned above, several new machine learning algorithms have been developed. These include stochastic gradient descent (SGD), gradient boosting trees (GBT), and extreme gradient boosting (XGBoost). The application of these algorithms can be beneficial to more efficiently predict mood based on HRV. While HRV-based prediction of mood can contribute to early detection or objective assessment of mood disorders, caution is needed. Specific populations may have particular characteristics in mood. For example, perinatal women have prominent biological and psychosocial factors that affect mood, and are susceptible to mood disorders, including “maternity blues” and postpartum depression. However, HRV-based prediction for perinatal women has not been adequately addressed. A common help-seeking barrier regarding postpartum depression was suggested to be women's inability to disclose their feelings (27). The previous study indicated that over 90% of women affected postpartum depression recognized there was something wrong, but only one-third believed they were suffering from postpartum depression, and over 80% had not reported their symptoms to any health professional (28). Cognitive behavioral interventions adapted for non-clinic settings during the perinatal period have effectively prevented postpartum depression (29). These suggest that perinatal women tend to fail to be aware of their emotions, and facilitating self-awareness of emotional conditions may be beneficial to prevent postpartum depression. During the perinatal period, most women experience sadness, anxiety, and frustration as representative negative emotions, which can be related to the physical and mental conditions resulting in mood disorders. It would be beneficial if emotions could be predicted based on HRV as an objective physiological marker. This would make it easier to record continuous alterations in daily life than using other biological indicators, such as body temperature, sweat, or blood pressure. There have been several obstacles to predicting pregnant women's emotions based on HRV. Firstly, there has been no device that sufficiently records pregnant women's HRV in their daily life. Secondly, it has been controversial whether machine learning algorithms sufficiently predict emotions based on HRV, as previously described. To solve the first issue, we developed a system to record the HRV of pregnant women by attaching a small heart rate sensor to underwear for pregnant women. Using this device, we accumulated HRV of pregnant women along with simultaneous self-monitoring of four types of emotions: happiness, sadness, anxiety, and frustration. To solve the second issue, we conducted comprehensive evaluations of prediction accuracy of major currently available machine learning algorithms; the k-NN, SVM, LR, NB, SGD, DT, RF, GBT, XGBoost, and ANN, using the accumulated dataset. Thus, the purpose of this study was to (1) explore the algorithms that are most efficient in distinguishing and predicting the different emotional conditions of pregnant women and (2) evaluate the HRV features that are important in predicting different emotional conditions.

Materials and Methods

Figure 1 showed the design of this study included the materials (samples, HRV, and emotions) and methods (machine learning algorithms, feature contributions, and others). As shown in the figure, 10 algorithms were tested independently, and RF was indicated to produce the highest prediction accuracy.

Figure 1

The design of this study. The design of this study was plotted. CVRR, coefficient of variation R-R interval; SDNN, standard deviation of all NN intervals; RMSSD, square root of the mean squared differences of successive NN intervals; NN50, number of interval differences of successive RR-intervals >50 ms; pNN50, the proportion derived by dividing NN50 by the total number of RR-intervals; LF, frequency domain features include low frequency; HF, high frequency; LF/HF, the ratio of low frequency to high frequency; SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network.

Data Collection

Participants

Participants were recruited from women who registered to a three-generation cohort of Tohoku Medical Megabank Organization during pregnancy. In the follow-up after delivery, a flier was distributed to notify women that they could participate in the current project when they became pregnant. From May 2018 to November 2019, 85 pregnant women, 25–44-years old, enrolled and completed the project. These women recorded their emotions for a collective total of 227.3 h (2.7 h per subject on average) during the period of observation from mid-pregnancy to 2 months after delivery. Emotional information was not available for 32 of the 85 perinatal women. Eleven participants recorded their emotions for <1 h, 25 for 1–3 h, five for 3–5 h, five for 5–8 h, and seven persons for over 8 h. Subjects were enrolled during week 23 of pregnancy (n = 4), week 24 (n = 4), week 25 (n = 14), week 26 (n = 17), week 27 (n = 15), week 28 (n = 12), week 29 (n = 6), week 30 (n = 2), week 31 (n = 3), and week 32 (n = 3). Forty-one of the 85 pregnant women gave birth to a baby girl and 43 to a baby boy, with no information for the remaining woman. Sixty-four deliveries were vaginal, and 20 delivered via Cesarean section. Two of the 85 women fed their babies with milk, 63 breast-fed, and 17 fed with a mixture of breast-feeding and milk. Thirty-four women were employed, 35 were unemployed, and 16 did not provide employment information. The study was approved by the Ethics Committee of Tohoku University (approval number: 2021-1-266). All women provided written informed consent.

Measures

Different Emotions

Participants reported their feelings via the application installed on their smartphones when they felt happy, anxious, sad, or frustrated by selecting their respective icons.

HRV

HRV data were obtained using a wearable heart rate monitor, MyBeat (UNIONTOOL, Tokyo), attached to the pregnant women's underwear (TOYOBO, OSAKA). The measured HRVs were the CVRR, SDNN, RMSSD, NN50, pNN50, LF, HF, and LF/HF. Supplementary Table 1 summarizes HRV descriptive information.

Machine Learning Predictions of the Four Types of Emotions

The k-NN, SVM, LR, NB, SGD, DT, RF, GBT, XGBoost, and ANN machine learning algorithms (14, 18, 30–41) were applied to predict the four types of emotions based on HRV data. Summary explanations of the 10 algorithms are provided in Supplementary Text 2. The parameters of the models are listed in Supplementary Table 2. For the test set, we used the trained models to test and compare their prediction of emotions with real data (42, 43). The explanation of accuracy, precision, sensitivity, specificity, F1 score, and area under the receiver operating characteristic curve (AUC) is provided in Supplementary Text 3.

Evaluations of Feature Contributions

RF was used to evaluate feature contributions to predicting four types of emotions. Feature analysis evaluated all features and observed the important features that had significant correlations with the different types of emotions based on feature contributions. Thus, as used in the previous studies, RF was used as a classifier (44–46) and a method to evaluate feature contributions (44, 47, 48).

Validation of the Analyses

Alternative Applications for Machine Learning Predictions

Waikato Environment for Knowledge Analysis, University of Waikato, New Zealand (WEKA) and JMP statistical software (SAS Institute, Cary, NC, USA) were used to analyze the same dataset and prediction models.

Alternative Calculations of HRV Indicators

We primarily used the HRV indicators calculated using the program installed in the MyBeat device. The source codes of the algorithms used to calculate the HRV indicators in the device are proprietary. To validate the HRV indicators provided by the device, we calculated HRV indicators in python using open-source codes. The multiple formulae used to calculate time domain features included CVRR, SDNN, RMSSD, NN50, and pNN50, and frequency domain features that included LF and HF. The formula used to calculate the remaining HRV indicators is summarized in Supplementary Text 1 (49–54). The HRV indicators given by MyBeat were compared with those calculated using python to ensure consistency between the two.

Cross-Validations of Models for Hyper-Parameter Search

To validate the aforementioned machine learning algorithms to construct prediction models of the different emotions, samples were randomly split into two groups to generate the training dataset and the test dataset (55–57) and subjected to cross-validation. To select the most appropriate cross-validation method, k-fold cross-validation (KCV) (58–60) (test size = 0, k = 5), GridSearch Cross-validation (GridSearchCV) (61–63), and RandomizedSearch Cross-validation (RandomizedSearchCV) (64–66) were tested in a preliminary study. RandomizedSearchCV provided the highest accuracy with the fastest calculation time. The optimal parameters are listed in Supplementary Table 2.

Results

Profiles of Collected Data

Among the 85 perinatal women, 32 did not input any emotion information on the smartphone application. The remaining 53 perinatal women recorded one of the four types of emotions when they felt that emotion during the 1 week in the observation period when their heart rates were monitored. On average, during the week, each subject recorded “happy” for 1.92 h, “frustrated” for 1.59 h, “anxious” for 0.47 h, and “sad” for 0.31 h. Among the 10 machine learning algorithms applied to predict four types of emotions based on HRV indicators. RF showed the highest AUC of 0.70, followed by GBT (0.69), ANN (0.68), XGBoost (0.66), SVM (0.65), LR (0.65), DT (0.65), SGD (0.64), k-NN (0.61), and NB (0.52). The accuracy, precision, sensitivity, F1 score, and AUC of the 10 machine learning algorithms are summarized in Table 1. ROC curve of Random Forest is shown in Supplementary Figure 1, and the accuracy of training and test dataset with Random Forest is shown in Supplementary Figure 2.

Table 1

Model evaluation indices of the 10 machine learning prediction of the four selected emotions.

Items	SVM	k-NN	SGD	LR	DT	NB	RF	GBT	XGBoost	ANN
Accuracy	0.72	0.73	0.72	0.73	0.73	0.61	0.74	0.73	0.72	0.74
Precision	0.66	0.68	0.66	0.67	0.67	0.67	0.69	0.67	0.66	0.68
Sensitivity	0.72	0.73	0.72	0.73	0.73	0.61	0.74	0.73	0.72	0.74
F1 score	0.66	0.68	0.66	0.66	0.68	0.63	0.69	0.68	0.68	0.68
AUC	0.65	0.61	0.64	0.65	0.65	0.52	0.70	0.69	0.66	0.68

The model evaluation indices of the 10 machine learning predictions of the four emotions used the test dataset (5-fold cross-validation) independent of the training dataset.

SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network; AUC, area under the curve.

Model evaluation indices of the 10 machine learning prediction of the four selected emotions. The model evaluation indices of the 10 machine learning predictions of the four emotions used the test dataset (5-fold cross-validation) independent of the training dataset. SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network; AUC, area under the curve.

Runtime Efficiency of Each Machine Learning Algorithm

The runtime efficiency was measured for each prediction model. All the models were completed within 3–12 s. NB took 3 s, RF and LR took 4 s, k-NN took 6 s, SVM took 7 s, DT took 8 s, SGD took 9 s, GBT, XGBoost, and ANN took 12 s.

Evaluations of Each Feature

Feature evaluation of the nine HRV indicators using RF revealed that CVRR showed the highest important score in the prediction of emotions, followed by RMSSD, SDNN, HF, LF/(LF+HF), pNN50, LF/HF, and NN50 showed the lowest contribution to the prediction (Figure 2). Cross-validation scores were plotted with the number of features used to predict emotions. When features with higher contributions were added to the predictions one by one, the cross-validation scores increased as more features were included in the prediction up to five features. After more than five features were included, the prediction accuracy reached the plateau, as shown in Figure 3.

Figure 2

Figure 3

Numbers of features and cross-validation scores of random forest-based prediction of emotions. Cross-validation scores for each number of features used in the prediction of emotions are plotted. As more features are included in the prediction, cross-validation scores increase. A plateau is reached when five features are included.

Importance of each heart rate variability indicator. The importance scores of each feature in the prediction of emotions based on the nine heart rate variability indicators using random forest are plotted. CVRR, coefficient of variation R-R interval; SDNN, standard deviation of all NN intervals; RMSSD, square root of the mean squared differences of successive NN intervals; NN50, number of interval differences of successive RR-intervals >50 ms; pNN50, the proportion derived by dividing NN50 by the total number of RR-intervals; LF, frequency domain features include low frequency; HF, high frequency; LF/HF, the ratio of low frequency to high frequency. Numbers of features and cross-validation scores of random forest-based prediction of emotions. Cross-validation scores for each number of features used in the prediction of emotions are plotted. As more features are included in the prediction, cross-validation scores increase. A plateau is reached when five features are included.

Validation of Analyses

WEKA and JMP analyses of the same dataset produced the same results regarding the AUC of the predictions using the 10 algorithms. In addition, predictions of emotions using the source codes of algorithms to calculate HRV indicators provided the same results of predictions using the HRV indicators produced by the program installed in the MyBeat device. The results of other applications are provided in Supplementary Tables 3, 4 and Supplementary Figure 3. To validate the ability of the aforementioned machine learning algorithms in building prediction models of the different emotions, KCV, GridSearchCV, and RandomizedSearchCV were tested in a preliminary study. As explained in section Cross-Validations of Models for Hyper-Parameter Search, RandomizedSearchCV was selected. The optimal parameters are listed in Table 2.

Table 2

Optimal parameters.

Machine learning algorithm	Important parameters
SVM	C = 1.0, kernel = “rbf,” penalty = “l2”
KNN	n_neighbors = 8, p = 2
LR	penalty = “l2,” “class_weight”: None
NB	var_smoothing = 1e-09
SGD	penalty = “l2,” alpha = 0.001
GBT	n_estimators = 100, criterion = “friedman_mse”
XGBoost	max_depth = 13, gamma = 2, objective = “multi:softmax”
DT	max_depth = 10, criterion = “gini,” splitter = “random,”
RF	n_estimators = 200, criterion = “gini,” max_features = 5
ANN	hidden_layer_sizes = (100,), alpha = 0.001, max_iter = 200

Optimal parameters. SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network; AUC, area under the curve.

Discussion

Unlike the previous studies, which applied a limited number of machine learning algorithms to predict emotions based on HRV indicators, the current study firstly conducted comprehensive evaluations of widely used algorithms. Among the python-based predictions of the happy, anxious, sad, and frustrated emotions using 10 machine learning algorithms, RF provided the highest AUC, followed closely by GBT. Their AUC values were higher than those of the remaining eight algorithms (SVM, k-NN, NB, ANN, DT, XGBoost, SGD, and LR), as summarized in Supplementary Table 5 along with their characteristics. There are some discrepancies between the results of previous studies and the present study. Table 3 summarizes the validation designs of the previous studies and the algorithms that showed the best prediction accuracies. Previous studies reported the superiority of SVM (5, 12–15), k-NN (18, 19), LR (16, 17), and ANN (26). There are several possible reasons for these discrepancies. First, SVM has parameters that include C and gamma. Setting a higher value of gamma can cause overfitting problems (i.e., high accuracy in the training dataset and low accuracy in the test dataset). While datasets independent from training datasets should be used as test datasets to validate the accuracy of algorithms, some previous studies did not detail whether the validation was properly performed using independent datasets. Second, SVM is a proper method to recognize two types of emotions, positive emotions and negative emotions (12). The present study intended to differentiate four selected types of emotions. Third, the sample sizes of some of the prior studies were relatively small, which resulted in lower reliability. Fourth, we tested multiple methods and selected RandomizedSearchCV for cross-validations, which provided the best prediction accuracies for each algorithm. Whether the previous studies used proper cross-validations for their algorithms is unclear, because some of the studies did not describe how to conduct cross-validations.

Table 3

Previous machine learning studies to predict emotions based on heart rate variabilities.

First author (reference citation)	Year of publication	Training dataset (n)	Test dataset (n)	Method of cross-validation	Total sample size (n)
Rakshit et al. (5)	2016	33	N.A.	Leave-one-out cross-validation	33
Cheng et al. (12)	2017	N.A	N.A	N.A	N.A
Jang et al. (13)	2012	N.A	N.A	N.A	200
Guo et al. (14)	2016	N.A	N.A	N.A.	25
Domínguez-Jiménez et al. (15)	2020	80% of total number of subjects	20% of total number of subjects	N.A.	37
Chueh et al. (16)	2012	10	2	Leave-one-out cross-validation	12
Shu et al. (17)	2020	25	N.A.	Leave-one-out cross-validation	25
Zheng et al. (18)	2012	20	10-fold cross-validation	10-fold cross-validation	20
Ferdinando et al. (19)	2016	80% of total number of subjects	20% of total number of subjects	N.A	512
Jang et al. (20)	2014	70% of total number of subjects	30% of total number of subjects	N.A.	300
Subramanian et al. (21)	2016	N.A	N.A.	A leave-one-out cross-validation	58
Nikolova et al. (22)	2019	N.A	N.A	N.A	25
Colomer Granero et al. (23)	2016	47	N.A.	10-fold cross-validation	47
Ayata et al. (24)	2018	32	N.A.	10-fold cross-validation	32
Su et al. (25)	2020	369,289 records	41,033 records	N.A	25
Lee et al. (26)	2005	N.A	N.A	N.A	6
First author (reference citation)	SVM	k-NN	LR	SGD	NB	DT	RF	GBT	XGBoost	ANN
Rakshit et al. (5)	*	-	-	-	-	-	-	-	-	-
Cheng et al. (12)	*	o	-	-	-	o	o	o	-	-
Jang et al. (13)	*	-	-	-	-	o	-	-	-	-
Guo et al. (14)	*	-	-	-	-	-	-	-	-	-
Domínguez-Jiménez et al. (15)	*	o	o	-	o	o	o	-	o	-
Chueh et al. (16)	o	o	*	-	o	o	-	-	-	-
Shu et al. (17)	-	-	*	-	-	-	-	-	-	o
Zheng et al. (18)	-	*	-	-	-	-	-	-	-	-
Ferdinando et al. (19)	-	*	-	-	-	-	-	-	-	-
Jang et al. (20)	o	-	-	-	*	o	-	-	-	-
Subramanian et al. (21)	o	-	-	-	*	-	-	-	-	-
Nikolova et al. (22)	o	-	-	-	*	-	-	-	-	o
Colomer Granero et al. (23)	o	-	o	-	o	-	*	-	-	o
Ayata et al. (24)	o	o	-	-	-	o	*	-	-	-
Su et al. (25)	-	o	-	-	-	o	*	*	-	-
Lee et al. (26)	-	-	-	-	-	-	-	-	-	*

o, machine learning algorithm tested in the study; -, machine learning algorithm not tested in the study; *machine learning algorithm with the highest prediction accuracy in the study. SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network.

Previous machine learning studies to predict emotions based on heart rate variabilities. o, machine learning algorithm tested in the study; -, machine learning algorithm not tested in the study; *machine learning algorithm with the highest prediction accuracy in the study. SVM, support vector machine; k-NN, k-nearest neighbor; SGD, stochastic gradient descent; LR, logistic regression; DT, decision tree; NB, naïve Bayes; RF, random forest; GBT, gradient boosting trees; XGBoost, extreme gradient boosting; ANN, artificial neural network. It is noteworthy that SVM, k-NN, LR, ANN, and NB belong to the classifier type of machine learning algorithms, whereas RF and GBT belong to the ensemble type of machine learning algorithms. Whether classifier (7) or ensemble is suitable for physiological signal-based emotion recognition is contentious. While most of the previous studies claimed that their proposed method was superior compared with other competitive methods, drawbacks of some studies included the aforementioned small sample sizes and lack of proper cross-validations. Ensemble algorithms, such as RF and GBT, analyze high-dimensional data and solve a variety of problems to achieve high accuracy (24). This contrasts with simple classifiers, such as SVM and k-NN, which are suitable for small sample sizes. In the present study, the RF and GBT ensemble learning algorithms displayed the best AUCs predicting the selected emotions. This is probably because these algorithms can solve and deal with overfitting problems to achieve the best balance between generalization and regularization (25, 67–70). The strengths of other algorithms are listed in Supplementary Table 5. These algorithms may have shortcomings in the prediction of four types of emotions based on HRV in the present study. For example, LR and SVM are appropriate methods to predict binary information but are not suitable to predict more than two types of information, such as four types of emotions considered in our research. In addition, LR, SVM, k-NN, and DT have difficulty analyzing large amounts of data, especially because they include many characteristic numbers and unbalanced data distribution. NB and DT are based on the assumption of the independence of sample attributes and so are not suitable for data in which sample attributes are related to one another, such as our data. In addition, LR, NB, and SGD generally have low accuracy and a high error rate of classification decisions (71). ANN generally has many parameters, such as the network topology, weights, and thresholds, which are difficult to regulate. In addition, the output results are difficult to explain, which affects the credibility and acceptability of the results. The run time of RF was relatively shorter than other algorithms. Considering the high AUC and short runtime may suggest the usability of RF in the simultaneous prediction of emotions in daily activities. Concerning cross-validation methods, GridSearchCV can ensure that the accuracy of the parameters is within the specified parameter range by traversing all possible combinations of parameters. This is very time-consuming in the case of large datasets and multiple parameters. Randomized SearchCV has supplanted GridSearchCV by random sampling in the parameter space. For parameters with continuous variables, RandomizedSearchCV samples them as a distribution, whose search ability depends on the n_iter set parameter. Bergstra et al. (72) proved empirically and theoretically that RandomizedSearchCV is more effective than grid search in hyper-parameter optimization. RF was used to evaluate the important features to predict the four types of emotions. Feature analysis based on RF indicated that CVRR, RMSSD, SDNN, HF, and LF, among the nine HRV indicators, were important features to predict emotions. HRV has been considered as a marker of emotional response as per many theories, among which the polyvagal theory and the model of neurovisceral integration are the main supporting theories (73). The nine indicators of HRV used in this study were previously analyzed in the context of numerous psychological and physical health concerns. Through these studies, these indicators were proposed as biomarkers of capacity for the self-regulation of physiological, emotional, and cognitive responses and for effective adaptation to environmental stress and demands. Wang et al. found that SDNN, CVRR, and LF of subjects who had negative emotions (fear, stress, and anxiety) were higher than those who had positive emotions. The study also showed that HF of those with negative emotions was lower than that of those who had positive emotions (74). Simplicio et al. found that decreased HF was correlated with a loss of flexibility in the parasympathetic cardiovascular tone and emotion regulation. RMSSD reflects parasympathetic nerve activity (75). Godfrey et al. (3) reported that RMSSD decreased under mental stress. Zhu et al. reported that SDNN, RMSSD, LF, and HF are related to emotions (76). Thus, all of these five HRV indicators have been suggested to reflect emotions (77–80), which was confirmed in the present study. The present data indicated that other features (NN50, pNN50, and LF/HF) were not as important as the five HRV indicators to predict the four types of emotions. The mechanical interpretation of the findings is difficult. Firstly, RR-intervals are affected by multiple control mechanisms, including autonomic modulation at the sinoatrial node, dynamic regulation of the vasculature, and endocrine/paracrine, endothelial, and mechanical factors. Additionally, complex control mechanisms, including baroreflex and respiratory sinus arrhythmia, can also drive changes in these parameters. While various studies have revealed CVRR, SDNN, and RMSSD as indicators responsive to different emotions, NN50 has been studied or proven not to be correlated with autonomic nervous function (6, 75, 81). In addition, while many studies suggest that LF and HF reflect other mechanisms that exert regulatory control over the cardiac cycle, such as baroreflex activity in response to vasomotor tone (82, 83). The findings that LF/HF may be an indicator to reflect the balance of the sympathetic and parasympathetic nerves (84) has become controversial. For example, a recent analysis of this metric cast doubt on its interpretation (85). We simultaneously tested different applications (JMP and WEKA) to analyze the data with algorithms, because the different applications provided slightly different results owing to variability in parameter regulations. In addition, we used Welch's method in python to extract the HRV based on RR-intervals. These alternative analyses provided the consistent finding that RF was the most appropriate method among the tested algorithms, which confirmed the best prediction accuracy of RF.

Limitations

This study has several limitations. First, the sample size was still relatively small (n = 85), although it was more extensive than previous studies. In addition, the observations of sadness and anxiety were less frequent than those of the other two emotions. Further investigation using more data will be needed to verify the accuracy of the model. Second, there was a potential selection bias. For example, perinatal women who had a more vital interest in maintaining their health conditions might have tended to enroll in the study. Third, there might be variability in self-recording of emotions among the participants regarding sensitivity to the alterations in their mood or diligence to record their emotions. Finally, the traditional methods for machine learning and optimization were used in the current study. Deep learning algorithms were not applicable due to the small number of observations of sadness and anxiety. After collecting more data, advanced methods such as deep learning algorithms (86–90) and Bayesian optimization (86) would be applicable in the future to optimize prediction models and the parameters.

Conclusions

Comprehensive analyses of 10 machine learning algorithms indicated that RF and GBT provided the highest prediction accuracy and suggested the usability of the algorithms to predict emotions based on autonomic nervous system indicators of pregnant women. The results also implicated SDNN, RMSSD, CVRR, LF, and HF as important parameters for the predictions.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics Statement

The studies involving human participants were reviewed and approved by the Ethics Committee of Tohoku University. The patients/participants provided their written informed consent to participate in this study.

Author Contributions

CO, NW, TS, TaN, HU, KM, MI, TO, FN, NF, ShK, MY, NY, and HT contributed to the acquisition of data. CO, ToN, TT, SO, and GT contributed to the data management. XL, CO, ZY, YT, KI, NS, NH, and HT contributed to the data analysis. XL, CO, KI, NS, NK, SaK, YK, JS, TH, MS, ShK, NH, and HT contributed to the interpretation of the data. XL and HT were involved in drafting the manuscript. HT, KI, NS, NK, and NH critically revised the manuscript for important scientific content. XL, CO, NW, FN, ShK, and HT made substantial contributions to the conception and design of the study. All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

Funding

This research was supported by a grant from the Strategic Research Program for Brain Sciences from the Japan Agency for Medical Research and Development (AMED) under (Grant Number JP20dm0107099), the Tohoku Medical Megabank Project from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan and AMED under (Grant Numbers JP20km0105001 and JP20km0105002), and Tohoku University Advanced Research Center for Innovations in Next-Generation Medicine. We are grateful to the project participants for supporting this study.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

28 in total

1. Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier.

Authors: Desbordes Paul; Ruan Su; Modzelewski Romain; Vauclin Sébastien; Vera Pierre; Gardin Isabelle
Journal: Comput Med Imaging Graph Date: 2016-12-28 Impact factor: 4.790

Review 2. Heart rate variability monitoring for emotion and disorders of emotion.

Authors: Jianping Zhu; Lizhen Ji; Chengyu Liu
Journal: Physiol Meas Date: 2019-07-01 Impact factor: 2.833

Review 3. The utility of low frequency heart rate variability as an index of sympathetic cardiac tone: a review with emphasis on a reanalysis of previous studies.

Authors: Gustavo A Reyes del Paso; Wolf Langewitz; Lambertus J M Mulder; Arie van Roon; Stefan Duschek
Journal: Psychophysiology Date: 2013-02-27 Impact factor: 4.016

4. The pathway to care in post-natal depression: women's attitudes to post-natal depression and its treatment.

Authors: A Whitton; R Warner; L Appleby
Journal: Br J Gen Pract Date: 1996-07 Impact factor: 5.386

Review 5. A systematic review of the efficacy of cognitive behavioral therapy for treating and preventing perinatal depression.

Authors: Laura E Sockol
Journal: J Affect Disord Date: 2015-02-02 Impact factor: 4.839

Review 6. Low-frequency power of heart rate variability is not a measure of cardiac sympathetic tone but may be a measure of modulation of cardiac autonomic outflows by baroreflexes.

Authors: David S Goldstein; Oladi Bentho; Mee-Yeong Park; Yehonatan Sharabi
Journal: Exp Physiol Date: 2011-09-02 Impact factor: 2.969

7. A novel wavelet sequence based on deep bidirectional LSTM network model for ECG signal classification.

Authors: Özal Yildirim
Journal: Comput Biol Med Date: 2018-03-28 Impact factor: 4.589

8. The LF/HF ratio does not accurately measure cardiac sympatho-vagal balance.

Authors: George E Billman
Journal: Front Physiol Date: 2013-02-20 Impact factor: 4.566

Review 9. Everything Hertz: methodological issues in short-term frequency-domain HRV.

Authors: James A J Heathers
Journal: Front Physiol Date: 2014-05-07 Impact factor: 4.566

10. Machine Learning Models for the Prediction of Postpartum Depression: Application and Comparison Based on a Cohort Study.

Authors: Wenjie Gong; Weina Zhang; Han Liu; Vincent Michael Bernard Silenzio; Peiyuan Qiu
Journal: JMIR Med Inform Date: 2020-04-30

1 in total

1. Emotion Recognizing by a Robotic Solution Initiative (EMOTIVE Project).

Authors: Grazia D'Onofrio; Laura Fiorini; Alessandra Sorrentino; Sergio Russo; Filomena Ciccone; Francesco Giuliani; Daniele Sancarlo; Filippo Cavallo
Journal: Sensors (Basel) Date: 2022-04-08 Impact factor: 3.576

1 in total