| Literature DB >> 33171646 |
Agata Kołakowska1, Wioleta Szwoch1, Mariusz Szwoch1.
Abstract
In recent years, emotion recognition algorithms have achieved high efficiency, allowing the development of various affective and affect-aware applications. This advancement has taken place mainly in the environment of personal computers offering the appropriate hardware and sufficient power to process complex data from video, audio, and other channels. However, the increase in computing and communication capabilities of smartphones, the variety of their built-in sensors, as well as the availability of cloud computing services have made them an environment in which the task of recognising emotions can be performed at least as effectively. This is possible and particularly important due to the fact that smartphones and other mobile devices have become the main computer devices used by most people. This article provides a systematic overview of publications from the last 10 years related to emotion recognition methods using smartphone sensors. The characteristics of the most important sensors in this respect are presented, and the methods applied to extract informative features on the basis of data read from these input channels. Then, various machine learning approaches implemented to recognise emotional states are described.Entities:
Keywords: affective computing; emotion recognition; human–computer interaction; sensors; sensory data; smartphones
Mesh:
Year: 2020 PMID: 33171646 PMCID: PMC7664622 DOI: 10.3390/s20216367
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Distribution of reviewed papers across the selected time span (for 2020, only the first half of the year was taken into account).
Figure 2(a) Sample illustrations of six Ekman’s emotions (b) Sample representation of PAD model.
Figure 3General emotion recognition process.
Low-level features extracted on the basis of accelerometer time series data.
|
|
|
|
| mean |
| [ |
| median |
| [ |
| maximum, minimum | [ | |
| range |
| [ |
| interquartile range | [ | |
| variance |
| [ |
| standard deviation |
| [ |
| mean absolute deviation |
| [ |
| skewness |
| [ |
| kurtosis |
| [ |
| root mean square |
| [ |
| energy |
| [ |
| power |
| [ |
| magnitude |
| [ |
| signal magnitude area |
| [ |
Studies on smartphone emotion recognition. m/f—males/females, T—touchscreen, A—accelerometer, G—gyroscope, M—magnetometer, P—GPS, B—bluetooth, L—light, O—other information.
|
|
|
|
|
|
|
| ||||||||
|
|
|
|
|
|
|
|
|
|
| |||||
| [ | valence, arousal, dominance (3 levels); stress, anger, happiness, sadness, surprise (binary) | yes | 70 (35/35) | 18–31 | x | deep learning (variational auto-encoder + fully connected layers); general model | ACC/AUC 67%/0.84 (valence), 63%/0.82 (arousal), 65%/0.82 (dominance); AUC: stress 0.8, anger 0.84, happiness 0.88, sadness 0.87, surprise 0.76 | |||||||
| [ | sadness, happiness, anger, surprise, disgust, fear | no | 40 (26/14) | avg 25.2 | x | x | SVM | 95% (pair-wise), 86.45% (multiclass) | ||||||
| [ | stress, anxiety, depression (5 levels) | no | 115 (100/15) | avg 19.8 | x | random forest (general) | stress and anxiety 73.4%, depression 74.1% | |||||||
| [ | valence, arousal (5 levels) | no | 50 | x | x | x | x | x | classifier fusion (neural networks, decision trees) | 71.67%, 72.37% | ||||
| [ | valence, arousal (binary); affect (positive/negative) | yes | 41 (33/17) | avg 24.42 | x | x | x | x | SVM, personalised | browsing: 81% valence, 85% arousal, 81% affect; chatting: 69% valence, 72% arousal, 62% affect | ||||
| [ | valence, arousal (binary) | yes | 33 (17/16) | avg 24.19 | x | x | naive Bayes, SVM (general, 4 classes as 4 combinations of valence/arousal) | 86.6% (naive Bayes), 83.21% (SVM) | ||||||
| [ | valence, arousal (continuous) | no | 39 (32/7) | x | x | personalised regression (random forest for valence, Ada boost for arousal) | pleasure: 82.2%, arousal: 65.7% | |||||||
| [ | happy, sad, stressed, relaxed | no | 24 (20/4) | avg 23.3 | x | multitask neural network (multiclass, first layers shared, then personalised layer) | AUC 84% | |||||||
| [ | happy, sad, stressed, relaxed | no | 22 (20/2) | 24-33 | x | random forest (personalised, multiclass) | AUC 78%, 73% | |||||||
| [ | happy, sad, stressed, relaxed | no | 15 (12/3) | 24–33 | x | deep neural network (personalised, multiclass) | 80% | |||||||
| [ | stress (5 levels) | yes | 13 (7/6) | 22–32 | x | decision tree, k-NN, Bayesian network, SVM, neural network | F-measure, 5-class: individual 79–87% (swipe), 77–81% (scroll); global 75–92% (swipe), 67–78% (scroll) | |||||||
| no | 25 | x | x | x | F-measure, 3-class: individual 86–88%, global 63-83% | |||||||||
| [ | excited, cheerful, relaxed, calm, bored, sad, irritated, tense, neutral | no | 8 (2/6) | x | x | x | x | random forest | pleasant(excited, cheerful, relaxed, calm)/unpleasant(bored, sad, irritated and tense) 71%, activated(excited, cheerful, irritated, tense)/deactivated(relaxed, calm, bored, sad) 78.3% | |||||
| [ | compound emotions (combination of sadness, anger, surprise, fear and disgust) | no | 30 (13/17) | 18–30 | x | x | x | x | x | x | personalised factor graph | 76% | ||
| [ | excitement, relaxation, boredom, frustration | yes | 20 (10/10) | avg 34 | x | SVM, naive Bayes, random forest, logistic regression | 4-classes: 67.5% (svm, naive Bayes); 2-classes (excitement + relaxation vs. boredom+frustration): 78.75% (logistic regression), 77.5 (random forest) | |||||||
| [ | happy, stressed, sad, relaxed | no | 22 (20/2) | 24–33 | x | x | random forest (personalised, multiclass) | AUC 84% | ||||||
| [ | happy, stressed, sad, relaxed | no | 22 (20/2) | 24–33 | x | random forest (personalised, multiclass) | AUC 73% | |||||||
| [ | anger, disgust, happy, sad, surprised, fear, neutral | no | x | x | x | x | x | naive Bayes (multiclass) | 72% | |||||
| [ | valence, arousal (binary) | no | 18000 | x | x | deep neural network of stacked restricted Boltzmann machines | 68% (valence) | |||||||
| [ | valence, arousal (binary) | yes | 29 (29/0) | 19–24 | x | k-NN, SVM, naive Bayes, decision tree | kNN 94.57%, SVM 96.75%, decision tree 96.4%, naive Bayes 88.4%, | |||||||
| [ | happy, sad, angry, neutral | no | 3 | x | x | decision tree (multiclass, general), multi-response linear regression (binary, general) | decision tree: F-measure 0.902, AUC 0.954; regression: F-measure 0.896, AUC 0.851 | |||||||
| [ | valence, arousal (3 levels) | no | 10 (3 provided enough data) | x | multilayer perceptron, SVM | arousal 75% SVM, valence 50.9% MLP | ||||||||
| [ | stress (3 levels) | no | 30 (18/12) | avg 37.46 | x | x | decision trees + transfer learning (personalized) | 71.58% | ||||||
| [ | affect (positive, neutral) | yes | 55 | x | x | x | SVM (general) for classification and regression | regression (7 point scale): RMSE 1.33; classification (binary): 87.3% (labels on the basis of two elicited states), 89.1% (labels from self reports) | ||||||
| valence, arousal, affect | no | 120 | x | x | x | SVM (general) for classification and regression | regression (7 point scale): affect RMSE 1.32, valence RMSE 1.61, arousal RMSE 1.88; classification (binary): affect 69%, valence 81.7%, arousal: 67.5% | |||||||
| [ | happy, angry, neutral | yes | 59 (27/32) | x | SVM (general) | anger(binary): 90.03% (wrist), 90.31% (ankle); happiness(binary): 89.76% (wrist), 87.65% (ankle); happy/angry: 87.1% (ankle); happy/angry/neutral: 85/78/78% (ankle) | ||||||||
| [ | positive, negative, neutral | yes | 24 (12/12) | 21–25 | x | random forest | 85.1% (personalised), 78.8% (general) | |||||||
| [ | stress (3 levels) | no | 30 (18/12) | 37.46 | x | naive Bayes, decision tree (general, personalised, based on similar users data) | general: 52% (naive Bayes), 50% (decision tree); personalised 71%; based on similar users: 60% (naive Bayes), 55% (decision tree) | |||||||
| [ | sad, happy, angry, content, energetic, tense | no | 10 (6/4) | 20-40 | x | x | x | x | transfer learning (general) + SVM (personalised) | 75% (general), accuracy rises after a few days due to validation and re-training | ||||
| [ | (1) valence, arousal (5 levels); (2) happiness, sadness, fear, anger, neutral | no | 12 (7/5) | x | x | x | x | x | x | x | random forest | general: 65.91% (discrete emotions), 72.73% (pleasure); personalised (one user only): 70.00% (discrete emotions), 79.78% (pleasure) | ||
| [ | boredom (binary) | no | 54 | 21–57 | x | x | random forest (general) | 82.9% | ||||||
| [ | mood (5 levels) | no | 9 (5/4) | 21–27 | x | x | x | naive Bayes (personalised) | 76% | |||||
| [ | stressed, excited, neutral | no | 20 (15/5) | 21–30 | x | decision trees (multiclass) | 71% (cross validation), 58% (test set) | |||||||
| [ | stress (binary) | no | 117 | x | x | random forest, GBM—generalised boosted model (general) | 72.51, 72.28% (random forest)%, 71.35% GBM | |||||||
| [ | happiness (3 levels) | no | 117 | x | x | random forest (general) | 80.81% | |||||||
| [ | valence, arousal (5 levels) | no | 32 (21/11) | 18–29 | x | x | multi-linear regression (personalised, general, hybrid) | 93% personalised, 66% (general), 75% (hybrid, after 30 days) | ||||||
| [ | happiness, surprise, anger, disgust, sadness, fear, neutral | no | 1 (1/0) | 30 | x | x | x | x | x | Bayesian network (multiclass) | 67.52% | |||
| [ | displeasure, tiredness, tensity (5 levels) | no | 15 | x | x | x | factor graph (personalised) | 52.58% (displeasure), 45.36% (tiredness), 47.42% (tensity) | ||||||
| [ | excited, relaxed, frustrated, bored; arousal, valence (2 levels) | no | 15 (9/6) | 18-40 | x | SVM (general), discriminant analysis (personalised) | general: 88.7% (arousal), 86% (valence), 77% (4 emotions); personalised: 89% (arousal), 83% (valence), 76.4% (4 emotions) | |||||||
| [ | positive, negative, neutral | no | 30 | x | x | dynamic continuous factor graph | F-measure 53.31% | |||||||
| [ | stress (binary) | yes | 19 | 20-57 | x | x | x | decision tree (general) | 78% | |||||
Features extracted in the reviewed studies.
|
|
|
|
| [ | Touchscreen | heat maps of pressure, down-down speed, up-down speed |
| [ | Accelerometer Gyroscope | features calculated on the basis of x, y and z sequences: mean, median, standard deviation, max, min, index of max/min, skewness, kurtosis, entropy, root mean square, energy, power, mean absolute deviation, interquartile range, signal magnitude area, zero crossing rate, slope sign change, waveform length; FFT coefficients and their, mean, max, magnitude, energy, band power of signal; sum of squares and sum of absolute values of wavelet transform coefficients; additionally step length and step duration calculated on the basis of accelerometer x series |
| [ | Touchscreen | features describing swipes: length, speed, relation between distance and displacement, pressure variance, touch area variance, direction, variance of the angle between points and axes; features calculated for all pairs of consecutive points or all pairs between the starting/ending point of a swipe and any other extracted for eight predefined directions: percentage of touches in each direction, variance of the direction of the vector determined by the mentioned pairs of points |
| [ | Accelerometer | shaking time, severity of shaking, times of shaking, time of portrait orientation, landscape orientation, times of exchanging orientation, step count, difference between average and largest speed |
| Gyroscope | rotation time, mean angular velocity | |
| GPS | entropy | |
| Light | state (no use/indoor/outdoor) | |
| Other | network speed, strength of signal | |
| [ | Touchscreen | touch area, maximum pressure, pressure, hold-time, distance between start and end position, speed, number of touches outside/inside keyboard layout, number of spacebar/send/change language/change number, duration since last press |
| Accelerometer | values of x, y, z | |
| Gyroscope | values of x, y, z | |
| Other | response time | |
| [ | Touchscreen | typing speed, backspace frequency, max number of characters without pressing delete for a second, touch count |
| Accelerometer | device shake frequency | |
| [ | GPS | mean an standard deviation of latitude and longitude |
| Other | average distance from work, distance from home, time of the day, day of week | |
| [ | Touchscreen | sequence of vectors containing: intertap duration, alphanumeric (1/0), special characters (1/0), backspace (1/0), touch pressure, touch speed, touch time |
| [ | Touchscreen | see[ |
| Other | last ESM response | |
| [ | Touchscreen | mean ITD (intertap distance), mean nonoutlier ITD, i-th percentiles of ITD (i = 25, 50, 75, 90), mean and standard deviation of word completion time, session duration, sum and number of ITDs greater than 30s, session duration-pause time, session duration/number of characters, session duration/number of words, percentage of backspace, percentage of nonalphanumeric characters |
| [ | Touchscreen | tap features: mean pressure, size, movement; scroll/swipe features: mean pressure, size, delta, length; typing features: pressure, tap size, tap movement, tap duration, pressure/size, tap distance, wrong words/all words, back/all digits |
| [ | Touchscreen | number of touches; minimum, maximum, range, mean, median, variance, standard deviation of touch intervals; session duration |
| Accelerometer | scores for various activities (Google API) | |
| Other | frequency and percentage of time of different application categories, screen features (duration and number of events for differents states: on, off, unlocked) | |
| [ | Accelerometer | mean and variance of x,y,z; step count |
| Gyroscope | mean and variance of x,y,z | |
| Magnetometer | mean and variance of x,y,z | |
| GPS | longitude, latitude, altitude | |
| Light | mean, variance, dark ratio, bright ratio, dark to bright ratio | |
| Other | application usage (duration for various categories), screen (on ratio, off ratio, Sleeping Duration, Usage Amount), call frequency and duration for each contact person, sms frequency of each contact person, microphone (mean, variance, noise ratio, silence ratio, noise to silence ratio), WiFi (frequency of the top N occurred IDs) | |
| [ | Touchscreen | touch pressure, touch duration, time between subsequent touches |
| [ | Touchscreen | see[ |
| Other | working hour indicator, persistent emotion | |
| [ | Touchscreen | mean session ITD (intertap distance), refined mean session ITD, percentage of special characters (nonalphanumeric), number of backspace or delete, session duration, session text length |
| [ | Touchscreen | typing time, typing speed, key press count, touch count, backspace count |
| Accelerometer | device shake count | |
| GPS | latitude, longitude | |
| Light | illuminance | |
| Other | time zone, discomfort index, weather attributes from OpenWeatherMap API | |
| [ | Accelerometer | DTW distance between the accelerometer readings during the observation interval and the average readings |
| Other | DTW distance between microphone readings during the observation interval and the average microphone readings, difference between the number of messages (calls) exchanged during the observation intervals and the average number of messages (calls) | |
| [ | Touchscreen | number of touch events (down, up, move), average pressure of events |
| [ | Touchscreen | average time delay between typed letters, number of backspaces, number of letters |
| Accelerometer | average acceleration | |
| [ | Accelerometer | mean, standard deviation, standard deviation of mean peak, mean jerk, mean step duration, skewness, kurtosis, standard deviation of power spectral density |
| [ | Accelerometer | percentage of high activity periods |
| Other | location changes (on the basis of WiFi access points, google map locations, cellular towers), conversation time (microphone), parameters form call and sms logs, number of applications used and duration for selected categories of applications | |
| [ | Touchscreen | finger speed, speed normalised by task difficulty, precision precision normalised by task difficulty, pressure, pressure decline, difference in angle between fingers and centroid at the beginning and end of interaction, angle between horizontal line and line intersecting centroid and tap, approach direction, tap movement, distance between two fingers |
| Accelerometer | horizontal and vertical acceleration, difference in aggregated acceleration | |
| Gyroscope | rotation around x, y, z axis, difference in aggregated rotation | |
| [ | Accelerometer | standard deviation, kurtosis, skewness, correlation coefficient (for every two axes), FFT coefficients, power spectral density |
| [ | Touchscreen | features describing strokes: mean, median, max, min, variance of length, time, pressure and speed of strokes |
| [ | Accelerometer | for x, y, z: mean, std, max, min, median, range, absolute value, variance; variance sum, magnitude, signal magnitude area, root mean squared, curve length, non linear energy, entropy, energy, mean energy, standard deviation of energy, DFT (Discrete Fourier Transform), peak magnitude, peak magnitude frequency, peak power, peak power frequency, magnitude entropy, power entropy (for each parameter min, max and mean calculated on the basis of 2 h period) |
| [ | Accelerometer | series of state values (run/walk/silence) |
| GPS | visited locations, time at locations | |
| Bluetooth | number of Bluetooth IDs, IDs seen for more than a predefined time, maximum time for an ID seen | |
| Other | parameters from call and sms logs, number of Wifi signals, content features extracted from text and emoticons | |
| [ | Accelerometer Gyroscope Magnetometer | for each axis: maximum, minimum, mean, standard deviation, wave number, crest mean, trough mean, the maximum difference between the crest and trough, the minimum difference between the crest and trough; additionally periods of steady/slow/fast on the basis of accelerometer |
| GPS | number of locations, entropy (time in different locations) | |
| Bluetooth | number of connections | |
| Light | proportion of time for not used, used indoors, used outdoors | |
| Other | call and message log parameters, number of WiFi connections, application usage time, time of light and dark screen, times of unlocking screen, number of photos, proportion of time in various modes | |
| [ | Light | ambient light |
| Other | connected to headphone or bluetooth, charging, day of week, hour, screen covered or not, ringer mode, average battery drain, battery change during the last session, bytes received/transmitted, time spent in selected applications or sessions, number of notifications, name/category of app that created last notification, number of apps used, number of phone unlocks, time since user last opened notification centre, time since last phone unlock, screen orientation changes, category/name of app in focus prior to probe and name of the previous app, name/category of app used most often | |
| [ | Accelerometer | activity from Google Activity Recognition Api |
| Light | ambient light | |
| Other | noise, message history, call history, connectivity type (WiFi, mobile, none), calendar (number and type of appointments), daytime, day type (weekday/weekend), location (cell ID) | |
| [ | Accelerometer | raw values of x, y, z |
| [ | Bluetooth | general proximity information, diversity (entropy of proximity contacts, the ratio of unique contacts to interactions, the number of unique contacts), regularity (mean and variance of time elapsed between two interaction events) |
| Other | general phone usage (number of outgoing, incoming and missed calls, number of sent and received sms), diversity (entropy of contacts, unique contacts to interactions ratio, number of unique contacts), regularity (average and variance of time elapsed between two calls or two sms or call and sms); second order features (selected statistics calculated for each basic feature); weather parameters (mean temperature, pressure, total precipitation, humidity, visibility, wind speed metrics) | |
| [ | GPS | number of visits in selected locations |
| Other | emails (number of emails, number of characters), sms (number of messages, number of characters), calls (number of calls, call duration); number of visits on website domains; application usage (categories, number of launches, duration) | |
| [ | Touchscreen | typing speed, maximum text length, erased text length, touch count, long touch count, frequency of backspace, enter and special symbol |
| Accelerometer | device shake count | |
| GPS | location (home, work, commute, entertain etc.) | |
| Light | illuminance | |
| Other | time, weather, discomfort index calculated as 0.4(Ta+Tw)+15, where Ta is dry-bulb temp., Tw is wet-bulb temp. | |
| [ | Accelerometer | activity (proportion of sitting, walking, standing, running), micromotion (picking a phone and doing nothing for longer than a few seconds) |
| GPS | location | |
| Other | communication frequency (sms, calls) | |
| [ | Touchscreen | features describing strokes: mean, median, maximum and minimum values of the length, speed, directionality index (distance between the first and the last point of a stroke), contact area |
| [ | GPS | location (region id) |
| Other | sms text, calling log | |
| [ | Touchscreen | mean and maximum intensity of touch, accuracy of touches-relation between the touches on active versus passive areas |
| Accelerometer | acceleration | |
| Other | amount of movement (taken from camera) |