| Literature DB >> 28509865 |
Fernando Alonso-Martín1, Juan José Gamboa-Montero2, José Carlos Castillo3, Álvaro Castro-González4, Miguel Ángel Salichs5.
Abstract
An important aspect in Human-Robot Interaction is responding to different kinds of touch stimuli. To date, several technologies have been explored to determine how a touch is perceived by a social robot, usually placing a large number of sensors throughout the robot's shell. In this work, we introduce a novel approach, where the audio acquired from contact microphones located in the robot's shell is processed using machine learning techniques to distinguish between different types of touches. The system is able to determine when the robot is touched (touch detection), and to ascertain the kind of touch performed among a set of possibilities: stroke, tap, slap, and tickle (touch classification). This proposal is cost-effective since just a few microphones are able to cover the whole robot's shell since a single microphone is enough to cover each solid part of the robot. Besides, it is easy to install and configure as it just requires a contact surface to attach the microphone to the robot's shell and plug it into the robot's computer. Results show the high accuracy scores in touch gesture recognition. The testing phase revealed that Logistic Model Trees achieved the best performance, with an F-score of 0.81. The dataset was built with information from 25 participants performing a total of 1981 touch gestures.Entities:
Keywords: acoustic sensing; contact microphone; human-robot interaction; machine learning; touch interaction
Mesh:
Year: 2017 PMID: 28509865 PMCID: PMC5470814 DOI: 10.3390/s17051138
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Set of touch gestures using artificial skin defined by Silvera [15].
Figure 2Data flow scheme: (a) The touch is produced by the user; (b) the vibration is collected by the microphone; (c) the beginning of the touch is detected; (d) feature extraction phase; (e) the ending of the touch is detected; (f) the classification phase throws the gesture recognized.
Figure 3Voice activity detection is based on the relation between the current SNR and a SNR threshold. The beginning of the gesture is detected when the current SNR is greater than SNR threshold and the end of the gesture is detected when SNR is lower than SNR threshold during a fixed period of time.
Figure 4The acoustic signal is analysed in three domains: time, frequency, and time-frequency.
The audio features employed in the Feature Extraction component. We are using the maximum, minimum, and average values of pitch, flux, roll-off, centroid, ZCR, RMS, and SNR. Since, a total of 23 features are extracted for each instance/gesture.
| Feature | Description | Domain |
|---|---|---|
| Pitch | Frequency perceived by human ear. | Time, Frequency, Time-Frequency |
| Flux | Feature computed as the sum across one analysis window of the squared difference between the magnitude spectra corresponding to successive signal frames. In other words, it refers to the variation of the magnitude of the signal. | Frequency |
| RollOff-95 | Frequency that contains 95% of the signal energy. | Frequency |
| Centroid | Represents the median of the signal spectrum in the frequency domain. That is, the frequency to which the signal approaches the most. It is frequently used to calculate the tone of a sound or timbre. | Frequency |
| Zero-crossing rate (ZCR) | Indicates the number of times the signal cross the abscissa. | Time |
| Root Mean Square (RMS) | Amplitude of the signal volume. | Time |
| Signal-to-noise ratio (SNR) | Relates the touch signal with the noise signal. | Time |
| Duration | Duration of the contact in time. | Time |
| Number of contacts per minute | A touch gesture may consist of several touches. | Time |
Figure 5Acoustic signatures for the touch-gestures as acquired by the contact microphone in the time domain. Horizontal axis represents the duration of the sound and the vertical one represents the amplitude normalized by the highest amplitude detected among them.
Figure 6Robotic platform and the integrated contact sensors.
Characterization of the touch gestures employed. The last column shows an example of how each gesture can be performed.
| Gesture | Contact Area | Intensity | Duration | Intention | Example |
|---|---|---|---|---|---|
|
| med-large | low | med-long | empathy, compassion | |
|
| med | med | med-long | fun, joy | |
|
| small | low | short | advise, warn | |
|
| small | high | short | discipline, punishment, sanction |
Figure 7Visual interpretation of the training set.
Classifiers with best performance using the training set and cross-validation.
| Classifier | |
|---|---|
|
| 1 |
|
| 0.93 |
|
| 0.82 |
|
| 0.81 |
|
| 0.80 |
|
| 0.76 |
Classifiers with the best performance using the test set.
| Classifier | |
|---|---|
|
| 0.81 |
|
| 0.79 |
|
| 0.78 |
|
| 0.75 |
|
| 0.74 |
|
| 0.73 |
|
| 0.72 |
Logistic Model Trees confusion matrix using the test set composed by 634 new touch instances.
| Gesture | Stroke | Tickle | Tap | Slap |
|---|---|---|---|---|
|
| 94 | 21 | 33 | 15 |
|
| 6 | 122 | 05 | 11 |
|
| 8 | 0 | 146 | 7 |
|
| 7 | 0 | 04 | 155 |
Summary of robot sensing technologies.
| Sensor Technology | Advantages | Disadvantages |
|---|---|---|
| Resistive | -Wide dynamic range. | -Hysteresis in some designs. |
| Piezoelectric | -Wide dynamic range. | -Difficulty of separating piezoelectric from pyroelectric effects. |
| Capacitive | -Wide dynamic range. | -Susceptible to noise. |
| Magnetic transductor | -Wide dynamic range. | -Poor spatial resolution. |
| Mechanical transductor | -Well-known technology. | -Complex for array constructions. |
| Optical transductor | -Very high resolution. | -Dependence on elastomer in some designs. |
Third-party weka classifiers employed.
| Name | Developed by | Available on |
|---|---|---|
| EBMC | A. Lopez Pineda | |
| Discriminant Analysis | Eibe Frank | |
| Complement Naive Bayes | Ashraf M. Kibriya | |
| IBKLG | S. Sreenivasamurthy | |
| Alternating Decision Trees | R. Kirkby et al. | |
| HMM | Marco Gillies | |
| Multilayer Perceptrons | Eibe Frank | |
| CHIRP | Leland Wilkinson | |
| AnDE | Nayyar Zaidi | |
| Ordinal Learning Method | TriDat Tran | |
| Grid Search | B. Pfahringer et al. | |
| AutoWeka | Lars Kotthoff et al. | |
| Ridor | Xin Xu | |
| Threshold Selector | Eibe Frank | |
| ExtraTrees | Eibe Frank | |
| LibLinear | B. Waldvogel | |
| SPegasos | Mark Hall | |
| Clojure Classifier | Mark Hall | |
| SimpleCART | Haijian Shi | |
| Conjuntive Rule | Xin XU | |
| DTNB | Mark Hall et al. | |
| J48 Consolidated | J. M. Perez | |
| Lazy Associative Classifier | Gesse Dafe et al. | |
| DeepLearning4J | C. Beckham et al. | |
| HyperPipes | Len Trigg et al. | |
| J48Graft | J. Boughton | |
| Lazy Associative Classifier | Gesse Dafe et al. | |
| DeepLearning4J | C. Beckham et al. | |
| HyperPipes | Len Trigg et al. | |
| J48Graft | J. Boughton | |
| Lazy Bayesian Rules Classifier | Zhihai Wang | |
| Hidden Naive Bayes classifier | H. Zhang | |
| Dagging meta-classifier | B. Pfahringer et al. | |
| MultilayerPerceptronCS | Ben Fowler | |
| Winnow and Balanced Winnow Classifier | J. Lindgren | |
| Nearest-neighbor-like Classifier | Brent Martin | |
| Naive Bayes Tree | Mark Hall | |
| Kernel Logistic Regression | Eibe Frank | |
| LibSVM | FracPete | |
| Fuzzy Unordered Rule Induction | J. C. Hühn | |
| Best First Tree | Haijian Shi | |
| MetaCost meta-classifier | Len Trigg | |
| Voting Feature Intervals Classifier | Mark Hall | |
| Ordinal Stochastic Dominance Learner | Stijn Lievens | |
| RBFNetwork | Eibe Frank | |
| MODLEM rule algorithm | S. Wojciechowski | |
| The Fuzzy Lattice Reasoning Classifier | I. N. Athanasiadis | |
| Functional Trees | C. Ferreira |