| Literature DB >> 35321344 |
Maria Lombardi1, Elisa Maiettini1, Davide De Tommaso2, Agnieszka Wykowska2, Lorenzo Natale1.
Abstract
Social robotics is an emerging field that is expected to grow rapidly in the near future. In fact, it is increasingly more frequent to have robots that operate in close proximity with humans or even collaborate with them in joint tasks. In this context, the investigation of how to endow a humanoid robot with social behavioral skills typical of human-human interactions is still an open problem. Among the countless social cues needed to establish a natural social attunement, this article reports our research toward the implementation of a mechanism for estimating the gaze direction, focusing in particular on mutual gaze as a fundamental social cue in face-to-face interactions. We propose a learning-based framework to automatically detect eye contact events in online interactions with human partners. The proposed solution achieved high performance both in silico and in experimental scenarios. Our work is expected to be the first step toward an attentive architecture able to endorse scenarios in which the robots are perceived as social partners.Entities:
Keywords: attentive architecture; computer vision; experimental psychology; humanoid robot; human–robot interaction; joint attention; mutual gaze
Year: 2022 PMID: 35321344 PMCID: PMC8935014 DOI: 10.3389/frobt.2022.770165
Source DB: PubMed Journal: Front Robot AI ISSN: 2296-9144
FIGURE 1Dataset collection. (A) Overall setup. The participant was seated at a desk in front of iCub. The latter was mounted with a RealSense camera on its head. (B) Sample frames were recorded using both iCub’s camera (first row) and the RealSense camera (second row). Different frames capture different human positions (rotation of the torso/head) and conditions (eye contact and no eye contact).
FIGURE 2Learning architecture. The acquired image is first used as input for OpenPose in order to get the facial keypoints and build the feature vector for the individual in the scene. Then, such a feature vector goes in as input to the mutual gaze classifier whose output is the pair (r, c), where r is the binary result of the classification (eye contact/no eye contact) and c is the confidence level.
FIGURE 3Feature importance. (A) Bar plot reporting on the x-axis the SHAP feature importance in percentage measured as the mean absolute Shapley value. Only the first 20 most important features are reported on the y-axis. (B) Numbered face keypoints of the feature vector.
FIGURE 4Experimental setup. (A) The iCub is positioned between two lateral screens face to face with the participant at the opposite sides of a desk that is 125 cm wide. (B) Sample frames acquired during the experiment in which the participant first looks at the robot to make an eye contact and then simulates a distraction looking at the lateral screen. On each frame, the prediction (eye contact yes/no) with the confidence value c is also reported.