| Literature DB >> 32331291 |
Michal Podpora1,2, Arkadiusz Gardecki1,2, Ryszard Beniak1,2, Bartlomiej Klin2, Jose Lopez Vicario3, Aleksandra Kawala-Sterniuk1.
Abstract
This paper presents a more detailed concept of Human-Robot Interaction systems architecture. One of the main differences between the proposed architecture and other ones is the methodology of information acquisition regarding the robot's interlocutor. In order to obtain as much information as possible before the actual interaction took place, a custom Internet-of-Things-based sensor subsystems connected to Smart Infrastructure was designed and implemented, in order to support the interlocutor identification and acquisition of initial interaction parameters. The Artificial Intelligence interaction framework of the developed robotic system (including humanoid Pepper with its sensors and actuators, additional local, remote and cloud computing services) is being extended with the use of custom external subsystems for additional knowledge acquisition: device-based human identification, visual identification and audio-based interlocutor localization subsystems. These subsystems were deeply introduced and evaluated in this paper, presenting the benefits of integrating them into the robotic interaction system. In this paper a more detailed analysis of one of the external subsystems-Bluetooth Human Identification Smart Subsystem-was also included. The idea, use case, and a prototype, integration of elements of Smart Infrastructure systems and the prototype implementation were performed in a small front office of the Weegree company as a decent test-bed application area.Entities:
Keywords: COVID-19; cloud services; humanoid robots; pepper robot; sensor networks; smart infrastructure; thermal imaging
Year: 2020 PMID: 32331291 PMCID: PMC7219337 DOI: 10.3390/s20082376
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Simplified overview diagram of the proposed system’s components.
Figure 2Simplified scheme of the processing of video input streams performed by the VISS.
Figure 3Sample image from the video datalog of the system, visualizing the data acquired by the thermal imaging sensor subsystem.
Figure 4Visualization of the zones for different identification tasks and methods.
Figure 5Comparison of mechanisms for obtaining information regarding interlocutor. Use case (A)—no RFID or Bluetooth signals available. Use case (B)—RFID UHF and Bluetooth subsystems are enabled. Red wide arrow means no information about incoming object/person, yellow wide arrow means partial information about object/person, green wide arrow means detailed information about object/person and blue narrow arrow means the possibility to start personalized interaction between robot (AI framework agent) and interlocutor.
Figure 6Real-world implementation of the proposed concept. Prototype system with humanoid robot Pepper ready to work. ©2020 IEEE. Reprinted, with permission, from [20].
Figure 7Bluetooth usability test—7 different test scenarios of a BT device carried towards the Bluetooth-based Human Identification Smart Sensor (BT HISS) node. X axis—distance in meters, Y axis—RSSI as a relative index (0 = best quality).
Values of average times and dispersion of response time of the Speech-to-text service and the robotic system database (phrases 1–4—minimal comfortable speaking man-machine distance, phrases 5–8—average distance, phrases 9–12 maximal comfortable distance).
| Phrases 1–4 Spoken from the Shortest Distance [s] | Phrases 5–8 Spoken from the Medium Distance [s] | Phrases 9–12 Spoken from the Zone Boundary Distance [s] | All Phrases [s] | |
|---|---|---|---|---|
| AVSpeechToText | 0.1688 | 0.1911 | 0.1965 | 0.5861 |
| DispSpeechToText | 0.1476 | 0.1911 | 0.1449 | 3.4876 |
| AVChatbot | 0.3305 | 0.2663 | 0.2813 | 0.2695 |
| DispChatbot | 0.4242 | 0.2646 | 0.4124 | 0.3699 |
Figure 8Relation of relative Google Speech-to-text service response times after sending the last frame of the audio speech data to Google.