Literature DB >> 33912598

Editorial: Language and Robotics.

Tadahiro Taniguchi¹, Takato Horii², Xavier Hinaut³, Michael Spranger⁴, Daichi Mochihashi⁵, Takayuki Nagai².

Abstract

Entities: Disease Gene Species

Keywords: concept formation; deep learning for robotics; emergence of communication; language acquisition by robots; multimodal communication; symbol emergence in robotics; symbol grounding in robotics

Year: 2021 PMID： 33912598 PMCID： PMC8072269 DOI： 10.3389/frobt.2021.674832

Source DB: PubMed Journal: Front Robot AI ISSN： 2296-9144

× No keyword cloud information.

1. Introduction

Language in the real-world environment involves a wide range of challenges in robotics and artificial intelligence (AI). Service robots are required to communicate and collaborate with people using language in the real-world environment. When a robot receives a spoken command from a user in a domestic environment, the robot must understand its meaning in the context of the specific environment. For example, to understand the meaning of “please bring me a pen in Takato's room” the robot needs to know where to find a pen and where Takato's room is. Futhermore, words or expressions (i.e., sounds processed as symbols) can be invented naturally in our daily environment and their meaning can change (Spranger, 2016) over time (i.e., depending on the culture or age of the speaker). Robots thus need to adapt like humans to these versatile aspects of language and demonstrate the ability to learn any language (Hinaut and Twiefel, 2019). In robotics, language understanding inevitably involves multimodal learning, semantic mapping, and behavior learning. To enable a robot to interact orally with people in a long-term manner, we need to develop an AI that makes a robot learn and adapt to language in the real-world environment and in an on-line manner. This topic thus raises several challenges to bridge the gap from low-level sensorimotor interaction (Pagliarini et al., in press) to high-level compositional symbolic communication. Taking inspiration of how children acquire language can help to design the simplest mechanisms to deal with these challenges. Conversely, robotics can help modeling and test hypotheses about language acquisition and language grounding (Cangelosi and Schlesinger, 2015; Taniguchi et al., 2016, 2018; Hinaut and Spranger, 2019), in particular through cross-situational experiments (Taniguchi et al., 2017; Juven and Hinaut, 2020). Following the successfully organized session “Language and Robotics” held in IEEE-IROS 2018, we organized this Research Topic. We aimed to publish original papers from robotics, natural language processing, machine learning, and cognitive science to share knowledge about the state-of-the-art machine learning methods and perspectives that contribute to modeling language-related capabilities in robotics.

2. About the Research Topic

We are pleased to present five research articles related to semantic mapping, language understanding, motion segmentation, symbol emergence, and language evolution. In this section, we briefly introduce each paper. First, three papers focused on language-related cognitive capabilities integrating real-world sensor information full of uncertainty and high-dimensional. Each method involves deep learning methods dealing with high-dimensional uncertain real data in robotics. Nagano et al. proposed a new machine learning method called a hierarchical Dirichlet process-variational autoencoder-Gaussian process-hidden semi-Markov model (HVGH). The method extended a hierarchical Dirichlet process-Gaussian process-hidden semi-Markov model (HDP-GP-HSMM) that can automatically segment time series data. HVGH integrated variational autoencoder and the HDP-GP-HSMM and achieved automatic motion segmentation along with representation learning. Katsumata et al. proposed a statistical semantic mapping method called SpCoMapping, which means spatial concept formation and semantic mapping. The proposed model employed Markov random field into a pre-existing spatial concept formation method and became able to learn the arbitrary shape of a place on a map. The method integrated multimodal information, e.g., language, vision, and position, to find semantic information of places. Tada et al. proposed a robust language understanding method by introducing noise injection into the sequence-to-sequence network. Recently, semantic parsing that enables a robot to understand the meaning of human user commands is developed based on deep learning methods. However, semantic parsing in natural language processing does not assume the existence of speech recognition errors. This paper showed the conventional idea of noise injection to sequence-to-sequence network semantic parsing can improve the robustness of a robot's language understanding. Second, two papers focused on the emergence, or evolution, of symbols. Cambier et al. described the perspectives of language evolution in swarm robotics. They advocated an approach based on language games for the further development of emergent communication in swarm robots. They suggested that swarm robotics can be an ideal testbed to advance research on the emergence of language-like communication. Hagiwara et al. proposed a new computational model representing symbol emergence. The model proposed in this paper regarded symbol emergence as a multiagent multimodal categorization problem. The convergence of the algorithm was guaranteed based on the theory of Markov chain Monte Carlo. This symbol emergence model involved sharing signs among agents and making each agent form internal representations based on its sensorimotor information.

3. Next Step

With the great success of this Research Topic, we organized related workshops and a tutorial. A survey paper related to this topic has already been published (Tangiuchi et al., 2019). We believe that integrating low-level and high-level cognitive capabilities (Nakamura et al., 2018; Taniguchi et al., 2020) in conjunction with language learning in the real-world environment is crucial to creating an artificial cognitive system, i.e., a robot, which can conduct lifelong learning in the real-world environment and achieves long-term human-robot interaction to support daily human activities. The intersection of language and robotics is a crucial Research Topic for further advancement in robotics and AI. We hope that this special issue will accelerate the cutting-edge studies in robotics and AI that aim to create human-level embodied AI that can communicate and collaborate with people in the real-world environment.

Author Contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

2 in total

1. Cross-Situational Learning with Bayesian Generative Models for Multimodal Category and Word Learning in Robots.

Authors: Akira Taniguchi; Tadahiro Taniguchi; Angelo Cangelosi
Journal: Front Neurorobot Date: 2017-12-19 Impact factor: 2.650

2. SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model.

Authors: Tomoaki Nakamura; Takayuki Nagai; Tadahiro Taniguchi
Journal: Front Neurorobot Date: 2018-06-26 Impact factor: 2.650

2 in total