Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A mismatch in the human realism of face and voice produces an uncanny valley.

Literature DB >> 23145223

A mismatch in the human realism of face and voice produces an uncanny valley.

Wade J Mitchell¹, Kevin A Szerszen, Amy Shirong Lu, Paul W Schermerhorn, Matthias Scheutz, Karl F Macdorman.

Abstract

The uncanny valley has become synonymous with the uneasy feeling of viewing an animated character or robot that looks imperfectly human. Although previous uncanny valley experiments have focused on relations among a character's visual elements, the current experiment examines whether a mismatch in the human realism of a character's face and voice causes it to be evaluated as eerie. The results support this hypothesis.

Entities: Disease Gene Species

Keywords: Masahiro Mori; anthropomorphism; facial–vocal mismatch; human realism; social perception

Year: 2011 PMID： 23145223 PMCID： PMC3485769 DOI： 10.1068/i0415

Source DB: PubMed Journal: Iperception ISSN： 2041-6695

Mori (1970) proposed a nonlinear relation between a character's degree of human realism and our subjective sense of rapport: the more human the character looks the more comfortable we feel interacting with it until a point is reached at which subtle nonhuman flaws cause the character to seem eerie, like an animated corpse. Mori dubbed this dip in rapport bukimi no tani (the uncanny valley). Although Mori conducted no experiments on the uncanny valley, he cited stimuli that could produce the described effect, including a prosthetic hand that looks real but feels cold and hard to the touch. In this example, there is a cross-modal mismatch: the visual appearance of the hand elicits the tactile expectation that it will feel as warm and soft as a human hand. The violation of this expectation causes more than surprise. There is a sense of the macabre, which Jentsch (1906) identified with uncertainty concerning whether the entity is animate or inanimate. This sense may be highest for an entity resembling a human being because of the viewer's self-identification (MacDorman et al 2009b; Ramey 2005). Theories ranging from the biological to the cultural have been proposed to explain the uncanny valley (MacDorman and Ishiguro 2006; Misselhorn 2009; Moosa and Minhaz Ud-Dean 2010). Attributions of eeriness have been elicited in empirical studies by a mismatch in the human photorealism of a character's visual elements, such as eyes and face; other treatments include pairing a realistic human skin texture with atypical face height, eye separation, and eye size (MacDorman et al 2009a; Seyama and Nagayama 2007). Although Tinwell et al (2010) have found a visual–auditory mismatch correlates with uncanniness, no experiment has yet been conducted that manipulates facial and vocal human realism as independent variables. This experiment is intended to fill that gap. The following prediction (the hypothesis) is made by the theory that a cross-modal mismatch in human realism causes uncertainty about whether an entity is animate or inanimate, thereby eliciting feelings of eeriness: a robot with a human voice, or a human being with a synthetic voice, will be perceived as eerier than a robot with a synthetic voice or a human being with a human voice. Forty-eight US-born participants (28 female, 20 male) were recruited in April 2010 from a sample of undergraduate students from a nine-campus Midwestern university. Their mean age was 21.2 (SD = 3.7). There were no significant differences in the experimental results by age or gender. In this within-group experiment, each participant viewed, in random sequence, four 14 s videos of a character reciting neutral phrases. Each video corresponded to either matched (robot figure–synthetic voice, human figure–human voice) or mismatched stimulus conditions (robot figure–human voice, human figure–synthetic voice). Each video played in a loop until the participant completed validated indices on the character's humanness, eeriness, and interpersonal warmth (Ho and MacDorman 2010). Each index averaged the results of five-to-eight 7-point semantic differential scales, ranging from −3 to +3. The order of video presentation and the scales was randomized to prevent order effects. Data analysis was performed in SPSS. The three indices were not significantly correlated and were normally distributed and reliable (Cronbach's α s ranged from 0.70 to 0.88). For humanness a two-way repeated measures ANOVA found a significant main effect for face realism [F(1,47) = 110.15, p < 0.001, η2 = 0.70] and voice realism [F(1,47) = 75.94, p < 0.001, η2 = 0.62] and a significant interaction effect [F(1,47) = 18.65, p < 0.001, η2 = 0.28]. The human figure–human voice condition rated the highest [M = 1.40, SE = 0.20], and the robot figure–synthetic voice condition rated the lowest [M = −2.29, SE = 0.13] (figure 1). For eeriness there was a significant main effect for voice realism [F(1,47) = 13.28, p = 0.001, η 2 = 0.22] and a significant interaction effect [F(1,47) = 36.51, p < 0.001, η2 = 0.44]. The two mismatched conditions, robot figure–human voice [M = −0.10, SE = 0.15] and human figure–synthetic voice [M = 0.19, SE = 0.16], rated significantly higher on eeriness than the two matched conditions, robot figure–synthetic voice [M = −0.60, SE = 0.13] and human figure–human voice [M = −1.10, SE = 0.14], by a paired samples t-test [t(47) = 6.042, p < 0.001]. For warmth there was a significant main effect for face realism [F(1,47) = 27.62, p < 0.001, η2 = 0.37] and voice realism [F(1,47) = 11.15, p = 0.002, η2 = 0.19] but no significant interaction effect. Warmth ratings were highest for robot figure–synthetic voice [M = 0.28, SE = 0.11] and lowest for human figure–synthetic voice [M = −0.96, SE = 0.13]. The higher warmth ratings for the robot conditions may be attributed to its cuteness relative to the seriousness of the ex-Marine human actor.

Figure 1.

A human voice heightened the eeriness of the robot, while a synthetic voice heightened the eeriness of the human. The error bars indicate 95% confidence intervals.

A human voice heightened the eeriness of the robot, while a synthetic voice heightened the eeriness of the human. The error bars indicate 95% confidence intervals. These results indicate incongruence in the human realism of a character's face and voice can elicit feelings of eeriness; thus, the hypothesis is supported. This suggests a design principle for synthetic agents to avoid the uncanny valley: the human realism of a character's visual elements and voice should match.

1 in total

1. Too real for comfort? Uncanny responses to computer generated faces.

Authors: Karl F MacDorman; Robert D Green; Chin-Chang Ho; Clinton T Koch
Journal: Comput Human Behav Date: 2009-05-01

1 in total

21 in total

1. Designing Empathic Virtual Agents: Manipulating Animation, Voice, Rendering, and Empathy to Create Persuasive Agents.

Authors: Dhaval Parmar; Stefan Olafsson; Dina Utami; Prasanth Murali; Timothy Bickmore
Journal: Auton Agent Multi Agent Syst Date: 2022-02-22 Impact factor: 1.431

2. A Bayesian explanation of the 'Uncanny Valley' effect and related psychological phenomena.

Authors: Roger K Moore
Journal: Sci Rep Date: 2012-11-16 Impact factor: 4.379

3. A reappraisal of the uncanny valley: categorical perception or frequency-based sensitization?

Authors: Tyler J Burleigh; Jordan R Schoenherr
Journal: Front Psychol Date: 2015-01-21

4. Persistence of the uncanny valley: the influence of repeated interactions and a robot's attitude on its perception.

Authors: Jakub A Złotowski; Hidenobu Sumioka; Shuichi Nishio; Dylan F Glas; Christoph Bartneck; Hiroshi Ishiguro
Journal: Front Psychol Date: 2015-06-30

Review 5. Is it the real deal? Perception of virtual characters versus humans: an affective cognitive neuroscience perspective.

Authors: Aline W de Borst; Beatrice de Gelder
Journal: Front Psychol Date: 2015-05-12

6. Robots with display screens: a robot with a more humanlike face display is perceived to have more mind and a better personality.

Authors: Elizabeth Broadbent; Vinayak Kumar; Xingyan Li; John Sollers; Rebecca Q Stafford; Bruce A MacDonald; Daniel M Wegner
Journal: PLoS One Date: 2013-08-28 Impact factor: 3.240

7. Stimulus-category competition, inhibition, and affective devaluation: a novel account of the uncanny valley.

Authors: Anne E Ferrey; Tyler J Burleigh; Mark J Fenske
Journal: Front Psychol Date: 2015-03-13

8. A truly human interface: interacting face-to-face with someone whose words are determined by a computer program.

Authors: Kevin Corti; Alex Gillespie
Journal: Front Psychol Date: 2015-05-18

Review 9. A review of empirical evidence on different uncanny valley hypotheses: support for perceptual mismatch as one road to the valley of eeriness.

Authors: Jari Kätsyri; Klaus Förger; Meeri Mäkäräinen; Tapio Takala
Journal: Front Psychol Date: 2015-04-10

10. Walking in the uncanny valley: importance of the attractiveness on the acceptance of a robot as a working partner.

Authors: Matthieu Destephe; Martim Brandao; Tatsuhiro Kishi; Massimiliano Zecca; Kenji Hashimoto; Atsuo Takanishi
Journal: Front Psychol Date: 2015-02-25