| Literature DB >> 34901836 |
Zion Mengesha1,2, Courtney Heldreth2, Michal Lahav2, Juliana Sublewski3, Elyse Tuennerman3.
Abstract
Automated speech recognition (ASR) converts language into text and is used across a variety of applications to assist us in everyday life, from powering virtual assistants, natural language conversations, to enabling dictation services. While recent work suggests that there are racial disparities in the performance of ASR systems for speakers of African American Vernacular English, little is known about the psychological and experiential effects of these failures paper provides a detailed examination of the behavioral and psychological consequences of ASR voice errors and the difficulty African American users have with getting their intents recognized. The results demonstrate that ASR failures have a negative, detrimental impact on African American users. Specifically, African Americans feel othered when using technology powered by ASR-errors surface thoughts about identity, namely about race and geographic location-leaving them feeling that the technology was not made for them. As a result, African Americans accommodate their speech to have better success with the technology. We incorporate the insights and lessons learned from sociolinguistics in our suggestions for linguistically responsive ways to build more inclusive voice systems that consider African American users' needs, attitudes, and speech patterns. Our findings suggest that the use of a diary study can enable researchers to best understand the experiences and needs of communities who are often misunderstood by ASR. We argue this methodological framework could enable researchers who are concerned with fairness in AI to better capture the needs of all speakers who are traditionally misheard by voice-activated, artificially intelligent (voice-AI) digital systems.Entities:
Keywords: African American Vernacular English; artificial intelligence; fair machine learning; natural language processing; social psychology; sociolinguistics; speech to text
Year: 2021 PMID: 34901836 PMCID: PMC8664002 DOI: 10.3389/frai.2021.725911
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
FIGURE 1Lack of Satisfaction of ASR Technology among African-American participants (n = 30).
FIGURE 2Why African American participants aren’t satisfied (n = 22).
FIGURE 3Top three attribution of errors among African-American participants when using ASR technology (n = 30).
FIGURE 4Attribution of errors among African-American participants (n = 30).
FIGURE 5Who African-American participants believe ASR technology works better for (n = 14).
FIGURE 6Emotions experienced from Voice Technology Errors (n = 30).
FIGURE 7Attributes considered when African-American participants encountered ASR technology errors (n = 30).
FIGURE 8Frequency of speech modification by African-American participants when using ASR technology (n = 30).
FIGURE 9Emotion Experienced when participants accomodated for Voice Technology Errors (n = 30).
FIGURE 10African-American participants needed to modify their speech for different results (n = 28).
FIGURE 11African-American participants belief ASR technology will not improve for users like themselves (n = 30).