Literature DB >> 32984890

User reactions to COVID-19 screening chatbots from reputable providers.

Alan R Dennis¹, Antino Kim¹, Mohammad Rahimi², Sezgin Ayabakan².

Abstract

OBJECTIVES: The objective was to understand how people respond to coronavirus disease 2019 (COVID-19) screening chatbots.
MATERIALS AND METHODS: We conducted an online experiment with 371 participants who viewed a COVID-19 screening session between a hotline agent (chatbot or human) and a user with mild or severe symptoms.
RESULTS: The primary factor driving user response to screening hotlines (human or chatbot) is perceptions of the agent's ability. When ability is the same, users view chatbots no differently or more positively than human agents. The primary factor driving perceptions of ability is the user's trust in the hotline provider, with a slight negative bias against chatbots' ability. Asian individuals perceived higher ability and benevolence than did White individuals.
CONCLUSIONS: Ensuring that COVID-19 screening chatbots provide high-quality service is critical but not sufficient for widespread adoption. The key is to emphasize the chatbot's ability and assure users that it delivers the same quality as human agents.

Entities: Disease Species

Keywords: chatbot; health screening; information technology; public health

Mesh：

Year: 2020 PMID： 32984890 PMCID： PMC7454579 DOI： 10.1093/jamia/ocaa167

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

INTRODUCTION

Many people are seeking information in response to the coronavirus disease 2019 (COVID-19) pandemic. Individuals with various symptoms and conditions are looking for guidance on whether to seek medical attention for COVID-19. Providing accurate, timely information is crucial to help those with—as well as those without—COVID-19 make good decisions. The sudden unprecedented demand for information is overwhelming resources., One solution is the deployment and use of technologies such as chatbots., Chatbots have the potential to relieve the pressure on contact centers., Chatbots are software applications that conduct an online conversation in natural language via typed text or voice commands (eg, Siri). Chatbots are scalable, so they can meet an unexpected surge in demand when there is a shortage of qualified human agents. Chatbots can provide round-the-clock service at a low operational cost. They are consistent in quality, in that they always provide the same results in response to the same inputs, and are easily retrained in the face of rapidly changing information. Chatbots are also nonjudgmental; they make no moral judgments about the information provided by the user, so users may be more willing to disclose socially undesirable information. As chatbots increase in quality, their use is expanding. For example, chatbots are already widely deployed in customer service applications to guide users through knowledge bases or well-structured processes (eg, technical and customer supports). Chatbots integrate directly into existing web, phone, social media, and message channels, and can be launched in many different languages. Chatbots are increasingly being deployed in health care., The COVID-19 pandemic has spurred even greater deployment, many for screening of potential patients.,COVID-19 screening is an ideal application for chatbots because it is a well-structured process that involves asking patients a series of clearly defined questions and determining a risk score., Chatbots can help call centers triage patients and advise them on the most appropriate actions to take, which may be to do nothing because the patient does not present symptoms that warrant immediate medical care. Despite all the potential benefits, like any other technology-enabled services, chatbots will help only if people use them and follow their advice., In this article, we examine whether people will use high-quality chatbots provided by reputable organizations. We control for chatbot quality by examining a chatbot that provides the exact same service as a human agent. COVID-19 screening is based on a very specific set of criteria, so a well-designed chatbot can perform at close to a trained human level. Trust is an important factor that influences the use of chatbots, as well as patient compliance., Users will be reluctant to use chatbots if they do not trust them. Trust in humans is influenced by 3 primary factors that also have parallels for trust in technology. The first is ability: the agent—human or chatbot—must be competent within the range of actions required of it. The agent must have the knowledge and skills needed to make a correct diagnosis. Second is integrity: the agent must do what it says it will do. For example, if the agent says the user’s information is private and will not be disclosed, the information must truly be private. In the era in which data breaches are common, do users believe that technology has integrity? Third is benevolence: the agent must have the patient’s best interests in mind, and not be guided by ulterior motives, such as increasing profits. The underlying trust factors of ability, integrity, and benevolence play important roles in the use of technology, and technology providing recommendations in particular. Ability and integrity are typically more important for instrumental outcomes associated with transactions (eg, purchasing) because users are most concerned with whether the technology will work as intended to complete the transaction. Affect and other perceptual outcomes (eg, satisfaction) are often influenced more by benevolence as these are based more on relationship aspects of technology use. Accordingly, we examine ability, integrity, and benevolence as potential factors to drive trust in chatbots and, subsequently, influence patients’ intentions to use chatbots and comply with their recommendations.

MATERIALS AND METHODS

We conducted a 2 × 2 between-subjects—2 agent types (human vs chatbot) by 2 patient severity levels (mild vs severe)—online experiment in which subjects were randomly assigned to view a video vignette of COVID-19 screening hotline session between an agent and a patient. The online setting is appropriate as screening services can be provided via various online channels., Vignettes have been commonly used to study human behavior, technology use, and trust because they provide excellent experimental control. Research shows that reading or watching a vignette triggers the same attitudes as actually engaging in the behaviors shown in the vignette;meta-analyses have shown no significant differences in conclusions between vignette studies and studies of actual behavior, although effect sizes in vignette-based studies tend to be slightly lower., In April 2020, we recruited 402 participants from Amazon Mechanical Turk following usual protocols to ensure data quality. Participants were paid $2.00. Thirty subjects failed one or more of the six attention checks and one did not report sex, and were removed, leaving 371 participants for analysis. About half were female (n = 188), and 83% were White, 8% were Asian, 6% were Black, and 3% were “other” (individuals selecting multiple ethnicities and individuals selecting “other”). The median age was 40 years, with most participants 25-64 years of age (1%: 18-24 years of age; 26%: 25-34 years of age; 34%: 35-44 years of age; 19%: 45-54 years of age; 15%: 55-64; 5%: 65 years of age or older). There were no significant differences in gender, age or race across the four conditions. The Supplementary Appendix provide the detailed demographics by condition. Participants watched a 2.5-minute video vignette of a fictitious text chat between an agent at a COVID-19 screening hotline and a user with possible COVID-19 symptoms. We designed 2 vignettes in which the users either reported mild or severe symptoms. We developed our vignettes based on our experiences using four COVID-19 chatbots and the screening questions recommended by the Centers for Disease Control and Prevention (CDC). Participants were informed that the video was either a human agent or a chatbot (randomly assigned), but the videos were the same between the two conditions to control for quality differences between human and chatbot. Thus, the study compares a chatbot with capabilities identical in quality to those of a human agent. Participants were informed that the hotline was provided by the CDC and were informed of the deception at the end of the study. Thus, any differences between the chatbot and human agent are due to human bias because participants saw the exact same vignette in both conditions. We used established measures of ability, integrity, benevolence, trust, and the control factors of disposition to trust and personal innovativeness with information technology. We adapted prior measures for satisfaction, persuasiveness, likelihood of use and likelihood of following up on the diagnosis of the agent. All measures used 7-point scales, and all scales proved reliable (Cronbach’s alpha >.80). All demographic items were categorical variables. More details on the items and reliabilities are provided in the Supplementary Appendix. The experimental materials were pilot tested with 100 undergraduate students at the A.R.D.’s university prior to the study. The study was reviewed by the Indiana University Institutional Review Board as protocol #2003099481 and was determined to be an exempt study.

RESULTS

The first part of our analysis shows that participants perceived the chatbot to have significantly less ability, integrity, and benevolence (see Table 1). Severity of symptoms influenced the perceptions of ability and integrity but not benevolence. The effect sizes for the models as a whole (R2) were what Cohen calls medium or small to medium. The individual effect sizes of the chatbot (partial η2) for ability and integrity were between what Cohen terms small (0.01) and medium (0.06), while the effect size for benevolence was medium. The primary factor influencing perceptions of ability was trust in the provider (ie, the CDC), with the type of agent (human or chatbot) being a secondary factor. For integrity, both the trust in the provider and the type of agent were primary factors. For benevolence, the primary factor was the type of agent, with trust secondary. We also controlled for sex, age, and ethnicity. Gender had no significant effect but compared to White individuals, individuals of Asian ethnicity perceived the agent to have significantly higher ability and benevolence. Age was significant for benevolence, but there was no pattern to its effects.

Table 1.

Results for ability, integrity, and benevolence showing beta coefficients

	Ability	Integrity	Benevolence
Chatbot	−0.399^c	−0.435^c	−0.616^c
Severe symptoms	0.136^a	0.297^b	0.329
Chatbot × severe symptoms	0.103	0.003	−0.260
Higher risk participant	0.030	0.013	0.013
Disposition to trust	0.162^c	0.218^c	0.202^b
PIIT	0.108^a	0.126^a	0.164^a
Trust in CDC	0.331^c	0.221^c	0.217^b
Female	0.109	0.001	0.136
Age	Included	Included	Included^a
Ethnicity	Included^a	Included	Included^a
Constant	6.125^c	4.511^c	4.650^c
R²	0.269	0.216	0.193
Adjusted R²	0.234	0.178	0.154
F	5.363	5.101	8.434
Effect sizes (partial η²)
Chatbot	0.042	0.045	0.088
Severe symptoms	0.012	0.021	0.007
Chatbot × severe symptoms	0.001	0.000	0.003
Higher risk	0.001	0.000	0.000
Disposition to trust	0.031	0.037	0.023
PIIT	0.016	0.015	0.017
Trust in CDC	0.120	0.040	0.027
Female	0.004	0.000	0.003
Age	0.030	0.024	0.039
Ethnicity	0.023	0.005	0.026

CDC: Centers for Disease Control and Prevention; PIIT: personal innovativeness with information technology.

P ≤ .05,

P ≤ .01,

P ≤ .001.

Results for ability, integrity, and benevolence showing beta coefficients CDC: Centers for Disease Control and Prevention; PIIT: personal innovativeness with information technology. P ≤ .05, P ≤ .01, P ≤ .001. In the second part of our analysis, we examined 5 outcomes: (1) persuasiveness, (2) satisfaction, (3) likelihood of following the agent’s advice, (4) trust, and (5) likelihood of use (see Table 2), after controlling for the effects of ability, integrity, and benevolence. The effect sizes for the models as a whole (R2) were large. The dominant factor across all five outcomes was perceived ability (very large effect sizes), with chatbot a secondary factor having a medium-sized positive effect on persuasiveness, and small- to-medium positive effects on satisfaction, likelihood of following the agent’s advice, and likelihood of use. Last, severity of the condition did not directly affect the outcomes nor moderate the relationship between chatbot and outcomes. The control variables (sex, age, and ethnicity) had no significant effects on the outcome variables.

Table 2.

Results for outcomes showing beta coefficients

	Persuasive	Satisfaction	Follow Advice	Trust	Likely to Use
Chatbot	0.272^c	0.112^c	0.035^a	0.022	0.270^b
Severe symptoms	0.097	0.044	−0.143	0.088	0.004
Chatbot × severe symptoms	0.014	0.069	0.268	0.026	0.039
Higher-risk participant	−0.024	−0.024	−0.039	0.001	0.000
Disposition to trust	0.015	0.035	0.016	−0.006	0.051
PIIT	0.028	0.021	0.038	0.043	0.115^a
Trust in CDC	−0.001	0.030	0.238^c	0.071^a	0.087
Female	−0.058	0.005	0.048	−0.125	−0.031
Age	Included	Included	Included	Included	Included
Ethnicity	Included	Included	Included	Included	Included
Ability	0.583^c	0.603^c	0.634^c	0.612^c	0.786^c
Integrity	0.105^b	0.049	−0.006	0.350^c	0.070
Benevolence	0.084^a	0.005	0.105	0.072	0.300^c
Constant	5.605^c	5.82^c	6.883^c	6.191^c	5.949^c
R²	0.671	0.766	0.553	0.741	0.594
Adjusted R²	0.653	0.752	0.527	0.726	0.571
F	35.759	57.167	21.633	50.140	25.601
Effect sizes (Partial η²)
Chatbot	0.065	0.034	0.011	0.001	0.022
Severe symptoms	0.010	0.010	0.000	0.007	0.000
Chatbot × severe symptoms	0.000	0.002	0.007	0.000	0.000
Higher-risk participant	0.002	0.004	0.002	0.000	0.000
Disposition to trust	0.001	0.007	0.000	0.000	0.002
PIIT	0.003	0.003	0.002	0.005	0.014
Trust in CDC	0.000	0.005	0.068	0.011	0.007
Female	0.003	0.000	0.001	0.010	0.000
Age	0.010	0.010	0.022	0.008	0.016
Ethnicity	0.007	0.004	0.001	0.009	0.004
Ability	0.410	0.576	0.266	0.373	0.277
Integrity	0.016	0.007	0.000	0.126	0.002
Benevolence	0.011	0.000	0.008	0.006	0.042

CDC, Centers for Disease Control and Prevention; PIIT: personal innovativeness with information technology.

P ≤ .05,

P ≤ .01,

P ≤ .001.

Results for outcomes showing beta coefficients CDC, Centers for Disease Control and Prevention; PIIT: personal innovativeness with information technology. P ≤ .05, P ≤ .01, P ≤ .001.

DISCUSSION

Simply put, the results show that the primary factor driving patient response to COVID-19 screening hotlines (human or chatbot) is users’ perceptions of the agent’s ability. A secondary factor for persuasiveness, satisfaction, likelihood of following the agent’s advice, and likelihood of use was the type of agent, with participants reporting they viewed chatbots more positively than human agents, which is good news for healthcare organizations struggling to meet user demand for screening services. This positive response may be because users feel more comfortable disclosing information to a chatbot, especially socially undesirable information, because a chatbot makes no judgment. The CDC, the World Health Organization, UNICEF, and other health organizations caution that the COVID-19 outbreak has provoked social stigma and discriminatory behaviors against people of certain ethnic backgrounds, as well as those perceived to have been in contact with the virus., This is truly an unfortunate situation, and perhaps chatbots can assist those who are hesitant to seek help because of the stigma. The primary factor driving perceptions of ability was the user’s trust in the provider of the screening hotline. Our results show a slight negative bias against chatbots’ ability, perhaps due to recent press reports. Therefore, proactively informing users of the chatbot’s ability is important; users need to understand that chatbots use the same up-to-date knowledge base and follow the same set of screening protocols as human agents.

CONCLUSION

Developing a high-quality COVID-19 screening chatbot—as qualified as a trained human agent—will help alleviate the increased load on COVID-19 contact centers staffed by human agents. When chatbots are perceived to provide the same service quality as human agents, users are more likely to see them as persuasive, be more satisfied, and be more likely to use them. A user’s tech-savviness (personal innovativeness with information technology) has only a small effect, so these results apply to both those with deep technology experience and those with little. Yet, therein lies the rub: there is a gap between how users perceive chatbots’ and human agents’ abilities. Therefore, to offset users’ biases, a necessary component in deploying chatbots for COVID-19 screening is a strong messaging campaign that emphasizes the chatbot’s ability. Because trust in the provider strongly influences perceptions of ability, building on the organization’s reputation may also prove useful.

AUTHOR CONTRIBUTIONS

ARD and AK conceptualized the research. SA, ARD, AK, and MR designed the methodology. AK collected and curated the data. SA, ARD, and AK analyzed the data, SA, ARD, AK, and MR wrote the article.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online.

CONFLICT OF INTEREST STATEMENT

None declared. Click here for additional data file.

8 in total

Review 1. To justify or excuse?: A meta-analytic review of the effects of explanations.

Authors: John C Shaw; Eric Wild; Jason A Colquitt
Journal: J Appl Psychol Date: 2003-06

2. Data breaches of protected health information in the United States.

Authors: Vincent Liu; Mark A Musen; Timothy Chou
Journal: JAMA Date: 2015-04-14 Impact factor: 56.272

3. Physician-patient relationship and medication compliance: a primary care investigation.

Authors: Ngaire Kerse; Stephen Buetow; Arch G Mainous; Gregory Young; Gregor Coster; Bruce Arroll
Journal: Ann Fam Med Date: 2004 Sep-Oct Impact factor: 5.166

4. Quro: Facilitating User Symptom Check Using a Personalised Chatbot-Oriented Dialogue System.

Authors: Shameek Ghosh; Sammi Bhatia; Abhi Bhatia
Journal: Stud Health Technol Inform Date: 2018

5. Implementation of a digital chatbot to screen health system employees during the COVID-19 pandemic.

Authors: Timothy J Judson; Anobel Y Odisho; Jerry J Young; Olivia Bigazzi; David Steuer; Ralph Gonzales; Aaron B Neinstein
Journal: J Am Med Inform Assoc Date: 2020-07-01 Impact factor: 4.497

6. Relationship Between Internet Health Information and Patient Compliance Based on Trust: Empirical Study.

Authors: Xinyi Lu; Runtong Zhang; Wen Wu; Xiaopu Shang; Manlu Liu
Journal: J Med Internet Res Date: 2018-08-17 Impact factor: 5.428

7. Acceptability of artificial intelligence (AI)-led chatbot services in healthcare: A mixed-methods study.

Authors: Tom Nadarzynski; Oliver Miles; Aimee Cowie; Damien Ridge
Journal: Digit Health Date: 2019-08-21

8. Chatbots in the fight against the COVID-19 pandemic.

Authors: Adam S Miner; Liliana Laranjo; A Baki Kocaballi
Journal: NPJ Digit Med Date: 2020-05-04

8 in total

10 in total

1. Recruitment in a research study via chatbot versus telephone outreach: a randomized trial at a minority-serving institution.

Authors: Yoo Jin Kim; Julie A DeLisa; Yu-Che Chung; Nancy L Shapiro; Subhash K Kolar Rajanna; Edward Barbour; Jeffrey A Loeb; Justin Turner; Susan Daley; John Skowlund; Jerry A Krishnan
Journal: J Am Med Inform Assoc Date: 2021-12-28 Impact factor: 4.497

2. Use of Health Care Chatbots Among Young People in China During the Omicron Wave of COVID-19: Evaluation of the User Experience of and Satisfaction With the Technology.

Authors: Yi Shan; Meng Ji; Wenxiu Xie; Xiaomin Zhang; Xiaobo Qian; Rongying Li; Tianyong Hao
Journal: JMIR Hum Factors Date: 2022-06-09

3. Exploring the Influential Factors of Consumers' Willingness Toward Using COVID-19 Related Chatbots: An Empirical Study.

Authors: Manal Almalki
Journal: Med Arch Date: 2021-02

4. Perceived Utilities of COVID-19 Related Chatbots in Saudi Arabia: a Cross-sectional Study.

Authors: Manal Almalki
Journal: Acta Inform Med Date: 2020-09

5. Chatbot breakthrough in the 2020s? An ethical reflection on the trend of automated consultations in health care.

Authors: Jaana Parviainen; Juho Rantala
Journal: Med Health Care Philos Date: 2021-09-04

6. Implications of social media misinformation on COVID-19 vaccine confidence among pregnant women in Africa.

Authors: Farah Ennab; Maryam Salma Babar; Abdul Rahman Khan; Rahul Jagdishchandra Mittal; Faisal A Nawaz; Mohammad Yasir Essar; Sajjad S Fazel
Journal: Clin Epidemiol Glob Health Date: 2022-02-12

7. Conceptualizing the COVID-19 Pandemic: Perspectives of Pregnant and Lactating Women, Male Community Members, and Health Workers in Kenya.

Authors: Alicia M Paul; Clarice Lee; Berhaun Fesshaye; Rachel Gur-Arie; Eleonor Zavala; Prachi Singh; Ruth A Karron; Rupali J Limaye
Journal: Int J Environ Res Public Health Date: 2022-08-30 Impact factor: 4.614