| Literature DB >> 34895247 |
Yoo Jung Oh1, Jingwen Zhang2,3, Min-Lin Fang4, Yoshimi Fukuoka5.
Abstract
BACKGROUND: This systematic review aimed to evaluate AI chatbot characteristics, functions, and core conversational capacities and investigate whether AI chatbot interventions were effective in changing physical activity, healthy eating, weight management behaviors, and other related health outcomes.Entities:
Keywords: Artificial intelligence; Chatbot; Conversational agent; Diet; Nutrition; Physical activity; Sedentary behavior; Systematic review; Weight loss; Weight maintenance
Mesh:
Year: 2021 PMID: 34895247 PMCID: PMC8665320 DOI: 10.1186/s12966-021-01224-6
Source DB: PubMed Journal: Int J Behav Nutr Phys Act ISSN: 1479-5868 Impact factor: 6.457
Summary of inclusion and exclusion criteria
| Inclusion criteria | Exclusion criteria | ||
|---|---|---|---|
| Populations/participants | Adults and/or children who use AI chatbots for PA, diet, and/or weight management | None | |
| Interventions | Constraineda and/or unconstrainedb text and/or speech-based AI chatbots operating as standalone software or via a web browser or mobile application | Chatbots that are part of virtual reality, augmented reality, embodied agents, and/or therapeutic robots | |
| Comparators | With or without a usual care groupc, comparison group, or an attention control group | None | |
| Outcome(s) | Main outcomes: Changes in self-reported and/or objectively measured PA, sedentary behavior, diet, and/or body weight Secondary outcomes: Feasibility, acceptability, safety (e.g., adverse events, injury), and/or user satisfaction of chatbots if available | Studies that report only chatbot infrastructure or | |
| Study designs/types | Randomized controlled trials or quasi-experimental studies | Qualitative studies, case-control studies, cross-sectional studies, or cohort studies |
AI artificial intelligence, PA physical activity
a Users can only respond by selecting predefined conversational lines
b Users can respond freely by inputting natural conversational lines
c Usual care group refers to the research group where individuals receive routine care from health care providers
Fig. 1Flow diagram of the article screening process
Summary of study and sample characteristics
| No. | First author/published year/Country | Primary Aim(s) | Study design/# of groups | Theoretical framework | Sample characteristics | ||||
|---|---|---|---|---|---|---|---|---|---|
| Total Size (N)/Attrition (%) | Mean age (SD) years and/or range | Females/ Women % | Race/ Ethnicity % | Education/Income | |||||
| 1 | Kramer J/a 2020/Switzerland [ | To evaluate the effects of the Ally chatbot that combines financial incentives, weekly planning, and daily self-monitoring prompts on reaching daily step goals. | Optimization randomized trial/1 group micro-randomized to incentive (cash vs. charity vs. no incentive) X Planning (action vs. coping vs. no planning) X Self-monitoring prompt (prompt vs. no prompt) groups. | Health Action Process Approach | 41.7 (13.5)/NR | 57.7 | NR | 59.9% with university degree | |
| 2 | Kunzler F/b2019/Switzerland [ | To explore the factors affecting users’ receptivity towards Just-In-Time Adaptive Interventions (JITAIs) delivered via the Ally chatbot. | Randomized controlled trial/ 3 groups (cash bonus vs. charity donation vs. control) | NR | 40.0 (13.7)/NR | 63.0 | NR | NR | |
| 3 | Piao M/ 2020/ South Korea [ | To assess the efficacy of the Healthy Lifestyle Coaching Chatbot intervention presented via a messenger app aimed at stair-climbing habit formation for office workers. | Randomized Controlled Trial/ 2 groups (intervention vs. control) | Habit Formation Model | NR/20-59 | 56.7 | NR | NR | |
| 4 | Carfora V/2019/Italy [ | To test a chatbot that delivers daily messaging intervention aimed at promoting the reduction of red and processed meat consumption (RPMC). | Randomized Controlled Trial/3 groups (informational vs. emotional vs. control) | NR | 20 (2.0)/NR | 75.6 | NR | 100% Undergraduate students | |
| 5 | Maher CA/2020/Australia [ | To test the feasibility (recruitment and retention) and preliminary efficacy of physical activity and Mediterranean-style dietary intervention (MedLiPal) delivered via an artificially intelligent virtual health coach. | Quasi-experiment/1 group | NR | 56.2 (8.0)/45-75 | 67.7 | NR | NR | |
| 6 | Fadhil A/2019/NR [ | To present the design and validation of CoachAI, a conversational agent-assisted health coaching system on physical activity, healthy diet, and stress coping. | Quasi-experiment/1 group | Health Action Process Approach, Technology Acceptance Model, AttrakDiff Model | 28.5 (9.4)/19-53 | 42.1 | NR | Most were university students or had graduate degree | |
| 7 | Stephens TN/2019/U.S. [ | To assess the feasibility of integrating the Tess chatbot in behavioral counseling of adolescent patients coping with weight management and prediabetes symptoms to promote treatment adherence, behavior change, and overall wellness. | Quasi-experiment/1 group | Cognitive Behavioral Therapy, Emotionally Focused Therapy, Behavioral Activation, Motivational Interviewing | 15.2 (NR)/9.78-18.54 | 57.0 | Hispanic (43%), White (39%), Black (9%), Asian (9%) | NR | |
| 8 | Casas J/ 2018/ Switzerland [ | To evaluate the effects of a conversational assistant designed to monitor and coach participants to achieve specific goals regarding their diet. | Quasi-experiment/1 group | NR | NR | NR | NR | NR | |
| 9 | Kocielnik R/ 2018/ U.S. [ | To develop and examine the feasibility of a mobile conversational system, Reflection Companion, to engage users in reflection on physical activity through dialogues | Quasi-experiment/1 group | Structured Reflection Model | 36.5 (11.2)/21-60 | 87.9 | NR | 55% college degree or being enrolled in college, 27% graduate degree | |
Studies a and b employed the same chatbot named Ally
NR not reported
Fig. 2Summary of chatbot characteristics and intervention outcomes
Summary of chatbot and intervention characteristics
| No. | First author/published year/Country | Intervention duration and frequency | Chatbot only or multiple components intervention | Overall technological features of the intervention | Chatbot characteristics | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Chatbot identity (name, gender) | Media | User input | Chatbot initiation/ User initiation | Relational capacity | Persuasion capacity | Safety | Ethics discussion | |||||
| 1 | Kramer J/a2020/Switzerland [ | 6 weeks and daily | Chatbot only | The Assistant to Lift your Level of activitY (Ally) app was developed using the MobileCoach platform, on both Android and iOS systems | Name: Ally Gender: Woman | Texts, graphs | Constrained | Daily and weekly messages delivered at a random time between 10 AM and 6 PM/NR | Using personalized greeting, initiating daily conversations, and occasionally sending unrelated messages to keep user interests | Setting personalized goals, using action-planning, coping-planning, and self-monitoring prompts | NR | NR |
| 2 | Kunzler F/b2019/Switzerland [ | 6 weeks and daily | Chatbot only | Ally app (available on both iOS and Android) is based on the MobileCoach platform. Physical activity was measured by subscribing to CoreMotion Activity Manager on iOS and Google Activity Recognition API | Name: Ally Gender: Woman | Texts, graphs | Constrained | Daily messages delivered 3 times a day (between 8 and 10 AM, 10-6 PM, or 8 PM) and weekly messages delivered at random times/NR | Initiating daily conversations with a greeting | Using persuasive prompts such as goal setting, self-monitoring, goal achievement, and weekly planning | NR | NR |
| 3 | Piao M/2020/ South Korea [ | 12 weeks and daily | Chatbot only | The Healthy Lifestyle Coaching Chatbot was developed using the Watson Conversational tool (IBM Corp) and was linked to the KakaoTalk Smart Chat application programming interface (API) through the RESTful API. It was deployed through KakaoTalk messenger app | Name: Chatbot Gender: NR | Texts, Images | Unconstrained | Daily message delivered at participants’ specified time/On-demand | Sending personalized goal-related messages based on their daily routines, send a compliment message providing positive feedback | Setting personalized goals, providing extrinsic (e.g., financial incentives) and intrinsic (e.g., pleasure, satisfaction) were used. | NR | NR |
| 4 | Carfora V/2019/Italy [ | 2 weeks and daily | Chatbot only | Chatbot was deployed through Facebook Messenger | Name: NR Gender: NR | Texts | Constrained | Daily messages delivered at 7:30 AM/ NR | NR | Informing participants about the health and environmental impact of excessive RPMC; Using persuasive message appeal (i.e., emotional appeal such as regret) | NR | NR |
| 5 | Maher CA/2020/Australia [ | 12 weeks and weekly | Multicomponent | The Paola chatbot was developed using the IBM Watson Virtual Assistant AI software and was hosted on the cloud-based instant messaging platform Slack. The program also used the MedLipal website and the Garmin Vivofit4 physical activity monitor | Name: Paola Gender: Woman | Texts | Unconstrained | Weekly check-in messages/On-demand | Referring to users by their first name and responding to questions at any time. | Assisting participants to set a personal daily step count goal based on age-based normative values + 2000 steps. This daily step goal was revisited and edited at each weekly check-in. | NR | NR |
| 6 | Fadhil A/2019/NR [ | 3 weeks and daily | Chatbot only | Chatbot used the coaching portal combined with the dialogue engine which consists of rails state machine, user clustering model and fitbit wearable data. Chatbot was deployed through Telegram Messenger | Name: CoachAI Gender: NR | Texts, Images | Constrained | Messages delivered recurrently (e.g., daily, weekly, on weekends, or weekdays) or at a scheduled date and time/ NR | Using greeting and some preliminary evaluation chat at the beginning of the interaction and assessing users’ comfort in discussing personal information with the agent. | Providing motivational messages consisted of positive reinforcement and a feedback on user’s overall adherence for the week to increase adherence | NR | NR |
| 7 | Stephens TN/ 2019/ U.S. [ | 10–12 weeks NR | Chatbot only | The Tess chatbot was deployed through multiple channels (i.e., SMS text message, Slack, WhatsApp or Facebook Messenger) and also was integrated with Google Home and Amazon Alexa | Name: Tess Gender: NR | Texts, Voice | Unconstrained | NR/On-demand | Mimicking empathy and compassion, adjusting the conversational style or modality to address each client’s needs, and responding based on the individual’s reported emotion or concern. | Specific goals and targeted behaviors were entered to the system to offer individualized conversations. Delivered customized integrative support, psychoeducation, and interventions | Data processing and storage are on secure servers that satisfy Health Insurance Portability and Accountability Act (HIPAA) regulations and within the country of residence for all participants given access. | NR |
| 8 | Casas J/2018/Switzerland [ | Seven days and daily | Chatbot only | The Rupert chatbot was developed using the Chatfuel service, and deployed through Facebook Messenger | Name: Rupert le nutritionniste Gender: NR | Texts, graphs, images, videos | Constrained | Daily messages/NR | Showing empathy, being friendly, positive, and not judgmental; speaking in users’ native language (i.e., French) | Asking participants to choose their goals, monitoring food intake, answering questions, and giving recommendations | NR | NR |
| 9 | Kocielnik R/ 2018/ U.S. [ | 2 weeks and daily | Chatbot only | Twillio API was used to communicate with users via mobile phones through SMS/MMS. Fitbit API was queried for user activity data to generate activity graphs. LUIS API offered automated recognition of free-text user responses | Name: Reflection Companion Gender: NR | Texts, Graphs | Unconstrained | Daily messages/NR | Personalized the experience by introducing questions that referenced users’ own behavior change goals | The mini-dialogues were delivered with a graph showing user’s physical activity metrics | NR | NR |
Studies a and b employed the same chatbot named Ally.
NR not reported.
1 Intervention duration is how long the intervention lasted and frequency is how often the programed intervened with the participants
2 Multicomponent means the intervention had multiple intervention components (e.g., in-person and using chatbots); chatbot only means the intervention was sorely delivered by the chatbot
3 Document the technological infrastructure, platform, and features of the intervention
4 Chatbot identity documents identity cues the chatbot is designed with. The cues can include name, gender, age, etc.
5 Media documents the types of media that the chatbot can use to deliver information
6 User inputs document the capacity of which participants can interact with the chatbot. Constrained means users can only select pre-programmed responses in the chat; unconstrained means users can freely type or speak to the chatbot
7 Chatbot/User initiation indicates whether and how often chatbot/user initiated the conversation
8 Relational capacity documents conversation strategies the chatbot can use to establish, maintain, or enhance social relationships with the participants (e.g., greetings)
9 Persuasion capacity documents conversation strategies the chatbot can use to change participant’s behaviors and behavioral determinants (e.g., knowledge, attitudes, norm perceptions, efficacy, etc.)
10 Safety documents strategies the chatbot is designed to ensure safety of the participants
11 Ethics discussion documents any ethical principles or standards the chatbot is designed with. Key ethical considerations include having transparency and user trust, protecting user privacy, and minimizing biases
Summary of outcome measures and results
| No. | First author/published year/Country | Main outcome measures | Secondary outcome measures | |||||
|---|---|---|---|---|---|---|---|---|
| Physical activity (PA) | Diet | Weight | Engagement | Acceptability and satisfaction | Adverse event | Other outcomes | ||
| Results | Results | Results | Results | Results | Results | Results | ||
| 1 | Kramer J/a 2020/ Switzerland [ | OM (Daily step count obtained from smartphone) | NR | NR | OM (Rate of individuals who stopped using the app) | NR | NR | NR |
| Daily cash incentives increased step-goal achievement by 8.1% (CI: [2.1, 14.1]) and, only in the no-incentive control group, action planning increased step-goal achievement by 5.8% (CI: [1.2, 10.4]). | NR | NR | 30% of participants stopped using the app over the course of the study. | NR | NR | NR | ||
| 2 | Kunzler F/b 2019/ Switzerland [ | OM (Daily step count obtained from smartphone) | NR | NR | OM (Just-in-time-response rate, overall response rate, conversation engagement, response delay obtained from the chatbot) | NR | NR | NR |
| Physical activity goal completion rate was correlated with overall response rate ( | NR | NR | Intrinsic factors: Device type, age, and personality traits had a significant effect on the just-in-time response rate, conversation rate, and total response rate. Extrinsic factors: Time and day of the delivery, phone battery, device interaction, and location had significant effects on just-in-time response, conversation engagement, and response delay. | NR | NR | NR | ||
| 3 | Piao M/ 2020/ South Korea [ | SR (Self-Report Habit Index) | NR | NR | NR | NR | NR | NR |
After 4 weeks of intervention without providing the intrinsic rewards in the control group, the change in SRHI scores was 13.54 (SD ± 14.99) in the intervention group and 6.42 (SD ± 9.42) in the control group ( When all rewards were given to both groups, from the fifth to twelfth week, the change in SRHI scores of the intervention and control groups was comparable at 12.08 (SD ± 10.87) and 15.88 (SD ± 13.29), respectively ( The level of physical activity showed a significant difference between the groups after 12 weeks of intervention ( | NR | NR | NR | NR | NR | NR | ||
| 4 | Carfora V/ 2019/ Italy [ | NR | SR (Self-reported RPMC; intention; attitude; regret on RPMC) | NR | NR | NR | NR | NR |
| NR | The emotional condition had stronger anticipated regret and higher intention to reduce RPMC, as compared to the control condition ( | NR | NR | NR | NR | NR | ||
| 5 | Maher CA/ 2020/ Australia [ | SR (Active Australia Survey) | SR (14-item Australian Mediterranean Diet Adherence) | OM (Seca 703) | OM (Number of weekly check-in obtained from the chatbot) | NR | NR | OM (Feasibility of subject enrollment) |
| Increased MVPA 109.8 (95% CI 1.9 to 217.7, | Increased 5.7 (95% CI 4.2 to 7.3, | Lost 1.3 (95% CI − 25 to − 0.7, | Mean weekly chatbot interaction 6.9 times out of 11 possible interactions. | NR | No adverse events reported | Enrolled 31 out of 99 screened participants in the 6- week enrollment period | ||
| 6 | Fadhil A/2019/NR [ | SR (Physical activity intention) | SR (Healthy diet intention) | NR | NR | SR (TAM questionnaire) | NR | SR (AttrakDiff questionnaire) |
| Results showed no difference between the three weeks; the scores remained unchanged for the physical activity. | Results showed no difference between the three weeks; the scores remained unchanged for the healthy diet. | NR | NR | The scales “ease of use,” “attitude,” and “intention” towards using the system were significantly higher than the middle score (respectively: | NR | Average scores were statistically higher than 4 for each dimension: pragmatic ( | ||
| 7 | Stephens/2019/ U.S. [ | SR (Target goal progress) | NR | NR | OM (Duration of conversation, Quantity of messages exchanged, Number of hours support exchanged, Percentage of exchanges outside of typical office hours obtained from the chatbot, Ratio of chatbot-initiated vs. patient-initiated conversations obtained from the chatbot) | SR (Helpfulness) | NR | NR |
| Adolescent patients reported experiencing positive progress toward their goals 81% of the time. | NR | NR | A total of 4123 messages were exchanged between participants and Tess. The average duration of conversations between Tess and patients was approximately 12.5 min ( A majority of the conversations were Tess initiated (73.6%) compared to patient initiated. | Patients indicated that Tess was helpful 96% of the time. | NR | NR | ||
| 8 | Casas J/2018/ Switzerland [ | NR | SR (Meal consumption) | NR | NR | SR (Chatbot Effectiveness) | NR | NR |
| NR | Only 11% of participants succeeded with their goals. In 65% of the cases the person has improved his consumption. In 12% of cases, consumption remained stable and in the remaining 24%, their consumption has worsened. | NR | NR | 82% of participants said that Rupert allowed them to think and be aware of their consumption. 86% reported answering honestly to the daily requests of the chatbot. 70% thought the chatbot intervention was efficient. | NR | NR | ||
| 9 | Kocielnik R/ 2018/ U.S. [ | SR (Habituation Action; Understanding; Reflection; Critical reflection adapted from Kember et al. 2000) OM (Step count obtained from fitness trackers) SR (Physical activity awareness) | NR | NR | OM (Participant interactions with the system: 1) number of dialogues responded to, 2) the time until a response was made, 3) the length and content of responses obtained from the chatbot) SR (Willingness to use the system for additional 2 weeks without compensation) | NR | NR | SR (Mindfulness) |
Significant difference in Habitual Action (HA) for pre (M = 3.16, SD = 1.06) to post (M = 3.53, SD = 0.89) study measurements; A weakly significant increase in Understanding (U) from pre (M = 3.60, SD = 0.98) to post (M = 3.92, SD = 0.84); Step count difference was not significant. Physical activity awareness difference was not significant | NR | NR | Participants responded to 96% of all initial questions and to 90% of the follow-up questions sent by the system. 16 out of the 33 participants elected to continue using the system for 2 additional weeks without reward. | NR | NR | No significant changes were observed between pre- and post measurements | ||
Studies a and b employed the same chatbot named Ally
PA physical activity, SR self-report, OM objective measure, MVPA moderate to vigorous physical activity, RPMC red and processed meat consumption, NR not reported