| Literature DB >> 35465567 |
Stephanie Tulk Jesso1,2, Aisling Kelliher3, Harsh Sanghavi4, Thomas Martin2,5, Sarah Henrickson Parker1,6.
Abstract
The application of machine learning (ML) and artificial intelligence (AI) in healthcare domains has received much attention in recent years, yet significant questions remain about how these new tools integrate into frontline user workflow, and how their design will impact implementation. Lack of acceptance among clinicians is a major barrier to the translation of healthcare innovations into clinical practice. In this systematic review, we examine when and how clinicians are consulted about their needs and desires for clinical AI tools. Forty-five articles met criteria for inclusion, of which 24 were considered design studies. The design studies used a variety of methods to solicit and gather user feedback, with interviews, surveys, and user evaluations. Our findings show that tool designers consult clinicians at various but inconsistent points during the design process, and most typically at later stages in the design cycle (82%, 19/24 design studies). We also observed a smaller amount of studies adopting a human-centered approach and where clinician input was solicited throughout the design process (22%, 5/24). A third (15/45) of all studies reported on clinician trust in clinical AI algorithms and tools. The surveyed articles did not universally report validation against the "gold standard" of clinical expertise or provide detailed descriptions of the algorithms or computational methods used in their work. To realize the full potential of AI tools within healthcare settings, our review suggests there are opportunities to more thoroughly integrate frontline users' needs and feedback in the design process.Entities:
Keywords: artificial intelligence (AI); clinical AI; clinician; evaluation; healthcare; human-centered design; machine learning
Year: 2022 PMID: 35465567 PMCID: PMC9022040 DOI: 10.3389/fpsyg.2022.830345
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
FIGURE 1Systematic review process and numbers of articles included and excluded. Articles were first identified through online databases, then pre-processed using python scripts. Next, all articles were manually reviewed by title and abstract prior to reviewing full articles to evaluate which articles were included.
Three categories of search criteria used to identify articles.
| Clinical domain terms | “Decision support,” “healthcare,” “health care,” “physician,” “patient,” “clinic*” (e.g., clinical, clinician), “nurs*” (e.g., nurse, nursing), “diagnosis,” “medical records” (e.g., electronic medical records), or “health records” (e.g., electronic health records) |
| AI terms | “AI,” “ML,” “machine learning,” “deep learning,” “intelligen*” (e.g., intelligent sensors), “ambient” (e.g., ambient awareness or ambient intelligence), “CNN,” “RNN,” “neural network,” “convolutional,” “recurrent,” “Markov” (e.g., Hidden Markov Model), “reinforcement learning,” “SVM,” “support vector” |
| User feedback terms | “UX,” “usability,” “user” (e.g., user test, user centered design), “adoption” (e.g., technology adoption), “human centered,” HCI, “human computer” (e.g., human computer interaction), “human AI” (e.g., human AI interaction) |
The asterisks (“*”) denotes a truncation to include variant endings of related words (e.g., “nurs*” can flag results including “nurse”, “nurses”, and “nursing”).
Evaluation criteria used for inclusion and exclusion.
| NA = no AI | Included studies needed to include some type of AI/ML, or the authors themselves needed to explicitly related their research to AI with or without the addition of algorithms. The assignment of the code “NA” meant that there was no machine learning or artificial intelligence involved in the study, nor did the authors claim that the study was related to AI. For instance, while a decision tree algorithm and predictive analytics are not technically AI, if the article reports any algorithm to be AI and asks clinicians about AI tools, we considered this to be AI. Additionally, hypothetical AI/ML technologies were not excluded |
|
| |
| NC = not clinical | Included articles needed to focus on challenges and work within a clinical domain. The assignment of the code “NC” meant that the article was not focused on the support of clinicians in clinical contexts. While diagnostic tests and tools were relevant, research focused on the work of lab technicians, speeding up lab results, or aiding in the process of quality improvement were excluded. Community/public health research efforts were also excluded |
| NU = no user feedback | Included articles needed to include some form of explicit feedback from intended clinical end users regarding a proposed or existing tool, or about AI/ML in general. The assignment of the code “NU” meant that the article did not describe any attempt to observe what clinicians thought about AI/ML and/or a specific clinical AI tool. If the users’ opinions are considered in any stage, the article could be included (for instance, interviews or committees of users to determine what users want prior to creating the system, or even informal feedback from users at the end of an evaluation). Efforts that used “user tests” solely for the purpose of validating system performance and which did not include any report of user opinions were excluded |
| PU = a patient is the user | Included articles needed to focus on clinicians as end users. The assignment of the code “PU” meant that patients were the intended users and clinicians were not considered to be primary users of any component of the tool/system. Articles that did include clinicians and patients as users of different components of the design were not excluded |
| Ineligible | Articles that were considered ineligible included research that was outside of the realm of human health (e.g., zoology, data security), or were related to public health research (e.g., tracking the spread of HIV, measuring depression and anxiety on social media), articles that described the technical details of laboratory tests (e.g., new biochemical assays), articles that did not present primary research (e.g., published study protocols, case studies, review articles, editorials, or position pieces), and papers that were not peer-reviewed (e.g., published theses or dissertations) |
Product matrix of clinical AI tool literature reviewed.
| Type of study | Real tool? | Method type | Method(s) | Metric(s) | Stakeholders | Target users | User consult time | Algorithm(s) | Validation | Trust? | ||||||||||||||||||||||||||||||||||||||||||
| References | Design study | 3rd party study | Preliminary design study | Empirical research | Real tool | Hypothetical tool | Qualitative | Quantitative | Co-creation | Context analysis | Ethnography/observations | Focus groups | Interviews | Iterative design | Survey | User evaluations | Validation/performance | Other | Total Participants | Preferences | Performance | Errors | Empirical observation | Trust | Opinions | Other | Clinicians | Patients | Other | Nurses | Physicians | Clinicians (in general) | Specific clinical specialties | Patients | Other | Beginning | Before or during prototype | Middle or iterative design | At the end of research effort | Not a design study | Deep learning | Regression | Described specific algorithm | Generic description | Named product | Hypothetical (research) | Validated here or elsewhere | Accuracy only | None, or non-design | Yes | No | |
| Diagnosis |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 18 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Unknown | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 22 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 8 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 15 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 95 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 302 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 9 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 22 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 24 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 30 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 2322 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 720 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 21 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | 617 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 21 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | 1020 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Treatment planning |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 6 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 9 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 16 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 43 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 151 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 95 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 5 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 51 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 22 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 10 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 24 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 17 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 720 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Risk assessment |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 9 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 20 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | 15 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | Unknown | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 2 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 47 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 10 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 2322 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 15 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Ambient intelligence and tele monitoring |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 17 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 16 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 8 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 2322 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 20 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | 270 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 515 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
| NLP |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 15 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 9 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 20 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | 1731 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Administrative tasks |
| ✓ | ✓ | ✓ | ✓ | ✓ | 24 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 12 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 19 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 5 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||
|
| ✓ | ✓ | ✓ | ✓ | ✓ | 720 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||||||||||||||||||||||||||||||
The gray rows designate “design” studies. *Indicates that paper appears in more than one category.
FIGURE 2Methods used within included articles and counts.
FIGURE 3Total number of participants included in studies.