| Literature DB >> 36173632 |
Deborah Plana1, Dennis L Shung2, Alyssa A Grimshaw3, Anurag Saraf4, Joseph J Y Sung5, Benjamin H Kann6.
Abstract
Importance: Despite the potential of machine learning to improve multiple aspects of patient care, barriers to clinical adoption remain. Randomized clinical trials (RCTs) are often a prerequisite to large-scale clinical adoption of an intervention, and important questions remain regarding how machine learning interventions are being incorporated into clinical trials in health care. Objective: To systematically examine the design, reporting standards, risk of bias, and inclusivity of RCTs for medical machine learning interventions. Evidence Review: In this systematic review, the Cochrane Library, Google Scholar, Ovid Embase, Ovid MEDLINE, PubMed, Scopus, and Web of Science Core Collection online databases were searched and citation chasing was done to find relevant articles published from the inception of each database to October 15, 2021. Search terms for machine learning, clinical decision-making, and RCTs were used. Exclusion criteria included implementation of a non-RCT design, absence of original data, and evaluation of nonclinical interventions. Data were extracted from published articles. Trial characteristics, including primary intervention, demographics, adherence to the CONSORT-AI reporting guideline, and Cochrane risk of bias were analyzed. Findings: Literature search yielded 19 737 articles, of which 41 RCTs involved a median of 294 participants (range, 17-2488 participants). A total of 16 RCTS (39%) were published in 2021, 21 (51%) were conducted at single sites, and 15 (37%) involved endoscopy. No trials adhered to all CONSORT-AI standards. Common reasons for nonadherence were not assessing poor-quality or unavailable input data (38 trials [93%]), not analyzing performance errors (38 [93%]), and not including a statement regarding code or algorithm availability (37 [90%]). Overall risk of bias was high in 7 trials (17%). Of 11 trials (27%) that reported race and ethnicity data, the median proportion of participants from underrepresented minority groups was 21% (range, 0%-51%). Conclusions and Relevance: This systematic review found that despite the large number of medical machine learning-based algorithms in development, few RCTs for these technologies have been conducted. Among published RCTs, there was high variability in adherence to reporting standards and risk of bias and a lack of participants from underrepresented minority groups. These findings merit attention and should be considered in future RCT design and reporting.Entities:
Mesh:
Year: 2022 PMID: 36173632 PMCID: PMC9523495 DOI: 10.1001/jamanetworkopen.2022.33946
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Screening and Selection of Randomized Clinical Trials
AI indicates artificial intelligence.
aDatabases and registers included Cochrane Library, Google Scholar, Ovid Embase, Ovid MEDLINE, PubMed, Scopus, and Web of Science Core Collection.
Summary of Randomized Clinical Trials Included in the Systematic Review
| Source | Study location | Study aim | Female, % | Race and ethnicity (%) | Medical specialty | Disease |
|---|---|---|---|---|---|---|
| Pavel et al,[ | Ireland, the Netherlands Sweden, UK | Detect neonatal seizures | 40 | NR | Neonatology | Neonatal seizures |
| Wang et al,[ | China | Detect colorectal adenomas | 52 | NR | Gastroenterology | Colon polyp and adenoma |
| Caparros-Gonzalez et al,[ | Spain | Assess premature infant physiological response | 41 | NR | Neonatology | None |
| Nimri et al,[ | Europe (multiple countries), Israel, US | Optimize insulin dose | 52 | NR | Endocrinology | Type 1 diabetes |
| Vennalaganti et al,[ | US | Detect Barrett esophagus–associated neoplasia | 24 | African American (2), White (95), other (3) | Gastroenterology | Barrett esophagus–associated neoplasia |
| Voss et al,[ | US | Improve socialization in children with autism spectrum disorder | 11 | Black (3), East Asian (24), Hispanic (23), Middle Eastern (1), South Asian (8), White/European American (35), unknown (17) | Pediatrics | Autism spectrum disorders |
| Manz et al,[ | US | Increase serious illness conversations among patients with cancer | 54 | Non-Hispanic Black (20), non-Hispanic White (70), other (10) | Oncology | End of life |
| Persell et al,[ | US | Improve blood pressure control in outpatients with hypertension | 61 | Black (35), Hispanic (8), White (52), other (9) | Primary care | Hypertension |
| Repici et al,[ | Italy | Detect colorectal adenomas | 51 | NR | Gastroenterology | Colorectal neoplasia |
| Wijnberge et al,[ | Netherlands | Detect intraoperative hypotension | 43 | NR | Cardiac surgery | Intraoperative hypotension |
| Shimabukuro et al,[ | US | Predict outcomes in patients with sepsis | 54 | African American (11), Asian (16), Hispanic (21), White (47), other (5) | Intensive care | Sepsis |
| Wang et al,[ | China | Detect colorectal adenomas | 49 | NR | Gastroenterology | Colon polyp and adenoma |
| Gong et al,[ | China | Detect colorectal adenomas | 51 | NR | Gastroenterology | Colorectal adenomas |
| Lin et al,[ | China | Diagnose childhood cataracts | 55 | NR | Ophthalmology | Childhood cataracts |
| Rabbi et al,[ | US | Facilitate weight loss through automated personalized feedback for physical activity and diet | 47 | NR | Primary care | None |
| Auloge et al,[ | France | Assess feasibility of AI/AR tool for vertebroplasty | 65 | NR | Orthopedics | Vertebral fracture |
| Avari et al,[ | Spain, UK | Decrease hypoglycemia episodes with personalized bolus advice for people with type 1 diabetes | 52 | NR | Endocrinology | Type 1 diabetes |
| Wang et al,[ | China | Detect colorectal adenomas | 52 | NR | Gastroenterology | Colon polyp and adenoma |
| Forman et al,[ | US | Facilitate weight loss by predicting and preventing dietary lapses | 85 | Hispanic or non-White (30), non-Hispanic White (70) | Primary care | Obesity |
| Wu et al,[ | China | Reduce rate of blind spots during EGD | 52 | NR | Gastroenterology | No specific disease |
| El Solh et al,[ | US | Predict optimal CPAP using neural network to reduce titration failure | 57 | NR | Pulmonology | Obstructive sleep apnea |
| Luštrek et al,[ | Belgium, Italy | Assess self-management of congestive heart failure using app and wristband | NR | NR | Cardiology | Congestive heart failure |
| Chen and Gao,[ | China | Assess AI-based echocardiography for diagnosis of acute heart failure | 36 | NR | Cardiology | Acute left heart failure |
| Seol et al,[ | US | Assess management of childhood asthma | 43 | White (72) | Pediatrics | Asthma |
| Repici et al,[ | Italy, Switzerland | Investigate colon adenoma detection of nonexpert endoscopists | 50 | NR | Gastroenterology | Colorectal cancer screening |
| Kamba et al,[ | Japan | Decrease colon adenoma miss rate | 23 | NR | Gastroenterology | Colorectal cancer screening |
| Liu et al,[ | China | Increase polyp and adenoma detection with CADe | 53 | NR | Gastroenterology | Colorectal cancer screening |
| Blomberg et al,[ | Denmark | Assess emergency-dispatched recognition of cardiac arrest during call | 36 | NR | Emergency medicine | Dispatcher assessment |
| Xu et al,[ | China | Assess polyp detection of AI-assisted colonoscopy | 49 | NR | Gastroenterology | Colorectal cancer screening |
| Jayakumar et al,[ | US | Evaluate effect of AI-enabled patient decision aid on knee osteoarthritis management | 64 | Asian (12), Black or African American (17), Hispanic or Latino (34), White (36) | Orthopedics | Osteoarthritis |
| Wu et al,[ | China | Identify blind spots in EGD | 52 | NR | Gastroenterology | Early gastric cancer |
| Sandal et al,[ | Denmark, Norway | Improve quality of life in patients with lower back pain using app | 55 | NR | Primary care | Lower back pain |
| Noor et al,[ | India | Identify follicles in patients with ovarian stimulation | 100 | NR | Gynecology | Infertility |
| Yao et al,[ | US | Identify patients with low ejection fraction from ECG data | 54 | Asian (1), Black or African American (2), White (94), other (2), missing (0.5) | Cardiology | Heart failure |
| Wu et al,[ | China | Identify gastric neoplasms on EGD | 54 | NR | Gastroenterology | Early gastric neoplasia |
| Strömblad et al,[ | US | Predict surgical case durations | 83 | Asian (8), Black (8), White (77), other (4), unknown (4) | Surgery | Solid tumor surgical procedures for gynecological and colorectal cancers |
| Eng et al,[ | US | Assess skeletal age | 46 | NR | Radiology | Skeletal development |
| Glissen Brown et al,[ | US | Reduce adenoma miss rate using computer-aided polyp detection | 45 | African American (21), White (68) | Gastroenterology | Colorectal cancer screening |
| Meijer et al,[ | Netherlands | Reduce pain after surgical procedures | 56 | NR | Anesthesiology | Postoperative pain |
| Liu et al,[ | China | Improve detection rate of polyps and adenomas | 46 | NR | Gastroenterology | Colon polyp and adenoma |
| Su et al,[ | China | Improve detection rate of polyps and adenomas | 51 | NR | Gastroenterology | Colon polyp and adenoma |
Abbreviations: AI, artificial intelligence; AR, artificial reality; CADe, computer-aided detection; CPAP, continuous positive airway pressure; EGD, esophagogastroduodenoscopy; NR, not reported.
Figure 2. Characteristics of Randomized Clinical Trials
A total of 41 randomized clinical trials were included in the analysis.[26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66] Individuals from underrepresented minority groups were participants in 11 clinical trials in which information on participant race and/or ethnicity was reported.[30,31,32,33,36,44,49,55,59,61,63] B, Data for 2021 are from January through October 15. D, The other medical specialty category includes anesthesiology, cardiac surgery, emergency medicine, general surgery, gynecology, intensive care, ophthalmology, pulmonology, and radiology.
Figure 3. Adherence to Consolidated Standards of Reporting Trials–Artificial Intelligence (CONSORT-AI) Extension Guideline
A total of 41 randomized clinical trials were included in the analysis.[26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66] The CONSORT-AI extension is an internationally developed consensus document reflecting recommended clinical trial reporting characteristics to ensure transparency and reproducibility.[24]
Figure 4. Risk of Bias in Randomized Clinical Trials
A total of 41 randomized clinical trials were included in the analysis.[26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66] Risk of bias was assessed using the revised Cochrane Risk of Bias, version 2 tool for randomized clinical trials.[23]