Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Empirical assessment of bias in machine learning diagnostic test accuracy studies.

Literature DB >> 32548642

Empirical assessment of bias in machine learning diagnostic test accuracy studies.

Ryan J Crowley^1,2, Yuan Jin Tan^1,3, John P A Ioannidis^1,3,4,5,6.

Abstract

OBJECTIVE: Machine learning (ML) diagnostic tools have significant potential to improve health care. However, methodological pitfalls may affect diagnostic test accuracy studies used to appraise such tools. We aimed to evaluate the prevalence and reporting of design characteristics within the literature. Further, we sought to empirically assess whether design features may be associated with different estimates of diagnostic accuracy.
MATERIALS AND METHODS: We systematically retrieved 2 × 2 tables (n = 281) describing the performance of ML diagnostic tools, derived from 114 publications in 38 meta-analyses, from PubMed. Data extracted included test performance, sample sizes, and design features. A mixed-effects metaregression was run to quantify the association between design features and diagnostic accuracy.
RESULTS: Participant ethnicity and blinding in test interpretation was unreported in 90% and 60% of studies, respectively. Reporting was occasionally lacking for rudimentary characteristics such as study design (28% unreported). Internal validation without appropriate safeguards was used in 44% of studies. Several design features were associated with larger estimates of accuracy, including having unreported (relative diagnostic odds ratio [RDOR], 2.11; 95% confidence interval [CI], 1.43-3.1) or case-control study designs (RDOR, 1.27; 95% CI, 0.97-1.66), and recruiting participants for the index test (RDOR, 1.67; 95% CI, 1.08-2.59). DISCUSSION: Significant underreporting of experimental details was present. Study design features may affect estimates of diagnostic performance in the ML diagnostic test accuracy literature.
CONCLUSIONS: The present study identifies pitfalls that threaten the validity, generalizability, and clinical value of ML diagnostic tools and provides recommendations for improvement.

Entities: Species

Keywords: bias; diagnostic techniques and procedures; machine learning; research design; sensitivity and specificity

Mesh：

Year: 2020 PMID： 32548642 PMCID： PMC7647361 DOI： 10.1093/jamia/ocaa075

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

39 in total

1. Feature extraction and classification of breast cancer on dynamic magnetic resonance imaging using artificial neural network.

Authors: P Abdolmaleki; L D Buadu; H Naderimansh
Journal: Cancer Lett Date: 2001-10-10 Impact factor: 8.679

2. Machine learning in medicine: a primer for physicians.

Authors: Akbar K Waljee; Peter D R Higgins
Journal: Am J Gastroenterol Date: 2010-06 Impact factor: 10.864

3. Screening for Down syndrome during first trimester: a prospective study using free beta-human chorionic gonadotropin and pregnancy-associated plasma protein A.

Authors: J C Forest; J Massé; J M Moutquin
Journal: Clin Biochem Date: 1997-06 Impact factor: 3.281

4. Multifeature analysis of Gd-enhanced MR images of breast lesions.

Authors: S Sinha; F A Lucas-Quesada; N D DeBruhl; J Sayre; D Farria; D P Gorczyca; L W Bassett
Journal: J Magn Reson Imaging Date: 1997 Nov-Dec Impact factor: 4.813

5. Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations.

Authors: L E Moses; D Shapiro; B Littenberg
Journal: Stat Med Date: 1993-07-30 Impact factor: 2.373

6. Effect of verification bias on screening for prostate cancer by measurement of prostate-specific antigen.

Authors: Rinaa S Punglia; Anthony V D'Amico; William J Catalona; Kimberly A Roehl; Karen M Kuntz
Journal: N Engl J Med Date: 2003-07-24 Impact factor: 91.245

7. The d-dimer test for deep venous thrombosis: gold standards and bias in negative predictive value.

Authors: John T Philbrick; Steven Heim
Journal: Clin Chem Date: 2003-04 Impact factor: 8.327

Review 8. High-performance medicine: the convergence of human and artificial intelligence.

Authors: Eric J Topol
Journal: Nat Med Date: 2019-01-07 Impact factor: 53.440

9. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration.

Authors: Jérémie F Cohen; Daniël A Korevaar; Douglas G Altman; David E Bruns; Constantine A Gatsonis; Lotty Hooft; Les Irwig; Deborah Levine; Johannes B Reitsma; Henrica C W de Vet; Patrick M M Bossuyt
Journal: BMJ Open Date: 2016-11-14 Impact factor: 2.692

10. Promises, Pitfalls, and Basic Guidelines for Applying Machine Learning Classifiers to Psychiatric Imaging Data, with Autism as an Example.

Authors: Pegah Kassraian-Fard; Caroline Matthis; Joshua H Balsters; Marloes H Maathuis; Nicole Wenderoth
Journal: Front Psychiatry Date: 2016-12-01 Impact factor: 4.157

5 in total

Review 1. Extracellular MicroRNAs as Intercellular Mediators and Noninvasive Biomarkers of Cancer.

Authors: Blanca Ortiz-Quintero
Journal: Cancers (Basel) Date: 2020-11-20 Impact factor: 6.639

2. Clinician Preimplementation Perspectives of a Decision-Support Tool for the Prediction of Cardiac Arrhythmia Based on Machine Learning: Near-Live Feasibility and Qualitative Study.

Authors: Stina Matthiesen; Søren Zöga Diederichsen; Mikkel Klitzing Hartmann Hansen; Christina Villumsen; Mats Christian Højbjerg Lassen; Peter Karl Jacobsen; Niels Risum; Bo Gregers Winkel; Berit T Philbert; Jesper Hastrup Svendsen; Tariq Osman Andersen
Journal: JMIR Hum Factors Date: 2021-11-26

3. Machine Learning Decomposition of the Anatomy of Neuropsychological Deficit in Alzheimer's Disease and Mild Cognitive Impairment.

Authors: Ningxin Dong; Changyong Fu; Renren Li; Wei Zhang; Meng Liu; Weixin Xiao; Hugh M Taylor; Peter J Nicholas; Onur Tanglay; Isabella M Young; Karol Z Osipowicz; Michael E Sughrue; Stephane P Doyen; Yunxia Li
Journal: Front Aging Neurosci Date: 2022-05-03 Impact factor: 5.750

4. Trial-level characteristics associate with treatment effect estimates: a systematic review of meta-epidemiological studies.

Authors: Huan Wang; Jinlu Song; Yali Lin; Wenjie Dai; Yinyan Gao; Lang Qin; Yancong Chen; Wilson Tam; Irene Xy Wu; Vincent Ch Chung
Journal: BMC Med Res Methodol Date: 2022-06-15 Impact factor: 4.612

5. Machine Learning Models for Predicting Neonatal Mortality: A Systematic Review.

Authors: Cheyenne Mangold; Sarah Zoretic; Keerthi Thallapureddy; Axel Moreira; Kevin Chorath; Alvaro Moreira
Journal: Neonatology Date: 2021-07-14 Impact factor: 4.035

5 in total