| Literature DB >> 35937138 |
Alicia M Jones1, Daniel R Jones1.
Abstract
Online AI symptom checkers and diagnostic assistants (DAs) have tremendous potential to reduce misdiagnosis and cost, while increasing the quality, convenience, and availability of healthcare, but only if they can perform with high accuracy. We introduce a novel Bayesian DA designed to improve diagnostic accuracy by addressing key weaknesses of Bayesian Network implementations for clinical diagnosis. We compare the performance of our prototype DA (MidasMed) to that of physicians and six other publicly accessible DAs (Ada, Babylon, Buoy, Isabel, Symptomate, and WebMD) using a set of 30 publicly available case vignettes, and using only sparse history (no exam findings or tests). Our results demonstrate superior performance of the MidasMed DA, with the correct diagnosis being the top ranked disorder in 93% of cases, and in the top 3 in 96% of cases.Entities:
Keywords: AI medical diagnosis; Bayesian medical diagnosis; Bayesian network; comparison of physicians with AI decision support; diagnostic decision support system; diagnostic performance; general medical diagnostic assistant; symptom checkers
Year: 2022 PMID: 35937138 PMCID: PMC9355422 DOI: 10.3389/frai.2022.727486
Source DB: PubMed Journal: Front Artif Intell ISSN: 2624-8212
Figure 1Diagnostic BN hierarchy (A) Generic fragment where each node represents a risk factor (R) disease (D), pathophysiological state (P), or findings (F); (B) BN fragment for liver cirrhosis.
Figure 2Typical diagnostic BN configurations. (A) A disorder causes 2 findings; (B) Independent disorders both cause a finding; (C) Causally related disorders cause the same finding; (D) Causally related disorders each explain a subset of the patient findings.
Figure 3This figure (image captured from our CMS) shows random serum glucose modeled as a log normal distribution for (peak distributions left-to-right): normal (healthy), chronic diabetes mellitus, and DKA. The overlay table in the top left shows multifactorial distributions of serum glucose for DKA as a function of factor findings “current pregnancy” and “recent heavy alcohol consumption”.
Figure 4Illustration of recursive BN computations for disorder cluster and subtype fragments. (A) Cluster fragment for patient findings F1 and F3 and disorder subtype ancestors. (B) Subtypy tree for disorder D3. (C,D) D3 in original network has been replaced by its children D31 and D32 to compute the cluster probabilities with the two children.
Sample vignette.
|
|
|
|
|---|---|---|
| Appendicitis | A 12-year-old girl presents with sudden-onset severe generalized abdominal pain associated with nausea, vomiting, and diarrhea. On exam she appears ill and has a temperature of 104°F (40°C). Her abdomen is tense with generalized abdominal pain, nausea, tenderness and guarding. No bowel sounds are present. | 12 y/o f, sudden onset severe abdominal pain, nausea, vomiting, diarrhea, T = 104 |
Performance comparison summary results for 7 DAs and physicians.
|
|
|
|
| ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| Physicians | 90 | 68/90 | 75.3 | 65.4–84.0 | 81/90 | 90.3 | 81.9–95.3 |
| Ada | 30 | 22/30 | 73.3 | 54.1–87.7 | 27/30 | 90.0 | 73.5–97.9 |
| Babylon | 30 | 21/30 | 70.0 | 50.6–85.3 | 29/30 | 96.7 | 82.8–99.9 |
| Buoy | 21 | 11/21 | 52.4 | 29.8–74.3 | 15/21 | 71.4 | 47.8–88.7 |
| Isabel | 30 | 15/30 | 50.0 | 31.3–68.7 | 21/30 | 70.0 | 50.6−85.3 |
| MidasMed | 30 | 28/30 | 93.3 | 77.9–99.2 | 29/30 | 96.7 | 82.8–99.9 |
| Symptomate | 30 | 21/30 | 70.0 | 50.6–85.3 | 26/30 | 86.7 | 69.3–96.2 |
| WebMD | 30 | 20/30 | 66.7 | 47.2–82.7 | 28/30 | 93.3 | 77.9–99.2 |
| All DAs | 201 | 138/201 | 67.7 | 61.8–75.0 | 175/201 | 87.1 | 81.6–91.4 |
| Top 3 DAs | 90 | 71/90 | 78.9 | 69.0–86.8 | 82/90 | 91.1 | 83.2–96.1 |
The Babylon and physician tests were not replicated in this study, but were transcribed from Baker et al. (.
In the Babylon study three physicians were tested, but only percent data were reported; therefore 95% CI's were computed assuming a total of 90 vignettes (30 per doctor).
For 9 of the 30 disorders presented, Buoy gave no proposed diagnoses; only triage recommendations (e.g., “Contact a medical professional” or “Call 911!”).
Isabel, Symptomate, and WebMD are the only DAs tested both in the original paper (Semigran et al., .
CI intervals were computed using Clopper-Pearson exact method for binomial probability distributions.
For a larger sample size to compare with physicians, we combined the top 3 DAs we tested (Ada, MidasMed, and Symptomate).