| Literature DB >> 33909055 |
Ayush Jain1, David Way1, Vishakha Gupta1, Yi Gao1, Guilherme de Oliveira Marinho1, Jay Hartford1, Rory Sayres1, Kimberly Kanada2, Clara Eng1, Kunal Nagpal1, Karen B DeSalvo1, Greg S Corrado1, Lily Peng1, Dale R Webster1, R Carter Dunn1, David Coz1, Susan J Huang2, Yun Liu1, Peggy Bui1,3, Yuan Liu1.
Abstract
Importance: Most dermatologic cases are initially evaluated by nondermatologists such as primary care physicians (PCPs) or nurse practitioners (NPs). Objective: To evaluate an artificial intelligence (AI)-based tool that assists with diagnoses of dermatologic conditions. Design, Setting, and Participants: This multiple-reader, multiple-case diagnostic study developed an AI-based tool and evaluated its utility. Primary care physicians and NPs retrospectively reviewed an enriched set of cases representing 120 different skin conditions. Randomization was used to ensure each clinician reviewed each case either with or without AI assistance; each clinician alternated between batches of 50 cases in each modality. The reviews occurred from February 21 to April 28, 2020. Data were analyzed from May 26, 2020, to January 27, 2021. Exposures: An AI-based assistive tool for interpreting clinical images and associated medical history. Main Outcomes and Measures: The primary analysis evaluated agreement with reference diagnoses provided by a panel of 3 dermatologists for PCPs and NPs. Secondary analyses included diagnostic accuracy for biopsy-confirmed cases, biopsy and referral rates, review time, and diagnostic confidence.Entities:
Mesh:
Year: 2021 PMID: 33909055 PMCID: PMC8082316 DOI: 10.1001/jamanetworkopen.2021.7249
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. User Interface of the Artificial Intelligence (AI)–Based Assistive Tool and the Study Design
The AI assistant shows as many as 5 top predictions of skin conditions, with the confidence in each prediction shown as colored dots and additional information (eg, sample images from an atlas) available with a click. More details are available in eFigure 1 in the Supplement. The study was designed as a multiple-reader, multiple-case (MRMC) study comprising 1048 cases. Two groups of clinicians (primary care physicians [PCPs] and nurse practitioners [NPs]) reviewed each case with or without AI assistance. The modality alternated every 50 cases. For every case, each clinician was instructed to rank as many as 3 differential diagnoses using a search-as-you-type interface and selecting matching skin conditions from a list of 3961 conditions. If their desired skin condition was not present, clinicians could provide free-text entries. All skin conditions were mapped to a list of 419 conditions. SCC indicates squamous cell carcinoma; SCCIS, SCC in situ.
Patient Characteristics in the Original Data Set and Final Enriched Data Set
| Characteristic | Data set | |
|---|---|---|
| Full study (n = 1048) | Cases with diagnoses from histologic findings (n = 152) | |
| Years | 2017-2018 | 2017-2018 |
| No. of sites | 11 | 10 |
| No. of images included in study | 3935 | 413 |
| No. of patients included in study | 1016 | 152 |
| Age, median (IQR), y | 43 (30-56) | 49 (35-59) |
| Sex, No. (%) | ||
| Female | 672 (64.2) | 99 (65.1) |
| Male | 375 (35.8) | 53 (34.9) |
| Race and ethnicity, No. (%) | ||
| American Indian or Alaska Native | 9 (0.9) | 0 |
| Asian | 102 (9.7) | 5 (3.3) |
| Black or African American | 66 (6.3) | 5 (3.3) |
| Hispanic or Latino | 447 (42.7) | 59 (38.8) |
| Native Hawaiian or Pacific Islander | 20 (1.9) | 2 (1.3) |
| White | 365 (34.9) | 80 (52.6) |
| Not specified | 38 (3.6) | 1 (0.7) |
| Fitzpatrick skin type (6 types), No. (%) | ||
| I | 2 (0.2) | 2 (1.3) |
| II | 109 (10.4) | 17 (11.2) |
| III | 668 (63.8) | 111 (73.0) |
| IV | 205 (19.6) | 14 (9.2) |
| V | 25 (2.4) | 1 (0.7) |
| VI | 0 | 0 |
| Unknown | 38 (3.6) | 7 (4.6) |
| Skin conditions based on primary diagnosis, No. (%) | ||
| Acne | 40 (3.8) | NA |
| Actinic keratosis | 39 (3.7) | 1 (0.7) |
| Allergic contact dermatitis | 25 (2.4) | NA |
| Alopecia areata | 37 (3.5) | NA |
| Androgenetic alopecia | 32 (3.1) | NA |
| Basal cell carcinoma | 36 (3.4) | 32 (21.1) |
| Cyst | 32 (3.1) | 1 (0.7) |
| Eczema | 53 (5.1) | NA |
| Folliculitis | 32 (3.1) | 3 (2.0) |
| Hidradenitis | 34 (3.2) | NA |
| Lentigo | 32 (3.1) | 3 (2.0) |
| Melanocytic nevus | 61 (5.8) | 28 (18.4) |
| Melanoma | 20 (1.9) | 6 (3.9) |
| Postinflammatory hyperpigmentation | 28 (2.7) | NA |
| Psoriasis | 40 (3.8) | NA |
| SCC/SCCIS | 34 (3.2) | 14 (9.2) |
| SK/ISK | 52 (5.0) | 13 (8.6) |
| Scar condition | 34 (3.2) | 2 (1.3) |
| Seborrheic dermatitis | 37 (3.5) | NA |
| Skin tag | 36 (3.4) | 3 (2.0) |
| Stasis dermatitis | 25 (2.4) | NA |
| Tinea | 31 (3.0) | 1 (0.7) |
| Tinea versicolor | 34 (3.2) | NA |
| Urticaria | 33 (3.2) | NA |
| Verruca vulgaris | 37 (3.5) | 8 (5.3) |
| Vitiligo | 36 (3.4) | NA |
| Other | 116 (11.1) | 65 (42.8) |
Abbreviations: IQR, interquartile range; NA, not applicable; SCC/SCCIS, squamous cell carcinoma/squamous cell carcinoma in situ; SK/ISK, seborrheic keratosis/irritated seborrheic keratosis.
One case was removed from the study for logistical reasons.
Of 165 cases, 13 had equivocal biopsy results and were excluded from the biopsy analysis. A total of 141 cases had growths and 53 were malignant.
Enrichment was performed to avoid skew toward common conditions (eg, acne and eczema) as described previously and additionally to include all available cases with biopsy confirmation.
Conditions with fewer than 10 cases each.
Figure 2. Comparison of Clinicians’ Diagnostic Agreement Rate With Dermatologists When Assisted by Artificial Intelligence (AI) vs Unassisted
Every clinician (primary care physicians [PCPs] or nurse practitioners [NPs]) provided their differential diagnosis (several rank-ordered conditions), which were then mapped to 419 skin conditions. Only agreement in the top differential diagnosis (how often the clinicians’ primary diagnosis agreed with the top diagnosis of a panel of dermatologists [top-1 agreement]) is considered, with additional details in eFigures 4 and 5 in the Supplement. Panels A and B cover all 1048 cases, whereas panels C and D cover 141 cases with growths and biopsy confirmation. A, Top-1 agreement increased with AI assistance (P < .001 for both PCPs and NPs). B, For top-1 agreement for unassisted vs assisted modalities for each individual clinician, a value above the diagonal indicates that the clinician had a higher agreement with dermatologists when assisted by AI. C and D, A similar analysis evaluated diagnostic accuracy for growths with biopsy confirmation on the 3-way classification of malignant, precancerous, and benign. Error bars represent 95% CIs. Additional analysis of assistance stratified by AI agreement with the reference diagnoses is presented in eFigures 6 and 7 in the Supplement.
Figure 3. Comparing Simulated Clinical Decisions by Clinicians When Assisted by Artificial Intelligence vs Unassisted
A, Rate of biopsy for all cases. B, Rate of referrals for all cases. C, Diagnostic accuracy among nonreferred cases. D, Diagnostic accuracy among referred cases. Top-3 agreement rates for cases for whom the primary care physicians (PCPs) and nurse practitioners (NPs) did and did not indicate a referral are presented in eFigure 11 in the Supplement. Error bars represent 95% CIs.
Figure 4. Comparing Clinicians’ Confidence and Case Review Time When Assisted by Artificial Intelligence vs Unassisted
A, Confidence of the primary care physicians (PCPs) and nurse practitioners (NPs) as a stacked bar plot. NA indicates cases for which the clinician could not provide a diagnosis. B, Comparison of the differences in case review time for the full set of 1048 cases as a box plot. The box edges represent quartiles, whereas the whiskers extend to the last observed points that fall within 1.5 times the interquartile range from the quartiles. Outliers beyond the whiskers are indicated with dots; a total of 182 (0.4% of the reads) outliers beyond 900 seconds are excluded from the 4 box plots for ease of visualization. The median time for diagnosis increased from 89 to 94 seconds for PCPs and from 77 to 84 seconds for NPs.