| Literature DB >> 33727785 |
Arvind Kumar Morya1, Jaitra Gowdar2, Abhishek Kaushal2, Nachiket Makwana2, Saurav Biswas2, Puneeth Raj2, Shabnam Singh3, Sharat Hegde4, Raksha Vaishnav5, Sharan Shetty6, Vidyambika S P7, Vedang Shah8, Sabita Paul7, Sonali Muralidhar9, Girish Velis10, Winston Padua11, Tushar Waghule12, Nazneen Nazm13, Sangeetha Jeganathan14, Ayyappa Reddy Mallidi15, Dona Susan John16, Sagnik Sen17, Sandeep Choudhary1, Nishant Parashar1, Bhavana Sharma18, Pankaja Raghav1, Raghuveer Udawat1, Sampat Ram1, Umang P Salodia1.
Abstract
INTRODUCTION: Deep Learning (DL) and Artificial Intelligence (AI) have become widespread due to the advanced technologies and availability of digital data. Supervised learning algorithms have shown human-level performance or even better and are better feature extractor-quantifier than unsupervised learning algorithms. To get huge dataset with good quality control, there is a need of an annotation tool with a customizable feature set. This paper evaluates the viability of having an in house annotation tool which works on a smartphone and can be used in a healthcare setting.Entities:
Keywords: artificial intelligence; deep learning; referrable diabetic retinopathy
Year: 2021 PMID: 33727785 PMCID: PMC7953891 DOI: 10.2147/OPTH.S289425
Source DB: PubMed Journal: Clin Ophthalmol ISSN: 1177-5467
Figure 1Flow diagram of the user interface.
Figure 2The zooming on 3 levels. (A) A default view of image spanning 512 pixels in the largest dimension. (B) Zoom into image of size 768 pixels in the largest dimension. (C) Zoomed in into an image 1024 pixels in the largest dimension.
Figure 3Color image (A) and its corresponding red-free image (B).
Figure 4Effect of brightness modification and green channel. (A) Difficulty in locating fovea due to dark macular region. (B) Easier fovea and macula localization. (C) Distinguishing artery and veins in the green channel is easier. (D and E) Contrast change in green channel makes it very easy to assess optic cup and optic disc for glaucoma verification.
Figure 5Each completed annotation, triggers a call to load the next set 3 images.
Figure 6Break up our whole dataset into small, mutually exclusive chunks of 1000 images each.
Figure 7Daily progress graph.
General Statistics and Data Analysis for Tool Usage
| Measure Name | Measure Quantity |
|---|---|
| Number of active doctors | 7 |
| Total number of doctors who have used the tool | 32 |
| Minimum number of images tagged by a doctor | 20 |
| Maximum number of images tagged by a doctor | 26,090 |
| Total annotations recorded on tool | 104,528 |
| Total unique images tagged | 52,152 |
| Minimum number of times an image is tagged | 23 |
| Average time spent by a user to complete a single annotation | 54 seconds |
| Average number of images tagged daily | 413 |
| Time spent by a user on average daily | ~53 minutes |
| Average number of ungradable images indicated by a user | 208 |
| Average number of new anomalies found in a single image per user | 24 |
Figure 8Graph displaying the average-total hourly annotations over 10 months bucketed by hour of the day as the X-axis and number of annotations as the Y-axis. The x-axis represents the day of the week and the yellow line represents the daily target assigned as per choice ie the number of images graded on that day. We have added these details to the figure legend.
Figure 9Feature usage of red-free imaging accessed, brightness changes and by how much were they varied, and also the correlation of these with the overall verdict for an image.
Figure 10Graph showing green channel image is accessed at least 5% of times by the graders out of which grouping by the overall verdict, most of the times the green channel is used for unhealthy cases.
Figure 11Verifying the stickiness or addictiveness of the tool by plotting the average time taken per annotation (in seconds) by some of the active grader pool to plot the first 100 images versus the last 100 images annotated.
Figure 12Top 15 signs and diseases by frequency.
Multi-Grader Variability Statistics as per Tasks/Disease Categories
| Grader Variability Among Doctors | ||||||
|---|---|---|---|---|---|---|
| Agreement Percentage (0 to 100%) | Kappa (−1 to 1) | |||||
| Task | Minimum | Maximum | Mean | Minimum | Maximum | Mean |
| 30.9 | 97.6 | 54.49 | −0.01 | 0.63 | 0.25 | |
| 59.4 | 98.4 | 75.9 | −0.02 | 0.68 | 0.23 | |
| 81.5 | 97.1 | 89.4 | −0.02 | 0.36 | 0.15 | |
| 72.2 | 93.6 | 84.4 | 0.33 | 0.7 | 0.51 | |
Abbreviations: DR, diabetic retinopathy; DME, diabetic macular edema; ARMD, age-related macular degeneration.
Results for AI Classifiers Trained Using Data Annotated by Multiple Experts Using the Smartphone App
| Metrics | Baseline Set | IDRiD | APTOS |
|---|---|---|---|
| 550 | 413 | 3412 | |
| 89.68 | 88.86 | 87.46 | |
| 90.37 | 89.18 | 79.48 | |
| 89.11 | 88.31 | 92.91 | |
| 87.09 | 92.77 | 88.47 | |
| 90.37 | 89.18 | 79.84 | |
| 88.71 | 90.94 | 83.74 | |
| 96.17 | 95.45 | 95.74 |