Literature DB >> 32818083

Health Economic and Safety Considerations for Artificial Intelligence Applications in Diabetic Retinopathy Screening.

Yuchen Xie¹, Dinesh V Gunasekeran^1,2, Konstantinos Balaskas³, Pearse A Keane³, Dawn A Sim³, Lucas M Bachmann⁴, Carl Macrae⁵, Daniel S W Ting^1,6,7.

Abstract

Systematic screening for diabetic retinopathy (DR) has been widely recommended for early detection in patients with diabetes to address preventable vision loss. However, substantial manpower and financial resources are required to deploy opportunistic screening and transition to systematic DR screening programs. The advent of artificial intelligence (AI) technologies may improve access and reduce the financial burden for DR screening while maintaining comparable or enhanced clinical effectiveness. To deploy an AI-based DR screening program in a real-world setting, it is imperative that health economic assessment (HEA) and patient safety analyses are conducted to guide appropriate allocation of resources and design safe, reliable systems. Few studies published to date include these considerations when integrating AI-based solutions into DR screening programs. In this article, we provide an overview of the current state-of-the-art of AI technology (focusing on deep learning systems), followed by an appraisal of existing literature on the applications of AI in ophthalmology. We also discuss practical considerations that drive the development of a successful DR screening program, such as the implications of false-positive or false-negative results and image gradeability. Finally, we examine different plausible methods for HEA and safety analyses that can be used to assess concerns regarding AI-based screening. Copyright 2020 The Authors.

Entities: Chemical Disease Gene Species

Keywords: artificial intelligence; deep learning; diabetic retinopathy; machine learning; ocular imaging

Mesh：

Year: 2020 PMID： 32818083 PMCID： PMC7396187 DOI： 10.1167/tvst.9.2.22

Source DB: PubMed Journal: Transl Vis Sci Technol ISSN： 2164-2591 Impact factor: 3.283

Introduction

Ageing populations and other demographic shifts have made diabetes mellitus (DM) a major epidemic of the 21st century. Despite improvements in health care that have led to decreasing age-specific mortality worldwide, there is an increasing net burden of DM disease, lives lost, and years lived with the disease and its complications. At a prevalence of over one-third of patients with DM, diabetic retinopathy (DR) is one such disabling complication of the disease that now presents a mounting challenge to over-stretched eye care services worldwide.– The need for regular screening has been established for early detection of DR along with diabetic macular edema (DME); whereby DME can develop anytime as DR progresses, and is a frequent cause of severe visual impairment in these patients. These trends give rise to an urgent need for solutions that can sift through the growing crowds of at-risk individuals and triage patients in need of early treatment to prevent permanent vision loss., Fortunately, progress in the parallel fields of ophthalmic imaging and the deep learning (DL) branch of artificial intelligence (AI) have enabled promising solutions that automate the detection of major blinding eye diseases in ophthalmic images. Automation of screening using AI-based solutions could thereby free up limited health care resources to provide more complex eye care services, cater to subpopulations with barriers to health care access, or facilitate transition from opportunistic to systematic screening programs. Clinically acceptable performance of these AI-based solutions in health care has been established for the application of DR screening based on classification of ophthalmic imaging, with area under the receiver operating characteristic curve, sensitivity, and specificity in excess of 80%. These solutions thereby enable accurate diagnosis of disease severity for triage and right siting of patients., However, despite the increasing interest in these AI-based solutions, few have been implemented across populations owing to uncertainty regarding their application to different health care settings, as well as the potential safety challenges.,, In this article, we highlight the major challenges in integrating AI-based solutions for DR screening, along with considerations for conducting health economic assessment (HEA) and safety analysis of these solutions.

Applications of AI in DR Screening

The clinical features of DR in the retina that are indicative of clinical severity and outcomes (e.g., blindness) have been described in the existing literature and consolidated in clinical guidelines, such as the International Clinical Classification of Diabetic Retinopathy Scale. This body of knowledge has fueled applications of AI in the form of classical feature-based image analysis and machine learning (ML) algorithms for DR screening training based on individual feature labeling by experts. These methods have been successfully used to automate classification of retinal fundus photographs based on the presence/absence (binary classification) and/or clinical severity (multiclassification) of DR. The advent of deep learning systems (DLS) heralds a new era in the processing of medical data using AI, whereby algorithms are trained on large repositories of imaging data without individual feature labeling. Instead, training is conducted using imaging data with labeling of overall clinical severity by experts. The DLS then self-learns predictive features from these labels using mathematical functions. Recent reports of DLS outperforming the classical feature-based image analysis in screening for DR and other ocular diseases have been described., The development and validation of several novel DLS solutions for automated DR screening have been reported by groups from various countries, including Singapore, United States, United Kingdom, China, Thailand, India, and Africa.– These investigators reported clinically acceptable performance of their DLS tools for classifying DR in color fundus photography or optical coherence tomography (OCT) imaging. Some AI-based solutions for DR screening have been approved as medical devices for automated classification of ophthalmic imaging based on evidence from studies conducted in several high-income countries. These solutions also have tremendous potential to enhance health care in resource-limited settings.,

Methods of HEA

Given the scarcity of resources available within a health system, HEA of novel health technologies is required for decision-makers to efficiently allocate resources. DR screening and teleophthalmology programs are cost-effective in a variety of developed– and developing settings., Before the advent of DL, feature-based computing techniques had been developed for automated retinal image screening.,– Such automated retinal screening has been shown to be cost-effective when applied to the national screening program in Scotland and the United Kingdom.– However, few studies have incorporated HEA for teleophthalmology services augmented with DL-based classifiers for DR screening. The few existing reports on HEA based on the implementation of DL-based solutions for DR screening are from the United Kingdom and Singapore. These studies show AI to be cost-effective in Singapore and the United Kingdom. However, this finding may not be generalizable given that they are both high-income countries with established teleophthalmology DR screening program. Cost-effectiveness may differ between countries owing to variations in disease prevalence, geographic barriers, availability/cost of the relevant skilled manpower, and health care resources. There have not been any studies conducting HEA of AI applications for DR screening in resource-limited settings, or countries without established teleophthalmology DR screening programs to date. The most appropriate HEA method for a given test or intervention is determined by several factors depending on the existing evidence for the solution and the intended clinical context for its application. Given the relatively nascent nature of AI-based solutions for health care, there is a need to identify suitable methods of HEA to evaluate them. In the following section, we outline common types of HEA and the contexts in which they are applied. These include cost-effectiveness analysis (CEA), cost-utility analysis (CUA), cost-minimization analysis (CMA), and cost-benefit analysis (CBA) (Table 1).,

Table 1.

Types of HEA

Method	Measurement of Effect	Questions Raised	Measurement of Cost
CUA	Healthy years (typically measured as quality-adjusted life years)	Given financial constraints, what is the most efficient way of allocating limited resources for improved outcomes?	Monetary units
CEA	Natural units (e.g., life years gained, cases of blindness avoided, and others)	Given financial constraints, what is the most efficient way of allocating limited resources for improved outcomes?	Monetary units
CMA	Assumption is that the clinical effectiveness of each alternative is the same	Given a certain objective, what is the most efficient way to achieve it?	Monetary units
CBA	Monetary units	Should a given goal or objective be pursued and to what extent?	Monetary units

Types of HEA

Cost-Utility/Cost-Effectiveness Analysis

CEA and CUA are two distinct forms of HEA that are often used interchangeably in the literature, although CUA is technically more comprehensive. CEA generally uses a single clinical outcome (life years), whereas CUA often uses quality-adjusted life years (i.e., calculated based on preferences for a particular health state).– Health Economics authorities (e.g., Washington Panel and the official requirements of economic evaluations of the United Kingdom) have recommended the use of CUA. When conducting CEA, other clinical outcomes (e.g., case of blindness avoided or cases of DR detected) can be used instead based on the disease studied. Examples of other outcomes that may be relevant to DR screening are blindness cases averted in a primary health care setting, or number of cases of proliferative DR detected in a screening network.

Cost-Minimization Analysis

CMA is often used when it has been established that two or more health technologies/interventions have comparable clinical effectiveness. In this context, researchers are primarily interested in assessing which alternative is less costly and quantifying the potential saving associated with the least expensive alternative. However, one of the major concerns with CMA is that it is often difficult to establish whether two alternatives are indeed equivalent (e.g., in a longitudinal study). Several researchers argued that even when there is no statistical difference found between the effectiveness of the two alternatives (i.e., no statistically significant difference in clinical outcomes), CEA is still preferred for HEA.– CMA is primarily used in situations with an established expert consensus (e.g., professional, based on research) that the two alternatives are equivalent in clinical effectiveness. It has been suggested that CMA is most suitable for clear-cut scenarios when alternatives represent similar state-of-the-art solutions (e.g., screening tools of the same class)., A research practice report by the International Society for Pharmacoeconomics and Outcomes Research indicated that CMA provides useful insights on budget impact for decision-makers. In DR screening, the current literature has concluded that AI-based solutions using DL techniques have demonstrated clinically comparable performance to human assessment in established DR screening programs, both in publicly available datasets and real-world settings. A recent meta-analysis on DLS published has further confirmed this conclusion. As such, CMA is a viable method to conduct HEA of comparable AI-based solutions in health systems with established DR screening programs.

Cost-Benefit Analysis

CBA is often used to quantitatively evaluate whether a new intervention should be adopted by directly comparing costs of the intervention against existing practices. For CBA, clinical outcomes and effects (e.g., disability days avoided, life years gained, medical complications avoided, or quality-adjusted life years gained) are converted into monetary value to evaluate the foreseeable net costs of adopting a given solution within a clinical pathway. The difficulty of applying CBA effectively in health, however, is the difficulty in assigning a monetary value to clinical outcomes (e.g., quality-adjusted life years or blindness prevented). In discrete choice experiments, patients are invited to express their strength of preference based on specific clinical outcomes to help ascribe a monetary value to them. However, this is subject to variation from cultural differences, and there are challenges (e.g., uncertainty about the validity of the outcomes of interventions) that need to be addressed for CBA to be used in HEA of AI-based solutions for DR screening. Cartwright has contributed an insightful review of several reports applying CBA to the intervention of drug abuse treatment services. Notably, they highlighted challenges in the measurement of clinical outcomes, need for representative populations of patients recruited, and lack of standardization in the application of CBA.

HEA of AI-Based Solutions for DR Screening

The previous section indicated that the existing reports of HEA of AI-based solutions for DR screening are from countries with established teleophthalmology programs, and systems for training and regular examination of human assessors for DR screening (i.e., United Kingdom and Singapore). Having reviewed these reports, one would arrive at the conclusion that semi-automated screening models are cost-effective (Table 2).,, Tufail et al. reported the cost saving to be 12% to 21% for DR screening in the United Kingdom using ML (an AI-based technology) in comparison with human assessors., A Scottish study showed a 46.7% cost-reduction by replacing first-level human assessment with automated grading in a national DR screening program., A study from Singapore suggested that the semi-automated/AI-assisted screening model is cost-effective compared to human assessment for DR screening over a lifetime horizon. However, there is no published HEA of a fully-automated DR screening model to date.

Table 2.

Health Economic Studies on DR Screening Using AI

Author, Year, Country	Comparators	Screening Model	Measurement of Effect	Economic Outcomes
Scotland et al,³⁴ 2007, UK	Semi-automated grading (hybrid approach) vs. manual grading alone	Digital photography and multilevel manual grading systems	The number of appropriate screening outcomes (i.e., defined as final decisions appropriate to actual grade of retinopathy present) and true referable cases detected in one year	Compared to the manual grading model, the semi-automated model led to a saving of £4088 per additional referable case detected, and of £1990 per additional appropriate screening outcome.
Tufail et al,²⁰ 2016, UK	AI-based ML tool as placement for initial manual grading (semi-automated hybrid)	AI-based (ML) two-field fundus photos	Appropriate outcomes (defined as identification of DR present vs. absent by the AI-based software)	AI-based semi-automated hybrid approach (Retmarker and EyeArt) had sufficient specificity to make them cost-effective to manual grading alone, as ICER was $18.69 and $7.14, respectively
Xie et al,⁵⁰ 2019, Singapore	Semi-automated hybrid approach (DLS-based) vs. manual grading alone	Retinal fundus photographs	QALYs	DLS-based (semi-automated hybrid approach) resulted in a lifetime cost-saving of $135 per patient while maintaining comparable QALYs gained.

QALYs, quality-adjusted life years;

ICER, incremental cost-effectiveness ratio;

manual grading is equivalent to human assessment.

Health Economic Studies on DR Screening Using AI QALYs, quality-adjusted life years; ICER, incremental cost-effectiveness ratio; manual grading is equivalent to human assessment.

Implications of False Negatives (FNs) in Screening Programs

FN cases are patients with referable DR that are mislabeled as being normal. As a result, these patients may receive delayed care if they are only referred at a subsequent screening interval that could be months or years later. The clinical impact of delayed care is the risk of interim disease progression. For DR this can lead to permanent vision loss in severe cases, as they tend to progress faster. Even when effective treatment is readily available, a high FN rate puts patients at increased risk of disease progression and vision loss., Notwithstanding the financial burden on the health care system from disease progression due to late detection, studies also report a psychological impact on patients, loss of public confidence in screening programs, and legal implications as other major consequences of FNs.

Implications of False Positives (FPs) in Screening Programs

In contrast to FN, a high FP rate of screening programs results in referrals of normal screening subjects for further assessment by an ophthalmologist when it is not required. This will create additional costs for the health care system in terms of resources and manpower being utilized to attend to unnecessary referrals. Moreover, FPs from a screening program could result in unnecessary anxiety and psychological stress for patients. However, there is no expert consensus on the acceptable FP rate performance for DR screening to date. Image quality is another important consideration in real-world screening implementation. Images with low quality (ungradable images) would be referred to the assessors and could incur additional costs for regrading images or repeat image acquisition if necessary. Nevertheless, the treatment of ungradable images as FP is not yet standard practice. In reports of DL models, several groups have excluded ungradable images from their analyses., However, this may not reflect the true performance of these solutions in practical application. In a study of automated eye screening, Tufail et al. reported results after including images of poor quality or classified as ungradable by the human assessors. Similarly, Ting et al. in Singapore also considered ungradable as referable DR to avoid missing possible DR cases. Discrepancies in reporting FP rate would impact the HEA of screening programs. The authors recommend that images classified as ungradable by AI-based solutions for DR screening should be included in the assessment of performance to reflect the practical need for these patients to be referred for definitive assessment. In developing a screening solution, there is a tradeoff between minimizing for FP or FN. The ideal balance for each health system may vary slightly depending on their system factors, such as cost structures, availability of resources, as well as resolution of competing clinical and financial interests. However, when clinical considerations are prioritized, minimizing FN in the context of these high performing AI-based solutions is generally favored because of the potential clinical safety impact of FN, whereas that for FP is mitigated when patients are reviewed by the attending ophthalmologist.

Challenges of Conducting HEA in the Real-World Setting

A number of recent studies suggest that the use of traditional techniques for HEA to quantify the impact of complex health services, such as a national screening program, can be challenging.– They explained that the evaluation of complex interventions involving both human services and advanced assistive technologies will likely encounter a number of problems. Among them, the heterogeneity of the user groups, participant selection (bias), the degree of participation of the user groups carrying out the intervention, and the composition of these groups lead to complexities that may require modifications to traditional assessment methods. In addition, conducting a comprehensive evaluation of an AI-based solution for DR screening requires consideration of local context, such as the availability of skilled manpower and DR screening resources. Therefore investigating the implementation in resource-limited settings is also an important area for future health services research. This is needed to evaluate the interventions in these settings based on their unique practical considerations, such as limited availability of internet access and the forms of imaging devices available (table-mounted, handheld, smartphone adapter-based, and others) that may affect image quality and the performance of AI-based solutions for DR screening, such as the incidence of FPs and FNs.

Summary Recommendations for HEA of AI-Based DR Screening

In summary, the choice of the specific HEA method for a particular clinical application of AI would depend on the form of application, clinical outcomes relevant to the intervention, availability of preexisting representative datasets, and the nature of assumptions associated with the solution. CMA is useful for rapid comparison of interventions with established comparable clinical effectiveness. Where this has yet to be established, CUA is often the preferred mode of analysis, although the nature of measurable clinical outcomes may require CEA to be considered instead. In resource-limited settings with high unmet clinical needs, CBA provides a tool for quantitative assessment of interventions to identify the most financially prudent option. Based on these considerations, CMA can be considered to evaluate AI-based solutions for DR screening in developed countries with established DR screening programs. CEA/CUA may need to be conducted for other dissimilar contexts to evaluate both clinical outcomes and costs based on the health system in question. Given the pressing need for solutions to expand the capacity of DR screening capabilities, HEA using data from clinical trials would be ideal to provide reliable and timely results with high internal validity to aid administrators in decision-making regarding the adoption of these AI-based solutions. The selected HEA method needs to be applied with established HEA strategies such as use of multiple comparator groups, stratified sensitivity analysis using those groups, and appropriate modeling methods, as outlined in frameworks for the assessment of complex public health interventions.

Methods of Safety Analysis for Health Care

Implementing AI technologies in national screening programs have the potential to improve patient safety by providing rapid and reliable identification of referable eye disease. It also has the potential to introduce new risks that will need careful analysis and management., These risks can be associated with the underlying AI technologies or the organizational systems that implement them. For example, mismatches can develop between the data that a DLS was originally trained on (i.e., training dataset) and the data it is required to interpret (validation dataset), such as geographic variations in disease phenotypes, which can lead to shifts in screening performance. Therefore organizational systems and decision-making processes need to be developed for periodic monitoring to investigate and address instances in which the automated screening system does not provide an appropriate classification to ensure that the overall screening system can “fail safe.” In addition, analyzing the safety of a DLS can be challenging, owing to difficulties in understanding the underlying decision-making process. The safety analysis of AI-based screening programs therefore requires the use of analytic techniques that consider clinical, technical, social, and organizational sources of safety and risk. A range of safety analysis methods have been developed for the prospective analysis of potential risks in complex sociotechnical systems. However, there has been limited examination of how these can be applied to large-scale AI systems in health care to date. In the following sections, we outline several relevant methods of safety analysis, including failure mode and effects analysis (FEMA), system-theoretic process analysis (STPA), and bowtie tie analysis.

Failure Mode and Effects Analysis

FMEA is a structured and proactive approach to identifying safety issues in complex sociotechnical systems that is increasingly applied to health care.– FMEA involves creating a detailed map of processes for a service or activity to identify all the potential manners that those processes might fail, and what the causes and effects of those failures might be. Each failure is then assessed according to the severity of the outcome, the probability of occurrence, and the likelihood of detection, to prioritize mitigating action and resources. One of the key requirements of conducting an effective FMEA is to establish a team with deep and broad expertise in all aspects of the system being analyzed, encompassing clinical, technical, and organizational components. Conducting FMEAs can be time-consuming and resource intensive. Because of the focus on analyzing individual failure modes, capturing complex interactions between different parts of a system is also a challenge. However, FMEA provides a systematic approach to understand and develop solutions for a broad range of technical and organizational safety risks and could be effectively applied to the implementation of AI-based screening programs.

System-Theoretic Process Analysis

STPA is a safety analysis method that analyses the way safety is controlled within a complex system, such as through automated monitoring, management supervision, or regular audits. It identifies where potential gaps in those control systems may occur, and how serious those unsafe control actions might be., One of the core premises of this approach is that all systems have hierarchical control structures: for example, local-level control might be performed by technicians or clinicians; higher-level supervision may be conducted by program managers; and overall oversight may be performed by systems regulators. The STPA method seeks to identify hazards in terms of potential failures of control, such as scenarios in which clinicians may not become aware of ungradable images. STPA is a relatively new method that requires extensive expertise in systems-analysis. It has seen limited application in health care to date, although its associated incident analysis model has been applied with useful outputs., STPA may be particularly valuable in identifying and optimizing the safety monitoring and governance systems required for AI-based screening programs. These may include routine algorithmic audits, peer review, and adjudication processes, which have already been described as solutions for grader variability when training automated solutions for DR screening.

Bowtie Analysis

Bowtie analysis is a barrier-based approach to safety analysis that is widely used in highly automated safety-critical industries, such as aviation, and is beginning to be applied in health care.,, It provides a visual method to identify and map factors that contribute to a particular failure, the consequences that can result from that failure, and the barriers and risk controls that can protect against those contributing factors and consequences. One of the main strengths of bowtie analysis is the ability to produce comprehensive graphic representations of complex models of risk, which can be used to explore both the sources of risk and safety in relation to specific types of failure. Directly identifying safety barriers and risk controls also provides practical insight into the actions that are needed to mitigate risks when implementing a new system.

Conducting Safety Analyses of AI-Based Solutions for DR Screening

In the earlier section, we have reviewed several important methods (FMEA, STPA, bowtie analysis) that can be used to analyze the safety concerns in implementing AI-based solutions in ophthalmology. To use these methods for safety analyses, a thorough understanding is required of the various potential models that AI-based solutions for DR screening can be implemented within a health system. The use of AI with teleophthalmology has been suggested as a sustainable solution to rapidly scale-up DR screening., Existing teleophthalmology screening programs utilize remote human assessment (by manual graders) to identify the presence of DR in ophthalmic imaging captured in community-based settings. To deploy AI-based DR screening programs, there are two different models that could be used: the semi-automated (using DLS as a filter prior to human assessment), and the fully-automated (using DLS as a complete replacement of human assessment). Figure 1 depicts the two DLS-based DR screening models (Figs. 1B, 1C), alongside an existing teleophthalmology human assessment model (Fig. 1A).

Figure 1.

Three potential DR screening models using manual grading (A), semi-automated (B), and fully-automated (C).

Three potential DR screening models using manual grading (A), semi-automated (B), and fully-automated (C). The semi-automated model (Fig. 1B) is a hybrid approach using an AI-based solution as a preliminary filter prior to human assessment. Here referable cases from the solution undergo secondary assessment by human assessors in a centralized reading center. Cheung et al. suggested that the benefits of the semi-automated model include decreased workload on nonreferable retinal images, and reduced FP cases referred to ophthalmologists. However, a fully-automated model (Fig. 1C) with complete replacement of human assessment may be more relevant for countries without existing systems and manpower for teleophthalmology. Ultimately, the manner in which various AI-based solutions are to be integrated into different health care systems needs to be considered based on the performance of the tool, the constraints of the system, and the safety considerations for participating patients. Because of the scalability of AI and ability to meet the needs of varied populations of patients, there is growing interest to examine potential safety issues that need to be considered. The earlier-mentioned methods for safety analyses can be used to inform the development of regulatory standards for assessment of safety and efficacy that are still evolving with the advances of AI applications in medicine.

Discussion

In this article, we provide a brief summary of the literature regarding the implementation of AI-based DR screening programs, highlighting the need for HEA and safety analyses. A brief discussion on various types of these HEA and safety analysis methods and when to use them in the evaluation of new technologies (e.g., AI) is also included. Practical considerations for the implementation of an AI-based DR screening program have also been outlined, such as the clinical implications of FN rates, FP rates, and image gradeability. Developing screening tools with a low FN rate has been highlighted as a clinically relevant goal due to patient safety implications. A balance of minimizing both FP and FN needs to be determined based on the intended clinical context. The ideal balance for each context will ultimately be governed by the cost of provider manpower (for adjudication or review of FPs), availability of relevant resources (e.g., various forms of imaging), and the needs of the population it serves (e.g., disease prevalence). Besides screening thresholds, image gradeability is another consideration in evaluating performance of a screening program. This is an important modifiable factor that could affect the HEA of a DLS screening program due to costs involved to reacquire or regrade images and should be included in the evaluation of AI-based solutions. Where relevant resources and technical capabilities are available, additional sources of information, such as three-dimensional OCT scans, may be incorporated to reduce the FP rate in the application of DLS for DR screening, in the same way they have been applied to other eye diseases., This article primarily discusses the role of AI-based solutions for DR screening, which has established cost-effectiveness and has been incorporated in evidence-based practice given improved outcomes with early detection and treatment. AI-based solutions to screen for other major eye diseases, such as glaucoma and age-related macular degeneration (AMD), have also been developed.– However population screening for these conditions are not yet widely accepted due to inconclusive evidence based on HEA, and clinically acceptable screening performance may vary for these conditions. That being said, incorporation of AI-based solutions may lower manpower costs and help make population screening for these conditions more affordable. Furthermore, Ting et al. have demonstrated that a single AI-based solution for DR screening could be trained to simultaneously detect referable AMD and glaucoma for broad-based eye screening. These considerations will need to be addressed in future research studying the implementation of AI-based solutions for eye screening. Looking ahead, future research using the tools outlined for HEA and safety analyses are needed to achieve a better understanding of the implementation of AI-based solutions in different settings (e.g., resource-limited settings, remote areas) and with novel screening models (e.g., fully-automated DLS). The required transitions in service delivery along with their associated requirements/costs also need to be investigated. These include transitioning from opportunistic/population DR screening, with or without teleophthalmology services, over to DR screening incorporating AI-based solutions.

Conclusions

To facilitate the real-world integration of AI-based solutions, future studies should also assess the technical feasibility and patient acceptability of implementing these solutions in various primary eye care settings. As these AI-based solutions will influence the practice of ophthalmology and medicine in the near future, it is important to create mechanisms for the direct users (such as optometrists or clinicians) to evaluate and utilize such “black box” AI-based screening programs in clinical practice. Therefore studies to evaluate the health professionals’ acceptance and interpretability of AI will be useful to identify barriers to adoption to develop targeted solutions accordingly.,

72 in total

1. An economic analysis of screening for diabetic retinopathy.

Authors: Siri Bjørvig; Monika A Johansen; Kristian Fossen
Journal: J Telemed Telecare Date: 2002 Impact factor: 6.184

2. Costs and cost-minimisation analysis.

Authors: R Robinson
Journal: BMJ Date: 1993-09-18

Review 3. Diabetic retinopathy: global prevalence, major risk factors, screening practices and public health challenges: a review.

Authors: Daniel Shu Wei Ting; Gemmy Chui Ming Cheung; Tien Yin Wong
Journal: Clin Exp Ophthalmol Date: 2016-02-17 Impact factor: 4.207

4. Three kinds of proactive risk analyses for health care.

Authors: Garill Coles; Becky Fuller; Kathleen Nordquist; Steve Weissenberger; Leann Anderson; Brooke DuBois
Journal: Jt Comm J Qual Patient Saf Date: 2010-08

5. Are we making good use of our public resources? The false-positive rate of screening by fundus photography for diabetic macular oedema.

Authors: R Lm Wong; C W Tsang; D Sh Wong; S McGhee; C H Lam; J Lian; J Wy Lee; J Sm Lai; V Chong; I Yh Wong
Journal: Hong Kong Med J Date: 2017-07-07 Impact factor: 2.227

Review 6. Management of diabetic retinopathy: a systematic review.

Authors: Quresh Mohamed; Mark C Gillies; Tien Y Wong
Journal: JAMA Date: 2007-08-22 Impact factor: 56.272

7. Grader Variability and the Importance of Reference Standards for Evaluating Machine Learning Models for Diabetic Retinopathy.

Authors: Jonathan Krause; Varun Gulshan; Ehsan Rahimy; Peter Karth; Kasumi Widner; Greg S Corrado; Lily Peng; Dale R Webster
Journal: Ophthalmology Date: 2018-03-13 Impact factor: 12.079

8. Focal photocoagulation treatment of diabetic macular edema. Relationship of treatment effect to fluorescein angiographic and other retinal characteristics at baseline: ETDRS report no. 19. Early Treatment Diabetic Retinopathy Study Research Group.

Authors:
Journal: Arch Ophthalmol Date: 1995-09

9. Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015.

Authors:
Journal: Lancet Date: 2016-10-08 Impact factor: 79.321

10. Clinically applicable deep learning for diagnosis and referral in retinal disease.

Authors: Jeffrey De Fauw; Joseph R Ledsam; Bernardino Romera-Paredes; Stanislav Nikolov; Nenad Tomasev; Sam Blackwell; Harry Askham; Xavier Glorot; Brendan O'Donoghue; Daniel Visentin; George van den Driessche; Balaji Lakshminarayanan; Clemens Meyer; Faith Mackinder; Simon Bouton; Kareem Ayoub; Reena Chopra; Dominic King; Alan Karthikesalingam; Cían O Hughes; Rosalind Raine; Julian Hughes; Dawn A Sim; Catherine Egan; Adnan Tufail; Hugh Montgomery; Demis Hassabis; Geraint Rees; Trevor Back; Peng T Khaw; Mustafa Suleyman; Julien Cornebise; Pearse A Keane; Olaf Ronneberger
Journal: Nat Med Date: 2018-08-13 Impact factor: 53.440

8 in total

Review 1. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI.

Authors: Baptiste Vasey; Myura Nagendran; Bruce Campbell; David A Clifton; Gary S Collins; Spiros Denaxas; Alastair K Denniston; Livia Faes; Bart Geerts; Mudathir Ibrahim; Xiaoxuan Liu; Bilal A Mateen; Piyush Mathur; Melissa D McCradden; Lauren Morgan; Johan Ordish; Campbell Rogers; Suchi Saria; Daniel S W Ting; Peter Watkinson; Wim Weber; Peter Wheatstone; Peter McCulloch
Journal: Nat Med Date: 2022-05-18 Impact factor: 87.241

2. Reporting guideline for the early stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI.

3. Digital health in medicine: Important considerations in evaluating health economic analysis.

Authors: Ann Kwee; Zhen Ling Teo; Daniel Shu Wei Ting
Journal: Lancet Reg Health West Pac Date: 2022-05-08

4. Multicenter, Head-to-Head, Real-World Validation Study of Seven Automated Artificial Intelligence Diabetic Retinopathy Screening Systems.

Authors: Aaron Y Lee; Ryan T Yanagihara; Cecilia S Lee; Marian Blazes; Hoon C Jung; Yewlin E Chee; Michael D Gencarella; Harry Gee; April Y Maa; Glenn C Cockerham; Mary Lynch; Edward J Boyko
Journal: Diabetes Care Date: 2021-01-05 Impact factor: 19.112

5. Clinician checklist for assessing suitability of machine learning applications in healthcare.

Authors: Ian Scott; Stacey Carter; Enrico Coiera
Journal: BMJ Health Care Inform Date: 2021-02

6. Global disparity bias in ophthalmology artificial intelligence applications.

Authors: Luis Filipe Nakayama; Ashley Kras; Lucas Zago Ribeiro; Fernando Korn Malerbi; Luis Salles Mendonça; Leo Anthony Celi; Caio Vinicius Saito Regatieri; Nadia K Waheed
Journal: BMJ Health Care Inform Date: 2022-04

7. Feasibility Study of a Multimodal, Cloud-Based, Diabetic Retinal Screening Program in a Workplace Environment.

Authors: Jeffrey R Willis; Ferhina S Ali; Braelyn Argente; Amitha Domalpally; Jacqueline Gannon; Simon S Gao; Shagun Grover; Purti Kanodia; Sparkle Russell-Puleri; Diana Sun; Cory Thrasher; Costas Tsougarakis; J Jill Hopkins
Journal: Transl Vis Sci Technol Date: 2021-05-03 Impact factor: 3.283

8. Clinical validation of an artificial intelligence-based diabetic retinopathy screening tool for a national health system.

Authors: José Tomás Arenas-Cavalli; Ignacio Abarca; Maximiliano Rojas-Contreras; Fernando Bernuy; Rodrigo Donoso
Journal: Eye (Lond) Date: 2021-01-11 Impact factor: 3.775

8 in total