Literature DB >> 35662766

Developing Clinical Decision Support System using Machine Learning Methods for Type 2 Diabetes Drug Management.

Rajiv Singla¹, Shivam Aggarwal², Jatin Bindra², Arpan Garg², Ankush Singla².

Abstract

Background and
Objectives: Application of artificial intelligence/machine learning (AI/ML) for automation of diabetes management can enhance equitable access to care and ensure delivery of minimum standards of care. Objective of the current study was to create a clinical decision support system using machine learning approach for diabetes drug management in people living with Type 2 diabetes. Methodology: Study was conducted at an Endocrinology clinic and data collected from the electronic clinic management system. 15485 diabetes prescriptions of 4974 patients were accessed. A data subset of 1671 diabetes prescriptions of 940 patients with information on diabetes drugs, demographics (age, gender, body mass index), biochemical parameters (HbA1c, fasting blood glucose, creatinine) and patient clinical parameters (diabetes duration, compliance to diet/exercise/medications, hypoglycemia, contraindication to any drug, summary of patient self monitoring of blood glucose data, diabetes complications) was used in analysis. An input of patient variables were used to predict all diabetes drug classes to be prescribed. Random forest algorithms were used to create decision trees for all diabetes drugs. Results and
Conclusion: Accuracy for predicting use of each individual drug class varied from 85% to 99.4%. Multi-drug accuracy, indicating that all drug predictions in a prescription are correct, stands at 72%. Multi drug class accuracy in clinical application may be higher than this result, as in a lot of clinical scenarios, two or more diabetes drugs may be used interchangeably. This report presents a first positive step in developing a robust clinical decision support system to transform access and quality of diabetes care. Copyright:

Entities: Chemical

Keywords: Clinical decision support; drug management; machine learning; type 2 diabetes

Year: 2022 PMID： 35662766 PMCID： PMC9162252 DOI： 10.4103/ijem.ijem_435_21

Source DB: PubMed Journal: Indian J Endocrinol Metab ISSN： 2230-9500

INTRODUCTION

The prevalence of diabetes is increasing in India exponentially. The International Diabetes Federation pegs the prevalence of diabetes in India to be 8.9% while the latest estimates from India and neighboring countries show a much higher prevalence of diabetes in urban centers.[123] Moreover, there is a huge population with prediabetes.[2] Identifying populations at risk for diabetes, diagnosing and treating them is going to put tremendous pressure on health resources in south-east Asian countries. The qualified health workers to population ratio in India is already one of the lowest in the world. There are 6.5 physicians per 10000 population in India as compared to 15.6 physicians per 10000 population world wide.[4] However, only 55–60% of them carry a valid degree to practice modern medicine and only 18.2% have post-graduate degrees.[56] In entire India, there are only 1200–1300 qualified endocrinologists. With burgeoning diabetes prevalence and stretched health resources, the use of technology may be the way forward to scale diabetes care. Another issue with diabetes care in India is the lack of enforcement of minimum standards of care. Health spending in India is largely out-of-pocket expenditure for consumers.[7] This creates great inequality in access to care and also the quality of care. Technology can again come to our rescue for ensuring a minimum quality of care to the poorest of the poor. Automation of diabetes management can enhance equitable access to care and ensure delivery of minimum standards of care.[89] Literature on the use of machine learning (ML) in diabetes drug management remains scarce though.[81011] In the current study, we suggest a novel approach for automation of diabetes management using real-world data, including patient behavioral factors, and applying machine learning algorithms to create pathways for diabetes drug usage.

RESEARCH DESIGN

Any decision to escalate or de-escalate the diabetes drug therapy and use of a particular drug-class is based on baseline patient characters like age, sex, renal functions, duration of diabetes, body mass index, and also on variable parameters like current glycemic control, current treatment regimen, hypoglycaemic events, and self-monitoring of blood glucose data. The requirement of medicines is also greatly influenced by patient behavioral factors like compliance to medicines, adherence to diet, and exercise among others. Availability of this clinical data in a usable and objective form can allow the artificial intelligence (AI)/ML approach to be used to automate diabetes drug prescription.

DATASET AND METHODOLOGY

Retrospective, cross-sectional data from electronic medical records (EMR) was extracted from an endocrine practice. Practice EMR is structured in a way not only to objectively capture the clinical and biochemical parameters during each physician-patient consultation but also has an option of recording behavioral factors displayed by the patient in the period since the last visit. These behavioral questions are asked by treating endocrinologists directly from patients and recorded. Recording of these behavioral parameters is organized in an objective way on a scale of 3-5 (adherence to diet and exercise each on a scale of 5, compliance to medicine on a scale of 3, self-monitoring of blood glucose values on a scale of 5) and importance of these factors in diabetes management is published previously.[12] A data subset of diabetes prescriptions with complete information on diabetes drugs, demographics (age, gender, body mass index), biochemical parameters (HbA1c, fasting blood glucose, creatinine), and patient clinical and behavioral parameters (diabetes duration, compliance to diet/exercise/medications, hypoglycemia, contraindication to any drug, summary of patient self-monitoring of blood glucose data, diabetes complications) was identified and used for final analysis. For analysis, a supervised machine learning process was used. In supervised machine learning, the software creates its own pathways/rules to delineate the relationship between given input and outcomes. Each patient-physician interaction resulting in the generation of a prescription is one data point for purpose of this study [Figure 1]. And, each data point has a recording of patient input parameters (patient’s clinical and behavioral factors) as well as an output parameter (an anti-diabetes drug to be used on basis of input parameters). In the nutshell, the purpose is to make a machine learn the process of prescription of anti-diabetes drugs when provided with appropriate clinical data [Figure 2].

Figure 1

Definition of a data point for current study

Figure 2

Input variables, process and output parameters for developing machine learning algorithms for diabetes drug prescription

Definition of a data point for current study Input variables, process and output parameters for developing machine learning algorithms for diabetes drug prescription For purpose of developing and validating the machine learning process, 67% of the dataset was used as a training set and 33% as a testing set [Figure 3]. The software aquires from the data of the training set to create rules/pathways and these rules are tested on data of the testing set. Many techniques can be used for the purpose of creating these pathways/rules in the supervised machine learning process e.g., support vector machine, random forest algorithm, neural networks, case base reasoning, k-nearest neighbor, etc., Composition of data and nature of outcome required usually guides the selection of appropriate machine learning techniques. In the current study, Random forest algorithms were used to create decision trees for each of the diabetes drug classes leading to the development of 12 different pathways, one for each drug class available in India. Random forest decision tree machine learning algorithm mirrors the clinical decisions and performed best on our data.[13]

Figure 3

Process of machine learning analysis and validation

Process of machine learning analysis and validation These algorithms were validated in the testing dataset and the accuracy of prediction in comparison to actual usage is presented as result [Figure 3]. Analysis was done using python language programming tools like Jupyter notebook and open source libraries and toolkits. The research was carried out adhering to principles outlined in the World Medical Association’s Declaration of Helsinki.

RESULTS

EMR had records in the form of 15485 prescriptions, from 4974 patients visiting the practice of lead author from May 2015 to Feb 2020 with the diagnosis of Type 2 diabetes mellitus. A subset of 1671 prescriptions (training set n = 1119; testing set n = 552) from 940 patients with complete records was used as the final dataset [Figure 4].

Figure 4

Process of data extraction, data preparation and selection of final dataset for analysis

Process of data extraction, data preparation and selection of final dataset for analysis As process of each prescription generation is one data point for this study and out of 1671 data points in total, 946 data points had HbA1c recorded. The mean Hba1c of of all these data points was 7.3% ± 1.3% (56 ± 9.9 mmol/mol) (median 7.05% (54 mmol/mol); n = 946). This HbA1c may actually underestimate the glycemic control of the study population as these HbA1c values may be from the first interaction as well as during follow ups. It is important to note this HbA1c here as any machine learning algorithm would only be as good as data used to create them. We want to convey here that using these algorithms, the aim HbA1c would ≤7.3% ± 1.3%. In most of the literature/guidelines, treatment escalation/de-escalation is based on the current glycemic status, the risk for hypoglycemia parameters, and current treatment. But compliance of a person with diabetes towards diet, medicines, and exercise has not been objectively inculcated in the decision process. We did not use the larger dataset of 15485 data points with records of age, BMI, sex, current prescription drugs, and current glycemic status, as a random forest decision tree revealed significant weightage of behavioral factors in deciding the drug prescription [Figure 4]. Random forest decision tree algorithms for each drug class were generated using data in the training set. In the nutshell, the software has access to a training dataset with all the input parameters and desired outcomes. Now, software connects the input parameters with output parameters with its own rules, and these rules are known as decision trees. For example, the decision to use sulfonyl urea class of drug in a particular scenario would depend on its previous use, and then the algorithm would assess other factors like hypoglycemia incidence, BMI, fasting blood glucose, etc., While as a clinician, we might have used the same factors to arrive at a decision, the algorithm assigns an order and importance to each of these factors based on the data. The accuracy of these decision trees was tested in the testing set (the process is delineated in Figure 3). For testing the accuracy of decision trees generated by random forest algorithms, the software is given access to only input parameters of the testing dataset and an output is generated. This output is compared with pre-existing actual output available to us. Table 1 outlines the accuracy parameters for each drug class when these algorithms were applied to the testing set.

Table 1

	True Positive Predictions	False Positive predictions	True Negative Predictions	False Negative Predictions	Accuracy
Sulfonylureas	286	27	211	28	90.04
Metformin	20	6	520	6	97.83
DPP 4i	250	46	219	37	84.96
Pioglitazone	18	12	513	9	96.19
SGLT2i	70	28	440	14	92.40
AGI	24	2	543	2	96.92
Basal Insulin	17	8	519	8	97.10
Premix Insulin	44	7	494	7	97.46
Data for other drug classes viz. Short acting insulin, GLP1 agonist, meglitinides and saroglitazar is not being shown here; their use was very sparse in the current dataset.

True positive prediction means use of a particular anti-diabetes drug is suggested by machine learning algorithm and this suggestion is found to be accurate when compared to actual prescription. False positive means use suggested by machine but not used in actual prescription

Accuracy of individual drugs class prescription on evaluating machine learning algorithms with testing set data (n = 552). (DPP4i: Dipeptidyl Peptidase-4 (DPP-4) Inhibitors; SGLT2i: Sodium-glucose Cotransporter-2 (SGLT2) Inhibitors; AGI: Alpha-glucosidase inhibitors) True positive prediction means use of a particular anti-diabetes drug is suggested by machine learning algorithm and this suggestion is found to be accurate when compared to actual prescription. False positive means use suggested by machine but not used in actual prescription Major factors impacting the decision to continue/escalate/de-escalate treatment included previous use of drug class, HbA1c, the incidence of hypoglycemia, dietary compliance, compliance to medicines, age, BMI, intolerance to particular drug class, and other drug classes that are being used. Factor overwhelmingly determining the prescription of a drug class was its presence in the previous prescription. Different factors were important for different drug classes [Table 2]. Behavioral factors were important in clinical situations where escalation or de-escalation of therapy was not clearly indicated based on glycemic control or hypoglycemia.

Table 2

Factors affecting prescription of an individual diabetes drug class as created by machine learning algorithms

Drug class	Sulfonylureas	DPP4i	SGLT2i	Metformin	Pioglitazones	AGI	Pre-mix insulin	Basal insulin
Major factors impacting decision to continue/escalate/de-escalate treatment	1. Pre-existing Sulfonylureas use 2. HbA1c 3. Fasting BG on Home Log 4. Duration of diabetes 5. Age 6. Incidence of hypoglycemia	1. Pre-existing DPP4i use 2. Intolerance to DPP4i/SGLT2i 3. PPBG on Home log 4. Exercise compliance 5. HbA1c 6. Diet compliance	1. Old SGLT2i use 2. HbA1c 3. Compliance to medication 4. Age 5. BMI 6. Intolerance to SGLT2i	1. Old Metformin use 2. HbA1c 3. Diet compliance 4. Hypoglycemia 5. Other drugs that are being used 6. BMI	1. Old pioglitazone use 2. BMI 3. HbA1c 4. Fasting BG on home log 5. Co-morbidities 6. Use of other drugs	1. Old AGI use 2. Incidence of hypoglycemia 3. BMI 4. Age 5. Diet compliance 6. PPBG on home log	1. Pre-existing premix insulin use 2. Fasting blood glucose on home log 3. Age 4. Use of insulin secretagogues 5. PPBG on home log 6. Duration of diabetes	1. Pre-existing use of basal insulin 2. Age 3. Incidence of hypoglycemia 4. HbA1c 5. Any intermittent illness 6. Diet adherence

Factors affecting prescription of an individual diabetes drug class as created by machine learning algorithms Complete prescription accuracy (proportion of prescriptions in validation dataset where all drugs usage were predicted correctly by algorithm) was 72%.

DISCUSSION

Artificial intelligence in diabetes care is coming of age with major breakthroughs in diagnostic and complication prediction spheres. Using fundus photography and artificial technique to detect retinopathy in people with diabetes is already approved by the Federal Drug Agency, USA.[14] Kuo et al.[15] could predict renal functions and chronic kidney disease status based on ultrasound images with an accuracy of 85.6% that is higher than that of experienced nephrologists (60.3%–80.1%). However, decisions related to drug management have been difficult to be solved using technology. Reason for the same include, (a) unavailability of patient behavioral data impacting the achievement of glycemic control from diabetes clinics, (b) data related to drugs and glycemic control is seldom captured in usable form, and (c) lack of achievement of ideal glycemic control in the real world, thereby limiting the efficacy of potential AI systems.[1617] A short review of studies applying the AI approach for diabetes drug management has been published previously.[710] The current study lays down a ground framework to collect data encompassing all factors required to make a decision for drug management in routine clinical practice [Figure 2]. Previous studies in this sphere have tried to predict the sequence of anti-diabetes drug classes to be used based on previous drugs being used and current glycemic control.[181920] However, behavioural factors play an important role in deciding drug management in routine clinical practice.[12] For example, at the same level of HbA1c with the same baseline anti-diabetes drugs, two individuals would have different prescriptions if one of them is not taking drugs regularly while the other shows full compliance. Single drug prediction accuracy is very high in the current study indicating very less noise in data. This can be due to data being from a single practice. Drugs that are used overwhelmingly (Metformin) or sparsely (e.g. pioglitazone) have accuracy >95% indicating a clear clinical decision for use of these drugs. Drugs that are used as second/third line show relatively lower accuracy ranging from 84% to 92% suggestive of interchangeable use in clinical practice. Wright et al.[18] devised a recommender system based on sequential pattern mining that was able to predict the medication prescribed, on three attempts, for 90.0% of patients when making predictions by drug class. This may have limited clinical utility as the choice of add-on follow-up drug classes is limited anyway in clinical practice. The accuracy for predicting the next drug improved when clinical information (lab data and physical data) was added to the current treatment regimen.[16] The k-nearest neighbor approach to suggest diabetes drug regimen using age, sex, race, BMI, treatment history, and diabetes progression as variables has been shown to reduce HbA1c by 0.44 ± 0.03% (4.8 ± 0.3 mmol/mol) (P < 0.001).[21] However, this conclusion was based on the projected benefit by averaging the outcome of cases with similar recommendations in actual data.[21] Complete prescription accuracy in the current study stands at 72% while validating on the testing dataset. In reality, this accuracy may be more as, at any given decision point for drug prescription, there may be two or more equally good choices. This parameter has not been reported in any of the previous studies. Limitations of this study include a dataset from a single practice. This has both its advantages and disadvantages. Advantages of using a dataset from single practice include more uniformity of data and thus reducing noise in the data. Moreover, datasets with a better glycemic control than a cumulative national record can increase the efficiency of the AI/ML system being deployed.[22] The main disadvantages include the introduction of personal bias into the system in the use of anti-diabetes drugs. Also, the time taken to collect data large enough to build robust AI/ML systems may take a long time, when collected from a single centre. Drugs other than anti-diabetes drugs were excluded from the decision algorithm as their impact on the decision to use a particular anti-diabetes drug is considered to be only minor. This approach needs validation across multiple practices for wider application. Currently, predictions are at the drug class level; predictions at the drug dosage level would make it more useful in clinical practice.

CONCLUSION

The current dataset shows an accuracy of 84-97% for predicting the use of individual drug classes. Complete prescription accuracy is reported to be 72%. This report presents a first step in developing a robust clinical decision support system for diabetes management and lays down a framework for data collection for machine learning algorithms in this field for the future.

19 in total

Review 1. Artificial Intelligence: The Future for Diabetes Care.

Authors: Samer Ellahham
Journal: Am J Med Date: 2020-04-20 Impact factor: 4.965

2. Development and Validation of a Deep Learning Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs.

Authors: Varun Gulshan; Lily Peng; Marc Coram; Martin C Stumpe; Derek Wu; Arunachalam Narayanaswamy; Subhashini Venugopalan; Kasumi Widner; Tom Madams; Jorge Cuadros; Ramasamy Kim; Rajiv Raman; Philip C Nelson; Jessica L Mega; Dale R Webster
Journal: JAMA Date: 2016-12-13 Impact factor: 56.272

3. The use of sequential pattern mining to predict next prescribed medications.

Authors: Aileen P Wright; Adam T Wright; Allison B McCoy; Dean F Sittig
Journal: J Biomed Inform Date: 2014-09-16 Impact factor: 6.317

4. Effectiveness of an mHealth-Based Electronic Decision Support System for Integrated Management of Chronic Conditions in Primary Care: The mWellcare Cluster-Randomized Controlled Trial.

Authors: Dorairaj Prabhakaran; Dilip Jha; David Prieto-Merino; Ambuj Roy; Kavita Singh; Vamadevan S Ajay; Devraj Jindal; Priti Gupta; Dimple Kondal; Shifalika Goenka; Pramod Jacob; Rekha Singh; B G Prakash Kumar; Pablo Perel; Nikhil Tandon; Vikram Patel
Journal: Circulation Date: 2018-11-10 Impact factor: 29.690

5. Drug Prescription Patterns and Cost Analysis of Diabetes Therapy in India: Audit of an Endocrine Practice.

Authors: Rajiv Singla; Jatin Bindra; Ankush Singla; Yashdeep Gupta; Sanjay Kalra
Journal: Indian J Endocrinol Metab Date: 2019 Jan-Feb

6. Automation of the kidney function prediction and classification through ultrasound-based kidney imaging using deep learning.

Authors: Chin-Chi Kuo; Chun-Min Chang; Kuan-Ting Liu; Wei-Kai Lin; Hsiu-Yin Chiang; Chih-Wei Chung; Meng-Ru Ho; Pei-Ran Sun; Rong-Lin Yang; Kuan-Ta Chen
Journal: NPJ Digit Med Date: 2019-04-26