| Literature DB >> 35838763 |
Luis Fernando Granda Morales1, Priscila Valdiviezo-Diaz1, Ruth Reátegui1, Luis Barba-Guaman1.
Abstract
BACKGROUND: Diabetes is a public health problem worldwide. Although diabetes is a chronic and incurable disease, measures and treatments can be taken to control it and keep the patient stable. Diabetes has been the subject of extensive research, ranging from disease prevention to the use of technologies for its diagnosis and control. Health institutions obtain information required for the diagnosis of diabetes through various tests, and appropriate treatment is provided according to the diagnosis. These institutions have databases with large volumes of information that can be analyzed and used in different applications such as pattern discovery and outcome prediction, which can help health personnel in making decisions about treatments or determining the appropriate prescriptions for diabetes management.Entities:
Keywords: chronic disease; clustering; collaborative filtering; data mining; diabetes; drug; machine learning; patient information; recommend; recommender system
Mesh:
Year: 2022 PMID: 35838763 PMCID: PMC9338420 DOI: 10.2196/37233
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 7.076
Figure 1Distribution of patients according to age and readmission.
Variables discarded from the data set.
| Variable | Discard reason |
| encounter_ida | Irrelevant variable for clustering |
| patient_nbrb | Irrelevant variable for clustering |
| payer_codec | Irrelevant variable for clustering |
| Weight | Data missing for 97.00% (n=97,000) of the 100,000 samples |
| Medical specialty | Data missing for 53.00% (n=53,000) of the 100,000 samples |
| Clorpropamida | Only 86 patients use this drug |
| Acarbosa | Only 308 patients use this drug |
| Miglitol | Only 38 patients use this drug |
| Troglitazona | Only 3 patients use this drug |
| Examide | No patient uses this drug |
| Citoglipton | No patient uses this drug |
| Glipizide_metformin | Only 13 patients use this drug |
| Glimepirida_pioglitazona | Only 1 patient uses this drug |
| Metformin_rosiglitazone | Only 2 patients use this drug |
| Metformina_pioglitazona | Only 1 patient uses this drug |
| Acetohexamida | Only 1 patient uses this drug |
| Tolbutamide | Only 23 patients use this drug |
| Tolazamide | Only 39 patients use this drug |
aencounter_id: Identification of a specific hospital visit or patient encounter.
bpatient_nbr: patient ID number.
cpayer_code: identifier corresponding to 23 distinct values of payment method (eg, Blue Cross/Blue Shield, Medicare, patient payment).
Patient characteristics, description, and their corresponding values.
| Variable | Description | Values |
| Gender | Patient gender (self-identified) | 0, 1 |
| Age | Patient age (years) | 5, 15, 25, 35, 45,…95 |
| admission_type_id | Identifier corresponding to 8 different types of admissions: emergencies, accidents, newborns, and others | 1-8 |
| discharge_disposition_id | Identifier of the discharge type (eg, discharged to home, psychiatric hospital, medical facility) | 1-28 |
| admission_source_id | Identifier of the admission source (eg, transfer from hospice, transfer from an ambulatory surgery center) | 1-11, 13-14, 20, 22, 25 |
| time_in_hospital | Number of days between admission and discharge | 1-14 |
| num_lab_procedures | Number of laboratory tests performed during the encounter | 1-132 |
| num_procedures | Number of procedures performed during the encounter | 0-6 |
| num_medications | Number of different drugs (generic names) administered during the encounter | 1-81 |
| number_outpatient | Number of outpatient visits | 0-42 |
| number_emergency | Number of emergency visits | 0-76 |
| number_inpatient | Number of inpatient visits | 0-21 |
| number_diagnoses | Number of diagnoses | 1-16 |
| max_glu_serum | Range of the result of the serum glucose level or if the test was not performed | 0, 1 |
| a1cresult | Range of the result of the hemoglobin A1C level or if the test was not performed | 0, 1 |
| change | If there is a change in medication | 0, 1 |
| diabetesmed | If the patient has been prescribed medication for diabetes | 0, 1 |
| readmitted | Days to inpatient readmission; these categories will be relabeled | 0, 1, 2 |
| African American, Asian, Caucasian, Hispanic, Other | Patient’s race | 0, 1 |
| circulatory | If the patient is admitted with circulatory system problems, the variable takes the value of 1 | 0, 1 |
| diabetes | If the patient is admitted with diabetes-related problems, the variable takes the value of 1 | 0, 1 |
| digestive | If the patient is admitted with digestive system problems, the variable takes the value of 1 | 0, 1 |
| genitourinary | If the patient is admitted with genitourinary problems, the variable takes the value of 1 | 0, 1 |
| injury | If the patient is admitted with injuries, the variable takes the value of 1 | 0, 1 |
| musculoskeletal | If the patient is admitted with musculoskeletal problems, the variable takes the value of 1 | 0, 1 |
| neoplasms | If the patient is admitted with neoplasms, the variable takes the value of 1 | 0, 1 |
| respiratory | If the patient is admitted with respiratory system problems, the variable takes the value of 1 | 0, 1 |
| other2 | If the patient is admitted with other complications, the variable takes the value of 1 | 0, 1 |
Final drugs used in the data set among 24 total drugs (N=100,000 patients).a
| Drug | Patients, n (%) |
| Insulin | 54,383 (53.44) |
| Metformin | 19,988 (19.99) |
| Glipizide | 12,686 (12.69) |
| Glyburide | 10,650 (10.65) |
| Pioglitazone | 7328 (7.33) |
| Rosiglitazone | 6365 (6.37) |
| Glimepiride | 5191 (5.19) |
| Repaglinide | 1539 (1.54) |
| Glyburide-metformin | 706 (0.71) |
| Nateglinide | 703 (0.70) |
aSome patients were administered more than one drug.
Figure 2Schematic of the proposed recommendation approach.
Collaborative filtering matrix.
| Medication 1 | Medication 2 | Medication 3 | Medication 4 | Cluster | |
| Patient 1 | 1 | 0 | 2 | 1 | 1 |
| Patient 2 | 0 | 1 | 1 | 1 | 2 |
| Patient 3 | 2 | 0 | 0 | 1 | 1 |
| Patient 4 | 1 | 2 | 0 | 0 | 3 |
| Patient 5 | 1 | 1 | 2 | 0 | 2 |
Comparative analysis of the performance of the algorithms.
| Algorithm | Number of clusters | Silhouette coefficient | Execution time |
| K-means | 6 | 0.654 | 15 minutes, 24 seconds |
| DBSCANa | 200 | 0.611 | 20 minutes, 31 seconds |
aDBSCAN: density-based spatial clustering of applications with noise.
Figure 3Details about the six clusters.
Drug recommendation for two patients with diabetes in cluster 1.
| Patient ID | Drugs recommendeda |
| 36 | Insulin, glipizide, metformin |
| 15 | Metformin, insulin, glyburide |
aDrugs are listed in order of preference (highest to lowest prediction score).