| Literature DB >> 34539180 |
Dina Nur Anggraini Ningrum1,2,3, Woon-Man Kung4, I-Shiang Tzeng4,5,6, Sheng-Po Yuan1,7, Chieh-Chen Wu4, Chu-Ya Huang8, Muhammad Solihuddin Muhtar2, Phung-Anh Nguyen2,9, Jack Yu-Chuan Li1,2,10,11, Yao-Chin Wang12,13.
Abstract
PURPOSE: To develop deep learning model (Deep-KOA) that can predict the risk of knee osteoarthritis (KOA) within the next year by using the previous three years nonimage-based electronic medical record (EMR) data. PATIENTS AND METHODS: We randomly selected information of two million patients from the Taiwan National Health Insurance Research Database (NHIRD) from January 1, 1999 to December 31, 2013. During the study period, 132,594 patients were diagnosed with KOA, while 1,068,464 patients without KOA were chosen randomly as control. We constructed a feature matrix by using the three-year history of sequential diagnoses, drug prescriptions, age, and sex. Deep learning methods of convolutional neural network (CNN) and artificial neural network (ANN) were used together to develop a risk prediction model. We used the area under the receiver operating characteristic (AUROC), sensitivity, specificity, and precision to evaluate the performance of Deep-KOA. Then, we explored the important features using stepwise feature selection.Entities:
Keywords: artificial intelligence; clinical decision support system; medical informatics application; precision medicine
Year: 2021 PMID: 34539180 PMCID: PMC8445097 DOI: 10.2147/JMDH.S325179
Source DB: PubMed Journal: J Multidiscip Healthc ISSN: 1178-2390
Figure 1Electronic Medical Record (EMR) matrix and network architecture of Deep-KOA. The vertical axis of the input matrix consists of diagnostic and medication codes. The diagnostic features occupy 1029 blocks out of all 1098 International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, and the medication features occupy 695 blocks out of all 830 World Health Organization-Anatomical Therapeutic Chemical (WHO-ATC) codes. The horizontal axis consists of 157 weeks (three years), and each cell is filled with visiting history of the patient. For each diagnosis code per week, the value is divided by seven (one week consist of seven days), and for each medication per week, the value is divided by 28 (assuming: one medication maximum prescribed as 4×7 days in a week). The EMR matrix data are fed to the convolutional neural network (CNN) architecture with leaky rectified linear unit (ReLU), and the static data (maximum age and sex of the patient) are fed to the artificial neural network (ANN) architecture.
Demographics of Sampled Dataset
| Characteristics | KOA Group (n=132,594) | Control Group (n=1,068,464) |
|---|---|---|
| Race/Ethnicity | All Asian | All Asian |
| Age, year | ||
| Mean (±SD) | 64.20 (±12.49) | 51.00 (±15.79) |
| Minimum | 25 | 25 |
| Median | 65 | 50 |
| Maximum | 105 | 113 |
| Sex, n (%) | ||
| Females | 83,111 (62.68) | 545,902 (51.09) |
| Males | 49,483 (37.32) | 522,562 (48.91) |
| Total diagnosis counts, n | 13,743,356 | 70,289,839 |
| Average annual accumulation per patient, n/patient/year | ||
| Clinical visit counts | 38.50 | 21.90 |
| Diagnosis (ICD-9-CM) counts | 34.60 | 21.90 |
| Medication (WHO-ATC) counts | 62.11 | 30.54 |
| Medication (WHO-ATC) multiplied by prescription days counts | 694.81 | 298.61 |
Abbreviations: KOA, knee osteoarthritis; SD, standard deviation; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; WHO-ATC, World Health Organization, Anatomical Therapeutic Chemical.
Figure 2Learning curve of Deep-KOA model using diagnosis and medication features. Based on loss and accuracy curve of training and validation vs epoch (iteration), the validation and training loss curve shown that the Deep-KOA model has less overfitting, plot of training loss (green line) decreases to point of stability and plot of validation loss (red line) decreases to point of stability and has small gap with the training loss. The training and validation dataset were representative, it is shown by small gap between plot of training loss and plot of validation loss along with plot of training accuracy (blue line) and plot of validation accuracy (orange line) also increase to point of stability and has small gap between them.
Performance of Deep-KOA with Different Input Features
| Input Features | AUROC | Sensitivity | Specificity | Precision |
|---|---|---|---|---|
| Diagnoses only | 0.94 | 0.83 | 0.91 | 0.76 |
| Medications only | 0.79 | 0.65 | 0.83 | 0.63 |
| Diagnoses and medications | 0.97 | 0.89 | 0.93 | 0.80 |
Abbreviations: Deep-KOA, deep learning model for knee osteoarthritis prediction; AUROC, area under the receiver operating characteristic curve.
Figure 3Area under the receiver operating characteristic curve (AUROC) of the Deep-KOA model using diagnosis and medication features, before the optimization (A) and after the optimization (B). The best threshold (dot) is 0.15, the best balance between true positive rate and false positive rate. Optimized model obtained from the TensorFlow optimization toolkit by removing some connections between nodes inside layers. After optimization, the model size decreased significantly by up to 33% from its original size (from 442 MB to 147 MB). In this case, AUROC between original and optimized model are almost the same (AUROC=0.97).
Important Features in Deep-KOA Model for Prediction of Knee Osteoarthritis (KOA)
| Feature | AUROC, (% Decrease)a |
|---|---|
| All features included | 0.9669 |
| Age | 0.9593 (−0.76) |
| Sex | 0.9644 (−0.25) |
| Comorbidities (ICD-9-CM code, name): | |
| (360–379) Disorders of the eye and adnexa | 0.9501 (−1.68) |
| (460–466) Acute respiratory infection | 0.9569 (−1.00) |
| (270–279) Other metabolic disorders and immunity disorders | 0.9631 (−0.38) |
| (725–729) Rheumatism, excluding the back | 0.9642 (−0.27) |
| (840–848) Sprains and strains of joints and adjacent muscles | 0.9646 (−0.23) |
| (530–537) Diseases of esophagus, stomach, and duodenum | 0.9647 (−0.22) |
| (710–719) Arthropathies and related disorders | 0.9648 (−0.21) |
| (250) Diabetes mellitus | 0.9648 (−0.21) |
| (451–459) Diseases of veins and lymphatics, and other diseases of circulatory system | 0.9652 (−0.17) |
| (401–405) Hypertensive disease | 0.9652 (−0.17) |
| Medication (WHO-ATC code, name): | |
| (A02AX) Antacids, other combinations | 0.9657 (−0.12) |
| (R05FA) Opium derivatives and expectorants | 0.9658 (−0.11) |
| (A02AA) Magnesium compounds | 0.9659 (−0.10) |
| (C07AB) Beta blocking agents, selective | 0.9660 (−0.09) |
| (H02AB) Glucocorticoids | 0.9660 (−0.09) |
| (B01AC) Platelet aggregation inhibitors exclude heparin | 0.9660 (−0.09) |
| (R05CB) Mucolytics | 0.9661 (−0.08) |
| (C10AA) Statins | 0.9661 (−0.08) |
| (A03FA) Propulsives | 0.9661 (−0.08) |
| (A06AB) Contact laxatives | 0.9661 (−0.08) |
Note: aThe AUROC decrease of the Deep-KOA when the feature was removed.
Abbreviations: Deep-KOA, deep learning model for knee osteoarthritis prediction; AUROC, area under the receiver operating characteristic curve; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; WHO-ATC, World Health Organization, Anatomical Therapeutic Chemical.
Patient Data Sample for Deep-KOA Model Evaluationa
| ID Patient | Age (Years) | Sex | Total Diagnoses in 3 Years (n) | Total Clinical Visits in 3 Years (n) | Total Medications in 3 Years (n) | Total Days of Medication Prescriptions in 3 Years (Days) | Label | Score |
|---|---|---|---|---|---|---|---|---|
| A | 58 | Male | 3 | 5 | 7 | 84 | KOA | 0.172 |
| B | 63 | Female | 6 | 10 | 6 | 158 | KOA | 0.639 |
| C | 59 | Female | 12 | 30 | 24 | 1355 | KOA | 0.942 |
| D | 55 | Male | 3 | 3 | 7 | 24 | NonKOA | 0.137 |
| E | 37 | Female | 5 | 10 | 8 | 30 | NonKOA | 0.012 |
| F | 46 | Female | 11 | 33 | 20 | 307 | NonKOA | 0.124 |
Notes: aTo compare the model performance, three patients of KOA and nonKOA are randomly chosen based on the feature similarity, especially the number of features during three years visiting.
Abbreviations: Deep-KOA, deep learning model for knee osteoarthritis prediction; KOA, knee osteoarthritis.