Literature DB >> 33998303

Supporting Real World Decision Making in Coronary Diseases Using Machine Learning.

Peter Kokol¹, Jan Jurman¹, Tajda Bogovič¹, Tadej Završnik², Jernej Završnik^3,4,5, Helena Blažun Vošner^3,4,6.

Abstract

Cardiovascular diseases are one of the leading global causes of death. Following the positive experiences with machine learning in medicine we performed a study in which we assessed how machine learning can support decision making regarding coronary artery diseases. While a plethora of studies reported high accuracy rates of machine learning algorithms (MLA) in medical applications, the majority of the studies used the cleansed medical data bases without the presence of the "real world noise." Contrary, the aim of our study was to perform machine learning on the routinely collected Anonymous Cardiovascular Database (ACD), extracted directly from a hospital information system of the University Medical Centre Maribor). Many studies used tens of different machine learning approaches with substantially varying results regarding accuracy (ACU), hence they were not usable as a base to validate the results of our study. Thus, we decided, that our study will be performed in the 2 phases. During the first phase we trained the different MLAs on a comparable University of California Irvine UCI Heart Disease Dataset. The aim of this phase was first to define the "standard" ACU values and second to reduce the set of all MLAs to the most appropriate candidates to be used on the ACD, during the second phase. Seven MLAs were selected and the standard ACUs for the 2-class diagnosis were 0.85. Surprisingly, the same MLAs achieved the ACUs around 0.96 on the ACD. A general comparison of both databases revealed that different machine learning algorithms performance differ significantly. The accuracy on the ACD reached the highest levels using decision trees and neural networks while Liner regression and AdaBoost performed best in UCI database. This might indicate that decision trees based algorithms and neural networks are better in coping with real world not "noise free" clinical data and could successfully support decision making concerned with coronary diseasesmachine learning.

Entities: Chemical Disease Gene Species

Keywords: artificial intelligence; cardiovascular conditions; cross-validation; decision making; diagnosing; heuristics; knowledge discovery; machine learning

Year: 2021 PMID： 33998303 PMCID： PMC8132094 DOI： 10.1177/0046958021997338

Source DB: PubMed Journal: Inquiry ISSN： 0046-9580 Impact factor: 1.730

How does your research contribute to the field? Machine learning is used to support decision making in medicine, however, it is not yet proven to be effective using real-world data How does your research contribute to the field? Provide assessment and rating of the accuracy and suitability of different machine learning methods when trained on real-world medical data What are your research’s implications toward theory, practice, or policy? To improve real-world decision making in cardiology aiming to improve health care delivery

Introduction

Machine learning is a popular and an effective approach for problem-solving when a sufficiently representative set of samples (knowledge base) is available to train its models in different fields like medicine[1] or engineering.[2] Knowledge base with enough objects (patient data) and balanced distribution of different patient diagnosis, enables various machine learning algorithms to be successfully trained, meaning that achieved accuracy is high enough to support real world decision making. The transition of hospitals to digitally storing patient data in electronic patients’ encounters has created a rich source of data that machine learning algorithms can use. Cardiovascular diseases are the leading global cause of death. Between 2013 and 2016, according to the NHANES (National Health and Nutrition Examination Survey), 48% of the US population over 20 was diagnosed with cardiovascular disease. The risk of these diseases increases with age.[3] However, most cardiovascular diseases can be prevented by reducing risk factors[4] The most common cardiovascular diseases are coronary artery disease, also known as coronary heart disease, caused by a decrease in blood flow to the heart muscle. This phenomenon is caused by a chronic disease called atherosclerosis, which is caused by the accumulation of fat, cholesterol and other substances on plaque-shaped arterial walls[5] Risk factors include family history, high blood pressure, smoking, elevated LDL cholesterol levels, diabetes, obesity, lack of recreation, and excessive alcohol consumption[6] Different factors must be considered when diagnosing cardiovascular diseases and their classification into subgroups. These include, among other things, laboratory measurements, which are suitable for computer processing and machine learning due to their numerical representation. Unlike expert systems that diagnose and predict based on human-based definitions of a set of attributes, machine learning algorithms can support decision making by analysing complex patterns in the data[7] Generally, machine learning algorithms can be categorized in supervised and unsupervised. While supervised algorithms require that 1 or more attributes in the knowledge base have to contain information about the class of the objects (in our case the diagnosis of the disease), unsupervised machine learning approach enables to search for new knowledge without attributes determined by a human observer, but they can find the most relevant attributes and their relationships with each other. The use of supervised learning, however, mostly addresses the problem of classification and risk determination[8] A wide range of applications of machine learning algorithms in the field of medicine can be found in Dinu and Joseph.[9] Using models created through machine learning, individual patients health status can be classified as healthy or ill on the basis of measured traits. This can benefit the user in diagnosing the disease or serve as a warning sign for a potentially dangerous condition. Machine learning algorithms that allow the visualization of the internal states of the models can also provide the conditions for individual decisions, allowing the user to check the validity of the results and contribute to new knowledge discovery.[10] Machine learning is widely used in cardiovascular medicine[11] for the purpose of data interpretation or the analysis of ECGs and imaging systems.[12] The comparison of decision trees, k-nearest neighbors, and support vector machines (SVM) in image recognition for multi sclerosis detection showed that the k-nearest neighbors performed the best,[13] however the SVM also showed promising result in detection of dendritic spines.[14] Many studies have used different machine learning models, such as Ensemble, Support Vector Machine, Random Forest, and Clustering for classification cardiovascular illness.[15] According to researcher from the paper,[16] logistic regression has showed a good performance for prediction the risk of common cardiovascular diseases. The study recommends using traditional regression models. Mezzatesta et al[17] investigated the use of non-linear SVM with RBF kernel algorithm, optimized with GridSearch in order to improve the accuracy of the algorithm. The results showed high accuracy of the outbreak of cardiovascular diseases in patients on dialysis. The aim of the study presented in this paper was to perform machine learning on the routinely collected an Anonymous Cardiovascular Database (ACP database) in the University Hospital of Maribor in the period 1999 to 2019 to both support decision making of physicians in their everyday work and possibly discover new knowledge.

Methodology

We performed the study in 2 phases. During the first phase we trained the different machine learning algorithms on the already tried UCI Heart Disease Dataset,[18] which contains heart disease data to test various machine learning algorithms, aiming to select the “best” candidates to be used on the ACP database during the second phase to induce classifiers. 10-fold cross validation was used to assess the accuracy of classifiers. In the ACP we restricted ourselves to the attributes age, gender, re-hospitalization, AH, SB, SP, KBL, S-CRP, S-Lp(a), S-triglycerides, S-HDL-cholesterol, S-cholesterol. The database contains data on 4477 hospital visits of 3490 unique patients, to which a total of 98 568 laboratory tests were performed. Unlike the Cleveland database, which is limited to only heart disease data, the ACP database contains cardiovascular disease data. According to the International Statistical Classification of Diseases and Health Related Problems the patients were diagnosed with the diseases code I25 (Chronic Ischemic Hard Disease) and its sub codes (Table 1). After the cleanup, and discarding of sub types containing a smaller number of patients the database contained 1619 patients.

Table 1.

The Number of Patients in Each of the I25 Sub Codes.

Sub codes of I25	1	2	5	11
Number of patients	761	299	514	45

The Number of Patients in Each of the I25 Sub Codes. The performance of different machine learning algorithms used in this study was assessed with different metrics.[19] Due to the fact that Accuracy is the most standard metric used in the health sciences machine learning community and that it is easy to understood by health professionals we used only it in the rest of the paper.

Results

The best classifiers created with individual machine learning algorithms together with their accuracies are presented on Figure 1. Algorithms presented are k-nearest neighbor’s (KNN), decision tree (DT), naive Bayes (NB), support vector method (SVM), random forest (RF), neural network (NN), bagging using support vector method (BG), logistic regression (LR), and AdaBoost using logistic regression (AB).

Figure 1.

The accuracies of the best classifiers used on the UCI database in increasing order.

The accuracies of the best classifiers used on the UCI database in increasing order. Due to the simplicity of the binary classification problem, most algorithms have achieved a high degree of accuracy. The best results were achieved by a classifier that used AdaBoost and logistic regression as the core classifier (0.855) and the worst by k-nearest neighbor’s (0.670). The difference in the accuracy of the use of 10 or 50 base classifiers was negligible, being 0.002. Due to similar results, the classifier with 10 core classifiers was selected as the best, whose learning process is faster due to its smaller size. The confusion matrices of all classifiers reflected a higher proportion of false negatives than false positives. The algorithms were thus more accurate in the classification of healthy individuals and slightly worse in the classification of patients with heart disease. The same type of classifiers was induced in the second phase of our study for the 4 diagnoses from the Table 1. The accuracies are shown in the Figure 2 (blue columns). Because the diagnoses I25.1 and I25.11 are similar and due to the limited number of characteristics considered by the classifiers, those 2 were difficult to distinguish. When physician diagnosed patients, they also used symptoms not presented in the ACP database (such as the presence of angina pectoris), which might affect the accuracy of induced classifiers. When we combined the diagnoses I25.1 and I25.11 we achieved better accuracies shown as orange columns in Figure 2.

Figure 2.

The accuracies of machine learning algorithms for the ACP database.

The accuracies of machine learning algorithms for the ACP database. We tested different parameter settings for individual classifiers and selected the best ones. In that manner, we used a weighted neighborhood of 20 neighbors for the k-closest neighbor classifier. We did not limit the depth of the decision tree and used entropy for the splitting criterion. A linear hyperplane was used as the kernel for the classifier using the support vector method. The topology of the neural network with the highest accuracy was 3 hidden layers with 200 neurons each. We used 50 base classifiers for bagging and AdaBoost. In the boosting method, we used the support vector method for the base classifier and logistic regression for AdaBoost. For the random forest, we used 500 trees with no depth limit. In the binary classification, the highest accuracies were achieved by logistic regression (0.570) and its implementation using the AdaBoost ensemble method (0.561), however boosting slightly reduced the accuracy. A high improvement in accuracy when combining I25.1 and I25.11 diagnosis was accomplished in the k-nearest neighbor method, neural network, and random forest, of which the highest improvement was achieved by the neural network whose accuracy increased for 0.049. In our last experiment we used the ACP database to analyse the accuracy of binary classification (presence or absence of cardiovascular disease). Because the database didn’t contain individuals without the cardiovascular condition (PWCD) and we did not have access to data on the simultaneous measured values of the required attributes, we generated samples for PWCD individuals. We balanced the created patient database by generating 1619 PWCD patients, of whom each individual had the same probability of being male and female. We limited the age of patients between 20 and 60 years. The value of the repeat hospitalization attribute was set at 0 for PWCD patients. The number of patients with positive AH, SB, or KBL was followed by the proportion of the global population with the listed medical conditions. 429 PWCD patients had arterial hypertension AH and 142 diabetes mellitus. The heart failure attribute of SP was set to 0 in the generated set, since we treated patients with heart failure as ill. There were 162 PWCD patients with chronic kidney disease. Although the intervals of the possible values of the attributes specified by the reference values are recommended for the healthy individual, they may deviate due to factors such as lifestyle habits. Thus, for example, a healthy person has higher cholesterol than 5.2, but will not necessarily have cardiovascular disease. We also observed discrepancies between the reference intervals in the literature. Because of this, we have extended the reference intervals by 25% when generating patient attribute values. We did not extend the interval limits below zero. The classification was performed with different parameters for individual classifiers and the best ones were used in our experiment. We employed 30 neighbors and weighted distances for the k-closest neighbors, although the weightless approach had similar accuracy. The decision tree achieved the best results without limiting depth. A linear hyperplane was used as the kernel for the support vector method. The neural network achieved the best results with 3 hidden layers of 20 neurons and without limitation of learning iterations. For the ensemble method, 50 base classifiers were used, where the base classifier was the support vector method for bagging, and logistic regression for AdaBoost. The highest accuracy of patient classification among the presented algorithms was achieved by a random forest with 100 trees and no depth limitation. The accuracy of the binary classification algorithms in healthy and sick patients is shown in Figure 3. The best accuracy was achieved by random forest (0.997) followed by decision tree (0.997), and the worst by naïve Bayes (0.950).

Figure 3.

The accuracies of machine learning algorithms for the ACP database (binary classifications).

The accuracies of machine learning algorithms for the ACP database (binary classifications). The ranking of 3 most accurate machine learning algorithms is shown in the Table 2. In the ACP database the reduction from 4 to 3 diagnoses didn’t significantly changed the order. However the reduction from 3 to 2 diagnoses did not only change the order, but also sets of the most accurate algorithms. The most accurate algorithms for the UCI and ACP databases are also totally different.

Table 2.

Most Accurate Machine Learning Algorithms.

UCI	ACP-4 diagnoses	ACP-3 diagnoses	ACP-2 diagnoses
AB	LR	LR	RF
LR	AB	AB	DT
BG	BG	NB	NN

Most Accurate Machine Learning Algorithms. Feature significance analysis using decision trees[1] revealed that 3 features are the most important in distinguishing between PWCD and ill patients (Figure 4). These are in descending order: Lp(a), CRP, and HDL-cholesterol.

Figure 4.

The decision tree induced on the APC database for binary classification.

Discussion

The final results show that machine learning can accurately support diagnosing in patients with/without cardiovascular diseases and even help by more precise diagnosing into 4 different cardiovascular diagnoses. It can also support knowledge discovery that is, revealing the various kinds of association of Lp(a) (Lipoprotein A) to CVD events.[20] In that manner machine learning can help physicians in cooperation with data scientist to improve the research on CVD and also translate knowledge in clinical practice and thus help to improve health care.[21] We chose among the most commonly used algorithms for machine learning methods. Our analysis indicates a significant difference in the datasets, since we used the same set of classifiers in both cases. In the case of the ACP database most of the algorithms achieved a higher accuracy comparing to UCI dataset. The UCI dataset was smaller, containing only 297 patients and 13 input attributes were used to classify the disease. The data might have also be affected by other unknown health condition of the patients. Unlike the UCI database, which contained only heart disease data, the ACP database also had cardiovascular disease data. As the datatset did not contain healthy individuals we generate a synthetic set of patients, which might be reflected in the accuracy. It is noticed that Random forest, NN and DT proved to be the most accurate algorithms as in many other medical applications REF. In addition to Lp (a), CRP, and HDL-cholesterol were the most important attributes for the presence of a cardiovascular disease as is also shown by Rhee et al.[22] CRP is as one of the key indicator of inflammation and tissue damage also known to be highly associated with cardiovascular conditions.[23] Our study had some limitations. The datasets used in our study varied in the presentation of the data. The UCI dataset was significantly smaller and only patients with cardio diseases were included. Therefore, there were a small number of samples and a limited set of attributes. Contrary, there were no k of healthy patients in in the ACP database, consequently the generation of healthy individuals might have influenced the classification results. However, our study has several advantages. We analysed data from a large number of patients with reasonably good data quality which also included the “noise” of the real world clinical environment. The purpose of medical classifiers is to support the physician in real world decision making. We used state-of-the-art ML algorithms which are able to develop the best predictive models.

Conclusion

In summary, cardiovascular disease still poses a high risk of death, so detecting risk factors is very important. We showed that diagnosing could be improved by using machine learning algorithms. Therefore, we implemented, analysed and compared some of the most common used ones. For that purpose, we used an already tried UCI database and a real world database of patients with cardiovascular disease. The accuracy of machine learning differs significantly between the 2 databases, indicating that decision trees based algorithms and neural networks might be better in coping with real world not “noise free” clinical data.

16 in total

1. Machine Learning for Echocardiographic Imaging: Embarking on Another Incredible Journey.

Authors: A Jamil Tajik
Journal: J Am Coll Cardiol Date: 2016-11-29 Impact factor: 24.094

2. Detection of Dendritic Spines Using Wavelet Packet Entropy and Fuzzy Support Vector Machine.

Authors: Shuihua Wang; Yang Li; Ying Shao; Carlo Cattani; Yudong Zhang; Sidan Du
Journal: CNS Neurol Disord Drug Targets Date: 2017 Impact factor: 4.388

3. Heart Disease and Stroke Statistics-2019 Update: A Report From the American Heart Association.

Authors: Emelia J Benjamin; Paul Muntner; Alvaro Alonso; Marcio S Bittencourt; Clifton W Callaway; April P Carson; Alanna M Chamberlain; Alexander R Chang; Susan Cheng; Sandeep R Das; Francesca N Delling; Luc Djousse; Mitchell S V Elkind; Jane F Ferguson; Myriam Fornage; Lori Chaffin Jordan; Sadiya S Khan; Brett M Kissela; Kristen L Knutson; Tak W Kwan; Daniel T Lackland; Tené T Lewis; Judith H Lichtman; Chris T Longenecker; Matthew Shane Loop; Pamela L Lutsey; Seth S Martin; Kunihiro Matsushita; Andrew E Moran; Michael E Mussolino; Martin O'Flaherty; Ambarish Pandey; Amanda M Perak; Wayne D Rosamond; Gregory A Roth; Uchechukwu K A Sampson; Gary M Satou; Emily B Schroeder; Svati H Shah; Nicole L Spartano; Andrew Stokes; David L Tirschwell; Connie W Tsao; Mintu P Turakhia; Lisa B VanWagner; John T Wilkins; Sally S Wong; Salim S Virani
Journal: Circulation Date: 2019-03-05 Impact factor: 29.690

Review 4. The HDL cholesterol/apolipoprotein A-I ratio: an indicator of cardiovascular disease.

Authors: Eun-Jung Rhee; Christopher D Byrne; Ki-Chul Sung
Journal: Curr Opin Endocrinol Diabetes Obes Date: 2017-04 Impact factor: 3.243

Review 5. Machine learning in cardiovascular medicine: are we there yet?

Authors: Khader Shameer; Kipp W Johnson; Benjamin S Glicksberg; Joel T Dudley; Partho P Sengupta
Journal: Heart Date: 2018-01-19 Impact factor: 5.994

Review 6. Machine Learning in Medicine.

Authors: Rahul C Deo
Journal: Circulation Date: 2015-11-17 Impact factor: 29.690