| Literature DB >> 33998303 |
Peter Kokol1, Jan Jurman1, Tajda Bogovič1, Tadej Završnik2, Jernej Završnik3,4,5, Helena Blažun Vošner3,4,6.
Abstract
Cardiovascular diseases are one of the leading global causes of death. Following the positive experiences with machine learning in medicine we performed a study in which we assessed how machine learning can support decision making regarding coronary artery diseases. While a plethora of studies reported high accuracy rates of machine learning algorithms (MLA) in medical applications, the majority of the studies used the cleansed medical data bases without the presence of the "real world noise." Contrary, the aim of our study was to perform machine learning on the routinely collected Anonymous Cardiovascular Database (ACD), extracted directly from a hospital information system of the University Medical Centre Maribor). Many studies used tens of different machine learning approaches with substantially varying results regarding accuracy (ACU), hence they were not usable as a base to validate the results of our study. Thus, we decided, that our study will be performed in the 2 phases. During the first phase we trained the different MLAs on a comparable University of California Irvine UCI Heart Disease Dataset. The aim of this phase was first to define the "standard" ACU values and second to reduce the set of all MLAs to the most appropriate candidates to be used on the ACD, during the second phase. Seven MLAs were selected and the standard ACUs for the 2-class diagnosis were 0.85. Surprisingly, the same MLAs achieved the ACUs around 0.96 on the ACD. A general comparison of both databases revealed that different machine learning algorithms performance differ significantly. The accuracy on the ACD reached the highest levels using decision trees and neural networks while Liner regression and AdaBoost performed best in UCI database. This might indicate that decision trees based algorithms and neural networks are better in coping with real world not "noise free" clinical data and could successfully support decision making concerned with coronary diseasesmachine learning.Entities:
Keywords: artificial intelligence; cardiovascular conditions; cross-validation; decision making; diagnosing; heuristics; knowledge discovery; machine learning
Year: 2021 PMID: 33998303 PMCID: PMC8132094 DOI: 10.1177/0046958021997338
Source DB: PubMed Journal: Inquiry ISSN: 0046-9580 Impact factor: 1.730
The Number of Patients in Each of the I25 Sub Codes.
| Sub codes of I25 | 1 | 2 | 5 | 11 |
|---|---|---|---|---|
| Number of patients | 761 | 299 | 514 | 45 |
Figure 1.The accuracies of the best classifiers used on the UCI database in increasing order.
Figure 2.The accuracies of machine learning algorithms for the ACP database.
Figure 3.The accuracies of machine learning algorithms for the ACP database (binary classifications).
Most Accurate Machine Learning Algorithms.
| UCI | ACP-4 diagnoses | ACP-3 diagnoses | ACP-2 diagnoses |
|---|---|---|---|
| AB | LR | LR | RF |
| LR | AB | AB | DT |
| BG | BG | NB | NN |
Figure 4.The decision tree induced on the APC database for binary classification.