Literature DB >> 25587978

A speedy cardiovascular diseases classifier using multiple criteria decision analysis.

Wah Ching Lee1, Faan Hei Hung2, Kim Fung Tsang3, Hoi Ching Tung4, Wing Hong Lau5, Veselin Rakocevic6, Loi Lei Lai7.   

Abstract

Each year, some 30 percent of global deaths are caused by cardiovascular diseases. This figure is worsening due to both the increasing elderly population and severe shortages of medical personnel. The development of a cardiovascular diseases classifier (CDC) for auto-diagnosis will help address solve the problem. Former CDCs did not achieve quick evaluation of cardiovascular diseases. In this letter, a new CDC to achieve speedy detection is investigated. This investigation incorporates the analytic hierarchy process (AHP)-based multiple criteria decision analysis (MCDA) to develop feature vectors using a Support Vector Machine. The MCDA facilitates the efficient assignment of appropriate weightings to potential patients, thus scaling down the number of features. Since the new CDC will only adopt the most meaningful features for discrimination between healthy persons versus cardiovascular disease patients, a speedy detection of cardiovascular diseases has been successfully implemented.

Entities:  

Mesh:

Year:  2015        PMID: 25587978      PMCID: PMC4327078          DOI: 10.3390/s150101312

Source DB:  PubMed          Journal:  Sensors (Basel)        ISSN: 1424-8220            Impact factor:   3.576


Introduction

Electrocardiogram (ECG) signals, characterized by P waves, Q waves, S waves, QRS complexes and T waves, are important information for cardiovascular disease diagnosis by cardiologists. Such a diagnosis requires the development of a cardiovascular diseases classifier (CDC). Generally, a CDC mainly comprises feature vectors extraction and building a classifier via machine learning algorithms like an Artificial Neural Network or Support Vector Machine. Features can be divided into three categories: non-fiducial features, fiducial features, and hybrid features. Non-fiducial features normally refer to features that do not characterize the ECG signals using P waves, Q waves, S waves, QRS complexes and T waves [1-5], and vice versa for fiducial features [6,7]. Hybrid features refer to feature vectors constructed by both non-fiducial and fiducial features [8-10]. In this investigation, a Support Vector Machine (SVM) is utilized to construct the CDC for the four most common types of cardiovascular diseases, namely bundle branch block, myocardial infarction, heart failure, and dysrhythmia. Seven criteria, including overall accuracy (OA), sensitivity (Se), specificity (Sp), area under the curve (AUC), training time (Tr), testing time (Te), and number of features (Nf), which are features indicative of the speed and accuracy of detection, are used as the essential parameters to compute the analytic hierarchy process (AHP) score to aid the multiple criteria decision analysis (MCDA) for the evaluation of the optimal CDC. Traditional work usually aims at the highest overall accuracy and/or lowest testing time. In reality, every end user has to specify the weights between criteria. It is not uncommon to find a ratio setting by intuition or simply a direct 1:1 assignment is adopted. It is noted that the practical needs of volunteers are neglected or not targeted. In the new method, assignments of criteria are devised for AHP analysis. The incorporation of AHP analysis in the classifier enables the consideration of the need of volunteers. This letter is organized as follows: the design of an optimal CDC is presented in Section 2. Multiple criteria decision analysis of the optimal CDC is given in Section 3. In Section 4, the AHP is formulated and a performance score is obtained from which the performance is analyzed and compared to traditional schemes. Finally, conclusions are drawn in Section 5.

Design of the optimal CDC

Figure 1 summarizes the block diagram of the new method. After the retrieval of ECG data, feature vectors are extracted. The SVM classifiers are then designed based on the features combinations. Therefore, N configurations can be obtained. The best model is selected among configuration f1 to configuration fN based on seven criteria, namely overall accuracy, sensitivity, specificity, area under the curve, training time, testing time, and number of features, with the aid of MCDA via AHP. The details of the new method are illustrated in the following.
Figure 1.

Block diagram of the new method.

Data Preprocessing and Features Construction

The data is obtained from an online and open access database [11,12]. A group of healthy candidates as well as candidates with the four most common types of cardiovascular diseases are selected: 52 candidates from health control, 15 bundle branch block candidates, 148 myocardial infarction candidates, 18 heart failure candidates and 14 dysrhythmia candidates. The unequal sample size in each class will lead to a bias of the SVM classifier [13]. The Lead I ECG signal is further partitioned into 30 s sub-signals to obtain 500 samples of healthy candidates and 125 samples of unhealthy candidates (of each type of cardiovascular disease). This process aims at equalizing the number of samples in each class (healthy and unhealthy). Before the introduction of these four diseases, the notations are briefed. Denote RR-interval to be the consecutive R points between consecutive ECG signals, QRS complex is the time between Q wave and S wave where point R is between Q wave and S wave. Similarly, QT interval refers to the time between point Q wave and T wave. The background of these four diseases is presented as follows: Myocardial Infarction: Irregular heartbeat and thus irregular RR-interval may occur in the ECG signal of the patients [14]; Bundle Branch Block: Patients have QRS complex with value exceeding 0.12 ms [15]; Dysrhythmia: The heartbeat can be more than 100 beats per minute or less than 60 beats per minute. Thus, RR-interval is different from the normal ECG signal. Also, the QT interval may increase if the type of cardiovascular disease is ventricular arrhythmias [16]; Heart Failure: A finding of prolonged QT interval in the ECG signals of the patients [17]. As a result, Q wave, R wave and S wave, QRS complex, and RR-interval are representative features to identify between healthy persons versus cardiovascular patients. The feature vector consists of 10 features using the average and standard deviation of these five parameters. Before detecting and computing the features, the ECG signals will undergo data preprocessing [18]. The maximum frequency of an ECG signal is typically less than 60 Hz, thus a bandpass filter with cutoff frequencies at 1 Hz and 60 Hz is implemented. A derivative filter is then applied to sharpen the Q, R, and S wave. Finally, signal squaring and sliding window integration are utilized for the location of Q, R, and S wave.

Cardiovascular Diseases Classifier Construction

The CDC is constructed by employing SVM with a 10-dimensional feature vector. This algorithm uses a Lagrange Multiplier with a set of support vectors, a set of weighting and an offset bias [19,20]. This report focuses on the design of CDC. The performance of CDC is dictated by OA, Se, Sp, AUC, Tr, Te, and Nf. It directly classifies the ECG signal into healthy (negative response) candidates and unhealthy (positive response) candidates. OA, Se, Sp, and AUC are related to the accuracy of CDC. Tr is the time required to train the CDC and Te is the time needed to detect the ECG signal. In this investigation, CDC will be trained up and validated with the ECG datasets. For the analysis of positive response—Class 0, 500 healthy patients are used. For the analysis of positive response—Class 1, 125 bundle branch block patients, 125 myocardial infarction patients, 125 heart failure patients and 125 dysrhthmia patients are retrieved from the database. Table 1 lists the datasets for CDC with binary classifier.
Table 1.

Database specification of ECG data for CDC.

Class 0 (Healthy/Negative Response)Number of SamplesClass 1 (Unhealthy/Positive Response)Number of Samples
PTB diagnostic (Healthy)500Bundle Branch Block125
Myocardial Infarction125
Heart Failure125
Dysrhthmia125
The CDC utilizes a 10-fold cross validation for performance evaluation [21] and the polynomial kernel function (third order) is utilized for SVM analysis. There is a total of 1023 combinations ( ), thus 1023 configurations can be formulated from a selection (from 1 to 10) of the 10 features. For the jth configuration where j = 1,…,1023, namely fj, its corresponding criteria, OA, Se, Sp, AUC, Tr, Te, and Nf are recorded. The main settings of SVM are summarized as follows, in general, the default setting is utilized in the MATLAB toolbox: Number of classes: Two; Class 0: 500 Healthy candidates; Class 1: 125 bundle branch block candidates, 125 myocardial infarction candidates, 125 heart failure candidates, and 125 dysrhthmia candidates; Feature vector: The maximum dimensionality is 10, which consists of: {Q wave average, Q wave standard deviation, R wave average, R wave standard deviation, S wave average, S wave standard deviation, QRS complex average, QRS complex standard deviation, RR-interval mean, RR-interval standard deviation}; Kernel function: 3rd order polynomial; Fold of cross validation: Ten-fold 1023 classifiers are constructed in 1023 configurations; the results are tabulated in Table 2.
Table 2.

CDC of each configuration.

fjOASeSpAUCTr (s)Te (s)Nf
f10.3240.3500.2980.3213.52.31
f20.3100.3240.2960.3033.42.51
f30.2980.2880.3080.2873.62.41
f10210.9860.9880.9840.9724.93.410
f10220.9640.9700.9580.9465.13.410
f10230.9700.9740.9660.9494.33.510

Multiple Criteria Decision Analysis of the Optimal CDC

In Table 2, seven criteria, namely OA, Se, Sp, AUC, Tr, Te, and Nf, are employed for performance evaluation of the 1023 scenarios. Multiple criteria decision making (MCDM) has been utilized in many areas since the 1990s [22]. It entails using the particular characteristics of cardiovascular diseases. By allocating appropriate weightings, the analytic hierarchy process (AHP) is adopted to evaluate and analyze the best scenarios among the 1023 scenarios investigated. The allocation of weightings confronts the feedback from an AHP analysis of 200 volunteers from which a pairwise comparison 7 × 7 matrix Am (m = 1, …, 200) is formulated. It is intuitively understood that Te should be as low as possible and that the accuracy should be kept to an acceptable level. Since the speed of detection is the prime factor of importance, the analysis on MCDA reveals that high weightings should be assigned to OA, Se, Sp, AUC, Te. These five parameters are referred as primary parameters. While Nf is typically preferred to be small for speedy detection, it is noted that Tr will not affect the detection time. Hence Nf and Tr are classified as the secondary parameters. The volunteers are required to fill in the am,ij , where i and j are between 1 and 7, in Table 3. The AHP based MCDA CDC is referred as the new classifier (NC). Traditional classifiers (TC) in [3,7,8] are also evaluated. Both the NC and the TC are applied to the three feature groups (non-fiducially features, fiducially features and hybrid features in [3,7,8]. The performance comparison between the NC and the TC is tabulated in Table 4. Based on the discussion for AHP formulation, the assignment of values of am,ij are based on the following guidelines:
Table 3.

Pairwise comparison 7 × 7 matrix Am.

OASeSpAUCTrTeNf
OA1am,12am,13am,14am,15am,16am,17
Seam,211am,23am,24am,25am,26am,27
Spam,31am,321am,34am,35am,36am,37
AUCam,41am,42am,431am,45am,46am,47
Tram,51am,52am,53am,541am,56am,57
Team,61am,62am,63am,64am,651am,67
Nfam,71am,72am,73am,74am,75am,761
Table 4.

Performance of NC versus TC.

MethodDatasets (Number of Samples)FeaturesResults (Related Work TC)Results (New Work NC)
Two-layered Hidden Markov Model [3]MIT-BIH database (34,799 samples from 16 Arrhythmia candidates)P-R interval, QRS complex interval and T sub-wave intervalOA = 0.992OA = 0.987
Se = 0.993Se = 0.99
Sp = 0.992Sp = 0.984
AUC = 0.971AUC = 0.966
Tr = 3.7 sTr = 3.4 s
Te = 2.7 sTe = 1.9 s
Nf = 3Nf = 2

Cross wavelet transform with a threshold based classifier [7]The PTB Diagnostic ECG database (18,489 samples from 52 healthy control candidates and 148 myocardial infarction candidates)Total sum of wavelet cross spectrum value and total sum of wavelet coherenceOA = 0.976OA = 0.966
Se = 0.973Se = 0.978
Sp = 0.988Sp = 0.958
AUC = 0.949AUC = 0.933
Tr = 6.2 sTr = 5.6 s
Te = 4.1 sTe = 2.8 s
Nf = 6Nf = 4

SVM [8]CU database, VF database, and AHA database (40,956 samples from 67 Ventricular fibrillation and rapid ventricular tachycardia candidates)Leakage, count 1, count 2, count 3, A1, A2, A3, time delay, FSMN, cover bin, frequency bin, kurtosis, and complexityOA = 0.952OA = 0.947
Se = 0.951Se = 0.952
Sp = 0.951Sp = 0.942
AUC = 0.943AUC = 0.937
Tr = 4.8 sTr = 4.5 s
Te = 2.7 sTe = 1.6 s
Nf = 13Nf = 10
Write 1 if equal importance of i and j; Write 3 if i is slightly more important than j; Write 5 if i is more important than j; Write 7 if i is strongly more important than j; Write 9 if i is absolutely more important than j. The pairwise comparison 7 × 7 matrix Am is then normalized, and Anormm can be obtained by modifying the matrix entries am,ij in Am into matrix entries anormm,ij in Anormm: By averaging each row of Equation (1), the corresponding 7 × 1 priority matrix wm with entries wm,k for k = 1,…,7 is given by: Denote Cp,q, (p = 1,…,7 and q = 1,…,1023) be the pth criteria, and qth scenario of CDC. Cp,q is normalized to become Cp,q,norm. The final score for each scenario, AHPq, is evaluated by: To avoid inconsistency in the construction of pairwise comparison matrices, the optimal CDC is concluded from the highest value of AHPq [23]. It is evaluated that the optimal CDC is obtained from scenario f652, with feature vector composes of average of Q, standard deviation of Q, standard deviation of S, average of QRS mean, standard deviation of QRS, average of RR-interval, and standard deviation of RR-interval, with AHP652 as follows: OA = 0.988, Se = 0.992, Sp = 0.985, AUC = 0.982, Tr = 4.5 s, Te = 2.8 s, Nf = 7.

AHP Scores and Analysis

The performance scores between the NC and the TC [3,7,8] are evaluated and tabulated in Table 4. In this investigation, the algorithms in related work have been evaluated, with the addition of MCDA using AHP to obtain a best scenario by assigning weights to the seven criteria. As the new work and related works are in the same application area, the classification of cardiovascular diseases, the weight assignment can be reused to facilitate performance comparisons. From Table 4, the percentage changes are evaluated as follows: Percentage change compared with AHP scores from [3]: OA = −0.504%, Se = −0.302%, Sp = −0.807%, AUC = −0.515%, Tr = −8.109%, Te = −29.630%, and Nf = −33.333%. It is concluded that there is an improvement of 30% in speed of detection of cardiovascular diseases @∼99.5% accuracy. Percentage change compared with AHP scores from [7]: OA = −1.025%, Se = 0.514%, Sp = −3.036%, AUC = −1.686%, Tr = −9.677%, Te = −31.707%, and Nf = −33.333%. It is concluded that there is an improvement of 30% in speed of detection of cardiovascular diseases @∼99% accuracy. Percentage change compared with AHP scores from [8]: OA = −0.525%, Se = 0.105%, Sp = −0.946%, AUC = −0.636%, Tr = −6.250%, Te = −40.741%, and Nf = −23.077%. It is concluded that there is an improvement of 40% in speed of detection of cardiovascular diseases @∼99.5% accuracy. The analysis reveals that in the NC, the speed of detection has been increased by 30%–40% while the accuracy is retained at ∼99%–99.5% of the TC. It is seen that there the reduction of OA, Se, and Sp are less than 1%. Thus the AHP based MCDA CDC is a reliable and speedy detection scheme for cardiovascular diseases.

Conclusions

In this letter, an optimal cardiovascular diseases classifier (CDC) has been proposed and implemented by using an analytic hierarchy process (AHP) to facilitate multiple criteria decision analysis (MCDA). The four most common types of cardiovascular diseases, namely bundle branch block, myocardial infarction, heart failure, and dysrhythmia are considered. Seven criteria, namely OA, Se, Sp, AUC, Tr, Te, and Nf are carefully considered and chosen to be the criteria for deriving the AHP score of MCDA to achieve the optimal CDC. The optimal CDC, the new classifier, achieves the following scores: OA = 0.988, Se = 0.992, Sp = 0.985, AUC = 0.982, Tr = 4.5 s, Te = 2.8 s, Nf = 7. Analysis and comparison with previous works show that the speed of detection cardiovascular diseases has been increased by 30%–40% while the accuracy is retained at ∼99%–99.5% of traditional classifiers. In conclusion, the AHP based MCDA CDC is a reliable and speedy detection scheme for cardiovascular diseases.
  13 in total

1.  PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

Authors:  A L Goldberger; L A Amaral; L Glass; J M Hausdorff; P C Ivanov; R G Mark; J E Mietus; G B Moody; C K Peng; H E Stanley
Journal:  Circulation       Date:  2000-06-13       Impact factor: 29.690

2.  A clinical and follow-up study of right and left bundle branch block.

Authors:  M Rotman; J H Triebwasser
Journal:  Circulation       Date:  1975-03       Impact factor: 29.690

3.  Predictors of congestive heart failure in the elderly: the Cardiovascular Health Study.

Authors:  J S Gottdiener; A M Arnold; G P Aurigemma; J F Polak; R P Tracy; D W Kitzman; J M Gardin; J E Rutledge; R C Boineau
Journal:  J Am Coll Cardiol       Date:  2000-05       Impact factor: 24.094

4.  Prediction of serious arrhythmic events after myocardial infarction: signal-averaged electrocardiogram, Holter monitoring and radionuclide ventriculography.

Authors:  D L Kuchar; C W Thorburn; N L Sammel
Journal:  J Am Coll Cardiol       Date:  1987-03       Impact factor: 24.094

5.  Ventricular fibrillation and tachycardia classification using a machine learning approach.

Authors:  Qiao Li; Cadathur Rajagopalan; Gari D Clifford
Journal:  IEEE Trans Biomed Eng       Date:  2013-07-26       Impact factor: 4.538

6.  Cardiac dysrhythmia following pneumonectomy. Clinical correlates and prognostic significance.

Authors:  M J Krowka; P C Pairolero; V F Trastek; W S Payne; P E Bernatz
Journal:  Chest       Date:  1987-04       Impact factor: 9.410

7.  Visual sensor based abnormal event detection with moving shadow removal in home healthcare applications.

Authors:  Young-Sook Lee; Wan-Young Chung
Journal:  Sensors (Basel)       Date:  2012-01-05       Impact factor: 3.576

8.  Wavelet-based watermarking and compression for ECG signals with verification evaluation.

Authors:  Kuo-Kun Tseng; Xialong He; Woon-Man Kung; Shuo-Tsung Chen; Minghong Liao; Huang-Nan Huang
Journal:  Sensors (Basel)       Date:  2014-02-21       Impact factor: 3.576

9.  Prediction of cardiovascular risk using Framingham, ASSIGN and QRISK2: how well do they predict individual rather than population risk?

Authors:  Tjeerd-Pieter van Staa; Martin Gulliford; Edmond S-W Ng; Ben Goldacre; Liam Smeeth
Journal:  PLoS One       Date:  2014-10-01       Impact factor: 3.240

10.  Implementation of a data packet generator using pattern matching for wearable ECG monitoring systems.

Authors:  Yun Hong Noh; Do Un Jeong
Journal:  Sensors (Basel)       Date:  2014-07-15       Impact factor: 3.576

View more
  1 in total

Review 1.  Applying the Analytic Hierarchy Process in healthcare research: A systematic literature review and evaluation of reporting.

Authors:  Katharina Schmidt; Ines Aumann; Ines Hollander; Kathrin Damm; J-Matthias Graf von der Schulenburg
Journal:  BMC Med Inform Decis Mak       Date:  2015-12-24       Impact factor: 2.796

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.