Literature DB >> 31595231

Heart sound classification using Gaussian mixture model.

Madhava Vishwanath Shervegar¹, Ganesh V Bhat².

Abstract

BACKGROUND: This article represents a new method of classifying the heart sound status using the loudness features from the heart sound.
MATERIALS AND METHODS: The method includes the following 3 main steps. First, the heart sound, which is usually found noisy, is heavily filtered by a 6th-order Chebyshev-I filter. The heart sound is then segmented using the event synchronous method to separate the sounds into the first heart sound, the systole and the second heart sound, the diastole. In the second step, the heart sound features namely maximum loudness index and minimum loudness index are obtained from the spectrogram of the sound by taking the row means. As a third step, the heart sound is classified using the Gaussian mixture model approach to categorize the sounds.
RESULTS: This method has been tested on a very large database of heart sounds consisting of over 3000 heart sounds recordings with a success rate of 97.77%.
CONCLUSION: Only 2 features are used in this method namely, minimum loudness index and maximum loudness index. Classification of sounds using these 2 features yields high accuracy even under noisy conditions and is comparable to any state-of-the-art technique.

Entities: Chemical

Keywords: Gaussian mixture model classifier; event synchronous segmentation; loudness; phonocardiography; spectrogram

Year: 2018 PMID： 31595231 PMCID： PMC6726299 DOI： 10.1016/j.pbj.0000000000000004

Source DB: PubMed Journal: Porto Biomed J ISSN： 2444-8664

Introduction

Heart diseases that include valve disorders are one of the main leading causes of death. Heart sounds result from electromechanical activity of heart during each heartbeat.[1] Heart valves produce time varying low-frequency transient signals in case of normal heart sounds. Pathological cardiac sounds or murmurs[2] result from the turbulence in blood flow through stenosis or regurgitation through the cardiac valves. The automatic classification of pathology in heart sounds has been described in the literature for over 50 years, but accurate diagnosis remains a significant challenge. Gerbarg et al[3] were the first to publish on the automatic classification of pathology in heart sounds (specifically to aid the identification of children with rheumatic heart disease) and used a threshold-based method. The methods for heart sound classification are grouped into 4 categories: (1) artificial neural network (ANN)-based classification; (2) support vector machine-based classification; (3) hidden Markov model (HMM)-based classification; and (4) clustering-based classification. The current prominent works in this field are summarized in Table 1. The important notes about the evaluation of the method, such as whether the data were split into training and test sets, are also reported. For relative brevity, only the notable studies with sizeable datasets are summarized in detail below. Uguz[4] used ANN with the features from a discrete wavelet transform and a fuzzy logic approach to perform 3-class classification: normal, pulmonary stenosis, and mitral stenosis. With a 50–50 train–test split of the dataset of 120 subjects, they reported 100% sensitivity, 95.24% specificity, and 98.33% average accuracy for the 3 classes. Uguz[5] also used time–frequency as an input to an ANN. A total of 120 heart sound recordings, split 50–50 into train–test, was used. They reported 90.48% sensitivity, 97.44% specificity, and 95% accuracy for a 3-class classification problem (normal, pulmonary, and mitral stenosis heart valve diseases). Ari et al[6] used a least square support vector machines (LSSVM) method for classification of normal and pathological heart sounds based on the wavelet features. The performance of the proposed method was evaluated on 64 recordings comprising of normal and abnormal cases. The LSSVM was trained and tested on a 50–50 split (32 patients in each set) and the authors have reported an 86.72% accuracy on their test dataset. Zheng et al[7] decomposed heart sounds using wavelet packets and then extracted fraction of the energy and sample entropy as features for the support vector machines (SVM) input. They tested on 40 normal and 67 abnormal patients and reported a 97.17% accuracy, 93.48% sensitivity, and 98.55% specificity. Patidar et al[8] used the tunable-Q wavelet transform as an input to LSSVM with varying kernel functions. They used a dataset of 4628 cycles from 163 patient heart sound recordings (and an unknown number of patients) and reported a 98.8% sensitivity and 99.3% specificity, but without stratifying patients (having mutually exclusive patients in testing and training sets), and therefore overfitting to their data. Gharehbaghi et al[9] used frequency band power over varying length frames during systole as input features. They used a growing-time SVM to classify pathological and innocent murmurs. When using a 50–50 train–test split (from a total of 30 patients with aortic stenosis (AS), 26 with innocent murmurs, and 30 normals), they showed 86.4% sensitivity and 89.3% specificity. Saracoglu[10] applied an HMM in an unconventional manner. They fitted an HMM to the frequency spectrum extracted from entire heart cycles. The exact classification procedure of using the HMMs is unclear, but it is thought that they trained 4 HMMs, and then evaluated the posterior probability of the features given each model to classify the recordings. They optimized the HMM parameters and principal component analysis-based feature selection on a training set and reported 95% sensitivity, 98.8% specificity, and 97.5% accuracy on a test dataset of 60 recordings. Quiceno-Manrique et al[11] used a simple K nearest neighbour (kNN) classifier with features from various time–frequency representations. They applied them on a subset of 16 normal and 6 pathological patients. They reported 98% accuracy for discriminating between normal and pathologic beats. However, the kNN classifier parameters were optimized on the test set, indicating a likelihood of over-training. Avendano-Valencia et al[12] also used time–frequency features and kNN classification approach for identifying normal and murmur patients. In order to extract the relevant time–frequency features, 2 specific approaches for dimensionality reduction were presented in their method namely feature extraction by linear decomposition, and tiling partition of the time–frequency plane. The experiments were carried out using 26 normal and 19 abnormal recordings and they reported an average accuracy of 99.0% when using 11-fold cross-validation with grid-based dimensionality reduction. Karar et al[13] segmented the heart sound using the time domain properties of the heart sound signal. Then, the segmented cycle was preprocessed with the discrete wavelet transform and then largest Lyapunov exponents were calculated to generate the dynamical features of heart sound time series. Finally, a rule-based classification tree was fed by these Lyapunov exponents to give the final decision of the heart health status. The developed method was tested successfully on 22 datasets of normal heart sounds and murmurs (they include 3 sets for normal healthy heart and 19 sets represented 3 cases of heart abnormalities, which divided into 4, 10, and 5 datasets for AS, aortic insufficiency, and ventricular septal defect, respectively) with success rate of 95.5%. Nabih-Ali et al[14] in their review paper made a mention about the approaches taken by different authors using various techniques for effective de-noising, segmentation, and classification of heart sounds and their advantages and disadvantages.

Table 1

Comparative results of previous classification methods

Comparative results of previous classification methods In our work, heart sounds have been classified by using the loudness index features obtained using spectrogram. The maximum and minimum values of the loudness index have been used for classification. With just only 2 features we obtained a high degree of classification with an accuracy of 97.77% even under noisy conditions.

Materials and methods

A very large dataset of heart sounds available from the Physionet database is used in this study.[15] The heart sound is subjected to the following preprocessing steps such as de-noising and normalization.

Heart sound de-noising

Heart sound obtained using diagnostic tools are usually contaminated with noise from various sources. These sounds hinder the early detection of mild heart sounds in the phonocardiogram. So filtering of noise to remove such artifacts becomes essential. This should be done at the cost of preserving all diagnostic information required for analysis of the phonocardiogram, but removing all unwanted entities called noise. The heart sound is taken from the Physionet database[15] is contaminated with various types of noises. The heart sound selected is heavily filtered to remove the maximum noise from the sound. A 6th-order Chebyshev low-pass filter with cut-off frequency of 140 Hz is used for this purpose. The noises are in high frequencies while the diagnostic information is in low frequencies. Filtering removes the high-frequency noise. The filtered heart sound is then put through the second phase of analysis namely segmentation. The filtered heart sound is adjusted to the required sound amplitude levels by normalizing. The normalizing is done by dividing each heart sound sample by the maximum value of the heart sound sample in each heart sound.

Segmentation of heart sounds

After de-noising and normalization of heart sound the phonocardiogram is segmented by events synchronous method. Some previous studies have used the electrocardiogram recordings to segment the heart sounds.[1] However, these signals are not available all the time along with the PCG collections data. Spectrum analysis has been used to separate the original heart sound signals into individual cycles, such that these signals are down sampled by a factor of 4 that gives the highest power of the first and second heart sounds (S1 and S2) in each cycle using Daubechies 4 and 5 wavelet coefficients (DUB4 and 5), respectively.[16] Segmentation based on cycle frequency and dynamic clustering in time–frequency domains has also been observed.[17] We have addressed the procedure for segmentation of heart sound using the event synchronous method.[18] This method is much simpler compared with other methods mentioned in the literature and is computationally more efficient and cheaper. Figure 1 shows the segmented heart sound for an abnormal heart sound taken from the Physionet database. The boundaries and the heart sounds are clearly distinguishable. A brief step-by-step procedure utilized is explained here: (1) Spectrogram, is obtained from the cardiac sound signal, at roughly 3 ms window. (2) For a better interpretation of heart sound murmur pitch and loudness the spectrogram is converted into bark scale and then smoothened using a hanning window.[19] (3) The sensation of loudness is derived from the spectrogram by summing the amplitudes of all frequency bands of the sound:

Figure 1

Segmented heart sound (blue) and segmentation boundaries (S1 red, S2 green).

Segmented heart sound (blue) and segmentation boundaries (S1 red, S2 green). where E represents the magnitude of the kth frequency band present in the spectrogram. There is a total of N such bands. (4) Event detection function is found by differentiating the loudness index function to obtain peaks that correspond to the onset and offset transients. (5) Systole and diastole are recognized by their durations with systole being shorter than the diastole. (6) A single cardiac cycle is obtained from the segmentation of all sounds.

Feature extraction

Extraction of loudness features

It is known from auscultation and also from the literature that there exists silence period in the normal heart sounds namely systole and diastole where the intensity of the sound is minimal. In patients suffering from valve disorders such as stenosis and regurgitation certain high pitch sounds are also audible. These sounds have less loudness (grade 1 to grade 4 sounds) compared with the first and second heart sounds. For grade 5 and grade 6 abnormal sounds, the loudness is much more than the first and second heart sounds. More energy is concentrated in the systoles and diastoles of abnormal sounds than the normal ones. Clearly loudness index is a measure than can be possibly used to identify and distinguish heart sounds with various indices. To determine the features for classification we repeat the steps mentioned in the previous stage, namely 1 to 3. Maximum and minimum values of the heart sound loudness index are found and are used as the 2 features for the classification of these sounds. Maximum loudness indices are high for abnormal sounds while the minimum loudness indices are low for normal sounds. Figure 2 shows a single normal cardiac cycle with loudness function. Figure 3 shows a single abnormal cardiac cycle with loudness function. Sharp peaks corresponding to murmurs are visible in the abnormal sounds.

Figure 2

Normal heart sound and the loudness curve.

Figure 3

Abnormal heart sound with loudness curve.

Normal heart sound and the loudness curve. Abnormal heart sound with loudness curve.

Classification of heart sounds using Gaussian mixture model

The literature survey reveals many classification methodologies used for heart sound classification as listed in Table 1. Euclidean distance (or in general, Minkowski distance) is the most commonly used distance metric. The drawback of this Euclidean metric is that the largest scaled feature gets dominated. Solution to this problem could be normalization of the continuous features to a common range of variance. Another drawback of Euclidean metric is that linear correlation between features can distort the distance metric. Euclidean distance always gives hyper-spheroid clusters. Mahalanobis distance, used in Gaussian mixture model (GMM), is expected to overcome all the above limitations where it provides hyper-ellipsoidal clusters. The database consisting of 2 sound types is assumed to be generated by the 2 Gaussian processes (of course each Gaussian could have been generated by many biochemical processes inside the heart having arbitrary stochastic distribution). Therefore, the entropy of the resultant distribution tries to become highest. Hence, the resultant distribution is Gaussian, which is having the maximum entropy among any other possible stochastic distributions. The probability distribution function (pdf) is given by Eq. (2). where d is the dimensionality. ML estimators of μ and are computed by The objective function is formed given by Eq. (2), by summing the class conditional density over all the classes for a feature in the feature vector; and again taking the product for all the features, assuming the features are linearly independent. Such likelihood-based objective function for optimization is maximized by EM[20] algorithm. It is a nonlinear optimization method. It optimizes the log likelihood over the entire feature space, including both observed data and hidden information embedded in the data. The EM algorithm consists of Expectation step (E-step) and Maximization step (M-step). In E-step, the posterior density based on conditional density using Bayes rule[21] is computed. In the M-step, the initial model is replaced with a new model which is a better one to represent the features such that the log likelihood is more than that of the previous iteration. The iterations are repeated until the new estimate will give same model and there will not be any improvement in the model. The algorithm is briefly described here: Input: A set of N feature vectors E = x1,x2,…,xmodel structure , where μ's and are parameters for the Gaussian models and α's are prior parameters subject to . Output: Trained model parameters Λ that maximizes the data likelihood And, a partition of data vectors given by the cluster identity vector Y = {y1, …, y}, y ∊ {1, …, K}. Steps: (Initialization) Initialize the model parameters Λ. (E-step) The posterior probability of model k, given a data vector x and current model parameters Λ, is estimated as where the pdf p(e|λ) is given in Eq. (2) (M-step) The ML re-estimation of model parameters Λ is given by (Stop) If P(E|Λ) converges; otherwise go back to Step 2; for each data vector x, set

Results

The heart sounds are selected from the Physionet database and are subjected to preprocessing steps namely filtering by using a 6th-order type 1 Chebyshev filter. The heart sounds are heavily filtered to remove maximum noise from the heart sounds (as discussed in the “Heart sound de-noising” section). The filtered heart sounds are then segmented using the event synchronous method to extract a single cardiac cycle of each heart sound (as discussed in the “Segmentation of heart sounds” section) and shown in Figure 1. The loudness features are extracted from the spectrogram of a single cardiac cycle (as discussed in “Feature extraction” section) of the normal heart sound as shown in Figure 2. The same is repeated for abnormal sounds as shown in Figure 3. The heart sounds are then classified for pathology using GMM-based classifier. The classification results in 2 clusters of abnormal and normal heart sounds as shown in Figure 4.

Figure 4

EM plot of all observations (green: normal sounds, yellow: abnormal sounds).

Discussion

Database used

Heart sounds present in the Physionet heart sound database[1] were taken from several people around the world, collected at either a clinical or nonclinical environment, from pathological patients. The Challenge training set consists of 5 databases (A through E) containing a total of over 3000 heart sound recordings. The duration of heart sound varies from 5 seconds to just over 120 seconds. The heart sound recordings were obtained from different locations on the body. Four locations used for acquisition are aortic area, pulmonic area, tricuspid area, and mitral area. Abnormal heart sounds were taken from patients with a confirmed cardiac diagnosis. The patients suffered from a variety of illnesses such as heart valve defects and coronary artery disease. Heart valve defects cover mitral valve prolapse, mitral regurgitation, AS, and valvular surgery. Pathological patients included both children and adults. Each subject/patient has contributed between 1 and 6 heart sound recordings. All recordings have a sampling rate of 2000 Hz and are in.wav format.

Classification of heart sounds

The heart sounds are selected from the Physionet database and preprocessed by passing the sounds through a 6th-order type 1 Chebyshev filter to remove the redundant noise. The heart sounds are normalized in the range [−1,1]. The normalized heart sounds are processed using the event synchronous segmentation procedure. This results in cycles of heartbeat of approximately 3 seconds duration. Two datasets are formed consisting of abnormal and normal heart sounds apiece, using the sounds in the Physionet database. In each of the datasets, 2 groupings are done one for training set and the other for test set; 50% of the dataset sounds make up the training set while the remaining 50% sounds make up the test set. The sounds are then clustered using the GMM classifier. Figure 5 shows the log-likelihood function plot for the sounds. The function is a negative function and it increases steadily and then stabilizes to a constant value for a total of 101 iterations. Figure 6 shows the scatter plot of all sounds. Two features were used the minimum loudness along x-axis and maximum loudness along y-axis. Figure 6 shows the EM plot of all sounds. The points in yellow are the abnormal sounds with higher loudness values while the sounds in green are the normal sounds with lower loudness values. The centroids are also shown and the ellipse represents the points closer to the centroid. A total of 112 sounds of each type: abnormal and normal were used for training purpose. Similarly, 112 sounds of each type were used for testing purpose as well. Out of 112 sounds, 5 sounds in normal dataset misclassified as abnormal sounds due to the presence of noise that remained even after filtering. This is shown in Table 2. Both normal sounds and abnormal sounds gave a similar accuracy of 97.77%. So the overall accuracy was 100%. The sensitivity of normal sounds was 95.33%, while the abnormal sounds gave 100%. The specificity of normal sounds was 100%, while the abnormal sounds gave 95.33%. This gave an overall sensitivity and specificity of 97.76%. The scoring is calculated to training set A. We generalize the same for the remaining sets B, C, D, E, and F.

Figure 5

Log-likelihood variations versus iterations.

Figure 6

Scatter plot of all observations (yellow circles).

Table 2

Scoring of all sounds

Log-likelihood variations versus iterations. Scatter plot of all observations (yellow circles). Scoring of all sounds

Conclusion

In this article, GMM-based classifier is presented for heart sound classification after preprocessing and segmenting the heart sound. The classification scheme is based on the maximum and minimum loudness indices extracted from the spectrogram of the heart sound. GMM overcomes the limitations of Euclidean metric even though the dataset has different range of variations. The model scales down the data to zero mean and unit variance. This study highlights that classification of heart sound using GMM classifier is comparable to a state-of-the-art classifier with high accuracy of 97.77%. The accuracy exceeds the performance of other classifiers mentioned in the literature, considering the large dataset used in this work.

Acknowledgments

The authors thank the Mitra Hospital for their feedback on this work.

Declaration

Ethics approval and consent to participate: Not Applicable. Consent for publication: Authors consent to the publication of this research article. Availability of data and material: No animal/humans were involved in the collection of data.

Author contributions

M. Vishwanath Shervegar collected the database sounds and performed analysis of the sounds. Ganesh V. Bhat was involved in the drafting of the paper and reviewed it.

Conflicts of interest

The authors declare no conflicts of interest.

7 in total

1. A biomedical system based on artificial neural network and principal component analysis for diagnosis of the heart valve diseases.

Authors: Harun Uğuz
Journal: J Med Syst Date: 2010-02-26 Impact factor: 4.460

2. Feature extraction from parametric time-frequency representations for heart murmur detection.

Authors: L D Avendaño-Valencia; J I Godino-Llorente; M Blanco-Velasco; G Castellanos-Dominguez
Journal: Ann Biomed Eng Date: 2010-06-02 Impact factor: 3.934

3. Assessment of aortic valve stenosis severity using intelligent phonocardiography.

Authors: Arash Gharehbaghi; Inger Ekman; Per Ask; Eva Nylander; Birgitta Janerot-Sjoberg
Journal: Int J Cardiol Date: 2015-07-02 Impact factor: 4.164

4. Selection of dynamic features based on time-frequency representations for heart murmur detection from phonocardiographic signals.

Authors: A F Quiceno-Manrique; J I Godino-Llorente; M Blanco-Velasco; G Castellanos-Dominguez
Journal: Ann Biomed Eng Date: 2009-11-17 Impact factor: 3.934