Literature DB >> 34914722

Novel feature extraction method for signal analysis based on independent component analysis and wavelet transform.

Abstract

Feature extraction is an important part of data processing that provides a basis for more complicated tasks such as classification or clustering. Recently many approaches for signal feature extraction were created. However, plenty of proposed methods are based on convolutional neural networks. This class of models requires a high amount of computational power to train and deploy and large dataset. Our work introduces a novel feature extraction method that uses wavelet transform to provide additional information in the Independent Component Analysis mixing matrix. The goal of our work is to combine good performance with a low inference cost. We used the task of Electrocardiography (ECG) heartbeat classification to evaluate the usefulness of the proposed approach. Experiments were carried out with an MIT-BIH database with four target classes (Normal, Vestibular ectopic beats, Ventricular ectopic beats, and Fusion strikes). Several base wavelet functions with different classifiers were used in experiments. Best was selected with 5-fold cross-validation and Wilcoxon test with significance level 0.05. With the proposed method for feature extraction and multi-layer perceptron classifier, we obtained 95.81% BAC-score. Compared to other literature methods, our approach was better than most feature extraction methods except for convolutional neural networks. Further analysis indicates that our method performance is close to convolutional neural networks for classes with a limited number of learning examples. We also analyze the number of required operations at test time and argue that our method enables easy deployment in environments with limited computing power.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34914722 PMCID： PMC8675669 DOI： 10.1371/journal.pone.0260764

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

1 Introduction

Signal processing is a rapidly developing field. With an abundance of new data and the development of user electronics, demand for methods that can quickly analyze incoming data increases. Both cloud and edge computing approaches can address these problems. The speed and accuracy of algorithms are crucial in cloud and edge solutions. For this reason, we introduce a novel method for signal feature extraction that obtains performance slightly below best-performing methods and requires less compute at test time. To show this, we compare the number of operations required by our method with the number of operations in commonly used methods for signal classification. To evaluate the performance of our method an ECG task was selected, as it is an important problem with well-established datasets and evaluation procedures [1, 2]. Also, there are already well-performing algorithms developed specifically for this field [3-5].

1.1 Signal feature extraction

Independent component analysis (ICA) [6] is a statistical method that solves the problem of blind source separation. In [7] ICA was connected with a neural network for ECG arrhythmia classification. Authors constructed feature vectors from ICA and RR intervals. RR interval is an ECG signal fragment cropped from one R wave peak to the next one. As classifiers, probabilistic neural networks and backpropagation neural networks were utilized. In [8] authors used wavelet transforms and ICA for feature extraction as separate components of feature vectors. Similarly to [7] RR intervals were also included as a part of a feature vector. Another method for signal feature extraction is a discrete wavelet transform (DWT). It can be utilized for the processing of various signals. In [9] DWT was used for electromyography (EMG) signal analysis. Wavelet transform can also be used to recognize emotions from speech [10] and for person identification based on EMG images [11]. In the case of ECG signal, feature extraction is often based on the analysis of the presence of waves and their shape on a record [12, 13].

1.2 ECG classification methods

In [14] authors proposed a support vector machine (SVM) with swarm optimization method for feature selection and model selection. Park et al. [15] utilized generalized k-nearest neighbors (k-NN) called Locally Weighted Regression for ECG classification. Discrete Cosine Transform (DCT) with Random Forest was proposed in [16]. There have been attempts to utilize neural networks for ECG signals classification as well. In [3] authors propose end-to-end training with raw ECG waveforms after heartbeat alignment and segmentation as inputs to a neural network. They use a network with three hidden layers, softmax activation output, and restricted Boltzmann machine pretraining of hidden layers. More recent approaches involve the utilization of deep learning models in ECG classification task. Kachuee et al. [4] propose a residual network with a 1D convolutional filter directly applied to prepossessed ECG signal. Jun et al. [5] utilize 2D convolution networks. To convert ECG waveform to image each beat of a signal is plotted as a 128 x 128 greyscale image. Deep neural networks require a lot of training data to train. This requirement can be prohibitive in medical domains as labeling high amounts of data by experienced physicians can be costly. For this reason, Weimann et al. explore the possibility of utilizing transfer learning in ECG classification task [17]. They used the Icentia11K dataset [18] for unsupervised pretraining of a Convolutional Neural Network, which later is fine-tuned for the classification task on PhysoiNet/CinC challenge [19]. Several pretraining methods were analyzed, yielding up to 6.57% improvement in F1 score. In [20] authors claim reaching a human-level performance with a deep neural network. This was possible due to the construction of a new big dataset containing 91,232 ECG records from 53,549 patients. Authors of [21] use a subset of available ECG leads to reconstruct information from all channels. This is achieved by unsupervised training of encoder-decoder network with Seq2Seq architecture. Autoencoder training enables for construction of latent space representation that compensates for missing leads. This latent space representation is used to train 1D ResNet for classification.

1.3 ECG evaluation schemes

There are two evaluation schemes used for ECG classification namely: a class-oriented and a subject-oriented [8]. In the class-oriented method, data is divided into training and testing subsets, with no consideration of whether signals collected from the same individual are in both train and test sets. This can cause the presence of very similar patterns in both training and testing sets, and it can make evaluation results unreliable. With the subject-oriented schema, this kind of problem is omitted by utilizing information about patient identity. As a result, a more robust estimation of true model generalization capabilities can be obtained. Another type of distinction that can be made is the utilization of general or patient-specific data. In most studies, learners are trained on a dataset composed of ECG recording for multiple patients. However, differences in ECG signal properties across patients can be substantial. Therefore some works [22-24] utilize small amount of data labeled for each patient. This data is used to create patient-specific learners that can later detect abnormalities in each heartbeat. Works that employ patient-specific protocol report higher accuracy however, they cannot be directly compared to other articles.

1.4 Aims and motivation

The aim of this work is to propose and evaluate a novel feature extraction method for signals. We examine the possibility of combining the independent component analysis with the wavelet transform by modifying the ICA mixing matrix. We hope that including additional information in the ICA feature extraction process will provide a performance boost while keeping an inference time low. These properties can make our solution attractive in certain applications where a high amount of computational power is unavailable. To formalize, we set the following research questions: What is the baseline performance of DWT and ICA applied separately as feature extraction methods for ECG? Is there a benefit from the utilization of DWT as an additional source of information during signal separation in ICA? Are there any alternatives to DWT that can be utilized as auxiliary information sources in ICA? How proposed method compares to other results from the literature? We also want to emphasize, that ECG classification performance is not of primary importance, as the main goal of this work is to introduce a novel feature extraction method.

2 Materials and methods

In this section, we introduce a novel method for the modification of mixing matrices with the utilization of wavelet transform. We also provide details about signals preprocessing, used dataset, metrics, and experiment setup.

2.1 Method

Firstly we define the principal component analysis (PCA) model as: where: x is value of p-th variable for i-th feature p ∈ {1, 2, 3, …, m}, i ∈ {1, 2, 3, …, n}, S is value j-th principal component for i-th feature j ∈ {1, 2, 3, …, m}, b is principal component coefficient. The model of principal components is based on matrix operations. It can be represented with matrices as: where: Z = [Z, Z, …, Z] is matrix of variables Z = (x1, x2,.., x) with n—being number of features. is matrix of principal components coefficients and S(S1, S2,.., S) is a matrix of principal components, with S = (s, s, …, s). Symbol ∘ denotes matrix multiplication. Principal components are determined with Hotelling algorithm [25]. Coefficients of principal components are computed in few steps. The coefficient of the first component S1 is determined by maximizing the variance of this component in all variables w1 using the function: Where W1 is principle component. Maximum is determined by utilization of Lagrange multipliers with bound , where is covariance matrix. In the next step rest of the covariance is computed: Where B1 = [b], p ∈ {1, 2, …, m} are values of coefficients for first principal component. Next is substituted into equation . Coefficients of the second principal component can be computed by analogy. These steps are repeated until variance in data explained by all components will reach 100%. In principal component analysis correlation coefficient is a measure of independence between principal components [26], which is important while dealing with multivariate normal distribution. In ICA however for estimating independence between variables entropy [27, 28] is used. Entropy H(X) of random variable X determines an average amount for information carried by possible outcomes of X and can be written as: where p is probability of outcome x. Entropy values are nonnegative and zero only when the probability of some outcome is one and all other outcomes probabilities are zero. In the case of the same probability for all outcomes, the entropy value is maximal. To estimate dependence between two variables we utilize mutual information I(X). It is based on the value of entropy for individual variables. Mutual information is calculated as the difference between the entropy of marginal distribution density: This metric is a modification of Kullback-Leibler divergence for two distributions. In our model we use negentropy J(X) for estimating signals dependence. Negentropy is given by the equation: where Y is random variable with normal distribution, which variance values is equal to variance of X i.e. VAR(Y) = VAR(X). Negentropy is a measure used in signal processing for independent component analysis. We utilize it as a criterion in blind signal separation. Maximizing the negentropy of the output signals is equivalent to minimizing the mutual information between these signals. In the proposed extraction method we assume that variables are linear combinations of independent components: where: X|—matrix containing learning examples (with |D| being number of learning examples that are currently processed) is principal component coefficients matrix, S = [S1, S2, …, S] is principal components matrix, H is coefficients matrix extracted from signal with wavelet transform from source signal [29], E = [E1, E2, …, E] is noise matrix. Matrix A determines principal components coefficients, and its elements are the basis for extracted features. By maximizing negentropy we attempt to obtain principal components that are independent. The process of determining principal component vectors is repeated for each class separately. For each segmented heartbeat, features are computed as principal components coefficients for all classes. Concatenated coefficients are the final feature vector that is utilized for classification. By computing components for each class separately, we hope to obtain distinct representations that will be more unique and therefore will enable easier classification. The number of principal components used for class can vary. An exact number of all extracted features is specified later in this work for each experiment. Wavelet transform can be written as: where: c—approximation coefficient for the j scale and k localization, d—detail coefficient, φ—time window, J—decomposition value, n—length of the original signal, Φ—wavelet function Γ—low and/or high pass filter; To allow for utilization of this transform in ICA it needs to be written in matrix form. Therefore we define L matrix as: where: g0, g1, … is representation of basic function (high-pass filter) h0, h1, …. is representation of scaling function (low-pass filter). Matrix L is the representation of a single heartbeat. To convert this sparse matrix to a more useful form Strang algorithm is utilized [29]. The output of the Strang algorithm for each ECG signal is concatenated into H matrix, which can be used in ICA mathematical model. In the proposed method for ECG feature extraction signal Daubechies wavelets were used. They can model a single heartbeat well. From now on we will refer to this kind of wavelets as db i where i is wavelet number.

2.2 Alternative to wavelet transform for ICA modification

In our work, we evaluate the possibility of replacing wavelet function with probability density functions in the process of ECG signal processing. This method is based on calculating values of ECG signals in a time window with the utilization of probability density functions. Wavelet function Φ in Eq (9) is replaced by one of the functions F(t) proposed below. Six example density functions are provided that can be used to construct F vector. Functions with high values properly placed were chosen to obtain good alignment with the R wave. The first proposed function is Poisson distribution defined on window area d, where λ is chosen such that the maximum of Poisson distribution cover ECG R wave. The next utilized density is exponential probability distribution: Where t is a time of wave R appearance in the window of width d and λ is coefficient chosen so t is aligned with R wave in ECG signal. The next considered function is t-Student distribution: where: r—number of signal samples in window d z—random variable with chi-square density μ—mean σ—variance The fourth density is normal distribution: where: μ—mean σ—variance Next is logarithmic distribution: In case of this distribution it is important to remember that first time index t in window cannot be zero. Last considered function is logistic density: where: α = e β = 1/σ μ—mean σ—variance

2.3 ECG signal alignment

An important element of the proposed method is heartbeat alignment to fixed window width d. In our experiments, windows are selected to match the middle of the window with the R wave. This causes phase alignment of the R wave in the ECG signal with a middle of the wavelet function. Misalignment can introduce noise to extracted features and impact classification performance. To establish the influence of signal alignment on feature extraction quality phase shift coefficient α was introduced. Let t1 be the time index of the window begin, t be the time index of the window end, and d = t − t1 be the width of the window. The middle of the window is given by the equation: Let t be time index of wave R appearance. Phase shift between and t is then defined as: In Fig 1 phase shift is visualized with sample ECG signals. During preliminary experiments, we observed that signal misalignment and high values of α have detrimental effects on classification quality. There are two possible explanations of this phenomenon. Firstly when lacking proper signal alignment, wavelets do not cover ECG signal correctly. Secondly, misalignment can cause that some features can have a larger set of possible values, as localization of R wave would move from one learning example to the other.

Fig 1

Visualisation of shift between R peak and middle of heartbeat for different classes of arrhythmia.

ICA is based on negentropy; therefore, it is important to evaluate how it is affected by phase shift. In Fig 2 negentropy is plotted as function of phase shift α. With α = 0 negentropy is equal to 0.95, which means that features are extracted well. When α is close to 1 high level of noise is introduced into features. Furthermore, performing classification with α = 0.0 and 0.1 cause a recall drop of approximately 14% when using Multi-layer perceptron (MLP) and 15% when using SVC. In all further experiments, the ECG signal was aligned to contain an R wave in the middle of the window.

Fig 2

Negentropy as a function of phase shift α with 95% confidence interval.

2.4 Method overview

An overview of the proposed method is provided in Fig 3. The method has three separate steps. Firstly raw ECG signal is preprocessed. Then, segmentation of single heartbeats is performed. R wave is found using Pan-Tompkins algorithm [30]. This is a commonly used algorithm for R wave detection in ECG signals. Each segment is aligned to fixed window size d and to contain an R wave in the middle. The second stage is feature extraction. It starts with covering segmented signal with wavelet function. Next, wavelet transform is computed (9). Due to the utilization of matrix notation in ICA, the calculated wavelet is represented as a matrix of coefficients. In the next step independent components (8) are calculated. Features extracted with modified independent components are stored for training. We want to emphasize that wavelet transform is not the basis of our method. It is used only as an additional source of information about signals for the ICA method. In our work, several base functions for wavelet transform were analyzed.

Fig 3

Concept diagram of proposed method.

2.5 Used classifiers

Below all classifiers used in the experiments are listed. For each classifier, we provide hyperparameter sets that were considered for fine-tuning. k-NN—k Nearest Neighbours [31] number of neighbors: 3, 5, 7 metrics: Minkowski, Euclidean, Manhattan SVC—Support Vector Classification/Support Vector Machine [32] paremeter C: 0.1, 1, 10, 100 kernel: linear, rbf, poly, sigmoid gamma: scale, auto CART—Classification and Regression Trees [33] criterion: gini, entropy splitter: best, random maximum depth: 1, 2, 3, …, 10 GNB—Gaussian Naive Bayes—without parameters [34, 35] MLP—Multi-layer perceptron [36] number of hidden layers: 3,4,5,…,10 activation function: identity, logistic, entropy, SOS, Tanh, Linear, Softmax, Exponent parameter alpha: 0.00001, 0.0001, 0.001, 0.01, 0.1 momentum: 0, 0.2, 0.4, 0.6, 0.8, 1 All hyperparameters were tuned automatically with statistica software.

2.6 Metrics

Across experiments standard set of metrics for imbalanced classification was used. For clarity, we provide definitions below. In equations below TP, TN, FP, and FN denote True positive, True Negative, False Positive and False Negative respectively. Accuracy is not appropriate in an imbalanced classification setting as a model can easily overfit to the majority class and obtain a high value of this metric. Other works in ECG literature report accuracy, so we calculate this metric for completeness, but it will not be used in results analysis. BAC-score, Recall, Precision, and Specificity are more suited for imbalanced classification problems. The recall is a percentage of all samples in a dataset with a given class that are correctly classified by the model. Precision is a percentage of model predictions indicating positive examples of a given class that are correct. Specificity is in some sense analogous to recall but is defined on True Negatives instead of True Positives. Depending on the application, one can optimize any of the metrics defined above. In our experiments, we utilize mainly BAC-score for comparison between models, as it is often important in the case of medical applications. In this field keeping a low number of false negatives is vital as each positive case can be associated with some disease, require the intervention of doctors, further examination, and diagnosis. Not detecting positive cases by the model can have detrimental effects on a patient’s health or life. All metric values reported in this work were obtained with five fold cross-validation. For evaluation of our method Bac-score was used. For comparison to other results from the literature, all of the above metrics were reported.

2.7 Dataset

PhysioNet [1] MIT-BIH Arrhythmia Database [2] was utilized for experiments. The database has been maintained since 1980, and it aimed to collect and standardize ECG signals for arrhythmia classification. It includes 24 hours of signals collected from 47 patients, including 25 men aged 32-89 years and 22 women aged 23-89 years. Each patient is represented as one record (except for one with two records) numbered from 100 to 124 and from 200 to 234. Due to different medical conditions of patients, the leads used to measure the heart’s activity are not uniform. All records contain lead II (referred to also as MLII). ANSI/AAMI EC57 contains requirements for ECG classification algorithms, recommendations for individual records, and a definition of the arrhythmia classes. For records it is advised to discard patients with pacemakers under numbers 102, 104, 107, and 217. The division of heart activation types into classes recommended by the standard is presented in Table 1. The total number of all heartbeats is 101464. Due to the low number of learning examples and the ambiguity in defining the features, the Q activation was not taken into consideration in this study.

Table 1

Labels used in the MIT-BIH database with number of learning examples assigned to each class.

Class name	#learning examples
Normal(N)	90632
Vestibular ectopic beats (S)	2779
Ventricular ectopic beats (V)	7129
Fusion strikes (F)	803
Undefined (Q)	15

From the classification point of view, it is important to divide the data set into training and testing sets. During the dataset split, we utilized information about patient identity to avoid leaking of the patterns that can be observed for one subject from training to test set. To correctly divide the dataset, we also need to take into consideration the number of learning examples assigned to the classes. The proportion of data in train and test split is defined by stratified 5-fold cross-validation [37].

2.8 Signal preprocessing

The raw ECG signal contains a lot of noise and artifacts that must be filtered out to obtain better and more reliable classification results. We can distinguish two types of interference with an ECG signal. The first type is the baseline wander—the disruption caused by the patient’s breathing, movements, and electrode displacement. The second type of interference is ECG convolution with a signal with a frequency of 60Hz coming from the AC power supply of the electrocardiograph. A low-pass filter with 5-15 Hz was utilized to eliminate noise generated by muscle tremors, electricity power supply (50/60 Hz), and the influence of T wave and floating isoline. Filter poles were localized on the unit circle to cancel out zeros of Laurent transform polynomial [30]. The result of filtration was subtracted from the original ECG waveform to obtain a signal without this kind of noise. In Fig 4 filtered ECG signal is presented for different classes that are present in the MIT-BIH dataset.

Fig 4

Example ECG signal after applying filtration and alignment of QRS complex.

Following type of beats were presented: a) NOR—normal beat, b) PVC—arrhythmias including premature ventricular contraction, c) PAB—paced beat, d) RBB—right bundle branch block beat, e) LBB—left bundle branch block beat, f) APC—atrial premature complexes, g) VFW—ventricular flutter wave, VEB—ventricular escape beat.

Example ECG signal after applying filtration and alignment of QRS complex.

2.9 Experiments setup

Experimental evaluation was divided into several parts. Firstly features were extracted with wavelet transform with ten types of wavelets db1-10. The main goal of the first experiment is to obtain an assessment of classification performance when ICA and wavelet feature extraction is performed separately. Wavelets used in this experiment are presented in the Fig 5. The next proposed extraction method was evaluated. Same classifiers and types of wavelets as in the first experiment were employed. The goal of these two experiments was to examine if the proposed method improves classification scores compared to the utilization of wavelet transformation and ICA separately. To compare baseline to proposed method Wilcoxon test was performed for each type of wavelet and classifier separately with significance level 0.05. The null hypothesis is that these two methods performance is not significantly different from each other. The third experiment verifies if alternative ICA mixing matrix modification methods provide better classification performance than results from the previous experiments.

Fig 5

Examples of Daubechies 1-10 wavelets used in experiments 1-3.

In the first experiment, 26 features were extracted with wavelets and 23 with ICA. In the second experiment, 21 features were used and 12 features in the third experiment. In all cases, features were extracted with 200 max iteration, coefficient α = 0 with neg-entropy log-cosh.

3 Results

This section contains the results of the experimental evaluation according to the experiment setup provided above.

3.1 Classification quality for wavelet transform and ICA applied separately

In this part of the experiments, we provide a baseline that will serve for comparison later. Results are presented in Fig 6. Based on these values we can conclude that the best classification scores are obtained for the MLP network. The highest metrics values for this kind of network were obtained for each type of wavelet and ICA. In the case of ICA BAC-score of MLP network is 92.23%. Of all wavelet functions, best BAC-score is obtained for db6: 91.95%. SVC classifier for ICA method obtained 91.00% BAC-score, while for db6 wavelet SVC BAC-score was equal to 90.94%. The lowest classification scores were obtained for the Naive Bayes classifier with values ranging from 79.83% to 85.66%.

Fig 6

Classification performance when the wavelet transform or ICA are applied separately.

3.2 Classification quality of proposed method

In this section proposed feature extraction method was evaluated. Obtained metrics are provided in Fig 7. After the application of the proposed method similar to the first experiment 5-layers MLP obtained the best results. With db6 it obtained 95.79% BAC-score. SVC scored 94.27% BAC-score with the same wavelet.

Fig 7

Classification performance when utilizing different wavelet functions for modification of mixing matrix in ICA.

Comparing obtained results from previous experiments, we note that after applying the proposed method increase in metric values is in the range from 3.29% to 3.86%. The average increase is 3.68%. In both experiments, the best results were obtained for the db6 wavelet. In Table 2 p-values for Wilcoxon test are presented. The presented values of significance p for the Wilcoxon test relate to the comparison of the classification quality of the proposed ICA method with the db6 wavelet with the quality results obtained after extraction with wavelets only. With a statistical significance level of 0.05, we reject the null hypothesis and conclude, that results are significantly different for all types of wavelets. Therefore, we can argue that our feature extraction method indeed provides additional information that impacts the quality of extraction in an independent analysis model.

Table 2

P-values obtained with Wilcoxon test comparing BAC-score from expierment 1 and 2.

methods	svc	knn7	knn5	knn3	mlp	cart	gnb
db1	0.019	0.016	0.019	0.018	0.021	0.02	0.019
db2	0.02	0.013	0.019	0.01	0.013	0.014	0.015
db3	0.013	0.016	0.011	0.01	0.017	0.019	0.01
db4	0.016	0.018	0.012	0.014	0.017	0.018	0.015
db5	0.016	0.017	0.01	0.019	0.014	0.015	0.017
db6	0.015	0.017	0.011	0.018	0.013	0.019	0.011
db7	0.018	0.016	0.013	0.014	0.014	0.013	0.011
db8	0.019	0.015	0.014	0.011	0.013	0.013	0.01
db9	0.017	0.019	0.018	0.013	0.01	0.018	0.015
db10	0.012	0.02	0.018	0.017	0.015	0.014	0.011

3.3 Classification quality when utilizing different function for ICA mixing matrix modification

We test how ICA mixing matrix modifications based on probability distribution functions affect the quality of classification. Obtained results were also compared with the best type of wavelet from previous experiments. Metrics are presented in Fig 8. From these results, we can conclude that classification performance is worse compared to the utilization of db wavelets. The best BAC-score observed so far is 95.79% for MLP and 94.27% for SVM. For other functions of the mixing matrix, modification BAC-scorel varies from 77.37% to 89.21%. Differences between these methods and ours are in rage 6.60%-10.97%.

Fig 8

Classification metrics for various functions for the mixing matrix creation in ICA.

Based on conducted experiments for further experiments, we utilize MLP classifier with wavelet db6 type. It is the model that obtained the best results, as demonstrated by our experiments.

4 Discussion

Experiments described so far were conducted with various machine learning algorithms i.e. SVC support vector machines, K of the nearest neighbors for K = 3,5,7, MLP neural network, CART classification tree, and GNB naive Bayes classifier. Firstly it was demonstrated that wavelet transformation could score between 79.83% and 91.95% BAC-score and ICA can score between 85.44%, and 92.23%. These results were a baseline for further comparison, and they provide an answer for research question 1. After applying our method (i.e. actualization of ICA matrix with wavelet transform matrices) an increase in BAC-score up to 11% was obtained. We want to emphasize the importance of signal alignment and matching of wavelet function with R wave in ECG signal. Even a small shift in phase between R wave of ECG and middle of window can have a detrimental impact on classification performance. In our experiments, MLP was the best performing classifier. This answers research question 2. When utilizing density probability function as an additional source of information for ICA instead of DWT obtained BAC-score was in 77.37, 89.21 interval. Obtained metrics for alternative functions were lower for all classifiers than for the db6 wavelet. This answers research question 3. In years 2006-2012 first approaches that combined DWT with ICA were proposed [8, 38]. These works concentrated on EEG signals. Wavelet transform was utilized for signal preprocessing and denoising. Next, preprocessed signals were separated into independent sources with ICA [38]. So in principle, one can describe these methods as conducting ICA in the DWT domain. Our method actualizes the matrix of independent components with information obtained with wavelet transforms. Due to this fact, we were able to obtain better results compared to other feature extraction methods [39, 40]. In [8] authors also utilize wavelet transforms with ICA for feature extraction however, they use ICA and wavelet as separate components of feature vectors, while our method combines these two algorithms.

4.1 Evaluation details

The proposed method was compared to ECG beat classification or feature extraction algorithms from the literature. Other works utilize seven classes instead of four. For this reason, new class definitions were adopted for comparison. Primary experiments utilize different class definitions due to the fact that the main goal of this paper is to introduce a novel signal feature extraction method. Therefore, obtaining the highest metrics for the ECG beat classification task is not of primary importance. Through the classification task, we only evaluate the usefulness of the feature extraction method. Reevaluation with the best-performing model was performed. As previously, results were obtained with 10-fold stratified cross-validation. To perform evaluation, we select the best performing model from experiments and conduct a hyperparameter search. Wavelet db6 was used for ICA modification with 21 features extracted. MLP network was selected with the following hyperparameters considered: with 3-20 hidden layers, loss functions: MSE, Cross entropy, and following activation functions for hidden and output layers: linear, sigmoid, tanh, and exponent. Results are provided in Table 3. For further experiments, we have selected the MLP neural network with the softmax output activation function, BFGS neural network learning algorithm (Broyden Fletcher Goldfarb Shanno), and the error function in the form of a sum of squares. This model will be used for comparison with other works.

Table 3

Best MLP networks used for heartbeat classification task.

Id	Network name	Accuracy	Learning algorithm	Loss function	Activation (hidden)	(Activation (output)
1	MLP 5	95.58	BFGS	Entropy	Exponent	Softmax
2	MLP 6	93.74	BFGS	Entropy	Tanh	Softmax
3	MLP 3	93.24	BFGS	SOS	Exponent	Tanh
4	MLP 3	92.99	BFGS	Entropy	Tanh	Softmax
5	MLP 5	92.74	BFGS	Entropy	Linear	Softmax
6	MLP 7	92.71	BFGS	Entropy	Linear	Softmax
7	MLP 3	92.49	BFGS	Entropy	Exponent	Softmax
8	MLP 6	92.47	BFGS	SOS	Linear	Exponent
9	MLP 7	92.37	BFGS	SOS	Linear	Exponent
10	MLP 5	92.24	BFGS	Entropy	Linear	Softmax

4.2 Comparison to other works

In Table 4 metrics obtained for the proposed method, other extraction methods and selected works from ECG beat classification literature are presented.

Table 4

Comparison of our methods with other results reported in literature.

Methods	Features	Classifier	Acc	Re	Pr	Sp
Proposed	ICA + Wavelet	MLP	95.58	95.42	95.92	95.82
This Paper	ICA [41]	MLP	92.23	92.12	92.18	92.15
This Paper	PCA [42]	MLP	89.32	88.82	89.11	87.64
This Paper	CCPCA [43]	SVM	91.45	90.53	91.08	91.12
This Paper	GPCA [44]	SVM	91.58	90.76	91.14	91.18
This Paper	KPCA [45]	MLP	90.23	90.66	89.89	89.79
Martis et al. [40]	PCA + Bispectrum	SVM	93.48	-	-	-
Martis et al. [40]	PCA + Cumulant	NN	94.52	-	-	-
Martis et al. [40]	PCA + Cumulant + DWT	SVM	93.76	-	-	-
Chazal et al. [39]	Morphology and heartbeat interval	LD	85.90	-	-	-
Melgani et al. [46]	PSO + SVM	SVM	91.67	93,83	-	90,49
Kiranyaz et al. [47]	1D CNN	1D CNN	96.40	68.,68	79.2	99,50
Jun et al. [5]	2D CNN	2D CNN	99.05	96.85	98.55	99,50
Dutta et al. [48]	SVM	SVM	95.82	86.16	97.01	99,17
Kumar et al. [16]	DCT	RF	92.16	-	-	-

The results marked in the methods as “This Paper” concern the authors’ use of various extraction methods known from the literature for the extraction of the features of the ECG signal. For various methods, i.e. main components of PCA, factor rotation according to CCPCA class centroids, optimization of rotation angle using GPCA gradients and non-linear kernel function transformation of KPCA, the quality of classification was worse than in the case of the proposed method. Combining different extraction techniques allows, as we can see from the research, to increase the quality of arrhythmia classification. Therefore, it is worth paying attention to the CCPCA and GPCA extraction methods. Application of factor rotation by class centroids, i.e. types of arrhythmias, increases the classification quality by more than 2% compared to the classical PCA method. Analyzing these results we can draw a conclusion that solutions based on PCA obtain slightly lower accuracy compared to our method. Combining PCA and wavelet transform also does not provide significant improvement. We argue that PCA, GPCA, CCPCA, KPCA assigns similar amplitudes of signal in some time proximity to the same principal component, which can explain lower classification performance. In signal fragments with significant variance in amplitude across multiple training samples, PCA can be beneficial. This problem does not occur when using ICA. Results from [49] are better compared to our experiments with DWT and ICA applied separately presented in Fig 6, and at the same time are worse compared to the proposed method. In [40] authors propose the utilization of ICA and wavelet transform separately with SVM classifier. Metrics obtained for this solution are also worse compared to our method, proving that combining ICA and wavelet transform is beneficial for ECG beat classification performance. Both 1D and 2D convolutional neural networks obtained superior classification performance compared to our method. This answers research question 4. From provided results, we can conclude that the best metrics are obtained by convolutional neural networks. This broad group of models is a class of its own when considering performance. For this reason, we provide a more detailed comparison with this type of network. We select [5], as this work provides detailed experiments. Three models were selected from this work comparison: AlexNet [50], VGGNet [51] and custom architecture proposed in [5]. Our results are compared to selected convolutional networks in Table 5. The best neural network obtained from the experiment results from the Table 3 was used for comparison. For comparison, cases with data augmentation increasing the training data set and with native content of ECG images were taken into account.

Table 5

Best MLP network for heartbeat classification compared to other results from literature.

Method	Grade	Acc(%)	Sp(%)	Re(%)	Pr(%)
Proposed		95.58	95.82	95.42	95.92
Jun et all [5]	Native	98.90	99.64	97.20	98.63
Jun et all [5]	Augmented	99.05	99.57	97.85	98.85
AlexNet [50]	Native	98.81	99.68	96.81	98.63
AlexNet [50]	Augmented	98.85	99.62	97.08	98.59
VGGNet [51]	Native	98.77	99.43	97.26	98.08
VGGNet [51]	Augmented	98.63	99.37	96.93	97.86

Obtained classification metrics for the proposed method are approximately 3-4% worse compared to CNNs. For more detailed analysis, performance for separate classes needs to be considered. Analyzing results in Fig 9 we conclude that the MLP network for arrhythmia types: PVC, PAB, RBB, LBB obtained worse performance, but it can be competitive with CNNs for arrhythmia types APC, VFW, and VEB. We presume that the advantage of results from [5] is connected to the number of learning examples available for each of the classes. Convolutional neural networks, when evaluated on classes with the lower amount of available data obtain comparable results. Therefore, we argue that our method can be on a par with convolutional neural networks in low data regimes.

Fig 9

Comparison of Recall obtained for our method and CNNs from [5] for selected type of beats.

Number of heartbeats for each type is equal to: PVC—7130, PAB—7028. RBB—7259, LBB—8075, APC—2546, VFW—472, VEB—106. For APC, VFW and VEB beats with lower amount of learning examples available our method provided better results than convolutional neural networks.

Comparison of Recall obtained for our method and CNNs from [5] for selected type of beats.

4.3 Computational requirements

Another important consideration is the computational requirements of each method. At test time, our method utilizes pre-computed matrices of independent components. Therefore, no computation of DWT is needed. For this reason number of operations required for computing the full feature vector of an unknown ECG heartbeat can be estimated as: where n is the number of samples in one heartbeat, C is the number of classes, and q is the number of independent components for class c. There is also a cost of classification, that is inherent to all methods and can vary depending on the type of classification algorithms used. Next, we compare the number of operations required in our approach to the convolutional neural network, as these are currently best-performing models. The number of operations required for 1D convolution layer inference with padding added to keep the size of the feature map the same can be written as: where Ch is number of input channels, Ch is number of output channels, D is feature map size, and k is filter size. Please note that in the above expression cost of applying bias was omitted for simplicity. The method proposed in this article will require fewer operations than a single convolutional layer if the following condition is met: Length of signal n is usually a few hundred samples (162 in case of our experiments) and can be adjusted by subsampling when needed. The number of independent components extracted per class is a hyperparameter that can be tuned. Earlier in this work, we reported the number of all features (i.e. number of independent components for all classes) as close to 20, therefore for simplicity, it can be assumed that max(q) = 20. The number of classes depends on the task and used dataset. In our experiments, it was 4. For comparison to others, seven classes were used. Now we analyze the right side of the inequality. With increasing neural network depth usually Ch Ch grows larger and D gets smaller. The number of input and out channels can change depending on architecture design. Usually they obtain high values in last layers such as 128, 256 or 512 [5, 51]. Filters are usually small (commonly used sizes are 3 or 5 [51]). For this reason, we argue that the condition given above can be easily met when the number of classes C is low and the processed signal n is sufficiently short. Therefore we conclude that with moderate assumptions satisfied proposed feature extraction method can require fewer operations than a single 1D convolution layer at the top of the network. Neural networks usually contain multiple layers and other operations such as pooling, nonlinear activations, and batch normalization. Layers with 2d convolutions require more operations. Lower compute requirements can allow for easier method deployment in practical scenarios.

5 Conclusions

This paper proposes a new approach to the feature extraction task using the independent component analysis, where the wavelet transform is used for the construction of the mixing matrix. This approach is particularly applicable to signals whose values are represented in the time domain. Metrics obtained in experiments were confronted with other results that can be found in the literature. Our method compares favorably to several other works. Convolutional neural networks can obtain better performance, however for classes with a low amount of samples our method can obtain comparable results. Obtaining a large dataset required by deep neural networks can be costly when at least one medical doctor must label each training sample. We argue that our method can be useful in these cases. Also, as shown by our analysis proposed algorithm requires fewer computations at test time. This can enable more broad applications with edge computing or embedded devices. A new method of extracting the features of the ECG signal was developed, combining the independent component analysis with the wavelet transform. The essence of this approach is that the wavelet transformer modifies knowledge in the form of a matrix of independent components. The obtained results confirmed that such a solution is competitive and gives better classification qualities compared to known extraction methods. The new method also opens up possibilities for analyzing other biomedical signals and more. Furthermore, the data processing speed makes it applicable for quick analysis of the ECG signal, where the signal is taken from the EKG device. However, in the case of convolutional networks, real-time processing may turn out to be too costly and thus difficult. Future work can include developing feature extraction methods that are more robust to signal misalignment. As the presented algorithm can be utilized for any signal defined in the time domain, adaptation to other applications can be considered. Features extracted with this method can also be clustered to detect a group of anomalies in signals automatically. Some machine learning algorithms require more data to perform well. This property is called data efficiency. Due to the high cost of obtaining large-scale labeled datasets in the medical domain interesting direction of research can be a principled evaluation of data efficiency for existing ECG classification approaches. 28 Sep 2021 PONE-D-21-24171Novel Feature Extraction Method for Signal Analysis based on Independent Component Analysis and Wavelet TransformPLOS ONE Dear Dr. Topolski, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Nov 12 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Mahmoud Al Ahmad, PhD Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following in the Acknowledgments Section of your manuscript: "his work was supported by the Polish National Science Center, grant No. 2017/27/ B/ST6/01325" We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "This work was supported by the Polish National Science Center, grant No. 2017/27/\\\\B/ST6/01325" Please include your amended statements within your cover letter; we will change the online submission form on your behalf 3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. "Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 4. We note that Figure 3 in your submission contain copyrighted images. All PLOS content is published under the Creative Commons Attribution License (CC BY 4.0), which means that the manuscript, images, and Supporting Information files will be freely available online, and any third party is permitted to access, download, copy, distribute, and use these materials in any way, even commercially, with proper attribution. For more information, see our copyright guidelines: http://journals.plos.org/plosone/s/licenses-and-copyright. We require you to either (1) present written permission from the copyright holder to publish these figures specifically under the CC BY 4.0 license, or (2) remove the figures from your submission: 1. You may seek permission from the original copyright holder of Figure 3 to publish the content specifically under the CC BY 4.0 license. We recommend that you contact the original copyright holder with the Content Permission Form (http://journals.plos.org/plosone/s/file?id=7c09/content-permission-form.pdf) and the following text: “I request permission for the open-access journal PLOS ONE to publish XXX under the Creative Commons Attribution License (CCAL) CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). Please be aware that this license allows unrestricted use and distribution, even commercially, by third parties. Please reply and provide explicit written permission to publish XXX under a CC BY license and complete the attached form.” Please upload the completed Content Permission Form or other proof of granted permissions as an "Other" file with your submission. In the figure caption of the copyrighted figure, please include the following text: “Reprinted from [ref] under a CC BY license, with permission from [name of publisher], original copyright [original copyright year].” 2. If you are unable to obtain permission from the original copyright holder to publish these figures under the CC BY 4.0 license or if the copyright holder’s requirements are incompatible with the CC BY 4.0 license, please either i) remove the figure or ii) supply a replacement figure that complies with the CC BY 4.0 license. Please check copyright information on all replacement figures and update the figure caption with source information. If applicable, please specify in the figure caption text when a figure is similar but not identical to the original image and is therefore for illustrative purposes only [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors propose a novel method for signal feature extraction that combines analysis of the signal waveform and separation of its sources. A wavelet transform is used to modify a mixing matric in the independent component analysis. The approach was validated using an ECG heartbeat classification, against 4 target classes; normal, vestibular ectopic beats and ventricular ectopic beats and fusion strikes. A 95.79% recall was obtained and the authors claim the method performance is close to that of convolutional neural networks for classes with limited number of learning examples and can be easily deployed in environments with limited computing power. Interesting research questions. 1. Include references for the following; Line 11 “… well-established datasets and evaluation procedures” Line 12”… well-performing algorithms developed specifically for this field.” Line126 “In principal component analysis correlation coefficient…” Line 314 “The proportion of data in train 314 and test split is defined by stratified 5-fold cross validation.” Line 405 “In years 2006-2007 first approaches that combined DWT with ICA were proposed.” Line 410 “Due to this fact, we were able to 410 obtain better results compared to other feature extraction methods.” 2. Several abbreviations should be defined before use; ECG, EMG, TP, TN, FN, FP, MLP 3. Certain sentences require commas to read better. Grammatical errors should be corrected; line 27-28, 58-60, 67-68, 73-75,90-91,138-139, 311-312, 469 (This is not an exhaustive list. Authors are advised to re-read the manuscript) 4. Section 1.1 and section 1.2 are both discussed as feature extraction methods. Section 1.1 should therefore be renamed appropriately perhaps as Independent Component Analysis 5. Consistency should be maintained. Equation (17) D is undefined. Does it refer to the width of the window which was defined as d in the text? Ensure all variables used in equations are defined e.g Df in eqn 24 & 25. 6. Justification for using negentropy should be included. 7. Line 276 ci? 8. Table 5 and 6 should be swapped to correspond with the order in which they are discussed in the text to reduce confusion. 9. Table headings could be more descriptive e.g “Table 5 Best MLP network for heart beat classification compared to selected convolutional networks reported in the literature” 10. Authors argue that in low data regimes the proposed method is comparable to convolutional neural networks, this could be included in the comparison tables for ease of comparison. 11. Justification for some choice of algorithms should be included e.g why stratified 5-fold cross validation, Pan-Tompkins algorithm etc… were these selected because they are commonly used in the literature? Could the technique benefit from other choices? Reviewer #2: The contribution of this paper is exciting and original. The proposed method seems to be valid for signal feature extraction. The procedure using signal processing can successfully lead to separation between the signal waveform from its resources. Experiments were carried out with an MIT-BIH database to illustrate the four classifications(Normal, Vestibular ectopic beats, Ventricular ectopic beats, and Fusion strikes). The derivations, experimental setup, and discussion section are clear to the readers. I think this paper will exhibit a better understanding of ECG heartbeat classification. Therefore, I recommend accepting this manuscript as it is with some polishing comments: C1. The abstract shall be rewritten to reflect all results yielded. C2. The Introduction could be shorter with more concentration on state of the art and the advantage of the proposed methodology. C3: In any case, profound grammatical revisions are needed to render the document useful for readers. C4: The article should have clear arguments regarding the proposed method's rationale and motivations and explicit discussions of the advances versus the prior art. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. Submitted filename: Reviewers comments _Nedal Al Taradeh.pdf Click here for additional data file. 23 Oct 2021 Reviewer #1 Thank you very much for your substantive comments allowing us to improve our article. RV: Reviewer AA: Author Answer RV: 1. Include references for the following; Line 11 “… well-established datasets and evaluation procedures” AA: Proper references were added. RV: 1. Include references for the following; Line 12”… well-performing algorithms developed specifically for this field.” AA: Proper references were added. RV: 1. Include references for the following; Line126 “In principal component analysis correlation coefficient…” AA: Proper reference was included. RV: 1. Include references for the following; Line 314 “The proportion of data in train 314 and test split is defined by stratified 5-fold cross validation.” AA: Proper reference was included. RV: 1. Include references for the following; Line 405 “In years 2006-2007 first approaches that combined DWT with ICA were proposed.” AA: This line was removed due to remarks provided by second reviewer. RV: 1. Include references for the following; Line 410 “Due to this fact, we were able to 410 obtain better results compared to other feature extraction methods.” AA: Proper citations were added. RV: 2. Several abbreviations should be defined before use; ECG, EMG, TP, TN, FN, FP, MLP AA: We added definitions of abbreviations mentioned above. We made sure that definitions were provided before the first appearance of abbreviation. RV: 3. Certain sentences require commas to read better. Grammatical errors should be corrected; line 27-28, 58-60, 67-68, 73-75,90-91,138-139, 311-312, 469 (This is not an exhaustive list. Authors are advised to re-read the manuscript) AA: All lines pointed out were checked and corrected. We have re-read the manuscript, corrected errors, and also asked one of our colleague for a fresh look. RV: 4. Section 1.1 and section 1.2 are both discussed as feature extraction methods. Section 1.1 should therefore be renamed appropriately perhaps as Independent Component Analysis AA: Introduction structure was changed. We believe that the current names of subsections in the introduction should be more informative. RV: 5. Consistency should be maintained. Equation (17) D is undefined. Does it refer to the width of the window which was defined as d in the text? Ensure all variables used in equations are defined e.g Df in eqn 24 & 25. AA: D is the width of the window. Naming was changed to improve consistency. All equations were checked for missing references. RV: 6. Justification for using negentropy should be included. AA: Justification for negentropy usage was added to text. RV: 7. Line 276 ci? AA: This is some error introduced during editing of our article. It was removed. RV: 8. Table 5 and 6 should be swapped to correspond with the order in which they are discussed in the text to reduce confusion. AA: I am sorry, but there was no Table 6 in the submitted paper. There is one reference to table 6, but this should be in fact reference to figure 6 (previously data in these plots were presented as a table). This error was corrected. Sorry for confusion. All other references were checked and corrected if needed. RV: 9. Table headings could be more descriptive e.g “Table 5 Best MLP network for heart beat classification compared to selected convolutional networks reported in the literature” AA: Thank you very much for your attention. We have supplemented the descriptions for the tables. RV: 10. Authors argue that in low data regimes the proposed method is comparable to convolutional neural networks, this could be included in the comparison tables for ease of comparison. AA: Information about performance for classes with low number of learning examples is presented in Figure 9. For clarity we included information about the number of beats available with a short conclusion. This should justify our claims about performance in low data regimes. RV: 11. Justification for some choice of algorithms should be included e.g why stratified 5-fold cross validation, Pan-Tompkins algorithm etc… were these selected because they are commonly used in the literature? Could the technique benefit from other choices? AA: Pan-Tompkins is commonly used algorithm for finding R wave in ECG signal. Proper mention was added. A 5-fold cross-validation was used because it ensures the statistical representability of larger data sets. This validation allows for the lowest bias and variance. Reviewer #2 Thank you very much for your substantive comments allowing us to improve our article. RV: Reviewer AA: Author Answer RV: C1. The abstract shall be rewritten to reflect all results yielded. AA: Abstract was rewritten to include most important result and conclusions. RV: C2. The Introduction could be shorter with more concentration on state of the art and the advantage of the proposed methodology. AA: Introduction was reorganized, and unnecessary information was removed to improve clarity. Subsection with ECG classification methods was rewritten to contain more recent approaches. RV: C3: In any case, profound grammatical revisions are needed to render the document useful for readers. AA: Document was checked for grammar, readability and clarity. We belive that major errors were removed and article in current form should be easier to follow. RV: C4: The article should have clear arguments regarding the proposed method's rationale and motivations and explicit discussions of the advances versus the prior art. AA: Separate subsection for aims and contributions in Introduction was created to highlight goals of our work and possible benefits of our method. Detailed discussion of benefits and obtained results is given in section 4. 17 Nov 2021 Novel Feature Extraction Method for Signal Analysis based on Independent Component Analysis and Wavelet Transform PONE-D-21-24171R1 Dear Dr. Topolski, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Mahmoud Al Ahmad, PhD Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: (No Response) Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: (No Response) Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: (No Response) Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: (No Response) Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: After considering all of the raised comments , I recommend to accept this article to be published in PLOS ONE journal. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 23 Nov 2021 PONE-D-21-24171R1 Novel Feature Extraction Method for Signal Analysis based on Independent Component Analysis and Wavelet Transform Dear Dr. Topolski: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Mahmoud Al Ahmad Academic Editor PLOS ONE

16 in total

1. A patient-adapting heartbeat classifier using ECG morphology and heartbeat interval features.

Authors: Philip de Chazal; Richard B Reilly
Journal: IEEE Trans Biomed Eng Date: 2006-12 Impact factor: 4.538

2. Comparative study of morphological and time-frequency ECG descriptors for heartbeat classification.

Authors: Ivaylo Christov; Gèrman Gómez-Herrero; Vessela Krasteva; Irena Jekova; Atanas Gotchev; Karen Egiazarian
Journal: Med Eng Phys Date: 2006-02-14 Impact factor: 2.242

Review 3. Principal component analysis: a review and recent developments.

Authors: Ian T Jolliffe; Jorge Cadima
Journal: Philos Trans A Math Phys Eng Sci Date: 2016-04-13 Impact factor: 4.226

4. Determination of the optimal number of components in independent components analysis.

Authors: Amine Kassouf; Delphine Jouan-Rimbaud Bouveresse; Douglas N Rutledge
Journal: Talanta Date: 2017-11-26 Impact factor: 6.057

5. A real-time QRS detection algorithm.

Authors: J Pan; W J Tompkins
Journal: IEEE Trans Biomed Eng Date: 1985-03 Impact factor: 4.538

6. Correlation technique and least square support vector machine combine for frequency domain based ECG beat classification.

Authors: Saibal Dutta; Amitava Chatterjee; Sugata Munshi
Journal: Med Eng Phys Date: 2010-09-15 Impact factor: 2.242

7. Real-Time Patient-Specific ECG Classification by 1-D Convolutional Neural Networks.

Authors: Serkan Kiranyaz; Turker Ince; Moncef Gabbouj
Journal: IEEE Trans Biomed Eng Date: 2015-08-14 Impact factor: 4.538

8. AF Classification from a Short Single Lead ECG Recording: the PhysioNet/Computing in Cardiology Challenge 2017.

Authors: Gari D Clifford; Chengyu Liu; Benjamin Moody; Li-Wei H Lehman; Ikaro Silva; Qiao Li; A E Johnson; Roger G Mark
Journal: Comput Cardiol (2010) Date: 2018-04-05

9. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network.

Authors: Awni Y Hannun; Pranav Rajpurkar; Masoumeh Haghpanahi; Geoffrey H Tison; Codie Bourn; Mintu P Turakhia; Andrew Y Ng
Journal: Nat Med Date: 2019-01-07 Impact factor: 53.440

10. Transfer learning for ECG classification.

Authors: Kuba Weimann; Tim O F Conrad
Journal: Sci Rep Date: 2021-03-04 Impact factor: 4.379