Literature DB >> 35791429

Automated diagnosis of COVID stages from lung CT images using statistical features in 2-dimensional flexible analytic wavelet transform.

Abstract

The COVID-19 epidemic has been causing a global problem since December 2019. COVID-19 is highly contagious and spreads rapidly throughout the world. Thus, early detection is essential. The progression of COVID-19 lung illness has been demonstrated to be aided by chest imaging. The respiratory system is the most vulnerable component of the human body to the COVID virus. COVID can be diagnosed promptly and accurately using images from a chest X-ray and a computed tomography scan. CT scans are preferred over X-rays to rule out other pulmonary illnesses, assist venous entry, and pinpoint any new heart problems. The traditional and trending tools are physical, time-inefficient, and not more accurate. Many techniques for detecting COVID utilizing CT scan images have recently been developed, yet none of them can efficiently detect COVID at an early stage. We proposed a two-dimensional Flexible analytical wavelet transform (FAWT) based on a novel technique in this work. This method is decomposed pre-processed images into sub-bands. Then statistical-based relevant features are extracted, and principal component analysis (PCA) is used to identify robust features. After that, robust features are ranked with the help of the Student's t-value algorithm. Finally, features are applied to Least Square-SVM (RBF) for classification. According to the experimental outcomes, our model beat state-of-the-art approaches for COVID classification. This model attained better classification accuracy of 93.47%, specificity 93.34%, sensitivity 93.6% and F1-score 0.93 using tenfold cross-validation.

Entities: Chemical

Keywords: COVID-19; FAWT based image decomposition; Feature extraction; Image classification; Machine learning; Medical imaging

Year: 2022 PMID： 35791429 PMCID： PMC9247116 DOI： 10.1016/j.bbe.2022.06.005

Source DB: PubMed Journal: Biocybern Biomed Eng ISSN： 0208-5216 Impact factor: 5.687

Introduction

Coronaviruses are a group of enveloped Ribonucleic acid (RNA) viruses found in mammals and birds and cause respiratory and gastrointestinal disorders, neurological sickness, and hepatitis in rare cases [1]. The coronavirus infection was first reported in Wuhan, China. On the 30th of January 2020, the World Health Organization (WHO) proclaimed the Coronavirus Disease 2019 (COVID-19) epidemic to be a public health emergency of worldwide significance [2]. Although the virus has harmed people all over the world, it has had the most significant influence in nations such as the United States, Italy, India, China, Spain, Brazil, Russia, the United Kingdom, Peru, Iran, and Turkey, where over a million people have been infected. The typical clinical symptoms of COVID are fever, cough, breathing shortness, and body pain. The real-time reverse transcription-polymerase chain reaction (RT-PCR) is a laboratory and frequently used testing tool for COVID detection, but it takes approximately 10–15 h to declare the results. Another way of testing is the rapid diagnostic test (RDT). Hence, to overcome these problems, the available medical modalities are beneficial for fast identification. because they save time and avoid unnecessary treatment delays [3]. These methods are high false-negative rate, laborious, less efficient, and time-consuming. Because of a false-negative result, the infected can sometimes be labeled as COVID-19 negative [4]. If the infected persons are not identified promptly, they do not receive appropriate treatment precisely. The RT-PCR test is not sufficient in the current pandemic due to the high false detection and alarm rate. Although RT-PCR provides good results, it needs high skilled persons to handle the sample of COVID patients. Due to this cause, in March 2020, US-based Disease control and prevention withdrew the RT-PCR testing kits [5]. Biomedical imaging has shown to be a fantastic tool for non-invasively diagnosing a wide range of disorders. The discovery of X-rays was the catalyst for the development of medical imaging [7]. However, as technology has advanced, several imaging models have emerged, including the electromagnetic spectrum, radio, ultrasound, microscope, and many more [8]. Furthermore, multimodal imaging exists in which more than one imaging model is used for effective diagnosis. Hence, the available medical modalities are beneficial for fast identification to overcome these problems because they save time and avoid unnecessary treatment delays. Presently two Medical modalities like, CT scans and X-rays, are utilized, but available tools for reading images is less accurate. It is also found that for COVID identification, X-ray provides more promising results [6], but CT images are a better choice because it provides more details than X-ray. Fig. 1 represents the image collection process from the CT scan machine, and Fig. 2 shows the CT scan image of COVID and normal.

Fig. 1

CT image acquisition process.

Fig. 2

Lungs CT scan images (a) COVID-19 (b) Normal.

CT image acquisition process. Lungs CT scan images (a) COVID-19 (b) Normal. Recently, various models have been developed for the automated detection of COVID based on traditional machines and deep learning. Related to CT scans, various models have been proposed based on machine learning with ensemble mechanisms. In related work, Pramod et al. [7] proposed a wavelet and transfer learning-based model for denoising and classifying COVID using CT images with an accuracy of 85.5 %. Kassania et al. [8] proposed a deep learning-based features extraction model, and these features are fed into a machine learning model for COVID classification. In this proposed mechanism author attains the highest classification using the DenseNet121 feature extractor with a Bagging tree classifier. Amyar et al. [9] proposed a method based on deep learning using CT images with 0.86 accuracy and sensitivity are 0.94. Yasar et al. [10] developed a texture-based model for COVID classification from the CT lung images. 23– layer CNN is used with.91sensitivity 0.90 F1-score. Wu et al. [11] suggested a model with 0.93 specificity and 0.95 sensitivity for classification and segmentation of lung lesion areas affected by COVID. Abraham et al. [12] proposed a model for COVID detection, which combines features extracted from multi-CNN with correlation-based feature selection (CFS) technique and Bayesnet classifier with an AUC of 0.963 and an accuracy of 91.16%. Pradeep et al. [13] suggested FBSED based decomposition for denoising and enhancing the classification rate of the model with different channels. Gaur et al. [14] proposed a model based on stacked ensemble CNN with a sensitivity of 97.62% for multi-classification (COVID, Normal, Pneumonia) using X-ray images and 98.31% sensitivity for binary classification using CT images. Wang et al. [15] proposed a model based on a residual deep structural design that uses chest images for COVID-19 detection with an accuracy of 83.5%. Sarkisov [16] suggested the COVID-CT mask net model. Here R-CNN is used to detect the lesion area of the lung, and then it is fed to the classifier. The accuracy of the model is 91.66%, F1 score is 0.91. Rashid et al. [17] developed a model for COVID recognition using an auto encoder-decoder with merging features. It has a multi-class classification capacity with 90.13% and 96.45% accuracy for three and four classes. Ewen et al. [18] proposed a model based on targeted self-supervision for classification with 0.86 accuracy on a small dataset. Mishra et al. [19] proposed transfer learning (TL) techniques based on VGG 16 and RseNet50 for COVID detection from lung CT images. This model provides an average classification accuracy of 86.74% with VGG16 and 88.52% with ResNet50. Ahmed et al. [20] suggested a model based on slice (2D) and volume (3D) for COVID detection using deep learning with 90.8% accuracy. Islam et al. [21] proposed a model in which deep features are extracted into CNN, and then extracted features are applied to machine learning mechanisms like SVM, RF, and Decision tree. CLAHE is also used to enhance the quality of CT images. Ebenezer et al. [22] suggested mechanism for image enhancement of CT scan like wavelet, Laplace transform and CLACHE. After this images are applied to EfficientNet for COVID classification. Bernheim et al. [3], Singh et al. [23], and Das et al. [24], all demonstrate that patients with COVID have observable abnormalities in chest imaging, such as bilateral disorders, revealing that RT-PCR is not the only approach for COVID diagnosis. The available conventional tools for decomposition like DWT, EWT, wavelet packet decomposition (WPD), and a combination of empirical wavelet transform (EWT) and discrete wavelet transform (DWT) are non-adaptive and restrained to the dyadic range [25]. Bayram [26] has currently developed a valuable wavelet transform based on Hilbert transform atoms couples and rational responses of frequencies to overcome the drawbacks of DWT like shift-invariance, fixed time–frequency, and the bad result of frequency responses. In this work, FAWT is utilized because it has analytic frequency response, fractional scaling, and shifting parameters to cover flexible time–frequency. Also, it has a good quality resolution of frequency [26], [27]. These advantages prosperities of FAWT is attractive and it is also observed from literatures this decomposition method is not used till now very efficiently for COVID diagnosis. Our work's main contributions are as follows: 1) Computer tomography images are used for COVID-19 detection to overcome the RT-PCR false negative value. 2) The proposed two-dimensional FAWT technique is utilized for removing the noisy pixels and artifacts from images. 3) The constructed model has been tested on a publically available dataset and compared with state-of-art methods in terms of performance parameters like accuracy, sensitivity, and specificity. The remaining part of this paper is arranged as follows: Section 2 describes datasets, Section 3 discusses the pre-processing and methodology of proposed works, and Section 4 presents the results of the proposed methodology with existing methods. The final part of the paper is the conclusion Section 5.

Database

We use 2482 images (1252 COVID and 1230 Normal) from the SARS-CoV-CT database for binary classification [28]. Here, images are resized by a resolution of 240X240 pixels and stored in 24-bit PNG format. The database is publicly available https://www.kaggle.com/plameneduardo/sarscov2-ctscan-dataset. These images of the dataset are obtained from Sao Paulo hospital, Brazil. In this dataset 60 COVID+ (32M & 28F) and 60 COVID – (30M & 30F) CT images are presented. Fig. 3 shows dataset images, and Fig. 4 shows graphs of patients.

Fig. 3

Dataset images.

Fig. 4

Number of subjects and patients used for composing this dataset.

Dataset images. Number of subjects and patients used for composing this dataset.

Proposed methodology

In the proposed method, images are split into their red (R), blue (B), and green (G) channels. We select the green channel because it contains more appropriate information, and the vision of humans is more sensitive to green color [7]. Then contrast-limited adaptive histogram equalization (CLAHE) is used to enhance the channel contrast and consistency of pixel strength [29]. Fig. 5 shows the block diagram of the proposed methodology.

Fig. 5

Proposed framework of methodology.

Two dimensional FAWT

For image decomposition, Flexible Analytic wavelet transform (FAWT) is an advanced version of DWT, and they have allures properties like time and frequency, covering high-quality frequency resolution and basic controlling parameters like dilation factor (D), quality factor (Q), and redundancy (R) [26], [30]. For discrete-time signal, a thin chirp let frame is formed using Hilbert transform, which contains a pair of atoms and helps for time–frequency study [31], [32], [33]. The ith decomposition level can be achieved by combining one low pass filter (LPF) and two high pass filters (HPF). Here, one HPF is positive, and the other is negative frequencies [34], [35]. Q-factor (Q) controls the number of oscillations, while redundancy (R) stands for time localization in wavelets. For wavelets, both of these factors are defined with the help of sampling parameters like p, q, r, and s. Here, p and q using for up and down-sampling in low pass filters, whereas r and s using for up and down-sampling in high pass filters [27]. Here, ω0 is the center frequency and is bandwidth. L(ω) is used for low pass filter response for scaling, while H (ω) is for high pass filter. The H (ω) and H*(-ω) represent positive and negative frequencies, respectively shown in Fig. 2. L (ω) in Eq. (2) and H (ω) in Eq. (3) expressed as below [26], [27]: The filter banks reconstruction in FAWT is represented in Equ.4 and 5. The appropriate value for the selection of Q (Eq. (5)) and R (Eq. (6)) explains below: An image I (X, Y) has rows (X) and columns (Y) with positive values in a 2D array. 2D FAWT performs the operation columns-wise (1-D FAWT) followed by row-wise (1-D-FAWT). The output of rows is used as input for columns [36], [37]. In this mechanism, we are producing two sub-bands. Approximation coefficients (cA) result from combining all row values with low-frequency, and this coefficient is a thin version (X × Y/2) of the images. Detailed Coefficients (cV, cH, cD) are obtained by combining high-frequency bands for all rows. Figs. 6 and 7 depicted the internal structure of the FAWT methodology and the first level of image decomposition, respectively. With the help of multi-resolution analysis, approximation coefficients are further decomposed into two sub-bands like cA1, cV1, cH1, and cD1 with the image size (X/2, Y/2). This operation of the wavelet decomposition of approximation coefficients continues till the appropriate level of wavelet decomposition is not obtained. To obtain a proper decomposition level, examine the performance parameters of the models [38].

Fig. 6

Internal structure of the FAWT.

Fig. 7

First level decomposition structure of FAWT.

Internal structure of the FAWT. First level decomposition structure of FAWT.

Feature extraction using LBP and VAR

After the decomposition of processed images, statistical features are extracted using local binary pattern (L) (LBP) and Value at Risk (VAR) histogram (V) [29], [39]. These features are mainly known as median, Entropy, mean, standard deviation, skewness, and kurtosis. The mathematical expression of L and V is as follows:where P is the pixel number of circular radius (R), np is the neighborhood pixel around the center pixel (cf). Total 12 × B × Split channel (C = 1) features are extracted where B is sub-band which is four after decomposition.

Feature normalization, feature selection, ranking

Now feature normalization is applied because it helps enhance the performance of the classifier. In this work, different types of Texture-based feature extraction methods are utilized, so the range values of features are different. The available traditional classifier like SVM, LS-SVM, RF, XGboost, and ANN use distance measures while updating constraints. Low-magnitude features are less weighted than high magnitude features [40]. With the help of the min–max normalization method (0–1), this difficulty can be determined. The mathematical expression of Min-max normalization (M) as below [41]:where M (i) is i th normalized feature, f (i) is i th feature value, and in feature vector f- f are minimum and maximum values. Whether medical or natural, min–max normalization is best suited because they have a fixed range (0–255) value. After normalization of features, PCA is used for feature selection, and then selected features are ranked using a student's t-test [42], [40]. The feature selection and feature ranking methods explain below: Principal component analysis (PCA): Commonly, obtained features size from Texture based extraction is large due to this classification performance of the model is time-consuming [43]. So it is necessary to reduce the size of features because all extracted features are not playing an important role in classification, and it is necessary to select only relevant or significant features. Traditionally, various dimension reduction techniques are available, like PCA and LDA. In this work, PCA is used for reducing the size of features [44]. In this method, data is represented in a set of new orthogonal variables called principal components (PCs), and the dimension of these components is less compared to the original. Let's consider that D is our data represented in the N-dimensional plane, and then PCA is utilized to represent our model data (D) in M dimensional subspace. It can be stated mathematically as: Feature ranking: it is necessary to arrange selected features in a ranked form because applying all features without provided rank makes the model complex and may also affect the classification performance. So for reasonable classifications, arrange the selected features such that there is a large variance between classes and small within classes. We used the Student's t-value to arrange features with high to low ranks [40]. After ranking features, they are applied to the classifier. Here high-rank features followed by low-rank features are provided to the selected classifier.

Least square support vector machine (LS-SVM) classifier

LS-SVM is an advanced version of SVM. It can work with linear, non-linear, small, and large datasets [45]. LS-SVM work with linear equation while SVM work with quadratic [46]. Here, the hyper plan is used for boundary creation, and radial basis function (RBF) kernels are used for distinct non-linear features. RBF is a flexible kernel [47], and it is suitable for separating our non-linear data. Our proposed method has been evaluated based on various parameters like accuracy, specificity, sensitivity, and F-score [48]. The kernel function of SVM is as follows: Linear kernel (LK) polynomial kernel (PK) Radial basis function (RBF) Here, >0 (scale parameter), and its value is . To separate the data, SVM used kernel (k), which is expressed as: Here,, is langrage multiplier, w is weight, T stands for transpose, and b is a constant coefficient. If the sigma value is overestimated, it is treated as linear, and if it is underestimated, it loses normality. The PCA ranked features are fed to the SVM classifier for COVID classification. The performance of our proposed method has been observed based on various parameters like accuracy, specificity, sensitivity, and F-score discussed below [48]. The model sensitivity indicates how well it can classify positive classes. The model specificity reveals how well it can categorize the negative class. The model accuracy measures how well it can categorize positive class images into positive classes and negative class images into negative classes. The harmonic mean of Precision and Recall is the F1 score. The AUC measures how well a model can distinguish between different classes [49], [50]. True positive, true negative, false positive, and false negative are represented by the letters NTP, NTN, NFP, and NFN.

Result and discussions

Results

A COVID-19 illness classification model is provided in this work, which uses a CT scan of a lung to classify whether or not a person is infected with COVID. When it comes to illness identification, feature extraction is essential. The importance of frequency-oriented data has been demonstrated in previous research such as. Data is extracted from CT-scan sub-band pictures using the FAWT approach. Recent work employed wavelet transformation for classification utilizing CT scans and basic machine learning techniques such as principal component analysis (PCA), LDA, RF, SVM, and K-nearest neighbors (KNN). A unique process is proposed that combines FAWT for image decomposition, statistical feature extraction, and machine learning algorithms for classification, and the results show a considerable improvement over previous work. In the current investigation, three preliminary experiments were conducted. The first experiment was conducted to determine the appropriate level of decomposition of the FAWT approach for COVID-19 diagnosis. A second experiment was conducted in order to determine the best classifier. The robust and important features are extracted and examined in the second experiment. The best-proposed model was investigated using different fold cross-validations in the third experiment. In this work, images are decomposed into sub-band images with the help of 2D-FAWT. Afore, Statistical based LBP and VAR features are extracted from sub-band images, and then PCA is utilized for dimensionality reduction and ranked the robust features with the help of the Student's t-value [51]. Finally, Least Square-SVM has been used for COVID stage classification [52]. Tenfold cross-validation has been used to test our proposed model performance. The proposed novel method has desirable properties like time–frequency covering, better frequency resolution, and effortless control on main parameters like d, R, and Q [26]. With the help of the 2D- FAWT method, processed images are decomposed into sub-bands images of different scales of frequencies. The number of decomposition levels and the parameters of the FAWT channel are chosen through the experiment to achieve the maximum classification accuracy feasible. The model accuracy is examined at every level of decomposition from one to six, represented in Fig. 8 . After observation, we found that performance parameters are enhanced up to the 6th level. After this, parameters are not improved at the 7th and 8th levels compared to previous levels. As the number of decompositions increases, the computational time increases. Hence, after the fixed decomposition level, different values of p, q, r, and s are examined with different D, R, and Q- factors [26], [27], [33]. The values of these parameters are p = 1, q = 2, r = 1, s = 2, and beta value is 0.5. After selection of beta value Q- factor is fixed at 3 (varies from 2 to 4). The value of beta and Q-factor is decided according to the Eq. (5). D factor varies from 0.5 to 1 (0.5, 0.6, 0.75 0.83, 1). The value of D is 0.83, and R is 1. Table 3 shows all selected parameters of FAWT. Fig. 8 shows the maximum accuracy of 93.47%, the sensitivity of 93.6%, specificity of 93.34% at the 6th level of image decomposition for classification. Beta.5 Q-factor 3, D.83, and R are the optimal parameters for designing an FAWT filter bank. On the database, conventional decomposition methods such as DWT [53], EWT [7], [54], CWT, Curvelet [55], and Contourlet [56] perform poorly when compared to the proposed 2D-FAWT method illustrated in Table 2 . The classification accuracy of our proposed decomposition approach was 93.47. Constant-Q transforms like the DWT and X-let allow dyadic scale decomposition. As a consequence, their frequency resolution will be less. As a result, these strategies required more decomposition levels and computing effort to get similar results as our proposed methodology. From each images total of 48 (12xBxC) features are extracted. Here, sub-band (B) is 4, and Channel (C) is only the green channel.

Fig. 8

Performance of model per decomposition level.

Table 3

FAWT parameters.

Parameters	Q = 2 (p = 1, q = 3, r = 2,s = 3)				Q = 3 (p = 1, q = 2, r = 1, s = 2)				Q = 4 (p = 3, q = 5, r = 2, s = 5)
Parameters	RF	SVM	KNN	LS-SVM	RF	SVM	KNN	LS-SVM	RF	SVM	KNN	LS-SVM
Accuracy	91.32	91.63	90.28	90.57	91.12	90.23	91.48	93.47	90.22	90.23	90.48	92.27
sensitivity	90.12	91.54	90.34	91.62	91.32	91.56	91.64	93.6	90.43	91.56	91.44	92.56
specificity	91.48	92.36	91.38	91.34	92.4	90.3	92.3	93.34	91.44	90.3	90.35	92.38

	DF = 0.75 (p = 1, q = 2)				DF = 0.83 (p = 4,q = 4)				DF = 1 (p = 2,q = 2)
Accuracy	92.22	91.23	90.42	93.47	89.42	90.23	91.48	92.47	91.32	90.43	91.48	92.27
sensitivity	91.62	90.56	91.58	93.6	90.32	91.56	90.64	92.68	92.42	91.26	90.64	91.68
Specificity	90.48	91.38	92.48	93.34	91.43	90.3	92.3	91.34	91.34	90.23	92.36	90.24

	R = 1 (r = 1, s = 2)				R = 2 (r = 2,s = 2)				R = 3(r = 3,s = 2)
Accuracy	90.42	90.63	90.8	93.47	90.62	90.63	90.48	93.67	90.32	91.23	90.28	93.07
Sensitivity	91.62	91.76	92.64	93.6	91.42	91.26	91.54	93.26	91.62	90.56	91.54	93.64
Specificity	90.48	90.34	91.36	93.34	90.58	91.38	92.36	93.62	92.48	91.32	90.34	93.24

Table 2

Comparison of various Image decomposition methods.

Decomposition	ACC. (%)	Spec. (%)	Sens. (%)
DWT	86.56	88.2	85.1
EWT	89.26	90.86	87.47
Curvelet	89.26	90.86	87.47
Contourlet	90.28	91.3	88.83
CWT	90.56	91.46	89.62
2D-FAWT*	93.4	93.34	93.62

*Our selected decomposition method.

Performance of model per decomposition level. Confusion matrix of model. Comparison of various Image decomposition methods. *Our selected decomposition method. PCA is utilized for feature reduction, and it is decided using the cumulative sum of variances (CSoV) per principal component (PC). The selection of PC based on the threshold rate of the CSoV, in our work threshold rate, varies from 90% to 98% [57] and found that at 92% threshold, 11 features are qualified out of 48 for COVID classification. The values of PCA are shown in Table 4 . Fig. 9 shows the per features performance of our proposed model, and at 11 feature model attains higher performance. Selected robust and relevant features are applied to various classifiers, and their results are shown in Table 6.

Table 4

Performance at different CSoV (PCA) and LDA.

Parameters	PCA (CSoV)				LDA
Parameters	90%	92%	95%	98%	LDA
Features	8	11	20	30	11
Accuracy	92.62	93.47	92.56	92.6	91.38
Sensitivity	91.23	93.6	92.87	93.53	92.57
Specificity	92.3	93.34	93.40	93.28	90.4

Fig. 9

Performance of model per feature using LS-SVM.

Table 6

Performance of the model with respect to the number of Fold.

Classifier	Fold number	ACC. (%)	Spec. (%)	Sens. (%)
Random Forest	2	84.23	86.4	87.2
	4	86.38	88.32	86.36
	6	88.46	89.27	88.4
	8	87.6	88.68	89.7
	10	91.05	89.3	89.9

Support vector machine	2	83.63	84.56	84.65
	4	87.58	88.23	87.38
	6	86.74	87.31	88.21
	8	89.3	87.65	89.58
	10	90.25	88.36	91.3

Least square support vector machine	2	85.3	87.78	88.6
	4	87.42	89.4	91.64
	6	86.12	88.62	88.4
	8	91.56	93.5	92.37
	10	93.47	93.34	93.6

FAWT parameters. Performance at different CSoV (PCA) and LDA. Performance of model per feature using LS-SVM. The Random forest (RF) [58], K-nearest Neighbour (KNN) [59], Support vector machine (SVM), and Least-square SVM [60] are utilized for classification with Tenfold cross-validation. The number of decision trees in this work is set at 100. The number of neighbors for the KNN classifier is set to 1, and the distance metric is Euclidean distance. The SVM classifier kernel function is a Gaussian function. The best number of trees for Random Forest (RF) is determined by trial and error, taking computing time and performance into account. In the LS-SVM classifier, the value of RBF parameter sigma (σ) varies from 0.3 to 3, with the increment of 0.3 is represented in Table 5 . Table 5 also shows the result with respect to the linear and polynomial kernel. At a 2.1 sigma value, our model attains the highest classification accuracy with Bayesian optimization [61]. K-fold cross-validations were conducted to ensure that the model classification performance was trustworthy and unbiased. The entire database was randomly partitioned into K equal sets for K-fold cross-validation. Our model performance has been evaluated 2, 4, 6, 8, 10 fold depicted in Table 6 and observed that at the 10th fold, this model attains the highest values of parameters.

Table 5

Classification performance of our model on dataset with different kernels.

kernel	parameters	Acc. (%)	Spe. (%)	Sen. (%)
LK		88.6	87.3	89.3
PK	n = 2	89.23	87.6	90.6
PK	n = 3	90.32	89.32	91.4

RBF	Different Sigma values (σ)
	1.5	92.46	89.5	90.2
	1.8	91.78	92.4	91.7
	2.1	93.47	93.34	93.6
	2.4	93.1	92.6	91.4
	2.7	92.7	87.2	90.2

Classification performance of our model on dataset with different kernels. Performance of the model with respect to the number of Fold. Table 1 shows the Confusion Matrix after the tenth fold for binary classification. All of the performance metrics are generated using the confusion matrix. Our binary classification model failed to distinguish 162 of 2,482 pictures according to the confusion matrix. COVID positive patients accounted for 82 of the 162 misclassified photographs, while Non-COVID positive patients accounted for the remaining 80. This confirms that our method correctly identifies COVID patients with significant true positive and negative rates. In addition, we conducted an ablation test to ensure the effectiveness of the 2-D-FAWT shown in Table 8. In this Table, we observed that our selected decomposition 2-D-FAWT module could progress the performance values in terms of Accuracy, Sensitivity, and specificity by 4.87%, 3.14%, and 6.7%, respectively. This demonstrates the utility of utilizing a multiresolution-based technique for disease detection. We conduct the experiments to evaluate the model performance on the publicly available dataset. The proposed model is compared with previously developed various traditional and deep learning models. We obtained that our developed model is less complex and needs less dispensation time for COVID detection. Using Thonny and MATLAB, we performed our experiment on an Intel Core i3 Central processing unit (CPU), 2.27 GHz processor, and 8 GB RAM with Microsoft window (10), a 64-bit operating system (OS).

Table 1

Confusion matrix of model.

	COVID	Non-COVID
COVID	1170	82
Non-COVID	80	1150

Table 8

Ablation experiment result.

Dataset	Module	Acc. (%)	Spe. (%	Sen. (%)
SARS-CoV-CT	W/O*	88.6	90.29	86.9
SARS-CoV-CT	W*	93.47	93.34	93.6

W/O* without and W* means with decomposition.

In Table 7 , the performance of our model is compared to that of existing approaches on CT scan images. Pramod et al. [7] used empirical wavelet transform and transfer learning for COVID diagnosis. Pradeep et al. [13] used FBSE based image decomposition, CNN network used for deep feature extraction, and RF, Naive Bayes, Adaboost, and softmax classifier for COVID classification. Ewen et al. [18] proposed a self-supervision method for the small labeled dataset, which is advantageous over the transfer learning techniques. Huan et al. [11] joint classification and segmentation method for COVID detection with 93% and 95% average specificity and sensitivity, respectively. Ahmed et al. [20] proposed a slice and volume-based deep learning model for COVID detection. Di et al. [63] propose an Uncertainty Vertex-weighted Hypergraph Learning (UVHL) method to identify COVID using CT images. In this method, different features are extracted from CT images, and the relation between different cases is built using Hypergraph. Wang et al. [64] developed Contrastive cross-site learning with the redesigned model for COVID classification. All existing model shown in Table 7 is restricted by their performance parameters. Our proposed model outperforms or is comparable to the majority of existing research in this domain. It can be shown that all of the approaches had an accuracy of more than 85%. In terms of known methods, [64] achieves the highest value of parameters. Their model has a 90.83% accuracy, a 0.9 F1 score, and a 0.96 AUC. The lack of pictures of COVID patients could explain why other approaches are so inaccurate. The bulk of the techniques used machine and deep learning for their research because it is the most widely used method for pattern recognition and picture categorization. On COVID, the majority of the detection and classification methods which are presented in Table 7 are based on transfer learning, smaller dataset, and manual feature extraction from images, for detection and classification, only one type of image is used, i.e., X-rays or CT scans, not both. Our model offers a lot of valuable properties, such as being more efficient with fewer features and having lower computing complexity. Due to this fact, our suggested methodology decomposes the COVID and Non-COVID images into sub-band images. The statistical features are extracted from those meaningful sub-images, which improves classification performance effectively. Furthermore, because the dimensions of the features acquired by the suggested method are modest, doctors can use them for diagnosis, which is not possible with previous approaches. In our situation, however, the image resolution is left unchanged in order to preserve all of the image information. In outline, the proposed method is very efficient, has less computational complexity, and is more significant with limited features for identifying COVID state–of–the–art approaches with high accuracy of 93.47%, and specificity of 93.3%, a sensitivity of 93.6%. In all Tables, bold items represent the proposed methodology best results.

Table 7

Comparison of the model with existing methods.

Previous methods	Acc(%)	Spe(%)	Sen(%)	F1(%)	AU C(%)
Yang et al. [62]	89	–	–	90	–
Even et al. [18]	86.6	–	–	87.4	86.09
Pramod et al. [7]	85.5	–	–	85.2	96.6
Di et al. [63]	–	94.1	93.2	–	–
Huan et al. [11]	–	93	95	78.5	–
Ahmed et al. [20]	90.80	–	–	–	0.9
Wang et al. [64]	90.83	–	–	0.90	0.96
Pradeep et al. [13]	–	96.5	–	0.97	0.98
Our method	93.4	93.34	93.6	93	93.62

Comparison of the model with existing methods. Ablation experiment result. W/O* without and W* means with decomposition.

Discussions

Due to the limits of nucleic acid-based laboratory testing, there has been an urgent demand for speedier alternatives that front-line health care workers can use to diagnose the condition promptly and reliably. Gray texture analysis of pictures has been employed in the creation of automated diagnosis systems for a variety of diseases in the past [65], [66]. Nucleic acid-based detection for the presence of specific sequences of the SARS-COV-2 gene has been the gold standard for COVID-19 diagnosis. While we continue to value nucleic acid detection in the diagnosis of SARS-COV-2 infection, it is essential to emphasize that the significant number of false negatives caused by methodological flaws, disease phases, and specimen collecting procedures may cause delays in diagnosis and disease control. According to recent research, nucleic acid testing accuracy ranges between 30 and 50 percent [67]. With the fast rise of COVID-19 suspicions around the world, developing effective automated techniques for COVID-19 diagnosis from CT imaging is critical in order to increase clinical diagnosis efficiency and relieve clinicians and radiologists of their tedious job. However, correct COVID-19 identification from CT imaging is a non-trivial difficulty, owing to the highly comparable patterns of COVID-19 and other pneumonia types and the substantial appearance variance of COVID-19 lesions in patients with varying degrees of severity. Recently, a number of models have been proposed to overcome this challenge [7], [11], [13], [18], [20], [62], [63], [64] resulting in significant advancement in the field of automated COVID-19 diagnosis. We developed an automated approach for COVID diagnosis using FAWT and statistical characteristics in this study. COVID images are separated into distinct R, G, and B channels using the proposed methodology. Because the green channel has the highest intensity and contrast of all the color channels, it contains more discriminative information than the other color channels. Even if they were to be utilized for classification, the other two frames have nearly the same information with low contrast. Thus, they will add to pre-processing processes. Furthermore, the three frames must be averaged as the last step to generate the greyscale image, resulting in information loss compared to using the green channel alone [68]. FAWT is used to decompose images at various levels. It is a more sophisticated form of DWT, with allures features such as time and frequency, as well as basic regulating factors such as dilation factor (D), quality factor (Q), and redundancy (R) [26]. Table 3 shows the values of FAWT parameters. Fig. 8 and Table 2 illustrate the results of per decomposition level and decomposition performance compared to other methods, respectively. LBP and VAR are texture operators that efficiently capture changes in greyscale pixel variations. We used LBP and VAR to capture the information present in decomposed images in this study. The underlying texture variation in deconstructed pictures is captured using statistical characteristics [39]. The information provided in FAWT images is effectively captured by median, Entropy, mean, standard deviation, skewness, and kurtosis [31], [33]. Entropy is a measure of signal complexity or uncertainty in general. These characteristics have lately been examined for. a variety of biomedical signal processing applications. These characteristics have previously been demonstrated to be beneficial for detecting glaucoma [31], human seizures, and fatty liver disease classification [69]. The benefit of these features is their universality and flexibility, which is attributable to the parameters involved, which allow for many uncertainty measures. We investigated these features for COVID diagnosis using CT lung images because of their known efficiency [70]. Table 4 indicates the superiority of PCA features over LDA features, as shown in Fig. 9. This is because normal photos have higher pixel intensity variations, but COVID images have fewer pixel intensity variations. These findings suggest that the features recovered from the decomposed green channel are useful in distinguishing between the two classes. We found that combining the student's t-value ranking features with LS-SVM improved the proposed approach's overall performance. Table 5, Table 6 show the classification performance of various kernel values and classifiers. CT scans are shown to be more suitable for COVID than x-rays because a computer tomography (CT) scan creates a 3600 picture of the body, which provides a high amount of detail. As a result, a CT scan is preferred for emergency and diagnostic purposes. In conclusion, the suggested methodology has the following advantages over existing methodologies: Using 11 characteristics, the suggested technique achieved 93.47 %, 93.6 % sensitivity, and 93.34 % specificity. As a result, our strategy is helpful in easing the workload on doctors during mass patient screening. When tested on a dataset of CT lung images, the suggested method surpassed existing methodologies for COVID diagnosis, achieving the most significant classification accuracy. The proposed approach has not been tested on large and diverse databases, and FAWT has employed two separate filters for the high-frequency component, which is the current limitation of the study conducted. The competitiveness of our results and methods are depicted in Table 9 .

Table 9

Comparison of the proposed model with previously developed models.

Ref.	Methods	Results (%)	Limitation /challenges
Wang et al. [71]	AI + Graphical features	Acc- 79.3Spe- 83Sen- 67	Graphical features take time for feature extraction from the COVID CT images, which is effected the performance of the model.
Gaur et al. [7]	Empirical wavelet transformation + Transfer learning	Acc-85.5F1-score 85.28 Spe- -	An EWT-based method cannot discriminate the signals if they overlap in the time and frequency domain. Also, this method suffers from boundary distortion and noise sensitivity.
Wang et al. [64]	Modified COVID-Net	Acc- 90.83F1- score − 90Sen- -	At the time of feature extraction from the lesion area, the image resolution is reduced due to this, the model performance is affected.
Chaudhary et al. [13]	FBSED + ML	Acc- 97.6F1-score 98Sen- 97	This model has higher performance, but the used image decomposition method takes more time due to this model suffers from higher computational complexity.
Gour et al. [14]	Stacked CNN model	Acc- 98.3Spe- -Sen- 97.6	This model is applicable only for large datasets because CNN requried more training datasets.
Yan et al. [72]	Multi-Scale CNN	Acc- 87.5Spe- 87.5Sen- 89.1	This method is not able to identify the Unique features and Cares about only the general pattern of CT images which is caused for miss Classification or a high false-negative rate. It is tested only on a small dataset.
Hasan et al. [73]	DCNN + 2 D- EMD	Acc −91.87Spe − 91.24F1 score − 91.94	EMD method suffers from boundary distortion, Noise sensitivity, and not appropriate Mathematical proofs.

Proposed model

S.No.	Method	Advantages

1.	FAWT	The time–frequency covering is the most significant property of FAWT. FAWT also resolves the shift-invariance and poor frequency resolution in DWT. A comparison of other image decomposition methods has fractional scaling and shifting properties which is helpful to enhance the performance of the model, as depicted in Table II. Due to the high resolution of decomposed images, statistical features are easily extracted.
2.	LBP and VAR	This technique is based on Statistical, which covers all distinctive features from the decomposed images. It also helps to find the uniqueness of features, which helps identify diseases from CT images.
3.	Student t-value and LS-SVM (RBF)	The hyper plan is used for boundary creation, and radial basis function (RBF) kernels are used for distinct non-linear features. RBF is a flexible kernel, and it is suitable for separating our non-linear data
Results of proposed model:- FAWT + Statistical features (LBP + VAR) + LS-SVM (RBF) – The proposed model is more appropriate than previously developed models because it has a less false-negative rate performance matrix of the model is Acc 93.4, Spe 93.34, Sen 93.6, F1-score 93. For COVID detection, the proposed model requires less number of features, and also FAWT-based decomposition preserves the information without loss.

Comparison of the proposed model with previously developed models.

Conclusion

We developed a way to deal with the problem of testing and identifying COVID-19 patients in this study. For image decomposition, the 2-D FAWT technique is used in our work for the first time. The key advantages of FAWT are that it is shift-invariant, has tunable oscillatory bases, and can cover a wide range of time and frequency variations. This method improves the performance, and it is helpful for deep feature extraction from decomposed sub-band images using LBP and VAR histogram. After this, PCA is used for dimensionality reduction. The Student's t-test algorithm ranks selected features. Then, selected features are applied to the different classifiers: RF, KNN, SVM, and LS-SVM (RBF). The effectiveness of our model has been evaluated using Tenfold cross-validation. FAWT module-based decomposition enhances the performance matrix of the model in terms of Accuracy, Sensitivity, and specificity by 4.87%, 3.14%, and 6.7%, respectively. Proposed model is more suitable than other models because it has classified COVID images with an accuracy of 93.47% based on the various experimental results. Compared to the previous study, our approach gives better outcomes with fewer features and lower computational complexity. We can infer that our approach is appropriate for COVID detection based on the results. More interestingly, the suggested model outperformed the existing methods for lung CT image classification. This methodology can also be extended to diagnose various infections via X-rays, CT scans, and other imaging modalities, such as tuberculosis, brain tumor, Glaucoma, Cancer, and influenza.

CRediT authorship contribution statement

Rajneesh Kumar Patel: Conceptualization, Methodology, Software, Resources, Writing – original draft, Writing – review & editing. Manish Kashyap: Supervision, Visualization, Resources, Formal analysis, Investigation, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

33 in total

1. Atherosclerotic risk stratification strategy for carotid arteries using texture-based features.

Authors: U Rajendra Acharya; S Vinitha Sree; M Muthu Rama Krishnan; Filippo Molinari; Luca Saba; Sin Yee Stella Ho; Anil T Ahuja; Suzanne C Ho; Andrew Nicolaides; Jasjit S Suri
Journal: Ultrasound Med Biol Date: 2012-04-21 Impact factor: 2.998

2. Comparison and ensemble of 2D and 3D approaches for COVID-19 detection in CT images.

Authors: Sara Atito Ali Ahmed; Mehmet Can Yavuz; Mehmet Umut Şen; Fatih Gülşen; Onur Tutar; Bora Korkmazer; Cesur Samancı; Sabri Şirolu; Rauf Hamid; Ali Ergun Eryürekli; Toghrul Mammadov; Berrin Yanikoglu
Journal: Neurocomputing Date: 2022-02-10 Impact factor: 5.779

3. A novel coronavirus outbreak of global health concern.

Authors: Chen Wang; Peter W Horby; Frederick G Hayden; George F Gao
Journal: Lancet Date: 2020-01-24 Impact factor: 79.321

4. Rapid COVID-19 diagnosis using ensemble deep transfer learning models from chest radiographic images.

Authors: Neha Gianchandani; Aayush Jaiswal; Dilbag Singh; Vijay Kumar; Manjit Kaur
Journal: J Ambient Intell Humaniz Comput Date: 2020-11-16

5. Hypergraph learning for identification of COVID-19 with CT imaging.

Authors: Donglin Di; Feng Shi; Fuhua Yan; Liming Xia; Zhanhao Mo; Zhongxiang Ding; Fei Shan; Bin Song; Shengrui Li; Ying Wei; Ying Shao; Miaofei Han; Yaozong Gao; He Sui; Yue Gao; Dinggang Shen
Journal: Med Image Anal Date: 2020-11-26 Impact factor: 8.545

10. Complex features extraction with deep learning model for the detection of COVID19 from CT scan images using ensemble based machine learning approach.

Authors: Md Robiul Islam; Md Nahiduzzaman
Journal: Expert Syst Appl Date: 2022-02-04 Impact factor: 6.954