Literature DB >> 36247809

Particle Swarm Optimization-Based Extreme Learning Machine for COVID-19 Detection.

Musatafa Abbas Abbood Albadr1, Sabrina Tiun1, Masri Ayob1, Fahad Taha Al-Dhief2.   

Abstract

COVID-19 (coronavirus disease 2019) is an ongoing global pandemic caused by severe acute respiratory syndrome coronavirus 2. Recently, it has been demonstrated that the voice data of the respiratory system (i.e., speech, sneezing, coughing, and breathing) can be processed via machine learning (ML) algorithms to detect respiratory system diseases, including COVID-19. Consequently, many researchers have applied various ML algorithms to detect COVID-19 by using voice data from the respiratory system. However, most of the recent COVID-19 detection systems have worked on a limited dataset. In other words, the systems utilize cough and breath voices only and ignore the voices of the other respiratory system, such as speech and vowels. In addition, another issue that should be considered in COVID-19 detection systems is the classification accuracy of the algorithm. The particle swarm optimization-extreme learning machine (PSO-ELM) is an ML algorithm that can be considered an accurate and fast algorithm in the process of classification. Therefore, this study proposes a COVID-19 detection system by utilizing the PSO-ELM as a classifier and mel frequency cepstral coefficients (MFCCs) for feature extraction. In this study, respiratory system voice samples were taken from the Corona Hack Respiratory Sound Dataset (CHRSD). The proposed system involves thirteen different scenarios: breath deep, breath shallow, all breath, cough heavy, cough shallow, all cough, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowels. The experimental results demonstrated that the PSO-ELM was capable of attaining the highest accuracy, reaching 95.83%, 91.67%, 89.13%, 96.43%, 92.86%, 88.89%, 96.15%, 96.43%, 88.46%, 96.15%, 96.15%, 95.83%, and 82.89% for breath deep, breath shallow, all breath, cough heavy, cough shallow, all cough, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowel scenarios, respectively. The PSO-ELM is an efficient technique for the detection of COVID-19 utilizing voice data from the respiratory system.
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Entities:  

Keywords:  Mel frequency cepstral coefficients; Particle swarm optimization-extreme learning machine

Year:  2022        PMID: 36247809      PMCID: PMC9554849          DOI: 10.1007/s12559-022-10063-x

Source DB:  PubMed          Journal:  Cognit Comput        ISSN: 1866-9956            Impact factor:   4.890


Introduction

COVID-19 (coronavirus disease 2019), which is caused by SARS-COV-2 (severe acute respiratory syndrome coronavirus 2), was declared a worldwide pandemic on 11 March 2020 by the World Health Organization (WHO). COVID-19 is considered a new infectious disease, but it is similar to other diseases that are caused by some coronaviruses such as Middle East respiratory syndrome coronavirus (MERS-COV) and severe acute respiratory syndrome coronavirus (SARS-COV), which caused disease outbreaks in both years (2002 and 2012) [1]. Fever, dry coughs, and fatigue are the most common COVID-19 symptoms [2]. In addition, the other COVID-19 symptoms include muscle pain, joint pain, gastrointestinal symptoms, shortness of breath, and loss of taste or smell [3]. At the writing time of this manuscript, there were 209.9 million active COVID-19 cases worldwide, and there were 4.4 million deaths. The USA has recorded the greatest number of new COVID-19 cases (37.1 million) and 622,263 deaths [4]. The scale of the pandemic has caused several health systems to be overrun due to case management and the need for testing. Several efforts have been made to detect early COVID-19 symptoms by applying artificial intelligence (AI) and machine learning (ML) algorithms to images [5]. Extreme learning machines (ELMs) have been demonstrated to outperform other ML algorithms, such as support vector machines (SVMs), convolutional neural networks (CNNs), and backpropagation-based neural networks (NNs), in such tasks [6]. For instance, the optimized genetic algorithm-extreme learning machine (OGA-ELM) has been shown to be able to detect COVID-19 from X-ray images with 100.00% accuracy [6]. In addition, the multiple kernels-ELM based on a deep neural network has been demonstrated to detect COVID-19 from computed tomography (CT) images with 98.36% accuracy [7]. Coughing is considered one of the most common COVID-19 symptoms as well as a symptom of more than a hundred other illnesses, which affects the respiratory system differently [8, 9]. For instance, lung illnesses can cause either obstruction or restriction in the airway, which can affect cough acoustics [10]. Additionally, it has been assumed that the behavior of the glottis acts differently under various pathological conditions [11, 12], which makes it possible to distinguish among coughs due to asthma, tuberculosis (TB), pertussis (whooping cough), and bronchitis [13-15]. Data on the respiratory system, such as speech, sneezing, eating behavior, coughing, and breathing, can be processed via ML algorithms to detect illnesses of the respiratory system [16]. Therefore, many recent studies have been conducted to detect COVID-19 by using voice data from the respiratory system. For example, the authors in [17] presented a new application for the detection of COVID-19 by implementing the short-time magnitude spectrogram features (SRMSF) and the ResNet18 classifier. The application was tested based on the Coswara dataset using cough voice data. The best result was obtained in terms of the area under the ROC curve (AUC), which reached up to 0.72. In addition, the authors in [18] proposed a system for the detection of COVID-19 by implementing the mel-filter bank features (MFBF) and SVM classifier. The proposed system was evaluated based on speech voice data collected from YouTube. The experimental results revealed that the highest achieved result had an accuracy of 88.60%. Additionally, an application for the detection of COVID-19 was proposed in [19]. The proposed application was built by using the MFCC features and ResNet50 classifier. The application was tested based on cough voice data. The experimental results showed that the proposed application achieved the highest results with a specificity of 94.2%. The authors in [20] proposed a COVID-19 detection system by implementing several handcrafted features and logistic regression (LR) classifiers. The proposed system was evaluated based on cough and breath voice data taken from a crowdsourced dataset. The experimental results showed that the highest precision was 80.00% for cough voice data and 69.00% for breath voice data. Moreover, a new COVID-19 detection system was proposed in [21]. The proposed system uses mel-spectrogram features (MSF) and the CNN architecture. The system was evaluated based on the ESC-50 dataset and a collected dataset by recording COVID-19 samples using a mobile app. The evaluation was based on three different scenarios: speech, cough, and overall voice data. The highest results were achieved with an accuracy of 92.00%, 92.85%, and 88.00% for speech, cough, and overall, respectively. In addition, the authors in [22] proposed a COVID-19 detection system that consists of two phases. The first phase applies several steps to extract the needed features. These steps start with MFCC followed by variable Markov oracle (VMO) and recurrence plot (RP) and end with recurrence quantification analysis (RQA). The second phase feeds the extracted features into the eXtreme gradient boosting (XGBoost) classifier. The proposed system was evaluated based on cough and vowel “ah” voice data collected by Carnegie Mellon University (CMU). The experimental results showed that the proposed system achieved an accuracy of up to 97.00% and 99.00% for cough and vowel voice data, respectively. Furthermore, a new COVID-19 detection application was presented in [23]. The presented system was based on using MFCC features and the long short-term memory (LSTM) classifier. The application was tested by using three different scenarios: cough, breath, and speech voice data. The highest experimental results were achieved with an accuracy of 97.00% for cough, 98.20% for breath, and 88.20% for speech. Table 1 summarizes the previous works on COVID-19 detection by using respiratory system voice data.
Table 1

Summary of the previous works on COVID-19 detection by using respiratory system voice data

RefDatasetFeaturesClassifierParameters/modalityResultsDisadvantages
[17]Coswara datasetSRMSFResNet18Cough0.72 AUC

1. The output results are not encouraging and need more improvement

2. Only one scenario has been conducted while other scenarios such as breath, counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored

[18]Collected from YouTubeMFBFSVMSpeech88.60% accuracy
[19]Own collected datasetMFCCResNet50Cough94.2% specificity
[20]Crowdsourced datasetSeveral handcrafted featuresLRCough and breath80.00% precision for cough and 69.00% precision for breath

1. The output results are not encouraging and need more improvement

2. Only two scenarios have been conducted while other scenarios such as counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored

[21]ESC-50 dataset and COVID-19MSFCNNSpeech, cough, and overallAccuracy of 92.00%, 92.85%, and 88.00% for speech, cough, and overall, respectively

1. The output results are not encouraging and need more improvement

2. More scenarios such as counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored

[22]Collected by CMUMFCC-VMO-RP-RQAXGBoostCough and vowel “ah”97.00% accuracy for cough and 99.00% accuracy for vowel “ah”More scenarios such as counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored
[23]Own collected datasetMFCCLSTMCough, breath, and speechAccuracy of 97.00% for cough, 98.20% for breath, and 88.20% for speech
Summary of the previous works on COVID-19 detection by using respiratory system voice data 1. The output results are not encouraging and need more improvement 2. Only one scenario has been conducted while other scenarios such as breath, counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored 1. The output results are not encouraging and need more improvement 2. Only two scenarios have been conducted while other scenarios such as counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored 1. The output results are not encouraging and need more improvement 2. More scenarios such as counting fast, counting normal, and vowels (i.e., i, e, and o) have been ignored Among the most common feature extraction methods used in the voice data processing field are linear predictive coding (LPC), cepstrum coefficients derived from LPC (LPCC), MFCC, and perceptual linear prediction (PLP) [24-26]. Out of all the aforesaid methods, MFCC is generally the most popular feature extraction approach in voice applications and has been cited to have the highest identification accuracy [27, 28]. Recently, the effectiveness of the ELM has been demonstrated in many domains, such as foreign accent identification [29], emotion recognition [30, 31], and language identification [32]. In addition, the ELM is preferred by researchers because it is superior to traditional SVMs [33-35] specifically in (1) thwarting overfitting, (2) its implementation on multi and binary classifications, and (3) its similar kernel-based capability SVM and working with an NN structure. These factors make the ELM more efficient in accomplishing a better learning performance. Hence, overall, ELM demonstrates a speedier learning process, better generalization devoid of overtraining, and greater results based on the utilized input weights. Therefore, researchers have integrated PSO into ELM to obtain the best input weights and biases that are used in the hidden layer of the ELM. The effectiveness of this integration (PSO-ELM) has been demonstrated in many domains, including language identification [36], short-term temperature prediction [37], and breast cancer detection [38]. However, to the best of our knowledge, no research has used the PSO-ELM in the detection of COVID-19. Therefore, the aims of this study are as follows: Propose a new COVID-19 detection system based on the PSO-ELM classifier and MFCC features using different voice data of the respiratory system. In the proposed method, we used thirteen different scenarios: deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all counts, vowel a, vowel e, vowel o, and all vowels. The NN, random forest (RF), and basic ELM classifiers are also implemented based on the MFCC features for the detection of COVID-19. Several evaluation measures, such as accuracy, recall, precision, specificity, F-measure, G-mean, and execution time, are used to evaluate the performance of the proposed system. The remainder of this paper is organized as follows. “Materials and Proposed Method” section provides the materials and proposed method. “Experimental Setup and Results” section discusses the results of the experiments. Finally, “Discussion” section presents the conclusion of this paper.

Materials and Proposed Method

The general diagram of the proposed COVID-19 detection system using the PSO-ELM method is demonstrated in Fig. 1. The diagram consists of various stages that are used to create the COVID-19 detection system based on the voice signals. The first stage refers to the voice datasets (i.e., breathing, coughing, counting, and vowels) for healthy and COVID-19-infected people. In the second stage, the MFCC method is utilized to extract the needed features from the voice signals. Last, in the third stage, the MFCC extracted features are fed into the PSO-ELM classifier to detect COVID-19 based on the voice signal. These three stages of the proposed COVID-19 detection system are deliberated as subsections.
Fig. 1

Block diagram of the proposed COVID-19 detection system

Block diagram of the proposed COVID-19 detection system

Dataset

In this study, the Corona Hack Respiratory Sound Dataset (CHRSD) is utilized to evaluate the proposed system. The CHRSD was downloaded from [39]. The dataset contains multiple categories of respiratory sounds, such as deep breath, shallow breath, heavy cough, shallow cough, count fast, count normal, vowel a, vowel e, and vowel o. The voice samples were recorded from healthy and COVID-19-infected people of both genders female and male. The duration of the voice samples is in the range of (1–25) seconds. Table 2 describes the dataset that was used in this study. It is worth mentioning that all the experiments in this study were conducted based on a ratio of 70% for training and 30% for testing. In addition, the data samples, in both training and testing, were shuffled.
Table 2

Description of the whole dataset

Deep breathShallow breathAll breathLabel
ClassNumber of samplesClassNumber of samplesClassNumber of samples
Healthy39Healthy39Healthy781
COVID-1939COVID-1939COVID-19782
Heavy coughShallow coughAll coughLabel
ClassNumber of samplesClassNumber of samplesClassNumber of samples
Healthy45Healthy45Healthy901
COVID-1945COVID-1945COVID-19902
Count fastCount normalAll countLabel
ClassNumber of samplesClassNumber of samplesClassNumber of samples
Healthy42Healthy45Healthy871
COVID-1942COVID-1945COVID-19872
Vowel aVowel eLabel
ClassNumber of samplesClassNumber of samples
Healthy43Healthy431
COVID-1943COVID-19432
Vowel oAll vowelsLabel
ClassNumber of samplesClassNumber of samples
Healthy41Healthy1271
COVID-1941COVID-191272
Description of the whole dataset

Feature Extraction: MFCC

In this study, the MFCC features [40, 41] were extracted from the voice samples based on conducting several processing steps. These steps are preemphasis, windowing, fast Fourier transform (FFT), mel-filter bank, log, and discrete cosine transform (DCT). The explanation of the processing steps is provided as follows: Preemphasis represents the first step of MFCC feature extraction, which aims to boost energy at high frequencies. Windowing is the second step of the MFCC, and it segments the voice sample into frames. FFT denotes the third step of the MFCC, which aims to convert the voice signal from the time domain into a frequency domain because the voice data features exist in the frequency domain. The mel-filter bank is the fourth step of the MFCC, which aims to approximate how much energy appears in each area or point. Log indicates the fifth step of the MFCC, the goal of which is to assure that the low and high frequencies are separated, which simulates the human hearing system. DCT is the last step of the MFCC, which aims to convert the log mel spectrum back to time. The whole process of extracting the MFCC features is demonstrated in Fig. 2. Table 3 depicts the variable values of the MFCC that have been utilized in this paper. The frame size and the frameshift size in the samples are calculated as depicted in Eqs. (1 and 2).
Fig. 2

The process of MFCC feature extraction

Table 3

The value of the MFCC variables that have been used in this study

VariableValue
Sampling rate44,100 Hz
Tw25 ms
Ts10 ms
Nw1103
Ns441
Number of MFCC features13
The process of MFCC feature extraction The value of the MFCC variables that have been used in this study Nw: the size of the frame in the samples. Tw: the duration of the frame in time. Ns: the size of the frameshift in the samples. Ts: the duration of the frameshift in time.

Classification: PSO-ELM

The PSO-ELM is proposed based on the concept of the PSO algorithm. In the PSO-ELM, the values of both input weights and hidden layer biases are modified by updating the parameters of the PSO to achieve high accuracy. The PSO-ELM processing steps are provided in deep detail as follows alongside its flowchart, which is depicted in Fig. 3.
Fig. 3

PSO-ELM algorithm flowchart

PSO-ELM algorithm flowchart N is a group of featured samples (X, t), where X = [x, x, …, x] R, and t = [t, t, …, t] R. where: X = the input, which is the extracted MFCC features. t = the expected output (true value). Step 1: Randomly initialize the particle swarm (P) in the range of ([0, 1] for the bias values of the hidden neurons and [− 1, 1] for the values of the input weights), where P = {p1, p2, …, p} and z refer to the population size. The ith particle position shape is depicted as p = {w11, w12, …, w1, w21, w22, …, w2, w, w, …, w, b1, …, b}. The ith particle velocity shape is denoted as V = {v, v, …, v}, where D = (1 + n) × L. In addition, the velocity of each particle is limited to the range of [− V, V]. Determine the maximum iteration number k. where: w [− 1, 1] denotes the values of the input weights that link the jth input neuron and the ith hidden neuron. b [0, 1] represents the bias of the ith hidden neuron. n refers to the number of input neurons. L refers to the number of hidden neurons. (1 + n) × L denotes the dimension of the particle, which requires optimizing its parameters. Step 2: Split the dataset into two subsets (training set and testing set). Set the number of hidden layer neurons as L and select the appropriate ELM activation function g(x) [42].where: is the output weights matrix. t is the true value. N is the number of training samples. where: H in Eq. (5) is the hidden layer output matrix of the ELM network; in H, the ith column is indicated to the ith hidden layer neuron on the input neurons. is the Moore–Penrose generalized inverse of H. The activation function g is infinitely distinguishable when the desired number of hidden neurons is L ≤ N. Step 3: Calculate the fitness value of each particle (p) in the population (P) utilizing Eq. (3). Step 4: Discover the best particle position Plocal as well as the best position for all particles P. Subsequently, update them based on the following details: Step 5: Modify the position and velocity of each particle based on Eqs. (8 and 9) and recalculate the fitness value for each particle via Eq. (3).where C1 and C2 denote acceleration coefficients, which are nonnegative constants; k refers to the present iteration number; i = 1, 2, …, z; d = 1, 2, …, D; and denotes the inertial weight. r1 and r2 are two randomly generated numbers in the range of [0, 1]. Step 6: If the stopping criteria are reached, then save the optimal input weights and biases of the input-hidden layers; otherwise, go to step 4. Step 7: The outcome results of the PSO optimization are utilized as input weights and biases for the ELM algorithm, and the hidden layer output matrix (H) is calculated by using Eq. (5). Step 8: Calculate the weights of the output () based on Eq. (4) and save the ELM prediction model for testing. Table 4 illustrates the parameters of the ELM and PSO that have been used in this study.
Table 4

The ELM and PSO parameters

ELMPSO
ParametersValuesParametersValues
PInput weights and biasesPopulation (particles)Contain positions and velocities
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document}βOutput weightsPositionGenerated randomly at the beginning, in the range of [− 1, 1] for input weights and [0, 1] for biases
Input weightIn the range of [− 1, 1]VelocityStart with zero values, and it is limited to the range of [− 2, 2]
Biases valuesIn the range of [0, 1]z50
Input neurons number (n)Input attributes\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\omega$$\end{document}ω, C1, C20.7289, 1.496, 1.496
Hidden neurons number (L)100–600, with 50 increment stepkmax100
Output neuronsClass valuesPlocalidBest particle position
Activation functionSigmoidPgBest position of all particles
The ELM and PSO parameters

Experimental Setup and Results

Several experiments were conducted based on four main scenarios: breath, cough, count, and vowels. In the breath scenario, three different scenarios were implemented: breath deep, breath shallow, and all breath. Additionally, three different scenarios (i.e., cough heavy, cough shallow, and all cough) were applied in the cough scenario. Moreover, in the count scenario, three different scenarios were performed, which were count fast, count normal, and all count. Last, four different scenarios (i.e., vowel a, vowel e, vowel o, and all vowels) were conducted in the vowel scenario. All the experiments were implemented based on using 70% of the dataset as a training dataset and 30% as a testing dataset. The proposed PSO-ELM was applied in several experiments in each scenario based on the scenario’s dataset with a varying number of hidden neurons in the range of [100-600] with 50 increment steps, and each experiment had 100 iterations. It is worth mentioning that all the experiments have been implemented in MATLAB R2019a programming language over a PC Core i7 of 3.20 GHz with 16 GB RAM and SSD 1 TB (Windows 10). In this study, numerous evaluation measurements were utilized to evaluate the proposed PSO-ELM approach. The evaluation measurements rely on the ground truth, which entails the application of the model to expect the answer on the evaluation dataset followed by a comparison between the predicted target and the actual answer. The evaluation measurements were used to evaluate the proposed PSO-ELM approach regarding true positive (TP), true negative (TN), false positive (FP), false negative (FN), recall, accuracy, specificity, G-mean, precision, F-measure, and execution time. Equations (10–15) [43-45] depict these evaluation measurements. The experimental results of the four main scenarios (breath, cough, count, and vowels) are provided and discussed separately in detail in the following subsections.

Breath Scenario

This section provides and discusses the performance of the PSO-ELM in the breath scenario, where the breath scenario includes three different scenarios: breath deep, breath shallow, and all breath. In the breath deep scenario, we used the voice samples that contain only breathing deep sounds. The shallow breath scenario was performed by using voice samples that contain only shallow breathing sounds. In the scenario of all breaths, we used the voice samples that contain both deep breathing and shallow breathing sounds. Table 5 describes the dataset that was used in each scenario.
Table 5

The dataset that was used in deep breath, shallow breath, and all breath scenarios

ClassNumber of all samplesNumber of training samplesNumber of testing samplesLabel
Deep breath scenario
Healthy3927121
COVID-193927122
Shallow breath scenario
Healthy3927121
COVID-193927122
All breath scenario
Healthy7855231
COVID-197855232
The dataset that was used in deep breath, shallow breath, and all breath scenarios The best experimental results of the proposed PSO-ELM in the deep breath, shallow breath, and all breath scenarios were obtained with overall accuracies of 95.83%, 91.67%, and 89.13% for the deep breath, shallow breath, and all breath scenarios, respectively. Figure 4 shows the ROC of the best results for the deep breath, shallow breath, and all breath scenarios.
Fig. 4

The ROC of PSO-ELM best results for the deep breath, shallow breath, and all breath scenarios

The ROC of PSO-ELM best results for the deep breath, shallow breath, and all breath scenarios

Cough Scenario

This section provides and discusses the performance of the PSO-ELM in a cough scenario where the cough scenario includes three different scenarios: heavy cough, shallow cough, and all coughs. In the heavy cough scenario, we used voice samples that contain only heavy coughing sounds. The shallow cough scenario was performed by using the voice samples that contain only shallow coughing sounds. In the scenario of all coughs, we used voice samples that contain both heavy coughing and shallow coughing sounds. Table 6 describes the dataset that was utilized in each scenario.
Table 6

The dataset that was utilized in heavy cough, shallow cough, and all cough scenarios

ClassNumber of all samplesNumber of training samplesNumber of testing samplesLabel
Deep cough scenario
Healthy4531141
COVID-194531142
Shallow cough scenario
Healthy4531141
COVID-194531142
All cough scenario
Healthy9063271
COVID-199063272
The dataset that was utilized in heavy cough, shallow cough, and all cough scenarios The best experimental results of the proposed PSO-ELM in heavy cough, shallow cough, and all cough scenarios were obtained with overall accuracies of 96.43%, 92.86%, and 88.89% for heavy cough, shallow cough, and all cough scenarios, respectively. Figure 5 shows the ROC of the best results for the heavy cough, shallow cough, and all cough scenarios.
Fig. 5

The ROC of PSO-ELM best results for the heavy cough, shallow cough, and all cough scenarios

The ROC of PSO-ELM best results for the heavy cough, shallow cough, and all cough scenarios

Count Scenario

This section provides and discusses the performance of the PSO-ELM in a count scenario where the cough scenario contains three different scenarios: count fast, count normal, and all count. In the count fast scenario, we utilized the voice samples that contain only counting fast sounds. The count normal scenario was performed by utilizing the voice samples that contain only counting normal sounds. In the scenario of all counts, we utilized the voice samples that contain both counting fast and counting normal sounds. Table 7 describes the dataset that was used in each scenario.
Table 7

The dataset that was used in count fast, count normal, and all count scenarios

ClassNumber of all samplesNumber of training samplesNumber of testing samplesLabel
Count fast scenario
Healthy4229131
COVID-194229132
Count normal scenario
Healthy4531141
COVID-194531142
All count scenario
Healthy8761261
COVID-198761262
The dataset that was used in count fast, count normal, and all count scenarios The best experimental results of the proposed PSO-ELM in count fast, count normal, and all count scenarios were obtained with overall accuracies of 96.15%, 96.43%, and 88.46% for count fast, count normal, and all count scenarios, respectively. Figure 6 shows the ROC of the best results for the count fast, count normal, and all count scenarios.
Fig. 6

The ROC of PSO-ELM best results for the count fast, count normal, and all count scenarios

The ROC of PSO-ELM best results for the count fast, count normal, and all count scenarios

Vowel Scenario

This section provides and discusses the performance of the PSO-ELM in a vowel scenario where the vowel scenario contains four different scenarios: vowel a, vowel e, vowel o, and all vowels. In the vowel a scenario, we used the voice samples that contain only vowel a sounds. The vowel e scenario was performed by using the voice samples that contain only vowel e sounds. In the vowel o scenario, we used the voice samples that contain only vowel o sounds. In the scenario of all vowels, we used the voice samples that contain vowel a, vowel e, and vowel o sounds. Table 8 describes the dataset that has been utilized in each scenario.
Table 8

The dataset that was utilized in vowel a, vowel e, vowel o, and all vowel scenarios

ClassNumber of all samplesNumber of training samplesNumber of testing samplesLabel
Vowel “a” scenario
Healthy4330131
COVID-194330132
Vowel “e” scenario
Healthy4330131
COVID-194330132
Vowel “o” scenario
Healthy4129121
COVID-194129122
All vowel scenario
Healthy12789381
COVID-1912789382
The dataset that was utilized in vowel a, vowel e, vowel o, and all vowel scenarios The best experimental results of the proposed PSO-ELM in vowel a, vowel e, vowel o, and all vowel scenarios have been obtained with overall accuracies of 96.15%, 96.15%, 95.83%, and 82.89% for vowel a, vowel e, vowel o, and all vowel scenarios, respectively. Figure 7 shows the ROC of the best results for vowel a, vowel e, vowel o, and all vowel scenarios. In addition, the best overall results of the evaluation measurements for all scenarios are shown in Table 9.
Fig. 7

The ROC of PSO-ELM best results for vowel a, vowel e, vowel o, and all vowel scenarios

Table 9

The best results of the proposed PSO-ELM in all scenarios

Hidden neuronsTPTNFPFNAccuracyPrecisionRecallSpecificityF-measureG-meanExecution time (s)
Deep breath
15011120195.83100.0091.67100.0095.6595.74119.8567
Shallow breath
30010120291.67100.0083.33100.0090.9191.29119.5190
All breath
25022194189.1384.6295.6582.6189.8089.96132.7839
Heavy cough
40014131096.4393.33100.0092.8696.5596.61128.3579
Shallow cough
55014122092.8687.50100.0085.7193.3393.54127.9549
All cough
20026225188.8983.8796.3081.4889.6689.87176.8013
Count fast
15012130196.15100.0092.31100.0096.0096.08124.2523
Count normal
35013140196.43100.0092.86100.0096.3096.36125.7110
All count
15024224288.4685.7192.3184.6288.8988.95174.0120
Vowel a
10013121096.1592.86100.0092.3196.3096.36107.1898
Vowel e
10013121096.1592.86100.0092.3196.3096.36108.1326
Vowel o
15012111095.8392.31100.0091.6796.0096.08106.6515
All vowels
200283531082.8990.3273.6892.1181.1681.58198.1815
The ROC of PSO-ELM best results for vowel a, vowel e, vowel o, and all vowel scenarios The best results of the proposed PSO-ELM in all scenarios In addition, numerous experiments were performed based on the basic ELM, NN, and RF in thirteen different scenarios: deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowels. The ELM and NN approaches were implemented with a single hidden layer and the number of nodes in the range of [100-600] with 50 incremental steps. Note that because the number of pages is limited, we reported only the highest performance of the ELM, NN, and RF in each scenario in terms of accuracy, recall, precision, specificity, G-mean, F-measure, TP, TN, FP, FN, and execution time. Table 10 provides all the experimental results of the basic ELM, NN, and RF techniques in the deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all counts, vowel a, vowel e, vowel o, and all vowel scenarios. The best performance of the basic ELM was obtained with an accuracy of 66.67%, 70.83%, 65.22%, 75.00%, 64.29%, 64.81%, 69.23%, 67.86%, 63.46%, 65.38%, 61.54%, 62.50%, and 61.84% for deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all counts, vowel a, vowel e, vowel o, and all vowel scenarios, respectively. The best performance of the NN had an accuracy of 62.50% deep breath, 66.67% shallow breath, 58.70% all breaths, 60.71% heavy cough, 60.71% shallow cough, 59.26% all coughs, 57.69% count fast, 71.43% count normal, 63.46% all count, 57.69% vowel a, 65.38% vowel e, 54.17% vowel o, and 56.58% all vowels. The best performance of the RF was achieved with an accuracy of 62.50%, 66.67%, 60.87%, 67.86%, 57.14%, 61.11%, 65.38%, 64.29%, 59.62%, 53.85%, 57.69%, 58.33%, and 60.53% for deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowel scenarios, respectively.
Table 10

The best experimental results of the basic ELM, NN, and RF techniques in all scenarios

TechniqueTPTNFPFNAccuracyPrecisionRecallSpecificityF-measureG-meanExecution time (s)
Deep breath
ELM6102666.6775.0050.0083.3360.0061.240.5236
NN875462.5061.5466.6758.3364.0064.0521.5222
RF693662.5066.6750.0075.0057.1457.743.3139
Shallow breath
ELM7102570.8377.7858.3383.3366.6767.360.5218
NN1066266.6762.5083.3350.0071.4372.1724.5918
RF975366.6764.2975.0058.3369.2369.443.3163
All breaths
ELM181211565.2262.0778.2652.1769.2369.700.9417
NN161112758.7057.1469.5747.8362.7563.0533.4789
RF171112660.8758.6273.9147.8365.3865.823.8154
Heavy cough
ELM11104375.0073.3378.5771.4375.8675.910.6609
NN1168360.7157.8978.5742.8666.6767.4522.4395
RF1095467.8666.6771.4364.2968.9769.013.6511
Shallow cough
ELM8104664.2966.6757.1471.4361.5461.720.6421
NN6113860.7166.6742.8678.5752.1753.4522.6169
RF795757.1458.3350.0064.2953.8554.013.7370
All coughs
ELM191611864.8163.3370.3759.2666.6766.761.0641
NN181413959.2658.0666.6751.8562.0762.2234.6208
RF191413861.1159.3870.3751.8564.4164.644.1569
Count fast
ELM8103569.2372.7361.5476.9266.6766.900.5920
NN4112957.6966.6730.7784.6242.1145.2928.5440
RF985465.3864.2969.2361.5466.6766.714.3687
Count normal
ELM1186367.8664.7178.5757.1470.9771.300.6178
NN1468071.4363.64100.0042.8677.7879.7725.1147
RF1086464.2962.5071.4357.1466.6766.824.4511
All count
ELM161791063.4664.0061.5465.3862.7562.761.0476
NN161791063.4664.0061.5465.3862.7562.7637.9394
RF121971459.6263.1646.1573.0853.3353.994.8791
Vowel a
ELM6112765.3875.0046.1584.6257.1458.830.4631
NN785657.6958.3353.8561.5456.0056.0426.5335
RF867553.8553.3361.5446.1557.1457.293.3466
Vowel e
ELM794661.5463.6453.8569.2358.3358.540.4873
NN985465.3864.2969.2361.5466.6766.7130.6789
RF5103857.6962.5038.4676.9247.6249.033.1672
Vowel o
ELM1057262.5058.8283.3341.6768.9770.010.4789
NN948354.1752.9475.0033.3362.0763.0122.5454
RF775558.3358.3358.3358.3358.3358.333.1421
All vowels
ELM2522161361.8460.9865.7957.8963.2963.341.5370
NN2221171656.5856.4157.8955.2657.1457.1553.8420
RF291721960.5358.0076.3244.7465.910.66534.5728
The best experimental results of the basic ELM, NN, and RF techniques in all scenarios Furthermore, additional experiments have been conducted based on LSTM and XGBoost approaches in thirteen different scenarios: deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowels. Table 11 delivers all the experimental results of the LSTM and XGBoost approaches in the deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowel scenarios. The highest performance of the LSTM was achieved with an accuracy of 79.17%, 70.83%, 60.87%, 75.00%, 57.14%, 64.81%, 65.38%, 64.29%, 65.38%, 61.54%, 65.38%, 62.50%, and 60.53% for deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowel scenarios, respectively. The highest performance of XGBoost obtained an accuracy of 70.83% deep breath, 70.83% shallow breath, 63.04% all breaths, 64.29% heavy cough, 75.00% shallow cough, 62.96% all coughs, 65.38% count fast, 64.29% count normal, 59.62% all count, 57.69% vowel a, 61.54% vowel e, 66.67% vowel o, and 60.53% all vowels.
Table 11

The best experimental results of the LSTM and XGBoost approaches in all scenarios

TechniqueTPTNFPFNAccuracyPrecisionRecallSpecificityF-measureG-meanExecution time (s)
Deep breath
LSTM8111479.1788.8966.6791.6776.1976.98116.448
XGBoost1075270.8366.6783.3358.3374.0774.5463.124
Shallow breath
LSTM893470.8372.7366.6775.0069.5769.63113.005
XGBoost893470.8372.7366.6775.0069.5769.6365.930
All breaths
LSTM171112660.8758.6273.9147.8365.3865.82128.850
XGBoost14158963.0463.6360.8765.2262.2262.2485.102
Heavy cough
LSTM10113475.0076.9271.4378.5774.0774.12112.323
XGBoost1086464.2962.5071.4357.1466.6766.8255.048
Shallow cough
LSTM795757.1458.3350.0064.2953.8554.01114.679
XGBoost1386175.0068.4292.8657.1478.7979.7158.063
All coughs
LSTM171891064.8165.3862.9666.6764.1564.16127.551
XGBoost161891162.9664.0059.2666.6761.5461.5882.043
Count fast
LSTM894565.3866.6761.5469.2364.0064.05117.751
XGBoost985465.3864.2969.2361.5466.6766.7160.707
Count normal
LSTM1086464.2962.5071.4357.1466.6766.82115.631
XGBoost8104664.2966.6757.1471.4361.5461.7258.609
All count
LSTM112331565.3878.5742.3188.4655.0057.66126.962
XGBoost1516101159.6260.0057.6961.5458.8258.8390.167
Vowel a
LSTM6103761.5466.6746.1576.9254.5555.47101.736
XGBoost876557.6957.1461.5453.8559.2659.3061.645
Vowel e
LSTM6112765.3875.0046.1584.6257.1458.83103.035
XGBoost6103761.5466.6746.1576.9254.5555.4760.859
Vowel o
LSTM784562.5063.6458.3366.6760.8760.9398.829
XGBoost793566.6770.0058.3375.0063.6463.9059.786
All vowels
LSTM2620181260.5359.0968.4252.6363.4163.59133.452
XGBoost2224141660.5361.1157.8963.1659.4659.48120.183
The best experimental results of the LSTM and XGBoost approaches in all scenarios Moreover, the proposed PSO-ELM is compared with some recent works in terms of accuracy based on various scenarios (i.e., deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, vowel a, vowel e, and vowel o scenarios). Table 12 demonstrates the comparison accuracy results of the proposed PSO-ELM and some other previous works.
Table 12

The comparison of accuracy between the proposed PSO-ELM and other previous works

TechniqueAccuracyTechniqueAccuracy
All breathsAll coughs
PSO-ELM89.13%PSO-ELM88.89%
SVM in [46]81.50%SVM [46]85.70%
RF [47]75.17%RF [47]70.69%
CNN [48]70.37%CNN [49]88.48%
RF [50]86.79%Ensemble model [51]77.10%
TechniqueAccuracyTechniqueAccuracy
Deep breathShallow breath
PSO-ELM95.83%PSO-ELM91.67%
SVM [46]62.30%SVM in [46]62.20%
BI-ATGRU [52]94.50%
TechniqueAccuracyTechniqueAccuracy
Heavy coughShallow cough
PSO-ELM96.43%PSO-ELM92.86%
SVM [46]72.30%SVM [46]74.10%
TechniqueAccuracyTechniqueAccuracy
Count fastCount normal
PSO-ELM96.15%PSO-ELM96.43%
SVM [46]73.50%SVM [46]72.50%
TechniqueAccuracyTechniqueAccuracy
Vowel aVowel e
PSO-ELM96.15%PSO-ELM96.15%
SVM [46]59.30%SVM [46]68.20%
TechniqueAccuracy
Vowel o
PSO-ELM95.83%
SVM [46]69.20
The comparison of accuracy between the proposed PSO-ELM and other previous works

Discussion

Based on the abovementioned experimental results in Table 9, we can conclude a critical observation. The PSO can create suitable weights and biases for the single hidden layer of the ELM to minimize classification errors. Avoiding unsuitable weights and biases prevents the ELM from becoming stuck in the local maxima of weights and biases. Consequently, the performance of the PSO-ELM was impressive in thirteen different scenarios, with accuracies of 95.83%, 91.67%, 89.13%, 96.43%, 92.86%, 88.89%, 96.15%, 96.43%, 88.46%, 96.15%, 96.15%, 95.83%, and 82.89% for deep breath, shallow breath, all breath, heavy cough, shallow cough, all cough, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowel scenarios, respectively. Furthermore, the additional observations can be made based on the results in Table 9. (a) Using a single sound of the respiratory system (e.g., deep breath, shallow breath, heavy cough, shallow cough, count fast, count normal, vowel a, vowel e, or vowel o) provides better results than combining two or more sounds of the respiratory system. (b) In both breath and cough, deepness and heaviness provide higher results than shallowness. (c) The speed (e.g., fast and normal) in count and the heaviness in cough deliver almost the same accuracy results. Moreover, based on the results in Tables 9, 10, and 11, the PSO-ELM outperformed the basic ELM, NN, RF, LSTM, and XGBoost in thirteen different scenarios: deep breath, shallow breath, all breaths, heavy cough, shallow cough, all cough, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowels. This demonstrates that generating suitable weights and biases of the ELM leads to minimized classification errors. However, they outperform the proposed PSO-ELM in terms of the execution time because the PSO-ELM is based on PSO, which requires more time to obtain the best values of input weights and biases. Lastly, based on all the results in Table 12, the PSO-ELM outperformed all the other previous works in the deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, vowel a, vowel e, and vowel o scenarios. This finding offers a promise that the proposed PSO-ELM is a reliable technique for the detection of COVID-19 by using different voice data of the respiratory system. Although the proposed method has shown good performance, there are some limitations, which are provided as follows: The voice datasets of the respiratory system that were used in this study for training and testing purposes are small. This study concentrated on classifying the voice data of the respiratory system into two classes (healthy/COVID-19) only, and other lung diseases were ignored.

Conclusion

In this study, we proposed a COVID-19 detection system based on the conventional MFCC features and PSO-ELM classifier. The PSO-ELM underwent thirteen different evaluation scenarios: deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowels using the CHRSD dataset. The outcome indicated the advantage of the PSO-ELM over some previous works (see Table 12) and the basic ELM, NN, RF, LSTM, and XGBoost (see Table 9, 10, and 11) in the thirteen different scenarios. The performance of the PSO-ELM was impressive in all scenarios, with accuracies of 95.83%, 91.67%, 89.13%, 96.43%, 92.86%, 88.89%, 96.15%, 96.43%, 88.46%, 96.15%, 96.15%, 95.83%, and 82.89% for the deep breath, shallow breath, all breaths, heavy cough, shallow cough, all coughs, count fast, count normal, all count, vowel a, vowel e, vowel o, and all vowel scenarios, respectively. However, the current study has been evaluated on only a small dataset. Therefore, the future work of this study is to implement the proposed PSO-ELM on a larger dataset. Additionally, other optimization approaches for ELM will be further explored to generate the most suitable weights and biases for the ELM, which leads to the minimization of the classification errors.
  19 in total

1.  Oxidative stress in patients with clinically mild encephalitis/encephalopathy with a reversible splenial lesion (MERS).

Authors:  Rie Miyata; Naoyuki Tanuma; Masaharu Hayashi; Takuji Imamura; Jun-ichi Takanashi; Rieko Nagata; Akihisa Okumura; Hirohumi Kashii; Sunao Tomita; Satoko Kumada; Masaya Kubota
Journal:  Brain Dev       Date:  2011-05-14       Impact factor: 1.961

2.  Clinical Characteristics of 138 Hospitalized Patients With 2019 Novel Coronavirus-Infected Pneumonia in Wuhan, China.

Authors:  Dawei Wang; Bo Hu; Chang Hu; Fangfang Zhu; Xing Liu; Jing Zhang; Binbin Wang; Hui Xiang; Zhenshun Cheng; Yong Xiong; Yan Zhao; Yirong Li; Xinghuan Wang; Zhiyong Peng
Journal:  JAMA       Date:  2020-03-17       Impact factor: 56.272

3.  AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app.

Authors:  Ali Imran; Iryna Posokhova; Haneya N Qureshi; Usama Masood; Muhammad Sajid Riaz; Kamran Ali; Charles N John; Md Iftikhar Hussain; Muhammad Nabeel
Journal:  Inform Med Unlocked       Date:  2020-06-26

4.  Chronic cough and the cough reflex in common lung diseases.

Authors:  T Higenbottam
Journal:  Pulm Pharmacol Ther       Date:  2002       Impact factor: 3.410

Review 5.  Prevalence, pathogenesis, and causes of chronic cough.

Authors:  Kian Fan Chung; Ian D Pavord
Journal:  Lancet       Date:  2008-04-19       Impact factor: 79.321

Review 6.  Chronic wet cough: Protracted bronchitis, chronic suppurative lung disease and bronchiectasis.

Authors:  A B Chang; G J Redding; M L Everard
Journal:  Pediatr Pulmonol       Date:  2008-06

7.  Robust Detection of COVID-19 in Cough Sounds: Using Recurrence Dynamics and Variable Markov Model.

Authors:  Pauline Mouawad; Tammuz Dubnov; Shlomo Dubnov
Journal:  SN Comput Sci       Date:  2021-01-12

8.  Optimised genetic algorithm-extreme learning machine approach for automatic COVID-19 detection.

Authors:  Musatafa Abbas Abbood Albadr; Sabrina Tiun; Masri Ayob; Fahad Taha Al-Dhief; Khairuddin Omar; Faizal Amri Hamzah
Journal:  PLoS One       Date:  2020-12-15       Impact factor: 3.240

9.  End-to-End AI-Based Point-of-Care Diagnosis System for Classifying Respiratory Illnesses and Early Detection of COVID-19: A Theoretical Framework.

Authors:  Abdelkader Nasreddine Belkacem; Sofia Ouhbi; Abderrahmane Lakas; Elhadj Benkhelifa; Chao Chen
Journal:  Front Med (Lausanne)       Date:  2021-03-31

10.  Persistent Symptoms in Patients After Acute COVID-19.

Authors:  Angelo Carfì; Roberto Bernabei; Francesco Landi
Journal:  JAMA       Date:  2020-08-11       Impact factor: 56.272

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.