Literature DB >> 34248448

Transmission trend of the COVID-19 pandemic predicted by dendritic neural regression.

Minhui Dong¹, Cheng Tang², Junkai Ji¹, Qiuzhen Lin¹, Ka-Chun Wong³.

Abstract

In 2020, a novel coronavirus disease became a global problem. The disease was called COVID-19, as the first patient was diagnosed in December 2019. The disease spread around the world quickly due to its powerful viral ability. To date, the spread of COVID-19 has been relatively mild in China due to timely control measures. However, in other countries, the pandemic remains severe, and COVID-19 protection and control policies are urgently needed, which has motivated this research. Since the outbreak of the pandemic, many researchers have hoped to identify the mechanism of COVID-19 transmission and predict its spread by using machine learning (ML) methods to supply meaningful reference information to decision-makers in various countries. Since the historical data of COVID-19 is time series data, most researchers have adopted recurrent neural networks (RNNs), which can capture time information, for this problem. However, even with a state-of-the-art RNN, it is still difficult to perfectly capture the temporal information and nonlinear characteristics from the historical data of COVID-19. Therefore, in this study, we develop a novel dendritic neural regression (DNR) method to improve prediction performance. In the DNR, the multiplication operator is used to capture the nonlinear relationships between input feature signals in the dendrite layer. Considering the complex and large landscape of DNR's weight space, a new scale-free state-of-matter search (SFSMS) algorithm is proposed to optimize the DNR, which combines the state-of-matter search algorithm with a scale-free local search. The SFSMS achieves a better global search ability and thus can effectively reduce the possibility of falling into local minima. In addition, according to Takens's theorem, phase space reconstruction techniques are used to discover the information hidden in the high-dimensional space of COVID-19 data, which further improves the precision of prediction. The experimental results suggest that the proposed method is more competitive in solving this problem than other prevailing methods.

Entities: CellLine Chemical Disease Gene Species

Keywords: COVID-19; Neural network; Optimization; Prediction; Regression

Year: 2021 PMID： 34248448 PMCID： PMC8262446 DOI： 10.1016/j.asoc.2021.107683

Source DB: PubMed Journal: Appl Soft Comput ISSN： 1568-4946 Impact factor: 6.725

Introduction

The novel coronavirus disease is a global problem. Because the first patient was diagnosed in December 2019, the disease has been called COVID-19. Due to the irregularity of viral transmission that occurred during the initial stage of the COVID-19 pandemic, information on this novel coronavirus and medical facilities were insufficient; therefore, the pandemic was initially not well controlled [1]. On January 20, 2020, it was officially announced that COVID-19 could be passed from person to person [2]. Soon after, COVID-19 spread around the world. The new disease was defined as a Public Health Emergency of International Concern by the World Health Organization (WHO) on January 30, 2020 [3], [4]. COVID-19 itself is not a large threat, but COVID-19 patients often have other complex and severe symptoms that can lead to death [5]. As of November 1, 2020, there were approximately 46 million diagnosed patients and 1.2 million deaths worldwide [6]. From the COVID-19 Data Hub [7], a COVID-19 open research dataset, we learned that the COVID-19 pandemic has been controlled in some areas, such as China, Burkina Faso, and Comoros, but is still severe in many other countries. In addition, even in areas where the outbreak is contained, there is still a possibility of re-outbreaks due to factors such as colder temperatures. As a result, countries still have a substantial need for COVID-19 prevention and control strategies. Good decisions can help reduce the spread of an pandemic, improve survival rates and reduce mortality. It is imperative to study the transmission trend of the COVID-19 pandemic and identify the mechanism of transmission. For the above reasons, this paper aims to study the historical transmission trend of the COVID-19 pandemic in various countries and determine the transmission mechanism of COVID-19 to provide important reference information for decision-makers in every country. When pandemics and major public health emergencies occur, pandemic models are usually built to analyse and predict the development trend of the disease. There are two main methods for building pandemic models. The first one is the mathematical method. There have been practical cases in which mathematical models have been applied to epidemiology in the past [8]. For example, Sharomi utilized a regression model to forecast the spread trend of tuberculosis [9], and in [10], [11], a transmission model was built for malaria. A model that takes susceptible, infectious, and recovered individuals as factors (SIRS) was improved and applied to study the syncytial virus in infants in [12], and in [13], mathematical models were adopted to control the dengue outbreak. To more effectively detect the transmission pattern of the new disease and obtain more information to help develop prevention strategies after the WHO declared the emergency, various mathematical methods have been used at varying degrees to model COVID-19 transmission. In [14], [15], [16], [17], researchers tried to use statistical or mathematical models to analyse and predict COVID-19 transmission trends. However, the fitting results of these methods are relatively poor, and the accuracy is low. The main reason is that statistical or mathematical modelling methods are mostly linear, whereas the transmission trend of the COVID-19 pandemic is a nonlinear regression problem with temporal factors. Some researchers have taken these features into account and proposed improved mathematical models that can be applied to nonlinear problems [17], [18], [19], [20]. However, most of these models need to impose a certain environmental assumption on the data, which may cause the loss of some important information. These methods also fail to consider temporal characteristics and are thus unable to achieve satisfactory results. To overcome the shortcomings of mathematical methods, researchers began to adopt machine learning (ML) technology to forecast the transmission trend of the COVID-19 pandemic. According to previous studies, the powerful ability of a recurrent neural network (RNN) [21] to describe the temporal and nonlinear characteristics of data can be effectively used to predict the spread trend of COVID-19 and achieve high-level accuracy. Therefore, state-of-the-art RNNs, including long short-term memory (LSTM) [22], bidirectional long short-term memory (BiLSTM) and gated recurrent units (GRUs), have been widely employed to predict the transmission trend. In addition, a hybrid method that combined LSTM and natural language processing (NLP) was proposed to predict the COVID-19 transmission trend in China [23]. In [24], LSTM was utilized by Chimmula to estimate the endpoint of the Canadian COVID-19 pandemic. An improved convolution LSTM model was adopted by Shastri [25] to perform spread trend prediction for India and the United States and achieved excellent results. Other researchers have used RNNs to predict the transmission trend of the COVID-19 pandemic on other data sets [26], [27], [28], [29]. Although RNNs can effectively extract temporal and nonlinear features from data, they still have some disadvantages; for instance, the temporal information cannot be fully extracted from the data by only applying a simple RNN model. RNNs also suffer from overfitting issues and unstable performance caused by different random weight initializations [30]. To further improve the precision of prediction, a modified dendritic neuron model (DNM) is developed and applied to forecast the transmission trend of the COVID-19 pandemic in this study. The computation of DNM mimics the process of biological neurons transferring information. Due to the nonlinear characteristics of synapses and the plasticity of dendrites, the DNM has a strong ability to fit complex nonlinear functions. Previous studies have used the DNM to solve multifarious linearly nonseparable problems and achieved satisfactory results [31], [32], [33]. However, the original DNM was specifically designed for classification problems. The architecture is simplified to seek high classification speed by neglecting the thickness of dendritic branches, which will influence the signal strengths [34]. This study proposes dendritic neural regression (DNR), which can enhance model regression ability by considering the thickness of dendritic branches. Moreover, since DNR’s weight has a complex and large search space, a novel scale-free state-of-matter search (SFSMS) algorithm that combines the state-of-matter search (SMS) algorithm [35] with the scale-free local search method is utilized to optimize the neural architecture of DNR. The SMS algorithm is a recently proposed evolutionary algorithm that has powerful search capabilities and can effectively avoid local optima. The scale-free method searches in a paradigm of complex networks, which are similar to a variety of real-world networks. In SFSMS, the scale-free local search is a new component that improves the quality of the solutions. Additionally, Takens’s phase space reconstruction (PSR) theorem [36] is employed as a module to preprocess the raw data in this study. The original one-dimensional time series hides some information in the high-dimensional space, which leads to intermittent behaviour and randomness. This information can be extracted through PSR so that we can greatly improve the prediction performance. Two components are imperative when we employ PSR techniques to describe high-dimensional information: the time delay and the embedding dimension. The details of how to obtain these two components are described in Section 3. Experiments for predicting COVID-19 transmission trends in several countries show that DNR-SFSMS is more accurate and robust than other commonly used ML algorithms. The remainder of this article is organized as follows. Section 2 introduces the structure of DNR and how to use the SFSMS algorithm to optimize DNR. The whole process of data preprocessing is introduced in Section 3. Section 4 presents the experimental results and discussion. Finally, a conclusion is provided in Section 5.

Model structure and optimization algorithm

In this study, we employed DNR to forecast the transmission trend of the COVID-19 pandemic in several countries. The left part of Fig. 1 shows the structure of DNR. DNR consists of four layers: the synaptic layer, the dendrite layer, the membrane layer and the soma layer. The synaptic layer is the entry point of the model and is used to receive input signals. The signal received by the synaptic layer is processed by an activation function and then flows to all branches of the dendrite layer. Each branch of the dendrite layer gathers all the signals in the branch and sends them to the membrane layer. The membrane receives signals from all branches of the dendrite layer and integrates them to transmit to the soma layer. Finally, the soma layer processes the signal through a sigmoid function and outputs it.

Fig. 1

Description of the DNR architecture.

Synaptic layer

The synaptic layer mimics the synaptic part of the nervous system; it is the portal of the neuron and receives signals from external inputs. The signals received by the synaptic layer are processed by the following equation: where is the th input signal and denotes the value of the th synapse transfer to the th dendritic branch. is a positive constant parameter in the synaptic layer. and are two alterable parameters in terms of different tasks. Depending on the different values of and , DNR can be further simplified. This operation mimics synapses that can become excitatory synapses or inhibitory synapses in terms of the received ions in the nervous system [32]. The detailed state changing rule is given as follows: (1) Constant 1 connection ( or ): the value of the synaptic layer output to the dendritic layer is fixed at ; (2) Constant 0 connection ( or ): the value of the synaptic layer output to the dendritic layer is fixed at ; (3) Excitatory connection (): the output of the synaptic layer remains the original value; (4) Inhibitory connection (): the output of the synaptic layer is the inverse of the original value.

Dendrite layer

The dendrite layer is responsible for aggregating signals from synapses distributed on each branch. The nonlinear relationship among these signals is thought to play an important role in neural information processing for some sensory systems, such as the visual and auditory systems, in biological networks [37], [38]. The nonlinear relationship is described by multiplication operations in DNR, which can be expressed as follows: where denotes the output value of the th dendritic branch.

Membrane layer

The duty of the membrane layer is to integrate the signals from all branches of the dendrite. The integrated operation is implemented through a summation, which can be given as follows: where denotes the strength of each dendritic branch and represents the input to the soma layer. is a parameter that differentiates DNR from DNM. In DNM, since is fixed at 1, further simplification can be performed to obtain a faster calculation speed [39]. In DNR, is a variable parameter, and it can adapt to different tasks by constantly changing its value to better cope with regression problems.

Soma layer

A sigmoid function is employed as the activated function in the soma layer, and the cell body is fired when the potential from the membrane exceeds the threshold. The process can be defined as follows: where denotes the output of the soma layer and and are two positive constants.

Learning algorithm

Since DNR captures the nonlinear relationship among features by means of the multiplication operation in the dendrite layer, the parameter space is tremendous and complex. In addition, in DNR, we add additional weights to describe the thickness of the dendrite branches, which further increases the difficulty of model optimization. The search abilities of traditional back propagation (BP) algorithms have some limitations in such a parameter space, including a tendency to fall into local optima and sensitivity of the initialized weights. Thus, the SFSMS algorithm is utilized to optimize DNR instead. The SFSMS algorithm is composed of an SMS algorithm and a scale-free local search approach, where the scale-free local search is used as an additional strategy to improve the whole population in the process of evolution. In the rest of this section, the whole process of the SFSMS algorithm will be introduced specifically.

SMS algorithm

In nature, substances vary among gases, liquids and solids in terms of temperature. The SMS algorithm refers to this principle [35]. The process of the SMS algorithm can be regarded as a process of substantial change from gas to liquid to solid. First, in the gas state, the molecules are long distances from each other and have weak attractions, but they have a large space of motion and easily collide. When the distance between the molecules is sufficiently reduced, the substance changes from the gas state to the liquid state. In the liquid state, the attraction among the molecules increases compared with the gas state, and the space of motion of the molecules and the possibility of collision are reduced. When the distance between molecules is very small, a substance becomes solid. The attraction among the molecules in the solid state is close to the maximum, the motion space is very small, and there are almost no collisions between molecules. There are three major phases in the process of searching for the best solution, as shown in Fig. 2; the three operations include the direction vector operation, the collision operation and random behaviour. Their specific definitions are shown below.

Fig. 2

Evolution process of the SMS algorithm.

Evolution process of the SMS algorithm. Direction vector operation: Similar to the molecules in a substance, in SMS algorithms, the current best individual attracts other individuals in the population to raise the level of the entire population. The purpose of the direction vector operation is to move the other individuals towards the best one. Let be the th individual vector and be the current optimal individual. According to the rules of the SMS algorithm, other individuals are biased towards when moving. Let be the direction vector of the th individual; its formula can be expressed as follows: where and denote the maximum iteration number and the current number of evolutionary iterations, respectively. Then, we can calculate the velocity vector in terms of . The equation is defined as follows: where is a positive constant and is the dimension of the individual vectors. and represent the upper bound and the lower bound of the th member of individual vectors, respectively. The last step of this operation is to update the individual vectors, which is given as follows: where is a random number and is a positive constant. Both are in the range of [0,1]. Collision operation: The collision operation aims to solve the problem of population diversity loss and premature convergence in the evolution process. In this operation, a threshold is employed to determine whether two individuals have collided. When the distance between two individuals is less than the threshold, they are considered to have collided. The direction vectors of collided individuals are exchanged. The equation of the threshold is defined as follows: where is the threshold and is a constant positive between 0 and 1. The collision operation formula is defined as follows: Random behaviour: Random behaviour is another measure to maintain the diversity of the population, and it can effectively prevent the population from falling into local optima. Compared with the other two operations, random behaviour is not necessary. Each individual may or may not perform a random behaviour. Random behaviour is expressed as follows: where is a random number similar to and denotes the occurrence probability of random behaviour. The whole solution search process in the SMS algorithm needs to go through three major phases: the gas phase, liquid phase and solid phase. The parameters , , and are set to different values in the different phases. In this study, these parameters are set to the defaults according to [35].

Scale-free local search and the BA algorithm

The scale-free local search has a complex topological structure. Fig. 3 shows a schematic diagram of a scale-free network. In such a scale-free network, there are fewer nodes with a high vertex degree than nodes with a low vertex degree, and the distribution follows a power law, which can be expressed by the following equation [40]: where denotes the probability of a node possessing degree and represents a scaling exponent that is commonly in the range of [2,3]. Notably, and are positively correlated. To intuitively observe such a law, the curve of its distribution is plotted in Fig. 4. As shown in Fig. 4, when the degree increases, the number of nodes decreases. In addition, the distribution is presented as a straight line in the coordinate axes of the logarithmic scale. Previous studies have shown that it is difficult to map complex networks in the real world with only a scale-free network [41]. To solve this problem, the Barabasi–Albert (BA) algorithm, proposed by Barabasi and Albert [40], is adopted to construct the scale-free network. The BA algorithm is inspired by the links formed between the new nodes and the original nodes in a real-world network, which can better reflect the real environment. In the BA algorithm, the newly generated nodes are connected to the old nodes with a particular preference.

Fig. 3

Schematic diagram of the scale-free network.

Fig. 4

The degree distribution curve of a scale-free network.

The detailed process of the BA algorithm is as follows: (1) Initialize the scale-free network with a fully connected network with nodes. Set the total number of nodes . (2) Generate a novel node and calculate the probability for all existing nodes with the equation , where represents the degree of node . (3) Generate a new link between novel node and existing node with probability . (4) Repeat steps (1)–(3) until all nodes are connected to the network. Moreover, by proving that is proportional to , Barabasi and Albert identified that the scale-free network in the BA algorithm still abides by the power-law distribution. In such a distribution space, links are more likely to appear between low-degree nodes and high-degree nodes. is also utilized to denote the extent to which low-degree nodes link to high-degree nodes. The equation for can be expressed as follows [42]: where denotes the number of links in the scale-free network architecture. and are the degrees of the two adjacent nodes at both ends of the th link. When the value of is high, links between nodes with a high degree are more likely to be generated. In addition, we also investigate the effect of the initial node number on . The experimental result is shown in Fig. 5. It should be emphasized that we conducted experiments for all networks. As shown in Fig. 5, the values of are lower in the scale-free networks than in the random network, and the value of increases as increases. When is 2, the value of achieves the lowest value.

Fig. 5

Degree–degree correlation coefficients of different networks.

Schematic diagram of the scale-free network. The degree distribution curve of a scale-free network. Degree–degree correlation coefficients of different networks.

Scale-free local search

In this study, we use a scale-free local search as a new component of the SMS algorithm (termed SFSMS) to further enhance the solution searching capability. First, a corresponding scale-free network is generated using the BA algorithm in terms of the size of the population. Second, we number each node in the network. According to the rules of the BA algorithm, there are more low-degree nodes and fewer high-degree nodes in the network. Third, in each iteration, we rank the individuals in terms of their fitness in the population after the SMS operations. Then, we put each individual into a network node with the same number and rank. As a result, high-degree nodes store excellent individuals, and low-degree nodes store poor individuals. Finally, we update each individual utilizing the following equation: where represents the weight vector of the th individual in the th generation after the SMS operation. denotes an arbitrary node in the scale-free network that is linked to the node of . There are two main advantages in using the scale-free network. On the one hand, high-quality individuals are always stored in the high-degree nodes, and the power-law distribution reduces the number of high-degree nodes. Thus, most individuals are close to excellent individuals after updating, which improves the level of the whole population and accelerates the convergence of the model. On the other hand, since the value of is relatively low in the BA algorithm, the probability of a link between high-degree nodes is low; namely, excellent individuals are less likely to attract each other, which ensures the diversity of the population and prevents falling into locally optimal situations. The detailed content of the SFSMS algorithm is introduced in Algorithm 1, and the flowchart for the overall methodology is demonstrated in Fig. 6.

Fig. 6

The flowchart of the DNR-SFSMS.

Study of the chaotic time series

Before applying DNR to forecast the transmission trend of the COVID-19 pandemic, the raw data should be preprocessed. According to Takens’s theorem [36], the hidden information of the time series can be revealed by the time delay and the embedding dimensions. Utilizing these two components, we can reconstruct the phase space of the data, which can greatly enhance the precision of forecasting. In addition, it is necessary to use the maximum Lyapunov exponent () to determine that the dataset is a chaotic time series when we use this method. The reconstructed dataset is available only when exceeds zero. The flowchart of the DNR-SFSMS.

Phase space reconstruction technique

In this study, the PSR technique proposed by Takens is adopted for data preprocessing. It is very difficult to predict the trend of a one-dimensional chaotic time series because of its randomness and intermittency. The characteristics of a one-dimensional chaotic time series are distinctive because there is some high-dimensional information that can hardly be described in one dimension. Even with ML methods, it is difficult to effectively describe high-dimensional information. Through PSR, we reconstruct the representations of the one-dimensional data so that their hidden laws can be revealed. Then, through ML methods to fit the representations, we can achieve a satisfactory effect. According to Takens’s theorem, the reconstruction operation requires two key parameters, i.e., the time delay and embedding dimension . Suppose is the raw one-dimensional chaotic time series. In terms of the two parameters and , the reconstructed data can be expressed as follows: where is the th input vector and is the th target for the neural model. When the two parameters and are set reasonably, can clearly describe the laws of the chaotic time series. The process for creating refers to the PSR technique. Obviously, the key of the PSR technique is to choose suitable approaches to obtain the time delay and embedding dimensions. However, since the time series always carry different kinds of noise, there is no method that can effectively calculate and for every time series. In most cases, we have to choose the appropriate method based on experience. After a certain trial, the mutual information (MI) algorithm [43] and the false nearest neighbour (FNN) algorithm [44] are employed to obtain the time delay and embedding dimensions, respectively.

Mutual information algorithm

In this study, the time delay is obtained using the MI algorithm, which is one of the most effective algorithms for such a problem. Let , which is called the information entropy, be the degree of uncertainty of . The function of is given as follows: where is a condition set, denotes the th condition, represents the probability of condition , and is the length of , i.e., the number of conditions. Let be the conditional information entropy, which can be expressed as follows: where is another condition set, denotes the length of , is the probability of the th condition in Y, and is the probability of the th condition in X occurring under the th condition in Y. Suppose is the mutual information entropy and is calculated as follows: where is called the joint formation of and and is defined as follows: According to the rule of the MI algorithm, let represent the time series. Therefore, the mutual information entropy can be given as follows: The components of this function were introduced previously. When the value of reaches a local minimum for the first time, the value of is the required time delay.

False nearest neighbours algorithm

The embedding dimension is an important component of PSR. If the embedding dimension is not selected properly, the hidden information in the high-dimensional space can hardly be exhibited in the one-dimensional space, and the laws in the time series cannot be extracted effectively. In this study, the FNN algorithm proposed by Kennel [43] is employed to obtain the embedding dimension for the COVID-19 pandemic data. In terms of the theory of the FNN algorithm, false nearest neighbour points are defined as two adjacent points that are a large distance apart in high-dimensional space. It is difficult to utilize these two points to demonstrate the hidden laws in the data. Thus, we gradually increase the embedding dimensions so that the trajectory in the high-dimensional space is clearer, which can help in obtaining a highly accurate prediction. When the embedding dimension reaches a certain value, the trajectory in the higher dimension is fully exhibited; this is the value that we seek. Let be the number of embedding dimensions and be an -dimensional vector in the phase space. Suppose is the nearest neighbour point of . Then, the distance between and can be expressed as follows: It is obvious that the value of varies with increasing . When is increased by one, the updated can be given as follows: These two points are regarded as false nearest neighbour points when is much less than . This function can be expressed in another form as follows: Let be a threshold in the range of [10,50]. The two points can be considered false nearest neighbour points when is less than . In this study, the initial number of embedding dimensions is set to . Then, is increased until the proportion of false nearest neighbour points is less than 5%, which is the value of that we require. It should be emphasized that in some cases, the proportion of false nearest neighbour points cannot reach 5%. To solve this problem, we set an upper bound of . In this study, the upper bound of is set to .

Maximum Lyapunov exponent

The maximum Lyapunov exponent () is utilized to identify the chaotic characteristics of the historical COVID-19 data. Only when the chaotic characteristics are confirmed can the reconstructed data be used for the ML methods. According to [45], when of the reconstructed data exceeds zero, the data can be seen as chaotic. In this study, Wolf’s method [46], one of the most effective methods for calculating , is employed to obtain the value. Suppose the vector is a reconstructed series through the PSR technique. Let and be the two closest points, and their distance can be expressed as follows: Then, we can use the component to calculate , and the equation is defined as follows: where is the initial time and is the final time. In addition, the recommended length for the time series prediction can be expressed as follows [47], [48]: The flowchart of the transmission trend of COVID-19 pandemic prediction. The growth curves of the numbers of COVID-19 cases diagnosed in each country. The results of the MI algorithm and FNN algorithm on the time series confirmed COVID-19 case data from six countries. Training and prediction results of DNR-SFSMS for six countries. Convergence curves of DNR-SFSMS for transmission trend prediction of the COVID-19 pandemic. Comparison of the four error metrics of prediction methods on the data from the six countries. From the top, the methods are DNR-SFSMS, DNR-BP, EDNM, GRU, BiLSTM, LSTM, SVR-s, SVR-r, SVR-p, SVR-l, MLP and ENN, respectively.

Experiment and discussion

This section is mainly divided into the following contents: First, the relevant data and environment of the experiment are presented. Second, we introduce in detail the steps of the normalization operation and the special treatment of this operation in this problem. Third, the evaluation indicators and charts are described. Finally, the experimental results are presented and discussed.

Benchmark datasets

In this study, DNR, a novel data-driven method trained by the SFSMS algorithm (DNR-SFSMS), is utilized to forecast the transmission trend of the COVID-19 pandemic. The whole process is shown in Fig. 7. The variation in the number of confirmed COVID-19 cases in six countries from March , , to March , , is collected as experimental data for analysing DNR performance. The six countries include India, Angola, Indonesia, Ethiopia, Azerbaijan and Israel. These data are available for free at https://datahub.io, a website called Datahub that is dedicated to discovering and sharing high-quality data sets, which was accessed in March 14, 2021. The data were collected by Johns Hopkins University through various public approaches, and the collated data were uploaded to GitHub. The variation in the number of confirmed COVID-19 cases in six countries is shown in Fig. 8.

Fig. 7

The flowchart of the transmission trend of COVID-19 pandemic prediction.

Fig. 8

The growth curves of the numbers of COVID-19 cases diagnosed in each country.

These datasets were first processed by the PSR technique and converted into a group of new feature vector sets and target sets. It should be emphasized that the lengths of the feature vector sets and target sets may differ for each country because the selected and may differ in the process of reconstruction. The reconstructed data from each country were split into two portions. The first portion contained pieces of data utilized as the training data, and the rest of the data were employed as the test data. Then, the model was trained via the normalized training data, and its performance was evaluated via the normalized test data. All experiments were conducted on a personal PC with an Intel(R) Core , 2.90 GHz, and GB memory using MATLAB R2020a. Results for the time delay, embedding dimensionality and maximum Lyapunov exponents of the time series confirmed COVID-19 case data from six countries. Results for the time delay, embedding dimensionality, maximum Lyapunov exponents and the specific allocation of training data and test data of the dataset of the COVID-19 pandemic in different countries. Parameter settings of the algorithms for the transmission trend prediction of the COVID-19 pandemic. Box-and-whisker plots for the MAPE of DNR-SFSMS for six countries.

Normalization

The purpose of the normalization operation is to reduce the computational time and increase the accuracy of prediction. Normalization is performed not only on the training data but also on the test data. There are several approaches to perform a normalization operation, such as mean and variance normalization and simple normalization [49], [50]. In this study, the normalization function can be expressed as follows: where is a member of the reconstructed feature vectors or a target value. is the normalized value of . and are the maximum value and the minimum value of , respectively. and are the upper bound and the lower bound, respectively. Usually, defaults to , and defaults to . Since DNR employs a sigmoid function as the activation function in the last layer, the output of DNR is in the range of [0,1]. Thus, the normalization operation is an imperative step when applying DNR. In addition, for the monotonically increasing problem of predicting the transmission trend of the COVID-19 pandemic, the selected values of the upper bound and lower bound strongly influence the results.

Evaluation metrics

To fairly compare the performance differences between DNR-SFSMS and other algorithms, several evaluation approaches were utilized in this study: (1) Evaluation metrics: In this study, four commonly used evaluation metrics were adopted for our experiments: the mean square error (), the root mean square error (), the mean absolute percentage error (), and the mean absolute error (). Their equations can be expressed as follows: where represents the desired values and L denotes the prediction outputs. , , and are all error-based metrics; thus, lower values of these metrics indicate better performance of the algorithm. (2) Relative charts: The charts plotted during the experiments can help to intuitively observe the performance of each algorithm and more easily draw a conclusion for the result. Several commonly used charts were generated to compare the performance of all algorithms. First, fitting charts were drawn to show the difference between the desired curve and the prediction curve. Second, a convergence chart was plotted to demonstrate the convergence effect and speed of DNR-SFSMS on all datasets. Third, histograms were generated to reveal the metric differences. Finally, boxplots were produced to display the entire situation in the experiments. (3) Nonparametric statistical test: The purpose of the test is to determine whether there is a significant difference between DNR-SFSMS and other ML methods. The Wilcoxon rank-sum test [51], [52] was adopted in this study and implemented in terms of the KEEL software [53]. The significance level was set to 5%, which indicates that there is a significant difference in performance between the DNR-SFSMS and another algorithm when the - is less than 5%. Prediction performance for confirmed COVID-19 cases in India, Angola and Indonesia. Prediction performance for confirmed COVID-19 cases in Ethiopia, Azerbaijan and Israel .

Experimental results and discussion

In terms of the methods mentioned above, the three components , and of PSR for the time series of confirmed COVID-19 cases in six countries were calculated. The results of PSR operation and the specific allocation of training data and test data are displayed in Table 1. The maximum Lyapunov exponents of the six countries exceed zero, which indicates that these time series are chaotic. Thus, the reconstructed datasets can be used in the prediction. In addition, the results of the time delays and the embedding dimensions calculations are plotted in Fig. 9.

Table 1

Results for the time delay, embedding dimensionality and maximum Lyapunov exponents of the time series confirmed COVID-19 case data from six countries.

Country	Time delay	Embedding	Maximum Lyapunov	Chaotic	Training data	Test data
	(τ)	dimensionality (m)	exponent (λmax)		size	size
India	1	2	0.0182	Yes	300	75
Angola	1	6	0.0332	Yes	300	71
Indonesia	1	4	0.0198	Yes	300	73
Ethiopia	1	3	0.0252	Yes	300	74
Azerbaijan	5	4	0.0152	Yes	300	61
Israel	3	2	0.0226	Yes	300	73

Fig. 9

The results of the MI algorithm and FNN algorithm on the time series confirmed COVID-19 case data from six countries.

For a comprehensive evaluation, several commonly used ML methods are adopted for comparison with the DNR-SFSMS. These methods include the multilayer perceptron (MLP) [54]; the Elman neural network (ENN) [55]; support vector regression with a linear kernel (SVR-l), a polynomial kernel (SVR-p), an RBF kernel (SVR-r) and a sigmoid kernel (SVR-s) [56]; RNN variants, including LSTM, BiLSTM and GRU; the DNM trained by L-SHADE (EDNM) [57]; and the original DNM. In addition, the random search method [58] is utilized to adjust the hyperparameters of the algorithms in this study. The range of normalization is also regarded as one of the hyperparameters of the DNM-based methods since it severely influences the prediction accuracy. The hyperparameter settings of the DNM-based methods and other algorithms are shown in Table 2, Table 3, respectively. All the experiments for each algorithm were independently conducted 30 times for each dataset.

Table 2

Results for the time delay, embedding dimensionality, maximum Lyapunov exponents and the specific allocation of training data and test data of the dataset of the COVID-19 pandemic in different countries.

Country	k	M	qs	epoch	Normalization	popsize	Learning rate
					range	(EDNM,DNR-SFSMS)	(DNR-BP)
India	5	6	0.5	1000	[0.3,0.65]	100	0.03
Angola	6	3	0.5	1000	[0.2,0.5]	100	0.05
Indonesia	5	5	0.5	1000	[0.4,0.469]	100	0.01
Ethiopia	6	7	0.5	1000	[0.1,0.19]	100	0.01
Azerbaijan	6	4	0.5	1000	[0.15,0.255]	100	0.05
Israel	5	4	0.5	1000	[0.3,0.313]	100	0.12

Table 3

Parameter settings of the algorithms for the transmission trend prediction of the COVID-19 pandemic.

Algorithm	Parameters
MLP	HiddenLayer=10, learningRate=0.01, epoch=1000
ENN	learningRate=0.01, epoch=1000
SVR	linear, polynomial, RBF and sigmoid kernels
LSTM, BiLSTM, GRU	HiddenUnits=200, epoch=1000

Fig. 10 shows the fitting charts for the COVID-19 trend forecasts in six countries. For the four SVR-based algorithms, SVR-l, which has the best performance, is selected as the representative. For the three RNN-based algorithms, GRU is selected as the representative since it has a better result than the other two. It should be emphasized that the reconstructed time series of the number of confirmed COVID-19 cases in each country were divided into a training dataset comprising the first pieces of data and a test dataset comprising the remaining data. As shown in Fig. 10, in the training dataset, namely, the first days, all algorithms except SVR-l have a very good fitting effect. However, for the test dataset, the difference between the curve generated by DNR-SFSMS and the original curve is significantly smaller than those of the other algorithms. Fig. 11 shows the convergence effect of DNR-SFSMS when trained on the six datasets. As shown in Fig. 11, although the maximum iteration number is set to , DNR-SFSMS always converges within generations, and the convergence effects are outstanding. The results indicate that the SFSMS algorithm has excellent performance in optimizing the weight of DNR and that DNR has good generalization for this type of problem.

Fig. 10

Training and prediction results of DNR-SFSMS for six countries.

Fig. 11

Convergence curves of DNR-SFSMS for transmission trend prediction of the COVID-19 pandemic.

The results of the four performance metrics on the six datasets for each algorithm in the form of “average standard deviation” are reported in Table 4, Table 5. For each metric, we present the best results in bold. As shown in Table 4, Table 5, DNR-SFSMS achieves the best results for any dataset, which implies that DNR-SFSMS has greater effectiveness and stability in regard to COVID-19 trend prediction. Moreover, comparing DNR-SFSMS, EDNM and the original DNM, we can find that, on the one hand, adding a weight on the dendrite layer to describe the strength of the dendrite branches is indeed conducive to the regression ability. In addition, compared with the traditional BP and L-SHADE algorithms to optimize the model, the SFSMS algorithm has better adaptability with the DNM-based models, the SFSMS algorithm can search the global optimal solution more effectively, and its searching ability is more stable. Moreover, most of the -s in the tables are less than 5%. According to the rules of the nonparametric statistics test, we can determine that the precision of DNR-SFSMS is significantly better than those of the other algorithms.

Table 4

Prediction performance for confirmed COVID-19 cases in India, Angola and Indonesia.

India
Model	MSE (Mean±Std)	p-value	RMSE (Mean±Std)	p-value	MAPE (Mean±Std)	p-value	MAE (Mean±Std)	p-value
ENN	5.33E＋08 ± 3.80E＋08	1.33E−01	2.12E＋04 ± 9.16E＋03	1.21E−01	1.61E−03 ± 7.23E−04	2.05E−01	1.76E＋04 ± 7.95E＋03	1.83E−01
MLP	4.78E＋12 ± 4.98E＋12	9.13E−07	1.89E＋06 ± 1.09E＋06	9.13E−07	1.71E−01 ± 1.01E−01	9.13E−07	1.85E＋06 ± 1.09E＋06	9.13E−07
SVR-l	1.28E＋12 ± 0.00E＋00	9.13E−07	1.13E＋06 ± 0.00E＋00	9.13E−07	1.05E−01 ± 0.00E＋00	9.13E−07	1.13E＋06 ± 0.00E＋00	9.13E−07
SVR-p	1.34E＋13 ± 0.00E＋00	9.13E−07	3.65E＋06 ± 0.00E＋00	9.13E−07	3.29E−01 ± 0.00E＋00	9.13E−07	3.57E＋06 ± 0.00E＋00	9.13E−07
SVR-r	1.88E＋12 ± 0.00E＋00	9.13E−07	1.37E＋06 ± 0.00E＋00	9.13E−07	1.26E−01 ± 0.00E＋00	9.13E−07	1.36E＋06 ± 0.00E＋00	9.13E−07
SVR-s	6.90E＋14 ± 0.00E＋00	9.13E−07	2.63E＋07 ± 0.00E＋00	9.13E−07	2.41E＋00 ± 0.00E＋00	9.13E−07	2.60E＋07 ± 0.00E＋00	9.13E−07
LSTM	5.16E＋10 ± 4.73E＋10	9.13E−07	1.99E＋05 ± 1.09E＋05	9.13E−07	1.80E−02 ± 1.01E−02	9.13E−07	1.95E＋05 ± 1.11E＋05	9.13E−07
BiLSTM	7.73E＋10 ± 3.27E＋10	9.13E−07	2.71E＋05 ± 6.41E＋04	9.13E−07	2.48E−02 ± 5.87E−03	9.13E−07	2.68E＋05 ± 6.42E＋04	9.13E−07
GRU	6.23E＋10 ± 3.81E＋10	9.13E−07	2.35E＋05 ± 8.50E＋04	9.13E−07	2.13E−02 ± 7.85E−03	9.13E−07	2.31E＋05 ± 8.58E＋04	9.13E−07
EDNM	8.24E＋10 ± 2.69E＋11	1.85E−06	1.70E＋05 ± 2.31E＋05	1.67E−06	1.35E−02 ± 1.82E−02	3.02E−06	1.48E＋05 ± 1.99E＋05	3.02E−06
DNR-BP	9.64E＋10 ± 4.56E＋10	9.13E−07	3.01E＋05 ± 7.55E＋04	9.13E−07	2.68E−02 ± 6.79E−03	9.13E−07	2.91E＋05 ± 7.36E＋04	9.13E−07
DNR-SFSMS	4.62E＋08 ± 5.20E＋08	–	1.85E＋04 ± 1.09E＋04	–	1.48E−03 ± 9.54E−04	–	1.60E＋04 ± 1.04E＋04	–

Table 5

Prediction performance for confirmed COVID-19 cases in Ethiopia, Azerbaijan and Israel .

Ethiopia
Model	MSE (Mean±Std)	p-value	RMSE (Mean±Std)	p-value	MAPE (Mean±Std)	p-value	MAE (Mean±Std)	p-value
ENN	3.70E＋07 ± 8.43E＋07	2.77E−05	4.74E＋03 ± 3.81E＋03	3.03E−05	2.06E−02 ± 1.60E−02	3.71E−04	3.22E＋03 ± 2.52E＋03	2.75E−04
MLP	1.14E＋09 ± 1.09E＋09	9.13E−07	2.93E＋04 ± 1.67E＋04	9.13E−07	1.78E−01 ± 1.11E−01	9.13E−07	2.62E＋04 ± 1.61E＋04	9.13E−07
SVR-l	2.84E＋08 ± 0.00E＋00	9.13E−07	1.69E＋04 ± 0.00E＋00	9.13E−07	1.15E−01 ± 0.00E＋00	9.13E−07	1.66E＋04 ± 0.00E＋00	9.13E−07
SVR-p	1.21E＋10 ± 0.00E＋00	9.13E−07	1.10E＋05 ± 0.00E＋00	9.13E−07	6.49E−01 ± 0.00E＋00	9.13E−07	9.66E＋04 ± 0.00E＋00	9.13E−07
SVR-r	8.58E＋08 ± 0.00E＋00	9.13E−07	2.93E＋04 ± 0.00E＋00	9.13E−07	1.82E−01 ± 0.00E＋00	9.13E−07	2.68E＋04 ± 0.00E＋00	9.13E−07
SVR-s	2.16E＋11 ± 0.00E＋00	9.13E−07	4.65E＋05 ± 0.00E＋00	9.13E−07	2.98E＋00 ± 0.00E＋00	9.13E−07	4.36E＋05 ± 0.00E＋00	9.13E−07
LSTM	8.21E＋06 ± 5.07E＋06	2.75E−03	2.68E＋03 ± 1.01E＋03	1.25E−03	1.73E−02 ± 6.44E−03	2.54E−04	2.49E＋03 ± 9.88E＋02	7.69E−04
BiLSTM	1.02E＋07 ± 5.35E＋06	6.00E−05	3.06E＋03 ± 8.94E＋02	2.54E−05	2.05E−02 ± 5.66E−03	6.49E−06	2.93E＋03 ± 8.73E＋02	1.03E−05
GRU	3.82E＋07 ± 4.39E＋07	7.16E−04	4.92E＋03 ± 3.74E＋03	4.31E−04	2.13E−02 ± 1.63E−02	6.41E−03	4.89E＋03 ± 3.75E＋03	9.08E−05
EDNM	5.33E＋09 ± 2.44E＋10	2.74E−06	3.04E＋04 ± 6.64E＋04	2.25E−06	1.22E−01 ± 2.39E−01	3.02E−06	1.91E＋04 ± 3.82E＋04	3.02E−06
DNR-BP	7.59E＋08 ± 2.08E＋09	9.13E−07	1.89E＋04 ± 2.00E＋04	9.13E−07	1.23E−01 ± 1.41E−01	9.13E−07	1.78E＋04 ± 2.01E＋04	9.13E−07
DNR-SFSMS	3.72E＋06 ± 4.55E＋06	–	1.62E＋03 ± 1.04E＋03	–	9.28E−03 ± 6.31E−03	–	1.38E＋03 ± 9.39E＋02	–

In Fig. 12, the performance metrics are presented in the form of bar charts. On each chart, we use a black dotted line to highlight the performance differences between DNR-SFSMS and the other algorithms. From Fig. 12, we can intuitively observe that DNR-SFSMS yields the best results on all datasets. Through further observation, we find that the predictions are more accurate for countries with more infections, that is, countries with more severe outbreaks such as India, than for countries with less severe outbreaks. The situation is mainly reflected in the fact that the MAPE values are smaller. We speculate that this is because, for countries with mild outbreaks, the country has effectively suppressed the transmission of the disease through some policies, so it does not conform to the law of natural transmission of the disease, resulting in a decline in the prediction accuracy of the model. Since MAPE is more of a reference than the other performance metrics for such large numerical prediction problems, we employ box plots to show the results of MAPE in Fig. 13. The box plots can help us visualize the detailed circumstance of each dataset and each algorithm in runs. In the box plots, the ordinate represents the MAPE value, and the rectangle represents the range of the overall MAPE values of the algorithms in the experiment. A rectangle with a large area denotes great fluctuations and poor stability of the algorithm. In addition, the red plus sign indicates the algorithm fell into a local optimum. The more signs there are, the more times the experiment falls into a local optimum. It should be noted that due to the nature of SVR-based algorithms, the results of each run of all SVR-based algorithms are the same, so they are represented as a line in box plots. Fig. 13 shows that DNR-SFSMS has almost the minimum rectangular area on all datasets, the positions of the rectangles on the coordinate axis are the lowest, and it has the least number of red plus signs. These situations suggest that the forecasting precision of DNR in COVID-19 disease trend prediction is superior to the forecasting precision of other commonly used ML methods. In conclusion, the above experimental results suggest that DNR-SFSMS can be used as a competitive tool for COVID-19 trend prediction.

Fig. 12

Comparison of the four error metrics of prediction methods on the data from the six countries. From the top, the methods are DNR-SFSMS, DNR-BP, EDNM, GRU, BiLSTM, LSTM, SVR-s, SVR-r, SVR-p, SVR-l, MLP and ENN, respectively.

Fig. 13

Box-and-whisker plots for the MAPE of DNR-SFSMS for six countries.

Conclusion

Since the outbreak of the novel COVID-19 pandemic, machine learning techniques have played a vital role in pandemic prevention, contact tracing, rapid screening and the development of vaccines and drugs. In this study, a novel DNR approach is applied to forecast the transmission trend of COVID-19 pandemics. The regression capability of DNR is enhanced by employing the combination of the SMS algorithm and scale-free local search. The SMS algorithm is a recently proposed evolutionary algorithm with strong optimization ability, and the scale-free local search can improve the quality of the population during evolution. Since the COVID-19 pandemic dataset can be deemed a time series, Takens’s theorem is applied to the data to improve the prediction accuracy. We utilize the time delay calculated by the MI algorithm and the embedding dimensions calculated via the FNN algorithm to implement the PSR operation. Then, the maximum Lyapunov exponent is calculated to determine the chaos of the reconstructed data, which affirms the availability of the reconstructed data. To fairly evaluate the proposed method, several commonly used ML methods are employed as competitors. The experimental results show that DNR-SFSMS is more competitive than the other methods for forecasting the COVID-19 transmission trend in terms of various evaluation metrics. DNR-SFSMS can be regarded as a powerful prediction approach. Our future research will focus on applying DNR-SFSMS to forecast the transmission of COVID-19 in more countries and simultaneously validate its forecast performance on other infectious diseases.

CRediT authorship contribution statement

Minhui Dong: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing - original draft, Writing - review & editing, Visualization. Cheng Tang: Resources, Validation, Investigation, Writing - review & editing. Junkai Ji: Methodology, Validation, Formal analysis, Investigation, Writing - review & editing, Supervision. Qiuzhen Lin: Resources, Investigation, Writing - review & editing. Ka-Chun Wong: Resources, Supervision, Project administration.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

21 in total

1. Emergence of scaling in random networks

Authors:
Journal: Science Date: 1999-10-15 Impact factor: 47.728

2. Multiplicative computation in a visual neuron sensitive to looming.

Authors: Fabrizio Gabbiani; Holger G Krapp; Christof Koch; Gilles Laurent
Journal: Nature Date: 2002-11-21 Impact factor: 49.962

3. Assortative mixing in networks.

Authors: M E J Newman
Journal: Phys Rev Lett Date: 2002-10-28 Impact factor: 9.161

4. Mathematical analysis of the transmission dynamics of HIV/TB coinfection in the presence of treatment.

Authors: Oluwaseun Sharomi; Chandra N Podder; Abba B Gumel; Baojun Song
Journal: Math Biosci Eng Date: 2008-01 Impact factor: 2.080

5. Independent coordinates for strange attractors from mutual information.

Authors:
Journal: Phys Rev A Gen Phys Date: 1986-02

6. Long short-term memory.

Authors: S Hochreiter; J Schmidhuber
Journal: Neural Comput Date: 1997-11-15 Impact factor: 2.026

Review 7. The use of mathematical models in the epidemiological study of infectious diseases and in the design of mass immunization programmes.

Authors: D J Nokes; R M Anderson
Journal: Epidemiol Infect Date: 1988-08 Impact factor: 2.451

8. Prediction for the spread of COVID-19 in India and effectiveness of preventive measures.

Authors: Anuradha Tomar; Neeraj Gupta
Journal: Sci Total Environ Date: 2020-04-20 Impact factor: 7.963

9. Preliminary estimation of the basic reproduction number of novel coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven analysis in the early phase of the outbreak.

Authors: Shi Zhao; Qianyin Lin; Jinjun Ran; Salihu S Musa; Guangpu Yang; Weiming Wang; Yijun Lou; Daozhou Gao; Lin Yang; Daihai He; Maggie H Wang
Journal: Int J Infect Dis Date: 2020-01-30 Impact factor: 3.623

10. Application of the ARIMA model on the COVID-2019 epidemic dataset.

Authors: Domenico Benvenuto; Marta Giovanetti; Lazzaro Vassallo; Silvia Angeletti; Massimo Ciccozzi
Journal: Data Brief Date: 2020-02-26