In this work we are presenting an approach for fuzzy aggregation in ensembles of neural networks for forecasting. The aggregator is used in an ensemble to combine the outputs of the networks forming the ensemble. This is done in such a way that the total output of the ensemble is better than the outputs of the individual modules. In our approach a fuzzy system is used to estimate the weights that will be assigned to the outputs in the process of combining them in a weighted average calculation. The uncertainty in the process of aggregation is modeled with interval type-3 fuzzy, which in theory can outperform type-2 and type-1. Publicly available data sets of COVID-19 cases for several countries in the world were utilized to test the proposed approach. Simulation results of the COVID-19 data show the potential of the approach to outperform other aggregators in the literature.
In this work we are presenting an approach for fuzzy aggregation in ensembles of neural networks for forecasting. The aggregator is used in an ensemble to combine the outputs of the networks forming the ensemble. This is done in such a way that the total output of the ensemble is better than the outputs of the individual modules. In our approach a fuzzy system is used to estimate the weights that will be assigned to the outputs in the process of combining them in a weighted average calculation. The uncertainty in the process of aggregation is modeled with interval type-3 fuzzy, which in theory can outperform type-2 and type-1. Publicly available data sets of COVID-19 cases for several countries in the world were utilized to test the proposed approach. Simulation results of the COVID-19 data show the potential of the approach to outperform other aggregators in the literature.
Fuzzy logic has become very important in different disciplines of study, one of the areas in which we focus for this work is the time series prediction area, as it has been shown in the literature that the use of fuzzy logic helps to improve results in many problems (Zadeh, 1989, Zadeh, 1998). Type-1 evolved to type-2 fuzzy systems mainly with the works by Mendel in 2001 (Mendel, 2001). Initially, interval type-2 fuzzy systems were studied and applied to several problems (Mendel, 2001). Later, these systems were applied to many problems in areas such as: robotics, control, diagnosis and others (Mendel, 2017, Karnik and Mendel, 2001). Simulation and experimental results show that interval type-2 outperforms type-1 fuzzy systems in situations with higher levels of noise, dynamic environments or highly nonlinear problems (Moreno et al., 2020, Mendel et al., 2014, Olivas et al., 2016). Later, general type-2 fuzzy systems were considered to manage higher levels of uncertainty, and good results have been achieved in several areas of application (Sakalli et al., 2021, Ontiveros et al., 2018, Castillo and Amador-Angulo, 2018). More recently, it is becoming apparent that type-3 fuzzy systems could help solve even more complex problems. For this reason, in this paper we are putting forward the basic constructs of type-3 fuzzy systems by extending the ideas of type-2 (Cao et al., 2021, Mohammadzadeh et al., 2021, Qasem et al., 2021), and also applying them for time series prediction.Recently, the very rapid propagation of COVID-19 has been noticed, including its several waves, that has spread to all continents in the world. In particular, in the case of Europe several countries, like Italy, Spain, and France have been hit hard with the spread of the COVID-19 virus, having a significant number of confirmed cases and deaths (The Humanitarian Data Exchange (HDX), 2022, Shereen et al., 2020, Sohrabi et al., 2020, Apostolopoulos and Bessiana, 2020, Sarkodie and Owusu, 2020, Beck et al., 2020). In the case of the American continent, United States, Canada and Brazil have also suffer a significant number of cases due to the rapid spread of COVID-19 (Zhong et al., 2020, Kamel Boulos and Geraghty, 2020, Gao et al., 2020). There are also several recent works on predicting and modeling COVID-19 behavior in space and time (Rao and Vazquez, 2020, Melin et al., 2020a). However, still the prediction problem remains a challenging task, as can be seen in recent papers on this topic (Melin et al., 2020b, Jin et al., 2020, Khalilpourazari et al., 2021, Kuvvetli et al., 2021, Liu et al., 2022), where different methods have been utilized, such as neural networks and fuzzy logic for achieving this task. In this sense, we can say that this was the main motivation for undertaking this research work.As a difference to previous prediction approaches, one of the key contributions of this work is the proposal of mathematical definitions of interval type-3 fuzzy theory, which were obtained by using the extension principle on the type-2 fuzzy theory definitions. In addition, the utilization of interval type-3 fuzzy, in the aggregation of ensemble outputs for prediction, has not been previously presented in the literature, and it is now shown that interval type-3 has the potential to be better than type-2 and type-1 in prediction problems. Also, the hybrid of type-3 with ensemble of neural networks has not been previously considered in prediction problems. We consider that these are important contributions to the frontier knowledge in soft computing and its applications.The structure of this article is defined as: Section 2 introduces basic terminology of interval type-3 fuzzy sets, Section 3 describes the proposed type-3 prediction method, Section 4 summarizes the results, and Section 5 outlines the conclusions and future works.
Interval type-3 Fuzzy logic
Interval type-3 fuzzy can be viewed as an extension of type-2 models. We offer basic terminology of type-3 fuzzy sets to give an idea of the difference with respect to their type-2 and type-1 counterparts. We start by recalling the concept of a fuzzy set (type-1) proposed by Zadeh (1989), where the membership to a set is allowed to be any number in the [0, 1] interval, in this way extending the concept of traditional sets. In this case, a type-1 fuzzy set A, is represented as: where is an element of a universe X, and
(x) is a membership function with numeric values in the interval [0, 1]. Later, as an extension of type-1, the concept of type-2 fuzzy set was proposed, which allows the membership to be a type-1 fuzzy set, instead of precise number (Mendel, 2001, Mendel, 2017). The goal of this extension was allowing a better representation of real-world uncertainty. A type-2 fuzzy set , is represented mathematically as: in which . In fact, represents the primary membership domain of x, and is a type-1 fuzzy set known as the secondary set. Later, a type-3 fuzzy set was also proposed as an extension of a type-2 fuzzy set, by using a primary, secondary and tertiary membership functions, with the goal of having an even better representation of uncertainty. The mathematical definition of a type-3 fuzzy set can be established as follows:A type-3 fuzzy set (T3 FS) (Rickard et al., 2009, Mohammadzadeh et al., 2020, Liu et al., 2021), denoted by , is represented by the plot of a trivariate function, called membership function (MF) of , in the Cartesian product in , where is the universe of the primary variable of , x. The MF of is formulated by or for short) and it is called a type-3 membership function (T3 MF) of the T3 FS, where is the universe for the secondary variable and is the universe for tertiary variable . If the tertiary MF is uniformly equal to 1 then we have an Interval type-3 fuzzy set (IT3 FS) with interval type-3 MF (IT3MF).Fuzzy set with an IT3 MF .Architecture of the proposed ensemble with type-3 fuzzy response aggregation.Interval type-3 system to compute the weights.MFs of input .MFs of input .MFs of output .MFs of output .Fig. 1 illustrates and IT3 FS with IT3MF , where is the LMF and is the UMF. The embedded secondary T1 MFs in of and are and .
Fig. 1
Fuzzy set with an IT3 MF .
In this case, we utilize interval type-3 MFs that are scaled Gaussians in the primary and secondary domains, respectively. This function can be represented as, ScaleGaussScaleGauss IT3MF, with Gaussian footprint of uncertainty , characterized with parameters (UpperParameters) for the upper membership function UMF and for the lower membership function LMF, the parameters (LowerScale), (LowerLag) to form the . The vertical cuts characterize the, and are IT2 FSs with Gaussian IT2 MFs, with parameters for the UMF and LMF (LowerScale), (LowerLag). The IT3 MF, ScaleGaussScaleGaussIT3MF is described with the following equations:where
, is the machine epsilon. If , then . Then and are the upper and lower limits of the DOU. The range , and radius, of the FOU are:The apex or core, , of the IT3 MF , is defined by the expression: where . Then, the vertical cuts with IT2 MF, , are described by the equations:
where . If , then . Then, and are the UMF and LMF of the IT2 FSs of the vertical cuts of the secondary IT2MF of the IT3 FS.
Proposed method
The method consists of utilizing an ensemble of two neural networks and then combine their outputs with a weighted average in which the weights are computed with an interval type-3 fuzzy system. Fig. 2 illustrates the architecture of the proposed method, where we can appreciate that the time series enters the two modules of the ensemble and individual predictions and are obtained with corresponding training errors and , respectively.
Fig. 2
Architecture of the proposed ensemble with type-3 fuzzy response aggregation.
The fuzzy rules for aggregating the results with two modules are:If ( is small) and ( is small), then ( is high)( is high).If ( is small) and ( is medium), then ( is high)( is medium).If ( is small) and ( is high), then ( is high)( is low).If ( is medium) and ( is small), then ( is medium)( is high).If ( is medium) and ( is medium), then ( is medium)( is medium).If ( is medium) and ( is high), then ( is medium)( is low).If ( is high) and ( is small), then ( is low)( is high).If ( is high) and ( is medium), then ( is low)( is medium).If ( is high) and ( is high), then ( is low)( is low).The design of the fuzzy rules was based on general knowledge of training neural networks with time series data. It is known that a high training errors means that the weight of the corresponding neural network should be low. On the other hand, if the training error is low then the weight of the network should be high. This general knowledge was used in putting forward the fuzzy rules. The interval type-3 system (seen in Fig. 3) has as inputs the error values of each neural network, and , respectively. The fuzzy rules could be easily generalized for the case of three neural networks and correspondingly three weights. After the type-reduction and defuzzification, the Type-3 system has as outputs the corresponding weights ( and ) for each neural network according to its prediction errors to obtain a final prediction P (combining and , the prediction of the neural networks) given by:
Fig. 3
Interval type-3 system to compute the weights.
In Table 1 we show the specific parameters of the MFs, which were found by trial and error, and could be optimized in the future with metaheuristics for achieving even better results. Basically, Table 1 shows the centers and standard deviations of the Gaussian MFs. The parameters of Table 1 are used in Eqs. (4)–(10) to generate the MFs needed for the fuzzy rules.
Table 1
Parameter values for the Gaussian MFs used in the linguistic values (center and standard deviation).
Variable
Membership Function
σ
m
Input 1
Small
0.10
0.00
Input 1
Medium
0.12
0.50
Input 1
High
0.10
1.00
Input 2
Small
0.10
0.00
Input 2
Medium
0.12
0.50
Input 2
High
0.10
1.00
Output 1
Low
0.10
0.00
Output 1
Medium
0.11
0.50
Output 1
High
0.10
1.00
Output 2
Low
0.10
0.00
Output 2
Medium
0.11
0.50
Output 2
High
0.10
1.00
Regarding the lower scale () and lower lag () parameters, after experimentation for achieving better results, they were found to be 0.8 and 0.2 for the inputs, respectively. On the other hand, for the outputs, they were found to be 0.9 and 0.6, respectively.Parameter values for the Gaussian MFs used in the linguistic values (center and standard deviation).Surface representing the type-3 fuzzy model for .Surface representing the type-3 fuzzy model for .Inference process for a particular value of x.Type reduction and defuzzification for a particular value.In Fig. 4, Fig. 5 we show the input MFs for both errors, respectively. In Fig. 6, Fig. 7 we illustrate the output MFs for both weights, respectively. The actual IT3 MFs are three dimensional, but in these figures we are showing a view on the plane for simplicity. The MFs of these figures are generated by plotting Equations (4) to (10) with the parameters shown in Table 1. For example, for error (input 1) the parameter values of the first three rows are used in Eqs. (4) to (10) to generate Fig. 4. Similarly, for generating Fig. 5, Fig. 6, Fig. 7 the same process is performed. For the inputs (Fig. 4, Fig. 5), it is assumed that the linguistic values are granulated as: small, medium or high, with Gaussian MFs.
Fig. 4
MFs of input .
Fig. 5
MFs of input .
Fig. 6
MFs of output .
Fig. 7
MFs of output .
In the outputs (Fig. 6, Fig. 7), it is assumed that the weights are granulated into: low, medium and high. These linguistic values are modeled with Gaussian MFs.In Fig. 8 we show one view of the nonlinear surface representing the fuzzy model, in this case, representing the relation of with respects to the errors and . In Fig. 9 the view is shown for with respect to the errors.
Fig. 8
Surface representing the type-3 fuzzy model for .
Fig. 9
Surface representing the type-3 fuzzy model for .
In Fig. 10 we illustrate the inference for a particular value of one of the inputs, and then in Fig. 11 the type-reduction and defuzzification are presented.
Fig. 10
Inference process for a particular value of x.
Fig. 11
Type reduction and defuzzification for a particular value.
Simulation results
The experiments were performed with a dataset used from the Humanitarian Data Exchange (HDX) (The Humanitarian Data Exchange (HDX), 2022), which includes COVID-19 data from countries, where cases have occurred from January 22, 2020 to January, 2022, where the last 15 days are used for the testing.Table 2 shows the resulting errors of training the two modules of the ensemble ( and ) and the corresponding weights obtained by the interval type-3 system of the previous section. The results of Table 2 are for five countries: France, Germany, Japan, Poland and USA. In Table 2 we illustrate the results of combining the predicted values of modules with the weighted average equations using the weights produced by the fuzzy system. The modules of the neural network were trained with the COVID-19 time series from January of 2020 to 2022, and the last 15 days are used for testing and comparing with the real values. Recurrent neural networks are used, with three delays, 300 training epochs, and backpropagation with momentum learning and adaptive learning rate. There are three layers in all the networks.
Table 2
Results of the modules for the ensemble and the weights from the IT3 fuzzy system.
Country
e1
e2
w1
w2
France
1.0
0.82
0.1653
0.1866
Germany
0.956
0.11
0.0949
0.9137
Japan
1.0
0.86
0.1135
0.1183
Poland
0.09583
1.0
0.9208
0.0879
USA
0.70
1.0
0.4968
0.1393
In Table 3 the results for the prediction of France are shown. In Table 4 we show the results for Germany and the prediction is illustrated in Fig. 12. In Table 5 we show the results for Japan and the prediction is illustrated in Fig. 13. In addition, we show in Table 6 and Fig. 14 the results for Poland. Finally, we show in Table 7 the prediction for USA.
Table 3
Results of the prediction with weighted average and comparison with real values for France.
P1
P2
PWA
PReal
6605410.19
7008247.76
6819020.56
7075244
6614139.24
7017080.51
6827804.6
7079005
6622399.29
7019209.37
6832813.5
7093651
6627109.31
7030190.08
6840848.64
7106147
6638009.1
7038745.08
6850505.07
7109125
6646165.48
7040257.54
6855138.42
7128903
6651161.41
7055100.1
6865355.67
7149118
6666269.19
7069013.04
6879829.87
7168026
6681418.33
7081862.44
6893759.54
7188721
6696029.19
7096088.87
6908166.55
7211399
6711976.63
7111599.88
6923882.57
7231148
6728648.35
7124729.52
6938676.05
7235966
6741361.56
7126988.88
6945845.95
7266361
6748887.39
7149216.18
6961167.45
7296757
6771633.28
7169428.36
6982569.8
7330086
Table 4
Results of the prediction with weighted average and comparison with real values for Germany.
P1
P2
PWA
PReal
4874657.83
8186604.82
7874981.02
7866784
4877993.44
8284832.8
7964280.49
7943959
4880960.91
8372079.56
8043597.35
7988210
4882089.4
8414449.58
8082086.92
8021339
4882788.23
8444313.88
8109207.02
8104157
4886304.01
8546418.12
8202034.98
8222262
4891326.88
8696895.89
8338826.79
8361262
4896263.17
8864841.73
8491434.92
8502132
4900498.6
9027704.27
8639372.11
8635461
4903941.72
9175951.89
8773994.95
8716804
4905466.24
9255829.58
8846500.33
8773030
4906195.81
9304413.44
8890581.54
8909503
4909695.04
9465430.18
9036777.33
9088672
4914154.61
9684141.65
9235329.67
9317280
4918690.59
9948751.83
9475469.25
9477603
Fig. 12
Comparison of prediction with respect to real values for Germany.
Table 5
Results of the prediction with weighted average and comparison with real values for Japan.
P1
P2
PWA
PReal
1421031.66
1168372.61
1292086.17
1209082
1435781.23
1191360.06
1311039.97
1234423
1449848.17
1213401.89
1329176.93
1260415
1462806.61
1235123.08
1346607.47
1286077
1474669.31
1255662.63
1362898.43
1308422
1484070.3
1271603.87
1375637.26
1325340
1517540.24
1340270.56
1427069.99
1420783
1525625.61
1357837.38
1439994.25
1443642
1532582.42
1373612.4
1451451.47
1462979
1537962.41
1385600.95
1460204.16
1476653
1541193.63
1392303.61
1465207.05
1494372
1545591.19
1406220.89
1474463.04
1514400
1545591.19
1406220.89
1474463.04
1514400
1551355.7
1421248.78
1484955.15
1532616
1556035.67
1432831.96
1493158.2
1549337
Fig. 13
Comparison of prediction with respect to real values for Japan.
Table 6
Results of the prediction with weighted average and comparison with real values for Poland.
P1
P2
PWA
PReal
3995819.87
3621223.9
3963176.88
3881349
4003077.22
3637991.57
3971262.98
3903445
4008232.79
3647846.54
3976828.06
3923472
4013534.37
3659477.29
3982681.17
3942864
4018723.82
3671368.56
3988454.63
3958840
4022446.88
3679114.97
3992528.3
3968450
4024370.78
3682395.61
3994570.43
3982257
4028881.2
3694834.14
3999771.72
4000270
4033453.12
3706100.13
4004926.97
4017420
4037113.47
3714046.58
4008960.82
4032796
4040580.05
3722064.06
4012823.97
4043585
4042683.73
3726273.81
4015111.18
4049838
4043955.25
3728860.68
4016497.32
4054865
4045314.96
3732474.08
4018053.42
4064715
4048172.85
3740829.22
4021390.35
4080282
Fig. 14
Comparison of prediction with respect to real values for Poland.
Table 7
Results of the prediction with weighted average and comparison with real values for USA.
P1
P2
PWA
PReal
61326847
60718991.9
61193732.4
64210668
62011425.7
61319440.6
61859887.4
65069619
62675140.7
61930734.2
62512122.6
65463200
62750945.4
62101842.9
62608798
65930556
62982269.4
62412455.5
62857485.4
66590148
63622278.3
62921873.6
63468896.2
67693339
64773747.8
63794454.8
64559291.7
68684431
65636754.3
64470771.5
65381414.9
69388781
65982382.4
64869656.6
65738705.8
70206083
66548711.7
65413568.4
66300125.8
70490987
66527725.9
65492563
66301034.9
70845794
66625713.4
65702223.5
66423477.7
71741698
67597051.5
66400650.5
67335050.8
72257016
67982681.7
66685204.4
67698546.2
72910136
68386962.5
67101527.1
68105464.1
73427335
Results of the modules for the ensemble and the weights from the IT3 fuzzy system.According to Table 3 the results of the prediction for France are relatively close to the real values, validating the proposed model.Results of the prediction with weighted average and comparison with real values for France.The prediction results for Germany are very good according to Table 4 and can also visualized in Fig. 12, where both the forecasted and real values are plotted and can be appreciated to be very near.Results of the prediction with weighted average and comparison with real values for Germany.The prediction results for Japan are very good according to Table 5 and can also visualized in Fig. 13, where both the forecasted and real values are plotted and can be appreciated to be very near.Comparison of prediction with respect to real values for Germany.Results of the prediction with weighted average and comparison with real values for Japan.Comparison of prediction with respect to real values for Japan.According to Table 6 the predicted results for Poland for the 15 days are very close to the real values and this fact can also be appreciated in Fig. 14, where both the real and predicted values are plotted and be seen to be near.Results of the prediction with weighted average and comparison with real values for Poland.Finally, in Table 7 we can notice once more that predicted and real values are close for the case of the United States.In addition, we have made a comparison with the prediction of type-3 with respect to type-2 fuzzy aggregation, using the same approach of weighted averages, to show the advantage of the proposal. In Table 8 we summarize a comparison for the prediction errors for the same mentioned period of time for 12 countries in which type-3 is better in 11 of the 12 countries.
Table 8
Comparison of predictions for type-3 versus type-2 in 12 countries based on Mean Squared Error (MSE).
Country
Comparison
Type-2 [19]
Type-3 (This paper)
Best
Avg
Worst
Best
Avg
Worst
Brazil
3.04 × 10−6
1.97×10−2
1.42 × 10−1
1.84 × 10−6
2.46 × 10−2
1.06 × 10−1
China
1.84 × 10−3
6.98 × 10−2
2.96 × 10−1
5.22 × 10−4
2.97×10−2
1.61 × 10−1
France
6.07 × 10−6
2.06 × 10−2
1.94 × 10−1
4.13 × 10−6
7.11×10−3
6.06 × 10−2
Germany
8.02 × 10−4
8.55 × 10−2
4.11 × 10−1
4.09 × 10−5
3.02×10−2
1.01 × 10−1
India
1.56 × 10−7
8.89 × 10−3
1.54 × 10−1
3.05 × 10−7
3.01×10−3
2.05 × 10−2
Iran
1.12 × 10−5
1.78 × 10−2
9.82 × 10−2
7.56 × 10−7
1.43×10−2
1.05 × 10−1
Italy
7.57 × 10−6
4.54 × 10−2
2.92 × 10−1
1.24 × 10−5
1.76×10−2
8.32 × 10−2
Mexico
2.86 × 10−5
9.14 × 10−3
1.81 × 10−1
1.48 × 10−5
1.49×10−3
3.00 × 10−2
Poland
8.31 × 10−5
2.05 × 10−2
4.28 × 10−1
5.71 × 10−5
6.08×10−3
5.85 × 10−2
Spain
5.82 × 10−4
9.17 × 10−4
1.54 × 10−3
5.56 × 10−4
8.37×10−4
1.54 × 10−3
United Kingdom
2.80 × 10−5
7.07 × 10−3
1.62 × 10−1
2.69 × 10−4
1.26×10−2
9.98 × 10−2
USA
3.15 × 10−6
8.02 × 10−3
6.04 × 10−2
6.26 × 10−7
5.32×10−3
9.49 × 10−2
Results of the prediction with weighted average and comparison with real values for USA.We have also previously compared the weighted average using type-2 with respect to type-1 fuzzy logic in a previous paper, showing that type-2 was better (Melin et al., 2021), so we can conclude that type-3 fuzzy outperforms both type-2 and type-1 in COVID-19 prediction.From the previous tables we can conclude that predicting with the interval type-3 fuzzy approach for aggregation in ensemble of neural networks is a good alternative in the prediction of complex time series, like the COVID-19. Of course, the proposed approach can be extended for ensembles with more modules by using fuzzy systems with more inputs and outputs, and the design can be optimized with metaheuristics for improving results. Finally, we can also mention that the ensemble approach with fuzzy aggregation could also be used for multiple time series problems, because we can use the modules for modeling each of the time series, and then at the end use the aggregator to combine the predictions of the modules.Comparison of prediction with respect to real values for Poland.Comparison of predictions for type-3 versus type-2 in 12 countries based on Mean Squared Error (MSE).
Conclusions
In this work a new approach for fuzzy aggregation in ensembles of neural networks has been outlined. The aggregator in an ensemble is utilized to combine the outputs of the networks forming the ensemble, in such a way that the total output is better than the outputs of the individual modules. In our approach a fuzzy system is used to estimate the weights that will be assigned to the outputs in the process of combining them in a weighted average calculation. The uncertainty in the process of aggregation is modeled with interval type-3. Simulation results show the potential of the approach to outperform other methods in the literature, such as type-2 and type-1 fuzzy aggregators. We have utilized COVID-19 time series of several countries to test the good performance of the proposed approach. As future work we plan to use our approach in other applications, like in Cervantes and Castillo, 2015, Melin et al., 2020b, Castillo et al., 2014, Rubio et al., 2017. Also, we plan to optimize the type-3 system with metaheuristics for improving the results. In addition, we plan to combine type-3 with other intelligent techniques for build strong hybrid models, such as deep learning like in Tian et al., 2022, Aly et al., 2021, and consider other time series prediction problems. Finally, we could also consider in the future, general type-3 fuzzy models instead of interval type-3, as outlined in Castillo et al. (2022).
CRediT authorship contribution statement
Oscar Castillo: Conceptualization, Methodology, Validation, Writing – original draft. Juan R. Castro: Conceptualization, Validation, Formal analysis, Software. Martha Pulido: Conceptualization, Methodology, Validation, Formal analysis, Software. Patricia Melin: Methodology, Software, Visualization, Project administration, Writing – review & editing, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.