Literature DB >> 36092946

Hybrid approaches for container traffic forecasting in the context of anomalous events: The case of the Yangtze River Delta region in the COVID-19 pandemic.

Dong Huang^1,2, Manel Grifoll², Jose A Sanchez-Espigares³, Pengjun Zheng¹, Hongxiang Feng¹.

Abstract

The COVID-19 pandemic had a significant impact on container transportation. Accurate forecasting of container throughput is critical for policymakers and port authorities, especially in the context of the anomalous events of the COVID-19 pandemic. In this paper, we firstly proposed hybrid models for univariate time series forecasting to enhance prediction accuracy while eliminating the nonlinearity and multivariate limitations. Next, we compared the forecasting accuracy of different models with various training dataset extensions and forecasting horizons. Finally, we analysed the impact of the COVID-19 pandemic on container throughput forecasting and container transportation. An empirical analysis of container throughputs in the Yangtze River Delta region was performed for illustration and verification purposes. Error metrics analysis suggests that SARIMA-LSTM2 and SARIMA-SVR2 (configuration 2) have the best performance compared to other models and they can better predict the container traffic in the context of anomalous events such as the COVID-19 pandemic. The results also reveal that, with an increase in the training dataset extensions, the accuracy of the models is improved, particularly in comparison with standard statistical models (i.e. SARIMA model). An accurate prediction can help strategic management and policymakers to better respond to the negative impact of the COVID-19 pandemic.

Entities: Chemical

Keywords: COVID-19 pandemic; Hybrid model; Machine learning model; SARIMA model; Yangtze River Delta multi-port region

Year: 2022 PMID： 36092946 PMCID： PMC9449872 DOI： 10.1016/j.tranpol.2022.08.019

Source DB: PubMed Journal: Transp Policy (Oxf) ISSN： 0967-070X

Introduction

Container transportation has become one of the most essential activities in the world's economic and logistics chain (Onut et al., 2011; Balci et al., 2018) and container throughput has been widely recognized as the most important indicator of port activity (Jiang et al., 2022; Gao et al., 2016; Grifoll et al., 2018). For this reason, accurate forecasting of container throughput plays a crucial role, regardless of the port development strategies (Feng et al., 2019), infrastructure investments or maritime supply chain (Ha et al., 2019). Accurate forecasting can also help strategic management and policy development by allowing better real-time decision-making (Stavroulakis and Papadimitriou, 2017), especially in the context of anomalous events such as the COVID-19 pandemic. In addition, port authorities can use forecasting methods for route optimisation, resources assignment and terminal management (Grifoll, 2019; Grifoll et al., 2021; Tsai and Huang, 2017; Levine et al., 2009). Anomalous events are generally characterised by their abruptness and unpredictability, such as the recent COVID-19 pandemic. Patients with COVID-19 were first detected in Wuhan, the capital city of Hubei Province of China, in December 2019. The outbreak of COVID-19 has posed unprecedented challenges to human beings and caused far-reaching consequences for a highly globalised world economy (Narasimha et al., 2021; Zhao et al., 2022). As container transport is closely linked to the world's economic developments, consumer activity and supply chains, container shipping has been severely affected by the COVID-19 pandemic (Guerrero et al., 2022; UNCTAD, 2020). According to Koyuncu et al. (2021), there was a 15.8% drop in total container throughput in China due to the lockdown strategy and deferred deliveries. When compared to the same period in 2019, the total containers handled at Chinese ports declined by 10.1% in the first two months of 2020. However, inaccurate forecasting of container throughput may also lead to avoidable financial losses and management confusion (Feng et al., 2021; Xie et al., 2019). In this sense, it is really necessary and beneficial for policymakers and port authorities to explore a new method to capture anomalous events and analyse the influence of the COVID-19 pandemic. Consequently, container throughput forecasting catches more attention and numerous forecasting methodologies have been proposed. The Autoregressive Integrated Moving Average (ARIMA) model is the most extensive and useful approach for container throughput forecasting; it is convenient and efficient in computation and outperforms other models in some cases, especially in short-term forecasting (Geng et al., 2015). The ARIMA model is also successfully applied in many other fields of forecasting, such as economic, traffic and environmental problems (Grifoll et al., 2021; Nepal et al., 2020). The ARIMAX model is based on the ARIMA model, where ‘X’ stands for “exogenous” external information, which can improve forecasting performance. The Seasonal Autoregressive Integrated Moving Average (SARIMA) model is based on ARIMA and brings the seasonal factor “S" into the ARIMA model, to exploit seasonal fluctuations in the time series (Ruiz-Aguilar et al., 2014); the same applies to SARIMAX. An Artificial Neural Network (ANN) is a mathematical model that simulates neuronal activity and is an information processing system based on emulating the structure and function of the brain's neural networks. ANNs are excellent at extracting the nonlinear relationships and dynamic patterns widely used in forecasting tasks (Ruiz-Aguilar et al., 2014). Given these characteristics, it is no surprise that ANN achieves numerous successes in transportation forecasting (Gosasang et al., 2011). Hua and Faghri (1994) first applied ANN to traffic prediction and, since then, more and more ANN-based forecasting models have emerged to improve traffic forecasting performance. Typical examples include Back Propagation Neural Networks (BPNN) (Kunnapapdeelert and Thepmongkorn, 2020), Feed Forward Neural Networks (FFNN) (Do et al., 2019), Radial Basis Function (Zhu et al., 2014), and Recurrent Neural Networks (RNN) (Li et al., 2018). Meanwhile, ANN has been used to compare traditional prediction models, to demonstrate the promising performance of ANN for specific applications (Sayed and Razavi, 2000). In this regard, Karlaftis and Vlahogianni (2011) compared ANN with classical statistical methods, and the results show that ANN is more flexible and has higher accuracy than classical statistical models. Usually, traditional RNN fails to capture the input sequence's long temporal dependence (Ma et al., 2015); ANN prediction models usually need more training samples, while container throughput datasets are limited. However, Long Short-Term Memory (LSTM) can overcome those problems (Geng et al., 2015). A Support Vector Machine (SVM) was proposed by Vapnik (2013). When SVM is used to solve a regression problem, it is called Support Vector Regression (SVR) and SVR has eliminated the limitation of ANN on the size of the dataset (Cao and Cai, 2007). SVR has several distinct benefits when it comes to solving small sample, nonlinear, and high-dimensional forecasting problems (Vapnik et al., 1997; Vapnik, 2013). Therefore, SVR has been widely applied in many fields, e.g. Huang and Hong (2009) used SVR to forecast the exchange rate, and Hong et al. (2011) applied SVR to forecast tourist arrivals. According to the research findings in transportation prediction, the single model is incapable of capturing nonlinear behaviour (Karlaftis and Vlahogianni, 2011). Given these properties, hybrid forecasting techniques have received more attention and extensive research has shown that hybrid forecasting techniques outperform the single model in terms of forecasting accuracies (Zheng et al., 2006). Hybrid models are mainly divided into two categories. One category applies the optimisation algorithm to optimise the hyperparameters of another forecasting model, such as Ping and Fei (2013), which applies genetic algorithms (GA) to optimise the backpropagation neural network model (BPNN) for forecasting the container throughput in Guangdong Province. These results showed that GA-BPNN has better accuracy. Mak and Yang (2007) presented a modified version of the support vector machine (SVM) to forecast container throughput in Hong Kong, which shows an impressive performance in the area of time series analysis. The other category combines two forecasting models, one used to forecast the linear component and another used to forecast the nonlinear component, such as the Gray-SARIMA dynamic model (Carmona-Benítez and Nieto, 2020), the ANN-SARIMA model (Ruiz-Aguilar et al., 2014) and the GA-SVR-SARIMA model (Hong et al., 2011). Usually, the traditional statistical models (e.g. SARIMA and ARIMA) are used to predict the linear component and the Machine Learning models (e.g. ANN, SVR and LSTM) are used to predict the nonlinear component. However, the port container traffic time series are difficult to classify as purely linear parts or nonlinear parts and, generally speaking, these time series contain both a linear part and a nonlinear part due to the seasonality, randomness and complexity presented in the time series (Wang et al., 2012; Khashei and Bijari, 2011). Therefore, it is inadequate to apply SARIMA or Machine Learning models to fit the linear part and nonlinear part, respectively. Meanwhile, traditional hybrid models are best suited to multivariate forecasting, and the authors have not found research papers related to port container traffic univariate forecasting by hybrid models, despite the increasing interest in port container traffic. Also, anomalous events such as the COVID-19 pandemic usually occur suddenly and unpredictably with asymmetric information and can bring great harm to all walks of life (Jin et al., 2019). The time series containing anomalous events is described as an inherently nonlinear complex and chaotic dynamic system, which has an impact on the prediction accuracy (Faulkner and Russell, 1997). Based on the above problem, the contributions of this paper are four-fold. Firstly, we proposed a hybrid model to enhance prediction accuracy and remove nonlinearity and the multivariate limitations. Secondly, we compared the prediction performance of different models for various training dataset extensions and forecasting horizons. Third, we explored the forecasting performance of different models in the context of the COVID-19 pandemic. Finally, we analysed the impact of the COVID-19 pandemic on forecasting work and maritime transportation. The Yangtze River Delta multi-port system (YRDP) is located in the most developed area of China (see Fig. 1 ). This area has been investigated from different perspectives. Feng et al. (2020) proposed a novel ternary diagram method to visualise the evolution of YRDP. Huang et al. (2022)explored the temporal and spatial characteristics of YRDP by a compositional data method and the results indicated that the development of YRDP has gone through four stages: the evolution of YRDP is characterised by a tendency towards a ‘multi-core development' and faces a differentiated pattern of 'peripheral port challenges'. Veenstra and Notteboom (2011) analysed the level of cargo concentration and the degree of inequality in the operations of the container ports to address the dynamics in YRDP.

Fig. 1

Location of SH, NB, LYG and SZ in YRDP. This figure also gives the statistical description of the four ports. Each group of data represents the maximum value (Max), the minimum value (Min), the mean value (AV) and the standard deviation (STD) of each container traffic time series for each port (units in 104 TEU). In this paper, the time series of the container throughput of Shanghai port (SH), Ningbo port (NB), Suzhou port (SZ) and Lianyungang port (LYG) in YRDP were applied for illustration and verification purposes. The reason why we selected those four ports is that SH and NB are international ports, ranked first and third in the world in terms of container traffic, while LYG and SZ are small-scale regional ports in China; thus the forecasting work consists of large and small ports’ container traffic time series, making the work more convincing. The organisation of this paper is as follows. Section 2 describes the methodology, including the SARIMA model, LSTM model, SVR model, and two hybrid models, each with two configurations (configuration 1:S-L1, S–S1 and configuration 2: S-L2, S–S2). In Section 3, the experimental procedure is introduced. The empirical results and discussion are presented in Section 4. Finally, conclusions and future research are proposed in Section 5.

Methodology

This section shows the analytical methods used in this contribution, including SARIMA, SVR, LSTM and the hybrid models.

SARIMA

A more sophisticated and accurate algorithm for analysing and forecasting time series data is the Box-Jenkins method, including the autoregressive model , the moving average model , the autoregressive moving average model , and the Autoregressive Integrated Moving Average model . The form of the model is as follows: Adding a seasonal factor for the model: The following is a compact expression of the model:where: , is the AR(p) operator, is the MA(q) operator, is the seasonal operator, and is the seasonal operator. The detailed parameters are presented in Appendix A.

Support vector machine (SVM)

The SVM algorithm used kernel functions to map data from low dimensions to high dimensional space. This method reduces dimensional catastrophe and computational complexity while having better scalability and an improved ability to fit the nonlinear data (Moscoso-López et al., 2016). Compared to traditional neural network algorithms, the SVM model uses structural risk optimisation and its scalability has been one of the advantages of the model. For a given sample , is the sample volume, is the input vector, and is the output target. The SVM model uses high-dimensional mapping of the feature space to and then a function approximation in the feature space using a linear regression function. SVM for regression is called SVR:where is the weight vector, donates the kernel function used for the input vector , and is the bias term. According to the statistical theory, SVM obtained and and fits the regression function formula by minimizing the objective function.where denotes the regularisation parameter, represents the loss function and the -intensive loss function is defined as:where is the tolerance error. Through Lagrange multiplier techniques, Eq. (5) leads to the following dual optimisation problem: Subject to the constraints for . The training error over is denoted as , while the training error less than is denoted as . The parameter vector in Eq. (4) is derived by solving the quadratic optimisation problem with constraints: The Lagrange multipliers are derived by solving a quadratic program. Finally, the SVR regression is calculated as: are kernel functions allowing for the mapping of input data into a high-dimensional feature space where a linear regression can be performed. This contribution uses the Gaussian Radial Basis Function as follows:where represents the width of the Kernel function.

Long Short-Term Memory networks model (LSTM)

LSTM, as a special Recurrent Neural Network, effectively overcomes the shortcomings of gradient disappearance and gradient explosion in machine learning (ML) models and has intensity processing capability for temporal data with relatively long intervals and delays (Huang et al., 2021). The LSTM structure consists of a forget gate that controls information transfer, an input gate and an output gate that are used to decide which signals are going to be forwarded to another node, as shown in Fig. 2 .

Fig. 2

LSTM prediction model structure. Each blue box represents a unit of LSTM, for example, the left-hand box is the unit at time . (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) Sequence is fed into the LSTM encoder: where 、、 represent the weight distribution of different cellular mechanisms, respectively. In Eq. (12), meaning the external information variables associated with the input gates. represents the input in the cell, represents the moment t−1 generic state, since the LSTM model cell correlation and implicit node information are shared. It can be considered as being part of the external input, where is the bias vector and denotes the sigmoid activation function. The mechanism of the forgetting and the output gates (as well as the associated parameters) are similar to the input and the final state values of the hidden cell given by the activation function (Eq. (14)), to get the input predictions.

The hybrid models

The hybrid model can predict more accurately than the single model (Wang et al., 2012; Ruiz-Aguilar et al., 2014). In this paper, we proposed two hybrid models, each with two configurations, to predict the container throughput. Due to the seasonality, complexity and randomness, the time series contains both linear and nonlinear patterns. Therefore, the application of SARIMA and ML-based models fit the linear and nonlinear patterns, respectively. Then:where is the linear component and represents the nonlinear component. The SARIMA model is applied to fit the linear part and the LSTM model and the SVR model are used to forecast the nonlinear part. Hence, the forecast value of the linear part and the residual at time is equal to the difference of the true value and the forecast value . Based on the characteristics of the LSTM and SVR, they can overcome the multivariate limitation and also resolve the nonlinearity of the container throughput time series. So, in Eq. (17), is the nonlinear function calculated by the LSTM model and SVR model. The final forecasting values are obtained:where is the linear function calculated by the SARIMA model and is the nonlinear function calculated by Eq. (17). The hybrid models in Eq. (18) are composed of the SARIMA model, LSTM, SARIMA and SVR, respectively. Therefore, these two hybrid models are SARIMA-LSTM and SARIMA-SVR. In this step, we called the hybrid models configuration 1, including SARIMA-LSTM1 (S-L1) and SARIMA-SVR1 (S–S1). The time series of the container throughput is hardly ever purely linear or nonlinear, it contains both linear and nonlinear patterns. So, to overcome this point and further improve the forecasting performance of configuration 1, we proposed configuration 2 (based on configuration 1) as follows:where is the nonlinear function calculated by the LSTM model and SVR model, is calculated by Eq. (18), is calculated by the SARIMA model and is calculated by Eq. (17). Eq. (19) is configuration 2 of the hybrid models, including SARIMA-LSTM2 (S-L2) and SARIMA-SVR2 (S–S2).

Experimental procedures

This section shows the experimental procedure. Firstly, we describe the container traffic time series used in the paper and the division of the dataset. Then, the Anomaly Detection Method (ADM) is introduced to detect anomalous points. The third is the modelling process, including the training model, model loading and forecasting. Finally, the performance of the different models is assessed. The LSTM, SVR and hybrid models were carried out in Python, with the function of LSTM and SVR. The SARIMA model was developed by R language using a forecast package. The auto.arima function in the forecasting package was convenient for generating the parameters. Table 1 displays the explanation of some key notation.

Table 1

Key notation.

Seasonal Autoregressive Integrated Moving Average (SARIMA)
p	The non-seasonal autoregressive order
d	The differences order
q	The non-seasonal moving average parameters
P	The non-seasonal autoregressive order
D	The differences order
Q	The non-seasonal moving average order
φp	The autoregressive order
θq	The moving average order
ΦP	The seasonal order
ΦQ	The seasonal operator
xt	Container traffic time series
Support Vector Machine (SVM)
φ(x)	The kernel function
b	The bias term
C	The regularisation parameter
ε	The tolerance error
σ	The width of the kernel function
βi∗,βi	The Lagrange multipliers
Long Short-Term Memory Networks model (LSTM)
ft	The forget gate
it	The input gate
C˜t	The output gate
wxi、 whi、 wsi	The weight distribution of different cellular mechanisms
The hybrid models
Lt	The linear component
Nt	The nonlinear component
Lˆt	The forecast value of linear component
Nˆt	The forecast value of nonlinear component
Yt	The true value
Anomaly Detection
IQR	The interquartile range

Key notation.

Dataset description and division

In this work, the container throughput time series of SH, NB, LYG and SZ were analysed. These time series datasets are shown in Fig. 3 , which contains monthly records related to container traffic from 2012 to 2021. All of the data came from the Ministry of Transport of the People's Republic of China (https://www.mot.gov.cn/).

Fig. 3

Monthly container throughput in the four ports; the red vertical line marks the drop due to the anomalous events (the COVID-19 pandemic). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) In this paper, the time series datasets were divided into two periods: the first period is prior to COVID-19 (pre-COVID-19), from 2012 to 2019, and the second period is post-COVID-19, from January 2020 to December 2021. For pre-COVID-19, we compared the forecasting accuracy of different models with various training extensions by splitting the training datasets into training expansion 84 (January 2012 to December 2018), training expansion 72 (January 2013 to December 2018), and training expansion 60 (January 2014 to December 2018). At the same time, we compared the accuracy of different forecasting horizons with different models. The different forecasting horizons were defined as follows: horizon 12 (January 2017 to December 2017), horizon 24 (January 2017 to December 2018) and horizon 36 (January 2017 to December 2019). For the post-COVID-19 period, we predicted the period of January 2021 to December 2021 using different training dataset extensions of the period from January 2014 to December 2020. We also categorise the training dataset into training expansion 84 (January 2014 to December 2020), training expansion 72 (January 2015 to December 2020), and training expansion 60 (January 2016 to December 2020). Because the corresponding training dataset extensions of the post-COVID-19 period (from January 2014 to December 2020) have the same data points as the training expansions in the pre-COVID-19 period, we compared the accuracy of the training dataset extensions between the post-COVID-19 period and the post-COVID-19 period to analyse the influence of COVID-19 on the prediction and maritime transportation.

Anomaly point inspection and detection

Anomalous points of time series are usually expressed as abnormal data points relative to some standard or conventional signals, such as an unexpected peak, unexpected trough, trend change and horizontal translation (Nguyen et al., 2021). The time series consists of a trend, season and remainder. We need first to decompose it by the Seasonal-Trend decomposition procedure based on Loess (STL) and to remove the trend part and season part, and then check whether the remainder part consists of anomaly points (Cleveland et al., 1990; Rojo et al., 2017). STL first decomposed the time series into three components: trend, seasonal, and remainder. Second, we removed the “trend” and “season” components and then tested the remainder component by the inter-quartile range (IQR) of ± 25 of the median, where IQR is the difference between the 25% and 75% quantiles. The Anomaly Detection Method uses an interquartile range (IQR) of ± 25 of the median, where IQR is the difference between the 25% and 75% quantiles (Cleveland et al., 1990). When the Anomaly Detection is finished (see Fig. 4 ), we used the median of each container traffic time series to replace the anomalous points to make the forecasting work accurate.

Fig. 4

Results of ADM for NB, SH, LYG and SZ. The red points represent the anomalies. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Modelling process and assessment criteria and robustness

In the modelling process, random initialisation is the first and most important step. In this paper, we used He's initialisation (tensorflow.keras.initializers.he_normal()) of the TensorFlow module in Python to initialise the parameters (He et al., 2015). Then the next step is to find the best parameter combinations by the Grid Search method and Cross-Validation method in the GridSearchCV function of the scikit-learning module in Python. For the ARIMA model, there is a function of auto.arima in R language to return the best parameters. Finally, the Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) were used to evaluate the performances of these models: where represents the true values and denotes the forecast values.

Numerical results and discussion

This section presents the predicted results of the hybrid models and benchmark models (e.g. ML models and the SARIMA model). We compared the prediction performance of various models, considering different training dataset extensions and forecasting horizons, and then analysed the impact of the anomalous events of the COVID-19 pandemic on the predictions. Lastly, we provided some managerial insight based on the forecast results.

Forecasting performance considering different training dataset extensions and forecasting horizons

Table 2 shows the forecasting performance of the different models for various training dataset extensions. The forecasting performance was measured by three criteria (i.e. MAE, MAPE, and RMSE). Table 2 indicates that the hybrid models (both configuration 1 and configuration 2) have a better forecasting performance than the SARIMA model and the ML models (i.e. SVR and LSTM). For instance, from the MAE criteria we can see that the biggest value of the hybrid model for NB comes from S-L1, ranging from 9.55 to 10.23 corresponding to training dataset extension 84 and to training dataset extension 60. However, the best performance of the single model is LSTM, for which the MAE ranges from 10.13 to 10.90, which is bigger than S-L1. In the same way, the greatest single model for SH is also LSTM, whose MAE ranges from 19.92 to 20.49, which is much smaller than S-L1's 9.55 to 10.23. The MAPE and RMSE also can indicate this point. The worst forecasting accuracy of the hybrid model for NB is S-L1 for both MAPE and RMSE, whose values range from 4.36 to 4.37 and 10.23 to 10.89, respectively. But the best prediction model for a single model of NB is LSTM, with MAPE and RMSE of 9.19–9.46 and 10.78 to 11.95, respectively. This pattern also applies to LYG and SZ. With the extension of the training dataset, the accuracy is increased. This is because most of the criteria are increased with the increase of training dataset extensions for all the forecasting models, except for the RMSE of SVR for LYG and the MAE of S-L2 of SZ.

Table 2

Forecasting performance comparison for various training dataset extensions. Bold numbers correspond to the best prediction performance for each dataset extension.

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		84	72	60	84	72	60	84	72	60
NB	SVR	12.33	12.47	12.70	9.29	9.48	9.49	13.51	13.76	14.69
	LSTM	10.13	10.88	10.90	9.19	9.46	9.46	10.78	11.69	11.95
	SARIMA	14.72	17.57	18.41	9.67	9.67	10.91	17.82	19.08	23.47
	S-L1	9.23	9.66	9.74	3.37	3.44	3.41	11.07	11.38	11.43
	S-L2	8.38	8.69	9.31	3.11	3.25	3.34	9.92	9.97	9.04
	S–S1	9.55	9.95	10.23	4.36	4.43	4.67	10.23	10.36	10.89
	S–S2	8.39	8.78	8.89	4.14	4.25	4.24	8.47	8.56	8.69
SH	SVR	22.95	23.36	24.23	6.63	6.80	6.87	24.19	25.67	26.64
	LSTM	19.92	19.97	20.49	6.58	6.58	6.59	20.98	21.17	21.74
	SARIMA	23.31	31.61	38.05	6.73	8.39	8.53	26.23	35.10	42.57
	S-L1	13.88	13.96	14.37	6.26	6.39	6.68	18.15	18.18	18.26
	S-L2	13.73	14.73	14.86	6.01	6.21	6.35	15.35	16.53	16.98
	S–S1	14.65	15.80	16.41	5.42	6.07	6.23	18.11	18.82	19.51
	S–S2	13.86	14.89	14.94	4.51	4.66	4.68	15.77	16.89	17.68
LYG	SVR	1.57	1.69	1.49	4.13	4.42	3.93	2.00	2.06	1.98
	LSTM	1.45	1.50	1.56	3.83	3.95	4.08	1.96	1.97	1.99
	SARIMA	2.47	3.35	4.30	4.02	6.10	14.56	3.01	3.78	4.85
	S-L1	0.28	0.37	0.41	0.74	1.00	0.82	0.42	0.48	0.58
	S-L2	0.24	0.26	0.36	0.65	0.66	0.65	0.38	0.39	0.54
	S–S1	0.38	0.44	0.47	0.65	0.71	1.50	0.55	0.59	0.94
	S–S2	0.32	0.36	0.44	0.54	0.59	1.21	0.46	0.54	0.87
SZ	SVR	3.95	4.87	5.09	8.61	10.08	10.45	4.50	5.37	5.59
	LSTM	2.33	2.46	3.47	6.12	6.44	8.44	3.39	3.39	3.24
	SARIMA	6.07	8.14	8.41	11.80	12.17	12.44	6.68	8.83	9.08
	S-L1	0.41	0.48	0.52	0.80	0.93	1.01	1.17	1.31	1.50
	S-L2	0.38	0.34	0.46	0.65	0.76	0.99	0.52	0.60	0.66
	S–S1	0.44	0.56	0.66	0.65	0.91	1.50	1.24	1.42	1.67
	S–S2	0.41	0.48	0.57	0.54	0.65	1.22	0.68	0.84	0.96

Forecasting performance comparison for various training dataset extensions. Bold numbers correspond to the best prediction performance for each dataset extension. Table 3 shows the forecasting performance of the different models for various forecasting horizons. The forecasting performance was measured by three criteria. Table 3 also indicates that the four hybrid models have the best forecasting accuracy compared with the other single model. For instance, the MAE of S-L2 for SH ranges from 8.17 to 9.15, the MAPE is from 4.36 to 5.78, and the RMSE is from 9.79 to 10.81. However, the best single model for NB is LSTM, and the three criteria range from 14.33 to 15.3, 14.32 to 15.66, and 11.18 to 15.09, respectively, which is lower than the hybrid models. According to Khashei and Bijari (2011), with the increase of the forecasting horizons, the forecasting accuracy is decreased. However, from Table 3 we can see that the three criteria do not show sufficient evidence for this pattern. This is because the forecasting horizons of the various models show an irregular pattern; for example, the most accurate forecasting horizon of S-L1 for MAE of NB is forecasting horizon 24, but for MAPE and RMSE it is forecasting horizon 12.

Table 3

Forecasting performance comparison for various forecasting horizons. Bold numbers correspond to the best prediction performance for each forecasting horizon.

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		12	24	36	12	24	36	12	24	36
NB	SVR	17.47	20.13	22.28	9.5	10.64	11.46	18.27	20.93	23.39
	LSTM	14.33	15.63	14.98	14.32	15.33	15.66	11.18	12.08	15.09
	SARIMA	19.78	22.95	25.56	19.03	17.15	16.23	20.77	23.94	26.97
	S-L1	9.69	8.78	12.62	8.35	8.57	9.81	10.71	10.75	11.16
	S-L2	9.15	8.71	9.11	4.39	4.36	5.87	9.79	10.66	10.81
	S–S1	10.63	11.03	11.65	8.39	9.34	10	11.01	11.51	12.34
	S–S2	9.31	9.62	9.98	5.69	6.35	7.58	10.05	10.43	10.97
SH	SVR	17.96	18.32	18.69	9.85	10.51	11.4	13.36	14.61	15.32
	LSTM	15.03	14.09	14.97	12.45	13.33	14.35	12.13	12.42	13.19
	SARIMA	20.12	20.96	21.36	14.96	13.37	12.96	18.20	18.94	19.07
	S-L1	11.18	12.26	11.32	7.5	7.32	8.03	11.85	13.49	13.67
	S-L2	10.52	10.61	11.07	5.36	6.54	6.95	10.24	11.23	12.09
	S–S1	12.11	12.36	13.65	6.65	6.98	6.25	12.06	12.54	13.66
	S–S2	11.08	11.22	11.36	4.23	4.07	3.76	11.23	11.46	12.59
LYG	SVR	0.96	0.75	1.12	4.52	4.33	5.21	2.03	2.10	2.12
	LSTM	0.93	0.74	0.95	4.25	4.65	4.72	1.04	1.18	1.33
	SARIMA	1.64	2.80	1.83	5.04	6.02	6.18	7.04	6.01	7.24
	S-L1	0.71	0.79	0.90	2.48	2.16	3.02	0.73	0.76	0.79
	S-L2	0.67	0.69	0.79	1.79	1.75	1.54	0.63	0.68	0.73
	S–S1	0.88	0.94	0.96	1.86	1.8	1.59	0.89	0.94	1.16
	S–S2	0.78	0.85	0.94	1.54	1.64	1.84	0.81	0.85	0.88
SZ	SVR	2.16	2.50	2.55	10.82	10.7	9.18	1.68	2.03	2.93
	LSTM	1.50	1.64	1.69	12.14	13.51	12.61	1.30	1.77	1.68
	SARIMA	3.57	3.84	4.17	14.42	13.29	11.97	4.15	4.29	4.78
	S-L1	1.43	1.46	1.38	6.75	7.47	7.95	1.34	1.58	1.77
	S-L2	1.15	1.19	1.66	3.01	3.27	3.18	1.32	1.52	1.55
	S–S1	1.63	1.53	1.85	5.44	6.34	6.94	1.45	1.65	1.89
	S–S2	1.35	1.46	1.55	1.86	1.8	1.59	1.36	1.54	1.65

Forecasting performance comparison for various forecasting horizons. Bold numbers correspond to the best prediction performance for each forecasting horizon. For the three single models, according to the three criteria, it is no surprise that the SARIMA always has the biggest value, SVR is lower than SARIMA, and LSTM's criteria are the lowest, irrespective of the different training dataset extensions or different forecasting horizons (see Table 2, Table 3). That fact indicates that the LSTM shows the most accurate performance and SVR is second, while the traditional statistical model SARIMA has the worst performance. When we compared configuration 1 (S-L1 and S–S1) to configuration 2 (S-L2 and S–S2), irrespective of the various training dataset extensions or various forecasting horizons, the three criteria show that configuration 2 has noticeably better performance than configuration 1, which means the configuration 2 we proposed can further improve the prediction performance of configuration 1. Table 4 and Table 5 display the difference of the three criteria between configuration 1 and configuration 2 for the various training dataset extensions and forecasting horizons. From those values, we can see that all values are positive, which means also that configuration 2 can improve the forecasting performance of configuration 1 for different training dataset extensions and forecasting horizons.

Table 4

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		84	72	60	84	72	60	84	72	60
NB	S-L	0.85	0.97	0.43	0.26	0.19	0.07	1.15	1.41	2.39
NB	S–S	1.16	1.17	1.34	0.22	0.18	0.43	1.76	1.80	2.20
SH	S-L	0.15	−0.77	−0.49	0.25	0.18	0.33	2.80	1.65	1.28
SH	S–S	0.79	0.91	1.47	0.91	1.41	1.55	2.34	1.93	1.83
LYG	S-L	0.04	0.11	0.05	0.09	0.34	0.17	0.04	0.09	0.04
LYG	S–S	0.06	0.08	0.03	0.11	0.12	0.29	0.09	0.05	0.07
SZ	S-L	0.03	0.14	0.06	0.15	0.17	0.02	0.65	0.71	0.84
SZ	S–S	0.03	0.08	0.09	0.11	0.26	0.28	0.56	0.58	0.71

Table 5

Difference of the three criteria between configuration 1 (S-L1, S–S1) and configuration 2 (S-L1, S–S1) for various forecasting horizons during the pre-COVID-19 period. The S-L represents the difference between S-L1 and S-L2, and S–S represents the difference between S–S1 and S–S2.

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		12	24	36	12	24	36	12	24	36
NB	S-L	0.54	0.07	3.51	3.96	4.21	3.94	0.92	0.09	0.35
NB	S–S	1.32	1.41	1.67	2.70	2.99	2.42	0.96	1.08	1.37
SH	S-L	0.66	1.65	0.25	2.14	0.78	1.08	1.61	2.26	1.58
SH	S–S	1.03	1.14	2.29	2.42	2.91	2.49	0.83	1.08	1.07
LYG	S-L	0.04	0.10	0.11	0.69	0.41	1.48	0.10	0.08	0.06
LYG	S–S	0.10	0.09	0.02	0.32	0.16	−0.25	0.08	0.09	0.28
SZ	S-L	0.28	0.27	−0.28	3.74	4.20	4.77	0.02	0.06	0.22
SZ	S–S	0.28	0.07	0.30	3.58	4.54	5.35	0.09	0.11	0.24

Difference of the three criteria between configuration 1 (S-L1, S–S1) and configuration 2 (S-L2, S–S2) for various training dataset extensions during the pre-COVID-19 period. The S-L represents the difference between S-L1 and S-L2, and S–S represents the difference between S–S1 and S–S2. Difference of the three criteria between configuration 1 (S-L1, S–S1) and configuration 2 (S-L1, S–S1) for various forecasting horizons during the pre-COVID-19 period. The S-L represents the difference between S-L1 and S-L2, and S–S represents the difference between S–S1 and S–S2.

Impact of COVID-19 on the prediction

This subsection investigates the prediction performance of different forecasting models in the context of anomalous events. In this sense, the COVID-19 pandemic provides a suitable example to test the prediction ability of the different forecasting models using the container throughput time series. In Table 6 , the splitting strategy of the training dataset extension for post-COVID-19 is different from the training dataset extensions for pre-COVID-19. The training dataset extensions for the pre-COVID-19 period are split as follows: training dataset extension 84 is the data from January 2012 to December 2018, training dataset extension 72 is the data from January 2013 to December 2018 and training dataset extension 60 is from January 2014 to December 2018; the test dataset is the data from January 2019 to December 2019. For the post-COVID-19 period, each training dataset extension was postponed for two years, respectively, and the test dataset is the data from January 2021 to December 2021.

Table 6

Three criteria of various training dataset extensions for the post-COVID-19 period. Bold numbers correspond to the best prediction performance for each dataset extension.

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		84	72	60	84	72	60	84	72	60
NB	SVR	15.41	15.59	15.87	11.67	11.91	11.92	16.87	17.18	18.32
	LSTM	12.71	13.63	13.65	11.55	11.88	11.88	13.51	14.63	14.95
	SARIMA	18.36	21.87	22.90	12.14	12.14	13.67	22.17	23.73	29.13
	S-L1	11.60	12.13	12.23	4.38	4.47	4.43	13.86	14.24	14.31
	S-L2	10.55	10.93	11.70	4.06	4.23	4.34	12.45	12.51	11.36
	S–S1	11.99	12.48	12.83	5.60	5.69	5.98	12.83	12.99	13.64
	S–S2	10.56	11.04	11.18	5.33	5.46	5.45	10.66	10.77	10.93
SH	SVR	28.49	29.00	30.07	8.40	8.60	8.69	30.02	31.84	33.04
	LSTM	24.76	24.82	25.46	8.33	8.33	8.35	26.07	26.30	27.00
	SARIMA	28.94	39.16	47.09	8.52	10.56	10.74	32.53	43.45	52.65
	S-L1	17.32	17.42	17.93	7.94	8.10	8.46	22.58	22.62	22.72
	S-L2	17.14	18.37	18.53	7.63	7.88	8.05	19.13	20.59	21.14
	S–S1	18.27	19.69	20.44	6.91	7.71	7.90	22.53	23.41	24.26
	S–S2	17.30	18.57	18.63	5.79	5.97	5.99	19.65	21.03	22.00
LYG	SVR	2.16	2.31	2.07	5.32	5.67	5.07	2.69	2.77	2.67
	LSTM	2.02	2.08	2.15	4.95	5.10	5.26	2.64	2.66	2.68
	SARIMA	3.27	4.36	5.53	5.18	7.74	18.16	3.94	4.89	6.20
	S-L1	0.58	0.69	0.74	1.14	1.46	1.24	0.75	0.82	0.95
	S-L2	0.53	0.55	0.67	1.03	1.04	1.03	0.70	0.71	0.90
	S–S1	0.70	0.77	0.81	1.03	1.11	2.08	0.91	0.96	1.39
	S–S2	0.63	0.67	0.77	0.90	0.96	1.72	0.80	0.90	1.30
SZ	SVR	5.10	6.23	6.50	10.83	12.64	13.10	5.77	6.84	7.11
	LSTM	3.10	3.26	4.50	7.77	8.16	10.62	4.41	4.41	4.22
	SARIMA	7.71	10.25	10.59	14.76	15.22	15.55	8.46	11.10	11.41
	S-L1	0.74	0.82	0.87	1.22	1.38	1.48	1.67	1.84	2.08
	S-L2	0.70	0.65	0.80	1.03	1.17	1.45	0.87	0.97	1.04
	S–S1	0.77	0.92	1.04	1.03	1.35	2.08	1.76	1.98	2.29
	S–S2	0.74	0.82	0.93	0.90	1.03	1.73	1.07	1.27	1.41

Three criteria of various training dataset extensions for the post-COVID-19 period. Bold numbers correspond to the best prediction performance for each dataset extension. Table 6 displays the three criteria of the various training dataset extensions for the post-COVID-19 period. From Table 6 we can see that the three criteria also show that the hybrid models have better predictive power than the single models during the post-COVID-19 period. For example, for NB, the worst hybrid model is S–S1 with the MAE ranging from 11.60 to 12.23, but the best single model is LSTM with the MAE ranging from 12.71 to 13.65. In the same way, the MAPE and RMSE of LSTM are correspondingly lower than S–S1. At the same time, the differences of the three criteria between configuration 1 and configuration 2 are all positive (except for the MAE of SH for training dataset extension 72 and 60; see Table 6). This fact indicates that configuration 2 also can improve configuration 1 during the post-COVID-19 period. For example, in terms of the MAPE of SH, S-L2 can improve S-L1 by about 0.22–0.41 (see Table 7 ).

Table 7

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		84	72	60	84	72	60	84	72	60
NB	S-L	1.05	1.19	0.53	0.32	0.23	0.09	1.42	1.74	2.94
NB	S–S	1.43	1.44	1.65	0.27	0.22	0.53	2.17	2.22	2.71
SH	S-L	0.18	−0.95	−0.60	0.31	0.22	0.41	3.45	2.03	1.58
SH	S–S	0.97	1.12	1.81	1.12	1.74	1.91	2.88	2.38	2.25
LYG	S-L	0.05	0.14	0.06	0.11	0.42	0.21	0.05	0.11	0.05
LYG	S–S	0.07	0.10	0.04	0.14	0.15	0.36	0.11	0.06	0.09
SZ	S-L	0.04	0.17	0.07	0.18	0.21	0.02	0.80	0.87	1.03
SZ	S–S	0.04	0.10	0.11	0.14	0.32	0.34	0.69	0.71	0.87

Difference of three criteria between configuration 1 (S-L1, S–S1) and configuration 2 (S-L2, S–S2) for various training dataset extensions during the post-COVID-19 period. The S-L represents the difference between S-L1 and S-L2, and S–S represents the difference between S–S1 and S–S2. Table 8 shows the difference between the three criteria of the corresponding training dataset extensions for the pre-COVID-19 period and post-COVID-19 period. The three criteria in Table 8 are all positive, which means that each criterion post-COVID-19 is higher than the pre-COVID-19 period. In other words, the COVID-19 pandemic makes the forecasting accuracy lower.

Table 8

Difference between the three criteria of the corresponding training dataset extensions for the pre-COVID-19 period and post-COVID-19 period.

		MAE (×10⁴ TEU)			MAPE (%)			RMSE (×10⁴ TEU)
		84	72	60	84	72	60	84	72	60
NB	SVR	3.08	3.12	3.17	1.77	1.80	1.82	3.36	3.42	3.63
	LSTM	2.58	2.75	2.75	1.75	1.75	1.76	2.73	2.94	3.00
	SARIMA	3.64	4.30	4.49	1.79	2.17	2.21	4.35	4.65	5.66
	S-L1	2.37	2.47	2.49	1.49	1.64	1.67	2.79	2.86	2.88
	S-L2	2.17	2.24	2.39	1.28	1.31	1.31	2.53	2.54	2.32
	S–S1	2.44	2.53	2.60	1.68	1.71	1.78	2.60	2.63	2.75
	S–S2	2.17	2.26	2.29	1.62	1.67	1.70	2.19	2.21	2.24
SH	SVR	5.54	5.64	5.84	2.38	2.43	2.43	5.83	6.17	6.40
	LSTM	4.84	4.85	4.97	2.36	2.42	2.42	5.09	5.13	5.26
	SARIMA	5.63	7.55	9.04	2.47	2.47	2.76	6.30	8.35	10.08
	S-L1	3.44	3.46	3.56	1.01	1.03	1.02	4.43	4.44	4.46
	S-L2	3.41	3.64	3.67	0.95	0.98	1.00	3.78	4.06	4.16
	S–S1	3.62	3.89	4.03	1.24	1.26	1.31	4.42	4.59	4.75
	S–S2	3.44	3.68	3.69	1.19	1.21	1.21	3.88	4.14	4.32
LYG	SVR	0.59	0.62	0.58	1.19	1.25	1.14	0.69	0.71	0.69
	LSTM	0.57	0.58	0.59	1.12	1.15	1.18	0.68	0.69	0.69
	SARIMA	0.80	1.01	1.23	1.16	1.64	3.60	0.93	1.11	1.35
	S-L1	0.30	0.32	0.33	0.40	0.46	0.42	0.33	0.34	0.37
	S-L2	0.29	0.29	0.31	0.38	0.38	0.38	0.32	0.32	0.36
	S–S1	0.32	0.33	0.34	0.38	0.40	0.58	0.36	0.37	0.45
	S–S2	0.31	0.31	0.33	0.36	0.37	0.51	0.34	0.36	0.43
SZ	SVR	1.15	1.36	1.41	2.22	2.56	2.65	1.27	1.47	1.52
	LSTM	0.77	0.80	1.03	1.65	1.72	2.18	1.02	1.02	0.98
	SARIMA	1.64	2.11	2.18	2.96	3.05	3.11	1.78	2.27	2.33
	S-L1	0.33	0.34	0.35	0.42	0.45	0.47	0.50	0.53	0.58
	S-L2	0.32	0.31	0.34	0.38	0.41	0.46	0.35	0.37	0.38
	S–S1	0.33	0.36	0.38	0.38	0.44	0.58	0.52	0.56	0.62
	S–S2	0.33	0.34	0.36	0.36	0.38	0.51	0.39	0.43	0.45

Difference between the three criteria of the corresponding training dataset extensions for the pre-COVID-19 period and post-COVID-19 period.

Discussion and managerial insights

The COVID-19 pandemic has led to a slowdown in container transportation and maritime trade (Guerrero et al., 2022). As the COVID-19 pandemic spread all over the world, many countries fell into a “lockdown and stagnant” state. The global supply chains were disrupted and Chinese ports were also affected by the COVID-19 pandemic. The COVID-19 pandemic related restrictions such as the lockdown strategy had a series of negative impacts on port activities. The decline, mainly in the first half of 2020, particularly in February, plummeted by 2.63%, 20.94%, 19.45%, and 39.13% in NB, LYG, SH, and SZ, respectively (see Fig. 5 ). In the next few months of 2020, it can be clearly found that the year-on-year growth rate is always negative from January 2020 to June 2020. It was inferred that the lockdown strategy had a negative influence on the economy and maritime trade, which in turn affected the container transportation sector (Zhao et al., 2022). After June 2020, the Chinese government efficiently resumed work and production, the transportation industry gradually recovered in those four ports, and the year-on-year growth rate turned positive for the first time since the COVID-19 pandemic; the four ports showed resilience and vitality and the container traffic began to rebound.

Fig. 5

Container traffic evolution from 2019 to 2021 and container traffic year-on-year growth rate of 2020 and 2021.

Container traffic evolution from 2019 to 2021 and container traffic year-on-year growth rate of 2020 and 2021. After October 2020, we found that the four ports showed a downward trend, and that the second wave of the COVID-19 pandemic around the world caused a shock to container transportation. In this context, those four ports were declining for three months from October 2020 (see Figs. 3 and 5). In the last half year of 2020, the major economies implemented vaccination plans based on their anti-epidemic experience in 2020 to achieve economic growth. At the same time, favourable factors such as the recovery of steady economic growth and the signing of the Regional Comprehensive Economic Partnership (RCEP) have also provided strong support for the development of foreign trade. NB and SH are ranked first and second in terms of container traffic in Chinese ports and have a close connection with the world maritime trade. By 2021, the container traffic in NB and SH broke a new record of 3180 and 4703 thousand TEUs. The container traffic year-on-year growth rate in NB, SH, LYG and SZ are all positive, and the growth trend returned to the pre-COVID-19 period. As a result, container traffic likewise returned to pre-epidemic levels in 2021 (see Fig. 5). The port industry is traditionally labour-intensive (Trujillo and Nombela, 1999). The prevention and control measures of the epidemic in China forced the port to apply digital technology, which accelerated the process of port digital transformation. Chinese ports reduced the contact risks by improving their automatisation during the epidemic to ensure the efficient and orderly operation of the entire supply chain, and also improved the understanding and recognition of digitalisation and automation in the port industry. LYG and SZ are small-scale ports in comparison with SH and NB, whose development benefits from the Chinese new development pattern whereby “internal circulation dominated and double circulation promoted each other”. This new development pattern has become a decisive force driving China's economic growth. Thanks to this development pattern, a new opportunity has been provided for inland ports and small-scale ports; thus, the LYG and SZ have maintained the stability developed during the COVID-19 pandemic (see Figs. 3 and 5). According to Zhao et al. (2022), the prediction error can serve as an indicator to measure the impact of the COVID-19 pandemic on maritime transportation. The larger the error, the greater the impact of the COVID-19 pandemic on maritime transportation. Proceeding from this point, we compared the accuracy of the different training dataset extensions between the pre-COVID-19 period and the post-COVID-19 period. We found that the accuracy of the post-COVID-19 period was higher than the pre-COVID-19 period (see Table 6, Table 8), which indicated that the COVID-19 pandemic had a negative influence on the prediction work, but different forecasting models have different predictive power, so the accuracy cannot reflect the impact of the COVID-19 on maritime transportation. The experimental prediction of the container throughput at NB, SH, LYG and SZ in YRDP was performed by using hybrid models, ML models (LSTM and SVR) and the SARIMA model. The MAE, MAPE and RMSE were then used as the measurement criteria to compare the predictive performance. For the predictive performance, configuration 2 (S-L2 and S–S2) was the most accurate in the various models, while configuration 1 (S-L1 and S–S1) was more accurate than the SARIMA model and ML models. At the same time, the accuracy of the S-L1, S–S1, S-L2 and S–S2 was also higher than the four EMD-BPN models (Wei and Chen, 2012), SARIMA-ANNs models (Ruiz-Aguilar et al., 2014) and W-LSSVR, EMD-LSSVR, and EMD-ANN (Xie et al., 2019). In addition, the S-L2 and S–S2 performed better in the context of the COVID-19 pandemic. In this sense, some managerial insights for the prediction of the container throughput were obtained. First of all, the hybrid models can improve on the prediction performance of the single models. Configuration 2 can help policymakers to make an accurate decision during the operational planning of a port, especially in the context of anomalous events such as the COVID-19 pandemic. The results also indicated that, with the increase of the training dataset extensions, the prediction accuracy of the container throughput is higher. This suggests that transportation practitioners should keep a sufficient training dataset and reduce the forecasting horizons to improve the prediction accuracy. Finally, configuration 2 is suitable for the univariate time series, which can be easily implemented by strategic management and policymakers.

Conclusion

In this paper, to enhance prediction accuracy while eliminating nonlinearity and the multivariate limitations in container throughput forecasting, especially in the context of the anomalous events (e.g. COVID pandemic), we proposed two hybrid models, each with two configurations (configuration 1:S-L1, S–S1, and configuration 2: S-L2, S–S2) in comparison to the benchmark models. Then, we explored the response of the different training dataset extensions and forecasting horizons to the prediction work and also analysed the influence of the COVID-19 pandemic on container throughput forecasting and maritime transportation. The conclusions of this study, based on the verification of the container throughput time series of four typical ports in YRDP, are as follows. The hybrid models (configuration 2) we proposed can improve the performance of benchmark single models and also resolve the nonlinear problem and remove the multivariate limit, which provides an efficient decision-making tool for policymakers and port authorities. At the same time, configuration 2 can further improve the accuracy of the traditional hybrid models (configuration 1). With the increase of the training dataset extensions, the accuracy of the models increased. Contrary to popular belief, with the increase of the forecasting horizon, there is insufficient evidence to indicate that the accuracy was lower. Configuration 2 performs better than other models in the context of the COVID-19 pandemic. Future research into the model in this paper is expected to be used in other time series, such as the stock price, GDP and rainfall. On the other hand, in the case of sufficient data, the hybrid models in this paper can better improve the accuracy of multivariate time series prediction.

Funding

This work has been partially funded by the MOLIÈRE project from the European Global Navigation Satellite Systems Agency (now EUSPA) under grant agreement No 101004275 and K.C.Wong Magna Fund at .

Declaration of competing interest

None.

Ports	Forecasting horizon
	12	24	36
NB	(0,0,1) (0,1,1)	(1,0,1) (0,1,1)	(0,0,1) (0,1,1)
SH	(1,0,0) (0,1,1)	(1,0,0) (0,1,1)	(1,0,1) (0,1,1)
LYG	(0,1,0) (1,0,1)	(0,1,0) (1,0,0)	(0,1,0) (1,0,0)
SZ	(1,0,0) (0,1,1)	(1,0,1) (0,1,1)	(1,0,0) (0,1,1)
Training dataset extension
	84	72	60
NB	(0,1,1) (0,1,1)	(1,0,1) (0,1,1)	(1,0,1) (0,1,1)
SH	(0,1,2) (1,1,1)	(0,1,1) (1,1,1)	(0,1,1) (0,1,1)
LYG	(2,1,2) (1,0,0)	(2,1,1) (1,0,0)	(1,1,1) (1,0,0)
SZ	(0,1,1) (0,1,1)	(1,1,1) (1,0,0)	(1,0,1) (0,1,1)
Training dataset extension of the pre-COVID-19 period
	84	72	60
NB	(0,1,1) (0,1,1)	(1,1,1) (0,1,1)	(1,0,1) (0,1,1)
SH	(0,1,2) (2,1,1)	(0,1,1) (0,1,1)	(0,1,1) (0,1,1)
LYG	(2,1,2) (1,0,1)	(2,1,1) (1,0,1)	(1,1,1) (1,0,0)
SZ	(0,1,1) (0,1,1)	(1,1,1) (1,0,0)	(0,0,1) (0,1,1)
Anomalous points
	84	72	60
NB	(1,0,1) (0,1,1)	(1,0,1) (0,1,1)	(1,0,0) (0,1,1)
SH	(1,0,1) (0,1,1)	(0,0,1) (2,1,0)	(0,0,0) (1,1,0)
LYG	(1,1,1) (1,0,1)	(1,1,1) (1,0,0)	(0,1,0) (0,1,1)
SZ	(2,1,0) (2,0,1)	(2,1,0) (2,1,0)	(2,1,0) (1,1,0)

4 in total