Literature DB >> 35383944

A robust autonomous method for blood demand forecasting.

Esa V Turkulainen1, Merel L Wemelsfelder2, Mart P Janssen2, Mikko Arvas1.   

Abstract

BACKGROUND: Blood supply chain management requires estimates about the demand of blood products. The more accurate these estimates are, the less wastage and fewer shortages occur. While the current literature demonstrates tangible benefits from statistical forecasting approaches, it highlights issues that discourage their use in blood supply chain optimization: there is no single approach that works everywhere, and there are no guarantees that any favorable method performance continues into the future. STUDY DESIGN AND METHODS: We design a novel autonomous forecasting system to solve the aforementioned issues. We show how possible changes in blood demand could affect prediction performance using partly synthetic demand data. We use these data then to investigate the performances of different method selection heuristics. Finally, the performances of the heuristics and single method approaches were compared using historical demand data from Finland and the Netherlands. The development code is publicly accessible.
RESULTS: We find that a shift in the demand signal behavior from stochastic to seasonal would affect the relative performances of the methods. Our autonomous system outperforms all examined individual methods when forecasting the synthetic demand series, exhibiting meaningful robustness. When forecasting with real data, the most accurate methods in Finland and in the Netherlands are the autonomous system and the method average, respectively. DISCUSSION: Optimal use of method selection heuristics, as with our autonomous system, may overcome the need to constantly supervise forecasts in anticipation of changes in demand while being sufficiently accurate in the absence of such changes.
© 2022 Finnish Red Cross Blood Service. Transfusion published by Wiley Periodicals LLC on behalf of AABB.

Entities:  

Keywords:  autonomous systems; blood supply chain management; demand forecasting; robust methods; time series

Mesh:

Year:  2022        PMID: 35383944      PMCID: PMC9325496          DOI: 10.1111/trf.16870

Source DB:  PubMed          Journal:  Transfusion        ISSN: 0041-1132            Impact factor:   3.337


absolute percentage error autoregressive integrated moving average Autonomous (method Selection) forecasting method with N‐step selection period length method average coronavirus disease 2019 caused by SARS‐CoV‐2 dynamic regression extreme learning machines exponential smoothing method moving average mean absolute percentage error multilayer perceptrons autoregressive neural networks Seasonal Naïve Method season‐trend decomposition method multiseason‐trend decomposition method exponential smoothing method with trigonometric seasonality, box‐cox transformation, ARIMA errors, trend, and seasonal components weighted method average

INTRODUCTION

A well‐functioning blood supply chain operation reduces wastage, procurement costs, and avoids life‐threatening shortages. Research on blood supply chain efficiency and reliability has increased steadily over the past couple of decades. According to a recent survey, most research on supply chain planning assumes demand as stochastic, deterministic, or otherwise intractably uncertain, using constant estimates or distribution sampling. However, demand forecasting has been deemed necessary and superior to expert planning in several studies, , , , , , , , , , , suggesting that supply chain management approaches can be improved by a significant margin by adopting methods for demand forecasting. Almost all reviewed research on demand forecasting attempts to determine the single best method for reducing shortages or costs, revealing that the best method varies between blood banks and blood products. , , , Researchers also note on multiple occasions that there are no particular methods that are guaranteed to give accurate forecasts and that the best method may change over time. , , To avoid laborious periodic method reselection, others have suggested using method selection systems and specific automatic procedures (most often the Box‐Jenkins procedure ) to help evaluate and select Autoregressive Integrated Moving Average (ARIMA) models without human intervention. , , , , In this study, we use a marked but relatively brief change in the behavior of the weekly red blood cell demand in Finland to examine how the continuation of this behavior would affect method performance and selection. We then devise heuristics for method selection that could allow an autonomous system to produce robust forecasts without supervision during changes in the demand behavior and compare the performance of such a system with individual method performance using altered demand data. Finally, we compare the performances of both the heuristics and the individual methods using unaltered demand data from Finland and the Netherlands to gauge its real‐world applicability.

MATERIALS AND METHODS

Data

Both the Finnish Red Cross Blood Service and Sanquin record every unit of blood delivered. By filtering these data by specific blood products (e.g., red blood cells or platelets) and aggregating them per week, we can create a weekly demand series. The record of red blood cell product deliveries spans from 2014 to 2021 in Finland and from 2009 to 2020 in the Netherlands. The aggregated weekly (Monday to Sunday) demand from this period for both countries is shown in Figure 1. The Finnish series exhibits a seemingly significant change in behavior around 2018 and 2019, as the slight downtrend in demand seems to level out and create a more pulse‐like signal. This behavior, however, does not continue beyond 2019. To study how the performance of different methods might be affected by such changes, we artificially extend the anomalous behavior by modeling the series between July 2017 and July 2019, adding some noise to it, and overwriting the actual demand history from July 2019 onward. The process to generate the synthetic data is explained in detail in Appendix S1, Part A. The artificially extended demand history is shown in Figure 2. We use both the altered and unaltered version of the demand data to test forecasting methods.
FIGURE 1

Weekly demand of red blood cell products in the Netherlands and Finland

FIGURE 2

Altered weekly demand of red blood cell products in Finland. The dotted line indicates the beginning of the synthetic data

Weekly demand of red blood cell products in the Netherlands and Finland Altered weekly demand of red blood cell products in Finland. The dotted line indicates the beginning of the synthetic data

Methods

As most blood supply operators do not have access to meaningful clinical variables or other external regressors, we limit ourselves in this study to autoregressive forecasting methods to enable a broad applicability of the results. Based on the current literature, we chose to examine the following methods: simple moving averages (MA), exponential smoothing (ETS), ARIMA, and autoregressive neural networks (NNAR). In addition to these, we examined the seasonal naïve method (SNAIVE), method averaging (AVG), season‐trend decomposition methods (STL and STLF), a dynamic seasonal method (TBATS), dynamic regression (DYNREG), multilayer perceptrons (MLP), and extreme learning machines (ELM). All methods are explained in detail in Appendix S1, Part B. The methods are trained using 3 years of most recent demand history. A forecast is generated for the following week, and its accuracy is tested using absolute percentage errors (APEs). This process is repeated until we have backtested all of the available history. These tests allow us then to summarize method performances in mean absolute percentage errors (MAPEs): where denotes the forecast value and denotes the observed value for a specific week . Absolute percentage errors are unit‐free, so they enable comparisons between different data sets and scales. However, they have some important limitations. Division by zero issues arise when the target observations contain zeroes, and the metric becomes very unstable if data has values very close to zero. Additionally, the data should exist on a ratio scale (entirely positive, meaningful zero). Our weekly aggregates of blood demand do not contain zeroes and the data are positive. We discuss other possible method selection metrics in the Discussion section. All development code is written in R (version 4.0.5). All methods are implemented using the forecast package (version 8.13) and the nnfor package (version 0.9.6), except for the combination methods (AVG and W.AVG). The development code is publicly available on Github. The repository contains all scripts used for generating the results in this article, an implementation of an autonomous forecasting system (an R Markdown script that outputs an HTML forecast report), as well as the necessary custom functions for running such a system. No data are made available for confidentiality reasons.

AUTONOMOUS METHOD SELECTION SYSTEM

Most of the methods, as implemented in the forecast package, are so‐called “modelers,” meaning that a new model is fit every time a forecast is requested. This allows for a degree of adaptability regarding changes in series behavior. However, all modelers are restricted to their respective model formulations, limiting more general adaptability. This issue can be circumvented by selecting from a pool of different modelers, thereby expanding the model space available for selection. As periodic manual reselection is costly, and often unfeasible, autonomous application of predetermined selection heuristics may be preferable. Figure 3 illustrates one possible method selection process for an autonomous forecasting system, an implementation of which is provided on Github. To test whether allowing for autonomous method selection using historical accuracy metrics would yield advantages in accuracy over the individual methods and modelers, we selected forecasts from the methods that held the lowest MAPE in 1, 3, 6, 12, 18, and 24 weeks before the forecast date (the selection period). We also tested forecasting by ranking methods based on the previous week's performance and computing an exponentially decaying weighted average as the forecast (W.AVG).
FIGURE 3

A schematic of the method selection process. Methods are trained with a 3‐year training window and then tested on the subsequent observation. The training window then moves, dropping the oldest observation, and including the previous test observation. A series of tests are performed over a selection period up until the forecast date (e.g., 12 weeks). Next, a forecast is generated by selecting the method with the lowest mean absolute percentage error (MAPE) from the selection period and retraining it with 2 years of the most recent observations

A schematic of the method selection process. Methods are trained with a 3‐year training window and then tested on the subsequent observation. The training window then moves, dropping the oldest observation, and including the previous test observation. A series of tests are performed over a selection period up until the forecast date (e.g., 12 weeks). Next, a forecast is generated by selecting the method with the lowest mean absolute percentage error (MAPE) from the selection period and retraining it with 2 years of the most recent observations

RESULTS

Figure 4 shows the differences in prediction accuracy between the first and second half of the artificially extended weekly data (Figure 2). We see that the STL, STLF, and DYNREG methods gain a significant advantage over other methods in the latter half of the data, after first performing largely comparably to other methods. The neural network methods (NNAR, MLP, ELM) stand out here in their substantial increase in accuracy in the latter half of the data after being the clear underperformers in the first half. However, they still compare unfavorably with AVG, SNAIVE, STL, DYNREG, and STLF. Table 1 shows the overall mean absolute percentage errors of different methods, and Table 2 shows the overall mean absolute percentage errors of method selection heuristics for different testing periods as well as the result for the weighted averaging method. The best accuracies in both Tables are highlighted in bold. Among the individual methods, the DYNREG performs best (4.46%). The majority of selection heuristics examined outperform most individual methods, with AUTO‐12 at the top (4.33%), beating them all. For the unaltered demand data (Figure 1), the performances of single methods and selection heuristics are presented together as a heatmap in Figure 5. The first row presents the theoretical minimum selection error, found by selecting the most accurate forecast each week retrospectively. The results are separate for each length of the testing period, as extending this period limits the available data, thus reducing the number of errors that can be computed.
FIGURE 4

Method performances on the first and second half of the synthetic data, ordered by the error magnitude on the second half. Some methods benefit from the change, some suffer

TABLE 1

Overall mean absolute percentage errors (MAPEs) of methods over the entire synthetic data

SNAIVEMA‐5MA‐7MA‐9MA‐12ETSSTLTBATSANNARIMAXDYNREGSTLFMLPELMAVG
5.066.816.516.546.726.944.845.946.646.21 4.46 4.617.067.045.14

Note: The best accuracies in both Tables are highlighted in bold.

TABLE 2

Overall mean absolute percentage errors (MAPEs) of selection heuristics over the entire synthetic data

AUTO‐1AUTO‐3AUTO‐6AUTO‐12AUTO‐18AUTO‐24W.AVG
5.014.644.54 4.33 4.684.714.76

Note: The best accuracies in both Tables are highlighted in bold.

FIGURE 5

A heatmap presentation of the overall mean absolute percentage errors (MAPEs) for the methods and selection heuristics. The first row presents the error for the optimal selection strategy (theoretical minimum). Results are spread over six columns (for each testing period length) to ensure comparable error scores. The splits separate countries and individual methods from heuristics

Method performances on the first and second half of the synthetic data, ordered by the error magnitude on the second half. Some methods benefit from the change, some suffer Overall mean absolute percentage errors (MAPEs) of methods over the entire synthetic data Note: The best accuracies in both Tables are highlighted in bold. Overall mean absolute percentage errors (MAPEs) of selection heuristics over the entire synthetic data Note: The best accuracies in both Tables are highlighted in bold. A heatmap presentation of the overall mean absolute percentage errors (MAPEs) for the methods and selection heuristics. The first row presents the error for the optimal selection strategy (theoretical minimum). Results are spread over six columns (for each testing period length) to ensure comparable error scores. The splits separate countries and individual methods from heuristics For Finland, the best performers are the AUTO‐12 selection method and AVG. The selection methods compared favorably with most individual methods, indicating viability for practical use without loss in accuracy. In the Netherlands, the accuracies are much better, most likely due to the larger volume of demand decreasing the noisiness of the signal. The best method was AVG, with STLF, ARIMAX, and DYNREG performing almost equally well. Out of the selection methods, the W.AVG and AUTO‐6 performed best, with better or roughly equal accuracy when compared with the individual methods. Overall, differences between accuracies were not statistically significant at the 95% confidence level, except for the MLP, whose underperformance was significant in comparison with some methods even after correcting for multiple testing.

DISCUSSION

The aim of this article was to explore blood demand forecasting systems that could offer robustness in the event of changes in the underlying demand signal. For this purpose, we identified a period of unusual behavior in the Finnish weekly blood demand data and extended it to create a synthetic demand history. We then examined how these conditions would affect the performance of various popular forecasting methods and found that the method performance is greatly affected by structural changes in the demand, highlighting the need to reselect methods over time (Figure 4). To test whether this could be done autonomously, we devised two different method selection heuristics that rely on historical performance: one that chooses the method that had the best mean absolute percentage error over a certain predetermined length of history and one that creates an exponentially weighted average of forecasts from different methods. The results from an application of various methods for the synthetic data show dominance of the selection heuristics over all individual methods, suggesting that the ability to switch between methods autonomously or to decide their weighting within a combination of methods successfully utilizes the differences between the methods. The method selection process behind these results is illustrated in Figure S1. While our study does not exhaust the search for the optimal selection strategy, the fact that a large majority of the heuristics outperform all of the individual methods indicates that there are multiple ways to achieve superiority in accuracy over single method approaches. Finally, we tested the heuristics and individual methods using the real, unaltered weekly demand data from Finland and the Netherlands. The results indicate that the heuristics' advantage with the altered demand series is most likely a result of the persistence of the behavioral change, as in reality, they do not differ significantly in accuracy from the individual methods in either countries. The AVG performed strongly in both countries, which is likely due to a trade‐off between the robustness enabled by the constant method reselection and simply forecasting with the historically best individual method. The AVG method, which computes the mean of all individual forecasts, seems to take advantage of the random error in the predictions of the individual methods. By canceling the biases of the individual methods, it reduces the variance of the resulting forecast. However, when more structural changes in the underlying demand signal emerge, the averaging method is unable to leverage the advantages gained by some methods or minimize the penalties suffered by others, as is shown by the dominance of selection heuristics for the synthetic data. While it is impossible to build an autoregressive forecasting system that can foresee changes in demand behavior (for example, a global COVID‐19 pandemic essentially halting regularly scheduled surgeries), one can leverage the advantages of various individual methods by enabling selection from a diverse library of methods. We did not aim to find the optimal set of methods nor the absolute best or most generalizable rule set for selection. For example, in an operation critical environment such as the blood supply chain, it might be more meaningful to select methods using a metric that harshly penalizes underforecasting or by their ability to detect peaks. We found no statistically significant differences between the peak detection abilities of different methods (Appendix S1, Part D). Nevertheless, by utilizing some simple method selection heuristics, one can ensure that any changes in demand behavior will be adjusted for, while maintaining sufficient forecasting accuracy in the absence of such changes. As such, we conclude that this approach is a viable autonomous forecasting solution for blood centers and blood products in general.

CONFLICT OF INTEREST

The authors have disclosed no conflicts of interest. Appendix S1. Supporting Information. Click here for additional data file.
  6 in total

1.  Performance of time-series methods in forecasting the demand for red blood cell transfusion.

Authors:  Arturo Pereira
Journal:  Transfusion       Date:  2004-05       Impact factor: 3.157

2.  Reducing demand uncertainty in the platelet supply chain through artificial neural networks and ARIMA models.

Authors:  Bahareh Fanoodi; Behnam Malmir; Farzad Firouzi Jahantigh
Journal:  Comput Biol Med       Date:  2019-08-30       Impact factor: 4.589

3.  Comparison of Time Series Methods and Machine Learning Algorithms for Forecasting Taiwan Blood Services Foundation's Blood Supply.

Authors:  Han Shih; Suchithra Rajendran
Journal:  J Healthc Eng       Date:  2019-09-17       Impact factor: 2.682

4.  A robust autonomous method for blood demand forecasting.

Authors:  Esa V Turkulainen; Merel L Wemelsfelder; Mart P Janssen; Mikko Arvas
Journal:  Transfusion       Date:  2022-04-05       Impact factor: 3.337

5.  Big data modeling to predict platelet usage and minimize wastage in a tertiary care system.

Authors:  Leying Guan; Xiaoying Tian; Saurabh Gombar; Allison J Zemek; Gomathi Krishnan; Robert Scott; Balasubramanian Narasimhan; Robert J Tibshirani; Tho D Pham
Journal:  Proc Natl Acad Sci U S A       Date:  2017-10-09       Impact factor: 11.205

6.  Designing an optimal inventory management model for the blood supply chain: Synthesis of reusable simulation and neural network.

Authors:  Monireh Ahmadimanesh; Ahmad Tavakoli; Alireza Pooya; Farzad Dehghanian
Journal:  Medicine (Baltimore)       Date:  2020-07-17       Impact factor: 1.817

  6 in total
  1 in total

1.  A robust autonomous method for blood demand forecasting.

Authors:  Esa V Turkulainen; Merel L Wemelsfelder; Mart P Janssen; Mikko Arvas
Journal:  Transfusion       Date:  2022-04-05       Impact factor: 3.337

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.