Literature DB >> 33852642

COVID-19: Short term prediction model using daily incidence data.

Hongwei Zhao¹, Naveed N Merchant², Alyssa McNulty¹, Tiffany A Radcliff¹, Murray J Cote¹, Rebecca S B Fischer¹, Huiyan Sang², Marcia G Ory¹.

Abstract

BACKGROUND: Prediction of the dynamics of new SARS-CoV-2 infections during the current COVID-19 pandemic is critical for public health planning of efficient health care allocation and monitoring the effects of policy interventions. We describe a new approach that forecasts the number of incident cases in the near future given past occurrences using only a small number of assumptions.
METHODS: Our approach to forecasting future COVID-19 cases involves 1) modeling the observed incidence cases using a Poisson distribution for the daily incidence number, and a gamma distribution for the series interval; 2) estimating the effective reproduction number assuming its value stays constant during a short time interval; and 3) drawing future incidence cases from their posterior distributions, assuming that the current transmission rate will stay the same, or change by a certain degree.
RESULTS: We apply our method to predicting the number of new COVID-19 cases in a single state in the U.S. and for a subset of counties within the state to demonstrate the utility of this method at varying scales of prediction. Our method produces reasonably accurate results when the effective reproduction number is distributed similarly in the future as in the past. Large deviations from the predicted results can imply that a change in policy or some other factors have occurred that have dramatically altered the disease transmission over time.
CONCLUSION: We presented a modelling approach that we believe can be easily adopted by others, and immediately useful for local or state planning.

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 33852642 PMCID： PMC8046206 DOI： 10.1371/journal.pone.0250110

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Since the World Health Organization declared a pandemic for the novel SARS-CoV-2 2019 virus (COVID-19) on March 11, 2020 [1], the Americas, Europe, South-East Asia and Eastern Mediterranean regions have the most documented cases [2]. Globally, nationally, and at every sub-governmental level, there is a need to monitor the current caseload and project the rate and nature of the spread to guide public health awareness, preparedness, and response. Societies have to deal with many pressing issues such as ensuring adequate supplies of personal protective equipment, considerations about the adequacy of the health care workforce and other health care resources, as well as how to balance restrictive safety guidelines with keeping businesses open and the economy sound. For a novel infectious disease, it is especially important to forecast future cases based on what has happened in the immediate past. Prediction for the number of cases in a pandemic and implications for health care needs and resources have received a lot of attention in the scientific world [3-5], government agencies [6-8], and in media lately [9-11]. With the plethora of models, there is also growing scrutiny [12] about the accuracy of different models, and an appreciation that model parameters need to be refined based on evolving knowledge about the disease trajectory and factors impacting infection and transmission rates. The different approaches to modeling and forecasting infectious disease epidemics can be characterized as: 1) mechanistic models based on SEIR (referring to Susceptible, Exposed, Infected, and Recovered states) framework [13]; or its modified version [14-16]; 2) time series prediction models such as ARIMA [17], Grey Model [18], and Markov Chain models [19]; and 3) agent type models (i.e. simulating individual activities for a population) [20]. Even within each category, there are different types of approaches attempted. For SEIR models, there are deterministic models involving differential equations, and stochastic models entailing probability distributions. There are models that are designed to make long-term forecasts, and models that are best used for short-term predictions. For this paper, we primarily focus on short-term predictions based on SEIR concepts intended to forecast incidence cases for the next two to three weeks. The SEIR model is an extension of the classical SIR model [21], and both SEIR and SIR models are foundations for many epidemiological modeling techniques. The model’s strength lies in its simple approximation of a complex process. For example, a typical SIR model specifies that at a certain time t, the population (with size N) can be classified as people who are susceptible S(t), infected I(t), and recovered R(t) according to the following series of differential equations: where β and λ represent the transmission rate and recovery rate, respectively. In theory, the population size for each state as a time series can be used to estimate the parameters in the model according to the system of equations. In practice, modelers rarely have an accurate count of people at each stage, and the parameters could change with time. The problem has been tackled using different approaches. For example, Zhu and Chen [22] considered a statistical transmission model for early phase of COVID-19 outbreak; Wu et al. [23] incorporated the possibility of people moving out of the compartments due to migration in the modified SEIR model. However, both approaches made the assumption that the transmission rate was constant. In many states within US, or in many counties, we have seen a rapid change of the transmission rate caused by public behavior and public policy, therefore, it is not realistic to use a model with a constant transmission rate over a long period of time. Although many approaches to predicting infectious disease transmission have appeared in literature, we have not found one method that can be used readily for a day-to-day short-term forecast purpose. Godio et al. [24] used SEIR models for predicting epidemic evolution by means of a stochastic solver, which allows a time-dependent transmission rate. They model the transmission rate as a function of community mobility. This approach is more flexible than the constant transmission rate assumption. However, it still cannot capture other dynamic aspects of the environment that impact the transmission rate, such as masks mandates, and adoption of contact tracing, early testing and isolation. Alternatively, Friston et al. [25] proposed a dynamic causal model framework for COVID-19, where they tried to include every variable that “matters” in the spread of the disease. This model suggested that individuals had four different characteristics: location, infection status, testing results, and clinical status (i.e., how sick they are). Each of these four characteristics contained four different states, and individuals could move from one state to another state over time. The main challenge was that there were many parameters used in the model, and identifying accurate initial estimates of all the parameters is difficult for a novel infectious disease with non-specific symptoms and potentially many asymptomatic cases. The objective of this paper is to provide a method that can be reliably used to make predictions for the epidemic evolution in the next two to three weeks, based on the observed incidence cases only. Due to the relative small percentage of death in the whole population, we will ignore the death data in our modeling. The motivation for this work originated from pragmatic planning questions posed by local and state officials charged with allocating resources and ensuring population health. Members from the Texas A&M University School of Public Health started to monitor and forecast COVID-19 cases at the beginning of the pandemic, and then used the projected cases to support predictions for hospitalization and related health resource utilization.

Methods

Assuming that we have observed a time series of COVID-19 incidence cases up to a time t, our goal is to make predictions of incidence cases in the next two to three weeks. In an ideal scenario, all data sets would be calibrated to the time of infection (an admitted impossibility). However, publicly available data sets most often reflect the date of reporting, which may be the date of reporting to the local health department, but more often reflects the date of reporting up the chain, such as to the State health department. As such, day-to-day variations of reported incidence cases often reflect not the true variation of the disease infection but reporting capacity. In addition, a large data dump might occur because of attempts to process backlogged data. Therefore, we propose to perform a smoothing average of data (e.g. 3-day weighted average) before performing any analysis. In the event of a big data dump, we also need to make adjustment to the data and distribute the cases over time. These adjustment to public databases would not only improve model handling but also be valuable for our interpretation and application. Our approach to forecasting future COVID-19 cases involves two main steps. First, we model the observed incidence cases using similar ideas as appeared in Cori et al. [26]. Assuming a Poisson distribution for the daily incidence number, and a gamma distribution for the series interval, we are able to estimate the parameter (i.e. the effective reproduction number R) in the model. In the forecasting step, we draw future incidence cases from their posterior predictive distributions, assuming that the current R will stay the same, decrease 5%, or increase 5%. The upper 95% posterior credible intervals for increased R scenario together with the lower 95% posterior credible intervals (CI) for decreased R scenario constitute our prediction intervals. The detailed description of our methods can be found in S1 Appendix. Some basic assumptions are necessary for using our methods. In order to determine the value of the effective reproduction number R, we made the assumption that R has a prior gamma distribution with a shape parameter of 1 and a scale parameter of 5, similar to Cori et al. [26]. We also assumed that the serial interval has a discretized gamma distribution [26] with a mean of 3⋅95 and a standard deviation of 4⋅24 [27]. These hyper-parameters are generally fixed in our model and in our projection. One parameter that we allow to vary is the time interval τ which we use to get reliable estimates of R. In essence, we assume that R is constant during this interval [t − τ + 1, t] so that we can get a reliable estimate of R(t) at time t. From our experience, τ = 7 days or τ = 12 days are recommended, the choice of which depends on the incidence numbers (smaller incidence cases require a larger τ) and the actual dynamic change of the transmission rate (a smaller τ can capture the change better). A detailed discussion of the assumptions and parameters used for our model is provided in the “Choosing Model Parameter” section in S1 Appendix.

Application to COVID-19 data sets

We first demonstrate how to use our methods for predicting COVID-19 cases in Texas, a large and diverse state in the US with a population size of approximately 29 million. We utilize data from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. As of November 15, 2020, the total number of reported cases was 1,059,753, corresponding to an attack rate of 38⋅0 per 1,000 people. We emphasize the importance of understanding how the case reports can be influenced by administrative issues, and the need to adjust our model accordingly. For example, on September 21, 2020 there was a reported 14,129 cases for Harris county due to processing of backlogged data on that day. This artificial spike would influence the estimate of R, and consequently, the prediction going forward. Therefore, we reassigned those cases from Harris county according to the following rule: We first imputed the number of cases on that day using the average number of cases in the past seven days. Then we evenly spread the extra cases over the previous 31 days including that index day of September 21. The modified series would be treated as the observed series in our subsequent modeling analysis. Another modification we made was to smooth the data series. Due to the high variability of the daily cases, and the fact that there was often a delay in reporting especially during the weekends, we smoothed the data using the following algorithm, similar to Sun et al. [28]: where T is the last time point in the data series upon which a forecast is to be made. The smoothed data series were the data we used for generating our prediction models. As mentioned in the detailed “Methods” section in S1 Appendix, we first used the method of Cori et al. [26] to estimate the reproduction number R(t) for different time t based on the smoothed incidence data in Texas, with a cut off date of November 15, 2020, and an interval of τ = 7 days. (Results using τ = 12 days are presented in S1–S3 Figs). The smoothed data series, the estimated R(t), and its 95% confidence intervals (CI) are shown in Fig 1.

Fig 1

Texas incidence cases over time (smoothed) and the estimated effective reproduction number R(t) (95% CI in shaded area) using 7-day intervals.

It is clear from Fig 1 that there were different stages of COVID-19 spread in Texas. Due to the large number of incidence cases, the 95% CI for the effective reproduction number R are quite narrow. During the month of April, the case counts were kept very low due to a statewide Shelter-in-Place order that was enacted by the Governor. The estimated R was close to 1⋅0 around mid-April. Beginning May 1, 2020 Texas started phased reopening process, with many restrictions lifted in early June, right after the Memorial Day holiday. The daily incidence cases began to increase dramatically after Memorial Day weekend, and continued throughout June, reaching a peak daily incidence of about 13,000 in early July. During this period, R gradually increased to a value of 1⋅325. A statewide mask mandate was implemented on July 3, 2020, and a couple of weeks after that, we started to see a downward trend in the incidence cases. The reproduction number slowly decreased to below 1⋅0 towards the end of July and during August. Unfortunately, the trend reversed starting in early September, with cases increasing again and a reproduction number above 1.0. The uptick was possibly due to Labor Day weekend gatherings and widespread reopening of in-person options for schools and colleges for the Fall 2020 semester. The epidemic was then kept under control for a while until Mid-October, when COVID-19 cases started to increase dramatically both statewide and nationwide. For illustration purposes, we applied our prediction method at four equally spaced time points that were two months apart: April 15, June 15, August 15, and October 15. We plotted three projection lines corresponding to the predicted mean values when the transmission rate (or equivalently the reproduction number R) stayed the same, increased 5%, or decreased 5%. We also plotted the prediction intervals (shaded areas) based on the upper 95% CI limits for the 5% increasing R and the lower 95% CI limits for the 5% decreasing R scenario. The predicted daily cases and cumulative cases, together with their prediction intervals for the next three weeks are shown in Figs 2 and 3 separately.

Fig 2

Texas predicted incidence cases using 7-day intervals.

Fig 3

Texas predicted cumulative incidence cases using 7-day intervals.

Texas predicted incidence cases using 7-day intervals.

Texas predicted cumulative incidence cases using 7-day intervals.

Three solid lines represent the predicted cases corresponding to current rate of transmission sustained, 5% increase in transmission rate, and 5% decrease in transmission rate. The shaded areas indicate prediction intervals. As expected, our predictions performed differently at different times. On April 15, our forecast assuming constant transmission rate matched the observed data very well. On June 15, when R was increasing rapidly because of the business reopening process and the Memorial Day holiday weekend, the observed cases fell between our predicted curves assuming the same transmission rate and 5% increase in transmission rate. On August 15, we saw a gradual decrease in transmission rate due to a statewide mask mandate, and the forecast with 5% decrease in transmission rate matched the observed data closely. Finally, on October 15, we started to see an increasing trend again, and the forecast assuming 5% increase in transmission rate worked well. Secondarily, we chose to test the applicability of our model to a smaller geographic region within Texas. We applied our method to predicting the number of cases for the Brazos Valley (BV), a group of seven counties in Texas (i.e., Brazos, Robertson, Burleson, Madison, Grimes, Leon, and Washington counties), which collectively comprise the Bryan-College Station metropolitan area and neighboring counties. The center is Brazos County, where Texas A&M University is located. This area is approximately 100 miles from both Austin and Houston and has a younger population than Texas as a whole. Several healthcare entities and a public health authority in the BV needed timely and accurate forecasts to support planning for local COVID-19 cases. The BV incidence cases and the estimated reproduction number R(t) using 12-day intervals are presented in Fig 4. Due to small incidence cases in BV, the CIs for R were quite wide, making forecasting for BV more challenging. The trend for BV was influenced by the local context so it did not always follow the trend in Texas. In addition, due to a relative small population size (approximately 229,000), and sudden population change caused by college students’ moving out (in late-March corresponding to the Stay-at-Home order) and then back to the region (in mid-August to correspond with the start of the Fall semester), we saw more variability in the incidence cases for BV. Therefore, we chose to use 12-day intervals for our modeling approach, but we also provided results using 7-day intervals in S4–S6 Figs for additional information. All other parameters were the same as appeared in the state model, and we made predictions on the same days as we did for the state model. The predicted daily incidence cases and cumulative incidence cases for BV are shown in Figs 5 and 6 separately.

Fig 4

Brazos Valley incidence cases over time (smoothed) and the estimated effective reproduction number R(t) (95% CI in shaded area) using 12-day intervals.

Fig 5

Brazos Valley predicted incidence cases using 12-day intervals.

Fig 6

Brazos Valley predicted cumulative incidence cases using 12-day intervals.

Brazos Valley predicted incidence cases using 12-day intervals.

Brazos Valley predicted cumulative incidence cases using 12-day intervals.

Three solid lines represent the predicted cases corresponding to current rate of transmission sustained, 5% increase in transmission rate, and 5% decrease in transmission rate. The shaded areas indicate prediction intervals. On April 15, our prediction assuming the same transmission rate sustained agreed well with the observed cases. On June 15, when the transmission rate increased rapidly, the prediction upper bounds followed approximately the observed curve. Our forecast based on past history did not capture the increased case numbers at the end of August when school started, since we had an influx of cases due to thousands of students moving to Brazos county from all over Texas. Starting October 15, although past trend suggested increasing incidence cases, the observed data matched more closely with the prediction lower bounds. Our model and method produced reasonably accurate results when the R value is distributed similarly in the future as it is in the past. Large deviations from the predicted results can imply that a change in policy or some other factors have occurred that have dramatically altered the R value over time.

Conclusion

We have proposed a method that generates predictions for the number of COVID-19 infectious disease cases in the future, based on what estimates of R are like at the current time. The major strength of our approach lies in its simplicity, which makes it easy to implement with a small team of modellers. As such, we have incorporated it as part of a dashboard (https://covid19-modeltrac.shinyapps.io/TX-BV-ModelTrac/#section-tx-forecasts), where it can automatically generate forecasting values every day for a future view of three weeks using publicly-available data. This transparent and straightforward approach means that the method can be easily adopted by others who want to do similar predictions to help inform local or state-wide decision using public data sources. Our predicted case numbers can also be used as data inputs alongside other information for predicting health care utilization and health outcomes such as hospitalizations, intensive care unit (ICU) occupancy and corresponding ventilator use, and anticipated fatalities. These projections should be performed routinely to plan for surges and avoid overwhelming health resources. In Texas for example, hospitals are collectively working together using surge projections to identify and refer patients to available hospital beds [29]. A limitation for any infectious disease prediction model is the complexity inherent in how data are collected. Infectious disease reporting has long been plagued with many challenges. It is important to acknowledge that our model, as many others, relies on detection of infections through testing and reporting. In reality, the journey of a simple data element, from infection to tabulation, has many obstacles and nuances along the way. Some major complexities of the data include: policies about testing algorithms (e.g. which suspect cases are tested); if screenings or surveillance is conducted, which diagnostic test is acceptable or required for reporting; accessibility and availability of testing; administrative issues such as reporting requirements, procedures, and infrastructure. These elements can vary widely by locale and among populations within a locale. Thus, the available data are likely to represent some fraction of infections. Understanding the underlying caveats and how local situations contribute to limitations is essential to evaluating the model output. Even so, the opportunity for practical application of our model to provide insight for assessment, planning, and policy-making remains invaluable. Similar to the widely-adopted method for estimating R [26], we made a few assumptions, e.g. the incidence I(t) follows a Poisson distribution, with a mean parameter determined by a renewal function involving a serial function w(s). The serial function is assumed to have a discretized gamma distribution. The reproduction number R varies with time, but we assume that it is constant over a time interval (7 days, or 12 days) in order to obtain a stable estimate for its posterior distribution. Under these assumptions, we can predict the number of cases that could occur in the following two or three weeks, allowing R to stay the same, increase 5%, or decrease 5%. The assumption that R behaves similarly in the future as it does now is a major assumption, and is probably inaccurate if we project far into the future. However, we believe it to be a reasonable approximation of the true process if we want to see what happens in the next couple of weeks from the present. Because R is related to many factors, it can change dramatically. It is a function of transmission probability, which means it can be affected by a mask mandate. It is also affected by the average number of contacts one person has, hence, we expect that R might increase when in-person school resumes. In addition, it depends on how many days on average one person is infectious after becoming infected, which can be reduced by contact tracing and early isolation. The number of people that are susceptible or immune is also changing over time. As more people become infected and then become recovered, the effective R should decrease over time if other factors stay constant. If we want to make more accurate forecasts, we should allow a future R to be a function of all these different factors. Another way to think about this is that if we make projections according to current values of R, then any deviations from the current trend can be attributed to factors not explicit in our model, such as a policy implementation, or behavior changes arising from reactions to current situation. One contributing factor to R that can be objectively measured is mobility data. If mobility data could provide insight on how R may vary, incorporating the motility data in a prediction model can result in better predictions for R in the future, which in turn will result in better estimates for the number of incidence cases. Finding the trend of R values in the future using other data sources is a direction of our future research. In summary, we presented a modelling approach that we believe can be easily adopted by others, and immediately useful for local or state planning. Although many initially downplayed the long-term consequences of COVID-19 [30], it is now clear that new surges are appearing in the US as well as globally [31-33], and that the pandemic spread is likely to last for another year or two [3]. Thus, public health and governmental responses will need to be guided by data that pinpoint where, when, and among whom the new cases are occurring. This information can help guide public health messaging as well as the nature and degree of government responses to mandating public health practices or regulating business operations to limit spread. Timely projections regarding case counts are critical to planning for healthcare resources and assuring available care and best possible outcomes for populations facing the uncertainty of a rapidly emerging infectious disease during a pandemic response.

Technical details.

Methods for predicting COVID-19 cases and the selection of model parameters. (PDF) Click here for additional data file.

Texas incidence cases over time (smoothed) and the estimated effective reproduction number R(t) (95% CI in shaded area) using 12-day intervals.

(TIF) Click here for additional data file.

Texas predicted incidence cases using 12-day intervals.

Texas predicted cumulative incidence cases using 12-day intervals.

Brazos Valley incidence cases over time (smoothed) and the estimated effective reproduction number R(t) (95% CI in shaded area) using 7-day intervals.

(TIF) Click here for additional data file.

Brazos Valley predicted incidence cases using 7-day intervals.

Brazos Valley predicted cumulative incidence cases using 7-day intervals.

Three solid lines represent the predicted cases corresponding to current rate of transmission sustained, 5% increase in transmission rate, and 5% decrease in transmission rate. The shaded areas indicate prediction intervals. (TIF) Click here for additional data file. 28 Jan 2021 PONE-D-20-36327 COVID-19: Short term prediction model using daily incidence data PLOS ONE Dear Dr. Zhao, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Mar 14 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols We look forward to receiving your revised manuscript. Kind regards, John Schieffelin, MD Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2.Thank you for stating the following in the Acknowledgments Section of your manuscript: "We thank Texas A&M University administration for internal funding to support this 264 work." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "The author(s) received no specific funding for this work." Please include your amended statements within your cover letter; we will change the online submission form on your behalf. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: N/A Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The need for an accessible method to estimate and predict SARS-CoV2 incidence, both short- and long-term, is very real, and the authors propose an intriguing option to meet that need. Howerver, they themselves point out that the porposed model is less successful when a variety of parameters shift within the projection interval. Under these circumstances the range encompassed by the +/- 5% change seems, in fact, unacceptabley wide from an operational perspective, and not resonable as suggested by the authors. And while the authors identified a number of chages of status which could be resonble easy to identfiy (and avoid) for a predeiction period, they never mentioned one of the truly problematic elements related to identificaiton of SARS-CoV2 infections (cases), which is the testing itself - not just the mentioned lag in reproting, but actual uptake of testing, and the tremendous variablility that can occur in uptake of diagnostic testing, influenced by supply shortages, population interest in and access to testing, at a local or state level. The unfortunate reality is that diganosed and reported infections with SARS-CoV2 are in fact, some unknown fraction of true infections, which also changes over time. This model actual gives some evidence of that, providing much tighter ranges, aligning more closely with actual case counts in the early periods, with far less precision in the late intervals. Reviewer #2: In the paper PONE-D-20-36327 "Covid-19, Short term prediction model using daily incidence data", Zhao et al proposed a new approach to forecasts the number of incident cases in the near future using some assumptions. Based on the paper, they reported that the method can produces reasonably results and large deviation from the predicted results can imply that a change in policy or some other factors. The results seem reasonable. Some similar results have been studied by Jin's group in Fudan(See [CCJL2020],[SZYPCC2020],[P2020]), Jin's model is well suitable for Chinese data. But the scene and data in USA are more complicated. Zhao's work is interesting. One suggestion is that we may not deal with the original number of incident cases, instead, we may consider to filter or smooth the number of incident cases, for example, 7-day average. [CCJL2020]Chen, Y., Cheng, J., Jiang, Y. and Liu, K. A time delay dynamical model for outbreak of 2019-nCoV and the parameter identification. J. Inverse Ill-Posed Probl., 28(2020), 243–250. [SZYPCC2020]Shao, N., Zhong, M., Yan, Y., Pan, H., Cheng, J. and Chen, W. Dynamic models for coronavirus disease 2019 and data analysis. Math. Methods Appl. Sci., 43(2020), 4943–4949. [P2020]Hanshuang Pan, Nian Shao, Yue Yan, Xinyue Luo, Shufen Wang, Ling Ye, Jin Cheng and Wenbin Chen, Multi-chain Fudan-CCDC model for COVID-19-a revisit to Singapore's case,Quantitative Biology, 2020, 8(4): 325–335. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 11 Feb 2021 We thank the reviewers and the academic editor for valuable comments on our submitted manuscript entitled "COVID-19: Short term prediction model using daily incidence data" (PONE-D-20-36327). We have made changes according to your requests and replied to the questions in the file named "Response to Reviewers.docx". We also uploaded two copies of our manuscript, one was a marked-up copy that highlighted changes made to the original version, and an unmarked version without tracked changes. Thank you very much for your consideration. Look forward to hearing back from you soon. Submitted filename: Response to Reviewers 11.02.2021.docx Click here for additional data file. 31 Mar 2021 COVID-19: Short term prediction model using daily incidence data PONE-D-20-36327R1 Dear Dr. Zhao, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, John Schieffelin, MD Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: (No Response) Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: (No Response) Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: (No Response) Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: (No Response) Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: In the paper "COVID-19: Short term prediction model using daily incidence data", they describe a new approach that forecasts the number of incident cases, first model the observed incidence cases using a Poisson distribution for the daily incidence number, and a gamma distribution for the series interval, then estimate the effective reproduction number assuming its value stays constant during a short time interval; and finally draw future incidence cases from their posterior distributions. The method is interesting and new, and the forecast results and explanation seem reasonable. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 5 Apr 2021 PONE-D-20-36327R1 COVID-19: Short term prediction model using daily incidence data Dear Dr. Zhao: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr, John Schieffelin Academic Editor PLOS ONE

17 in total

1. An agent-based model to evaluate the COVID-19 transmission risks in facilities.

Authors: Erik Cuevas
Journal: Comput Biol Med Date: 2020-05-20 Impact factor: 4.589

Review 2. Prediction of the Number of Patients Infected with COVID-19 Based on Rolling Grey Verhulst Models.

Authors: Yu-Feng Zhao; Ming-Huan Shou; Zheng-Xin Wang
Journal: Int J Environ Res Public Health Date: 2020-06-25 Impact factor: 3.390

3. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study.

Authors: Joseph T Wu; Kathy Leung; Gabriel M Leung
Journal: Lancet Date: 2020-01-31 Impact factor: 79.321

4. ARIMA modelling & forecasting of COVID-19 in top five affected countries.

Authors: Alok Kumar Sahai; Namita Rath; Vishal Sood; Manvendra Pratap Singh
Journal: Diabetes Metab Syndr Date: 2020-07-28

5. The hidden Markov chain modelling of the COVID-19 spreading using Moroccan dataset.

Authors: Abdelghafour Marfak; Doha Achak; Asmaa Azizi; Chakib Nejjari; Khalid Aboudi; Elmadani Saad; Abderraouf Hilali; Ibtissam Youlyouz-Marfak
Journal: Data Brief Date: 2020-07-24

6. On a Statistical Transmission Model in Analysis of the Early Phase of COVID-19 Outbreak.

Authors: Yifan Zhu; Ying Qing Chen
Journal: Stat Biosci Date: 2020-04-02

7. Estimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data, March 2020.

Authors: Tapiwa Ganyani; Cécile Kremer; Dongxuan Chen; Andrea Torneri; Christel Faes; Jacco Wallinga; Niel Hens
Journal: Euro Surveill Date: 2020-04

8. Prediction models for COVID-19 clinical decision making.

Authors: Artuur M Leeuwenberg; Ewoud Schuit
Journal: Lancet Digit Health Date: 2020-09-22

9. WHO Declares COVID-19 a Pandemic.

Authors: Domenico Cucinotta; Maurizio Vanelli
Journal: Acta Biomed Date: 2020-03-19

10. Prediction models for covid-19 outcomes.

Authors: Matthew Sperrin; Brian McMillan
Journal: BMJ Date: 2020-10-20

5 in total

1. Comparison of Conventional Modeling Techniques with the Neural Network Autoregressive Model (NNAR): Application to COVID-19 Data.

Authors: Muhammad Daniyal; Kassim Tawiah; Sara Muhammadullah; Kwaku Opoku-Ameyaw
Journal: J Healthc Eng Date: 2022-06-14 Impact factor: 3.822

2. Time-dependent force of infection and effective reproduction ratio in an age-structure dengue transmission model in Bandung City, Indonesia.

Authors: Juni Wijayanti Puspita; Muhammad Fakhruddin; Nuning Nuraini; Edy Soewono
Journal: Infect Dis Model Date: 2022-07-11

3. Panel Associations Between Newly Dead, Healed, Recovered, and Confirmed Cases During COVID-19 Pandemic.

Authors: Ming Guan
Journal: J Epidemiol Glob Health Date: 2021-12-11

4. Predicting the mutational drivers of future SARS-CoV-2 variants of concern.

Authors: M Cyrus Maher; Istvan Bartha; Steven Weaver; Julia di Iulio; Elena Ferri; Leah Soriaga; Florian A Lempp; Brian L Hie; Bryan Bryson; Bonnie Berger; David L Robertson; Gyorgy Snell; Davide Corti; Herbert W Virgin; Sergei L Kosakovsky Pond; Amalio Telenti
Journal: Sci Transl Med Date: 2022-02-23 Impact factor: 17.956

5. A data-driven hybrid ensemble AI model for COVID-19 infection forecast using multiple neural networks and reinforced learning.

Authors: Weiqiu Jin; Shuqing Dong; Chengqing Yu; Qingquan Luo
Journal: Comput Biol Med Date: 2022-04-27 Impact factor: 6.698

5 in total