Literature DB >> 34939041

Estimating COVID-19 R t in Real-time: An Indonesia health policy perspective.

Abstract

COVID-19 (SARS COV2 n-corona virus) is the newfangled virus of the coronavirus family. COVID-19 can cause serious illness with symptoms of fever, cold, cough, and respiratory blockage. COVID-19 is a contagious virus, which originated in Wuhan, China. After one month, WHO declared it as a Pandemic due to its rapid spreading. Presently, Indonesia is also facing a hard time controlling the spread. Hence, it is essential to understand the spread rate in Indonesia and to analyze the strategies to minimize the virus spread. The proposed study can be used to assess variations in virus spread both nationally, and sub-nationally. This allows public health officials and policy-makers to track the progress of the outbreak in near real-time using an epidemiologically valid measure.

Entities: Chemical

Keywords: Bayesian; COVID-19; Coronavirus; Epidemic outbreak; Likelihood; Posterior; Spread rate

Year: 2021 PMID： 34939041 PMCID： PMC8378038 DOI： 10.1016/j.mlwa.2021.100136

Source DB: PubMed Journal: Mach Learn Appl ISSN： 2666-8270

Introduction to Covid-19

Recently, the big question globally revolved around is to find ways to slow down the epidemic using various methods. However, there are plenty of additional factors influencing the pandemic spread rate. All the countries and their population are solely dependent on insights from various studies or trials to recommend possible strategies and to gauge the efficacy of the implemented policies (Ivanov, 2020, Remuzzi and Remuzzi, 2020). The deadly COVID-19 virus has infected over 140 million people and over 3 million confirmed deaths all over the world. The current COVID-19 epidemic has shown a non-linear and intricate type of behavior (Koolhof et al., 2020). Furthermore, the epidemic has dissimilarities with other recent epidemics to find out precise outcomes. Besides, several well-known and unidentified features are visible in the dispersed population across different geographical regions (Darwish et al., 2020, Rypdal and Sugihara, 2019). Thus, typical pandemic mathematical analysis encounters another type of issue in showing accurate results. To overcome the current pandemic difficulties, various mathematical models were developed based on assumptions (Dallas et al., 2019, De Groot and Ogris, 2019, Kelly et al., 2019, Koike and Morimoto, 2018, Scarpino and Petri, 2019, Zhan et al., 2019). In the past few months, most of the countries in the world, and the Indonesian government have repetitively referred to of COVID-19, to bring back the new normal stage and to resume economic activity. Most of the countries including Indonesia are trying to lower below 1. Let us see what is and how it reflects Indonesia’s present health catastrophe?

What is ?

indicates the virus main facsimile number, where R indicates at a given moment in time. and determine by what factor infection is likely to spread. is the estimated number of additional cases of infection, which in return will spread the infection. can vary from place to place. For example, the typical for COVID-19 cases in Indonesia is 1.4. It means that, for every 1000 people with COVID-19, 1400 people will be infected. The newly infected population of 1400 would start spreading to an additional 1960 people and this infectious spread continues. Following a similar manner in about ten cycles, the infection could spread to over 60 thousand people. It means that when goes beyond 1, the virus spreads exponentially, but if is less than 1, the epidemic will drastically fall due to less spread across the neighboring population. As per the WHO survey in Hubei, China, it announced that is the most important factor to analyze the spread rate.

How it is calculated and what affects it?

There are a few factors that influence the estimation of and . As per WHO, the COVID-19 virus is contagious due to droplets and it may remain in the air or on surfaces for about 2 to 3 days. Although manipulating the process of spread is complicated, the only way infectious time could be reduced is by identifying and isolating virus-affected persons. Social isolation methods could drastically minimize the spread rate. It is also necessary to understand Indonesian geographical and population diversity to calculate . The only way to lower the spread rate is through external policy interventions. of 1.1 may suddenly increase due to the lifting of restrictions. So, it is necessary to estimate to see the consistency of the daily rate to an outbreak of COVID-19 cases accurately for relaxing social restrictions. This proposed paper describes the rate of spreading across Indonesia and the impact of the virus over time . The remaining sections of the paper are structured as follows. Section 2 discusses the literature review. Section 3 describes the proposed research work and its mathematical model. Section 4 provides the experimental results. Section 5 presents the discussion and concluding remarks.

Literature review

The current COVID-19 dynamic spread rate has caused challenges to predict accurately using machine learning models. Over the time this has become even more challenging (Agarwal et al., 2018, Burke et al., 2019, Carlson et al., 2018, Kleiven et al., 2018). Even though machine learning models were used in past epidemics, there are still plenty differences with the current pandemic (Chenar and Deng, 2018, Iqbal and Islam, 2019, Liang et al., 2020, Maciel et al., 2019, Raja et al., 2019, Tapak et al., 2019). Machine learning has predicted precisely for ecological catastrophes, which could be considered as the basis for the current outbreaks. The COVID-19 prompted a different form of a self-imposed slump compared to the past ones (Sucahya, 2020).

Predicting the end of Coronavirus disease

The COVID-19 pandemic has turned into a global emergency and has now affected many lives in the past few months. The proposed research work is an attempt to show with data, how similar and different the spread of the pandemic is in different countries. Also, to understand how the pandemic is growing and to see how the disease is spreading across the world since 22nd January 2020. Proposed paper focuses on three data sources, which are Confirmed, Recovered, and Death cases.

Confirmed cases

Initially, the country with more confirmed cases was China, later the USA and other countries with a substantially smaller population have now passed that number in a very short time after the outbreak. According to break down trends by continents, it is noticed that Asia is essentially reaching the flat curve, while Europe and the America’s are still rapidly increasing in their numbers. Very few cases are reported in Oceania (expected due to the relatively low population size). The numbers in Asia are overwhelmingly dominated by India, after very rigid containment measures, which appear to be close to the top slot in the spreading of the disease. On the other hand, the numbers in the Americas are dominated by the USA that is where the outbreak is particularly intense now.

Proposed methodology

Default machine learning models to forecast the contagious virus over time could be more difficult. So, it is necessary to customize the mathematical model to consider various factors.

Customized mathematical model to estimate Indonesia

For example, imagine that the observed value of 20 new cases and likely change is shown in Fig. 2.

Fig. 2

Likelihood function with K 20.

In most pandemics, indicates the virus spread rate. This means, over time how many people get infected. Mathematically, is equal to R0 when 0. But alone may not throw light on actions and limitations. While the epidemic grows, the limitations may change . So, it is very important to know the present value of . If , the epidemic would infect more people. If <1, the epidemic would be slow in its spread. With smaller R, it would be easy to handle the spread. Generally, if <1 indicates that it is well under control. Poisson Distribution. Likelihood function with K 20. Estimation of the basic reproductive number (R0), derived by integrating uncertainties in parameter values, during the coronavirus disease outbreak in Indonesia. (A) Changes in R0 based on different growth rates and serial intervals. Each dot represents a calculation with mean latent period (range 2.2–6 days) and mean infectious periods (range 5–14 days). Only those estimates falling within the range of serial intervals of interests were plotted. (B) Histogram summarizing the estimated R0 of all dots in panel A (i.e., serial interval ranges of 6–9 days). The median R0 is 5.7 (95% CI 3.8–8.9). Hence, provides a couple of benefits. First, it allows to determine the beginning of the epidemic and when it turned into a pandemic. Second, it provides very important information on which actions and limitations to be addressed. Even top doctors and scientists claim that alone could guide us in controlling the pandemic. Today, we are not using in this way. Essentially, it is limited to understand at a national level. Instead, to manage this crisis effectively, we need a local (state, county, and/or city) granularity of . In the proposed model, a new process model is introduced, a Gaussian noise to find time variable The Gaussian process makes the model much more responsive. In the current period, every day, we learn how many more people have been infected by COVID-19. The new case count provides the present condition of . It is also noticed that, today is connected to (previous day). Hence, all the past COVID-19 contributes to . After observing few insights, Bayes’ rule is used to update the views of the actual value of It also provides information on daily additional infected cases. The Bayes’ Theorem used in the proposed model is as follows: In Eq. (1), k is the new cases, the distribution of is equal to: The likelihood of k as an additional number with times The prior views of exclusive of the given numbers Divided by the probability of several cases To make it iterative: each day that passes, it uses the past day prior to finding the present day’s past . It is assumed that an uncorrelated Gaussian method. where, is a hyperparameter. So, on first day: On second day: Selecting a Probability Function The probability function tells how likely to get new cases using . At any time, if it is required to model arrivals over some time period, statisticians like to use Poisson Distribution. It is assumed that typical new cases on each day at a rate then, the probability of new cases is distributed according to Poisson distribution: In this, a brief-expression is created to make as a column. By giving it a column for and a row for lambda will evaluate the pmf over both and produce an array that has rows and lambda columns. This is an efficient way of producing many distributions all at once. In Fig. 1, the Poisson distribution shows various cases per day, it will probably get that many, plus or minus some variation based on chance. But in the real case, there have been cases and need to know what value of is most likely. To do this, is fixed in place while varying . This is called the likelihood function.

Fig. 1

Poisson Distribution.

Fig. 2 shows that if there are 20 cases, the most likely value of is (not surprisingly) 20. But could be possible that lambda is 21 or 17 and noticed 20 new cases by chance alone. It also says that it is unlikely that could be 40 and noticed as 20. However, is parameterized by but looking for , which is parameterized by . So, it is necessary to know the relationship between and

Connecting and

The key insight to making this work is to realize that there is a connection between and . where is the reciprocal of the successive interval (about 7 days for COVID19). The recent Covid-19 analysis shows that the exponential growth rate of the outbreak used to be double in 7 days. Since every new case count on the previous day is used to reformulate the likelihood function as a Poisson parameterized by fixing and varying , is shown in Eq. (7). Gamma value 7 is assumed for COVID19 based on following analysis: To obtain accurate values of , we used earlier assessments of serial intervals for COVID-19. The serial interval is estimated to be 7–8 days based on data collected. More recent data collected in some Provinces in Indonesia, suggests that the serial interval is dependent on the time to hospital isolation. When infected persons are isolated after 5 days of symptoms (initially where the public was not aware of the virus and few interventions were implemented), the serial interval is estimated to be 8 days. Thus, these results suggest a serial interval of 7–8 days. With this serial interval, we sampled latent and infectious periods within wide biologically plausible ranges and estimated the median to be 5.8 (95% CI 4.4–7.7) shown in Fig. 3. To include a wider range of serial interval (i.e., 6–9 days), given the uncertainties in these estimations, we estimated that the median of estimated is 5.7 (95% CI of 3.8–8.9) (Fig. 3B). The estimated can be lower if the serial interval is shorter. However, recent studies reported that persons can be infectious for a long period, such as 1–3 weeks after symptom onset. Thus, we believe that a mean serial interval shorter than 6 days is unlikely during the early outbreak in Indonesia, where infected persons were not rapidly hospitalized.

Fig. 3

Estimation of the basic reproductive number (R0), derived by integrating uncertainties in parameter values, during the coronavirus disease outbreak in Indonesia. (A) Changes in R0 based on different growth rates and serial intervals. Each dot represents a calculation with mean latent period (range 2.2–6 days) and mean infectious periods (range 5–14 days). Only those estimates falling within the range of serial intervals of interests were plotted. (B) Histogram summarizing the estimated R0 of all dots in panel A (i.e., serial interval ranges of 6–9 days). The median R0 is 5.7 (95% CI 3.8–8.9).

Evaluating the likelihood function

To continue our example, let us imagine a sample of new case count . What is the likelihood of different values of R on each of those days is shown in Fig. 4.

Fig. 4

Likelihood of R with various K values.

It is noticed that each day we have an independent guess for R. The goal is to combine the information we have about previous days with the current day. To do this, Bayes’ theorem is used.

Performing the Bayesian update

To perform the Bayesian update, we need to multiply the likelihood by the prior (which is just the previous day’s likelihood without our Gaussian process) to get the posteriors. Let us do that using the cumulative product of each successive day and the results are shown in Fig. 5.

Fig. 5

Cumulative product of each successive day with R.

It is noticed from Fig. 5 that, how on Day 1, posterior matches Day 1’s likelihood because it had no information other than that day. However, when we update the prior using Day 2 information, the curve has moved left, but not nearly as left as the likelihood for Day 2. This is because Bayesian updating uses information from both days and effectively averages the two. Since Day 3 likelihood is in between the other two, there can seen a small shift to the right, but more significantly a narrower distribution. This shows more confident in our believes of the true value of . From these posteriors, it is easy to answer important questions such as “What is the most likely value of each day” and also to obtain the highest density intervals (HDI) for R. Based on this analysis both the most likely values of COVID-19 and the HDI over time are calculated and shown in Fig. 6.

Fig. 6

Most likely values of R and the HDI.

It is noticed from Fig. 6 that, the most likely value of changes with time and the highest-density interval narrows as we become sure of the true value of over-time. Likelihood of R with various K values. Cumulative product of each successive day with R. Most likely values of R and the HDI.

Results

Looking at the present situation, it is necessary to start the analysis when there are authenticated several cases each day. Find the last zero new case day and start analyzing on the day after that. Also, daily reported cases are erratic based on testing backlogs, etc. In the proposed methodology, a Gaussian filter is applied to get the best view of the ‘true’ data of time series. This is an arbitrary choice, but the real-world process is not nearly as stochastic as the actual reporting.

COVID-19 global spread trends

Table 1 shows continent-wise detailed COVID-19 confirmed cases, deaths, recovered, active cases, incident, and mortality rates.

Table 1

Continent wise recent COVID-19 cases details.

Fig. 7 shows the COVID-19 spread across the globe from china and shows how fast it has spread from China to other parts of the world. Fig. 8 shows the global prediction of new confirmed cases.

Fig. 7

Global COVID-19 Spread.

Fig. 8

COVID-19 Global confirmed cases prediction.

Continent wise recent COVID-19 cases details. It is noticed in Fig. 8 that new confirmed cases are increasing exponentially, it does not show the flatten or downtrends. Fig. 9 shows the future prediction of the death cases. Fig. 10 shows the highly affected country USA’s prediction. Fig. 11 shows the new cases per day in Indonesia.

Fig. 9

COVID-19 Global death cases prediction.

Fig. 10

COVID-19 USA confirmed cases prediction.

Fig. 11

Indonesia new confirmed cases trend.

Global COVID-19 Spread. COVID-19 Global confirmed cases prediction. COVID-19 Global death cases prediction. COVID-19 USA confirmed cases prediction. Indonesia new confirmed cases trend.

Choosing for the Gaussian process

Choosing the right value of for the Gaussian process is essential to predict the future trend. The general approach is simply selecting yesterday’s posterior as today’s past. While intuitive, doing so does not allow for the general hypothesis that the value of R would probably change from yesterday. To consider the change of R, Gaussian noise is applied to the prior distribution with some standard deviation . The higher the more noise and the more R value would be expected drift to each day. Fascinatingly, applying noise on noise iteratively means that there would be a natural decay of distant posteriors. This approach has the same effect of windowing, but it is more robust and does not arbitrarily forget posteriors after a certain time like general approaches. However, there is still an arbitrary choice of value to use in the process of maximum likelihood. To maximize the likelihood of the data, it is important to choose the right value of . In general, is a fixed value, but in the proposed method to maximize for all the values of . Since, , is defined. Hence, it turns out as the denominator of the Bayes rule is shown in Eq. (8). In Eq. (8), it is noticed that the numerator is just the joint distribution of and . Eq. (8), can marginalize the distribution over to get So, the distribution of all the numerator values of is summed up to get and posterior is calculated. In the proposed method, the optimum value of is chosen to maximize the and also to maximize the value of Eq. (10). where, indicates the time information and i indicates each state information of Indonesia. Since, multiplying lots of tiny probabilities together, it can be easier (and less error-prone) to take the of the values and add them together. Maximizing the sum of the of the probabilities is same as maximizing the product of the non-logarithmic probabilities for any choice of .

Function for calculating the posteriors

To calculate the posteriors, the following steps are used Calculate - the expected arrival rate for every day’s Poisson process Calculate each day’s likelihood distribution of all possible values of Calculate the Gaussian process matrix based on the value of Calculate initial past because the first day does not have a previous day to take from the posterior Based on the information from the CDC described (Sanche et al., 2020) Gamma value is chosen with a mean of 4. Loop from day 1 to the end, by doing the following: Calculate the prior by applying the Gaussian to yesterday’s prior. Apply Bayes’ rule by multiplying this prior and the likelihood we calculated in step 2. Divide by the probability of the data (also Bayes’ rule) Fig. 12 shows every day (row) of the posterior distribution plotted simultaneously.

Fig. 12

Time Domain with Credible Intervals.

Time Domain with Credible Intervals. The posteriors start without much confidence(wide) and become progressively more confident(narrower) about the true value of . Since the analyzed results include uncertainty, so better to view the most likely value of along with its highest-density interval is shown in Fig. 13. Hence, the proposed algorithm produces the most likely value for overtime for each locale.

Fig. 13

Indonesia Real-time .

Choosing the optimal value of

To choose an optimal , each state is evaluated with various sigma values. The optimum value of maximizes the likelihood of the data . Sigma value should be carefully chosen to avoid the overfit and that maximizes of each state. To do this, all the likelihoods are added up as per state for each value of sigma, and choose the maximum. After selecting the optimal , the precalculated posterior values are collected corresponding to that value of for each state. Also calculated the 90% and 50% highest-density intervals and the most likely value. In the proposed algorithm, the last seven days are considered instead of the previous days data to produce each state’s accurately. Fig. 14 shows the Indonesia’s state-wise for and plotted them along with the highest density interval (HDI) bands. Fig. 15 shows the Indonesia’s most recent values of .

Fig. 14

State-wise Indonesia real-time .

Fig. 15

Most recent in Indonesia.

State-wise Indonesia real-time . In Fig. 15, the states should be aware that their high values may create an exponential increase in the number of cases. Fig. 15 clearly shows that without social restrictions, cases may grow significantly. Now, let us look at which Indonesian states are almost certainly below threshold 1.0 is shown in Fig. 16a and the epidemic uncontrol is shown in Fig. 16b.

Fig. 16a

Likely under control states of Indonesia.

Fig. 16b

Not under control states of Indonesia.

Most recent in Indonesia. Likely under control states of Indonesia. Not under control states of Indonesia. Fig. 16a clearly shows that there is only one state called Jawa Tengah in Indonesia doing well at present. Also, Fig. 16a shows that state Jawa Tengah has a high end of its HDI is less than 1. This means that even in the worst-case scenario the state likely has an epidemic under control. It is clearly showed in Fig. 16b, Jawa Timur and Riau states are not under control. This might be due to earliest high rate of infection, but seeing larger states above 1.0 is worrying, especially because none of these states have hit the headlines as being trouble spots.

Conclusion

Various likelihoods and posteriors are computed and analyzed to predict the real-time Covid spread rate and its impact. The analysis shows that if new cases continue to decrease for more than 14 days at minimum, then it is assumed that the epidemic is under control. However, the analysis also proved that more testing is needed, so that the daily recorded cases could better reflect the real-world transmission rate. The analysis clearly shows that the people with certain pre-measures could minimize drastically COVID-19 spread. So, it is necessary to follow certain precautions until it becomes normal. Hence, social distancing and mass gathering should be avoided to control the spread rate.

Discussion

The Indonesian government is facing a hard time minimizing the COVID-19 spread rate, at first, it wanted to be careful in controlling the movements of the population. Over time and as cases increased, the only way spread rate can be minimized by increasing the number of Covid-19 isolation centers. This may reflect some challenges from different factors: (1) readiness and capacity between isolation facilities; (2) availability of monitoring officers; (3) transportation facilities. Recommendations There are several suggestions based on this study, namely: First, ensuring that the reference isolation centers designated are under standards both in terms of human resources, availability of necessary facilities, and reagents so that when appointed by the government they can immediately operate. Second, close the international borders and closely monitor inter-state borders where the positive cases more. Third, Indonesia’s geographical location is very broad, and the position of the reference may not be all easily accessible to certain regions. For this reason, some special procedures need to be considered. Fourth, encourage and strengthen the ability of existing resources in the country to be able to produce logistics needs and the needs of medical facilities independently. Lastly, to speed up the vaccination and immunization programs.

CRediT authorship contribution statement

Sankaraiah Sreeramula: Conceptualization, Methodology, Software, Writing. Deny Rahardjo: Supervision, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

16 in total

1. The Norovirus Epidemiologic Triad: Predictors of Severe Outcomes in US Norovirus Outbreaks, 2009-2016.

Authors: Rachel M Burke; Minesh P Shah; Mary E Wikswo; Leslie Barclay; Anita Kambhampati; Zachary Marsh; Jennifer L Cannon; Umesh D Parashar; Jan Vinjé; Aron J Hall
Journal: J Infect Dis Date: 2019-04-16 Impact factor: 5.226

2. Consensus and conflict among ecological forecasts of Zika virus outbreaks in the United States.

Authors: Colin J Carlson; Eric Dougherty; Mike Boots; Wayne Getz; Sadie J Ryan
Journal: Sci Rep Date: 2018-03-21 Impact factor: 4.996

3. Real-time predictions of the 2018-2019 Ebola virus disease outbreak in the Democratic Republic of the Congo using Hawkes point process models.

Authors: J Daniel Kelly; Junhyung Park; Ryan J Harrigan; Nicole A Hoff; Sarita D Lee; Rae Wannier; Bernice Selo; Mathias Mossoko; Bathe Njoloko; Emile Okitolonda-Wemakoy; Placide Mbala-Kingebeni; George W Rutherford; Thomas B Smith; Steve Ahuka-Mundeke; Jean Jacques Muyembe-Tamfum; Anne W Rimoin; Frederic Paik Schoenberg
Journal: Epidemics Date: 2019-07-23 Impact factor: 4.396

4. High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2.

Authors: Steven Sanche; Yen Ting Lin; Chonggang Xu; Ethan Romero-Severson; Nick Hengartner; Ruian Ke
Journal: Emerg Infect Dis Date: 2020-06-21 Impact factor: 6.883

5. Comparative evaluation of time series models for predicting influenza outbreaks: application of influenza-like illness data from sentinel sites of healthcare centers in Iran.

Authors: Leili Tapak; Omid Hamidi; Mohsen Fathian; Manoochehr Karami
Journal: BMC Res Notes Date: 2019-06-24

6. Inter-outbreak stability reflects the size of the susceptible pool and forecasts magnitudes of seasonal epidemics.

Authors: Martin Rypdal; George Sugihara
Journal: Nat Commun Date: 2019-05-30 Impact factor: 14.919

7. Real-Time Forecasting of Hand-Foot-and-Mouth Disease Outbreaks using the Integrating Compartment Model and Assimilation Filtering.

Authors: Zhicheng Zhan; Weihua Dong; Yongmei Lu; Peng Yang; Quanyi Wang; Peng Jia
Journal: Sci Rep Date: 2019-02-25 Impact factor: 4.379

8. Predicting the impacts of epidemic outbreaks on global supply chains: A simulation-based analysis on the coronavirus outbreak (COVID-19/SARS-CoV-2) case.

Authors: Dmitry Ivanov
Journal: Transp Res E Logist Transp Rev Date: 2020-03-24

9. Testing predictability of disease outbreaks with a simple model of pathogen biogeography.

Authors: Tad A Dallas; Colin J Carlson; Timothée Poisot
Journal: R Soc Open Sci Date: 2019-11-13 Impact factor: 2.963

Review 10. COVID-19 and Italy: what next?

Authors: Andrea Remuzzi; Giuseppe Remuzzi
Journal: Lancet Date: 2020-03-13 Impact factor: 79.321