Literature DB >> 33773195

Estimation of the incubation period of COVID-19 using viral load data.

Keisuke Ejima¹, Kwang Su Kim², Christina Ludema³, Ana I Bento³, Shoya Iwanami², Yasuhisa Fujita², Hirofumi Ohashi⁴, Yoshiki Koizumi⁵, Koichi Watashi⁶, Kazuyuki Aihara⁷, Hiroshi Nishiura⁸, Shingo Iwami⁹.

Abstract

The incubation period, or the time from infection to symptom onset, of COVID-19 has usually been estimated by using data collected through interviews with cases and their contacts. However, this estimation is influenced by uncertainty in the cases' recall of exposure time. We propose a novel method that uses viral load data collected over time since hospitalization, hindcasting the timing of infection with a mathematical model for viral dynamics. As an example, we used reported data on viral load for 30 hospitalized patients from multiple countries (Singapore, China, Germany, and Korea) and estimated the incubation period. The median, 2.5, and 97.5 percentiles of the incubation period were 5.85 days (95 % CI: 5.05, 6.77), 2.65 days (2.04, 3.41), and 12.99 days (9.98, 16.79), respectively, which are comparable to the values estimated in previous studies. Using viral load to estimate the incubation period might be a useful approach, especially when it is impractical to directly observe the infection event.

Entities: Disease Gene Species

Keywords: COVID-19; Incubation period; Infectious disease epidemiology; Mathematical model; SARS-CoV-2

Year: 2021 PMID： 33773195 PMCID： PMC7959696 DOI： 10.1016/j.epidem.2021.100454

Source DB: PubMed Journal: Epidemics ISSN： 1878-0067 Impact factor: 4.396

Introduction

The current COVID-19 outbreak is characterized by a longer incubation period (i.e., time from infection to symptom onset) than that of influenza and other acute respiratory viruses. This longer incubation period means that many of the strategies for disease control that rely on symptom-based surveillance (e.g., community fever monitoring or home observation of travelers for symptoms) will not effectively control the outbreak. For example, the wide geographic spread of SARS-CoV-2 could have been driven by this long incubation period, allowing cases to pass through border control measures such as temperature screening (Wells et al., 2020). Estimating the incubation period is challenging, because we rarely directly observe the time of infection or the time of symptom onset (examples to the contrary in HIV infection show the intense follow-up needed to observe these events (Robb et al., 2016; Rolland et al., 2020)). The first study estimating the incubation period of SARS-CoV-2 was that of Bi et al. (2020), who fit a log-normal model to a subset of cases for whom detailed information about exposure to another case was available. However, even with meticulous contact tracing, directly observing infector-infectee pairs is a time-consuming process, especially when the incubation period is lengthy. Measuring the incubation period through contact tracing is more difficult if the infector-infectee pair had a lot of contact with each other, leading to a number of suspected individuals needing to be interviewed. Indeed, Bi et al. demonstrated large uncertainty (the interval of exposure was more than 10 days for about 25 % of the cases) concerning the timing of infection for COVID-19 in China (Bi et al., 2020). Although a majority of studies estimated incubation period of SARS-CoV-2 (Bi et al., 2020; Lauer et al., 2020; Linton et al., 2020) used a statistical modelling technique that accounts for uncertainty in both the reports of exposure time and the time of symptom onset (Reich et al., 2009), they had to inherently use a heuristic weight function for the censored information. Here we propose another approach to estimating the incubation period, in which we use longitudinal data on viral load and hindcast the point of initial infection. Viral load data were collected at the early stage of the epidemic for clinical purposes (e.g., understanding the etiology and the pathophysiology of COVID-19) and to ensure patients were no longer shedding virus (or more precisely, viral fragments) before hospital discharge. The data were analyzed using a mathematical model describing viral dynamics, which typically draws a bell-shaped curve (i.e., viral load first increases exponentially until the peak, where it starts to decline). Although the data are available only after the onset of symptoms, the timing of infection can be estimated by hindcasting the model for each case.

Results

Viral load dynamics for SARS-CoV-2

We extracted viral load data for 30 hospitalized patients as reported in four papers (Table S1) and quantified the dynamics of SARS-CoV-2 infection with a mathematical model previously proposed (Ikeda et al., 2016; Perelson, 2002; Kim et al., 2020a):where and are the relative fraction of uninfected target cells at time t to those at time 0 and the amount of virus at time t, respectively. The parameters , , and are the rate constant for virus infection, the maximum rate constant for viral replication, and the death rate of infected cells, respectively. The viral load data from the four different papers were fitted to the model with mixed effects, which assumed that the parameters for each individual follow normal distributions with the same population mean. Several different models other than the model described above are available to explain the viral load trajectory of acute infection. However, we chose this model because it better explains the data. As an example, the model considering eclipse phase has been proposed for acute infection and has been applied to SARS-CoV-2 (Baccam et al., 2006; Gonçalves et al., 2020). We fitted the models with and without eclipse phase and compared the goodness-of-fit (i.e., BIC). Although the BICs were comparable, we needed to fix the parameter value that determines the length of the eclipse phase. Thus, we decided to use the current model without eclipse phase. The viral load dynamics for each case in Asia and Europe is shown in Fig. 1, Fig. 2 , respectively, and the estimated values of the parameters for each case are summarized in Table S2. The peak of viral load appeared 2–3 days after symptom onset. Note that in the data, there were no cases in which viral load was measured before symptom onset. Among the total 30 cases, viral load was the highest in the first measurement in 14 cases.

Fig. 1

Fig. 2

Viral load dynamics for each case in Europe: Each colored symbol corresponds to the measured viral load (Germany: green). The shadowed region corresponds to the estimated viral load from 100 sets of parameters resampled from conditional distributions; the solid line gives the best-fit curve. The time scale is days since the onset of symptoms (the black dotted vertical line is the day of symptom onset). The gray dashed horizontal line is the detection limit.

Viral load dynamics for each case in Asia: Each colored symbol corresponds to the measured viral load (Singapore: pink, China: blue, Korea: yellow). The shadowed region corresponds to the estimated viral load from 100 sets of parameters resampled from conditional distributions; the solid line gives the best-fit curve. The time scale is days since the onset of symptoms (the black dotted vertical line is the day of symptom onset). The gray dashed horizontal line is the detection limit. Viral load dynamics for each case in Europe: Each colored symbol corresponds to the measured viral load (Germany: green). The shadowed region corresponds to the estimated viral load from 100 sets of parameters resampled from conditional distributions; the solid line gives the best-fit curve. The time scale is days since the onset of symptoms (the black dotted vertical line is the day of symptom onset). The gray dashed horizontal line is the detection limit.

Establishing a viral load threshold for infection

To assess the day on which SARS-CoV-2 infection was established, in other words, the start of the exponential growth phase of the viral load (Perelson, 2002), we needed to set the viral load threshold for this timing. The time of the infection event, , was identified by means of back-calculation by using the dataset when the viral load reaches the threshold. We used the three cases reported from China (Patients D, H, and L) with known primary cases to determine the viral load threshold to establish infection (Zou et al., 2020). For these three cases, the day of exposure was assumed to be equal to the day of the infection event, as follows. A case (Patient E) from Wuhan visited Patient D and Patient L in Zhuhai on January 17. Patients D and L developed symptoms on January 23 and 20, respectively (thus their primary case is Patient E). Two cases (Patient I and P) from Wuhan visited their daughter, Patient H, in Zhuhai on January 11. Patient H developed fever on January 17 (thus her primary cases are Patient I and P). Using this contact information, we computed the viral load for the day on which infection was established by hindcasting the mathematical model with the estimated parameters, which we defined as the infection establishment threshold. The viral load threshold was , and (all of which correspond to 95 % confidence intervals) for Patients D, H, and L, respectively. The viral load threshold, or the dose of exposure, should be heterogeneous between patients; thus, we fitted the normal distribution to the log-transformed estimated viral load threshold. Specifically, we randomly sampled 100 values from the estimated distribution of viral load threshold for each of the three patients and then fitted a normal distribution. We used this distribution ( as the threshold for further analyses. Because the viral load thresholds estimated for the three patients differed substantially, we performed the same analysis for each patient and used the thresholds to estimate the distribution of incubation periods as sensitivity analyses.

Incubation period of COVID-19

With the viral load threshold, we computed the incubation period, , for all patients by hindcasting the mathematical model after fitting the model to the data. To address the uncertainty of the estimation, we resampled 100 parameter sets for each individual including the viral load threshold and obtained the corresponding 100 for each individual (i.e., 100 × 30 in total) (see “The nonlinear mixed effect model” for the details of computation). Then, the three parametric distributions were fitted to 100 × 30 : Weibull, gamma, and log-normal distributions. Comparing the Akaike Information Criteria (AIC) for those three distributions, the best model (i.e., that with the lowest AIC) was used for further analyses. The parametric bootstrap method was used to assess parameter uncertainty. Specifically, the bootstrap sample was composed of 30 : a single was resampled from the 100 of each individual. The best parametric model (i.e., Weibull, gamma, or log-normal distribution) was fitted to the bootstrapped data for parameter inference. We repeated this process 1000 times and obtained 1000 parameter sets, and the median, 2.5, and 97.5 percentiles of the distribution were computed. As a sensitivity analysis, the above process was repeated with the data from Europe (Germany) and Asia (China, Singapore, and Korea) separately. Fig. 3 shows the 95 % CI of the empirical distribution of the timing of infection for each case estimated from the viral load data using the virus dynamics model. The AICs of the three models (log-normal, gamma, and Weibull distributions) were 13812.7, 13930.3, and 14390.1, respectively. Thus, the lognormal distribution was preferred and was used for further analysis. Fig. 4 A and D summarize the estimated cumulative distribution function and probability density function of the incubation period for COVID-19 (log-normal distribution), respectively. The median, 2.5, and 97.5 percentiles of the incubation period were 5.85 days (95 % CI: 5.05, 6.77), 2.65 days (2.04, 3.41), and 12.99 days (9.98, 16.79), respectively. For a sensitivity analysis, distributions of the incubation period were computed for Asia (Singapore, China, and Korea) and Europe (Germany) separately in Fig. 4 BC and EF. The median, 2.5, and 97.5 percentiles of the incubation period were 5.77 days (95 % CI: 4.81, 6.61), 2.69 days (1.84, 3.49), and 12.26 days (9.30, 16.67), respectively, for Asian countries. The median, 2.5, and 97.5 percentiles of the incubation period were 6.01 days (95 %CI: 4.93, 7.37), 3.01 days (1.76, 4.13), and 12.18 days (7.98, 16.91), respectively, for Europe. We did not observe large difference in the incubation period in the data from Asia and Europe. Furthermore, we used the threshold estimated from each patient to estimate the distribution of the incubation period. The median, 2.5, and 97.5 percentiles of the incubation period were 5.81 days (95 % CI: 4.97, 6.69), 2.53 days (1.96, 3.25), and 13.23 days (10.46, 17.23) using the threshold of Patient D; 7.17 days (95 % CI: 6.37, 8.02), 3.85 days (3.21, 4.65), and 13.35 days (10.74, 16.55) using the threshold of Patient H; and 3.89 days (95 % CI: 3.45, 4.37), 2.04 days (1.72, 2.48), and 7.29 days (5.81, 9.02) using the threshold of Patient L. As expected, the lower threshold yielded to a longer incubation period.

Fig. 3

Fig. 4

The estimated incubation period: (A, B, C) The cumulative distribution function for total, Asian, and European cases, respectively. We used the log-normal distribution for fitting. The gray lines were drawn based on the 1000 different bootstrap samples. The horizontal bars are 95 % CIs at 2.5 %, 50 %, and 97.5 % of the distribution. The solid red curve corresponds to the median of the estimated distribution. (D, E, F) The probability density function for total, Asian, and European cases, respectively.

The estimated day on which infection was established for each case by use of day from symptom onset as the time scale: The dots and the bars are the median, 2.5, and 97.5 percentiles of the empirical distribution of the estimated day on which infection was established for each case. The case IDs on the right correspond to those in the original papers. The estimated incubation period: (A, B, C) The cumulative distribution function for total, Asian, and European cases, respectively. We used the log-normal distribution for fitting. The gray lines were drawn based on the 1000 different bootstrap samples. The horizontal bars are 95 % CIs at 2.5 %, 50 %, and 97.5 % of the distribution. The solid red curve corresponds to the median of the estimated distribution. (D, E, F) The probability density function for total, Asian, and European cases, respectively.

Discussion

Inferring the timing of infection is challenging in general. Given asymptomatic and presymptomatic transmission and the relatively long incubation period of SARS-CoV-2, not all patients are aware of how they were exposed or the specific time of exposure. The median incubation period of SARS-CoV-2 is estimated to be 5–6 days (Bi et al., 2020; Lauer et al., 2020; Linton et al., 2020; Backer et al., 2020), whereas that for other acute respiratory viral infections, such as SARS-CoV-1, non-SARS human coronaviruses, influenza A virus, and influenza B virus, are estimated to be 4.0, 3.2, 1.4, and 0.6 days, respectively (Lessler et al., 2009). The proportion of SARS-CoV-2 infection that is asymptomatic ranges from 40 % to 45 % (Prevalence of Asymptomatic SARS-CoV-2 Infection), which is close to that for influenza (50 %) (Weinstein et al., 2003). By contrast, asymptomatic cases are rarely observed for SARS-CoV-1 (Lee et al., 2003). Thus, we proposed using viral load data, which are externally measured and are independent of recall. The median of the estimated incubation period was about 6 days, and 97.5 % of cases developed symptoms in about 13 days. These estimations are consistent with previously published estimates (Bi et al., 2020; Lauer et al., 2020; Linton et al., 2020; Backer et al., 2020). Mass vaccination campaigns for SARS-CoV-2 have been proceeding with unprecedented speed; however, the risk of resurgence will not be negligible (influenza outbreaks happen even though effective vaccines are available). In addition to the vaccine, contact tracing is important to reduce the risk for resurgence, and being able to make valid estimates of the incubation period helps to reduce the burden in the contact tracing process. Indeed, contact tracing helped to further identify and treat cases earlier than a symptom-based approach (Bi et al., 2020). Furthermore, when we know the incubation period distribution, we can better assess the role of presymptomatic infection in the outbreak. Combined with the serial-interval distribution, the incubation-period distribution has been used to quantify the proportion of presymptomatic infection (He et al., 2020). The strength of this approach is that it can complement the limitations of the classic interview-based approach regarding ascertaining the exposure event. Our proposed approach may be applicable not only to human infectious disease and zoonoses such as influenza and COVID-19, but also to animal/livestock infectious diseases such as foot-and-mouth disease when contact recall is not possible. Furthermore, replicating viral load from infection to recovery is helpful not only for estimating the incubation period but also for clinical and epidemiologic understanding of the disease. For example, we observed that the viral load of SARS-CoV-2 peaked 2–3 days after the onset of symptoms, which is consistent with the finding that the viral load in throat swabs was on the decline when first measured (2–4 days since symptom onset) (18, 19). There are several limitations to be noted in this study. One is related to the modelling approach. Our approach did not account for any uncertainty in reported day of symptom onset because the data did not include the range of exposed days. The approach accounting for uncertainty was previously proposed by Reich et al. (Reich et al., 2009). Combining our approach with that of Reich et al. might reduce uncertainty surrounding the precise reporting of exposure and illness onset once such data are available, which is doable because estimation of the timing of infection and estimation of the incubation-period distribution are independent. The model we used in this study did not include detailed immune response or antiviral effects given limited information. We can update the model once relevant data are available. Another limitation is relevant to the data we used. The proposed approach requires collection of viral loads over time since symptom onset, which might not be feasible for all patients or in resource-limited contexts. A few studies have investigated change in viral load over time (Benefield et al., 2020; Kucirka et al., 2020). However, those studies included viral load data with observation at a single time point because they did not consider individual variability. Further, one paper assumed that the incubation period was 5 days when symptom onset information was not available (Benefield et al., 2020). We believe such an approach is unreasonable because 1) the day of exposure is extremely hard to observe (and therefore we are proposing to use longitudinal viral load data), and 2) the incubation period varies between patients. We admit the inclusion criterion for data in our study (more than three data points from each patient) is a limitation of our study; however, we do not think that adding nonlongitudinal data or data without information on symptom onset would be an option. We used data from hospitalized and symptomatic patients. If viral dynamics and the incubation period differ in unhospitalized patients, the estimated incubation-period distribution should represent that for hospitalized patients only. Indeed, we are planning to collect saliva samples from mildly symptomatic to asymptomatic patients (https://rctportal.niph.go.jp/en/detail?trial_id=jRCT2071200023). We used the viral load data collected from upper respiratory specimens (i.e., nasopharyngeal, oropharyngeal, nasal swabs), because viral dynamics differs between organs, as evidenced in multiple studies (e.g., rectal swab vs. nasal swab) (Xu et al., 2020; Young et al., 2020). However, viral dynamics might also differ between nasal, nasopharyngeal, and oropharyngeal swabs, even though they are close. Similarly, sex, age, and other factors might influence viral dynamics; however, such information was not consistently available from all patients. We treated the different types of swabs as a covariate in the model, but the computation did not converge because of the small sample sizes. However, because we used a mixed-effect model, the random effect in every parameter (on each patient) should have considered the difference in viral dynamics due to the sample type and demographic differences to some extent. Being able to make valid estimates of the incubation period distribution is essential for mitigating risk of disease spread. Knowing the estimated incubation period distribution simplifies the process of contact tracing and improves our understanding of the role of presymptomatic infection. By unifying the proposed approach with existing epidemiologic methods, we can achieve precise estimation of the incubation-period distribution.

Materials and methods

Data

The viral load data were from 30 hospitalized patients presented in four previously published studies of hospitalized COVID-19 patients (Zou et al., 2020; Wölfel et al., 2020; Young et al., 2020; Kim et al., 2020b). All cases used in our analysis presented with symptoms before or after hospitalization. For consistency, the viral load data from upper respiratory specimens were used in the analysis. Patients treated with antivirals or with less than two data points were excluded. For all the studies from which we extracted data, ethics approval was obtained from the ethics committee at each institute. Written informed consent was obtained from the patients or their next of kin in the original studies. We summarized the data in Table S1 and described the details in the Supplementary Material.

The nonlinear mixed effect model

MONOLIX 2019R2 (www.lixoft.com) was used to fit the nonlinear mixed-effects model to the viral load data. The nonlinear mixed-effects model incorporates fixed effects and random effects accounting for interpatient variability in viral dynamics. Specifically, the parameter for individual , is represented by the fixed effect, and the random effect, which follows the Gaussian distribution with mean and standard deviation . The fixed effect (population parameter) and random effect were estimated by using the stochastic approximation EM (expectation-maximization) algorithm and empirical Bayes’ method, respectively. Using estimated parameters and a Markov Chain Monte Carlo algorithm, we obtained the conditional distribution of parameters for each patient. A total of 100 parameter sets for each patient were resampled from the conditional distribution and used to estimate the incubation period distribution.

Author contributions

Conceived and designed the study: KE HN SI. Analyzed the data: KE KSK SI. Wrote the paper: KE KSK CL AIB SI YF YI HO YL KW KA HS SI. All authors read and approved the final manuscript.

Declaration of Competing Interest

The authors declare that they have no competing interests.

11 in total

1. Number Needed to Quarantine and Proportion of Prevented Infectious Days by Quarantine: Evaluating the Effectiveness of COVID-19 Contact Tracing.

Authors: Diogo Fernandes da Silva; João Vasco Santos; Filipa Santos Martins
Journal: Public Health Rep Date: 2022-08-20 Impact factor: 3.117

2. Comparative Analysis of Age, Sex, and Viral Load in Outpatients during the Four Waves of SARS-CoV-2 in A Mexican Medium-Sized City.

Authors: Carlos Eduardo Covantes-Rosales; Victor Wagner Barajas-Carrillo; Daniel Alberto Girón-Pérez; Gladys Alejandra Toledo-Ibarra; Karina Janice Guadalupe Díaz-Reséndiz; Migdalia Sarahy Navidad-Murrieta; Guadalupe Herminia Ventura-Ramón; Mirtha Elena Pulido-Muñoz; Ulises Mercado-Salgado; Ansonny Jhovanny Ojeda-Durán; Aimée Argüero-Fonseca; Manuel Iván Girón-Pérez
Journal: Int J Environ Res Public Health Date: 2022-05-08 Impact factor: 4.614

Review 3. Review of COVID-19 testing and diagnostic methods.

Authors: Olena Filchakova; Dina Dossym; Aisha Ilyas; Tamila Kuanysheva; Altynay Abdizhamil; Rostislav Bukasov
Journal: Talanta Date: 2022-03-31 Impact factor: 6.556

4. Shorter Incubation Period among COVID-19 Cases with the BA.1 Omicron Variant.

Authors: Hideo Tanaka; Tsuyoshi Ogata; Toshiyuki Shibata; Hitomi Nagai; Yuki Takahashi; Masaru Kinoshita; Keisuke Matsubayashi; Sanae Hattori; Chie Taniguchi
Journal: Int J Environ Res Public Health Date: 2022-05-23 Impact factor: 4.614

5. Incubation period of wild type of SARS-CoV-2 infections by age, gender, and epidemic periods.

Authors: Chiara Achangwa; Huikyung Park; Sukhyun Ryu
Journal: Front Public Health Date: 2022-07-27

6. Designing isolation guidelines for COVID-19 patients with rapid antigen tests.

Authors: Yong Dam Jeong; Keisuke Ejima; Kwang Su Kim; Woo Joohyeon; Shoya Iwanami; Yasuhisa Fujita; Il Hyo Jung; Kazuyuki Aihara; Kenji Shibuya; Shingo Iwami; Ana I Bento; Marco Ajelli
Journal: Nat Commun Date: 2022-08-20 Impact factor: 17.694

7. Estimation of the basic reproduction number of COVID-19 from the incubation period distribution.

Authors: Lasko Basnarkov; Igor Tomovski; Florin Avram
Journal: Eur Phys J Spec Top Date: 2022-08-12 Impact factor: 2.891

8. Immune response to SARS-CoV-2 in severe disease and long COVID-19.

Authors: Tomonari Sumi; Kouji Harada
Journal: iScience Date: 2022-07-04

9. Spatially distributed infection increases viral load in a computational model of SARS-CoV-2 lung infection.

Authors: Melanie E Moses; Steven Hofmeyr; Judy L Cannon; Akil Andrews; Rebekah Gridley; Monica Hinga; Kirtus Leyba; Abigail Pribisova; Vanessa Surjadidjaja; Humayra Tasnim; Stephanie Forrest
Journal: PLoS Comput Biol Date: 2021-12-23 Impact factor: 4.475

Review 10. Severe acute respiratory syndrome coronavirus-2: implications for blood safety and sufficiency.

Authors: Philip Kiely; Veronica C Hoad; Clive R Seed; Iain B Gosbell
Journal: Vox Sang Date: 2020-09-23 Impact factor: 2.996