Literature DB >> 34278058

Modelling SARS-CoV-2 unreported cases in Italy: Analysis of serological survey and vaccination scenarios.

Marco Claudio Traini¹, Carla Caponi², Riccardo Ferrari³, Giuseppe Vittorio De Socio⁴.

Abstract

OBJECTIVES: Aim of the present paper is the study of the large unreported component, characterizing the SARS-CoV-2 epidemic event in Italy, taking advantage of the Istat survey. Particular attention is devoted to the sensitivity and specificity of the serological test and their effects.
METHODS: The model satisfactory reproduces the data of the Italian survey showing a relevant predictive power and relegating in a secondary position models which do not include, in the simulation, the presence of asymptomatic groups. The corrections due to the serological test sensitivity (in particular those ones depending on the symptoms onset) are crucial for a realistic analysis of the unreported (and asymptomatic) components.
RESULTS: The relevant presence of an unreported component during the second pandemic wave in Italy is confirmed and the ratio of reported to unreported cases is predicted to be roughly 1:4 in the last months of year 2020. A method to correct the serological data on the basis of the antibody sensitivity is suggested and systematically applied. The asymptomatic component is also studied in some detail and its amount quantified. A model analyses of the vaccination scenarios is performed confirming the relevance of a massive campaign (at least 80000 immunized per day) during the first six months of the year 2021, to obtain important immunization effects within August/September 2021.

Entities: Chemical

Keywords: ARS-CoV-2; Asymptomatic and unreported cases; Epidemiological models; Sensitivity and specificity of the serological tests; Seroprevalence; Vaccination

Year: 2021 PMID： 34278058 PMCID： PMC8276585 DOI： 10.1016/j.idm.2021.06.002

Source DB: PubMed Journal: Infect Dis Model ISSN： 2468-0427

Introduction

Characteristic feature of the present coronavirus pandemic is that many of the persons who contract the disease are asymptomatic (never symptomatic or mild symptomatic) and largely responsible for the infection transmission (Johansson et al., 2021). In fact the spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) throughout the world has been extremely rapid suggesting the hypotheses of a crucial rôle played by those infected persons who remain asymptomatic even if contagious (Buitrago-Garcia et al., 2020). A recent paper by Oran and Topol (2020) summarizes the available evidences on asymptomatic SARS-CoV-2 infection concluding that ”asymptomatic persons seem to account for approximately 40% to 45% of the SARS-CoV-2 infections, and that they can transmit the virus to others for an extended period of time, perhaps longer than 14 days”. Oran and Topol (Oran & Topol, 2020) conclude with the need ”that testing programs include those without symptoms”. Other works estimate the fraction of asymptomatic patients to be more than 50% (Mizumoto et al., 2020) or as high as 75% (Day, 2020). For this reason, asymptomatic patients remain “hidden” and cannot be identified except through tests of large portions of the population. An example is given in the study by Lavezzo, E., E. Franchin, C. Ciavarella et al. (Lavezzo et al., 2020) reporting results of a detailed survey in the small town of Vo’ (near Padua, Italy) where on February 21st a lockdown was imposed in the whole municipality as a first outbreak in Italy. After the lockdown they found a prevalence of 1.2% (95% CI:0.8–1.8%). Notably, 42.5% (95% CI:31.5–54.6%) of the confirmed SARS-CoV-2 infections detected across two surveys were asymptomatic (i.e. they did not manifest symptoms at the time of swab testing and did not develop symptoms afterwards). After that a number of studies and projects have been developed to identify the rôle of asymptomatic in the pandemic infection in various countries (e.g. Buitrago-Garcia et al., 2020; Garcia-Basteiro, Moncunill, Tortajada et al., 2020; Guerriero et al., 2020; Peterson & Phillips, 2020; Pollán et al., 2020; Salje, Kiem, Lefrancq et al., 2020; Snoeck et al., 2020; Well et al., 2020; Yanes-Lane et al., 2020). In August 2020 preliminary results of a seroprevalence survey have been presented for a set of 64660 persons in Italy (Istat, 2020). The survey of the Italian National Institute of Statistics (Istat) aimed to define (within the entire population) the portion of persons that developed an antibody response against SARS-CoV-2. The survey adopted a methodology allowing the evaluation of the seroprevalence in the population also estimating the fraction of asymptomatic infections and the differences for age groups, sex, localization etc. The results involve a restricted number of tests, in particular those ones whose results have been reported before July 27th. The post-stratified techniques adopted (Little, 1993) allow to obtain statistical estimates coherent with epidemic data (both at the international and local level). The main results of the Istat survey, enlarged to the entire Italian population, are summarized in Table 1. The analysis concludes that 1482377 individuals developed IgG antibodies against SARS-CoV-2. A level of seroprevalence of 2.5% (95% CI: 2.3–2.6%) to be compare with 244708 officially reported cases.

Table 1

Region	IgG positive(95% CI)		Absolute value	Reported cases
Region	lower limit	upper limit	Absolute value	Reported cases
Italy	2.3	2.6	1482377	244708

Summary of the main results of the seroprevalence analysis in Italy (Istat, 2020). The amount of IgG positive tests corresponds to an infected population of (almost) 1.5 million of individuals in Italy, when the official reported cases are 244708 till July 27th, 2020. Asymptomatic patients (A) differ from exposed patients (E) in an important aspect. Unlike in traditional epidemiological models, contact between a person in the (A) group and another in the Susceptible (S) group does lead to the latter getting infected, with a certain probability. In addition, as in other models, contact between a person in the Infected (I) group and another in the (S) group also leads to the latter getting infected, with a similar probability. The seminal paper by Robinson and Stilianakis (Robinson & Stilianakis, 2013) formulates and investigates a preliminary model that, in addition to the exposed class, captures the asymptomatic phenomenon and it could be called SEAIR, since the corresponding model including exposed (E) group only is usually called SEIR. The two models differ in many aspects from a mathematical point of view; see Ansumali et al. for a recent analysis and detailed discussion (Ansumali et al., 2020). Moreover it is possible to develop more refined models of the pandemic by introducing additional categories such as Quarantined, Healed, Ailing, Recognized (or Reported), Threatened, etc. (e.g. Park et al., 2020; Giordano et al., 2020). By introducing more categories, one will get a more realistic model of the disease progression. On the other hand, the number of parameters to be estimated increases drastically. The ideal trade-off between these two conflicting considerations remains and it has to be investigated on a case by case basis (Remuzzi and Remuzzi, 2020; Jia et al., 2020; Kinoshita et al., 2020; Tang, Wang, et al., 2020; Tang, Bragazzi, et al., 2020; Yu, 2020). Aim of the present paper is the study of the large unreported component, characterizing the SARS-CoV-2 epidemic event in Italy, taking advantage of the Istat survey (Istat, 2020). The model used assumes eight compartments (groups) including symptomatic reported and unreported, quarantined, asymptomatic, threatened (hospitalized) and recovered, a model recently proposed to discuss the role of measures against the Italian outbreak (Traini et al., 2020, 2021a); a Markov-Chain-Monte-Carlo method (Hogg and Foreman-Mackey, 2018; Brooks et al., 2011) is used to fix the parameters. The model has been first formulated for the Wuhan outbreak (Tang, Wang, et al., 2020), (Tang, Bragazzi, et al., 2020). The uncertainties introduced by the serological test sensitivity and its time-dependence produce large effects in the identification of the antibody presence and are discussed in detail in Section 3.1 and Section 3.2. Section 4 is devoted to the study of vaccination scenarios and the rôle of unreported cases. Section 5 is devoted to some concluding remarks.

Methods

Recent model predictions and data comparison

The present Section is devoted to the validation of the model (and predictions) with the most recent data on the second epidemic wave in Italy (after September 25th, 2020), in particular the predictions for the unreported components. The parametrization is based on values indicated in Table II of (Traini et al., 2021a), no ad hoc modifications are introduced. The model, formulated in April 2020 (Traini et al., 2020) is simply normalized to the value of the reported cases of September 25th (47718), as indicated in the official site of the Ministry of Heath (Ministero della Salute) (Ministero della Salute, 2020). September 25th, 2020 is assumed as the beginning of the second wave of infection. The results, extended till February 2021, are summarized in Fig. 1.

Fig. 1

The model predictions for the daily reported cases September 25th, 2020–February 2021, in Italy, (upper panel), and for the daily variations of the reported cases, same period, (lower panel) are compared with the official data from Ministry of Heath (Ministero della Salute, 2020) (last updated February 15th, 2021). The social measures adopted in Italy during the second wave period are rather different from the strict lockdown of the first events, in particular the measures imposed are differentiated and follow the local virulence of the infection. The new approach of the governmental institutions assumes a strategically flexible response to the virus allowing for a possible “coexistence”. The distributions are therefore more stable and without rapid variations. The comparison with the daily reported cases of the upper panel, confirms the model description also in the second wave period, even if with larger uncertainties and some oscillations. The data are presented in different colors: green (till November 13th), red till the end of 2020, and yellow in the year 2021. The simple raison is that the model has been applied for the first time to the data of the second wave, in November 13th. After that date, the additional official data have been added to the Figure, day by day, and no modification or renormalization is introduced, i.e. the model predictions remained unchanged. The same color-notations are used in the lower panel where the data for the daily variations of the reported cases exhibit a broader aspect in comparison with the model predictions. Despite such a differences, the model remains rather consistent also with those data. As matter of fact daily variations represent an even more stringent test since they are related to the derivative of the distribution shown in the upper panel. In particular one has to note that they can become negative (in the region where the reported cases decrease, cfr. upper panel) and are submitted to large fluctuations due to data collection uncertainties and local inhomogeneities.

Modelling a seroprevalence survey

Epidemiological surveillance of COVID-19 cases captures only a portion of all infections because the clinical manifestation of infections with SARS-CoV-2 ranges from severe diseases, which can lead to death, to asymptomatic infection. The official sources of data in Italy (Ministero della Salute, 2020) do not provide information about the asymptomatic patients thus limiting their usefulness in view of calculating interesting parameters, not last the lethality rate. Attempts to exploit the existing available data in order to estimate the prevalence and the lethality of the virus in the total Italian population has been proposed by Bassi, Arbia and Falorsi (Bassi et al., 2020). They used a post-stratification of the official data in order to derive the weights necessary for re-weighting the sample results. The re-weighting procedure artificially modify the sample composition so as to obtain a distribution which is more similar to the population. They obtain a prevalence of 9%. Conversely, a (population-based) sero-epidemiological survey can quantify the portion of population which developed antibodies agains SARS-CoV-2 providing information on the exposed individuals and on the remaining susceptible persons (assuming that antibodies are marker of total or partial immunity). A certain number of surveys of SARS-CoV-2 have been realized or planned (in addition to the already quoted references, one could mention (Sood et al., 2020; Valenti et al., 2020; Stringhini et al., 2020; Bryan et al., 2020; Shakiba Nazariet al., 2020; Doi et al., 2020; Erikstrup et al., 2020; Wu et al., 2020; Bobrovitz et al., 2020). The data, elaborated by Istat in their report, have been collected in a period of time (May 25th to July 15th, 2020) running over (almost) two months. They are summarized in the upper and lower limits and the absolute value of Table 1 or (approximately) in the following ratio The ratio (1) is a clear sign of the large impact of the asymptomatic prevalence on the total infected population with positive IgG in Italy even if no time dependence is assumed for those data, an assumption that needs some more attention. In fact, the model description we are proposing can be of some help to answer questions on the time dependence of the data. A roughly constant behavior, as a function of time, for the ratio (1), can be presumed under the following assumptions: the antibodies have a lifetime significantly longer than the time needed for collecting the data; the social contact rate regime is rather stable during the collection of data. In this way the fraction of infected population is (on average) connected with the transmission probability of the infection which is assumed to be constant. In the present model β = (1.5851 ± 0.0336) ⋅ 10−8 day−1, (cfr. Table II of ref. (Traini et al., 2021a)); in a (large and representative) sample, the portion of individuals reached by the infection in the previous months is fixed (next discussion will be largely devoted to investigate this point). The assumption i) sounds reasonable and the assumption ii) can be explicitly checked in our model by defining the following (time dependent) ratio The numerator, in Eq. (2), counts, for each day, the total number of infected (reported (H) and unreported with symptoms (I) plus unreported asymptomatic infected individuals (A)) and the denominator counts the number of reported cases (H). Both numerator and denominator are largely time dependent functions as already discussed. A clear example is given in Fig. 2 where the reported (H(t)) and unreported cases (A(t) + I(t)) as a function of time are shown in the period of the second wave in Italy (September 2020–December 2020). The ratio appear to be 1:4 and one needs to disentangle the amount of asymptomatic cases responsible for the ratio. To this end we will make use of a combination of model predictions and results from the serological survey obtained in Italy in the period of the first wave (Istat, 2020).

Fig. 2

The analysis of the infected individuals (reported (H(t)) and unreported (A(t) + I(t)) cases) in Italy as a function of time in the period September 25th - December 31st. The data for the reported-infected individuals are from (Ministero della Salute, 2020) and are compared with our model predictions (magenta curves) which include statistical uncertainties. The reported cases are compared with the predictions for the class of infected non reported cases (asymptomatic and symptomatic) in the same period of time (cyan curves). The model predictions are shown in Fig. 3 were the results for the unreported cases (cfr. Fig. 2) are separated in the two components: unreported-asymptomatic (A(t)), and unreported-symptomatic (I(t)). In the time interval of interest the model predicts (as average values)a crucial information to evaluate in a quantitative way the amount of asymptomatic cases. The relation between unreported and asymptomatic distributions will be further discussed in Section 3.3 after a deeper analysis of the survey results.

Fig. 3

Model analysis of the time dependence distribution of the unreported cases of Fig. 2. The (unreported) asymptomatic component (A(t)) is largely dominant with respect to the (unreported) symptomatic contribution (I(t)) for the whole period of the second wave. For the moment let us come back to the model ratio (2), as it is shown in Fig. 4. Despite the rapid decrease of the ratio before May 25th, its value remains, after that date and till the end of July, approximately constant (ratio|(t) ≈ 5). In fact, the contact rate between February 24th and May 25th had large variation due to lockdown and social restriction measures adopted in Italy, while in the period May 25th - July 31st the social situation was rather stable. The model captures such variations validating the assumption ii) for that period.

Fig. 4

The model ratio of Eq. (2) as a function of time.

The model ratio of Eq. (2) as a function of time. However the ratio (2) cannot be compared directly with the results of the Istat survey. As a matter of fact the survey measured the number of individuals who developed antibodies without selecting the period of infection. The serological test cannot answer to time dependent questions only measuring the presence of antibodies which simply count the total amount of infections in the period preceding the test. In order to reproduce, as close as possible, the analysis from Istat, we can calculate the cumulative values of the numerator and denominator of Eq. (2) summing (up to a given day (t)) the components. The numerator of Eq. (1) (i.e. the absolute value) is given by the sum of the Asymptomatic population (A), the (unreported) Infected (I) and the official reported cases (H); the denominator the cumulative value of the official reported cases. One gets: The cyan curves in Fig. 5 summarize the model predictions for the ratio (4). The black curve representing the mean value and the dotted lines the normal distributed variations.

Fig. 5

The fixed ratio of Eq. (5), elaborated from the results of the seroprevalence survey (data-points) is compared with the model ratio of Eq. (4) (cyan curves). See text for discussion.

The fixed ratio of Eq. (5), elaborated from the results of the seroprevalence survey (data-points) is compared with the model ratio of Eq. (4) (cyan curves). See text for discussion. In the next Section the results of the Istat survey will be compared with the predictions of the model, but we want to emphasize that the comparison will privilegiate the form of ratios like (1) and (4) for a simple reason: possible systematic uncertainties affecting the values at numerator and denominator are largely compensated in the ratios which, therefore, represent the more adequate way of comparing the survey data with model calculations.

Analysis of the Istat survey

The results of the Istat survey can be transformed in an analogous ratio in a simple way: the absolute value corresponds to an average value of the prevalence (2.3 + 2.6)/2 ≈ 2.5 (cfr. Table 1). Therefore the upper and lower limit of the infected population divided by the reported cases is summarized by the ratio 1which now includes the uncertainties estimated in the Istat survey and updates the crude approximation (1). The data points in Fig. 5 give a graphical representation of the ratio (5) assumed, in a first step, to be time-independent on the basis of the assumption i) and ii) already discussed. From this first analysis one can conclude that: the global comparison in Fig. 5 between the estimated values and the evidences of ref (Istat, 2020) is surprisingly successful, in particular when statistical uncertainties are included in the analysis of the survey and in the model predictions. For a long period of time during the lockdown months, speculations on the Asymptotic population gave strongly divergent estimates. Our results of Figs. 5 and 3 are obtained within a model fixed in April 6th, 2020 and published before the Istat survey (Traini et al., 2020); data confirm that the epidemiological models can offer predictions rather stable and realistic also for the Asymptomatic compartment. In particular the good agreement of Fig. 5 is related to the dominance of the asymptomatic component within the unreported cases. Such a result is consistent with the empirical fact that the tracing procedures (swabs) in Italy during first wave of SARS-CoV-2, involved symptomatic individuals only, (Ministero della Salute, 2020. report 0011715-03/04/2020-DGPRE-DGPRE-P). It is not surprising, therefore, that a large portion of unreported cases can be associated to asymptomatic individuals (cfr. Fig. 3). The model results for the cumulative ratio of Fig. 5 start with a value ≈ 8 at the beginning of April 2020, reaching an asymptotic average value ≈ 5.5 at the end of July 2020. For the whole period of the preliminary survey (May 25th - July 15th) the ratio of Eq. (5), (data-points with error bars in Fig. 5), is reproduced by the model calculation of Eq. (4) within the statistical uncertainties. The conclusions of the Istat survey, i.e.:” the seroprevalence data at regional level, to be integrated with the epidemiological surveillance data, are particularly relevant to identify, on one side, the portion of individuals reached by the infection in the previous months, and for programming measures to prevent future possible second waves, on the other side”,2 can easily be applied to the relevance of modelling the behavior of the asymptomatic population, a key ingredient to manage the future of the pandemic event (e.g. (Gandhi et al., 2020)).

Results

The method introduced in the previous Section in order to learn on asymptomatic prevalence from serological tests and surveys, needs further considerations before extracting detailed information. Critical effects to be considered are the uncertainties due to sensitivity and specificity of the serological tests and the rôle played by time dependence of the antibody response.

Corrections due to sensitivity and specificity of the serological tests

A first correction to the seroprevalence analysis presented in the previous Section is due to the sensitivity and specificity of the serological tests adopted in the survey screenings (for a recent note on the false positive and false negative in diagnosis of Covid-19, see (Jia, Xiao, & Liu, 2020)). The report by Istat (Istat, 2020) indicates in not less than 95% the specificity of the tests and in not less than 90% their sensitivity. The consequent false negative and false positive classified individuals can be taken into account increasing the uncertainties of the ratio (5) of an amount of 10% (10% × 5.94 ≈ 0.59) and 5% (5% × 5.94 ≈ 0.30) in the two directions, assuming the maximum error in the data. One gets:which replaces the value (5) to take into account the (asymmetric) corrections due to false responses. Fig. 5 is, consequently, replaced by Fig. 6 where the model results are compared with the more elaborated analysis. The asymmetric increase of the error-bars is clearly shown, however, at the same time, the conclusions of Section 2.3 remain basically valid.

Fig. 6

The ratios of Eqs. (5), (6), are compared with the cumulative model ratio of Eq. (4) (cyan curves). The smallest error bars take into account the statistical variation of Eq. (5) suggested by the Istat report (Istat, 2020; Istat Questionnaire, 2020). The largest error bars include the modifications induced by the sensitivity and specificity of the IgG tests as proposed and discussed in the text.

Corrections due to time dependence of antibody tests

A recent meta-analysis by Deeks et al. (Deeks et al., 2020) observed substantial heterogeneity in sensitivities of IgA, IgM and IgG antibodies, or combinations, for results aggregated across different time periods post-symptom onset. They based the main results of the review on the 38 studies that stratified results by time since symptom onset .3 In particular IgG/IgM all showed low sensitivity during the first week since onset of symptoms (all less than 30.1%), rising in the second week and reaching their highest values in the third week. The combination of IgG/IgM had a sensitivity of 30.1% (95% CI 21.4 to 40.7) for 1–7 days, 72.2% (95% CI 63.5 to 79.5) for 8–14 days, 91.4% (95% CI 87.0 to 94.4) for 15–21 days. Estimates of accuracy beyond three weeks are based on smaller sample sizes and fewer studies. For 21–35 days, pooled sensitivities for IgG/IgM were 96.0% (95% CI 90.6 to 98.3). There are insufficient studies to estimate sensitivity of tests beyond 35 days post-symptom onset. Summary specificities (provided in 35 studies) exceeded 98% for all target antibodies with confidence intervals no more than 2 percentage points wide. Assuming as a reference point the results of the meta-analysis by Deeks et al., one must correct further the uncertainties of Fig. 6. The questionnaire filled by the people involved in the screening for the Istat survey (Istat Questionnaire, 2020) included questions on the exact period of the symptom onset, a relevant information to correct the data and to analyze the effects of the time-dependent sensitivity of the tests. However the information is not at disposal at the moment and one is obliged to introduce further corrections. One remains with the basic assumption that sample does not privilege the time dependence of the sensitivity. As a consequence the single test has to be considered (in average) affected by the average value of the sensitivity among 30.1% (days 1–7), 72.2% (days 8–14), 91.4% (days 15–21), 96% (days 22–35), and 90% for days > 35 (and till the end of the screening period, i.e. the remaining 17 days). 90% is indeed the minimum value of the sensitivity proposed by Istat and discussed in Section 3.1. Time-dependence renormalizes the sensitivity through a weighted average where the weights (7, 7, 7, 14, 17) are the duration of the partial values of the sensitivities (30.1, 72, 2, 91.4, 96.0, 90.0). One has: The sensitivity of the test decreases from not less than 90%–81.3% (the false tests from 10% to 18.7%), while the specificity remains not less than 95%. Taking into account the new estimated sensitivity (7), and repeating the calculations leading to Eq. (6), the ratio (6) is replaced bywhere (18.7% × 5.94 ≈ 1.11). Fig. 6 is replaced by Fig. 7. The asymmetric increase of the error-bars is again clearly shown, and the conclusions drawn in Section 2.3 become more weak.

Fig. 7

The ratios of Eqs. (5), (8), are compared with the cumulative model ratio of Eq. (4) (cyan curves). The smallest error bars take into account the statistical variation suggested by the Istat report (Istat, 2020). The largest error bars include the modifications induced by the sensitivity and specificity of the IgG and their time-dependence (Deeks, 2020) (Deeks et al., 2020), as elaborated in the present analysis.

The comparison between the model estimates and the data corrected for the time dependence of the sensitivity of the tests as in Fig. 7 is less accurate and the data cannot be considered a stringent test for models. The estimate of the asymptomatic prevalence remains valid, but with a larger interval of confidence. Model and corrected data are still consistent for the whole time period. The validity of the model description remains a guide in the interpretation of the unknown asymptomatic distribution within a larger interval of values. A more careful analysis of the Istat data, in particular the reference to the mentioned questionnaire (Istat Questionnaire, 2020) could help in making again the comparison more stringent. The ratios of Eqs. (5), (8), are compared with the cumulative model ratio of Eq. (4) (cyan curves). The smallest error bars take into account the statistical variation suggested by the Istat report (Istat, 2020). The largest error bars include the modifications induced by the sensitivity and specificity of the IgG and their time-dependence (Deeks, 2020) (Deeks et al., 2020), as elaborated in the present analysis.

Unreported versus asymptomatic cases

We are now in a position to disentangle, in more detail, the distribution of asymptomatic cases from the unreported ones. A first example has been given in Fig. 3 where the model simulation has been used to separate the two components: unreported-asymptomatic and unreported-symptomatic distributions. A second step can be performed by means of the results of the Istat survey. An attempt is illustrated in Fig. 8 where the population sampled by the survey is further subdivided following the symptom onset. Since the model does not distinguish mild-symptomatic and asymptomatic, the class of unreported cases has to be compared with the sum of the sectors with percentage 27.3% + 24.7% + 6.5% = 58.5%. The amount of Asymptomatic corresponding to the 58.5% is (cfr. Table 1)which represents the average asymptomatic cases ( = never asymptomatic + mild symptomatic + not specified) as detected by the serological survey. The possible variations in the final value are obtained taking into account the statistical variations of the seroprevalence (2.6%–2.3%) as shown in Table 1 and numerically elaborated in the footnote of pag. 9, one has

Fig. 8

From (Istat, 2020): Percentage of different symptoms in the population with antibodies sampled in the Istat seroprevalence survey. The Asymptomatic class of the model is made by the three portions 27.3% + 24.7% + 6.5% = 58.5%. See text for discussion. The same analysis can be written in our privilegiate form by means of the ratio 5 Even if it can appear rather cryptic, Eq. (12) is particularly interesting since: i) it is fully equivalent to the previous expressions (10),(11); ii) it is based on empirical data like “reported cases” and on the evidence that the asymptomatic class is just 58.5% of the total amount of infected as statistically detected by the serological test; iii) in addition it contains the “” which can be expressed in different way following the approximation used in the evaluation procedure, namely the expressions (5), (6) and (8) which contain different ingredients (as discussed in the previous Sections 3.1, 3.2). When the full ratio (8) is used taking into account corrections due to test sensitivity and time dependence effects, one obtains an estimate for the minimum and maximum value of the asymptomatic population: In order to compare those values with the model predictions one has to use the analogous quantities of Eq. (12) for the model, namely: i) the “reported cases”, which are approximately the same since the model reproduces the empirical data; ii) the ratio between unreported + reported cases and reported cases as given by the “” of Eq. (4). Numerically such ratio remains approximately constant for the period of interest (cfr. cyan curves in Fig. 5): third crucial ingredient is the ratio between the asymptomatic and symptomatic unreported cases as illustrated in Fig. 3, , as already anticipated in Section 2.2. The final expression reads The survey result (13) and the model prediction (15) largely overlap within the statistical uncertainties. As final result one can estimate the number of asymptomatic individuals (both never symptomatic and mild symptomatic) in Italy in the period May 25th - July 15th as the largest interval of the previous estimates (to be conservative)a 20% accuracy result.

Vaccination scenarios and the rôle of unreported cases

The present Section is devoted to a hot aspect of the recent discussion on SARS-CoV-2: vaccination. Great Britain started the vaccination campaign before Christmas 2020, and on December 27th all the remaining European countries. The organization of an historical massive event is rather heavy and it will need the effort of many institutions.

Modelling vaccination: scenarios

A preliminary aspect is the assessment of a possible vaccination scenario. We do not discuss specific strategies, but we want to establish general scenarios to illustrate main advantages and disadvantages within simple assumptions. Basically we will assume that the year 2021 will be devoted to vaccination in Italy and that the order of magnitude of the vaccinated (immunized) persons per day is between 40000 and 80000 (to be selected with specific social and homogeneity criteria) (e.g. Makoul et al., 2020). In the following two basic scenarios will be introduced, as illustrated in Fig. 9.

Fig. 9

A constant number of person are vaccinated per day during the year 20121: 80000, scenario (1-A), 40000, (scenario (1-B). Alternatively in scenario (2-A) an increasing number of people are vaccinated during 2021 allowing for a less strong starting effort. The total number of vaccinated people result (almost) 30 million for both scenarios, (1-A) and (2-A), at the end of the year.

The first scenarios, (1-A), (1-B), (1-C), assume a homogenous distribution, during the 2021, of a fixed number of immunized persons per day: 80000, scenario (1-A), 40000, scenario (1-B), (full lines in Figs. 9), and 200000, scenario (1-C), (dashed line in Fig. 9). The effort to start with such large portion of population suddenly is high and a second scenario is also investigated; The second scenario assumes that the amount of persons immunized during the first period is rather low and the process will accelerate during the year (dot-dashed line in Fig. 9, scenario (2-A). A constant number of person are vaccinated per day during the year 20121: 80000, scenario (1-A), 40000, (scenario (1-B). Alternatively in scenario (2-A) an increasing number of people are vaccinated during 2021 allowing for a less strong starting effort. The total number of vaccinated people result (almost) 30 million for both scenarios, (1-A) and (2-A), at the end of the year.

Year 2021: a perspective

The Section is devoted to a general perspective of the year 2021 offering a macroscopic view of the pandemic event in Italy and the advantages of the vaccination campaign. From Fig. 10 one can have an approximated idea of the time-evolution of the SARS-CoV-2 infection in Italy during the year4 in the case of no vaccination, an assumption which is ruled out, but it can offer a reference point to appreciate the advantages of the vaccines. Such a scenario is described by the full black line of Fig. 10. One can immediately see that the number of the daily reported cases (at the beginning of the year 2021 they were 574767 equivalent to 100%) is decreasing during the year since a larger and larger number of “susceptible” individuals get immunized (or dead) after the infected period. The year 2021 will not be sufficient to obtain a vanishing number of infected: in summer they will be around 150000 and during the next Christmas period roughly 50000. A rough estimate which takes into account (arbitrary) oscillations due to possible waves of infections5 as it emerges from the wave behavior of the curve. The situation changes largely with the introduction of the vaccination at a rate of 80000 immunized persons per day (red curve, scenario (1-A)). A reduction of roughly 100000 cases could already be evident in April, while in summer (June) one reaches the 20000 cases instead of 150000, in August the virus will give a definitive “CIAO”. As a matter of fact one is gaining several months (from six to ten months) because of vaccination without counting the number of deaths (see Fig. 11).

Fig. 10

Fig. 11

The reported + unreported cases as a function of the day of the year 2021. The scenario assuming no vaccination (full black line) is compared with the scenario (1-A) which assumes a constant rate of 80000 immunizations each day (full red line). The blu line, scenario (1-B), shows results for 40000 immunized per day at constant rate.

The fraction of the reported cases as a function of the day of the year 2021. January 1st they were 574767, equivalent to 100%. The scenario assuming no vaccination (continuous black line) is compared with the scenario (1-A) which assumes a constant rate of vaccines of 80000 immunized each day (continuous red line). The blu line, scenario (1-B), shows results for 40000 immunized per day at constant rate. The scenario (2-A), where the rate is not constant and a low starting period is compensated by an accelerated second immunization period (see Fig. 9), is illustrated by the dot-dashed (magenta) curve. The reported + unreported cases as a function of the day of the year 2021. The scenario assuming no vaccination (full black line) is compared with the scenario (1-A) which assumes a constant rate of 80000 immunizations each day (full red line). The blu line, scenario (1-B), shows results for 40000 immunized per day at constant rate. The scenario (1-B) (blu line) assumes half immunized persons per day (40000) and the disadvantages are evident with respect to the previous hypothesis of 80000. One has to wait end of October to eliminate the virus infection and in June one could have still 80000 cases. A lower limit is reached with the dashed line representing the strongest scenario of 200000 immunized per day, scenario (1-C). Also the scenario (2-A) has been implemented in the model and one can see the effects of a slow rate at the beginning of the year in the same Fig. 10 (dot-dashed line). The results of the vaccination are almost invisible till the end of May. They appear rapidly in the second part of the year producing the final result at the end of October. The summer is still a hard period and scenario (2-A) is rather similar to scenario (1-B). The conclusions of the previous discussion on vaccination strategies as they emerge from Fig. 10 can appear rather unintuitive. It appears rather hard to accept the rapid reduction of the reported cases in one single year (initial value 575000) also in a scenario with no-vaccination. To partially restore intuition, one has to realize that, following the results of the previous Sections, a large part of the effect is due to those hidden individuals we named unreported cases. The relevant number of unreported cases (a factor 4–5 larger than actual reported cases) produces a diffusion of recovered persons (with antibodies) contributing in a substantial to the exit from the Covid-19 infection. A convincing result of this analysis could come from Fig. 11 where the behavior of the total (unreported + reported cases) is shown as a function of time. It complement Fig. 10 showing that the decreasing behavior of the reported cases has a specular behavior for the unreported ones. The presence of antibodies has not only a vaccine origin, the (recovered) unreported cases contribute critically.

Concluding remarks

The traditional compartmental classes of Susceptible, Exposed, Infected and Recovered which characterize a large fraction of epidemical models, are not sufficient to simulate the coronavirus (SARS-CoV-2) pandemic infection. Many of the people who contract the diseases are Asymptomatic, they are infected and contagious and are often invoked as one of the causes for the rapid spread of the infection. It is hard to evaluate the amount of asymptomatic, estimates range from 40% to 75% (Buitrago-Garcia et al., 2020; Oran and Topol, 2020; Day, 2020; Mizumoto et al., 2020) of the total infected population. In a small scale the particular conditions of the Vo’ village (near Padua, Italy) allowed for two detailed surveys after a localized lockdown: the analysis found a prevalence of 1.2% (95% CI:0.8–1.8%). Notably, 42.5% (95% CI:31.5–54.6%) of the confirmed SARS-CoV-2 infections were asymptomatic (that is did not have symptoms at the time of positive swab testing and did not develop symptoms afterwards, (Lavezzo et al., 2020). The Italian National Institute of Statistics (Istat) presented, in August 2020, preliminary results of seroprevalence survey on the percentage of individuals reached by the SARS-CoV-2 infection in the previous months. The present study investigates such seroprevalence through a model description of the entire infected population in Italy considering a eightfold compartmental model which includes Infected, Asymptomatic and Quarantined population in addition to the more classic Susceptible, Recovered, Exposed. The model is illustrated in few examples and used to investigate in detail the amount of unreported cases including also sensitivity and specificity of the antibody tests used for the surveys. The comparison validates the model description and encourages other studies to detail, in a more quantitative way, the role of time-dependence of the sensitivity of the test used for the antibody screening. The combined use of model predictions and survey data allows an estimate of the total number of asymptomatic population in Italy in the period May 25th July 15th, 2020, as Predictions for the period of the second wave in Italy are also presented and discussed, including unreported contribution. Despite the fact the epidemiological surveillance of the second wave of the epidemic event in Italy is characterized by a rather strong use of swabs, the model predictions for the amount of unreported cases is of the same order of magnitude of the percentage already seen in the first outbreak (Traini et al., 2020). Future antibody screening will verify the present prediction. To complete the modelling, Section 4 has been devoted to the vaccination scenarios. The rôle played by the introduction of vaccination is clearly shown and it reduces in a significative way the time to reach the end of the infection. The strategies of the administration can be simulated and they favor an intense administration from the beginning. Starting with a low rate of administration risks to loose a large part of the advantage. At present (May 9th, 2021) the persons fully vaccinated (immunized), in Italy, are 7,400,000, for a total number of 24,000,000 administrations (Ministero della Salute, 2021) (Ministerodella Salute, 2021), while the rate of immunization can be estimated at the level of 88,000 (Traini, 2021b) per day. An intermediate value of the scenarios investigated in the present study. Note added in proof. Fig. 10 shows data referring to February 15th, the updated situation is shown in Fig. 12.

Fig. 12

Updating Fig. 10. Official data from Ministry of Heath (Ministero della Salute, 2020) (Ministerodella Salute, 2020) (last updated May 9th, 2021).

Updating Fig. 10. Official data from Ministry of Heath (Ministero della Salute, 2020) (Ministerodella Salute, 2020) (last updated May 9th, 2021). The large increase of the data after February 15th is basically due to the manifestation in Italy of new virus variants (e.g. Burki, 2021; Fontanet et al., 2021), an event not included in the model used in the present work. The delay induced by the variant wave seems to be almost 3 months. An acceleration in the administration of the vaccines is mandatory.

Authorship contribution statement

The paper is the product of the joint work of the authors.

Declaration of competing interest

We declare a non competing interest.

39 in total

1. Covid-19: identifying and isolating asymptomatic people helped eliminate virus in Italian village.

Authors: Michael Day
Journal: BMJ Date: 2020-03-23

2. Seroprevalence of SARS-CoV-2-Specific Antibodies Among Adults in Los Angeles County, California, on April 10-11, 2020.

Authors: Neeraj Sood; Paul Simon; Peggy Ebner; Daniel Eichner; Jeffrey Reynolds; Eran Bendavid; Jay Bhattacharya
Journal: JAMA Date: 2020-06-16 Impact factor: 56.272

3. A study of SARS-CoV-2 epidemiology in Italy: from early days to secondary effects after social distancing.

Authors: Marco Claudio Traini; Carla Caponi; Riccardo Ferrari; Giuseppe Vittorio De Socio
Journal: Infect Dis (Lond) Date: 2020-07-30

Review 4. Prevalence of Asymptomatic SARS-CoV-2 Infection : A Narrative Review.

Authors: Daniel P Oran; Eric J Topol
Journal: Ann Intern Med Date: 2020-06-03 Impact factor: 25.391

5. Performance Characteristics of the Abbott Architect SARS-CoV-2 IgG Assay and Seroprevalence in Boise, Idaho.

Authors: Andrew Bryan; Gregory Pepper; Mark H Wener; Susan L Fink; Chihiro Morishima; Anu Chaudhary; Keith R Jerome; Patrick C Mathias; Alexander L Greninger
Journal: J Clin Microbiol Date: 2020-07-23 Impact factor: 5.948

6. Observed and estimated prevalence of Covid-19 in Italy: How to estimate the total cases from medical swabs data.

Authors: F Bassi; G Arbia; P D Falorsi
Journal: Sci Total Environ Date: 2020-10-08 Impact factor: 7.963

7. SARS-CoV-2 Transmission From People Without COVID-19 Symptoms.

Authors: Michael A Johansson; Talia M Quandelacy; Sarah Kada; Pragati Venkata Prasad; Molly Steele; John T Brooks; Rachel B Slayton; Matthew Biggerstaff; Jay C Butler
Journal: JAMA Netw Open Date: 2021-01-04

8. Understanding variants of SARS-CoV-2.

Authors: Talha Burki
Journal: Lancet Date: 2021-02-06 Impact factor: 79.321

9. Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Geneva, Switzerland (SEROCoV-POP): a population-based study.

Authors: Silvia Stringhini; Ania Wisniak; Giovanni Piumatti; Andrew S Azman; Stephen A Lauer; Hélène Baysson; David De Ridder; Dusan Petrovic; Stephanie Schrempft; Kailing Marcus; Sabine Yerly; Isabelle Arm Vernez; Olivia Keiser; Samia Hurst; Klara M Posfay-Barbe; Didier Trono; Didier Pittet; Laurent Gétaz; François Chappuis; Isabella Eckerle; Nicolas Vuilleumier; Benjamin Meyer; Antoine Flahault; Laurent Kaiser; Idris Guessous
Journal: Lancet Date: 2020-06-11 Impact factor: 79.321