Literature DB >> 33132155

Infectious diseases epidemiology, quantitative methodology, and clinical research in the midst of the COVID-19 pandemic: Perspective from a European country.

Geert Molenberghs¹, Marc Buyse², Steven Abrams³, Niel Hens⁴, Philippe Beutels⁵, Christel Faes⁶, Geert Verbeke¹, Pierre Van Damme⁵, Herman Goossens⁷, Thomas Neyens¹, Sereina Herzog⁵, Heidi Theeten⁵, Koen Pepermans⁵, Ariel Alonso Abad⁸, Ingrid Van Keilegom⁹, Niko Speybroeck¹⁰, Catherine Legrand¹¹, Stefanie De Buyser¹², Frank Hulstaert¹³.

Abstract

Starting from historic reflections, the current SARS-CoV-2 induced COVID-19 pandemic is examined from various perspectives, in terms of what it implies for the implementation of non-pharmaceutical interventions, the modeling and monitoring of the epidemic, the development of early-warning systems, the study of mortality, prevalence estimation, diagnostic and serological testing, vaccine development, and ultimately clinical trials. Emphasis is placed on how the pandemic had led to unprecedented speed in methodological and clinical development, the pitfalls thereof, but also the opportunities that it engenders for national and international collaboration, and how it has simplified and sped up procedures. We also study the impact of the pandemic on clinical trials in other indications. We note that it has placed biostatistics, epidemiology, virology, infectiology, and vaccinology, and related fields in the spotlight in an unprecedented way, implying great opportunities, but also the need to communicate effectively, often amidst controversy.

Entities: Disease Species

Keywords: Antiviral therapy; Data sharing; Diagnostic testing; Factorial designs; Infection fatality rate; Mathematical epidemiology; Mathematical modeling; Mortality; Non-pharmaceutical intervention; Nowcasting; Platform trials; Pragmatic trials; Prevalence; SARS-CoV-2; Vaccine development

Mesh：

Substances：
COVID-19 Vaccines

Year: 2020 PMID： 33132155 PMCID： PMC7581408 DOI： 10.1016/j.cct.2020.106189

Source DB: PubMed Journal: Contemp Clin Trials ISSN： 1551-7144 Impact factor: 2.226

Introduction

We have to return to 1918, the time of the H1N1 influenza pandemic, the Spanish flu, to encounter a health crisis that had to be confronted without adequate medicinal products, prior to even the concept of vaccination, poor scientific knowledge (viruses had not been discovered), and with little or no historic registration or surveillance data available [110]. Arguably, such an invasive health crisis has a profound transformational impact on virtually all aspects of society. Spinney [110], in a chronicle of the Spanish flu, asserts that this 1918 pandemic, responsible for a death rate in the order of magnitude of 100 million people (rescaled to today's world population; directly or because of induced comorbidities, in particular also bacterial infections), was at least equally impactful as both world wars for shaping the world as we have known it until the end of 2019. It is interesting from a historic perspective, and crucial in understanding today's evolution, to examine how the Spanish flu impacted society (politics and geopolitics, social relationships, economic power, etc.). Unquestionably, the SARS-CoV-2 induced COVID-19 epidemic holds the same disruptive power. Our focus is on how such a global public health crisis transforms clinical research, and in particular epidemiological, (bio)statistical and clinical trials research. It is insightful to remember that, at the onset of Spanish flu, scientists thought that it was bacterial, before it catalyzed discovery and then the study of viruses and their induced ailments. In more traditional communities around the world, religious, cultural, (e.g., Confucian) or even environmental explanations were given (such as miasma or bad air). Spanish flu was often confused with bacteria-induced typhus, also known as typhus fever. In the absence of proper diagnostic testing, the occurrence of typhus was confirmed as soon as the characteristic rash occurred. Other than that, milder cases of the Spanish flu were hard to set apart based on the symptoms they induced. For severe cases, there was less doubt (e.g., due to partial or full-body dis-coloring), but by that time it was usually too late. While the details are different, the broad-brush similarity between 1918 and 2020 is striking [40]. The post-Spanish flu public health world looked very different from what it was before. The importance of hygiene to fight and prevent disease had been understood since the seminal contributions of Florence Nightingale and the key discoveries of Louis Pasteur regarding bacteria. A key factor had been the discovery of penicillium and eventually antibiotics. This brings to the fore two important ways to confront infections: hygiene as an archetypical non-pharmaceutical intervention (NPI) [[4], [10], [44], [56]] and antibiotics as an essential example of a pharmaceutical intervention (PI). But, for the Spanish flu, antibiotic development was in its infancy (arsphenamine, discovered by German physician Ehrlich in 1909, was used for syphylis at the time), which remained the case until the discovery and mass production of penicillin during WWII. More importantly, antibiotics do not work for viral infections. While a century ago the world was less globalized than it is now, the mass movement of people due to WWI, but also transatlantic vessels, offered transmission opportunities to H1N1; our hyper-interconnected world did the same for SARS-CoV-2. This meant that Spanish flu had to be tackled with a variety of NPIs and some PIs, including social distancing, facial masks, quarantining after improvements in diagnosis, and more adequate treatment for H1N1-induced pneumonia. Eventually, likely already in 1919, the virus mutated to a less lethal strain, a typical competitive advantage for a virus, even though the mutation between Spring and Fall 1918 was uncharacteristic: the virus became more lethal over the summer, causing a horrendous second wave of infections [110]. It was clear in 1918 that little or no records, apart from anecdotal evidence, were kept about past influenza epidemics. Fast forwarding to 2020, arguably influenza is properly understood, from a viral, epidemiological, epidemic modeling, and vaccination standpoint. National and international surveillance is well developed, e.g., to determine the components of the upcoming season's influenza vaccine and to monitor the emergence of strains with pandemic potency [61]. We have to admit, though that, even though SARS-CoV-1 and MERS-CoV provided a wakeup call, data on coronavirus induced pathology, in contrast, are rare. Vijgen et al. [129] suggest that the Russian flu of 1890 was the birth of hCoV-OC43, rather than H2N2, the 1957–1958 influenza pandemic in East Asia. Like H1N1, also SARS-CoV-2 induced a sense of urgency and mobilized societal, political, and research forces that are in non-pandemic periods unheard of, except in wartime and in the face of catastrophes such as a financial meltdown. On March 10, 2020, Tomas Pueyo, a product and marketing leader at Course Hero, addressing politicians, wrote on medium.com: “The coronavirus is coming to you. It's coming at an exponential speed: gradually, and then suddenly. It's a matter of days. Maybe a week or two. When it does, your healthcare system will be overwhelmed. Your fellow citizens will be treated in the hallways. Exhausted health care workers will break down. Some will die. They will have to decide which patient gets the oxygen and which one dies.” While it sounded alarmist to some, it has proven an accurate vision for countries and regions in all continents except Oceania. The combined fields of biostatistics, epidemiology, survey science, and clinical trials research, in close collaboration and at the service of virology, immunology, and infectiology, contribute towards the following broad areas: understanding the virus and its dynamics by extending and reformulating existing mathematical and statistical models and used to estimate key epidemiological parameters (e.g., basic reproduction number, incubation period, serial interval, generation interval, etc.); studying the immunological response to SARS-CoV-2 exposure, including the determination of the (sero-)prevalence in the population, T-cell mediated and humoral immunity responses, and potential cross-immunity; monitoring the global pandemic and its epidemics (country-wide, regional, city specific) by observing a set of characteristics and using a variety of modeling tools; gathering additional information by way of (longitudinal) survey sampling to gauge the epidemiological effect as well as the societal side effects (social and economic) of NPIs; making short-, medium-, and longer-term predictions – in view of monitoring health care capacity in early phases, NPI exit strategies, and the building of lines of defense towards surveillance in the post-peak period; contributing to clinical research for the development of diagnostic tools, antiviral medicinal products, and vaccines. Every one of these areas has seen tremendous and rigorous scientific development in the pre-pandemic era, both theoretical and applied. In that sense, the body of knowledge in 2020 cannot possibly be compared with the fragmented knowledge in 1918, and still, the knowledge about key aspects (seasonality, immunity, prevalence) is partial and speculative at best. The field of mathematical and statistical modeling of infectious diseases is well established [5,54,55] as is, of course, epidemiology and clinical trial methodology. When a pandemic suddenly breaks out, all of these areas are strongly forced to collaborate, whereas scientific areas, even within medicine, tend to be compartmentalized. Researchers in the same field across the globe should work together. In addition, time is of the essence, so that certain principles need to be relaxed out of necessity, while others stand like a rock. A natural consequence for the need and willingness to collaborate is making available all potentially relevant data and an uncompromised commitment to an open access policy. We return to this key lesson in Section 11. Note that the need for open access to data should be paralleled by an open access to code in order to harness the power of the Internet and make research efficient on a worldwide scale. A sobering thought is that, in spite of all of this knowledge, at the outset, all one can do is enlist the key questions that emerge and quickly report early but key findings [[21], [43], [49], [64], [91], [132]] An epidemic, or even pandemic, of a different nature was the HIV-induced AIDS epidemic [15,23]. Confronted with a lethal viral infection that affected predominantly younger people and hence led to a considerable number of life years lost, a massive response ensued, with large academic and collaborative AIDS research groups formed around the globe, predominantly in the United States (e.g., the AIDS Clinical Trials Group). It led to the acceptance of placebo controlled trials with frequent interim analyses overseen by an Independent Committee [14]. Also, coerced by the ‘fourth player’ (i.e., the patients and their advocacy groups, next to the three other players: regulators, industry, academia), co-enrolment in several trials simultaneously was grudgingly accepted, but arguably led to the development of highly active anti-retroviral therapy (HAART; [23]). Undoubtedly, it dynamized the clinical research community and arguably paved the way for dynamic treatment regimes and a new emphasis on personalized medicine. It is evident that any deviation from standard practice poses methodological challenges that may be partially addressed during a crisis. A WWII example thereof is Wald's development of the sequential trial framework: there simply was no time for the established rigorous but slow industrial quality control processes [131]. The new paradigm proposed by Wald led to further developments that continue to influence clinical research today [116]. The current pandemic, just like the earlier ones, shows the need to trust the good faith of experts, and the good intentions of health professionals, rather than build onerous and time-consuming systems that are premised on the possibility of fraud and misbehavior. It is instructive to point out that the position of science and scientists was questioned in 1918 as well as today. Some referred to a “totalitarian system of science” [110]. It is natural that the position of biomedical science and its biostatistics and epidemiological counterparts is debated because seldom are they so prominently present in the public debate. To understand this, note that an epidemic is somewhat archaically referred to as a “crowd disease” [50]. It is natural to consult a physician for an ailment and, the more severe the condition, the more a patient is willing to accept side effects, as long as there is a sufficiently strong therapeutic effect. In fact, this is not different in a crowd disease. In the absence of PIs, the NPIs are society's only therapeutic class. Prescription, dosing, and monitoring of side effects then becomes a societal responsibility, where expert advice is blended with policymaking by mandated politicians, and with input from advocacy groups.

Epidemiological background

The biomedical and public health community, as well as the world population, have quickly been learning crucial lessons about SARS-CoV-2 and COVID-19. To date, several aspects, though, remain uncertain or simply unknown. It is useful to briefly review some basic concepts of infectious disease modeling. While more complex models for COVID-19 are undoubtedly more appropriate to account for heterogeneity related to gender and age (in relation to social contact behavior, acquisition of infection, infectivity per average person, symptomatology of infected individuals and corresponding risks of hospitalization as well as subsequent mortality risks), spatial heterogeneity, and/or variation in risks due to societal position, the so-called basic Susceptible-Infected-Recovered (SIR) compartmental model provides a reasonable starting point (see,e.g., [55]). Although simplistic in the face of the current epidemic, the SIR model does contain the essential ingredients. Abrams et al. (2020) have developed a much more elaborate model, i.e., an age-structured, stochastic model, tailored to the dynamics of SARS-CoV-2 transmission and the subsequent human response upon contracting the disease, both at the level of the symptomatology as well as in terms of humoral immunity responses within hosts [1]. In the basic SIR model, at any time t ≥ 0, the population is divided into three fractions or compartments: S(t) represents the susceptible fraction, I(t) is the infected (and infectious) fraction, and R(t) is the recovered fraction (immune survivors and potentially deaths). The initial states are S(0), I(0), and R(0). Flows of individuals between these states can be described using (ordinary) differential equations. The model is further influenced by two critical numbers: the recovery rate k, and the basic reproduction number R 0. While R 0 is an implicit model parameter, the force of infection, i.e., the instantaneous rate at which susceptible individuals become infected, determines the basic and time-varying effective reproduction numbers, together with the recovery rate, through the so-called next generation matrix, providing information about the next generation of infected individuals resulting from a single typical infected individual. This basic model is rigid in that it assumes homogeneous (random) mixing within the population, and requires the population to be a closed system. In reality, as in Abrams' model, a population consists of various subgroups, or silos, with different behaviors (such as different levels of social contacts), and borders in a country like Belgium are merely administrative lines between neighboring countries [1]. Also, the three-fraction system is often too simple. For SARS-CoV-2, we need to add an exposed state in which exposed individuals are not yet infectious while viral load is gradually building up, a pre-symptomatic compartment in which individuals are able to infect others even though they do no have symptoms yet, and compartments including asymptomatic individuals and individuals with mild symptoms, severely ill, hospitalised, and intensive care unit (ICU) admitted persons. Recovery and death are ideally kept separate as well. Such an elaborate model is essential if it is to be used against the background of the hospital capacity available, and to gauge the death toll. Consider now the reproduction number R 0, defined as the average number of susceptible individuals infected by a single typical infected individual during his/her entire infectious period, at least in a fully susceptible population. There is a whole world “not” captured by a single R 0 value. First, it may depend on the initial population characteristics (age distribution, geographical spread], etc.). Second, as time evolves and S(t) depletes, the effective reproduction number R is more relevant. The basic reproduction number as a measure of transmissibility of a pathogen is very different for seasonal influenza, where it is usually around 1.5, as compared to COVID-19, where it is estimated around 2.5 without medication or vaccines, and without NPIs. For an overview of COVID-19 related R 0 estimates, see Abrams et al. (2020) [1]. An early estimate for COVID-19, based on the Wuhan outbreak, can be found in Zhou et al. [142]. As is now everyday knowledge, R 0 < 1 (and R < 1) implies dampening of the epidemic, whereas with R 0 > 1 (and R > 1) it picks up until immunity is sufficiently widespread or the susceptible reservoir is depleted. Depending on R 0 and the generation interval (time between infection events in an infector-infectee pair, see [47]) building up immunity can be a lengthy process, even if no interventions are implemented, which we seem to see with the current pandemic. Moreover, uncertainty surrounding the nature and extent of immunity is considerable, because humoral immunity seems to wane over time and the role of T-cell immunity is yet to be studied in more detail. At the onset of the Wuhan outbreak, there was considerable uncertainty regarding key epidemiological parameters, in particular R 0, but also the associated (case and infection) fatality rate. We now know that both R 0 and the infection fatality rate (IFR) are relatively high, the latter being highly variable with age. Although highly dependent on the population under study, some additional examples of R 0 values from other infections: measles (R 0 ≃ 15), mumps (R 0 ≃ 5), SARS (R 0 ≃ 2.5). See Riccardo et al. [101], Chowell et al. [27], and He et al. [53]. Several other quantities are of epidemiological interest: infectious period (roughly about a week, versus a few days for influenza); age-specific contact rate (the typical number of social contacts of a certain type a member of the population has, see [55]); mode of transmission (for COVID-19, the mode of transmission was established early as airborne droplets, but this mode was later supplemented with others); probability of transmission upon a contact between a susceptible and infectious individual; shedding of viral load depending on the severity of symptoms; contribution of children to the infection process; high-risk contacts and their influence on disease dynamics (e.g., superspreading events). The infectiousness is strongly person-dependent (cf. the so-called superspreaders) and here secondary transmission via the airborne route is key, i.e., via aerosols [26,62]. A key population characteristic is the contact rate, i.e., the frequency and intensity of physical social contacts between population members. The number and intensity of social contacts is not a constant. There are group- and individual-specific aspects to the contact rate and, importantly, it can be modified. During the time frame without PIs, modifying the contact rate and intensity (briefly or for a very extended period of time) is essentially the only option available. For a variety of reasons, describing and predicting a real-life epidemic curve may be very difficult. As the pandemic has been unfolding, the country and state specific epidemic curves take about any possible non-linear shape (https://coronavirus.jhu.edu/data/new-cases-50-states), underlining the importance and extent of heterogeneity in infectious disease dynamics.

The non-pharmaceutical intervention period

Let us turn to the three possible strategies to modify the aforementioned contact rate, because when the house is on fire, and neither medicinal products nor vaccines are yet available, NPIs are all that one has got. The first strategy is suppression. It essentially means that the reproduction number is forced below one by imposing severe contact restrictions at the population level, as was done in China outside of Hubei (in Hubei, where the Chinese authorities were taken by surprise, this was at first not possible). Of course, a large fraction of the population is then kept in the susceptible state and hence they do not contribute to the build-up of herd immunity. As a consequence, measures should be put in place to avoid the epidemic from flaring up after measures are relaxed, while monitoring very effectively so that, if it does, suppression measures can be enacted again. Clearly, China is in this situation, and will be until vaccines and medication are available. Cheap, widespread, sensitive and specific diagnostic tools help maintain control, potentially supported by electronic means such as smartphone apps, as well as contact tracing and isolation [57]. Needless to say, international travel in and out of susceptible regions is and remains problematic. Suppression is only possible when the viral spread is radically suppressed at an early stage, however, SARS-CoV-2 has stealth characteristics. Its incubation period is relatively long, with a very infectious period near the end of the incubation period [67]. To aggravate matters, there is a large fraction of pre-symptomatic and asymptomatic but infectious cases (possibly 40–50%, although estimates vary widely and could be even higher). These characteristics, combined with a high reproduction number, make the epidemic resemble a bush fire: one match is sufficient to ignite it, after which the fire starts to spread at ground level, invisible to the naked eye until it suddenly evolves in an all-out fire. For this reason, in Europe, suppression was not a viable option after the initial outbreak in Northern Italy. The second strategy is mitigation [113]. This was practiced by about all European countries during their first wave, to more (Italy, Spain) or lesser (Sweden) degrees. Here, measures are taken to bring the reproduction number down, such as reducing the number and nature of social contacts to a pre-specified level, so that the epidemic is slowed sufficiently and the number of critically ill cases at any time can be handled by the health care system. The difference between suppression and mitigation is that the latter aims to build up herd immunity, in such a way that the health care system is able to cope. It can be supplemented by a temporary capacity increase of the system (e.g., field hospitals, annexes to existing hospitals). The third strategy, or perhaps absence thereof, is counting solely on herd immunity [97]. Generally, it will typically produce a shorter epidemic than with mitigation, and afterwards the population will be immune at population level. That is, the fraction of recovered people (immune for a certain time, e.g., the rest of the season) will be large enough, i.e. above the critical threshold, so that the re-emerging virus will not find enough susceptible population members to push the reproduction number above one, and the epidemic will soon decrease and become seasonal (where transmission is typically increased during winter months, as it is for influenza virus and other, more benign, betacoronaviruses, such as hCoV-OC43). It was anticipated, early March 2020, that mitigation in a country or region would lead to a population with roughly 30% of recovered, hence immune, members, whereas herd immunity could lead to 60–70% immunity, at least in the absence of clusters [13]. The latter is sufficient to prevent further outbreaks, or to ensure that they would be short-lived. That is, provided that immunity is sufficiently strong and sufficiently long-lasting. Unfortunately, none of this has played out as anticipated. Sero-prevalence has been building up depressingly slowly [58]. In Belgium, sero-prevalence was estimated at roughly 3% by the end of March, 6% mid-April, 7% mid-May, and back down to 6% and 5% in early June and July, respectively. This points to waning of IgG antibodies, after their discovery has been ridden between a long delay in onset of detectability [11] and relatively poor sensitivity. At the time of writing, this suggests that other aspects of immunity, such as T-cell immunity, need to be scrutinized [3]. A key limitation to herd immunity strategies is the high fraction of critically ill patients, leading to overburdening of the health care system, and the high IFR [88]. In the Belgian non-nursing home population, the IFR is about 0.4%, but this figure masks the strong age gradient, with an IFR close to 0% in the population under 25, but rising to 2.5% in the 85+ population outside of nursing homes, and to 35% for the 85+ in nursing homes. Not surprisingly, the death toll in nursing homes is very large (two thirds of the nearly 10,000 COVID-19 related deaths in Belgium are among nursing home residents). This has been observed in a large number of countries around the globe [32]. The death toll has been quoted as an argument for why lockdowns and other NPIs are unavoidable. In Europe, an estimated 3 million deaths have been avoided by lockdown measures [46]. For Belgium, this boils down to a figure between 50,000 (with a coping health care system) and 250,000 (for a strongly overwhelmed health care system). How to proceed with the mitigation strategy when the peaks in the relevant curves lie in the past? Given the large reproduction number (super-spreading virus combined with a long infectious period), relaxation of NPIs needs to be done with utmost care. Re-emergence of the epidemic is likely as the virus will have built up reservoirs already. Reservoirs take the form of animal species that harbor the virus during time periods when there is no human epidemic(e.g., geese and pigs in the case of influenza). Changing tactic and opting for herd immunity is extremely difficult because it would undo the effects of NPIs, including the hardships they will have induced. It is only a viable strategy if supplemented with sufficiently promising PIs (antiviral medication and vaccines). While pharmaceutical breakthroughs are happening at an unprecedented speed, it is unrealistic to expect major relief from this end in less than a year. It is more realistic to move towards suppression, or a combination of mitigation and suppression when the epidemic is sufficiently under control, i.e., when the number of new infections falls below a certain level. At that point, contact tracing and quarantine measures, needed for suppression, become a viable strategy, supported by increased reliability and capacity of diagnostic testing, the use of electronic tracing (e.g., based on apps) in addition to human tracing (by infectiologists and health inspectors). A final but extremely important aspect is whether or not contact between populations will be possible in periods when there are no peaks or outbreaks. The answer is that this could well be detrimental. Not only is travel itself a risk factor, as is clear from the early introductions around the globe, but contact between populations in different epidemic stages is complex. China's cautious protection of its borders after its initial peak, as well as Europe's initially prudent but now complex international travel situation, even within the Schengen zone of the European Union, are cases in point. Inevitably, new outbreaks will keep emerging until immunity is sufficiently widespread or adequate vaccines are available. Antivirals will not stop this but may prove important in turning mitigation strategies into a success [117]. Note that this provides an interesting link between NPIs and PIs, between mathematical modeling and the outcome of successful clinical trials. The seasonality of COVID-19 (and its successors in subsequent years, i.e., COVID-20, etc.) is poorly understood at this point, although Kissler et al. [72] provide useful predictions, based on knowledge from coronaviruses OC43 and HKU1. Corona virus-induced diseases (typically but not exclusively, common cold) are seasonal, but less so than, for example, influenza. Kissler et al. [72] report that outbreaks are possible at any time of year, with more acute outbreaks in autumn and winter. Depending on the extent of (non-permanent) immunity, either annual or biennial outbreaks are more likely. Other scenarios would be possible if immunity is lifelong (i.e., outbreaks in cycles of 5 years or more). Also, cross-immunity with other betacoronaviruses HCoV-OC43 and HCoV-HKU1 will play an important role in temporal SARS-CoV-2 dynamics.

Modeling and monitoring the epidemic

Modeling

Jewell [66] underscores the importance of high-quality mathematical and statistical models for epidemics. Using such models, key epidemiological quantities are estimated: numbers of infected cases, hospitalizations, people in ICU, and deaths. Some models also permit short-term, medium-range, and long-term predictions, and allow to examine how such quantities change with changing human behavior and measures taken, such as social distancing, face masks, hygiene and, eventually, vaccination programs. It is useful to cast predictions according to a variety of scenarios, to inform policy makers, other scientists, and the public opinion. Each model has its strengths and pitfalls, and simultaneously considering various models strengthens prediction. Some models operate at macro level (e.g., to study the number of cases in the population of an entire country), while others operate regionally or locally. Models are informed by data, mathematical infectious disease theory, and assumptions. Each model provides a piece of the jigsaw puzzle, and it requires a good amount of expertise and skills in infectious disease modeling to lay the entire puzzle. Model uncertainty and sensitivity analysis must accompany every modeling effort. Models, no matter how refined, will never be able to capture every detail of human behavior. In fact, there are striking examples of models that were poorly predictive because they ignored behavioral aspects, such as the need for college students to gather and party [24]. Also, important epidemiological quantities, such as the ones referred to earlier, are (typically) fully unknown at the onset of a pandemic. Over the first half year of the crisis, several quantities have been estimated with increasing precision, though sometimes with hiccoughs (e.g., the length of the pre-symptomatic period). Others, such as seasonality, remain hazy. The determination of immunity has been a roller coaster of progressing insight (see Section 5). In a growth model, such as a logistic [127] or Richards model [102], hospital admissions, number of tests, etc. are used to compute how the spread of the virus evolves over time. This approach lends itself naturally to estimating how the growth factor of the epidemic changes according to measures taken, or under the influence of a changing testing strategy. Transmission trees aim at mapping the chain of infections among people [54]. One examines the genetic similarity of the virus among people or one makes use of contact tracing. For COVID-19, contact tracing was applied at the onset of the epidemic to find out in which region a person could have been infected. As the epidemic in March 2020 gained strength and the number of infected people increased, contact tracing was no longer feasible in Belgium. However, it is considered a vital component of a suppression strategy for second and later waves. Transmission trees are helpful to estimate key characteristics of SARS-CoV-2, such as the basic and effective reproduction number, and the generation interval, i.e., the time lapse in a so-called infector-infectee pair, the serial interval, i.e., the time between symptom onset in the infector and in the infectee, and the incubation period. Based on COVID-19 data from China and Singapore, Ganyani et al. [47] were able to show that R 0 is larger when estimated from the generation interval as compared to the serial interval, pointing for the first time to pre-symptomatic infections, its associated risks, and implications for an exit strategy [28][94] A meta-population model is a robust, large-scale model, that allows to incorporate people's mobility. It divides the population into groups based on age category, residence, etc. Each of these groups follow an underlying mathematical model for the spread of the epidemic. Such a model assigns people to the various compartments (susceptible, exposed, infected, recovered). By mimicking interaction between such groups according to various scenarios (e.g., little or a lot of contacts with people outside the household, little or more mobility between towns, etc.), it is possible to predict how the number of infected people changes over time, in the short run as well as over longer time intervals. Important sources of information are the number and the nature of social contacts of people in various age categories, the mobility patterns of people in different regions, etc. An individual-based model, based on the number of hospitalizations, performs well in terms of (1) describing the spread of the disease, and (2) examining the consequences of relaxing the measures taken, i.e., candidate exit strategies. In such a model, each individual is assigned to a family, a school category, type of workplace environment, and the population at large. This assignment is guided by data available from school registries as well as employment data. The model mimics behavior of individuals on a day-by-day basis. It accounts for changes in behavior on weekend days relative to weekdays, during holiday periods and, importantly, also as a result of measures taken, such as school closure and reduced social contacts. When investigating the consequences of exit strategies (e.g., reopening of schools and certain workplaces), the team also examines the added value of household bubbles (i.e., a combination of members from multiple households, matched to have a similar structure in terms of age, composition, etc.), allowing repeated contacts within bubbles and lensuring a reduction in community-level mixing, and contact tracing to monitor and avoid new infections. Although simple deterministic compartmental models, such as the basic SIR model introduced previously, have been used in the initial phase of the pandemic, making an abstraction of some of the important properties of both the pathogen as well as the infected host, an additional layer of complexity is of utmost importance to incorporate in order to adequately describe the dynamics of COVID-19 and to make reliable predictions of the future course of the epidemic. As more and more evidence has been accumulating throughout the progression of the pandemic, it became clear that age-specific differences exist in susceptibility to infection, infectiousness upon infection, probability of being symptomatic and disease severity, thereby leading to large differences in hospitalization and mortality risks upon contracting SARS-CoV-2. Research groups with ample of experience in infectious disease modeling were well equipped to expand and refine existing models for disease spread to account for such complexities. In a Belgian context, an individual-based model (STRIDE) previously developed for influenza [48,75,136] was adapted to describe COVID-19 dynamics in the Belgian population, a meta-population model accounting for mobility patterns was adapted to study the impact of exit strategies in the aforementioned setting [28] and a stochastic age-structured compartmental model was designed and specifically tailored to the spread of SARS-CoV-2 following earlier though related work on asymptomatic infections and their role in disease spread [104]. On top of that, preparedness after previous epidemics, such as but not limited to the Ebola virus epidemic in West Africa (2013–2016), and experience in modeling infectious disease dynamics under pressure allows one to go beyond the application of simple models. Complexities imposed by intervention measures taken, such as stringent lockdown measures, and their impact on social contact behavior, pose additional challenges for modeling. Consequently, there is a need to directly relate the spread of the disease to social contact behavior and to inform transmission rates using social contact data. All of the approaches mentioned before (Individual-based, meta-population and stochastic models) rely on such social contact data, besides other sources of information, to calibrate and relate these models to the given epidemiological situation. Needless to say, model outputs and predictions require continuous fine-tuning and validation. Long-term predictions, while very useful [72] should be seen as plausible scenarios at best, that demonstrate the impact of assumptions and variations in behavior. A collection of the aforementioned statistical and mathematical models developed by the team at the Universities of Hasselt, Antwerp and Leuven in Belgium can be found at www.simid.be and https://www.uhasselt.be/dsi-covid19-en. While not always obvious, there are clear links between statistical and mathematical modeling of the epidemic, and COVID-19 clinical trials research. A convincing illustration is found in Torneri et al. [117], who establish the vital role of antiviral medication in local outbreak control, in other words, the impact of non-pharmaceutical and pharmaceutical interventions can form a virtuous couple. In retrospect, when a number of predictions have been cast, under a variety of scenarios, at most one of these will come close to what actually happened, at least for the country or region for which it was intended. But, in a pandemic, countries and regions around the globe, with varying characteristics, all exhibit their own curves. For example, the Southern and Western states in the US exhibit a very different curve than the Northeastern states (cf. https://coronavirus.jhu.edu/map.html). While care needs to be taken when comparing an observed curve with a prediction intended for a different geographical entity (or subpopulation), it is useful information for epidemic monitoring as well as for future model refinement and calibration, not only for future pandemics, but also for subsequent peaks of the ongoing one.

Nowcasting and early warning

Modeling the event history of COVID-19 is important for public health policy, especially towards critically ill patients [130]. Event history analysis includes studying the timing (or delay) between different events: infection, symptom onset, confirmed case, hospitalization, recovery and death. First, due to the incubation period and the delay of reporting and/or hospitalization, the impact of intervention measures is only observed after several days. For example, if the sum of the incubation period and delay of reporting is 10 days, then we expect to see an impact of the interventions on the number of confirmed cases after 10 days. However, as the delay time varies from individual to individual, the effect of the intervention is spread over several days. The delay distribution of the incubation period [81,100] and the time between symptom onset and hospital admission [41]126] is therefore crucial, as is understanding the heterogeneity in the delay times among individuals. Good knowledge of such delay distributions allows one to back-calculate the number of newly (symptomatic) infected cases, known as nowcasting, from either the number of confirmed cases or hospitalised cases, and assess the impact of intervention measures. Second, the length of stay in hospital is important, which varies among individuals and among countries due to different health systems. Information about the length of stay in hospital is important to predict the number of required hospital beds, both for beds in general hospital and beds in the ICU, and to track the burden on hospitals [126]. Individual-specific characteristics, such as, for example, sex, age, comorbidity, and frailty of the individual, can explain differences in length of stay in the hospital and are therefore important to correct for. The estimation of the length of stay is complicated by the truncated and interval-censored nature of the data collected during the unfolding epidemic [41]. Third, the time delay from infection and illness onset to death is important for the estimation of the case fatality ratio [35]. A naïve case fatality ratio based on the proportion of reported deaths to reported cases during an outbreak is generally biased upwards, due to both the delay between case and death incidence and underreporting of cases. An early warning system to monitor COVID-19 trends and forecast increases of the hospital burden are essential in times of a pandemic [73]. Multiple data streams are used as predictors of increases at the national and provincial level in Belgium. The mobility of individuals (tracked via mobile phone data), absenteeism at work, the number of patients with respiratory diseases visiting their general practitioner and the proportion of positive tested cases are important predictors for the immediately following two-week period [42]. This is especially relevant at crucial times during an epidemic with multiple waves. Nowcasting is essential at the onset of the epidemic and when the curve begins to flatten and a peak is reached. It is also relevant when the rate of decrease slows, an often missed signal. While a decreasing curve is qualitatively a favorable evolution, it is important to constantly monitor the rate of decrease: if the decrease slows down while the curve is still at a relatively high level, it might be an early sign that it might eventually stop and then, unfortunately, start to increase again.

Mortality reporting

Mortality among COVID-19 patients is relatively high when measured by IFRs [[51], [88], [144]]. The overall IFR is estimated around 0.6% in many countires, but is very strongly age dependent, and the risk is higher for males than for females. This was clear even from early reports [141]. In a pandemic like the current one, it is not uncommon to have (at least) double mortality reporting. For example, in Belgium, Statistics Belgium reports overall mortality, while the Belgian health institute Sciensano reports COVID-19 mortality. Excess mortality can be deduced from overall mortality, providing an alternative estimate for, and perhaps a better one, than COVID-19 mortality [6]. Hence, this is a place where official statistics, epidemiology, and demography meet. Bustos Sierra et al. [16]and Molenberghs et al. [88] from a Belgian perspective, and Aron et al. [6] from an international standpoint, reported that Belgium's excess mortality agrees very closely with COVID-19 mortality. This is because Belgium reports not only confirmed COVID-19 deaths in hospitals, but also suspected deaths regardless of the place of occurrence. In contrast, these authors found that in the Netherlands the reported COVID-19 mortality accounts for only 62% of excess mortality. Arguably, excess mortality, when carefully teased out from overall mortality, is a better estimate of COVID-19 mortality, than reported COVID-19 mortality itself. For example, the number of deaths per million on July 4, 2020, was 843 in Belgium, 650 in the UK, 607 in Spain, 576 in Italy, 458 in France, and 357 in the Netherlands. But, after correction for underreporting, these figures become 1012 for Spain, 860 for Italy, 813 for the UK, 766 for Belgium, 575 for the Netherlands, and 472 for France (https://github.com/owid/covid-19-data/tree/master/public/data). Something that has become saliently clear is the very steep IFR curve as a function of age [71,88]. This, combined with the superspreading context that nursing homes provide has led, in many countries, to a huge death toll in such settings [32], which has in turn triggered dedicated epidemiological research.

Prevalence determination and other surveys

Unlike testing and tracing, which is aimed at finding as many new cases as possible, prevalence determination is aimed at reliably estimating what fraction of the population is recovered and hopefully immune. Apart from the viral and immunological issues related to prevalence determination, it should be done based on representative samples. Hence, sample survey methods can be used, although often alternative methods are used. Prevalence determination is important to gauge IFRs and to assess whether or not herd immunity is building up.

Prevalence determination

An obvious way of prevalence determination is by means of the sero-prevalence, based on the detection of antibodies in blood serum samples. Herzog et al. [58] proceeded via a nationwide cross-sectional survey of residual blood samples tested for the presence of Immunoglobulin G (IgG) antibodies against SARS-CoV-2. This method, as we know now, is ridden with a number of issues, such as time to IgG seroconversion, detectability, and waning [[11], [52], [63]]. In Belgium, sero-prevalence around April 1, 2020, was around 3%, three weeks later it was 6%, rose to nearly 7% mid-May, and then started to drop to 5.5% (around June 10) and even 4.5% around July 1. In other words, as mentioned in the literature [83], waning of IgG antibody concentrations is also observed in this sero-epidemiological study, and the primary route for immunity may not be these antibodies but rather T-cell mediated immunity or other antibodies not measured so far. As a consequence, IFR determination and the status of a population's immunity are referred back to the drawing board and the interpretation of (serial) sero-prevalence studies have to be reconsidered. Evidently, the decrease in seroprevalence implies that the status but also extent of immunity may be very different when based on T-cell mediated and humoral immunity responses. Also, cross-immunity with endemic coronaviruses, especially beta-coronaviruses such as hCoV-OC43, is a relevant study subject, but one about which there is little or no knowledge available [72]. Note that different survey sampling methods and different sub-populations considered (e.g., blood donors, or people spontaneously reporting at hospitals) may well yield different estimates. Apart from immunological issues with prevalence determination, the quality of the representative sampling method used influences the reliability of the findings.

The role of public opinion surveys

It is important to keep the finger on the pulse of the public opinion, for various reasons. Well-conducted surveys are vital to get a feel for how the population perceives risk, the impact of measures taken, acceptance and compliance to NPIs, etc. At the same time, it can be a component of an early warning system (Section 3.2) if the occurrence of symptoms is queried. One such example is the “Big Corona Study” (see also [93]), an online survey that can be filled in by all members of the public on every Tuesday since March 17, 2020; from June 2, 2020 onwards, the survey shifted to a bi-weekly frequency. It collects data about public adherence to measures taken by the government, contact behavior, mental and socio-economic distress, and spatio-temporal dynamics of COVID-19 symptoms' incidences. While public participation is useful as a low-cost method to collect timely information within the context of a pandemic, caution should be exercised at the analysis stage; online surveys, based on self-reporting, often do not reach every societal group equally [2]. It typically causes response rates to vary among citizens of different ages, genders, cultures and economic statuses. This is particularly the case in 2020, where the perception of the seriousness of the COVID-19 pandemic varies considerably between individuals and has become politically coloured. This then translates to increased difficulties to correct for unrepresentative samples, even after standardization methods such as inverse probability weighting are performed. In essence, these problems all relate to non-random missingness patterns [89], where the absence of information is driven by complex processes. These processes do not lie far from opportunistic sampling phenomena that often occur in biodiversity studies that make use of citizens to collect data [92]. For example, using such surveys to pinpoint areas of increased disease incidence necessitates careful investigation, since response rates' spatial dynamics may be stochastically dependent on the underlying spatial process that generates heterogeneity in the symptoms' incidences. If present, this opportunistic sampling phenomenon, termed preferential sampling [34], invalidates statistical inference on the spatial dynamics of COVID-19 symptoms. This can be accommodated by using a shared latent process approach where a geostatistical binomial model for the proportion of participants of each Belgian municipality that experiences COVID-19 symptoms shares a spatial random effect with a model for the response rates. The result of this approach is shown in Fig. 1 , which depicts predicted symptoms' incidence, corrected for preferential sampling, using data of 397,529 individuals collected during the third round of the “Big Corona Study” (March 31, 2020).

Fig. 1

Predicted probabilities for a citizen to experience at least one key COVID-19 symptom per municipality, based on extensions of a shared latent process model that corrects for preferential sampling [93]. The above is an example of how survey sampling methods, citizen science, and spatial statistics come together to gauge the public opinion regarding COVID-19 and, in turn, to inform policy makers. Unsurprisingly, several suveys are undertaken simultaneously. For example, the Belgian health institute Sciensano has conducted several waves of a COVID-19 Health Interview Survey [106]. This study has a longitudinal component; participants can indicate whether or not they are willing to have their responses linked across waves. Smaller scale (longitudinal) surveys towards the public's perceived vulnerability and acceptance of measures are undertaken too [31]. A general perspective on the role of social and behavioral science in the response to COVID-19 research can be found in Van Bavel et al. [124]. In many countries, all such surveys take place in an ad hoc fashion. It can be beneficial, though, to make use of a permanent (online) representative panel for public opinion research. Such a panel exists in the Netherlands [65]. Catalyzed by the current pandemic, a panel of this type is likely to be initiated in Belgium as well.

Diagnostic and serological testing

The battle against a novel emerging pathogen such as COVID-19 requires the development of a rigorous screening strategy to detect the virus, with the objective to mitigate its public health impact and to bring the pandemic under control. Aiming to achieve a rapid scale-up of diagnostic testing capacity has rarely, if ever, been attempted at the current pace [118]. Testing is not merely an instrument to diagnose a given individual and to determine individual-level risk factors, it is also a prerequisite to a proper disease surveillance system, serving in monitoring and managing the epidemic. Testing allows unraveling a number of key uncertainties concerning the epidemic, such as the number of infected people, or the proportion of the population that is effectively immune against the virus. Early literature [80], i.e., from the first quarter of 2020, is a testimony that at first, diagnostic instruments for SARS-CoV-2 were lacking and needed to be developed in a speedy fashion. The SARS-CoV-2 tests that were developed since the start of the COVID-19 outbreak can broadly be categorized in so-called real-time (diagnostic) reverse-transcriptase PCR (RT-PCR) and serological tests. Patients with symptoms are often diagnosed based on RT-PCR tests allowing the detection of viral nucleic acid in oropharyngeal or nasopharyngeal swabs. Such tests identify whether someone has the virus. Serological tests on the other hand, determine the presence of antibodies. With the advent of COVID-19, new serological tests have been emerging, creating new opportunities for an assessment of the SARS-CoV-2 epidemic. Serologic tests are most of the time ineffective at detecting early stages of the infection, since antibody titers only gradually increase days or weeks after infection, but are able to detect past infections providing, in theory, an indication of the proportion of the population that has been infected with the virus, at least when lifelong humoral immunity is conferred. Serological analysis may be useful to actively identify close contacts, define clusters of cases and linking clusters of cases retrospectively to delineate transmission chains and ascertain how long transmission has been ongoing or to estimate the proportion of asymptomatic individuals in the population [137]. Serological tests help to understand the epidemiology and to evaluate vaccine responses, but the reliability for diagnosis in the acute phase of illness and the assumption of protective immunity have been questioned [107]. Detection capabilities of tests may further depend on the delay since the onset of the infection or symptoms [11]. Furthermore, higher antibody levels not necessarily correlate well with an increase in protection against reinfection. Despite their value, serological tests do not allow, given the many current unknowns and uncertainties, to confirm whether or not a person is contagious or if he/she is protected against the virus, unless a correlate of protection is well-established, and do not allow, in other words, the delivery of an “immunity passport”. In the initial phase of an epidemic, knowledge on diagnostic test performance is scarce and not fully reliable. Samples are usually collected from a limited number of patients, and negative controls are not always present. A correct assessment of the limitations and performance of each of these tests is nevertheless crucial to demonstrate their accuracy and clinical utility and to design a correct testing strategy. The performance of a diagnostic test is typically characterized by its sensitivity and specificity. RT-PCR tests are considered reliable for detecting the presence of the virus, and are considered the standard by some, despite a non-negligible rate of false negative results, i.e., a low sensitivity - in some circumstances (see, for example, [[128], [133]]). False negatives can complicate governmental decisions to lift confinement restrictions. False-negative results have an impact on the manner in which serological testing might be used to support non-pharmaceutical interventions, as well as implications for the development of large-scale testing pathways [96]. The current evidence about the diagnostic accuracy of COVID-19 serology tests is characterized by high risks of bias and heterogeneity, with limited generalizability to outpatient populations [8]. A full comparison of the performance of serological tests has not yet been conducted on a large set of identical samples. The duration of antibody rises is currently unknown, and the utility of these tests for public health management purposes has been reported as uncertain [33]. Variation in performance characteristics between assays indicates the urgent need for evaluation of the large number of SARS-CoV-2 serology tests that have become rapidly available [96]. Evaluating the performance of diagnostic tests is usually based on comparing test results with a gold standard, but such a “perfect test” is often unavailable. Moreover, even if the diagnostic sensitivity and specificity are considered fixed values, intrinsic to the diagnostic test (i.e., constant and universally applicable), many examples illustrate that these values can fluctuate depending on the context [108]. Estimations of test characteristics are often obtained from studies under well-controlled conditions. The sensitivity of RT-PCR tests used for the diagnosis of COVID-19 may, for example, depend on factors such as the type of specimen, the timing of sampling and the sampling technique [125]. Yet, quantifying the performance of a given test in real-world conditions is essential when interpreting test results, measuring its predictive value, or when choosing a test for a specific use case: screen asymptomatic patients, monitor contacts, identify clusters, support contact tracing, and as a preventive measure. Hitchings et al. [60] explain how the so-called test positive fraction correlates with the incidence in a given population, turning this into a useful surveillance tool. In hospital settings, sensitive and specific diagnostic tests for active infection with SARS-CoV-2, allow guiding the care for individual patients, but a fast and repeated testing strategy at the expense of e.g. a lower test sensitivity may be more effective as a public health strategy [77]. A public health strategy – with the goal to reduce transmission - may indeed ask for the use of rapid tests, removing the focus from the usual dogma of high sensitivity and specificity towards a test to be practically useful, also accounting for factors such as costs, speed, and logistical constraints. A proper evaluation of diagnostic performance in the absence of a gold standard can be done by using latent class models, which do not require a priori knowledge of the infection status. Umemneku Chikere et al. [119] provide an overview of these and other models that allow using the combined information of multiple different tests applied on the same samples and Kostoulas et al. [74] present standards for the reporting of such diagnostic accuracy studies. Models used to analyze the results of multiple diagnostic tests assume that there is an unknown prevalence, sometimes referred to as a latent class, and that the sensitivity and specificity of the diagnostic tests are unknown. This “latent prevalence” can then be linked to the apparent prevalence (i.e., the observed proportion of positive results of the diagnostic tests) through a set of equations allowing estimating all parameters at stake (i.e., prevalence, sensitivities and specificities of each of the tests used) [108]. Further context on issues surrounding diagnostic tests is given in Tang et al. [112]. Once diagnostic tools are available and properly evaluated, their use may be hampered by constraints such as a lack of reagents, limited laboratory capacity, and personnel. Pooling samples may be used to addresss this concern, increasing the number of individuals tested with an available number of tests and providing a cost-effective alternative to individual testing. Over the years, an entire body of research has indeed been developed around group testing in a diagnostic context, for example when resources are scarce and/or under time pressure [25]. This is precisely the situation we are confronted with the current pandemic, creating an opportunity to roll out and test reliable and new methodologies (see, e.g., [109]). It is another example where existing and seasoned methodology can be tailored to differing circumstances, such as the need for repeated testing, as described by Augenblick et al. [7]. Test results can be compared with the results from non-pharmaceutical components of early warning systems (Section 3.2). Knowledge on the test characteristics can be used and integrated when interpreting survey results (Section 5.2).

Vaccine development

While a number of effective vaccines have been developed over the last half century, such as for measles, rubella, smallpox, hepatitis B, Ebola, etc., vaccine development remains a challenging area. For example, no succesful vaccine has been found so far for HIV [23]. Even the determination of the seasonal influenza vaccine, a yearly exercise, is one of hits and misses, due to the volatile nature of the influenza virus. Of particular importance to us is that traditionally coronaviruses (hCoV-229E, hCoV-NL63, and hCoV-OC43) have received little or no attention from a vaccine development standpoint. This changed for SARS-CoV-1 and MERS-CoV but in these cases there was no opportunity to put potential vaccines to the test[111]. While existing vaccine-constructs (e.g., adeno-based, adjuvants, etc.), in particular for influenza and the aforementioned coronaviruses, can provide a step-up for SARS-CoV-2, success is not automatically guaranteed. Because the general consensus is that the global population will be able to return to normalcy only after the development of effective vaccines and the implementation of large vaccination programmes, the challenge is to develop a vaccine at unprecedented speed. Evidently, global collaboration is essential. A candidate vaccine developed in one part of the world may have to be put to the test in another, depending on the succession of epidemic waves. The state of urgency poses ethical questions, such as whether one can, besides the traditional phase 3 efficacy studies, set up controlled human infection model (CHIM) studies where healthy subjects are infected to test a vaccine, while effective treatment may not yet be available. A further challenge is that vaccines need to be developed while the immunology associated with SARS-CoV-2 is still unclear, and knowledge is accumulating, with trial and error. Several pharmaceutical companies have taken the unprecedented step to plan and build production capacity in parallel with candidate vaccine development and testing. A fascinating new chapter is currently being written to bring future vaccines to market; many lessons will be learned that fall beyond the scope of the present paper.

Clinical trials for COVID-19 patients

The amount of clinical research generated by the COVID-19 pandemic is mind-boggling: on June 15, 2020, a search of the ClinicalTrials.gov website with the keywords “COVID”, returned more than 600 interventional studies currently recruiting patients [123]. For a more complete coverage of trials worldwide, the ReDO database listed 1144 interventional trials for the treatment of COVID-19 infected patients on June 26, 2020 [99]. Reassuringly, 825 (80%) of these trials were controlled and taking place in a hospital setting (because testing capacity was lacking outside of the hospitals at the start of the pandemic). It is beyond the scope of this paper to cover the various treatment approaches that are being tested against COVID-19, whether using repurposed drugs already in use for other indications, new drugs specifically developed against the virus, or non-drug treatments. The World Health Organization (WHO) published a useful classification of treatment types [138]. Here we focus on key features of some of the clinical trials that were designed and conducted in record time in the early days of the epidemic in Belgium.

Outcome measures

The natural history of most diseases is well established, and a consensus has in most cases been reached on outcomes that appropriately capture how a patient feels, functions or survives. COVID-19 infections were, at least initially, largely unknown, hence it was challenging to choose outcome measures that would be clinically relevant as well as statistically sensitive to treatment benefits. The best outcomes to use will undoubtedly emerge as the results of clinical trials begin to appear and clinicians have built experience on how to measure these outcomes. In large randomized trials for hospitalised patients such as RECOVERY (Randomized Evaluation of COVid-19 thERapY), all-cause mortality within 28 days was the primary outcome of interest [114]. While all-cause mortality is unquestionably the ultimate clinical outcome most therapies are trying to impact, cause-specific mortality could be more sensitive and also more relevant if (and only if) the treatments had no impact on deaths due to other causes. In practice both all-cause and cause-specific mortality are typically required to assess all treatment effects, and the designation of either one as the primary outcome may depend on the importance of competing risks of death. Other outcome measures of interest are time to invasive mechanical ventilation, and time to discharge. Besides time to clinically important events, the need to quantify the severity of the COVID-19 infection led to the definition of clinical progression scales. Table 1 shows one such ordinal scale with scores ranging from 0 to 10 [135]. Less granular ordinal scales have been used (e.g., with scores ranging from 1 to 5) with a similar intent. Various outcome measures can be defined using these scales, e.g. time to a score change (improvement or deterioration) of at least 2 points on the chosen scale, cumulative score or area under the score curve up to day 15, etc. Time will tell which scale and outcome measure are simple enough to be used effectively and sensitive enough to detect treatment benefits. Last but not least, inclusion of patient-reported outcomes (PRO) should be considered in trials of COVID-19 patients with prolonged follow-up [22].

Table 1

WHO clinical progression scale (ECMO = extracorporeal membrane oxygenation; FiO2 = fraction of inspired oxygen; NIV = non-invasive ventilation; pO2 = partial pressure of oxygen; SpO2 = oxygen saturation).

Patient State	Descriptor	Score
Uninfected	Uninfected; no viral RNA detected	0
Ambulatory mild disease	Asymptomatic; viral RNA detected	1
	Symptomatic; independent	2
	Symptomatic; assistance needed	3
Hospitalised: moderate disease	Hospitalised; no oxygen therapy	4
	Hospitalised; oxygen by mask or nasal prongs	5
Hospitalised: severe diseases	Hospitalised; oxygen by NIV or high flow	6
	Intubation and mechanical ventilation with pO₂/FiO₂ ≥ 150 or SpO₂/FiO₂ ≥ 200	7
	Mechanical ventilation with pO₂/FIO₂ < 150 or vasopressors	8
	Mechanical ventilation with pO₂/FiO₂ < 150 and vasopressors, dialysis, or ECMO	9
Dead	Dead	10

Multi-arm designs

The main challenges when conducting clinical trials in the COVID-19 context are (a) the multitude of potential treatments, (b) the lack of patients in some regions to conduct several trials in parallel, (c) the pace at which new scientific insights become available, and (d) the push to use treatments based on incomplete preclinical development and unreliable clinical data. Hydroxychloroquine, for instance, made it into preliminary COVID-19 treatment guidelines without proper supporting evidence, thus undermining the use of untreated controls in clinical trials. This has forced statisticians and clinicians to search for flexible designs which allow including additional promising therapies or removing therapies which have shown not to be effective, while simultaneously allowing for optimal use of the limited available patients and drugs. When two treatments A and B are to be compared to standard of care (SOC), a natural choice would be a randomized multi-arm study comparing A, B and SOC (leaving aside the potential difficulties of access to A and B at once). The advantage is that a single SOC group can be used rather than two SOC groups which would be needed in two separate trials comparing A with SOC and B with SOC. However, classical multi-arm studies require all patients enrolled to be eligible for all treatments. In the COVID-19 context, a research treatment often has contraindications which do not allow patients to be randomized to that particular treatment, but allowing patients to be randomized to some of the other treatments under consideration. A possible solution is selective exclusion. While such designs with selective exclusion have been described in the statistical and medical literature [68,78], the statistical analysis of such studies has not received much attention. As an example, consider a scenario in which patients are randomized to treatment A, B, or SOC in a (1:2:1) ratio. Interest is in comparing A with SOC, and B with SOC. Further assume that 10% of the population eligible for A and/or B is eligible for A only (subpopulation 1), while 30% is eligible for B only (subpopulation 2). The remaining 60% is eligible for both A and B (subpopulation 3). This situation is graphically shown in Fig. 2 . Out of 100 patients eligible for A or B, we expect 10, 30, and 60 subjects in subpopulations 1, 2, and 3, respectively. In each subpopulation, randomization is performed according to the appropriate ratios, i.e., (1:1), (2:1), and (1:2:1), in subpopulations 1, 2, and 3, respectively.

Fig. 2

Graphical depiction of two clinical trials with a common standard of care arm.

Graphical depiction of two clinical trials with a common standard of care arm. When analyzing the effect of treatment A versus SOC, only concurrent controls can be included. Hence the SOC patients from subpopulations 1 and 3, will be compared to all A patients from the same two subpopulations. However, in subpopulation 3, 50% of the patients received B, implying that subpopulation 3 is underrepresented in the comparison of A versus SOC. If the objective is to estimate the marginal effect of A versus SOC, i.e., the effect one would estimate in a placebo controlled trial of A versus SOC, the patients from subpopulation 3 need to be reweighted by a factor 2, in order to restore the balance between subpopulations 1 and 3. The final analysis of A versus SOC is then a weighted analysis of the 2 × 20 patients from the subpopulations 1 and 3 who received either A or SOC, however, the patients from subpopulation 3 get each a weight of 2. Likewise, the marginal effect of B versus SOC can be estimated using a weighted analysis of the 25 SOC patients and the 50 B patients from subpopulations 2 and 3, but the patients from subpopulation 3 need to be reweighted by a factor 4/3 in order to correct for the imbalance due to the removal of the 25% patients on treatment A in subpopulation 3. Note that the gain of the design in Fig. 2 is that an expected 15 SOC patients, i.e., 25% of 60% of the study population, can be used twice, once in the comparison with A and once in the comparison with B. The gain obviously highly depends on the eligibility criteria and on the randomization ratios used. Note also that the methodology can easily be extended to trials with more than two research treatments and to trials with adaptive designs allowing for adding new treatments or removing non-promising treatments.

Factorial designs

Factorial designs, a rare exception in trials sponsored by pharmaceutical companies who prefer to focus on a single therapeutic question, were suggested for situations in which more than one treatment could be tested simultaneously in the same patients. As an example of such a design, the COV-AID trial (Treatment of COVID-19 patients with Anti-Interleukin Drugs) simultaneously tested blockade of the Interleukin-1 pathway with Anakinra, and blockade of the Interleukin-6 pathway with either Siltuximab or Tocizilumab, in hospitalised adult patients with COVID-19 infection, acute hypoxia and signs of cytokine release syndrome. The factorial design is premised on the effectiveness of interleukin blockade to prevent hyperinflammation or auto-inflammatory syndromes in COVID-19 infected patients. Interestingly, in such a design, only 2 out of every 9 patients receive usual care while 7 receive usual care plus at least one experimental drug (see Table 2 ).

Table 2

Factorial design to simultaneously test three drugs, two blocking IL-6 and one blocking IL-1.

		IL-6 blockade
		No	Yes
		1/3	2/3
	No	Usual Care	Siltuximab	Tocilizumab
	2/3	2/9	2/9	2/9
IL-1 blockade			Anakinra +	Anakinra +
	Yes	Anakinra	Siltuximab	Tocilizumab
	1/3	1/9	1/9	1/9

Factorial design to simultaneously test three drugs, two blocking IL-6 and one blocking IL-1.

Interim analyses and multi-stage designs

In view of the huge uncertainties associated with anticipated clinical outcomes as well as treatment effects, it was generally considered appropriate to include one or more interim analyses for safety and/or futility and/or efficacy in the trial designs. Group sequential trial methodology provides a well-known framework for incorporating as many interim analyses as deemed necessary while adequately controlling the probability of a type I error. Any substantial trial benefits from being monitored by an experienced IDMC (Independent Data Monitoring Committee), and in particular trials with interim analyses of efficacy; however IDMCs are in high demand and short supply, and the flurry of COVID-19 trials will not ease the current shortage. Adaptive design methodology was also considered, though its most common applications (choice of an optimal dose, increase in sample size, or enrichment in specific patient subsets) did not address the most acute need in COVID-19 trials, which was to allow seamless addition or dropping of treatment arms to reflect a changing therapeutic landscape. This is the objective of platform trials, such as the multi-arm multi-stage (MAMS) trials [134]. The PRINCIPLE trial (University of Oxford [120]), served as a model for the design of a similar trial in Belgium, the DAWN (Direct Antivirals Working against nCov) Ambulatory Care Platform trial. Logistical challenges in the setting of COVID-19 include the timely identification of eligible subjects, obtaining informed consent when isolation at home is needed, as well as the delivery of study medication. Initially, this trial will compare Camostat with standard of care in community dwelling adult patients who are at least 50 years old presenting with signs and symptoms compatible with COVID-19. The aim of this large pragmatic trial is to avoid hospitalization by using a well tolerated antiviral to rapidly treat patients at risk who have first symptoms of COVID-19. Like in PRINCIPLE, the DAWN trial will use Bayesian posterior probabilities to add or drop treatment arms while the study is ongoing, but unlike in PRINCIPLE, randomization will not use adaptive randomization, for there is neither a statistical advantage nor an ethical imperative to do so (see [59], with discussion). Instead, minimization can be used to allocate treatments in a constant ratio while allowing for several prognostic factors to be balanced across the treatment arms.

Pragmatism in trial conduct

Perhaps the most impressive aspect of clinical trial activities during the pandemic was the collaborative pragmatism that naturally evolved in response to the crisis. Statisticians from academia, the public and the private sectors voluntarily contributed ideas and resources to come up with optimal trial designs to address the most critical clinical questions. Some of these collaborations pre-dated the pandemic, but many were improvised to respond efficiently to the most pressing needs. When it came to launching the trials, the usual delays and bureaucratic hurdles evaporated, and the trials could all be launched within a couple of weeks - instead of the several months usually required to fulfil all administrative requirements. While excessive speed may create challenges, as discussed in Section 10, on balance it may be preferable to unnecessary delays whenever the health of patients is at stake – and this is the case for many non-COVID-19 related health issues. While an overarching priority was given to rigorous trial designs, implementation details were kept as simple as possible. As was already argued prior to the pandemic, simplicity is a virtue in clinical research [76], but one that does not align with the commercial interests of the clinical research organizations that implement clinical trials for pharmaceutical companies [29]. Many have argued that the absurdly high costs of pivotal clinical trials are due to inefficiencies in the current clinical research process [90]. Examples of inefficiencies include the collection of data of marginal interest, including details of medical history and concomitant medications, complex procedures to measure outcomes, including central reviews and outcome adjudications, strict visit schedules and examinations that do not reflect clinical routine, and so on. Although some of these inefficiencies may be justified for pivotal trials of new drugs, they should generally be avoided in trials of approved drugs or other non-drug treatments. A clear distinction between pragmatic and explanatory approaches to clinical trials was proposed nearly fifty years ago, yet most trials conducted today adopt the explanatory approach, which is unnecessarily onerous [105]. Table 3 provides a comparison of trial characteristics under the explanatory and pragmatic approaches [19]. The COVID-19 pandemic provided empirical evidence that inefficiencies in clinical research can easily be overcome in pragmatic trials in times of emergency. Will this lesson survive the end of the pandemic?

Table 3

Contrast between the explanatory and the pragmatic approach in clinical trials.

Approach	Explanatory	Pragmatic
Type of trial	Industry-sponsored	Investigator-led
Primary purpose	Regulatory approval	Public health impact
Patient selection	Fittest patients	All comers
Effect of interest	‘Ideal’ treatment effect	Actual treatment effect
Outcome ascertainment	Centrally reviewed	Per local investigator
Preferred control group	Untreated (when feasible)	Current standard of care
Experimental conditions	Strictly controlled	Clinical routine
Volume of data collected	Large, for supportive analyses	Key data only
Data quality control	Extensive and on-site	Limited and central only

Contrast between the explanatory and the pragmatic approach in clinical trials.

Impact of COVID-19 on ongoing clinical trials

The COVID-19 pandemic has had, and will continue to have, a major impact on the conduct of almost all ongoing clinical trials, in particular on the treatment of patients and the schedule of their planned protocol visits. Regulatory agencies worldwide have promptly issued guidance on measures to be taken to minimize the impact of COVID-19 on ongoing trials [[39], [122]]. Given the huge uncertainty associated with the current situation, and the lack of historical precedents, the guidance documents recommend to capture as much information as possible on protocol deviations and other unexpected events, so as to be able to conduct various analyses when the trial is completed. Meyer et al. [86] give an excellent overview of statistical issues and recommendations for clinical trials during the COVID-19 pandemic. From a statistical inference perspective, despite the dramatic health care disruptions caused by the COVID-19 pandemic, intention-to-treat (ITT) analyses of randomized clinical trials remain valid, if (as will generally be the case) protocol deviations impact all randomized treatment groups equally. However, such deviations may induce a dilution of the treatment effect, and as such are likely to result in more conservative estimates of treatment effects (with the exception of non-inferiority trials). In other words, the ITT estimates of treatment effects will in general not be biased by systematic differences between the randomized treatment groups, but they may well underestimate treatment effects that would have been estimated in ‘normal’ circumstances.

Missing data

Missing visits, missing clinical assessments, missing scans or laboratory values, and all such like that result from the COVID-19 pandemic will in general be missing at random (MAR), since the pandemic is an external cause of missingness that bears no relationship to the disease or treatment under investigation. Hence COVID-19 related missing data can be appropriately dealt with by using likelihood based methods or multiple imputation under the MAR assumption. To give a few typical examples: (1) hazard ratios estimated using proportional hazards regression models, e.g., survival times, remain valid under independent censoring (and proportional hazards); (2) treatment effects estimated using mixed models for repeated data, e.g., for longitudinal measurements of visual acuity, remain valid if the outcome data are MAR; (3) generalized estimating equations for longitudinal measurements of responses remain valid if missing data are imputed under the assumption of MAR. Multiple imputation may be feasible when the amount of missing data is limited; however, the potential for multiple imputation is limited when large volumes of data are missing, especially when few patients have observed data that can be used to impute the data for patients with missing values. In multinational or multiregional trials, the COVID-19 pandemic may take a different course in different regions; in addition, regional differences such as distance traveled to health care centers may create very different patterns of missingness across regions. This variability may not create a systematic bias if it affects all treatment arms equally. It does offer an opportunity to perform sensitivity analyses using region as a potential modulator of treatment effect. Other sensitivity analyses (such as shift imputation and tipping-point analyses) will likely play a more prominent role due to the larger than usual volume of missing data. Finally, it will be important to rule out situations of differential drop-out rate between the randomized treatment groups. This could happen, for instance, in open-label trials if patients in the control arm are more likely to miss their planned visits than patients who receive an experimental therapy. Conversely, some trials had to stop the experimental treatment (e.g., immunotherapy in cancer) for fear of an interaction with COVID-19.

Outcome assessments

Missing visits have a direct impact on outcome assessments. For instance, in oncology trials, tumor response and time to progressive disease are assessed through CT-scans performed according to a fixed schedule. Some conventions that are sometimes applied, e.g., to censor patients if they have missed too many visits, become wholly inappropriate when deviations from the intended schedules are systematic and unavoidable. In such cases, these conventions should be used, if at all, only in sensitivity analyses. The proper primary analysis of time to progression should remain an ITT analysis, in which all patients are followed up as thoroughly as possible, regardless of how long it takes to obtain CT-scans, until they have objective confirmation of disease progression. Because in some patients such confirmation may come with considerable delay, interval-censoring analyses may be helpful to complement or even replace the traditional analyses with right censoring only. Some patients may prefer to avoid hospital or office visits during the COVID-19 pandemic. If outcome assessments were due to take place at the hospital or doctor's office (e.g., a 6-minute walk test), it may be preferable to replace these assessments by their home-based equivalent assessments, when available. In most situations, some data are better than no data at all, under the assumption that data taken in less than ideal situations are not grossly erroneous or misleading. In fact, even if assessments taken at home in poorly controlled conditions are less reliable than those taken at the hospital in the most rigorous conditions, the loss in efficiency in detecting a treatment effect may be surprisingly small, assuming no systematic bias between the randomized treatment groups [18]. It is sometimes believed, wrongly, that patients who have symptomatic COVID-19 infections should be removed from trials of other indications. This is unjustified and should not be done unless it is mandated by the patient's safety or personal choice.

COVID-19 related events

It is conceivable that in some cases a randomized trial has its treatment arms differentially affected by the pandemic if the intervention under study is a risk factor for COVID-19. As an example, in oncology, chemotherapy is felt to increase the risk of infection among cancer patients, and some authors have cautioned the medical community about this risk. In a trial comparing chemotherapy with a non-cytotoxic intervention, the incidence of COVID-19 may therefore be higher in the chemotherapy arm. If the infection is a risk factor for one or more of the outcomes of interest (e.g., survival), an association may be created between the exposure (treatment) and the outcome (in this case, survival) through the infection, thus confounding the analysis of such outcome(s), unless cause-specific mortality is used. The reporting of causes of death is generally unreliable and variable from center to center, but COVID-19 related deaths are likely to be reported reliably (respiratory diseases being an exception). It may also be useful to perform competing risks analyses for the outcome of primary interest in the trial (such as disease progression) and COVID-19 infection. It is conceivable that patients with COVID-19 infection will receive treatments that interact with their treatments for other indications. Such interactions would only create a potential bias in randomized trials if they were different for the treatments being compared, an unlikely situation but one that may on occasion occur. A related issue is that most clinical trials forbid the inclusion of patients in other trials of investigational drugs. AIDS advocacy groups argued long ago that co-enrollment in multiple trials was both ethically and scientifically desirable, a view that still prevails today and should be pro-actively implemented in trials [87].

Protocol amendments

Because the results of randomized clinical trials are, by nature, protected against changes in the environment that affect all randomized groups equally, there will generally be no good reason to amend the statistical sections of the protocols of ongoing studies, except for sample size calculations and the provision of descriptive statistics on the impact of COVID-19 (number of patients with COVID-19 infections and COVID-19 deaths). Some trials will have to stop as a result of the pandemic with a lower sample size than initially planned. For trials that can continue throughout the pandemic, major protocol deviations may result in a lower treatment effect than anticipated, which might justify a sample size increase to compensate the loss in statistical power. Such sample size increases do not affect the type I error.

The Price of speed: methodological sloppiness

Uncontrolled trials

In times of great pressure, such as when the COVID-19 pandemic erupted, it is very tempting to take shortcuts and experiment with potentially effective treatments in an uncontrolled way, with the hope that some of the treatments tested will be so effective as to constitute real breakthroughs in the management of the disease. Two additional factors may mitigate against conducting properly controlled experiments: the number of patients available, and the severity of their condition (patients admitted to ICU often having a fatal outcome). Yet, despite ethical dilemmas with control arms, randomization was widely considered during the COVID-19 outbreak as the only way to generate reliable, practice-changing evidence [29]. Claims made on the basis of supposedly impressive clinical outcomes of COVID-19 infected patients treated with Chloroquine and Hydroxychloroquine were viewed with skepticism, and the contradictory data that were later published about these drugs, including some that had to be retracted [85], confirmed that skepticism was indeed in order, and that scientific standards could not be lowered as a result of the pandemic [82]. Observational studies, even when conducted with care, can be so misleading that some authors have argued a moratorium should be placed on reporting them [20]. And indeed, to counteract exaggerated claims based on uncontrolled data, some wide-ranging national or international collaborations were quickly put in place for the conduct of large-scale trials [115]. The SOLIDARITY trial, conducted in 35 countries under the auspices of the WHO, is an example of a large simple trial for hospitalised patients with COVID-19 treated with local standard of care (SOC) against Remdesivir, Lopinavir and Ritonavir, or SOC plus Lopinavir and Ritonavir and Interferon β-1a [139]. Despite the best of intentions, the SOLIDARITY trial ran into contractual and legal difficulties that made its adoption in many countries slow and inefficient. Furthermore, the trial prioritized antiviral agents and other PIs over NPIs, which may have diverted resources away from trials of simple supportive care interventions. Finally, finding international consensus to select or change trial interventions is far more challenging than at the national level. The right balance between national and international efforts will have to be addressed going forward, with the overarching goal of maximizing the efficiency of clinical research. Trial implementation is definitely more efficient at the national level; however the number of patients in a small country like Belgium is insufficient to size the trials properly. Most of the trials started during the early phase of the COVID-19 epidemic will be too small to provide reliable estimates of treatment effects, and it would therefore be advisable to plan prospective meta-analyses of all such trials as soon as possible. Such prospective meta-analyses should be based on patient-level data (see Section 11). In the UK, the large RECOVERY trial tested standard of care against low-dose Dexamethasone, Azithromycin, Tocilizumab or convalescent plasma[121]. This trial accrued 11,303 patients between March and June 2020 and, in this short period of time, was already able to show a highly significant benefit of dexamethasone on mortality [114], which immediately led to the use of glucocorticoids as standard of care for hospitalised COVID-19 patients.

Methodological errors

Methodological issues have arisen in a number of studies dedicated to prediction models for diagnosis and prognosis (mortality risk, progression to severe disease, length of hospital stay) of patients with COVID-19. Wynants et al. [140] conducted a systematic review and detected 51 studies with methodological issues and errors among a collection of 4909 titles screened. In clinical trials conducted in COVID-19 patients, the statistical methods commonly used are based on the standard Cox proportional hazards model [30] and the Kaplan-Meier estimator [70] (see, for example, [9], and [79]). When time to death due to COVID-19 is the outcome of interest, these methods implicitly treat discharged or recovered patients as right censored. Doing so is incorrect, however, as right censoring means that the unobserved time to death can be any time point larger than the observed one, whereas patients who recover may in fact never die from COVID-19. A correct way of analyzing this type of data is through the use of competing risk models, such as the model proposed by Fine and Gray [45] which is based on the subdistribution hazard, or on cure models. To study the impact of incorrectly classifying recovered patients as right censored, Oulhaj et al. [95] simulated data from a fictive clinical trial on COVID-19. Six scenarios representing different situations of the effect of treatment on death and its competing event recovery were considered. The hazard ratio of death and the 28-day absolute risk reduction were estimated using the Cox model and the Fine and Gray model. The Cox model estimated the hazard ratio of death due to COVID-19 and the 28-day absolute risk reduction incorrectly in almost all cases. The magnitude of the estimation bias increased when the process of recovery was faster and/or the chance of recovery was higher. In some cases, the estimates obtained from the Cox model also incorrectly showed a harmful effect of treatment when it was in fact beneficial. The simulation study therefore shows that there is a substantial risk of misleading results in COVID-19 research if recovery and death due to COVID-19 are not considered as competing events, and the assumption of non-informative censoring is violated. This issue, and others related to intercurrent events, is best addressed using the estimand framework, which has now become a regulatory requirement for trials aimed at new drug registration [69]. Another well-known issue with the Cox model is the presence of strongly non proportional hazards. Much literature has recently focused on alternatives to the Cox model, especially for situations where deviations from proportionality are expected or have been observed, e.g., in trials of immunotherapy for cancer patients. Accelerated failure time models and the restricted mean survival time have been advocated in such cases, as have approaches based on generalized pairwise comparisons such as the win ratio and the net benefit [17]. Further experience is needed with these alternative approaches, which might advantageously be considered in COVID-19 trials.

The need for data sharing

During the pandemic, one of the key needs was and remains the collection of personal and medical data at an individual and group level. This need provided impetus for contact tracing and opened possible avenues of research for understanding the spread of the virus throughout the population and specific subgroups. In the discussion regarding the use of existing medical data, the collection of new data and in particular the collection of contact tracing data, some policy makers argued that there was a conflict between the rights guaranteed by the European Union's General Data Protection Regulation (GDPR), and this need for data sharing. This paradoxical dichotomy potentially inhibits the use of valuable data for research purposes within a country, and jeopardises cross-border scientific cooperation in the case of different interpretations of the same regulation within EU-member states. Several authors have argued that there is ample room within the GDPR for a framework allowing for the scientific use of existing and newly collected data to support the international effort to curb the pandemic [12,84]. These views are echoed by the European Data Protection Board [38], and confirmed by the Belgian Data Protection Authority. Specifically, one can invoke article 9(i) of GDPR if ‘… processing is necessary for reasons of public interest in the area of public health …’ and 9(j) if ‘… processing is necessary for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1)’. These provisions, together with the European Clinical Trials Regulation and the corresponding Belgian law, provide a solid base for the scientific use and data sharing of medical and personal data [37]. As argued by other authors [143] the COVID-19 pandemic is not a free pass to use these data without any safeguards. The pandemic actually has been an opportunity to show that the principle underlying GDPR can actually be an advantage for data-driven research. The confrontation with a new situation, may also require reflection time, i.e., for a debate, but this time was not available, and hence public interest should prevail, within limits defined by the ethical committees. The availability of granular data from differing sources like individual medical files, or data held by mutual health organizations would provide unique opportunities to support health policy making and develop successful strategies for the current and future pandemics. The adoption of standards for data citation and referencing would also promote data sharing in an international, interdisciplinary, and interdependent research community. Guidelines have been developed by DataCite (https: https://datacite.org/cite-your-data.html) and DataVerse (http://best-practices.dataverse.org/data-citation/). As far as clinical research is concerned, there has been a remarkable push towards sharing of individual patient data for a number of years, both from publicly-funded trials but also from the pharmaceutical industry [103]. The goal is to share individual patient data from all completed trials within reasonable time after their completion so as to allow for further analyses of these patient data, as well as to help the design of other trials. Such maximization of the use of patient data is certainly in line with greater patient involvement in clinical research, and would pave the way to truly patient-centric research. For the sharing to be maximally useful, the data should be made available as early as possible (without infringing intellectual property rights or publication in full by the trial principal investigators). The COVID-19 pandemic has also made it clear that data should be shared even earlier, albeit confidentially, among the Independent Data Monitoring Committees of trials investigating similar treatments in order to inform decisions about amending or stopping ongoing trials after careful review of all relevant data [98].

Reflections, concluding remarks, and outlook

In this section, we suggest some specific lessons learned for both the modeling and prediction as well as for clinical research. First and foremost, there is a huge need for international collaboration through formal and informal scientific networks during pandemics. While there are local, country specific aspects to the epidemic (culture, population density, demography, health care system), there is commonality from an infectious diseases perspective. The statistical and methodological teams in academia, industry, and government need to connect with each other, nationally and internationally. Steady research capacity is needed, that can quickly scale up in pandemic times in order to respond to pandemics efficiently. For example, statisticians working in other areas (exact sciences, economy, humanities) can be converted quickly to COVID-19 response work provided they are sufficiently broadly trained, and there are pre-existing communication lines (e.g., university wide statistics research centers, learned societies, etc.). More than ever statisticians, modelers, and epidemiologists must be able to communicate and collaborate with research teams from other key fields, such as virologists, health economists, but also economists, social and behavioral scientists, etc. Effective communication lines need to be established between statisticians and other scientific experts, policy makers and international, national, and local policy makers, the public opinion, and the press. A number of statisticians must have received media training and ideally have built up experience in clearly communicating potentially complex statistical matters. An exceptional pandemic situation makes it clear that out-of-the-box thinking is needed. Inevitably, inaccurate or incorrect judgements will be made at some level during the pandemic. It must be acknowledged, and accepted, that knowledge is being built while the response to the crisis is being rolled out. Mutual trust between the parties involved and honest communication towards the public opinion is essential. In this sense, it is fine and even healthy that researchers not automatically agree with one another. Critical reflection and peer review, formally and informally, externally and internally within research groups, is of crucial importance to avoid serious mistakes. A gradually, orderly, and naturally built consensus, can help avoid misguided policies. When the process works, public opinion is ready to accept NPIs, for example, before they are formally announced. A key problem with COVID-19 is the pressure that it can induce on the health care system. It is therefore important to have sufficient reserve capacity (in terms of hospital, staff, supplies). This is difficult because of the cost involved. Statisticians can contribute to planning, health economic evaluation, and, during pandemic times, by monitoring and forecasting hospital load and other capacity.

Modeling, prediction, prevention

To avoid methodological errors, even when research is done at very high speed, and to ensure that models built and data analyses undertaken are as stable, broadly valid, and unbiased as possible, it is imperative to share data at the finest granular level possible, including individual patient data in clinical and epidemiological studies, and spatial data used to monitor the epidemic, to deter or alleviate post-wave outbreaks, etc. As is well-known throughout statistics, a well-fitting model (curve) does not automatically imply good prediction qualities. In meteorology, various weather models are juxtaposed to come to a calibrated weather forecast. Good models imply a subtle interplay between epidemiological theory, sophisticated modeling, and the use of real-world data: data about infections, hospitalization, and mortality on the one hand, and non-pharmaceutical interventions taken as well as their gradual relaxation on the other. In a pandemic epoch, a large number of national, regional, and city-wide epidemics can be compared. To date, excellent international resources are available, such as from the European Centre for Disease Prevention and Control ([36]; https://www.ecdc.europa.eu/en/covid-19-pandemic), Johns Hopkins University (https://coronavirus.jhu.edu/map.html), and Our World in Data (https://ourworldindata.org/coronavirus). These offer valuable resources on how the epidemic is playing out elsewhere. Especially in contiguous and highly connected areas, such as in the United States and the European Union (especially the Schengen Zone), the epidemic's evolution cannot be seen in isolation, except at the rare times where state or international borders are virtually closed.

Clinical research

The COVID-19 crisis has provided an exceptional opportunity to question the way in which clinical research is conducted, not just for the treatment of COVID-19 patients but also for all other diseases. One of the priorities today should be to streamline clinical research in diseases with high morbidity and mortality (cancer, cardiovascular disease, etc.) This could entail drastic simplifications of trial set-up (protocol review committees, ethical approval, regulatory submissions, access to drugs from competing drug companies for comparative effectiveness trials, etc.) as well as trial conduct (pragmatic trials comparing standards of care using ultra-simple protocols, real-time electronic data capture, central statistical monitoring, common resources for Independent Data Monitoring Committees, etc.) These ideas are by no means new (see, e.g., https://moretrials.net/) but with the lessons learned during the COVID-19 pandemic, they may get more traction than ever before. The need for a strengthened international collaboration in epidemiology should be accompanied by a corresponding international preparedness for clinical research, in order to quickly deploy large simple trials simultaneously in as many countries as possible. If the urgency to carry out clinical trials of treatments against COVID-19 could now be expanded to all other diseases, it would be a revolution in using statistical methodology to improve global health.

90 in total

Review 1. A 25-year review of sequential methodology in clinical studies.

Authors: Susan Todd
Journal: Stat Med Date: 2007-01-30 Impact factor: 2.373

2. The effect of public health measures on the 1918 influenza pandemic in U.S. cities.

Authors: Martin C J Bootsma; Neil M Ferguson
Journal: Proc Natl Acad Sci U S A Date: 2007-04-06 Impact factor: 11.205

3. Open science: The open clinical trials data journey.

Authors: Frank Rockhold; Christina Bromley; Erin K Wagner; Marc Buyse
Journal: Clin Trials Date: 2019-07-26 Impact factor: 2.486

4. Optimal design of multi-arm multi-stage trials.

Authors: James M S Wason; Thomas Jaki
Journal: Stat Med Date: 2012-07-23 Impact factor: 2.373

5. Estimated Costs of Pivotal Trials for Novel Therapeutic Agents Approved by the US Food and Drug Administration, 2015-2016.

Authors: Thomas J Moore; Hanzhe Zhang; Gerard Anderson; G Caleb Alexander
Journal: JAMA Intern Med Date: 2018-11-01 Impact factor: 21.873

6. SARS-CoV-2 T cell immunity: Specificity, function, durability, and role in protection.

Authors: Daniel M Altmann; Rosemary J Boyton
Journal: Sci Immunol Date: 2020-07-17

7. Guidelines for Inclusion of Patient-Reported Outcomes in Clinical Trial Protocols: The SPIRIT-PRO Extension.

Authors: Melanie Calvert; Derek Kyte; Rebecca Mercieca-Bebber; Anita Slade; An-Wen Chan; Madeleine T King; Amanda Hunn; Andrew Bottomley; Antoine Regnault; An-Wen Chan; Carolyn Ells; Daniel O'Connor; Dennis Revicki; Donald Patrick; Doug Altman; Ethan Basch; Galina Velikova; Gary Price; Heather Draper; Jane Blazeby; Jane Scott; Joanna Coast; Josephine Norquist; Julia Brown; Kirstie Haywood; Laura Lee Johnson; Lisa Campbell; Lori Frank; Maria von Hildebrand; Michael Brundage; Michael Palmer; Paul Kluetz; Richard Stephens; Robert M Golub; Sandra Mitchell; Trish Groves
Journal: JAMA Date: 2018-02-06 Impact factor: 56.272

8. Rapid Scaling Up of Covid-19 Diagnostic Testing in the United States - The NIH RADx Initiative.

Authors: Bruce J Tromberg; Tara A Schwetz; Eliseo J Pérez-Stable; Richard J Hodes; Richard P Woychik; Rick A Bright; Rachael L Fleurence; Francis S Collins
Journal: N Engl J Med Date: 2020-07-22 Impact factor: 91.245

9. COVID-19: Putting the General Data Protection Regulation to the Test.

Authors: Stuart McLennan; Leo Anthony Celi; Alena Buyx
Journal: JMIR Public Health Surveill Date: 2020-05-29

10. Serial interval of novel coronavirus (COVID-19) infections.

Authors: Hiroshi Nishiura; Natalie M Linton; Andrei R Akhmetzhanov
Journal: Int J Infect Dis Date: 2020-03-04 Impact factor: 3.623

3 in total

Review 1. How is "solidarity" understood in discussions about contact tracing apps? An overview.

Authors: Max Tretter
Journal: Front Public Health Date: 2022-07-22

2. COVID-19 surveillance in the Flemish school system: development of systematic data collection within the public health school system and descriptive analysis of cases reported between October 2020 and June 2021.

Authors: Joanna Merckx; Jonas Crèvecoeur; Kristiaan Proesmans; Naïma Hammami; Hilde Denys; Niel Hens
Journal: BMC Public Health Date: 2022-10-15 Impact factor: 4.135

3. The case against censoring of progression-free survival in cancer clinical trials - A pandemic shutdown as an illustration.

Authors: Corinne Jamoul; Laurence Collette; Elisabeth Coart; Koenraad D'Hollander; Tomasz Burzykowski; Everardo D Saad; Marc Buyse
Journal: BMC Med Res Methodol Date: 2022-10-05 Impact factor: 4.612

3 in total