Literature DB >> 33814681

Epidemics, the Ising-model and percolation theory: A comprehensive review focused on Covid-19.

Isys F Mello¹, Lucas Squillante¹, Gabriel O Gomes², Antonio C Seridonio³, Mariano de Souza¹.

Abstract

We revisit well-established concepts of epidemiology, the Ising-model, and percolation theory. Also, we employ a spin S = 1/2 Ising-like model and a (logistic) Fermi-Dirac-like function to describe the spread of Covid-19. Our analysis show that: (i) in many cases the epidemic curve can be described by a Gaussian-type function; (ii) the temporal evolution of the accumulative number of infections and fatalities follow a logistic function; (iii) the key role played by the quarantine to block the spread of Covid-19 in terms of an interacting parameter between people. In the frame of elementary percolation theory, we show that: (i) the percolation probability can be associated with the probability of a person being infected with Covid-19; (ii) the concepts of blocked and non-blocked connections can be associated, respectively, with a person respecting or not the social distancing. Yet, we make a connection between epidemiological concepts and well-established concepts in condensed matter Physics.

Entities: Chemical

Keywords: Covid-19; Ising-model; Logistic function; Percolation theory

Year: 2021 PMID： 33814681 PMCID： PMC8006539 DOI： 10.1016/j.physa.2021.125963

Source DB: PubMed Journal: Physica A ISSN： 0378-4371 Impact factor: 3.263

Introduction

In the field of epidemics, every hour counts and there is an urge in predicting the temporal evolution of the disease aiming to find the best way to deal with it and to establish a proper control of its spread. Recently, the pandemic Covid-19 (Coronavirus disease) has been rapidly spreading all over the world, being needless to mention the impact of it in our lives in a broad context, see, e.g., Refs. [1], [2], [3], [4], [5], [6], [7]. It has been proposed that such a quick spread of Covid-19 in human beings is associated with a spike protein, which in turn has a site that is triggered by an enzyme called furin. The latter lies dangling on the surface of the virus, leading thus unfortunately to the infection of human cells much more easily [8]. Therefore, there is an urgency of an appropriate mathematical description of the spread of Covid-19 aiming to contain the disease. This is particularly true aiming to support the health government agencies all over the world to maximize the effectiveness of medical support strategy in such a global crisis.

The SIR model

In the field of epidemiology [9], [10], [11], [12], [13], several mathematical approaches aiming to describe the infectious disease spread have been employed. Among them, the SIR (Susceptible, Infectious, Removed) model [11], [14], [15], [16], [17] stands out. Such a model considers the contact transmission risk, the average number of contacts between people as a function of time , and the time that a person remains infected and thus can infect others. The latter enable us to infer the average number of people directly infected. The value of usually dictates if the disease will eventually disappear (), if an endemic will take place () or even if there will be an epidemic () [11]. Furthermore, three variables provide key information about the disease, namely the fraction of people that are susceptible of being infected, the fraction of people that are already infected and can thus infect others, and the fraction of people that becomes immune to the disease. The epidemic curve is achieved employing the following three coupled first-order differential equations [11]: where is the contact transmission risk, is the average number of contacts between people as a function of time, and refers to the time that a person remains infected and, as a consequence, can infect others. Also, using the previously defined parameters can be written as . Although the system of differential equations defined in the frame of the SIR model describes nicely the behavior of epidemics, the , , and consequently, factors are considered constant. However, such factors can be changed over time influenced by other parameters such as social distancing, for instance. Thus, a time dependence of and has to be taken into account, cf. discussion in Refs. [18], [19]. Given the non-linearity of the system of equations, the mathematical solution is non-trivial, being numerical analysis required in many cases. Now, for the sake of completeness, we will analyze each differential equation separately. The negative sign in Eq. (1) indicates that decreases monotonically with time. Note that the factor is also present in Eq. (2), but with opposite sign, i.e., as the number of infected people increases, less people are indeed susceptible, since a fraction of the susceptible people has now become infected. Also, Eq. (3) takes into account the ratio between infected people and the duration of the infection, also called incubation time, being such a ratio proportional to the number of people recovering from the disease. This factor, namely , is negative in Eq. (2), as expected, since recovered people are no longer part of the group of infected people. As previously mentioned, the set of three equations enables us to obtain the epidemic curve for an infectious disease. In order to discuss the SIR model and its respective generated epidemic curves in a comprehensive way, we have applied such a model for Covid-19, SARS, and Ebola in order to discuss the implications of the disease spread for distinct factors (Fig. 1). Note that the higher value of is associated with Covid-19, most likely due to its relatively easy infection because of the furin enzyme [23]. Fig. 1 shows the epidemic curves employing the parameters and reported in the literature for the case of Covid-19 [20], SARS [21], and Ebola [22]. The number of susceptible people in panel (a) of Fig. 1 for Covid-19 decreases more rapidly than SARS and Ebola, which is a direct consequence from the fact that Covid-19 presents a higher factor when compared with the other diseases. As a consequence, the number of infected people over time depicted in panel (b) of Fig. 1 increases more rapidly and thus the number of immune people from Covid-19 in panel (c) of Fig. 1 increases more slowly due to the quick spread of the disease. In summary, Fig. 1 shows the typical epidemic curves for each disease spread here considered, in the frame of the SIR model. Note that the SIR model, through the knowledge of , enables us to infer predictions about the spread of the disease. According to recent works, environmental factors such as the weather, can also be taken into account when treating a disease spread [24]. Such an analysis is key in predicting the epidemiologic curves during seasonal changes in some countries. Essentially, environmental factors, i.e., temperature changes, can impact on the floating time of respiratory droplets in the atmosphere, leading thus to an enhancement of the infection probability. Such an influence of the environment on the spread of the disease can be associated with the factor in the light of the SIR model. In practical terms, several factors enable us to minimize the risk of infection. For example, in the context of the SIR model, upon joining the quarantine both and are reduced, since the number of contacts between people is lowered in the same way as the probability of being infected. On the other hand, upon going to the supermarket using a face mask, for instance, can be lowered but not necessarily since there will still be contact between people. Such actions are not only important on containing the disease spread, but also are key in preventing the resurgence, or small new outbreaks. In a broader context, other mathematical models can be used to investigate the spread of diseases, such as MSEIR, MSEIRS, SEIR, SEIRS, SIRS, SEI, SEIS, SI, and SIS, where M is passively immune infants and E is the exposed people in the latent period [25], [26]. It is worth mentioning that the imposed difficulties on acquiring the data associated with the number of people exposed to Covid-19 is one of the key limitations of such models. Hence, as pointed out in Ref. [27], the simple application of the SIR model, for instance, is naive and does not suffice to describe the spread of Covid-19. We emphasize that other approaches can be considered, such as the study of the spread of diseases employing the network model [28], [29], where the possible contacts that each person has with other people compose a set and the sum of all such sets in a given region gives rise to a network. Given the various involved factors, a broader analysis considering, for instance, social aspects and urbanization, is required. Such aspects are discussed in Section 6.

Fig. 1

Epidemic curves for Covid-19 (orange color solid line), SARS (navy solid line), and Ebola (green solid line). (a) Number of susceptible people , (b) number of infected people , and (c) number of immune people versus time , employing the parameters (number of people directly infected) and (incubation time) for each disease here considered taken from Refs. [20], [21], [22]. Details in the main text.

The Ising-model and the Covid-19 outbreak

It is evident that the mathematical models aforementioned used to describe the spread of epidemics are not built based on concepts of Statistical Physics. However, an adaptation of the Ising-model can be made to describe the Covid-19 spread. Hence, in what follows we discuss the celebrated Ising-model [30]. Although initially proposed for describing magnetic systems, over the last decades the Ising-model has been revealed as an appropriate tool to describe several phenomena. Indeed, this includes the supercooled phase of water [31], the vicinity of the Mott critical end point [32], [33], [34], magnetic field-induced quantum critical points [35], econophysics [36], democratic elections [37], [38], as well as the spread of diseases [39], [40], [41], [42], just to mention a few examples. Before discussing the adaptation of the Ising-model for the description of the Covid-19 spread, we recall here the classical one-dimensional Ising-model [43], [44]. In order to make a brief review of such a model, we focus here on the case of applied longitudinal magnetic field. Essentially, the quintessence of the Ising-model lies on the assumption that magnetic moments are coupled only with their nearest neighbors, being the Hamiltonian of a linear chain of spins given by [44]: where is the coupling constant between neighboring magnetic moments on sites and , () represents the spin on a site (), and refers to the longitudinal applied magnetic field. At this point, it is worth emphasizing that the coupling constant , also called magnetic exchange coupling constant, will be used to emulate the contact between infected and non-infected people in our approach. Next, we make use of the Ising-model [30], logistic function [45] and percolation theory (to be introduced in Section 5) to describe the spread of Covid-19. We demonstrate the close relation between the number of people following the proposed quarantine by the WHO (World Health Organization) and the spread of Covid-19. Although it is quite obvious that the more people respect the quarantine the lower will be the number of infected people, a proper quantitative description is still lacking. Unfortunately, for many governmental leaders around the world, it is not obvious that the quarantine is crucial in containing the Covid-19 disease spread. It is worth mentioning a recent analysis about the so-called Digital Herd Immunity, reported in Ref. [46], where there is a tracking of Covid-19 infected people through a smartphone applicative. Interestingly enough, the critical fraction of people using such applicative for the containment of the Covid-19 spread is about 75% of the population. The latter means that if the entire population, infected or not, are tracked by the applicative and respect the social distancing, the spread of the disease is contained. This paper is organized as follows: in Section 2, we present a discussion about the adaptation of the Ising-model to the Covid-19 spread and deduce, in a pedagogical way, the logistic function, applying it in the description of both the number of infections and fatalities due to Covid-19. In Section 3, we present an analogy between the electron interaction in condensed matter Physics and the interaction between infected and non-infected people in epidemics. It is worth mentioning that the analogy between well-established condensed matter Physics concepts and Epidemiology was implemented in this work focusing on an analogy between such distinct research fields with their respective conditioning. A comparison between the logistic function and the SIR model is presented in Section 4. In Section 5, we present a comprehensive review of the percolation theory, the Cayley tree, and the Bethe lattice aiming to describe the behavior of the Covid-19 spread in the context of percolation theory. Conclusions and perspectives are also presented.

Covid-19 spread, infections and fatalities

The genesis of the interaction parameter

Under the light of condensed matter Physics, we consider that the interaction (contact) between infected and non-infected people can be associated with the magnetic exchange interaction between nearest-neighbor magnetic moments in the well-established Ising-model [31], [34], [47], [48], cf. discussions in the previous Section. Hence, analogously to the case of the Ising-model for magnetism with spins , we assume that the number of infected people is , while the number of non-infected people is labeled by , being where is the total number of considered habitants. In other words, infected and non-infected people can be identified considering a Ising-like variable , so that we write , and . The latter enable us to select mathematically infected and non-infected people, respectively. Yet, we consider that two people who are infected by Covid-19 have no effect on each other, so that in this case the interaction and, otherwise, . In the same way, a non-infected person has no effect in another non-infected person. Essentially, in our approach quantifies, at some extent, the key role played by the quarantine in the spread of Covid-19. Note that plays a similar role than in the frame of the SIR model, discussed previously. Following a similar mathematical treatment reported by us elsewhere, cf. Ref. [34], we write a mathematical function to describe the total population taking into account the contact between non-infected and infected people, which is emulated by : where is the total number of healthy people before the spread of the virus and refers to a neighbor person of . Note that the second term of the right side of Eq. (5) will always be negative for and and vice-versa. Although the various applications of the Ising-model to distinct research areas are known in the literature, as discussed in the Introduction section, Eq. (5) provides a new way of mathematically selecting infected () and a non-infected person () taking into account the interaction parameter . Furthermore, Eq. (5) indicates that, when , i.e., infected people interact with healthy people, is decreased. Evidently, the total number of the population remains constant. It is worth mentioning that even in a hypothetical situation in which all the population joins the quarantine, there would still be a small finite originated from household interactions, which are usually longer and more frequent and thus, under certain circumstances, can lead to new infections. In order to determine a proper mathematical expression for , one must take into account the number of isolated people in quarantine. We propose that the contact between infected and non-infected people can be described considering , with 1 . As a matter of fact, other power-laws of can also be employed, such as and , cf. shown in the inset of Fig. 3. Note, however, that the power-law we have employed is an Ansatz to incorporate the effects of social distancing in the epidemic curve. Note that this analysis aims to show the mitigation of the number of Covid-19 new cases over time upon increasing the social distancing and it was not employed to any specific epidemiological data sets of any particular country. Our Ansatz is a reasonable assumption, since as the number of people in quarantine decreases, the contact between them increases, which, in the present scenario, is represented mathematically by an increase in . Note that Eq. (5) does not incorporate any time evolution.

Fig. 3

(a)Main panel: number of new cases versus time (in days) for Covid-19 in South Korea, fitted with a Gaussian (orange color solid line) and exponential (dashed cyan line) functions. Inset: corresponding Gaussian fit for (orange color solid line) and (navy blue solid line). The values of (green solid line) and (red solid line) represent the cases for which and , respectively; (1) delimitates the maximum number of new cases for and (2) the reduction of the number of new cases when the number of people in quarantine is increased [66]. (b)Main panel: accumulative number of infected people versus time (in days) of Covid-19 for China and the correspondent fitting (pink solid line) employing Eq. (10). Inset: versus with and = 0.22 (obtained from the fitting of the data set shown in the main panel) for 1 day (red solid line), 3 days (green solid line), and 7 days (blue solid line). (c)versus with and = 0.22 for 1 day (red solid line), 3 days (green solid line), and 7 days (blue solid line). For the sake of comparison, we have employed obtained in the fitting of the data for China in (b). Data set available in Ref. [55].

The Gaussian-type function and Gaussian processes

It turns out that upon analyzing already officially reported epidemic curves [49], [50] for Covid-19, their typical time evolution in terms of new cases per day follows an initial rapid increase of the number of infected people, which is usually assumed to be exponential. After such initial exponential growth, the number of infected people reaches a threshold value and starts decreasing monotonically as time continues to evolve. Essentially, in qualitative terms, the typical shape of epidemic curves is more or less the one of a simple Gaussian-type function, as can be seen in Fig. 2. At this point, it is worth mentioning that although the epidemics data set here discussed follows the behavior of a Gaussian function, our analysis has nothing to do with a probability distribution. For the sake of completeness, we recall the mathematical expression for the Gaussian function: where is related with an initial value, is a normalization constant associated with the area under the curve, is the time, is the value of associated with the maximum value of , and , where is the standard deviation. Hence, following a similar analysis employed by the authors of Ref. [37], we make use here of a simple Gaussian function to describe the spread of Covid-19. In our analysis, the interaction between infected and non-infected people is incorporated in the Gaussian function by summing up in and into the Gaussian function. The incorporation of in terms of the probability of a person being infected is discussed into details in Section 5.2. We stress that in our analysis, we have summed up into the terms and of the Gaussian function, as previously described, based on the argument that when the number of people in quarantine is increased, the contact between infected and non-infected people is reduced and, as a consequence, is increased. Although we have considered that , any other power-laws of that describe the lowering of the interactions between people upon increasing the number of people in quarantine could be incorporated into the Gaussian function as well. The mathematical function that incorporates the interactions between people is naturally distinct for each region depending on the number of contacts that each person has with others. In other words, the impact on the epidemic curves regarding the social distancing is more pronounced in regions with a higher number of contacts between people. In this context, a natural analogy between the number of contacts each person has with other people and the coordination number employed in the field of condensed matter Physics can be made. For more complex scenarios, where the number of contacts is very high, an approach employing dynamical mean-field theory (DMFT) can be implemented, cf. mentioned in Section 5. Next, we discuss the incorporation of into the Gaussian function. Essentially, we have summed up as follows: This gives rise to a broadening of the Gaussian distribution function, so that is increased and its maximum is reduced. Note that by doing so, the area under the curve remains constant. It is worth emphasizing that our analysis based on the incorporation of in the Gaussian function was only possible due to the Ising-like model that we have introduced previously, within which has its genesis. We emphasize that the modeling of the epidemic curves is performed by assuming that there are several additional parameters ruling the global behavior of the time evolution of the key factors incorporated in the SIR model, i.e., the number of infected, recovered and susceptible people. However, it is important to reinforce that a real disease spread depends on an ensemble of parameters, which are not usually taken into account to describe the global dynamics of the epidemic curves. These additional factors cannot be modeled by employing solely elementary analytic functions. Thus, in such cases, Gaussian processes may be implemented in the model to increase the accuracy of best-fit solutions describing the time evolution of the epidemics parameters [51], [52]. Essentially, Gaussian processes can be understood in the following way: suppose we have a given phenomenon (in our case, the number of infected (), recovered () and susceptible () people in the light of an epidemic) to be modeled as a function of time . There is a set of well-determined parameters which are responsible for dictating the predominant behavior regarding the time evolution of these particular parameters, cf. Eqs. (1), (2), (3). However, other factors may play a minor, although non-negligible, role in modeling the epidemic curve. A simple example of a factor which could change the epidemic curve is the weather. It is known that weather prediction is one of the most difficult tasks in modern predictive analysis due to the intrinsically stochastic aspect of this phenomenon. If the virus corresponding to a given epidemic presents a “weakness” to, e.g., hot weather, an abrupt change in the temperature may cause small fluctuations in the number of infected people over a short period of time. Such a period of time is comparable to the timespan in which the temperature remains relatively high. Supposing that these temperature fluctuations take place for some days in a given timespan, one has a stochastic variable adding a “noise” to the behavior, for instance, of the number of susceptible people over time . The Gaussian processes is thus an algorithm for introducing these kind of random fluctuations in the epidemic model. The so-called Gaussian aspect is attributed to the way in which the noise is added. It is known that a sequence of randomly generated points can be approximated by a Gaussian distribution function for large values of the number of data points, being such result known as the Central Limit Theorem [53]. Thus, the incorporation of randomly generated fluctuations in the deterministic portion of implies a more complete modeling. Indeed, this is true not only for the deterministic nature of , already described by the SIR model, but also to the noise introduced by, for instance, the weather variation and its impact on the survival of the virus causing the given disease. The goal is, eventually, to determine which value of standard deviation better suits the Gaussian processes, i.e., the one which minimizes the deviation between the actual data of the epidemic curve and the model curve described by , where is the deterministic part of the function and is the stochastic contribution to the function . Of course, such modeling requires a more complex implementation. In Ref. [54] an example of a public available code to model a Gaussian process in one dimension is reported. Note, however, that such procedures are random in nature and serve only the purpose of increasing the accuracy of the fits by taking into account a detrending process. We thus mention here the existence of such methods, but we do not implement them in the current work, since no additional insights can be gained in the discussions related to the spread of Covid-19.

Fig. 2

Number of new cases versus time in days for: (a) Ebola [63],(b) SARS [64], and (c) Influenza A/H1N1 [65]. The solid lines in all panels represent the Gaussian fitting of the data set for each case.

The logistic and the Fermi–Dirac-like function applied to epidemics

Given the relatively large amount of available data for China [55] and the achievement of a proper control of the spread of Covid-19 in its territory, in the following we pedagogically deduce the logistic function by making use of the also celebrated standard population growth model [56], [57]. We assume a maximum number of possible infected people labeled by ; considering the number of infected people at a given time and, as consequence, [] is the number of people that can be infected. Note that corresponds to the percentage of the population that will be infected. Also, we consider that the number of people that can be infected by in a time interval is , where quantifies how many times an infected person interacts with a non-infected person. The number of infected people in a time minus the number of infected people at is proportional to the number of people of the population that can be infected by the already existent infected people at a given time . Hence, it is straightforward to write: where is a non-universal proportionality constant. Upon taking the limit for in both sides of Eq. (8), we achieve the following differential equation: where quantifies the frequency of infection. Making the integration in both sides of Eq. (9), we have: where is a non-universal integration constant. Eq. (10) has some reminiscence of the well-known Fermi–Dirac (FD) distribution function [58]. Interestingly, Eq. (10) is also typically employed in chemical kinetics [59], anaerobic biodegradability tests [60], and even on germination curves [61]. It provides a robust description for the evolution of the number of infected people over time, saturating at . Also, Eq. (9) can be recognized as the well-established logistic differential equation [56], which has a general solution of the type [62]: where is the growth rate of the function, is the maximum number of , and is the midpoint value of the function. Thus, Eq. (10) can be rewritten in a similar way of Eq. (11): Note that Eq. (12) is equivalent to Eq. (10) if we consider . The logistic function can also be written in the trigonometric form as the distribution function: Although Eq. (10) is essentially the well-known logistic function, we will consider it under the view of condensed matter Physics as a Fermi–Dirac-type (FD-type) function. This is because of their mathematical similarities. The data set for Covid-19 here discussed are available in Ref. [55]. It is worth mentioning that we have focused our analysis in countries presenting a more advanced picture of the Covid-19 spread, such as South Korea and China [55] during the preparation of this manuscript.

Data analysis and discussion for the case of infections

We start our analysis recalling the data set for Ebola, SARS, and Influenza A/H1N1. Their epidemic curves, depicted in Fig. 2, can be described by a fitting employing a Gaussian function. Number of new cases versus time in days for: (a) Ebola [63],(b) SARS [64], and (c) Influenza A/H1N1 [65]. The solid lines in all panels represent the Gaussian fitting of the data set for each case. Table 1 shows the parameters obtained in the fitting for the various epidemics (Fig. 2) and their respective standard deviations. At this point, it is worth emphasizing that it is not possible to make a forecast of an epidemic curve by only employing a Gaussian fitting of the data set associated with the initial growth of the epidemic curve. Now, we focus on the analysis of the spread of Covid-19. We start with the available data set for Covid-19 in South Korea, see Fig. 3(a).

Table 1

Gaussian fitting parameters for several epidemics [see Fig. 2, Fig. 3(a)]. The existence of other external parameters influencing the time evolution of the number of new cases causes fluctuations on the data. Since such factors can be described, in general, by a Gaussian process, the associated errors of the fitting with respect to the real data are incorporated in the standard deviations of the fitting presented in Fig. 2, Fig. 3. Details in the main text.

Disease	y0	A	w	tc	σ
Ebola	41	15418	13.98	34.08	6.99
SARS	2	2122	38.08	51.39	19.04
Influenza A (H1N1)	0.42	260	6.89	58.69	3.44
Covid-19 (South Korea)	85.94	5890.41	7.51	15.88	3.75
Covid-19 (Austria)	12.82	4317.33	8.34	24.24	4.17

The number of people joining the quarantine is directly associated with the previously defined interaction (contact) between infected and non-infected people. As discussed, in a hypothetical situation where no one joins the quarantine and so achieves its maximum value, which means that everyone interacts freely with each other. This would be the worst case at all. It has been broadly discussed that upon increasing the number of infected people in quarantine, the maximum in the epidemic curve associated with the number of new cases is not only lowered, but it is also shifted, indicating that the spread of Covid-19 will be more contained. The latter corresponds to the desired situation in terms of controlling the spread, since the health government agencies will have more time to manage the situation. This is one of the reasons why a proper registration of the epidemic data is crucial. Note that our analysis based on the Ising-like model incorporates the key role played by social distancing, broadly discussed in the media, see, e.g., Ref. [66]. We demonstrate such a situation employing the data set available for South Korea, assuming a hypothetical finite incorporated in the Gaussian function employed in the analysis, cf. depicted in the inset of Fig. 3(a). More specifically, when is lowered the maximum number of infected people is decreased and its position in time is shifted, making thus the disease spread more controllable. Such a decrease in represents more people joining the quarantine and thus avoiding contact with each other. Interestingly, the number of infected people over time in the frame of the SIR model, depicted in Fig. 1(b), for the various diseases resembles the behavior of the number of new cases over time shown in the inset of Fig. 3 for Covid-19. Such a resemblance of the role played by and is due to the fact depends on , which in turn represents an average interaction between people in a similar way than . Indeed, the factor can be written in terms of , namely [11]. Since is connected with the average number of contacts between people as a function of time , upon decreasing as shown in Fig. 1(b) the shape of the infected people over time is broadened in the same way as in Fig. 3(a) when is lowered. Essentially, both behaviors strengthen the importance of respecting the social distancing and thus minimizing the average number of contacts between people. Now, we treat the available data set of the spread of Covid-19 in China in terms of the FD-like distribution function. As depicted in Fig. 3(b), the number of accumulated infected people versus elapsed time starts to saturate after roughly 50 days after the outbreak of Covid-19. In other words, for large values of the ratio . Eq. (10) fits nicely such data set, cf. Fig. 3(b). Note that the number of infected people over time for China is marked by a jump at days, cf. Fig. 3(b). Such a jump is due to the recount in the number of new infected people in China since the criteria of diagnoses was changed. In the following, we discuss into more details the similarity between Eq. (10) and the FD distribution function. Rearranging Eq. (10), we have: being a constant introduced to play the role of the Fermi energy for the electron gas [58]. Note that and, for = 1 we have exactly the same form of the FD distribution function [58], where and refer, respectively, to the energy and temperature. In the present case, plays a role analogous to the energy for the Fermi gas, while is analogous to Boltzmann factor , where is Boltzmann constant. Indeed, the time evolution of the Covid-19 spread has a similar significance than for the FD distribution function for the Fermi gas. The behavior of as a function of is shown in the inset of Fig. 3(b) for several values of . Considering that in our fit using Eq. (10) for the spread of Covid-19 in China [inset of Fig. 3(b)] we have obtained , it becomes clear that we are dealing with a distorted version of the FD function where , cf. Fig. 3(c). Hence, at some extent, we are faced with a behavior analogous to the distorted FD distribution for electrons in the picture of Landau Fermi-liquid (FL) [67], [68]. The latter will be discussed into more details in Section 3. Also, we emphasize that the ratio gives us the probability of a person being infected in a certain time in a similar way that dictates the probability of a state with energy being occupied at a certain temperature . We anticipate that the spread of Covid-19 for other countries, not discussed in the present work given the lack of available data, should follow the same behavior as here discussed for South Korea and China. The position of the maximum in the epidemic curve, described by a Gaussian function, as well as , will be a direct reflex of the policies taken by the health government agencies by a particular country. (a)Main panel: number of new cases versus time (in days) for Covid-19 in South Korea, fitted with a Gaussian (orange color solid line) and exponential (dashed cyan line) functions. Inset: corresponding Gaussian fit for (orange color solid line) and (navy blue solid line). The values of (green solid line) and (red solid line) represent the cases for which and , respectively; (1) delimitates the maximum number of new cases for and (2) the reduction of the number of new cases when the number of people in quarantine is increased [66]. (b)Main panel: accumulative number of infected people versus time (in days) of Covid-19 for China and the correspondent fitting (pink solid line) employing Eq. (10). Inset: versus with and = 0.22 (obtained from the fitting of the data set shown in the main panel) for 1 day (red solid line), 3 days (green solid line), and 7 days (blue solid line). (c)versus with and = 0.22 for 1 day (red solid line), 3 days (green solid line), and 7 days (blue solid line). For the sake of comparison, we have employed obtained in the fitting of the data for China in (b). Data set available in Ref. [55]. Gaussian fitting parameters for several epidemics [see Fig. 2, Fig. 3(a)]. The existence of other external parameters influencing the time evolution of the number of new cases causes fluctuations on the data. Since such factors can be described, in general, by a Gaussian process, the associated errors of the fitting with respect to the real data are incorporated in the standard deviations of the fitting presented in Fig. 2, Fig. 3. Details in the main text.

The logistic and the Fermi–Dirac-like function for the case of fatalities

As reported by the Centers for Disease Control and Prevention (CDC) [69], there are some risk factors that increase the chance of an infected people to pass away due to infection by Covid-19. Such factors include mainly being with age above 65 years old and having a serious underlying medical condition. In our approach, we refer to this set of factors as a global (Risk) -factor. We assume that the number of infected people presenting the -factor is and the number of infected people without presenting the -factor . It is evident that the total number of infected people is given by . Also, it is expected that people presenting the -factor are more likely to pass away and thus to reduce the total number of infected people. As discussed in the following, the number of new fatalities per day follows a Gaussian fitting in the same way as the number of new infection cases per day. Now, we focus on the number of accumulative fatalities over time. Based on the fact that the number of accumulative fatalities over time follows a behavior similar to the number of accumulative infected people over time, we carry out a similar analysis applied to the number of infected people over time, cf. previously discussed. The number of people that can pass away in a time interval is given by , where is the maximum number of fatalities, the number of people that pass away in a certain time , and is a parameter associated with the fatalities rate. Thus, the number of fatalities in a time interval minus the number of fatalities in a time is proportional to the number of infected people presenting the -factor. These assumptions enable us to write: where is a proportionality constant. Thus, following the same elementary mathematical treatment as in the last Subsection, we can write: where is a non-universal integration constant and is the fatality rate. Eq. (16) provides a reasonable description of the number of accumulative fatalities over time and resembles, as stated previously, a distorted Fermi–Dirac-like distribution function (logistic function), being that for the ideal case . Interestingly, the number of accumulative fatalities over time, described by Eq. (16), also follows a logistic function in the same way as the number of accumulative infections over time, cf. Eq. (10). Next, we discuss the fitting of the number of accumulative fatalities in China employing Eq. (16).

Data analysis and discussion of the fatalities and the infection capacity

Before starting the discussion regarding the data set for China, aiming to demonstrate the applicability of our approach, we recall the behavior of both the number of new fatalities and the total number of fatalities over time for the Ebola outbreak in West Africa in 2014 [70]. As can be seen in the main panel of Fig. 4, the time evolution of the number of new fatalities follows a Gaussian function, whereas the total number (accumulative) of fatalities (inset of Fig. 4) is well-described by Eq. (16). Such a behavior is also followed by other pandemics (not shown) [71], [72] and thus reinforcing that we are dealing with an universal description of the epidemic curves of any disease.

Fig. 4

Main panel: number of new fatalities per month in the 2014 Ebola outbreak in West Africa fitted with a Gaussian function (blue solid line), being the obtained fitting parameters displayed in Table 2. Inset: total number of fatalities per month during the Ebola outbreak fitted (red solid line) with Eq. (16), being , and . Data set available in Ref. [70].

Considering the number of new fatalities by Covid-19 for the specific case of China [Fig. 5(a)], a similar behavior than the 2014 Ebola outbreak in West Africa (main panel of Fig. 4) is observed. As depicted in Fig. 5(a), the number of new fatalities over time follows a Gaussian function as well. The total number of fatalities over time [Fig. 5(b)] follows a behavior which resembles the one of a distorted FD-type function, cf. inset of Fig. 5(b). Interestingly, the value of the parameter is much lower than the one for the accumulative number of infections over time, which is suggestive that the total number of fatalities per day follows a less distorted FD-like distribution function.

Fig. 5

(a) Number of new fatalities per days of Covid-19 in China fitted with a Gaussian function (red solid line). (b) Main panel: accumulative number of fatalities per days in China fitted (black solid line) with Eq. (16). Inset: as a function of showing a distorted Fermi–Dirac-like behavior employing and . (c) Main panel: infection capacity (in arbitrary units) as a function of time in days (pink solid line) computed employing the fitting parameters for China, namely , = 2660.2, and , using Eq. (18). Inset: versus employing the fitting parameters for China for 1 day (red solid line), 3 days (green solid line), and 7 days (blue solid line). The vertical dashed indicates the 50 days. Data set available in Ref. [55].

Table 2

Gaussian fitting parameters for the number of new fatalities per day for Ebola and Covid-19 [see main panel of Fig. 4, Fig. 5(a)].

Disease	y0	A	w	tc	σ
Ebola	−0.41	389.66	5.72	10.97	2.86
Covid-19	8.28	2725.28	18.35	24.95	9.17

Also, under the perspective of condensed matter Physics, given the similarity between Eq. (14) and the FD distribution function [58], a direct analogy can also be made for the density of states (DOS), i.e., , where is the number of particles (electrons) and the energy in the case of the Fermi gas. Here, the proposed equivalent quantity is . For the sake of completeness, we recall the expression used in the calculation of the specific heat for the Fermi gas [58]: where is the Fermi energy. At this point, it is worth mentioning that the DOS for the Fermi gas is temperature independent [58]. Below we write the here proposed infection capacity : Note that for 0, based on previous discussions, should have its maximum value. Also note the presence of an additional term in Eq. (18). Such a term emerges because in the present case the DOS is time dependent, being such a feature key in understanding the present results. Eq. (18) quantifies the infection capacity by Covid-19 over time. In the main panel of Fig. 5(c), we show versus time for China employing the fitting parameters discussed for new infections, cf. Fig. 5(a). A Gaussian function-like behavior is clearly observed. In the case of China, the infection capacity goes to zero at days (dashed vertical line in Fig. 5), which is in agreement with both the vanishing of new fatalities per day and also with the saturation of the accumulative number of people which passed away, cf. Figs. 5(a) and (b), respectively. Furthermore, the lowering of , preceded by its vanishing, is also associated with the saturation of the accumulative number of infected people, which also takes place at days. The inset of Fig. 5(c) depicts versus . Interestingly, for the number of people that can be infected is reduced and, consequently the contact (interaction) between infected and non-infected people is reduced as well. Such a reduction leads to a behavior closer to an ordinary FD distribution function where no interactions are present, i.e., we approach a clean Dirac delta function. (a) Number of new fatalities per days of Covid-19 in China fitted with a Gaussian function (red solid line). (b) Main panel: accumulative number of fatalities per days in China fitted (black solid line) with Eq. (16). Inset: as a function of showing a distorted Fermi–Dirac-like behavior employing and . (c) Main panel: infection capacity (in arbitrary units) as a function of time in days (pink solid line) computed employing the fitting parameters for China, namely , = 2660.2, and , using Eq. (18). Inset: versus employing the fitting parameters for China for 1 day (red solid line), 3 days (green solid line), and 7 days (blue solid line). The vertical dashed indicates the 50 days. Data set available in Ref. [55]. Gaussian fitting parameters for the number of new fatalities per day for Ebola and Covid-19 [see main panel of Fig. 4, Fig. 5(a)].

The Landau Fermi-liquid picture

Following the discussions presented in the previous section, a similar situation is also found in a Landau FL electronic system [68]. Upon bringing the quasi-particle (interacting) character to a non-interacting picture, the distorted FD function becomes a regular FD distribution. In the frame of the Landau FL picture the equilibrium distribution function of the quasi-particles is given by [68], [73]: where is the quasi-particle local energy and the chemical potential. The quantity depends on the interaction between the quasi-particles and can be written in the form , where is the energy associated with the case when the interactions between quasi-particles are absent and is the distribution function of the excited quasi-particles. Thus, can be rewritten as follows: where the term can be recognized as the previously discussed factor . In the case where then , which leads to the case of a non-interacting Fermi-gas. On the other hand, when we have a distorted Fermi–Dirac distribution function, which corresponds to the case of a Landau FL as previously discussed. At this point, an analogy can be made between the behavior of electrons in solids and the interaction between people during an epidemic. In fact, the interaction between quasi-particles plays a role analogous to the interaction parameter , which quantifies the contact between an infected and a non-infected person in the modeling of the Covid-19 spread.

The logistic function versus SIR model

In the following, we make a comparison between the logistic equation and the SIR model. Note that Eq. (10) represents the total number of infected people as a function of time . In the frame of the SIR model, knowing and we have access to both the number of new infections per day and accumulative total number of infections by calculating . Hence, it is possible to obtain the accumulative total number of infected people in a similar way when employing the logistic equation. This demonstrates that, although it is appropriate for the description of epidemics, the SIR model presents a much more complex treatment than the application of the logistic equation, making it easier to determine the epidemic curve by the latter. It is clear that to make use of Eq. (10) one needs to know the infection rate and the constant (), while the SIR model considers purely epidemiological factors, such as and . Thus, although simple, the logistical equation depends on factors that can only be determined after the end of the epidemic in a given place or region. Besides the use of the logistic function to describe sigmoidal growth curves, the so-called Gompertz function can also be employed [74]. Essentially, the main difference between the logistic and the Gompertz function lies on the fact that for the Gompertz function the growth is more steep and it approaches the asymptote smoother than in the case of the logistic function. Thus, in the frame of the Covid-19 spread, the use of either the logistic or the Gompertz function will depend on the steepness of the total accumulative number of cases over time, which can be different for each country. Recently, a predictive model employing the Gompertz function, moving regression, and a so-called Hidden Markov Model was reported [75]. In Ref. [75], the authors provide a numerical fitting of the epidemic data for all countries in order to predict the number of new cases in the next day, discussing also the disease spread in terms of four stages: the start of the outbreak (lagging), the initial exponential growth, the deceleration, and the stationary phase. Also, the authors of Ref. [75] provide a real-time platform to monitor the growth acceleration of the Covid-19 spread in all countries aiming to measure the effectiveness of the mobility restrictions on the containment of such a disease.

The Cayley tree and the Bethe lattice

As a natural continuation of previous sections, in what follows we present an analysis of the probability of a person being infected with Covid-19 in the frame of percolation theory. Over the past decades, the percolation theory [13], [76], [77] has been used to describe the effects of disorder in superconductors [78], in the description of diluted magnetic semiconductors [79], in the analysis of traffic network [80], and also in epidemiology [81], just to mention a few examples. In condensed matter Physics, some powerful methods have been used to explore the electronic structure of systems of interest. Here, the density functional theory (DFT) and dynamical mean-field theory (DMFT) deserve to be mentioned. The latter enables the investigation of the so-called strongly correlated systems, i.e., systems in which interactions between electrons are taken into account and, as a consequence, the emergence of exotic phases. In a crude description, DMFT assumes an atom as an impurity in a matrix with several electrons under the influence of an effective field [82]. Such an approximation becomes accurate as the coordination number () becomes infinite, i.e., it takes into account infinite nearest neighbors [82], [83], [84], [85]. Within this context, the application of the Bethe lattice is appropriate. Considering the application of percolation theory in the field of Solid State Physics, we highlight here the so-called Bruggemans’s effective medium approximation [86]. where refers to the fraction of one of the phases of interest, and represent, respectively, the dielectric constant of the two distinct phases labeled and , is the effective dielectric constant, and is the so-called shape factor. In the frame of such an approximation, the goal is to analyze the dielectric responses of the two coexisting phases. Thus, when one of the phases achieves a critical concentration (or percolation threshold) one of the phases will dominate the Physics of the system significantly altering its dielectric response [87]. Here, upon analyzing the spread of Covid-19 we associate the percolation probability with the probability of a person being infected in terms of the number of people in and out the quarantine, cf. Fig. 6. Fig. 7 shows the probability of infection as a function of the number of connections and their corresponding association with the blocked and non-blocked paths well-known in the frame of percolation theory [77].

Fig. 6

Fig. 7

Number of connections (left column) with their corresponding percolation probability terms (middle column) and schematic representation of the possible path configurations (right column). The red lines represent a division of the blue paths. Each configuration has 2 possible states, considering all the possibilities of blocked and non-blocked paths. Details in the main text.

Modeling epidemics employing elementary percolation theory

In order to describe the probability of infection by Covid-19 or any other disease, we make use of elementary concepts in the frame of percolation theory [57], which determines the probability of percolation to take place in terms of non-blocked and blocked connections. The probability of percolation is associated with the probability of a person being infected with Covid-19 through contact with an infected person (Fig. 6). A blocked connection represents an infected person respecting the quarantine and thus “blocking” the spread of the disease. In the same way, the non-blocked connection refers to an infected person not joining the social distancing and thus, unfortunately, contributing to the increase of probability of new people becoming infected. Furthermore, we make use of a simple 3 connections net to describe the Covid-19 spread, i.e., one infected person can infect two other people and so on (Fig. 8). Basically, for each number of connections , we have a total probability , which includes the percolation and non-percolation probabilities, cf. Fig. 7. In the simplest case, i.e., , there are only two possible states, namely the path is non-blocked or it is blocked . Hence, the total probability in this case is given by . For , there are four possible states: connection 1 blocked and connection 2 non-blocked, connection 1 non-blocked and connection 2 blocked, connection 1 and 2 blocked, and connection 1 and 2 non-blocked, being the corresponding probabilities for each state given, respectively, by , , and . In this case, the total probability is the sum of the probabilities of each state, given by , which can be rewritten as . Following such a logic, the generalized mathematical function of the total probability , i.e., including the percolative and non-percolative probabilities, in terms of the number of connections reads [57]: Note that includes both percolative and non-percolative probabilities, but we are interested only in the infection (percolative) probability. Thus, we consider only the terms of that contribute to the percolation probability. Considering , for instance, we have: However, the percolation probability, labeled here , is given only by the terms . Essentially, in order to determine the percolation probability, it is necessary to make a combinatory analysis employing the value of considering the probabilities for all possible states to exist. Then, we rule out all the probabilities contributions associated with the states presenting null chance of percolation to take place, making thus the percolation probability as the sum of all the probabilities of existence of all the remained states. For the particular case with , the percolation only occurs when both connections 1 and 2 are non-blocked (Fig. 7), which means that is only given by the term in the total probability. Fig. 7 summarizes an extension of such an analysis for other values of by considering only the terms associated with the percolative probability.

Fig. 8

(a) Representation of the Bethe lattice for [88], [89] where the zeroth shell is occupied by an infected person (in red color). In this configuration, each infected person can infect two other more. For the sake of compactness, we stick to only four shells in this representation of the Bethe lattice, namely the zeroth (, red color), the first (, orange color), the second (, green color), and the third (, purple color) shells. The starting point is represented by the patient zero (highlighted in red color). (b) Representation of the number of people that an infected person (red color) [88] can infect in terms of the number of nearest neighbors employing the Bethe lattice [89] for . (c) Schematic representation of the Cayley tree for for various shells. The zoomed region outlined in red depicts two distinct branches representing two distinct hypothetical cities 1 and 2. An infected person (green color) is unable to reach the city 1 and thus everyone there is healthy (blue color). On the right branch, the infected person can reach the city 2 and thus spreading the disease. The various colors employed to represent the people in the right branch represent the various subsequent shells. More details in the main text.

Schematic representation of the possibilities of infection by Covid-19 based on the number of connections . (a) For there are two possible states and , where represents a blocked path (infected person in quarantine) and a non-blocked path (infected person not in quarantine), but not in contact with other people. (b) For there are four possible states, labeled by , , , and . The interaction between an infected (hand outlined in red) and non-infected person (hand outlined in black) is represented by a handshake. The handshake acceptance (green checkmark) or denial (red x) by each of the persons is also depicted. Only state represents an actual percolation, i.e., infection since the contact interaction actually happened. (c) For one infected person (red color) can infect two others by not being on quarantine, while in (d) the person respected the quarantine and avoided infecting two other people (represented by the light red color; blue color indicates non-infected people). For , there are 8 combinatory possibilities, but for the sake of compactness we indicate only two of them. Details in the main text. Figure generated using templates available in Ref. [88]. At this point, we take into account the case for and also make a broader analysis considering the so-called Bethe lattice. As can be inferred from Fig. 7, for there is a path leading to two others, being the consecutive paths originating from each of these two not shown. By taking such consecutive paths into account, we can form infinite shells leading to the formation of a Bethe lattice, as shown in Fig. 8. Number of connections (left column) with their corresponding percolation probability terms (middle column) and schematic representation of the possible path configurations (right column). The red lines represent a division of the blue paths. Each configuration has 2 possible states, considering all the possibilities of blocked and non-blocked paths. Details in the main text. For the sake of completeness, we recall next the formation of such a lattice based on discussions reported in textbooks, such as in Ref. [77]. In the initial condition, a central site with a probability of being occupied and of being empty leads to an amount of paths (coordination number), which also has a site at the end of each path. Thus, although not employed in this work, we can make a direct analogy with the Ising-model for integer spin , where we have configuring spin up or down, and meaning that there is a vacancy in such a site. The set of sites formed by the central site and the paths constitute the zeroth shell of the lattice, cf. indicated in Fig. 8(a). Then, each new site on the boundary of shell zero gives rise to new paths and sites forming shell number one. This can be performed continuously to form other shells until, at the end of the lattice, the last sites have only one path and there is no longer the possibility of forming new shells. In this context, taking the central site as reference so that we can follow a path from the central site to a site at the boundary of the final shell, it is necessary that there are both available paths and occupied sites, configuring a percolation. In this way, it is possible to consider the percolation probability as the probability of obtaining a path allowing the central site to be connected to a site in the final shell of the Bethe lattice. In an analogous way, we represent the non-percolation probability as . If each occupied site with probability leads to paths, we can estimate an average of occupied sites. However, having in mind that a site can be occupied or not, the reduction of available paths at each shell will prevent to reach the final sites and therefore there is no percolation. Upon going from one shell to other if , then the probability of reaching the final shell (percolation) is lowered. Such analysis is important because it leads to the so-called percolation threshold . Employing , for instance, we focus now on the factors that may prevent percolation. In this specific case, any site leads to two paths and each of them with a probability of non-percolation. From the standard definitions of sets and probabilities, two events will be statistically independent if and only if the intersection between them can be written as a product of both probabilities [90]. Hence, the probability of non-percolation for will be . Also, the sites must be occupied for the percolation to occur, so that the total probability of non-percolation should be written as . The solutions of the latter are , i.e., there is no percolation, and , which can be equal to zero if and thus percolation takes place. Analyzing the central site and the three paths originating from it, the percolation probability must take into account and discount the probability of non-percolation, i.e., . Replacing here the previous solutions obtained for , we have two different cases. If , which represents the non-percolative condition, i.e., . If , then: which has a phase transition order parameter-like behavior for the percolation. It is worth noting that such analysis is distinct for different values of since it is dependent on the number of possible paths. Also, note that is universal for and does not depend on the length of the path: which is associated with the size of the lattice [77]. Analyzing the simplest case, i.e., a chain composed by a set of sites, it is possible to define the pair correlation function . The latter is associated with the percolation probability between two occupied reference sites separated by a distance . Such a distance incorporates other sites that may be present between the occupied ones. The function is given by [77], where is the so-called correlation length . For , diverges and the . In the case of the Covid-19 spread, this would mean that if the number of non-blocked paths achieves such a threshold, i.e., a significant portion of the population does not join the social distancing, then a pronounced increase of the number of new infections over time would take place. The very same mathematical treatment previously discussed can be employed upon considering the branches of the Cayley tree as cities, cf. Fig. 8(c). It has been lately considered the effectiveness of the so-called intermittent quarantine [91]. The latter means that nearby cities would join the social distancing in pre-programmed different days. As a consequence, the economic activities can be retaken and with relatively low probability of spreading the disease between such cities. The main idea behind lies on the fact that if an infected person is unable to reach one of the cities [Fig. 8(c)], the probability of a person being infected in that specific city is lowered, as well as the probability of all subsequent infections that would take place. In order to discuss the Cayley tree, we have used the concept of a regular tree, which means that the branches of the tree are constructed always in the same way employing a fixed number. However, random trees can also be employed [13], [92], such as the Erdős–Rényi network, where the construction of such tree is probabilistic and is not fixed. In the latter case, crossings between branches can take place. Furthermore, it is also reported in the literature the so-called pruning process in random tree [93], where each site, or vertex, is systematically removed over time. Initially, there are a few branches and over time the tree reaches a plateau, i.e., the number of branches is minimized and, upon continuing the pruning process, the tree itself disappears. Bringing this discussion to the case of the Covid-19 spread through the Cayley tree, we can associate each pruned vertex with either a person passing away or joining the social distancing [Fig. 8(c)]. As an analogy with the random tree discussion, upon continuously pruning the tree, such as in the case of people joining the social distancing or passing away, the epidemic is faded away, i.e., the tree vanishes. The effects of the selected quarantine can also be analyzed in the frame of the SEIR model, which also takes into account the number of people exposed to the disease. Thus, employing the usual factors for the SEIR model, the number of connected cities, particular regions, states or even countries and the number of people circulating from one region to another, it is possible to make a forecast of the number of new infections [91], cf. Fig. 8(c). (a) Representation of the Bethe lattice for [88], [89] where the zeroth shell is occupied by an infected person (in red color). In this configuration, each infected person can infect two other more. For the sake of compactness, we stick to only four shells in this representation of the Bethe lattice, namely the zeroth (, red color), the first (, orange color), the second (, green color), and the third (, purple color) shells. The starting point is represented by the patient zero (highlighted in red color). (b) Representation of the number of people that an infected person (red color) [88] can infect in terms of the number of nearest neighbors employing the Bethe lattice [89] for . (c) Schematic representation of the Cayley tree for for various shells. The zoomed region outlined in red depicts two distinct branches representing two distinct hypothetical cities 1 and 2. An infected person (green color) is unable to reach the city 1 and thus everyone there is healthy (blue color). On the right branch, the infected person can reach the city 2 and thus spreading the disease. The various colors employed to represent the people in the right branch represent the various subsequent shells. More details in the main text.

Switching on the interactions in the Bethe lattice

The Ising-model can be applied to the Bethe lattice [43]. To this end, let us consider a lattice in which the central site has spin . The latter has neighbors, next-nearest neighbors, and so on until the last shell which leads to , i.e., -th neighbors. The total number of sites on the lattice can be written as [43]: Furthermore, the number of sites in a network should increase with the number of shells [77], as follows: where is the dimensionality of the lattice. For a Bethe lattice, reads [43] For an infinite number of shells, we have [43] that is, for a lattice with infinite shells we also have infinite dimensionality. For the sake of completeness, we present here a textbook-like discussion of the Ising-model in the Bethe lattice. Essentially, our goal is to make a connection between the Bethe lattice and the interaction parameter . Before starting, we recall Eq. (4) where the Hamiltonian for the Ising-model considering longitudinal applied magnetic field is defined. The first step to calculate the physical quantities of interest is the construction of the partition function [43]. The latter represents the sum over all possible accessible states (Zustandssumme) of the system and reads: In a magnetic system, the states are associated with the possible spin orientations. So, replacing Eq. (4) in (30) we have: Considering a particle in a reservoir in which the temperature is fixed and the particle is in a state with energy , the number of remaining accessible states is determined by the multiplicity calculated in terms of the difference between the total energy and . Thus, the probability that the particle occupies a state labeled by can be written as follows [94]: where constant refers to a normalization constant. The Boltzmann expression for the entropy, namely , allows us to calculate the entropy of a system as a function of . Considering that the reservoir is much larger than the particle itself, i.e., there are many accessible states inside the reservoir, then . Thus, since is very small compared to , the entropy difference can be expanded in a Taylor series [94]: and, since [94], where is the system’s energy and f.e.p. refers to fixed external parameters, the entropy reads: Thus, can be written as: Eq. (35) can be replaced into Eq. (32) to determine the probability of the particle to be found at state: Since the total energy is fixed constant, we now write that and thus: Furthermore, the total probability is defined as , which allows us to calculate the normalization constant and thus achieve the following expression: Note that the term represents only the particular state and, in the denominator term the sum is over all accessible states. Thus, it is needed to use the index to differ such terms. Eq. (38) can be applied to the Bethe lattice taking into account the possible states [43]. Hence, we rewrite Eq. (30) as a function of a non-normalized probability distribution as: where, by comparison, is the term of the sum in Eq. (31). This is a reasonable association, since the non-normalized probability distribution function can be written in terms of the multiplicity , which in turn is associated with the probability that a particle occupies a certain state. Thus, we have: Now, we focus on the zeroth shell of the Bethe lattice. Eq. (31) is valid for a chain of spins, i.e., a one-dimensional case. However, in order to analyze a bidimensional case, for instance, the Bethe lattice can be employed. It is necessary to take into account interaction between the spin at the central point, labeled by , and the spins at each site of the paths connected to this central point. As previously mentioned, for statistically non-correlated events, the total probability will be given by the product of the individual probabilities of each event separately. Hence, the total probability of the system to be at a state reads [43]: where, Eq. (41) gives the probability in terms of the interaction between spin at the central point and each of its first neighbors at a site , cf. sketched in Fig. 9. Then, all subsequent interactions between the spin at a site and its nearest neighbors at a site are considered, which is represented by the product. Note that the interaction between spins are not taken into account, since such paths in the Bethe lattice are not connected. As discussed in Section 2, Eq. (5) incorporates, when taking into account infected and non-infected people, the interaction, which is equivalent to the magnetic exchange coupling constant in the Ising-model. Such an equivalence enables us to calculate the probability of a person being infected or not taking into account and the interactions between the person at the central site and its first neighbors , as well as the interaction between a person labeled by and its nearest neighbor (Fig. 9): where, and therefore, . The exponential decay of in terms of the increase of the interacting parameter means that, upon increasing , the probability of a person being infected is reduced. In other words, an increase of reflects on an increase in the number of infected people, decreasing thus the number of non-infected people that can be infected. Note that, for the case of two infected people or two non-infected people interacting with each other, the sums in Eqs. (43), (44) are null. Indeed, such sums are non-zero only for the case of an interaction between a non-infected person and an infected person (or vice-versa). It is worth mentioning that Eqs. (43), (44) represent a simple case, in which we consider that each person has contact with only two other people, which might not represent the real case. If one considers more contacts between people, the solution of the Ising-model on the Bethe lattice becomes even more complex and tricky to be solved analytically. The limitations of applying such a model lie in the lack of data associated with the interaction between people for each region, as well as the total number of healthy people. However, we consider that the analogy between magnetism and epidemiology can pave the way for more complex situations. Thus, we have the probability of the disease spread in terms of the interactions between neighboring people in a similar way as in the previously discussed case for the Ising-model on the Bethe lattice.

Fig. 9

Representation of spins distributed on the Bethe lattice [89] with showing a spin (red arrow) at the central point and other neighboring spins at sites (orange color arrow) and (green arrow). More details in the main text.

The Bethe lattice and the spread of Covid-19

Bringing this whole discussion to the context of the Covid-19 spread, we can associate an occupied site in the Bethe lattice with an infected person and the paths with the displacement of such people and their possibility of infecting healthy people (Fig. 8). If an infected person travels to the end of a path and finds a healthy person, that person becomes infected and can go on infecting two more others, for example. However, if a healthy person remains in quarantine and it is not in the infection path, the site becomes empty reducing the probability of percolation. In the same way, if an infected person respects the quarantine, the path leading to the contamination is interrupted and the percolation probability is also reduced. It is clear that upon increasing the number of possible paths the data operations turn into a complex exercise, preventing thus the resolution of the problem by hand. Hence, aiming to deal with such a complexity, advanced data analysis techniques are required. This includes, for instance, Markov Chain Monte Carlo methods such as the emcee package, and Machine Learning based algorithms such as the ones presented in the SciKitLearn package [95], [96], [97], [98], [99], [100]. Several tracking methods have been employed in order to collect big-data sets of millions of internet users aiming to make a proper mathematical description of a collective behavior, such as epidemic outbreaks. Among such methods, the ARGO (AutoRegression with GOogle search data) stands out, being applied, for instance, for the case of the influenza epidemics [101]. Recently, the ARGO method was employed in the case of Covid-19 to make a real-time forecast about the disease spread in small provinces in China [102]. As discussed previously, the behavior of the percolation probability as a function of is altered when the number of connections is increased [ Fig. 10]. For a system presenting a relatively high number of connections , a percolation probability distribution takes place. This is the case, for instance, of crowded places, as in the so-called favelas in Brazil, where each person represents a connection that may contribute to the increase of the percolation probability distribution, i.e., the dissemination of the disease. In order to analyze the behavior of such distribution as a function of for various values of , it is natural to employ a distribution function. Indeed, note that the percolation probability polynomial function (Fig. 7) can be approximated by [57], [103]: where refers to the critical value of connections. Equation (45) is also called the Cauchy distribution function [103]. Eqs. (45), (13) represent a probability distribution, being such mathematical functions asymptotically different so that Eq. (45) cannot be rewritten in the form of Eq. (13) and vice-versa.

Fig. 10

Percolation probability as a function of the non-blocked connections for various number of net connections employing (a) the percolation probabilities polynomials shown in Fig. 7 and (b) Eq. (45). Details in the main text.

Data analysis and discussion in the frame of the percolation theory

Making use of the adaptation of the percolation theory for the Covid-19 spread previously discussed, we discuss now the effects on the percolation probability upon increasing the number of connections , cf. shown in Fig. 10. Upon comparing, for instance, the behavior between the percolation probability for and 5, we observe that grows more rapidly for higher values of , indicating that the more the number of connections , the faster the percolation probability increases in terms of the non-blocked paths . For finite values of , the percolation probability is enhanced upon increasing the number of non-blocked paths , i.e., the probability of people getting infected by Covid-19 increases proportionally to the number of people not in quarantine. However, when the percolation probability approaches 1 for , i.e., if people all around the world would interrupt the social distancing, the probability of people getting infected by Covid-19 would be enhanced dramatically. In fact, the more people get infected, more and more number of connections takes place, decreasing thus the minimal number of non-blocked paths (people not respecting the quarantine) for the percolation, i.e., spread of the disease, to occur. In other words, the probability of people getting infected by Covid-19 in terms of the people not respecting the quarantine grows faster when the total number of connections (infections) is increased. Yet, an interesting analogy of such percolation probability can be performed. Since the fraction of people in quarantine and non-infected people is linked to each other, i.e., as one increases the other decreases so that = 1, one can make an analogy with the Physics of semiconductors and the well-established law of mass action [58]. The latter establishes that the product of the density of electrons and holes at a certain temperature is constant and it also depends on the product of their corresponding masses. In other words, the amount of electrons promoted from the valence to the conduction band is equal to the number of remaining holes in the valence band. As an analogy, and can be interpreted in the same way since any increase (decrease) in implies a decrease (increase) in exactly the same proportion in . Percolation probability as a function of the non-blocked connections for various number of net connections employing (a) the percolation probabilities polynomials shown in Fig. 7 and (b) Eq. (45). Details in the main text.

What is next?

Besides the need of a more appropriate mathematical approach in order to describe properly the epidemic curves of a particular disease, there are other inherent factors of our society that can be developed in order to minimize the social impact of other epidemics in the future. These include the record and divulgation of reliable data sets in the frame of epidemics by health agencies in order to avoid or properly brake the outbreak of a disease. The development and design of mobile apps [46] can serve as an important tool to be employed in a more efficient social distancing. Such mobile apps can indeed be very useful in the management of the disease spread. However, their use collide with ethical aspects regarding the control of each individual’s information and locations by the government, which in many cases can preclude the application of such apps. As pointed out in Ref. [104], the record of the epidemic data must be performed and analyzed together with other relevant data sources, such as demographic, genetic, and travel patterns for the various locations and temporal scales aiming the prevention and the containment of epidemics. Furthermore, it is highly required that the health agencies make a proper planning about the construction of hospital beds in the case of an epidemic, being obviously evident that the construction speed of such hospital beds in different countries will be distinct, impacting thus on the treatment of infected people. Even though the coronavirus was already known by the scientific community a few years ago [105] the outbreak of the disease could not be avoided. Hence, it becomes clear that the discovery of a new virus should be accompanied, if possible, by the discovery of the antidotes and or vaccines. This points to the urge of high-level scientific research focused on the prevention or management of future viral epidemics. Also, the incorporation of discussions on the risk and possible consequences of epidemics in the curriculum of primary and high schools may play an important role on the resulting awareness of younger people regarding profilatic measurements which could, indeed, reduce the impact of future epidemics. The impact of the Covid-19 vaccination of the population in the epidemic curves was not here discussed since a vaccine was still under development during the preparation of this manuscript. However, as discussed in the literature [106], vaccinated individuals can still spread the Covid-19 virus, being the main goal of the vaccine to significantly attenuate the severity of the symptoms which may lead to a fatality. Based on the current status, the short-term effect of the vaccination is most likely to impact on the epidemic curves associated with new infections [107]. This is because, taking into account the level of efficacy of the vaccines, vaccination leads to immunity of the population and thus consequently the number of new infections should be reduced. Nevertheless, there are evidences that the pandemic control based on vaccines may lie in a continuous modification of the Covid-19 vaccine due to the virus’ mutations [108], [109], [110], which means that a new variant of it should be taken periodically by the population, as it is the case, for instance, for the flu vaccine [111]. The latter emphasizes the importance of a continuous investigation of the virus’ mutations aiming to improve the efficacy of the vaccines. Yet, the number of people in quarantine can be still considered a crucial factor to attenuate the number of new cases in a short period of time, cf. discussed in Section 2.1. It is clear that the reduction of the number of new cases shall lead to the reduction of the number of fatalities.

Conclusions

We have reviewed the key-related aspects to the SIR model for various epidemics and provided a brief discussion about its mathematical description. An adaptation of a Ising-like model was presented and we deduced the logistic function, usually employed to describe epidemic curves among other phenomena. We have shown that the temporal evolution of the number of infections and fatalities, described by the logistic function, has some resemblance with a distorted Fermi–Dirac-like distribution function found in the celebrated Landau Fermi-liquid theory. Our analysis demonstrated that a Gaussian-type function suffices to describe the epidemic curves and that the quarantine plays a crucial role in the amendment of Covid-19 spread. The fundamental concepts of the Cayley tree and the Bethe lattice were discussed and a connection with the Ising-model was made. Yet, we have demonstrated that the percolation (infection) probability in terms of the number of people not respecting the social distancing sets in more rapidly when the total number of cases is increased, making thus evident the importance of the quarantine in the suppression of Covid-19 spread. We hope that the governmental health agencies can have benefits from it. The present work is, at some extent, an appeal to world leaderships to adopt the social distancing to brake the spread of Covid-19 before a vaccine is discovered and released. Such social distancing is crucial since there are possible sub-notifications of the number of new infection cases that could mask an even worse scenario of the number of infections. Our appeal is relevant in the attempt to prevent new waves or even small outbreaks of the number of infections. This is particularly true in countries, such as, for instance, India and some African countries, for which the epidemic curve is still ascending up to date. The mass media plays an important role regarding the dissemination of reliable information aiming to properly aware the population, which can also be considered, in some cases, as an external factor that can indirectly influence the shape of the epidemic curves. Furthermore, we have reviewed the basics of percolation theory and employed it to describe the Covid-19 spread. The analytic solutions for the mathematical models, presented here, can serve as a basis for more complex cases involving numerical calculations. We have provided several distinct paths that can be employed to describe the epidemic curves for any disease. Such elementary discussions presented here can be useful in the application to other collective phenomena. As discussed by the authors of Ref. [112], it is not straightforward to make a forecast of the epidemic curves since the variables associated with the epidemic might change over time. Indeed, as pointed out in Ref. [113], the outbreak of epidemics should be faced treated in a much broader context. The knowledge from various other research areas, besides epidemiology, can play a crucial role in a deeper discussion about the epidemics prevention, such as logistics and crisis management [113]. The defiance of the twenty-first century, e.g., poverty, climate change, and urbanization significantly contribute to epidemics to happen more likely [113]. Also, it is crucial to learn from the negative outcomes of past epidemics, such as the Ebola outbreak in Congo [114], aiming to prevent other possible future epidemics. Furthermore, several countries in the world are experiencing the epidemic scenario for the first time, being difficult for such countries to properly manage the situation. Thus, a more embracing and modern approach on the containment of epidemics must be developed. Also, it is mandatory the inclusion of strategic plans in future public health governmental policies. Last but not least, this work was written during our period of quarantine.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

32 in total

1. Accurate estimation of influenza epidemics using Google search data via ARGO.

Authors: Shihao Yang; Mauricio Santillana; S C Kou
Journal: Proc Natl Acad Sci U S A Date: 2015-11-09 Impact factor: 11.205

2. Percolation transition in dynamical traffic network with evolving critical bottlenecks.

Authors: Daqing Li; Bowen Fu; Yunpeng Wang; Guangquan Lu; Yehiel Berezin; H Eugene Stanley; Shlomo Havlin
Journal: Proc Natl Acad Sci U S A Date: 2014-12-31 Impact factor: 11.205

3. Preventing COVID-19 prejudice in academia.

Authors: Piotr Rzymski; Michał Nowicki
Journal: Science Date: 2020-03-20 Impact factor: 47.728

4. Water's two-critical-point scenario in the Ising paradigm.

Authors: Claudio A Cerdeiriña; Jacobo Troncoso; Diego González-Salgado; Pablo G Debenedetti; H Eugene Stanley
Journal: J Chem Phys Date: 2019-06-28 Impact factor: 3.488

5. Growth Rate and Acceleration Analysis of the COVID-19 Pandemic Reveals the Effect of Public Health Measures in Real Time.

Authors: Yuri Tani Utsunomiya; Adam Taiti Harth Utsunomiya; Rafaela Beatriz Pintor Torrecilha; Silvana de Cássia Paulan; Marco Milanesi; José Fernando Garcia
Journal: Front Med (Lausanne) Date: 2020-05-22

4. Assessing vaccination priorities for different ages and age-specific vaccination strategies of COVID-19 using an SEIR modelling approach.

Authors: Cong Yang; Yali Yang; Yang Li
Journal: PLoS One Date: 2021-12-22 Impact factor: 3.240

5. The impact of vaccination on the spread of COVID-19: Studying by a mathematical model.

Authors: Bo Yang; Zhenhua Yu; Yuanli Cai
Journal: Physica A Date: 2021-12-12 Impact factor: 3.263

5 in total