Literature DB >> 36105539

An Overview of Discrete Distributions in Modelling COVID-19 Data Sets.

Ehab M Almetwally^1,2, Sanku Dey³, Saralees Nadarajah⁴.

Abstract

The mathematical modeling of the coronavirus disease-19 (COVID-19) pandemic has been attempted by a large number of researchers from the very beginning of cases worldwide. The purpose of this research work is to find and classify the modelling of COVID-19 data by determining the optimal statistical modelling to evaluate the regular count of new COVID-19 fatalities, thus requiring discrete distributions. Some discrete models are checked and reviewed, such as Binomial, Poisson, Hypergeometric, discrete negative binomial, beta-binomial, Skellam, beta negative binomial, Burr, discrete Lindley, discrete alpha power inverse Lomax, discrete generalized exponential, discrete Marshall-Olkin Generalized exponential, discrete Gompertz-G-exponential, discrete Weibull, discrete inverse Weibull, exponentiated discrete Weibull, discrete Rayleigh, and new discrete Lindley. The probability mass function and the hazard rate function are addressed. Discrete models are discussed based on the maximum likelihood estimates for the parameters. A numerical analysis uses the regular count of new casualties in the countries of Angola,Ethiopia, French Guiana, El Salvador, Estonia, and Greece. The empirical findings are interpreted in-depth. © Indian Statistical Institute 2022, Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Entities: Chemical

Keywords: COVID-19; discrete distributions; hazard rate; maximum likelihood estimation.; survival discretization

Year: 2022 PMID： 36105539 PMCID： PMC9461386 DOI： 10.1007/s13171-022-00291-6

Source DB: PubMed Journal: Sankhya Ser A ISSN： 0976-836X

Introduction

Corona-Virus “COVID-19” was first reported in early December 2019 in Wuhan, China, and within three months spread like a pandemic around the whole globe. The World Health Organization (WHO) described COVID-19 as a pandemic on March 11, 2020. Refer to Figs. 1 and 2. Despite the drastic, large-scale containment measures implemented in most countries, these numbers rapidly increased every day—posing an unprecedented threat to the global health and economy of interconnected human societies. Countries around the world have therefore increased their efforts to decrease the COVID-19 spread rate.

Figure 1

The situation for the daily new cases over the world by the WHO Region has been shown in Figure 1

Figure 2

The situation for the daily new deaths over the world by the WHO Region has been shown in Figure 2

The situation for the daily new cases over the world by the WHO Region has been shown in Figure 1 The situation for the daily new deaths over the world by the WHO Region has been shown in Figure 2 To model daily cases and deaths in the world, there are some mathematical/statistical models in the literature which are used to describe the dynamics of the evolution of COVID-19. The comparison of the COVID-19 epidemic dynamics among different countries is of great concern. In this regard, the researchers are making their best efforts to provide medical solutions for drugs and vaccines in reducing the risk of virus spread. The study of this aspect of science requires discrete distributions. For any researcher, the first question comes to mind- Why do we need discrete distributions? We are aware that most of the current continuous distributions do not fit adequately for modeling the cases of COVID-19 in count data analysis. In the current situation, it is of great interest to study more about COVID-19 and compare different countries as many as possible. Therefore, in this article, an effort has been made to compare the COVID-19 pandemic outbreak in several countries around the world. Recently, many authors introduced different discrete distributions such as natural discrete Lindley distribution has been implemented by Al-Babtain et al. (2020a, ??) to model everyday cases and deaths in the world. Almetwally et al. (2020) introduced a discrete Marshall-Olkin generalized exponential distribution to discuss the recent Egyptian cases regularly. Elbatal et al. (2022) obtained discrete odd Perks-G class of distributions. Almetwally et al. (2022) introduced discrete Marshall-Olkin inverse Toppe-Leone distribution with application to COVID-19 data. Nagy et al. (2021) discussed discrete extended odd Weibull exponential with different applications. Gillariose et al. (2021) proposed discrete generalization of the exponential model. A new discrete distribution, called discrete generalized Lindley, was analyzed by El-Morshedy et al. (2020) to examine the counts of daily coronavirus cases in Hong Kong and new daily fatalities in Iran. Maleki et al. (2020) have used an autoregressive time series model based on normal distribution of the two-piece scale mixture to estimate the recovered and reported cases of COVID-19. The study carried out by Hasab et al. (2020) where they used the susceptible infected recovered (SIR) epidemic dynamics of the COVID-19 pandemic for modelling the novel Coronavirus epidemic in Egypt. Nesteruk (2020) and Batista (2020b) have predicted regular new COVID-19 cases in China by using the mathematical model, named SIR. The logistic growth regression model used by Batista (2020a) for estimating the final size of the coronavirus outbreak and its peak time. This research work aims to model the daily new fatalities of COVID-19 using a review of statistical models to determine the best model fitting of COVID-19 data for different countries as Angola, Ethiopia, French Guiana, El Salvador, Estonia, and Greece and aware of the risks resulting from the spread of Corona-Virus in the world. To accomplish this goal: First, we study separate models such as Poisson, geometric, negative binomial, discrete Burr, discrete Lindley, discrete alpha power inverse Lomax, discrete generalized exponential, discrete Marshall-Olkin Generalized exponential, discrete Gompertz-G-exponential, discrete Weibull, discrete inverse Weibull, exponentiated discrete Weibull, discrete Rayleigh, and new discrete Lindley. Second, in some countries such as Angola, Ethiopia, French Guiana, El Salvador, Estonia, and Greece, we define the best discrete models that match different regular Coronavirus death datasets. The remainder of the paper is structured as follows. Discrete models are analyzed in Section 2. In Section 3, review for discrete models has been done based on survival discretization method. We discuss the parameter estimation of the discrete models in Section 4. Section 5 presents the new regular death of COVID-19 in the case of Angola, El Salvador, Estonia, and Greece to validate the use of models in suitable lifetime count results. Lastly, in Section 6, conclusions are made.

Review for Classical Discrete Models

In this Section, survival discretization method and some discrete distributions have been reviewed, such as Binomial, Poisson, Hypergeometric, discrete Burr, discrete Lindley, discrete alpha power inverse Lomax,discrete generalized exponential, discrete Marshall-Olkin Generalized exponential, discrete Gompertz-G-exponential, discrete Weibull, discrete inverse Weibull, exponentiated discrete Weibull, discrete Rayleigh, new discrete Lindley, negative binomial, beta-binomial, Skellam, beta negative binomial and Conway–Maxwell–Poisson distribution. We don’t use Logarithmic, Borel, discrete compound Poisson, Boltzmann, Benford’s law, Yule–Simon, Zipf’s law, and Zeta distribution because the range of x doesn’t support 0.

Binomial Distribution

The binomial distribution (bionm) can be defined, using the binomial expansion as the distribution of a random variable X for which where p + q = 1, 0 < p < 1, and n is a positive integer. Ifn = 1, the distribution is called the distribution of Bernoulli.

Poisson Distribution

If a random variable X has a Poisson (Pois) distribution with parameter 𝜃,then its PMF is given by where 𝜃 > 0. For more information, see Johnson et al. (2005) chapter 4.

Hypergeometric Distribution

In a sample of n balls drawn without substitution from a population of (N) balls, (N𝜃) of which are white and (N − N𝜃) are black. The PMF of hypergeometric distribution is given by For more information, see Johnson et al. (2005) chapter 6.

Waring Distribution

The distribution of Waring is a generalization of the distribution of Yule, see Johnson et al. (2005). Taking Pr[X = x] , Waring expansion proportional to the (x + 1) term in the sequence. where α,𝜃 > 0. If α = 1 then, Yule distribution is the special case of Waring distribution.

Yule–Simon Distribution

The Yule–Simon distribution is a discrete probability distribution named after Udny Yule and Herbert A. Simon in probability and statistics. Originally, Simon named it the distribution of Yule by Simon (1955). The Yule-Simon (𝜃) PMF is where 𝜃 > 0. The CDF of Yule-Simon distribution is The hr function of the Yule-Simon distribution is given by

Discrete Rectangular Distribution

In its most general form, the discrete rectangular distribution (sometimes called the discrete uniform distribution) is defined by

Distribution of Leads

The distribution of leads in coin tossing (Johnson et al. 2005) has the PMF

Beta-binomial Distribution

The beta-binomial (Bbinom) distribution in probability theory and statistics is a family of discrete probability distributions on finite support of non-negative integers occurring when the probability of success is either unknown or random in each of a fixed or known number of Bernoulli trials, is discussed by Griffiths (1973). The PMF of Bbinom is given by the CDF of Bbinom is given by where FGH (a,b,x) is the generalized hypergeometric function.

Negative Binomial Distribution

The negative binomial(Nbionm) distribution in probability theory and statistics is a discrete probability distribution that models the number of successes before a given non-random number of failures (denoted by r) occur in a series of independent and identically distributed Bernoulli trials. The PMF of Nbinom is given by where r is a number of failures until the experiment is stopped (integer, but the definition can also be extended to real numbers) and p is success probability in each experiment. For more information, see Johnson et al. (2005) chapter 5.

Geometric Distribution

The geometric distribution in probability theory and statistics is the probability distribution of the number X − 1 of failures before the first success, supported by the set{0,1,2,... }. The PMF is given as where0 < p < 1.

Beta Negative Binomial Distribution

A beta negative binomial (BNbinom) distribution in probability theory is the probability distribution of a discrete random variable X equal to the number of failures needed to achieve achievements in a series of independent Bernoulli trials in which the probability p of success in each trial, though constant in any given experiment, is itself a random variable following a beta distribution. Wang (2011) discussed this distribution as a compound probability distribution. The PMF of BNbinom is given by where x = 0, 1, 2,… and r,α,β > 0.

Logarithmic Distribution

A random variable X is said to have a logarithmic distribution with parameter 𝜃 if its PMF is in the form For more information of logarithmic distribution, see Johnson et al. (2005) chapter 7.

Skellam Distribution

Let μ1,μ2 > 0, Skellam (1946) introduced the Skellam distribution (distribution of the difference between two independent Poisson random variables) and is denoted by Skellam (μ1,μ2) with PMF is given by wherex = …,− 2,− 1, 0, 1, 2, … Ik (2μ1 μ2) is the modified Bessel function of the first kind.

Conway–Maxwell–Poisson Distribution

Shmueli et al. (2005) discussed the Conway–Maxwell–Poisson (CMP) distribution with PMF as where Z (λ,𝜃) = ∑ j= 0∞ λ (j!) is normalization constant.

Review for Discrete Models Based on Survival Discretization Method

In the statistics literature, sundry methods are available to obtain a discrete distribution from a continuous one. The most commonly used technique to generate discrete distribution is called a survival discretization method, it requires the existence of cumulative distribution function (CDF), survival function should be continuous and non-negative and times are divided into unit intervals. The PMF of discrete distribution is defined in Roy (2003) as wherex = 0,1,2,…,S (x) = P (X ≥ x) = F (x; Θ), F (x; Θ) is a CDF of continuous distribution, and Θ is a vector of parameters. The random variable X is said to have the discrete distribution if its CDF is given by The hazard rate is given by hr (x) = P (X=x) S(x). The reversed failure rate of discrete distribution is given as

Discrete Burr Distribution

The PMF of the discrete Burr (DB) distribution has been defined by Krishna and Pundir (2009) is given by where x = 0,1,2,…, α > 0,0 < 𝜃 < 1, the CDF of the DBu distribution is The hazard rate (hr) of the discrete Burr distribution is

Discrete Lindley Distribution

The PMF of the discrete Lindley (DLi) distribution has been defined by Gómez-Déniz and Calderín-Ojeda (2011) is given as follows where x = 0,1,2,…, 0 < 𝜃 < 1. The CDF of the DLi distribution is The hazard rate of the DLi distribution is

Discrete Alpha Power Inverse Lomax

The discrete alpha power inverse Lomax (DAPIL) distribution is introduced by Almetwally and Ibrahim (2020). The PMF and the CDF of the DAPIL distribution are respectively given by The hr function of the DAPIL distribution is given by

Discrete Generalized Exponential Distribution

The PMF of the discrete generalized exponential (DGE) distribution has been defined by Nekoukhou et al. (2013) is given as follows where x = 0,1,2,…,α > 0, 0 < 𝜃 < 1, when 𝜃 = e−;λ > 0, the CDF of the DGEx distribution is The hazard rate of the DGEx distribution is

The DMOGEx Distribution

The discrete Marshall-Olkin Generalized exponential (DMOGEx) distribution is introduced by Almetwally et al. (2020). The PMF and the CDF of the DMOGEx distribution are respectively given by and where 0 < ρ < 1, λ,𝜃 > 0. The hr function of the DMOGEx distribution is given by

Discrete Gompertz-G Exponential

The discrete Gompertz-G- exponential (DGzEx) distribution was introduced by Eliwa et al. (2020). The PMF and the CDF of the DGzEx distribution are respectively given by and The hr function of the DGzEx distribution is given by

Discrete Weibull

A discrete Weibull (DW) distribution was introduced by ? Nakagawa-and-Osaki:1975 (), and is defined by the cumulative distribution function (CDF)as: The DW distribution has PMF: and the hazard rate of DW is

Discrete Inverse Weibull

A discrete inverse Weibull (DIW) distribution was introduced by Jazi et al. (2010), and is defined by the CDF: The DIW distribution has PMF: and the hazard rate of DIW is

Exponentiated Discrete Weibull

The exponentiated discrete Weibull (EDW) distribution was introduced by Nekoukhou and Bidram (2015), and is defined by the CDF: The DIW distribution has PMF: and the hazard rate of DIW is The discrete Rayleigh (DR) distribution was introduced by Roy (2004), and can be defined when α = 2 and β = 1as follows: The DR distribution has PMF: and the hazard rate of DR is

New Discrete Lindley

The new discrete Lindley (NDL) distribution was introduced by Al-Babtain et al. (2020a, ??), and is defined by the CDF: The NDL distribution has PMF: and the hazard rate of NDL is

Parameter Estimation of Discrete Model

In this section, we estimate the parameters of the models using a maximum likelihood method. It is noted that the maximum likelihood method is also used to estimate unknown parameters of a statistical model because maximum likelihood estimates (MLEs) have several desirable properties; For example, they are asymptotically unbiased, symmetrical, consistent, asymptomatically normally distributed, etc. Let x1, x2,…, x be a random sample of size n from the discrete distribution, and then the log-likelihood function is given by where Θ = (Θ1,…,Θ), k is a length of Θ. The MLEs can be obtained by partially first derivatives of the log-likelihood function and equal to zero provide the MLEs of Θ, say Θ̂ = (Θ̂1,…,Θ̂), then using a computational process such as the k variable Newton-Raphson Algorithm are given by the solutions of the equations. For interval estimation and hypothesis tests on the model parameters, we require the information matrix. The k × k observed information matrix is One can use the normal distribution of Θ̂ to construct approximate confidence interval regions for some parameters. Indeed, an asymptotic 100(1 − ξ) confidence interval for each parameterΘ;j = 1,…,k, is given by where ℶ ^ denotes the (i,i) diagonal element of I− 1 (Θ̂) and z is the (1 − ξ 2) thquantile of the standard normal distribution.

Applications of Real Data

In this section, we illustrate the empirical importance of the discrete distributions, such as DB, DLi, Binom, Pois, DR, DGE, Geometric, DW, DIW, DE, NDL, DGzEx, DMOGE, DAPLo, and EDW distributions using four applications to real data sets. The fitted models are compared using some criteria; namely, Akaike information criterion (AIC), corrected AIC (CAIC), Hannan-Quinn information criterion (HQIC), Chi-square (X2) with a degree of freedom and its p-value. and

African Continent

Angola

This data represents the daily new deaths of 51 days from 10 October to 29 November 2020 belong to Angola country (see World Health Organization). The MLEs and the goodness of fit statistics are reported in Tables 1, 2 and 3.

Table 1

The goodness of fit and estimation models with one parameter for Dataset of Angola

x	Freq.	DL	binom	pois	DR	Geom	DE	NDL	Nbinom
0	5	10.3851	3.7196	3.4748	3.6409	13.8353	13.8377	11.1728	3.7451
1	13	10.3832	9.4929	9.3344	9.4361	10.082	10.0832	10.2192	9.5336
2	8	8.6612	12.3511	12.5374	11.7371	7.347	7.3473	8.3084	12.3726
3	7	6.6201	10.9194	11.2263	10.594	5.3539	5.3538	6.3327	10.9106
4	8	4.8052	7.3768	7.5392	7.5862	3.9015	3.9012	4.6338	7.3521
5	6	3.3707	4.0606	4.0505	4.4611	2.8431	2.8427	3.2965	4.0368
6	3	2.3078	1.8966	1.8135	2.1912	2.0718	2.0714	2.2972	1.8806
7	1	1.5516	0.7728	0.6959	0.9077	1.5098	1.5093	1.5759	0.7644
𝜃		0.5928	0.95	2.6863	0.9286	0.2713	0.7287	0.3902	0.9501
χ²		7.4731	6.3093	7.1692	5.0751	13.8986	13.9035	8.5615	6.3249
P-Value		0.3813	0.5041	0.4115	0.6508	0.053	0.0529	0.2857	0.5024
AIC		212.2015	205.5942	206.245	204.4382	221.7812	221.7812	214.0158	207.5957
CAIC		212.2832	205.6759	206.3267	204.5199	221.8628	221.8628	214.0975	207.8457
BIC		214.1334	207.526	208.1769	206.37	223.713	223.713	215.9477	211.4593
HQIC		212.9397	206.3324	206.9832	205.1764	222.5194	222.5194	214.7541	209.0721

Table 2

The goodness of fit and estimation models with two parameters for Dataset of Angola

x	Freq.	DB	DGE	DW	DIW	Bbinom	BNbinom	skellam
0	5	6.7808	4.7033	4.9671	3.3896	9.6902	5.4547	3.5321
1	13	18.9004	11.9206	10.2181	16.9335	7.1534	6.8033	9.2664
2	8	8.3399	11.6379	11.2204	10.9531	5.7499	6.6694	12.4241
3	7	4.2361	8.59	9.5265	6.0458	4.7385	5.9618	11.1673
4	8	2.5474	5.6325	6.7748	3.5861	3.9474	5.0827	7.5452
5	6	1.6992	3.484	4.172	2.2938	3.3055	4.2143	4.0829
6	3	1.2139	2.0895	2.2648	1.559	2.7745	3.4333	1.8423
7	1	0.9104	1.232	1.0959	1.1113	2.3304	2.7648	0.7128
α		4.8597	0.5766	0.9026	0.0665	0.8452	51.1307	2.7117
𝜃		0.814	2.7731	1.7864	1.5591	8.3895	0.9501	0.0251
χ²		24.8612	4.5821	3.5563	13.3268	10.0551	6.3249	6.9532
P-Value		0.0008	0.7108	0.8292	0.0645	0.1855	0.5024	0.4337
AIC		233.0403	208.2782	205.5087	222.5174	219.2958	207.5957	208.2394
CAIC		233.2903	208.5282	205.7587	222.7674	216.9786	207.8457	208.4894
BIC		236.9039	212.1418	209.3723	226.381	220.5679	211.4593	212.1031
HQIC		234.5167	209.7546	206.9851	223.9938	219.5569	209.0721	209.7159

Table 3

The goodness of fit and estimation models with three parameters for Dataset of Angola

x	Freq.	DGzEx	DMOGE	DAPL	EDW
0	5	7.0237	5.3943	4.4178	6.0749
1	13	8.3922	9.7184	11.1826	9.0453
2	8	9.2915	11.0367	11.8077	9.9774
3	7	9.2309	9.6026	9.1019	9.4497
4	8	7.8652	6.7241	5.9894	7.6045
5	6	5.3941	4.0369	3.6199	5.0134
6	3	2.7262	2.2048	2.0932	2.5727
7	1	0.899	1.1424	1.1871	0.9666
α		0.8945	1.6567	0	3.0214
β		1.1054	0.4903	9.8333	0.4415
𝜃		0.323	4.1168	0.0998	0.9919
χ²		3.9266	4.0603	4.5484	3.2125
P-Value		0.7882	0.7728	0.7149	0.8647
AIC		206.4348	208.7333	210.1451	206.0902
CAIC		206.9454	209.2439	210.6558	206.6009
BIC		212.2303	214.5287	215.9406	211.8857
HQIC		208.6494	210.9479	212.3598	208.3048

The goodness of fit and estimation models with one parameter for Dataset of Angola The goodness of fit and estimation models with two parameters for Dataset of Angola The goodness of fit and estimation models with three parameters for Dataset of Angola From Tables 1, 2 and 3, it is evident that all distributions are fitted and work quite well for analyzing these data except for the DB distribution. However, we always search for the best model to get the best evaluation of the data, and therefore, using AIC, BIC, CAIC, HQIC, χ2 and p-values, we can say that the DMKEx model provides the best fit among all the tested models because it has the largest p-value and the smallest values of AIC, CAIC, BIC, HQIC and χ2 statistics. Figure 1 supports the results of Tables 1, 2 and 3. The fitted PMFs for Dataset of Angola

Ethiopia

This data represents the daily new deaths of 68 days from 1 April to 7 June 2020 belong to Ethiopia country (see World Health Organization). The MLEs and the goodness of fit statistics are reported in Tables 4, 5, and 6.

Table 4

The goodness of fit and estimation models with one parameter for Dataset of Ethiopia

Value	Count	DL	binom	Pois	DR	Geom	DE	NDL	Nbinom
0	53	51.9015	50.6629	50.6728	42.4357	52.5454	52.5486	52.1453	48.4267
1	12	12.8907	14.8785	14.9038	24.206	11.9422	11.9404	12.5212	13.9393
2	1	2.6204	2.2169	2.1917	1.3481	2.7141	2.7132	2.6725	4.0123
3	2	0.4851	0.2234	0.2149	0.0102	0.6169	0.6165	0.5348	1.1549
𝜃		0.1425	0.9957	0.2941	0.3759	0.7727	0.2272	0.8399	0.7122
χ²		5.8081	15.4572	16.1469	397.4295	4.1765	4.1789	5.0873	3.5538
P-Value		0.1213	0.0015	0.0011	0	0.243	0.2428	0.1655	0.3138
AIC		96.8328	99.3872	99.5043	119.8357	96.3289	96.3289	96.6049	98.2034
CAIC		96.8934	99.4478	99.565	119.8963	96.3895	96.3895	96.6655	98.3881
BIC		99.0523	101.6067	101.7239	122.0552	98.5484	98.5484	98.8244	102.6425
HQIC		97.7122	100.2666	100.3838	120.7151	97.2084	97.2084	97.4843	99.9623

Table 5

The goodness of fit and estimation models with two parameters for Dataset of Ethiopia

Value	Count	DB	DGE	DW	DIW	Bbinom	BNbinom	skellam
0	53	53.0355	53.0563	53.1071	52.9906	53.2037	8.541	50.6655
1	12	11.6924	11.1753	11.148	11.9984	7.286	13.6952	14.9089
2	1	2.2184	2.8052	2.7572	1.8851	3.1767	13.7835	2.1935
3	2	0.6154	0.7162	0.7192	0.5689	1.6954	11.1608	0.2152
α		1.5906	0.2566	0.219	0.7793	0.1855	100.0741	0.2943
𝜃		0.1126	0.8367	0.9329	2.4612	25.0861	4.0965	0.0001
χ²		3.7657	3.51	3.4511	3.978	4.3161	67.7849	16.12652
P-Value		0.2879	0.3195	0.3272	0.2639	0.2293	0	0.0011
AIC		98.2653	98.2326	98.2076	98.3745	98.6101	109.5831	101.5044
CAIC		98.45	98.4172	98.3923	98.5591	98.821	108.5832	101.689
BIC		102.7044	102.6716	102.6467	102.8135	97.1454	110.8358	105.9434
HQIC		100.0242	99.9915	99.9665	100.1333	100.1186	105.8331	103.2632

Table 6

The goodness of fit and estimation models with three parameters for Data set of Ethiopia

Value	Count	DGzEx	DMOGE	DAPL	EDW
0	53	53.0519	53.0207	52.9588	53.0357
1	12	11.3019	11.7037	11.8781	11.4966
2	1	2.6663	2.2482	2.1159	2.4583
3	2	0.6917	0.6664	0.5747	0.6694
α		0.3851	3.1747	0.0001	0.6074
β		1.6448	0.3739	1.5139	4.551
𝜃		0.0433	0.0826	0.1381	0.0531
χ²		3.5429	3.3497	4.0924	3.5125
P-Value		0.3152	0.3408	0.2517	0.3191
AIC		100.2122	99.9264	100.4083	100.1279
CAIC		100.5872	100.3014	100.7833	100.5029
BIC		106.8707	106.5849	107.0668	106.7864
HQIC		102.8505	102.5647	103.0466	102.7662

The goodness of fit and estimation models with one parameter for Dataset of Ethiopia The goodness of fit and estimation models with two parameters for Dataset of Ethiopia The goodness of fit and estimation models with three parameters for Data set of Ethiopia From Tables 4, 5, and 6, it is evident that all distributions are fitted and works quite well for analyzing these data except for the DR, binomial, Poisson, Skellam, and BNbinom distributions. However, we always search for the best model to get the best evaluation by using AIC, BIC, CAIC, HQIC, χ2, and p-values. Figure 2 supports the results of Tables 4, 5 and 6. The fitted PMFs for Dataset of Ethiopia

El Salvador

This data represents the daily new deaths of 81 days from 1 April to 20 June 2020 belong to El Salvador country (see World Health Organization). The MLEs and the goodness of fit statistics are reported in Tables 7, 8 and 9.

Table 7

The goodness of fit and estimation models with one parameter for Dataset of El Salvador

x	Freq.	DL	binom	pois	DR	Geom	DE	NDL	Nbinom
0	34	35.8944	28.208	28.0146	18.5008	39.2874	39.2913	36.898	28.2086
1	25	22.7585	29.5621	29.7438	33.7886	20.2318	20.232	21.8881	29.5622
2	11	11.9636	15.6819	15.7898	20.8584	10.4188	10.4179	11.5415	15.6815
3	6	5.751	5.6135	5.5881	6.5736	5.3654	5.3644	5.7054	5.6133
4	4	2.6229	1.5252	1.4833	1.1546	2.763	2.7622	2.7076	1.5251
5	1	1.1555	0.3355	0.315	0.1168	1.4229	1.4223	1.2492	0.3354
𝜃		0.3719	0.9871	1.0617	0.7716	0.485	0.5149	0.6045	0.9871
χ²		1.1319	8.6416	9.2714	33.6686	2.5452	2.5471	1.3475	8.642
P-Value		0.9512	0.1242	0.0987	0	0.7697	0.7694	0.93	0.1242
AIC		230.5771	235.0331	235.4473	253.2005	233.3614	233.3614	231.1279	237.0331
CAIC		230.6278	235.0837	235.4979	253.2511	233.4121	233.4121	231.1785	237.1869
BIC		232.9716	237.4275	237.8417	255.595	235.7559	235.7559	233.5223	241.822
HQIC		231.5378	235.9938	236.408	254.1612	234.3221	234.3221	232.0885	238.9545

Table 8

The goodness of fit and estimation models with two parameters for Dataset of El Salvador

x	Freq.	DB	DGE	DW	DIW	Bbinom	BNbinom	skellam
0	34	34.4036	33.8388	33.7095	33.0795	40.0502	0.9349	28.0132
1	25	28.0052	25.2297	24.6603	29.5016	16.4509	3.0269	29.7437
2	11	9.339	12.2539	12.8315	8.9315	9.1663	5.6262	15.7905
3	6	3.7907	5.4873	5.8521	3.6857	5.5386	7.8829	5.5886
4	4	1.8668	2.3897	2.4465	1.8677	3.4695	9.2524	1.4835
5	1	1.0496	1.0293	0.9571	1.0786	2.2156	9.608	0.315
α		2.4136	0.4274	0.5838	0.4084	0.5804	88.7166	1.0618
𝜃		0.4504	1.5656	1.2446	1.7955	34.4566	7.1157	0.00001
χ²		4.1337	1.2457	1.2487	4.8049	5.9714	578.7171	9.2705
P-Value		0.5303	0.9404	0.9401	0.4401	0.309	0.0002	0.0988
AIC		238.4529	232.4455	232.0191	239.7789	242.0565	246.0948	237.4473
CAIC		238.6068	232.5994	232.1729	239.9327	241.9607	246.466	237.6011
BIC		243.2418	237.2344	236.808	244.5678	248.2354	251.5662	242.2362
HQIC		240.3743	234.3669	233.9405	241.7002	249.4564	253.6535	239.3687

Table 9

The goodness of fit and estimation models with three parameters for Dataset of El Salvador

x	Freq.	DGzEx	DMOGE	DAPL	EDW
0	34	34.4545	34.0422	34.1455	34.3273
1	25	22.8177	24.1669	25.2472	22.9832
2	11	13.2714	12.9569	12.0298	13.3583
3	6	6.5965	5.8305	5.2789	6.4936
4	4	2.7112	2.4142	2.3042	2.6292
5	1	0.8856	0.9649	1.0301	0.886
α		0.661	0.9399	0	1.7099
β		1.2116	0.3903	12.3095	0.542
𝜃		0.1615	2.3293	0.2418	0.7948
χ²		1.2798	1.3569	1.41	1.3566
P-Value		0.937	0.929	0.9232	0.929
AIC		233.5703	234.2826	234.9581	233.7504
CAIC		233.882	234.5943	235.2698	234.0621
BIC		240.7537	241.4659	242.1415	240.9337
HQIC		236.4524	237.1646	237.8402	236.6324

The goodness of fit and estimation models with one parameter for Dataset of El Salvador The goodness of fit and estimation models with two parameters for Dataset of El Salvador The goodness of fit and estimation models with three parameters for Dataset of El Salvador From Tables 7, 8 and 9, it is evident that all distributions are Fitted and work immensely well for analyzing these data except for the BNbionm, and DR distribution. However, we always search for the best model to get the best evaluation byusing AIC, BIC, CAIC, HQIC, χ2, and p-values. Figure 3 supports the results of Tables 7, 8 and 9.

Figure 3

The fitted PMFs for Dataset of Angola

French Guiana

This data represents the daily new deaths of 153 days from 1 June to 31 October 2020 belong to French Guiana country (see World Health Organization). The MLEs and the goodness of fit statistics are reported in Tables 10, 11 and 12.

Table 10

The goodness of fit and estimation models with one parameter for Dataset of French Guiana

Value	Count	DL	Dbinom	DP	DR	Geom	DE	NDL	Nbinom
0	102	102.8925	95.8637	97.4614	76.8051	105.446	105.4415	103.7724	95.8073
1	39	36.0756	44.7489	43.9533	66.7841	32.7737	32.7754	34.8533	39.9856
2	7	10.4012	10.5126	9.911	9.1226	10.1864	10.1879	10.4053	12.5161
3	4	2.7352	1.6571	1.4899	0.2861	3.166	3.1668	2.9123	3.4824
4	1	0.6816	0.1972	0.168	0.0022	0.984	0.9844	0.7825	0.9084
𝜃		0.2028	0.9969	0.451	0.498	0.6892	0.3108	0.7761	0.7913
χ²		2.0874	8.8849	9.9735	64.0151	2.5036	2.5029	2.1005	2.9355
P-Value		0.7197	0.064	0.0409	0	0.644	0.6441	0.7173	0.5687
AIC		276.2944	280.1661	280.288	319.2933	277.1681	277.1681	276.4623	278.2455
CAIC		276.3208	280.2907	280.3145	319.3198	277.1945	277.1945	276.4888	278.3255
BIC		279.3248	283.2946	283.3184	322.3237	280.1985	280.1985	279.4927	284.3064
HQIC		277.5254	281.4952	281.519	320.5243	278.3991	278.3991	277.6933	280.7075

Table 11

The goodness of fit and estimation models with two parameters for Dataset of French Guiana

Value	Count	DB	DGE	DW	DIW	Bbinom	BNbinom	skellam
0	102	102.0825	102.176	102.491	101.8402	106.2493	90.7806	97.4627
1	39	39.2054	37.3434	36.63	40.164	21.4081	37.3982	43.9525
2	7	7.8603	10.0104	10.3575	6.825	10.0652	14.9649	9.9106
3	4	2.2069	2.583	2.6754	2.0941	5.6251	5.9448	1.4897
4	1	0.8124	0.6607	0.6512	0.8706	3.3895	2.3575	0.1679
α		2.0172	0.2551	0.3301	0.6656	0.2721	240.033	0.4509
𝜃		0.2045	1.3712	1.1147	2.4483	54.6357	1.0697	0.0002
χ²		1.5823	1.9274	2.0838	1.7686	16.732	7.0243	9.9743
P-Value		0.812	0.7491	0.7204	0.7782	0.0022	0.1346	0.0409
AIC		278.7042	278.082	278.2295	279.5632	288.2087	297.1347	282.288
CAIC		278.7842	278.162	278.3095	279.6432	288.2088	297.1348	282.368
BIC		284.7651	284.1429	284.2904	285.6241	288.2147	307.1408	288.3489
HQIC		281.1662	280.544	280.6915	282.0252	285.2111	289.1372	284.75

Table 12

The goodness of fit and estimation models with three parameters for Dataset of French Guiana

Value	Count	DGzEx	DMOGE	DAPL	EDW
0	102	103.1603	102.0371	102.0898	102.089
1	39	35.3723	38.5727	38.5167	38.1064
2	7	10.7692	8.6816	8.9211	9.2162
3	4	2.8765	2.5161	2.3031	2.4825
4	1	0.6653	0.8	0.7049	0.7394
α		0.5133	3.0611	0	0.7349
β		1.601	0.333	4.2859	3.599
𝜃		0.0611	0.2035	0.2042	0.1063
χ²		2.309	1.2514	1.7868	1.5687
P-Value		0.6791	0.8696	0.7749	0.8144
AIC		280.4584	279.6547	280.2424	279.9279
CAIC		280.6195	279.8158	280.4035	280.089
BIC		289.5497	288.7461	289.3337	289.0192
HQIC		284.1514	283.3478	283.9355	283.621

The goodness of fit and estimation models with one parameter for Dataset of French Guiana The goodness of fit and estimation models with two parameters for Dataset of French Guiana The goodness of fit and estimation models with three parameters for Dataset of French Guiana From Tables 10, 11 and 12, it is evident that all distributions are fitted and work immensely well for analyzing these data except for the DR and skellam distributions. However, we always search for the best model to get the best evaluation of the data, and therefore, using AIC, BIC, CAIC, HQIC, X2, and p-values, we can say that the DL in Table 7, DGE in Table 8, and DMOGE in Table 9 model provides the best fit among all the tested models because it has the largest p-value and the smallest values of AIC, CAIC, BIC, HQIC and χ2 statistics. Figure 4 supports the results of Tables 10, 11 and 12.

Figure 4

The fitted PMFs for Dataset of Ethiopia

Europe Continent

Estonia

This data represents the daily new deaths of 81 days from 1 April to 20 May 2020 belong to Estonia country (see World Health Organization). The MLEs and the goodness of fit statistics are reported in Tables 13, 14 and 15.

Table 13

The goodness of fit and estimation models with one parameter for Dataset of Estonia

x	Freq.	DL	Binom	pois	DR	Geom	DE	NDL	Nbinom
0	20	20.413	15.268	15.058	9.516	22.727	22.725	21.09	15.278
1	15	13.91	17.899	18.071	18.994	12.397	12.396	13.393	17.901
2	6	7.866	10.701	10.844	14.011	6.762	6.762	7.56	10.697
3	5	4.069	4.349	4.338	5.772	3.688	3.689	4.001	4.345
4	3	1.998	1.351	1.301	1.451	2.012	2.012	2.033	1.349
5	0	0.947	0.342	0.312	0.23	1.097	1.098	1.004	0.341
6	1	0.438	0.074	0.062	0.023	0.599	0.599	0.486	0.073
𝜃		0.4	0.977	1.2	0.81	0.455	0.545	0.577	0.977
χ²		2.896	18.123	21.001	59.692	3.221	3.221	2.801	18.156
P-Value		0.822	0.006	0.002	0	0.781	0.781	0.833	0.006
AIC		152.29	157.901	158.584	170.826	153.582	153.582	152.457	159.902
CAIC		152.373	157.985	158.667	170.91	153.665	153.665	152.541	160.157
BIC		154.202	159.814	160.496	172.738	155.494	155.494	154.369	163.726
HQIC		153.018	158.63	159.312	171.554	154.31	154.31	153.185	161.358

Table 14

The goodness of fit and estimation models with two parameters for Dataset of Estonia

x	Freq.	DB	DGE	DW	DIW	Bbinom	BNbinom	skellam
0	20	20.3137	19.946	19.8908	19.3778	26.443	21.1803	15.0634
1	15	16.7589	14.6836	14.3643	17.7488	10.1255	10.6461	18.0724
2	6	6.0514	7.8054	8.0425	5.8575	5.3982	6.3244	10.8412
3	5	2.6133	3.897	4.1076	2.5532	3.132	3.9673	4.3356
4	3	1.3476	1.9012	1.9781	1.3445	1.8836	2.5626	1.3004
5	0	0.7856	0.9183	0.9117	0.7995	1.153	1.6878	0.312
6	1	0.4992	0.4416	0.4057	0.5169	0.7121	1.128	0.0624
α		2.3335	0.4789	0.6022	0.3876	0.5549	23.0072	1.1998
𝜃		0.4714	1.4098	1.1879	1.6709	23.4553	0.7414	0
χ²		5.445	2.969	3.032	5.799	6.84	3.588	21.021
P-Value		0.488	0.813	0.805	0.446	0.336	0.732	0.002
AIC		158.573	154.399	154.216	159.305	160.349	153.155	160.584
CAIC		158.828	154.655	154.472	159.56	160.559	154.059	160.839
BIC		162.397	158.223	158.04	163.129	165.287	157.388	164.408
HQIC		160.029	155.855	155.673	160.761	162.71	155.095	162.04

Table 15

The goodness of fit and estimation models with three parameters for Dataset of Estonia

x	Freq.	DGzEx	DMOGE	DAPL	EDW
0	20	20.3734	20.0495	19.5879	20.1969
1	15	13.3667	14.1479	15.6532	13.6021
2	6	8.0864	8.0852	7.7502	8.1771
3	5	4.4584	4.1015	3.574	4.3901
4	3	2.2106	1.9558	1.6677	2.1237
5	0	0.9707	0.9051	0.8091	0.932
6	1	0.371	0.4131	0.4115	0.3734
α		0.6758	1.0231	0	1.4865
β		1.2462	0.4511	8.5769	0.6436
𝜃		0.1096	1.7632	0.2205	0.7555
χ²		3.1186	3.0591	3.6685	3.1415
P-Value		0.7938	0.8014	0.7214	0.7909
AIC		156.043	156.336	157.083	156.116
CAIC		156.565	156.858	157.605	156.638
BIC		161.779	162.072	162.819	161.852
HQIC		158.228	158.521	159.267	158.301

The goodness of fit and estimation models with one parameter for Dataset of Estonia The goodness of fit and estimation models with two parameters for Dataset of Estonia The goodness of fit and estimation models with three parameters for Dataset of Estonia From Tables 13, 14, and 15, it is evident that all distributions are Fitted and work quite well for analyzing these data except for the Binom, Pois, and DR distributions. However, we always search for the best model to get the best evaluation of the data, and therefore, using AIC, BIC, CAIC, HQIC, X2, and p-values, we can say that the DL model provides the best fit among all the tested models because it has the smallest values of AIC, CAIC, BIC, HQIC, and χ2 statistics, as well as having the highest p-value. Figure 5 supports the results of Tables 13, 14 and 15.

Figure 5

The fitted PMFs for Dataset of El Salvador

Greece

This data represents the daily new deaths of 111 days from 12 March to 30 June 2020 belong to Greece country (see World Health Organization). The MLEs and the goodness of fit statistics are reported in Tables 16, 17 and 18.

Table 16

The goodness of fit and estimation models with one parameter for Dataset of Greece

x	Freq.	DL	Binom	pois	DR	Geome	DE	NDL	Nbinom
0	39	34.3535	20.121	19.8601	12.2023	40.7981	40.7968	36.0786	20.12694
1	26	28.3298	34.099	34.1756	29.1321	25.8027	25.8024	27.4649	34.10319
2	17	19.4399	29.154	29.405	30.7488	16.3189	16.319	18.5846	29.15261
3	9	12.2127	16.7658	16.8669	21.694	10.3209	10.3211	11.7896	16.76212
4	6	7.2833	7.2952	7.2562	11.1845	6.5275	6.5277	7.1799	7.292363
5	7	4.1968	2.5617	2.4973	4.3612	4.1283	4.1285	4.2511	2.560298
6	6	2.36	0.7562	0.7162	1.3081	2.6109	2.6111	2.4657	0.755599
7	0	1.3031	0.193	0.1761	0.3047	1.6513	1.6514	1.4077	0.192785
8	0	0.7094	0.0435	0.0379	0.0555	1.0444	1.0445	0.7938	0.043407
9	1	0.3819	0.0088	0.0072	0.0079	0.6605	0.6606	0.4431	0.008761
𝜃		0.4868	0.9847	1.7208	0.8901	0.3676	0.6325	0.4925	0.984735
χ²		12.6465	184.8257	212.9279	218.3349	9.4779	9.4768	10.9923	184.9933
P-Value		0.1793	0	0	0	0.3944	0.3945	0.2762	4.59E-35
AIC		399.5498	440.2502	442.207	462.0504	399.2138	399.2138	398.6729	442.2502
CAIC		399.5865	440.2869	442.2437	462.0871	399.2505	399.2505	398.7096	442.3614
BIC		402.2594	442.9597	444.9165	464.7599	401.9233	401.9233	401.3824	447.6693
HQIC		400.649	441.3494	443.3061	463.1496	400.3129	400.3129	399.7721	444.4486

Table 17

The goodness of fit and estimation models with two parameters for Dataset of Greece

x	Freq.	DB	DGE	DW	DIW	Bbinom	BNbinom	skellam
0	39	39.9728	38.4338	37.6699	36.6671	44.7225	0	19.6302
1	26	33.0123	27.1352	27.1739	35.5067	23.4325	0.0002	34.0085
2	17	14.3432	17.2843	17.6853	14.4778	14.472	0.0014	29.4593
3	9	7.1817	10.791	11.1464	7.2501	9.3716	0.0057	17.0124
4	6	4.1365	6.6806	6.8913	4.2126	6.2001	0.018	7.3683
5	7	2.6307	4.1183	4.2028	2.7015	4.1492	0.0469	2.5531
6	6	1.7954	2.5328	2.5362	1.8561	2.7951	0.1059	0.7372
7	0	1.291	1.5556	1.5173	1.3417	1.8901	0.2123	0.1824
8	0	0.966	0.9547	0.9011	1.0081	1.2807	0.3856	0.0395
9	1	0.746	0.5856	0.5318	0.781	0.8687	0.6439	0.0076
α		2.097	0.613	0.6606	0.3303	0.7334	131.206	1.7325
𝜃		0.5251	1.1173	1.0818	1.3636	45.3618	27.2001	0.00003
χ²		21.5247	9.9019	10.0035	21.5044	10.1022	227.2067	205.3209
P-Value		0.0105	0.3585	0.3502	0.0106	0.3423	0	0
AIC		416.9434	400.8569	400.5058	418.3186	401.0155	438.4688	444.2158
CAIC		417.0545	400.968	400.6169	418.4298	401.027	438.4569	444.3269
BIC		422.3625	406.276	405.9249	423.7377	406.0137	432.5658	449.6349
HQIC		419.1418	403.0553	402.7042	420.517	402.9877	439.1251	446.4142

Table 18

The goodness of fit and estimation models with three parameters for Dataset of Greece

x	Freq.	DGzEx	DMOGE	DAPL	EDW
0	39	37.1418	39.1004	38.2561	39.6469
1	26	26.2135	24.518	28.7498	23.2576
2	17	17.918	17.6573	17.394	16.952
3	9	11.8333	11.7765	10.2344	12.0579
4	6	7.5308	7.3852	6.0569	8.1187
5	7	4.6056	4.4451	3.6484	5.1014
6	6	2.6985	2.6083	2.2469	2.963
7	0	1.51	1.5074	1.4167	1.5797
8	0	0.8041	0.8636	0.9145	0.7693
9	1	0.4059	0.4922	0.6039	0.3412
α		0.7564	0.5621	0	1.893
β		1.406	0.5661	16.7726	0.3775
𝜃		0.0521	3.0701	0.2929	0.9346
χ²		9.5683	9.7419	12.1902	9.085
P-Value		0.3866	0.3718	0.2028	0.4295
AIC		401.4063	402.2136	405.1989	400.9745
CAIC		401.6306	402.4379	405.4232	401.1988
BIC		409.5349	410.3422	413.3275	409.1031
HQIC		404.7038	405.5111	408.4964	404.272

The goodness of fit and estimation models with one parameter for Dataset of Greece The goodness of fit and estimation models with two parameters for Dataset of Greece The goodness of fit and estimation models with three parameters for Dataset of Greece From Tables 16, 17 and 18, it is evident that all distributions are Fitted and work quite well for analyzing these data except for the DB, Binom, Pois, DIW, and DR distribution. However, we always search for the best model to get the best evaluation of the data, and therefore, concerning the AIC, BIC, CAIC, HQIC, χ2and p-values, we can say that the DE model provides the best fit among all the tested models because it has the smallest values of AIC, CAIC, BIC, HQIC and χ2statistics, as well as having the highest p-value. Figures 6, 7 and 8 support the results of Tables 16, 17 and 18.

Figure 6

The fitted PMFs for Dataset of Guiana

Figure 7

The fitted PMFs for Dataset of Estonia

Figure 8

The fitted PMFs for Dataset of Greece

The fitted PMFs for Dataset of Guiana The fitted PMFs for Dataset of Estonia The fitted PMFs for Dataset of Greece

Concluding Remarks

In this article, we use 8 discrete distributions with one parameter, 7 discrete distributions with two parameters, and 4 discrete distributions with three parameters to fit and determine the best model of daily Coronavirus deaths in some countries, such as Angola,Ethiopia, French Guiana, El Salvador, Estonia, and Greece. In the case of discrete distributions with one parameter, we discussed DL, binomial, Poisson, DR, Geometric, DE, Nbinom, and NDL distributions. In the case of discrete distributions with two parameters, we discussed DB, DGE, DW, DIW, Bbinom, BNbinom, and skellam distributions. In the case of discrete distributions with three parameters, we discussed DGzEx, DMOGE, DAPL, and EDW distributions. A review of some important discrete distributions has been provided as DB, DL, DMOGE, DGE, DAPL, DR, DE, Geometric, Binomial, NDL, DGzEx, and EDW distribution. The maximum likelihood estimation method is discussed to estimate the parameters of the discrete distributions. We prove empirically that the discrete models fit different datasets of daily Coronavirus deaths in some countries as Angola, Ethiopia, French Guiana, El Salvador, Estonia, and Greece. DW and DB reveal its superiority over other competitive models for the analysis of daily deaths of the COVID-19 in the case of Angola, Ethiopia, French Guiana, El Salvador, Estonia, and Greece.

6 in total

1. The frequency distribution of the difference between two Poisson variates belonging to different populations.

Authors: J G SKELLAM
Journal: J R Stat Soc Ser A Date: 1946

2. Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease.

Authors: D A Griffiths
Journal: Biometrics Date: 1973-12 Impact factor: 2.571