Literature DB >> 35900989

Optimum stratum boundaries and sample sizes for Covid-19 data in Egypt.

Fatma S Abo El Hassan1, Ramadan Hamed2,3, Elham A Ismail1, Safia M Ezzat1.   

Abstract

Stratified random sampling is an effective sampling technique for estimating the population characteristics. The determination of strata boundaries and the allocation of sample size to the strata are two of the most critical factors in maximizing the precision of the estimates. Most surveys are conducted in an environment of severe budget constraints and a specific time is required to finish the survey. So cost and time are two important objectives that are taken under consideration in most surveys. The study suggested Mathematical goal programming model for determining optimum stratum boundaries for an exponential study variable under multiple objectives model when cost and time are under consideration. Compared to other techniques, Goal programming has many advantages in resources planning. Determining the required resources to satisfy the desired goals and the effectiveness of the available resources as well as providing best solutions under different amounts of resources are examples of the advantages of Goal programming. In addition the paper used data on Covid-19 to evaluate the performance of the suggested model for the exponential distribution. The study divided the number of new cases diseases into small, medium and high numbers. It also compared the results with the findings in the reports of the World Health Organization. The suggested mathematical goal programming revealed that Egypt was exposed to three waves of infection during the interval (5/3/2020 to 12/8/2021). These results are identical to the actual reality of covid-19 waves in Egypt.

Entities:  

Mesh:

Year:  2022        PMID: 35900989      PMCID: PMC9333231          DOI: 10.1371/journal.pone.0271220

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.752


1 Introduction

The basic feature of stratified random sampling is that the internally strata units are homogenous, that is, stratum variances should be as small as possible. The equations for determining the optimum stratum boundaries were first provided by Dalenius [1]. Khan et. al. [2] studied the optimum strata width as a Mathematical Programming Problem that was solved using the dynamic programming technique. Khan et. al. [3] studied the problem of optimum stratification and formulated as a MPP assuming exponential frequency distribution of the main study variable. The stratum boundaries are optimum in the sense that they minimize the sampling variance of the stratified sample mean under Neyman allocation. The formulated MPP found to be separable with respect to the decision variables and is treated as multistage decision problem. A solution procedure has also been developed using dynamic programming. Khan et. al. [4] discussed the problem of determining OSB for the study variables with Triangular and Standard Normal distributions. Khan et. al. [5] proposed the method of choosing the best boundaries when strata are formed based on a single auxiliary variable with a varying measurement cost per unit strata by assuming a suitable distribution of the auxiliary variable. The study is concerned with variables which followed triangular, uniform, exponential, normal, right triangular, Cauchy and power distribution. When the study variable has a pareto frequency distribution, Rao et. al. [6] suggested a procedure for determining optimum stratum boundary and optimum strata size of each stratum. Fonolahi and Khan [7] presented a solution to evaluate the optimum strata boundaries When the measurement unit cost varies throughout the strata and when the variable is exponentially distributed. Reddy et. al. [8] solved the same problem when multiple survey variables are under consideration. Danish et. al. [9] presented optimum strata boundaries as a non-linear programming problem when the cost per unit varies throughout the strata. Reddy et. al. [10] formulated the stated problem under Neyman allocation where the auxiliary variables follow Weibull distributions. Danish and Rizvi [11] suggested a non-linear programming model to determine optimum strata boundaries for two auxiliary variables. Reddy and Khan [12] implemented the problem of optimum stratum boundary for various distributions using R package. The aim of this study is to determine optimum stratum boundary (OSB) using Goal programming approach. Compared to other techniques, Goal programming has many advantages in resources planning. Determining the required resources to satisfy the desired goals and the effectiveness of the available resources as well as providing best solutions under different amounts of resources are examples of the advantages of Goal programming. In addition the paper used data on Covid-19 to evaluate the performance of the suggested model for the exponential distribution. The study divided the new cases disease into small, medium and high numbers. It also compared the results with the findings in the reports of the World Health Organization in terms of times of peak disease. This study aimed at determining the optimum sample size, if necessary, within a certain cost and time.

2 Optimum stratum boundaries model

Let the study variable u from population stratified into J strata and is the estimate of population mean. Let u0 and u be the smallest and largest values of the stratification variable u, respectively. The variance of stratified sample mean equal under Proportional allocation, is made as small as possible, in order to determine the intermediate stratum boundaries from the smallest to the largest then is the sample mean in in stratum h, h = 1, 2, …, J, W is the proportion of population units in stratum h and is the variance of stratum for variable u in the h stratum. and nis the sample size chosen from population N and it is equal to under proportion allocation. The minimization of variance given by (1) can be expressed as the minimization of The problem of determining OSB is to find J-1 intermediate points in the interval |u0, u|, let the distance between the smallest and largest values of the stratification variable u is set to be equal If the study variable u has a defined frequency function f(u), and the boundaries of the h stratum are u, u, then Where, using Eqs (4), (5) and (6), WS in Eq (2) can be represented as a function of u and u. i.e f(u, u) = WS. Hence, the objective function is to obtain u1 ≤ u2 ≤ … ≤ u ≤ u. That is adequate to the following MPP: suubject to u0 ≤ u1 ≤ u2 ≤ … ≤ u ≤ u ≤ u. Let q = u − u ≥ 0 denotes the width of the h(h = 1, 2, …., J) stratum. Accordingly, definition (3) is expressed as follows: For k point As a result, determining OSB is the same as determining OSW (optimal stratum width) as MPP: When h = 1 the function f1(q1) transforms into a function in q1 only where u0 is known. in addition, when h = 2 the function f2(q, u1) = f2(q2, u0 + q1) transforms into a function in q2 only where u1 is known. As a result, the MPP can be written as a function in q as follows:

3 The proposed mathematical goal programming model

The proposed mathematical goal programming model for evaluating OSB and optimum sample size allocation to the strata will be presented in this section. The suggested mathematical goal programming constraints are as follows: The aggregate of the optimum stratum width be equal to the distribution’s range. The cost (not exceeding a fixed limit) was added to the model as objective constrain that will be minimized. Time constraint is taken into consideration as it is needed for the sampling process within a specific range. To optimally determine stratum boundary, we will allocate the sample to the different strata for variable u defined in [a,b]. The problem is to partition u into J strata such that a = u0 ≤ u1 ≤ u2 ≤ … ≤ u ≤ u ≤ u = b, let u − u0 = q and define q = u − u thus the required stratification points are given as follows: u = u + q. The suggested Goal Programming (GP) approach can be formulated as follows: find q, u, n, c and t which: h = 1, 2, …., J, q ≥ 0 1 ≤ n ≤ N, dp, dn ≥ 0 Where, k denoted the total number of goal functions, n: Sample size of the h stratum :Total sample size c: per unit cost of the h stratum C: total cost t: time per unit of the h stratum T: total time v: prefixed variance of the estimator of the population mean dp, dn: positive and negative deviation variables of the i goal, (i = 1, 2, 3) is goal functin index where the first goal is to minimize , the second and the third goals are to minimize cost and time of collecting data per unit in each stratum, respectively. is assumed if the finite population correction is ignored and n denotes thestratum size of the h stratum.

4 Covid-19 data application

Currently, the entire world is dealing with the world’s most serious health problem, covid-19. The study used covid-19 data which designated by the World Health Organization (WHO) for Egypt. The study used the original covid-19 data from WHO to evaluate the suggested Mathematical goal programming model. The application to covid-19 data adopted the following steps:- The study variable which used is new cases of covid-19 data from the period 5/3/2020 to 12/8/2021 in Egypt. A statistical fit test was applied for the chosen study variable and it was found that The study variable followed an exponential distribution Let the variable under study u follows an exponential distribution with parameter θ > 0, that is By using Eqs (4), (5), (6) and (17) the term w and can be expressed as follows And The study suggested mathematical programming in order to calculate the variance when v = .965 and take v = .965 as an initial value when number of strata J = 3 to determine the OSB and optimum allocation into sample strata. As stated before that new cases variable distributed as exponential distribution with θ = 0018, u0 = 0, u = 1774 and q = 1774 Where θ is the parameter for exponential distribution, (u0, u) are the observation of smallest and largest values of stratification variable u and q is the different between largest and smallest value. The study used sample size n = 50 (needing to allocate into different strata to estimate average of new cases which presents variable under consideration) from total N = 517 which means recorded days from the period 5/3/2020 to 12/8/2021. The suggested Mathematical goal programming is applied when number of strata J = 3 representing the approach of classifying the number of new cases diseases into small, medium and high numbers. Cost and time are important objectives of most surveys so the study chosen arbitrary specific range of time, which is equal to 150 hours and the fixed value of cost, which is equal to equal 12000. This is done in order to evaluate the suggested mathematical goal programming when multi-objectives is determined. Using Eqs (18) and (19), the suggested goal programming model (10–16) when the study variable u is given by Eq (17), can be formulated as follows: Minimize subject to h = 1, 2, …., J, q ≥ 0 1 ≤ n ≤ N, dp, dn ≥ 0

5 Results and discussion

The study used number of new cases disease for COVID 19 data from Egypt to evaluate the performance of the suggested mathematical programming. Hence, the study solved the suggested goal programming model (20–26) by using a GAMS program and the results are appeared in the following table. Table 1 summarizes the main findings when suggested goal programming model applied in case of an exponentially distributed random variable and taking into consideration pervious conditions. The suggested model calculated the optimum stratum boundaries for new cases disease divided into three stratum. The first strata was from 0 to 341 cases which presented small number of new cases, the second strata was from 342 to 837 which presented medium number of new cases and the third strata was from 838 to 1774 which presented high number of new cases. The new minimum value of variance.777 which is less than the initial value (0.965) which chosen before. Sample size is divided according to the number of strata to 16, 17 and 17 to first, second and third respectively. As well as dividing the time and cost on the three strata as shown in the above table. The study study elucidated the findings resulted from the suggested mathematical programming using WHO data in Table 2.
Table 1

Results for OSB, optimum sample size of the variance function for exponential distribution when J = 3.

No. of strata (J)Optimum strata width OSW (qh)Optimum strata boundary OSW (uh)Sample size (nh)(Ch)(Th)Optimum value of variance
3341.134116.55 ≈ 16380846.777
496.383716.60 ≈ 174063.149
936.616.85 ≈ 17411451
Table 2

Strata 1(small cases) strata 2 (medium cases) strata 3 (high cases) and the dates corresponding to three stratum.

Wave 1Wave 2Wave 3
Strata 1From 15/03/2020 To 03/05/2020From 01/08/2020 To 18/11/2020-
Strata 2From 04/05/2020 To 26/05/2020From 19/11/2020 To 22/12/2020From 20/1/2021 To 15/4/2021
Strata 3From 27/5/2020 To 16/7/2020From 23/12/2020 To 19/1/2021From 16/4/2021 To 04/06/2021
The results in Table 2 show that Egypt was exposed to three waves of infection with the emerging corona virus: The first wave started from 15/3/2020 to 3/5/2020, then the number of infected people started to rise from 4/5/2020 to 26/5/2020, and the peak of the first wave was in the period from 27/5/2020 to 16/7/2020, as the number of infected people increased significantly. The second wave started from 1/8/2020 to 18/11/2020, then the number of infected people increased from 19/11/2020 to 22/12/2020, and reached the peak stage in the number of infected people in the period from 23/12 / 2020 to 19/1/2021. As for the third wave concluded from the results of the proposed program, it began with a rise in the number of infected people from 20/1/2021 to 15/4/2021, and those numbers rose and reached their peak in the period from 16/4/2021 to 4/6/2021. The the number of injured decreased in the period from the cut 5 / 6 / 2021 to date of getting data. The cut points, calculated from the suggested mathematical programming, matching with the waves which appeared in Egypt during the stated period. Fig 1 presented the graph for covid-19 data in Egypt which coincide with the results for the suggested mathematical programming confirming that Egypt faced three waves of covid-19 disease.
Fig 1

Daily new cases in Egypt reported from WHO data appeared the daily new cases which explain the start date and the end date for the three waves.

New cases reported from WHO data Referring to the web on WHO site, it was found that the Fig 1 corresponds to the results of the proposed program. The suggested mathematical goal programming results are approximately the same compared with the real data which appeared in the WHO real event. The study suggested mathematical goal programming model for determining optimum stratum boundaries for an exponential study variable under multiple objectives model when cost and time are under consideration. The study collected the data from the world Health Organization reports. Covid-19 data is used to evaluate the performance for the suggested model for the exponential distribution. The study divided the number of new cases into three groups small, medium and high numbers. The study compared the results which calculate from the suggested mathematical programming with the real data from the World Health Organization reports about covid-19 for Egypt. The results show that Egypt was exposed to three waves of infection during the interval (5/3/2020 to 12/8/2021). The results are identical to the actual reality of covid-19 waves in Egypt. Optimum allocation on strata was used as an addition to the suggested program in addition to the cost and time factors. The mathematical goals programming model does not depend primarily on the data, but rather depends mainly on the parameters of the data distribution. Nevertheless, we find that the results are consistent with the reality regarding what Egypt has been exposed to from three waves of the emerging corona virus.
  1 in total

1.  Optimum strata boundaries and sample sizes in health surveys using auxiliary variables.

Authors:  Karuna Garan Reddy; Mohammad G M Khan; Sabiha Khan
Journal:  PLoS One       Date:  2018-04-05       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.