Hadis Anahideh1, Lulu Kang2, Nazanin Nezami1. 1. University of Illinois at Chicago, United States. 2. Illinois Institute of Technology, United States.
Abstract
We aim to design a fairness-aware allocation approach to maximize the geographical diversity and avoid unfairness in the sense of demographic disparity. During the development of this work, the COVID-19 pandemic is still spreading in the U.S. and other parts of the world on large scale. Many poor communities and minority groups are much more vulnerable than the rest. To provide sufficient vaccine and medical resources to all residents and effectively stop the further spreading of the pandemic, the average medical resources per capita of a community should be independent of the community's demographic features but only conditional on the exposure rate to the disease. In this article, we integrate different aspects of resource allocation and create a synergistic intervention strategy that gives vulnerable populations higher priority in medical resource distribution. This prevention-centered strategy seeks a balance between geographical coverage and social group fairness. The proposed principle can be applied to other scarce resources and social benefits allocation.
We aim to design a fairness-aware allocation approach to maximize the geographical diversity and avoid unfairness in the sense of demographic disparity. During the development of this work, the COVID-19 pandemic is still spreading in the U.S. and other parts of the world on large scale. Many poor communities and minority groups are much more vulnerable than the rest. To provide sufficient vaccine and medical resources to all residents and effectively stop the further spreading of the pandemic, the average medical resources per capita of a community should be independent of the community's demographic features but only conditional on the exposure rate to the disease. In this article, we integrate different aspects of resource allocation and create a synergistic intervention strategy that gives vulnerable populations higher priority in medical resource distribution. This prevention-centered strategy seeks a balance between geographical coverage and social group fairness. The proposed principle can be applied to other scarce resources and social benefits allocation.
The COVID-19 pandemic has been widely spreading around the globe and caused hundreds of thousands of deaths, crashed the healthcare systems of many countries, and stalled almost all social and economic activities, which leads to an astronomical amount of financial loss. To battle the spreading COVID-19 pandemic, several leading countries and organizations have devoted a significant amount of resources to develop vaccines, new diagnostics, and anti-infective treatments for the novel coronavirus [17]. More than 60 candidate vaccines are now in development worldwide, and several have entered early clinical trials in human volunteers, according to the World Health Organization [39]. Once the development of the treatments is approved for public use, it is of paramount importance that these resources are quickly dispatched to all the communities in a country because one weak link in the defense against the virus would leave the neighboring communities, even the whole state vulnerable and exposed to the spreading of the disease. However, the existing healthcare infrastructures in many states and cities are insufficient to provide universal supplies of vaccinations and treatments. Hence, targeting the high risk population and setting fair principles for prioritizing the allocation accordingly will save lives, curb the further spreading of the virus, and prevent sequential global pandemic outbreaks in the following years.In the United States, due to the disparities [43] between different population subgroups in terms of health, financial stability, and accessibility to health care services, certain low-income, and minority populated communities are particularly vulnerable to COVID-19. For example, among Chicago's minority groups, the Latinx and Black groups have been hit much harder by the COVID-19 pandemic than any other ethnic group. By the latest report (September 15, 2020) from the Chicago Department of Public Health (CDPH), among the death tolls caused by COVID-19, 33.0% are from the Latinx group, and 42.8% are from Black & non-Latinx group [49]. These two ethnic groups have also led to a substantial increase in the number of COVID-19 confirmed cases by a large margin compared to the rest ethnic groups. These ratios are disproportional to the percentages of these ethnic groups of the total population in Chicago, which mainly consists of 32.3% white, 28.7% Hispanic, and 30.9% African American, according to US Census Bureau [3].The development of vaccines and other effective treatment medicines against any novel infectious disease generally takes a long time due to the high uncertainties in all the stages in the process, including clinical trials. Even if the vaccines and treatments are developed successfully, they will only be available in limited quantities initially due to manufacturing, logistic and financial constraints. Due to the huge gap between available medical resources and the entire population in need of them, especially during the early production stages, the allocation of such resources becomes a difficult yet pressing issue. Moreover, due to demographic distribution and occupational factors of the residents in different geographical regions, resource allocation based solely on maximum geographical uniformity can lead to an unfair distribution system and propagate biases across population subgroups of the population.We design a fairness-aware allocation framework for vaccine and scarce treatment resources considering both Geographical Diversity and Social Group Fairness as the guiding principles for prevention-centered strategies. The social group fairness is based on the general fairness notion of Equality of Opportunity [25,31]. An allocation strategy is fair if the average amount of resources an individual receives only depends on the individual's exposure rate to the disease and is independent of the individual's demographic or social-economic background. We also consider an allocation strategy diverse if the geographical location does not affect the averaged resources an individual can receive. Based on such notions of fairness and diversity, we formally define and formulate them into inequality constraints. Our proposed research provides a solution to distribute the scarce medical resources in a fair manner to all communities and protect certain minorities and low-income groups that are more vulnerable to pandemic's impact. Not only can such a solution help stop the spreading pandemic more effectively, but also push for justice and fairness in healthcare decision-making.
Related work
Resource allocation has been a classic and important problem in many domains such as economy (e.g, Ref. [30]), management science (e.g, Ref. [14]), emergency response (e.g, Ref. [28]), etc. More recent works have addressed the need for an integrated approach with different purposes such as material allocation in a production plan [8], supplier selection, and order allocation application [51]. Healthcare resource allocation is one of the most challenging allocation problems. A large number works have been developed to identify effective strategies for it [7,12,26,27,33,41,55,61].Identifying and assessing the type of populations at risk in emergency resource allocation is yet another challenge that appears in various crises. The risk equity in the field of emergency resources location-allocation has been studied in the existing literature [65]. developed a bi-level optimization based on the game theory for emergency logistics. The risk is assessed based on the multiplication of incident probability by the exposed population within a certain impact radius corresponding to certain hazardous materials. The aim is to minimize the total environmental risk posed by hazardous waste at different incident sites. More recently [62], has discussed that an ethically justified resource allocation during the current pandemic should meet the needs of the highest risk categories. In addition [48], developed a framework using local COVID-19 data of Los Angeles County to geographically identify the risk level of sub-populations by partitioning based on demographics and vulnerabilities. They assess vulnerabilities based on pre-existing health conditions, barriers to accessing health care, environmental risk, and social vulnerability and find out that the significant presence of racial minorities denotes the most vulnerable neighborhoods. The proposed framework and medical vulnerability have been discussed and evaluated in Ref. [54], as well.Recently, the fairness of algorithmic decision making has been the center of attention of many researchers, who are designing and developing algorithms for different purposes, such as machine learning, ranking, and social welfare [4,5,23,66]. Mitigating the bias of an outcome from a decision model, which is mainly caused by the inherent bias in the data and societal norm, will ensure that the outcome is not favorable or adversarial toward any specific subgroups [4,16,24,31,60,63,66]. One of the critical problems where the fairness of the outcome matters significantly is scarce resource allocation. The notion of fairness in resource allocation has been introduced in Ref. [6] and later with a more precise definition in Ref. [10].Fairness was firstly adopted in bandwidth allocation problem for computer network systems [11,15,18,[34], [35], [36],38,46,47]. In these settings, the amount of resources requested can be modified by different users. Besides, service allocation mainly covers a group of users and is not necessarily one-to-one allocation. These settings differ from the resource allocation for scarce treatments which is a one-to-one allocation problem.Now fairness has been addressed in many resource and service allocation methods [25,37,40,41,59]. The importance of ethical consideration in resource allocation and principal guidelines have been discussed in Ref. [62]. Here we highlight a few and emphasize our difference from them. In Ref. [25], the authors formalize a general notion of fairness for allocation problems and investigate its algorithmic consequences when the decision-maker does not know the distribution of different subgroups (defined by creditworthy or criminal background) in the population. The distribution estimation is accomplished using censored feedback (individuals who received the resource, not the true number). In our work, we estimate the distribution of different social groups from the data using Bayes rule. Singh [53] considers a fair allocation of multiple resources to multiple users and have proposed a general optimization model to study the allocation. Our proposed method differs from Ref. [53] on two aspects. First [53], assumes multi-resources and multi-type users, whereas we mainly focus on resource allocation problem across different regions and different population subgroups. Second [53], aims to maximize the coverage, whereas we consider a fairness-diversity trade-off by minimizing the diversity and fairness gaps, simultaneously, across different regions. Donahue et al. [20] considers the problem of maximizing resource utilization when the demands for the resource are distributed across multiple groups and drawn from probability distributions. They require equal probabilities of receiving the resource across different groups to satisfy fairness and provide upper bounds on the price of fairness over different probability distributions. In our proposed model, we utilize a similar fairness requirement while requiring diversity (population) consideration across different regions. Thus, our work is an intersection between fairness and diversity. Furthermore, we deal with the one-to-one allocation instead of the coverage problem. One recent research article related to COVID-19 resource allocation designed a vulnerability indicator for racial subgroups that can be used as guidelines for medical resource allocation [48]. The proposed model cannot identify other vulnerable subgroups that are not geographically clustered, thus not able to form spatially concentrated communities. In our paper, we directly identify subgroups’ vulnerability using exposure rates estimated from COVID-19 cases and death. We focus on the available data to estimate the necessary parameters and perform an empirical study using the proposed Algorithm 2.
Summary of contributions
In this paper, we aim to design a fairness-aware allocation strategy that considers the trade-off between geographical diversity and social fairness (demographic disparity) in allocating resources. We provide a new aspect to the classic allocation problem while seeking a synergistic intervention strategy that prioritizes disadvantaged people in the distribution of scarce resources. The proposed approach is different from the existing works using the maximum utilization [13] or maximum coverage [25,53] objective. The nature of the treatment allocation problem is not the same as the coverage problems since the former is a one-to-one assignment problem. In contrast, in the latter, the resource can cover more than one user, such as police or doctor allocation [20,25]. In our work, we aim to study scarce resource allocation considering the trade-off between geographical diversity and social fairness. More specifically, we do not seek to share the vaccines equally among groups of populations. Instead, we aim to emphasize the protection of the vulnerable populations or the ones at high risk, depending on their exposure rate, to prevent the death/spread of the disease. We focus on the available data to estimate the necessary parameters and perform an empirical study using the proposed Algorithm 2. However, our proposed model can be generalized to resource allocation in other scenarios such as disaster relief.In § 2, we first model the notions of geographical diversity and social group fairness to allocate scarce medical treatments and vaccines. We then formulate the allocation problem into an Integer Program (IP) problem, which incorporates diversity and fairness as constraints, in addition to the capacity constraint, i.e., the number of available resources. Fairness and diversity constraints are bounded by user-defined hyperparameters, ε
, and ε
, which are the allowed diversity and fairness gaps, respectively. To efficiently obtain a feasible solution for the original IP problem, we relax the IP problem to a Linear Programming (LP) problem. Moreover, the fairness and diversity constraints are much more complicated than the capacity constraint. Hence to deal with them efficiently, we use the penalty method, a common practice to solve constrained optimization. The two constraints are combined into a single objective function using a trade-off hyperparameter α. To guarantee that the converted problem is equivalent to the original feasibility problem, we provide theoretical proof. Subsequently, a binary search algorithm is used to obtain a feasible range of α. We evaluate the allocation scenario under different (ε
, ε
) values and provide the corresponding feasible range for α, accordingly. Different levels of the trade-off between diversity and fairness are presented. The proposed framework can be applied at different stages of the pandemic to estimate the exposure rate of population subgroups and obtain a feasible allocation considering both population size and exposed population. In § 4, we evaluate the performance of the proposed model using COVID datasets in Chicago for vaccine and scarce treatment allocations. The results demonstrate the impact of incorporating fairness criteria in the allocation model compared to diverse and uniform allocation. The paper concludes in § 6.
Problem definition
Traditional resource allocation treatments have been widely studied in the optimization literature [14,28,30] where the limited resource is being distributed to areas based on a single objective (e.g., the function of the total population). However, regions might have different needs or require a higher priority in a medical resource allocation setting (e.g.vaccine). An allocation treatment solely based on geographical diversity suggests an equalized distribution among the regions, which may not provide equity in the sense of social fairness. To assure the concept of “equity” as well as “equality”, we aim to model and incorporate fairness notion on demographic subgroups as another principal in the allocation decision process. Identifying each region's priority or risk level is critical since it can save many lives and protect vulnerable subgroups. We aim to design a fair and diverse resource allocation framework through modeling the Social Group Fairness and Geographical Diversity and modeling their trade-off in the optimization setting. The optimization model minimizes the fairness and diversity gaps across different subgroups of different regions while satisfying the capacity constraint to protect a vulnerable population. A trade-off analysis is performed to demonstrate the price of fairness incorporation as the fairness gap (the maximum allocation gap between demographic subgroups) with and without fairness consideration in the allocation. We propose a tuning approach to identify an optimal range for the trade-off hyperparameter(α) in order to provide the best possible allocation solution under different scenarios.
Notations
We consider a centralized decision maker for allocating available b units of vaccines to a set of centers denoted by M. They include clinics, hospitals, pharmacies, etc. For convenience, we assume the entire area covered by the M centers to be a city, but it can be a county or other administrative district. Let x
be the decision variable denoting the number of vaccines allocated to the center j ∈ M. Let z
be the region, such as the list of zip-code areas, that are assigned to be covered by center j. For simplification, we assume there is no overlap between the list of zip-code areas covered by different centers (even if they are close in the distance), i.e., z
∩ z
= ∅, ∀l, k ∈ M. We also assume that residents can only receive the resources from the center that covers the region where they reside. Such policy is not uncommon in practice, especially in the distribution of scarce resources.Next, we introduce some key concepts and their notation. If we consider the geographical location of any individual reside to be a random variable, denoted by Z, then {j ∈ M, z
} are the possible values for Z. Therefore, P(Z = z
) is the proportion of the population who reside in z
. Let U
1, …, U
be discrete-valued sensitive variables corresponding to demographic and socioeconomic attributes, and S
for i = 1, …, p be the set of possible levels for each of the sensitive variable. For instance, if [U
1, U
2, U
3] represent three attributes, income, race, gender, respectively, then S
1 = {low,medium, high}, S
2 = {black, latinx,white, others}, and S
3 = {female, male}. The combinations of all the levels of U
1, …, U
is a set denoted by and indexed by a set I, i.e., . In other words, . For any , it corresponds to a possible combination of levels of U
1, …, U
, such as , and there can be a group of the population whose values of the sensitive variables (U
1, …, U
) are equal to g. In short, we call it social group g. Let s
be the population size of the social group g
who reside in region z
, where i ∈ I, j ∈ M. Therefore, s
/∑
s
= P([U
1, …, U
, Z] = [g
, z
]). Let E be a binary random variable with E = 1 representing the individual is exposed to the infectious disease and E = 0 otherwise. So P(E = 1|g
) is the exposure rate of the social group g
and P(E = 1|g
, z
) the exposure rate of the social group g
living the area of z
. It is intuitive to assume these exposure rates depend on the social groups and regions. We will discuss more on some reasonable assumptions on the exposure rates and how to estimate them later.A key concept in resource allocation is the amount of resource per capita. It is a ratio between the quantity of available resources and the size of the population who will receive the resource. Denote V as the amount of resource one individual receives. One important assumption we make here is that V follows a discrete uniform distribution. There are three parameters involved, the amount of resource X, the population who are to receive the X amount of resource, and size of the population , the cardinality of . Therefore, the mean value of V is is the resources per capita, and it varies with respect to the three parameters.The focus of this article is on the following problem. Given a limited amount of resource b, such as vaccines, how should the decision maker allocate the amount of resource x
to each center j satisfying geographical diversity (quantified by ), and social fairness (quantified by ).
Diversity and fairness modeling
In this section, we define geographical diversity and social group fairness, which the latter is based on the equality of opportunity notion of fairness [24]. To quantify the geographical diversity and social fairness, we introduce and in Equations (1), (7).(Geographical Diversity)Geographical Diversity of allocation of limited resource to a set of centers M is satisfied if ∀j ∈ M, on average the resource per capita is invariant with respect to the location of the groups of the population, i.e.,does not vary with respect to the locationz.The notation is simplified and we use z
to refer to the population who reside in region z
. Let s
be the population size of the social group g
and denote the averaged resource per capita over the entire city under the consideration of the resource allocation plan, and the location is omitted since it is obvious. It is straightforward to formulate the geographical diversity for ∀j ∈ M as follows. flushleftSo if geographical diversity is strictly met, for all j.The geometrical diversity represents the conventional or minimal requirement for resource distribution that requires evenly distributing the resources among all geographical regions across the entire city. Next, we introduce the concept of social fairness. In this definition, we emphasize the even distribution of resources among the endangered population, i.e., exposed to the infectious disease, disregarding the social groups.(Social Fairness)Social Fairness of allocation of limited resource to a set of centersMis satisfied if∀i ∈ I the averaged resource per capita is invariant with respect to the values ofU, …, Uof the group of the population, i.e.,does not vary with respect to the social groupg.The notation is simplified and we use E = 1 and g
to denote the exposed individuals in social group g
. Directly translating this definition into a formula, the fairness principle should beHowever, the calculation of and is not straightforward, and we derive them as follows.We first calculate the resource per capita for the exposed individuals who reside in z
, disregarding the social group, i.e.,in which the denominator is the amount of exposed individuals in z
. Then,In this article, we assume that the chance of exposure of an individual only depends on the individual's social attributes and is independent of the geographical location, i.e., P(E = 1|U
1, …, U
, Z) = P(E = 1|U
1, …, U
). Equivalently, ∀i ∈ I and ∀j ∈ M, P(E = 1|g
, z
) = P(E = 1|g
). This assumption is reasonable since, in many U.S. cities as Chicago, the geometrical locations of the residents are highly correlated with the social and economic status of the residents. For example, in Fig. 1
, the percentage of the major ethnic groups, White, Hispanic, and African-American, are shown in three heat maps. It is very clear that the geometrical locations of the residents and the racial groups are heavily correlated. Of course, this assumption also makes the rest of the formulation much simpler. Under this assumption, we can simply obtain
Fig. 1
Map of race and ethnicity by neighborhood in Chicago.
Map of race and ethnicity by neighborhood in Chicago.Next, to obtain , we need to integrate Equation (5) with respect to g
, i.e.,Based on the derivations, we can formulate the fairness for ∀i ∈ I,Similar to the definition of , if the fairness is strictly met, for all i ∈ I.
Fair and diverse allocation optimization
As explained above, if the diversity and fairness requirements are strictly met, all and should be 0. However, such constraints are too restrictive and difficult or impossible to satisfy for all regions and social groups. Define and . and are auxiliary decision variables that signify the tight upper bounds on the diversity and fairness constraints, i.e., and for any j ∈ M and i ∈ I. Ideally, we want to find the feasible solution x such that both upper bounds are equal to zero. In addition, x
for j = 1, …, M should satisfy the capacity constraint, i.e., ∑
x
= b. Seeking to achieve geographical diversity and social fairness simultaneously, one can formulate this problem as multi-objective (MO) minimization.The integer constraint is because usually, resources such as vaccines are counted in integers, and one individual only receives one vaccination. The integer constraints in P1 can be relaxed, particularly in practice when b is large.A common solution to the multi-objective optimization problem is the weighted sum method, which leads to a simpler minimization problem P2. After removing the absolute operation in the constraints, we obtain the relaxed linear programming (LP) problem.where refers to the positive side of the absolute value function, , and refers to the negative side of the absolute function. Similarly, for , we have and . Here α ∈ [0, 1] is a hyperparameter that controls the trade-off between the importance of fairness and diversity. If α > 0.5, the objective function focuses more on achieving fairness, and it focuses more on the diversity if α < 0.5.Rounding HeuristicIf we relax the integer constraints of P1, based on the multi-objective optimization theory, it is easy to conclude that for any x* in the Pareto front of P1, there exists an α* ∈ [0, 1] such that x* is an optimal solution of P2. This is because the remaining constraints of P1, including for all j ∈ M, for all i ∈ I, ∑
x
= b, and x
≥ 0 for all j ∈ M, form a convex polyhedron. Particularly, the constraints and are all linear in x. Thus, we have the above conclusion.In practice, the simplex method can be used to find the optimal solution of P2 efficiently with common optimization libraries like IBM CPLEX and SciPy.Once the optimal solution of the LP-relaxation problem, P2, is obtained for a given α value, we need to round the solution into integers. But the rounded solution is not necessarily feasible for the minimization problem P2. To ensure the feasibility of the rounded solution, we employ a heuristic rounding approach in Algorithm 1, which is similar to the one in Ref. [56]. The algorithm starts with rounding the LP relaxation solution to the nearest integer values. Next, if the capacity constraint ∑
x
= b is not satisfied, the algorithm reduces x
for the top populated region based on the total exceeded amount. If the resource constraint is under-satisfied, the algorithm increases the top exposed populated areas based on the remaining resources. Note that in the Diverse-only (α = 0) and Fairness-only (α = 1) scenarios, the algorithm only considers the population and exposed population to sort x
's, respectively.
Feasibility
As we discussed in § 3, the ideal situation would be minimizing the upper bounds and to the full extent. An important observation to make here is that the ideal case with zero upper bounds both on the fairness and the diversity constraints is almost always impossible due to the underlying biases in data and historical discrimination in society. Therefore, in practice, a small positive upper bound threshold is considered to satisfy fairness and diversity. For example, the fairness requirement can be thought of as the US Equal Employment Opportunity Commission's “four-fifths rule,” which requires that “the selection rate for any race, sex, or ethnic group [must be at least] four-fifths (4/5) (or eighty percent) of the rate for the group with the highest rate”.1
We consider a similar requirement for diversity as well.The solution obtained under the MO model,P2, is unable to guarantee to satisfy these requirements. For example, a solution with zero unfairness (but not satisfactory on the level of diversity) can be on the Pareto front solution of MO – hence might be the optimal output – even though it is not a valid solution.Subsequently, we introduce two control parameters ε
and ε
, which are the acceptable thresholds for the diversity and fairness requirements, i.e., and . The violation threshold parameters , are user-defined values that determine how diverse and fair the allocation should be. The value corresponds to a fully fair allocation, whereas corresponds to a completely fairness-ignorant allocation that solely considers the diversity. These violations must be relatively small by which resources allocated to a particular region and particular group nearly achieve the required level of diversity and fairness, respectively. Note that as quantitative metric, we say that allocation is considered as fair if , and allocation is considered as diverse if . No feasible solution may exist for the optimization problem depending on the constraints, weights, and available scarce resources. In this case, the decision-maker will have no choice but to relax the constraints. In §4 we will discuss the allocation under different scenario.Fortunately, as we shall prove in Theorem 1 if the problem has a feasible solution given the fairness and diversity constraints, there must exist an α value under which the optimum solution of P2 is feasible and vice versa.Let be:IfX′ ≠ ∅: given x* ∈ X′,
∃α* such that
x* is an optimal solution of P2.If ∄ α* such that
x* is an optimal solution of P2 that satisfies both conditions
and
, then
X′ = ∅.We now need to find the α value for which an optimum P2 is feasible to satisfy diversity and fairness requirements. A naive approach would be a brute-force search in [0, 1] with a small step size added to α. However, this is not a practical approach since it needs to solve the optimization problem in each iteration. As the step size decreases, the exhaustive search increases, and the number of times it needs to solve P2 increases accordingly. Therefore, we first obtain the monotone properties of optimal upper bound and with respect to α. This result is used to find proper α value later in § 3.2.Letandbe optimum solutions ofP2 givenα1andα2, respectively. It can be shown that ifα1 ≤ α
2
, then
and
.
Choosing trade-off parameter, α
The parameter α in P2 controls the trade-off between the diversity and fairness of the allocation decision. Using the monotone property of and to α in Proposition 1, we now propose an approach to find a proper α value that satisfies both fairness and diversity constraints for given (ε
, ε
).Using an example, we show the high-level idea of the tuning algorithm under different scenarios. In Fig. 2
, the monotonically decreasing red curve corresponds to the fairness constraint, and the monotonically increasing blue curve corresponds to the diversity constraint. The dashed horizontal lines correspond to the thresholds allowed for diversity and fairness constraints, respectively. For α < α
1 we can see that exceeds the allowed threshold of ε
. Based on Proposition 1, we should remove the range [0, α
1) from consideration. For α > α
3, exceeds the allowed threshold of ε
, hence, we can remove the range (α
3, 1] from consideration. For α = α
2, both and are satisfied. The color bar below the plot in Fig. 2 shows different ranges of α. The green segment corresponds to the range of α values for which the optimum solution of P2 satisfies and . Consequently, we can prune the infeasible intervals of α by evaluating the violation of diversity and fairness thresholds.
Fig. 2
Feasible region with respect to α.
TuningFeasible region with respect to α.Algorithm 2 represents the proposed tuning algorithm. The algorithm starts from the middle of the α interval, α
, and solves P2 with that. Then, it splits the α region into half to further prune it until a narrow feasible range of α is obtained. In each iteration, if the diversity threshold is not satisfied by the solution obtained from P2, the algorithm prunes the interval larger than α
. Similarly, if the fairness threshold is not satisfied, the algorithm prunes the interval smaller than α
. If neither of the thresholds is satisfied with the solution obtained from P2 using α
, a feasible solution does not exist for (ε
, ε
). Finally, when both constraints are satisfied, the algorithm returns a valid solution for P2, rounded to obtain an integer solution for P1.
Price of fairness
We now define the Price of Fairness (PoF) as the difference of the fairness gap calculated for the optimal solution of P2 obtained with and without any fairness constraints. As we described in § 2, Equation (7) is the fairness constraint that is defined for each social group. Solving P2 with and without the fairness constraints, we can obtain different allocation solutions and consequently different fairness and diversity gaps. This will allow us to compare fair-diverse allocation performance with the Diverse-only allocation to analyze the impact of fairness constraints, namely PoF.The PoF metric assists the decision-maker on the fairness scheme to be considered, mainly, ε
and ε
. Note that there is a trade-off between diversity and fairness gap in P2, when the allocation scheme is more focused on diversity, smaller ε
, the control parameter α become close to 0. Subsequently, P2 minimizes the diversity upper bound subject to only the capacity constraint. Comparing the solution of such a problem with the one that focuses on fairness, α close to 1, we expect to see a larger fairness gap and smaller diversity gap. In §4, we evaluate PoF of the allocation solution obtained under different scenarios.
Case study: resource allocation for COVID-19 relief in US cities
In this section, we apply the proposed fair and diverse allocation strategy to the planning of distribution of medical resources for COVID-19 relief among some US cities (Chicago, New York, Baltimore). The city of Chicago is a segregated city with a relatively high Hispanic and Black population [9]. Several research studies have addressed a positive correlation among proportions of Black and Hispanic communities and COVID-19 cases [2,9,42,64]. More specifically, in Ref. [9], both positive and statistically significant associations between the proportions of Black and Hispanic population and per capita COVID-19 cases have been identified using data from six segregated cities in the US. In particular, New York City has the highest rate of confirmed cases, followed by Chicago and Baltimore. That is, we present our instance problems based on these cities. Even though our case study is limited to three cities, it includes the first (New York City) and third (Chicago) largest cities in the USA, and presents our approach for cities with higher vulnerable populations (e.g., Baltimore).We rely on the US census for population and demographic distribution data. For the COVID-19 cases, death tolls, and hospitalization, we separately collect the data provided by the governmental departments of each considered city. The primary sources of each dataset are described in Section 3.1. In the subsequent section, we first perform a descriptive analysis of the data to motivate the necessity of the fair and diverse distribution of medical resources. Next, we evaluate the performance of our proposed Fair-Diverse allocation approach and identify a reasonable trade-off parameter α based on the binary search Algorithm 2. Lastly, we will calculate the price of fairness (PoF) to highlight the role of fairness constraints in our optimization setting. The analytical results, optimization models, and algorithms were implemented in Python 3.7 using Docplex and Sklearn packages. The codess are available on github.2All experiments have been performed by a Mac-book machine with a 1.8 GHz Dual-Core Intel Core I5 CPU and 8 GB memory.
Data description
Population Dataset: The uszipcode
3
Python package provides detailed geographic, demographic, socio-economic, real estate, and education information at the state, city, and even zip code level for different areas within the US. Based on the documentation, this package uses an up-to-date database by having a crawler running every week to collect additional data points from multiple data sources. This dataset does not provide the intersectional population. We refer to this dataset as Pop. throughout the experiment section.City of Chicago COVID-19 Database [44] provides daily data on COVID-19 positive-tested cases, death tolls, hospitalizations, and other individuals’ attributes (e.g., age, race, gender) to track the pandemic in this city. In our study, we primarily use the COVID-19 Daily Cases data to reveal the inequality among different population subgroups. We refer to this dataset as Chicago-COVID-Cases throughout the experiment section. The city of Chicago COVID-19 database also provides daily data on COVID-19 Cases, Tests, and Deaths by ZIP Code dataset 4.2. We refer to this dataset as Chicago-COVID-Zipcode throughout the experiment section.New York City (NYC) COVID-19 Data Repository4 consists of different COVID-19 related datasets including daily, weekly, monthly data, data on SARS-CoV-2 variants, the cumulative COVID-19 cases, etc. For our analysis, we use COVID-19 cases and death totals by age, race, and gender since the start of the COVID-19 outbreak in NYC, February 29, 2020. We refer to this dataset as NYC-COVID-Zipcode throughout the experiment section.City of Baltimore COVID-19 data Dashboard5 provides different statistics and visualization for COVID-19 data in Baltimore. The main source of this data is the Maryland Department of health [45], which is updated daily. This dataset includes COVID-19, the number of cases and death at the zip-code level, and total cases and death tolls by age, race, gender. We refer to this dataset as BLT-COVID-Zipcode throughout the paper.
Descriptive analysis on COVID-19 risk factor among different population subgroups, and different regions (zip-codes)
Utilizing the Chicago-COVID-Cases dataset, we first identify the contribution of each demographic attribute to the total number of COVID-19 cases and deaths in each region of the Chicago area. This analysis reveals critical insights into the fundamental importance and differentiation among demographic subgroups.Next, we perform a PCA analysis [1] using Pop. merged with the Chicago-COVID-Zipcode and produced a Biplot, as shown in Fig. 3
, to identify and compare the contribution of different attributes to the COVID-19 death and case numbers. Note that we used cross-validation to tune the number of components of the PCA method. To better visualize the impact of these attributes on different regions, we implement a K-means clustering [32] method to partition our geographical areas (Zipcodes) based on the numbers of COVID-19 deaths and cases. We cluster them into three categories and label them as high, medium, and low impacted areas, accordingly (see Fig. 4).
Fig. 3
Biplot of demographic features using PCA Analysis with Kmeans clustering -Chicago City regions.
Fig. 4
Exposure rates of population subgroups-Chicago city.
Biplot of demographic features using PCA Analysis with Kmeans clustering -Chicago City regions.Exposure rates of population subgroups-Chicago city.Our findings in this part show that areas with higher COVID-19 deaths and cases tend to have a higher population, higher elderly population (compared with the young one), more Black Or African American, Latinos population (compared with other races), and lower median income. Furthermore, we believe that the revealed negative correlation between income and COVID-19 cases should raise severe concerns in future decision-making procedures. One justification for that is lower-income individuals are mostly daily-paid and cannot afford living expenses if they are self-quarantined or stop working. Consequently, the pandemic inevitably affects the areas harder with larger lower-income populations. However, we do not have COVID-19 data based on income level. Hence, we do not consider income in this study.In brief, we can observe notable differences in contribution to the COVID-19 positive cases and death tolls across specific demographic attributes (e.g., older age). This fact reveals the critical role of a Fair-Diverse model, which considers not only the overall population size but also the unrepresentative (exposed) population for scarce resource allocation problems.
Estimating the distribution of exposed population
Estimating the marginal distribution of the number of high-risk individuals in each demographic group, i.e., P(E = 1|g
), requires the data of positive-tested (infected) cases and death tolls. In this article, we intend to propose a guideline for resource allocation of vaccines and scarce treatments for the COVID-19 pandemic. However, we only have access to the count of individuals of each social group who were infected from COVID-19, namely P(g
|E = 1). The probability P(E = 1|g
) can be calculated from the Bayes formulaWe will use P(E = 1|g
), in the §4.4 to form the fairness constraints of the optimization problem, Equation (7), and calculate the exposed population in each zip-code. Table 4 presents the exposure rates for different population subgroups in Chicago City (The exposure rates for other cases (cities) are provided in Appendix Fig. 15, Fig. 16.
Table 4
Resource Allocation (Racial groups): Top 15 populated areas- NYC.
Zipcode
Total Population
Exposed Population
alpha = 0.5
alpha_tuned
Diverse-only
Fair-Only
11368
101558
12156
6614
8226
6525
0
11226
97766
7186
6367
7918
6282
0
11373
95500
8765
6219
7735
6136
0
11220
94949
8798
6183
7690
6101
0
11385
94032
7941
5960
7616
6042
0
11236
91632
6051
5967
4354
5888
0
10467
91077
8504
5931
7377
5852
0
11219
90499
5798
5736
4300
5815
0
10025
90025
5950
5706
4277
5784
0
11207
88857
7579
5787
7197
5709
0
11208
88639
8445
5772
7179
5695
0
11211
86805
6049
5502
7031
5577
0
11214
86639
5843
5491
4116
5567
0
11234
85567
5208
5423
4065
5498
0
11377
84635
7242
5512
6855
5438
0
Fig. 15
Exposure Rates-New York.
Fig. 16
Exposure Rates-Baltimore.
Fair-diverse allocation
To evaluate the proposed Fair-Diverse model, we construct Diverse-only, Fair-only, and equalized importance or alpha = 0.5 models, and compare the allocation solutions as well as the resulted fairness and diversity gaps among them. In our terminology, Diverse-only corresponds to an allocation that is merely based on diversity constraint, Equation (1), and is not considering any other (e.g., fairness) measures. Similarly, Fair-only corresponds to a model solely based on fairness constraint. Furthermore, alpha = 0.5 refers to a model that has equalized weights on Fairness and Diversity constraints in the optimization setting.As mentioned in § 3, we propose a diversity and fairness trade-off problem as in P2. As long as the resource constraint is satisfied, P2 has an optimal solution given an α value. However, the optimal solution obtained from p2 might not be feasible given the diversity and fairness requirements ε
and ε
as mentioned in § 3.1. Hence, we apply our proposed binary search algorithm to discover a range for α that results in a feasible solution.To evaluate the performance of the Algorithm 2, we will consider three US cities and different sensitive attributes (Race, Age, and Gender) and solve each instance problem separately. We will then show the allocation results for each problem using the four models mentioned above and discuss α ranges in detail. The total number of zip-code areas varies for each city, with 177 regions in New York City, 58 in Chicago, and 36 in Baltimore. The resulted allocations of each problem is sorted by population size for the top 15 area (zip-codes).Before discussing the results, we provide the computation time of our proposed algorithms for different test cases. For the City of Chicago, the total computation time of optimization of P2 is 2.2 s, and the overall computation time of binary-search is 10.8 s. Similarly, for NYC, the total computation time of optimization of P2 is 6.1 s, and the total computation time of binary-search is 76.4 s. For the City of Baltimore, the total computation time of optimization of P2 is 2.1 s, and the total computation time of binary-search is 4.3 s.Note that the computation time of optimization of P2 is proportional to the dataset size (number of zipcodes). Subsequently, NYC and the City of Baltimore has the highest and lowest time, respectively. Moreover, the computation time of binary-search is proportional to the computation time of optimization of P2 as it aims to solve P2 in each iteration. Note that the number of performed iterations of binary-search also impacts the overall computation time of the algorithm. Therefore, modifying threshold hyperparameter ε
, ε
, or τ would affect the reported times. Overall, our proposed algorithms are computationally efficient at the city and state levels. At the national or worldwide level, it will still be efficient but requires an efficient parallel implementation to reduce the computation time further.
City of Chicago
We consider b = 200000 units of vaccines as the total available resources to be allocated for the city of Chicago. Using Pop., and Chicago-COVID-Zipcode datasets, we evaluate the allocation results of our proposed resource allocation framework at the zip-code level in the city of Chicago.The first instance problem considers Race as the sensitive attribute, and the associated Fair-Diverse model captures inequalities across different racial subgroups. The results for the top 15 populated areas are reported in Table 1
. The baseline values for ε
and ε
are derived from the alpha = 0.5 model and are equal to 0.24 and 0.007. Note that we do not use the tuning algorithm for this alpha = 0.5 model. For Fair-Diverse model ε
and ε
are both set to be 0.025 to decrease the fairness gap compared to the baseline value. We obtained a tuned range of [0.54 0.86] for α in this case (midpoint = 0.70). Looking at Table 1, the area associated with zip-code “60639” for instance, receives a lower number of vaccines using Diverse-only model but higher in both alpha = 0.5 and Fair-Diverse model (α = 0.70) due to having higher total exposed population. In contrast, the Fair-only closes the fairness gap to the full extent (ε
= 0), and as a result, obtains an extreme allocation solution in which only a few areas (zip-codes) receive vaccines. Undoubtedly, this could not be a desirable allocation solution under certain fairness and diversity requirements. Since the table represents the top 15 populated areas, we cannot observe all areas with positive allocation using Fair-only model.
Table 1
Resource Allocation (Racial groups): Top 15 populated areas- Chicago.
Zipcode
Total population
Exposed Population
alpha = 0.5
alpha tuned (0.70)
Diverse-only
Fair-only
60629
113046
5802
10329
12125
9497
0
60618
91351
3652
7002
9798
7675
0
60623
91159
4815
8330
9777
7659
0
60639
89452
5145
8174
9594
7515
70901
60647
86586
3824
7912
9287
7274
0
60617
83590
3782
7638
8966
7023
0
60608
81930
3909
6280
8788
6883
0
60625
78085
2945
5986
8031
6560
0
60634
73894
2491
5664
4491
6208
0
60620
72094
2744
6587
7733
6057
0
60641
70992
2973
6455
7614
5964
0
60614
66485
1602
5096
4040
5586
0
60657
65841
1591
5047
4001
5532
0
60640
65412
2047
5014
3975
5496
0
60609
64420
3157
5886
6909
5412
0
Resource Allocation (Racial groups): Top 15 populated areas- Chicago.The second instance problem considers Age as the sensitive attribute, and the associated Fair-Diverse model captures inequalities across different age groups. The results for the top 15 populated areas are reported in Table 2
. The baseline values for ε
and ε
are derived from the alpha = 0.5 problem and are equal to 0.012 and 0. Next, we run the binary search algorithm to tune the α value and find a feasible solution for the Fair-Diverse model. In this case, ε
and ε
are both set to be 0.003. We obtained a tuned range of [0.57 1] for α in this case (midpoint = 0.78). As mentioned previously, the Diverse-only model assigns vaccines to areas only based on the total population, and the Fair-only model obtains an extreme allocation solution in which only a few areas (zip-codes) receive vaccines. Therefore, none of these models are capable of delivering a fair and diverse allocation solution. Note that in this instance problem, alpha = 0.5 model is not doing any better than the Diverse-only model since the weight on the fairness component is not adequate to change the results (diversity-gap is dominant). This result can further reveal the necessity of the proposed tuning algorithm.
Table 2
Resource Allocation (Age groups): Top 15 populated areas-Chicago.
Zipcode
Total population
Exposed Population
alpha = 0.5
alpha tuned (0.78)
Diverse-only
Fair-only
60629
113046
6367
9204
8989
9204
0
60618
91351
5478
7672
7492
7672
0
60623
91159
5084
7422
7249
7422
70300
60639
89452
5127
7367
7195
7367
0
60647
86586
5121
7312
7141
7312
0
60617
83590
5007
6931
7093
6931
0
60608
81930
4872
6904
6743
6904
0
60625
78085
4698
6569
6415
6569
0
60634
73894
4610
6219
6365
6219
0
60620
72094
4404
6009
6150
6009
0
60641
70992
4302
5955
5816
5955
0
60614
66485
4028
5735
5604
5735
0
60657
65841
4034
5730
5864
5730
0
60640
65412
4200
5644
5776
5644
0
60609
64420
3651
5264
5141
5264
0
Resource Allocation (Age groups): Top 15 populated areas-Chicago.Finally, a notable instance occurs when we consider gender as the sensitive attribute and the Fair-Diverse model attempts to eliminate the unfairness between male and female subgroups. The results for the top 15 populated areas are reported in Table 3
. The baseline values for ε
and ε
derived from the alpha = 0.5 problem and are both close to zero (6.04e-05 and 0). It is worth mentioning that this is because of the similar gender population distribution across different regions. Consequently, the fairness and diversity requirements can be satisfied even with Diverse-only model. We can still run the binary search algorithm to tune the α value and find a feasible solution for Fair-Diverse model by closing the fairness gap further. To do this, the ε
and ε
values in Fair-Diverse model should be set to 0 and 0.1. We obtained a tuned range of [0.92 1] for α in this case (midpoint = 0.96). Looking at Table 3 and exposure rates (shown in Table 4
), we notice that, in this specific instance problem, the exposed population size is, in actuality, aligned with the total population size in different areas. In other words, highly populated areas tend to have higher exposed populations as well. Therefore, the resulted allocations from Diverse-only, alpha = 0.5, and Fair-Diverse models are very close and even equal in some cases.
Table 3
Resource Allocation (Gender groups): Top 15 populated areas-Chicago.
Zipcode
Total population
Exposed Population
alpha = 0.5
alpha tuned (0.96)
Diverse-only
Fair-only
60629
113046
5802
9520
9529
9520
0
60618
91351
3651
7695
7703
7695
0
60623
91159
4815
7697
7705
7697
0
60639
89452
5145
7555
7563
7555
0
60647
86586
3824
7295
7302
7295
0
60617
83590
3782
7033
7026
7033
0
60608
81930
3909
6914
6921
6914
79366
60625
78085
2944
6573
6579
6573
0
60634
73894
2490
6209
6203
6209
0
60620
72094
2743
6035
6029
6035
0
60641
70992
2972
5989
5995
5989
0
60614
66485
1602
5567
5562
5567
0
60657
65841
1591
5515
5521
5515
0
60640
65412
2047
5498
5504
5498
0
60609
64420
3156
5424
5430
5424
0
Resource Allocation (Gender groups): Top 15 populated areas-Chicago.Resource Allocation (Racial groups): Top 15 populated areas- NYC.Fig. 5, Fig. 6 present the results for the top 15 populated regions in the Chicago City for racial and age instance problems. Note that in both instance problems, the total population equals the summation of all associated groups (e.g. Age groups) due to having unknown labels in the data. In Fig. 5, for example, “60614″ and “60634″ regions receive less vaccines both under Fair-Diverse and alpha = 0.5 models due to having higher exposed population, Table 1. The Diverse-only does not consider the exposed population. Therefore, the associated allocations are higher for these regions. Besides, the allocation obtained from the tuned range of α is significantly different from the allocation obtained with the alpha = 0.5 model since the latter does not satisfy the fairness requirement ε
. Moving to another instance problem, Fig. 6 represents the allocation results for the age instance problem. Based on the plot, we can notice that the alpha = 0.5 and Diverse-only allocation solutions overlap. This can be justified by the fact that the 50% emphasis on fairness is not sufficient to close the age subgroups disparities, and it requires a higher α value as we obtained through the tuning algorithm. That being said, the tuned α value, in this case, is 0.78, which is substantially higher than 0.5. For example, “60614″ and “60609” regions receive less vaccines using Fair-Diverse model with α tuning since they have relatively lower exposed population, Table 2. For observing more interactive visualization tools, please check our newly created web application on the Chicago City datasets using Rshiney.6
Fig. 5
Resource Allocation (Racial groups): Top 15 populated areas-Chicago.
Fig. 6
Resource Allocation (Age groups): Top 15 populated areas-Chicago.
Resource Allocation (Racial groups): Top 15 populated areas-Chicago.Resource Allocation (Age groups): Top 15 populated areas-Chicago.
NewYork city (NYC)
Utilizing Pop., and NYC-COVID-Zipcode datasets, we assess our proposed resource allocation framework at a zip-code level for New York City. In this problem, we consider the total number of vaccines available b to 500000 since NYC has a higher population than Chicago. Similar to the Chicago case study, the first instance problem for NYC considers Race, and the second instance consider Age as the sensitive attribute. We ignore the description of the Gender instance problem in this section (please see the appendix for the results).The first instance problem considers Race to capture inequalities across different racial subgroups. The results for the top 15 populated areas in NYC are reported in Table 4. The ε
and ε
derived from the alpha = 0.5 model, are 0.10 and 0.0007 respectively. For Fair-Diverse model ε
and ε
are both set to be 0.017 to decrease the fairness gap compared to the baseline value. The tuned α range is between 0.66 and 0.69 (midpoint = 0.67). Looking at Table 4, the area associated with zip-code “10467”, receives a lower number of vaccines using Diverse-only model but higher numbers in both alpha = 0.5 and Fair-Diverse model (α = 0.67) due to having higher total exposed population and more risks. In contrast, the Fair-only closes the fairness gap to the full extent (ε
= 0), and as a result, obtains an extreme allocation solution in which only a few areas (zip-codes) receive vaccines. As shown in Table 4, none of the top 15 populated areas will receive vaccines under this extreme condition. As a result, this could not be an applicable allocation solution.The second instance problem considers Age to captures inequalities across different age groups. The results for the top 15 populated areas are reported in Table 5
. The baseline values for ε
and ε
are derived from the alpha = 0.5 problem and are equal to 0.008 and 0 while the α value range is between zero and one. Next, we run the binary search algorithm to tune the α value and find a feasible solution for Fair-Diverse model with ε
and ε
both set to 0.003. We obtained a tuned range of [0.68 1] for α in this case (midpoint = 0.84). As mentioned previously, the Diverse-only model assigns vaccines to areas merely based on the total population, and the Fair-only model obtains an extreme allocation solution in which only a few areas (zip-codes) receive vaccines. Therefore, these models are not capable of delivering a fair and diverse allocation solution. Note that in this instance problem, alpha = 0.5 model is not doing any better than the Diverse-only model since the weight on the fairness component is not adequate, which further reveals the necessity of the proposed tuning approach.
Table 5
Resource Allocation (Age groups): Top 15 populated areas- NYC.
Zipcode
Total Population
Exposed Population
alpha = 0.5
alpha_tuned
Diverse-only
Fair-Only
11368
108904
10192
6786
6550
6786
0
11226
100574
9711
6267
6049
6267
0
11373
99772
9835
6217
6433
6217
0
11220
98531
9403
6140
5926
6140
0
11385
97405
9466
6069
5859
6069
0
10467
95723
9076
5965
5757
5965
0
11208
93812
8674
5846
5643
5846
0
11236
92721
9020
5778
5577
5778
0
11207
92591
8633
5769
5569
5769
0
10025
92284
9549
5750
5950
5750
0
11219
90004
8005
5608
5413
5608
75932
11211
88978
8286
5544
5352
5544
0
11377
88572
8808
5519
5711
5519
0
11214
86223
8725
5373
5559
5373
0
11234
86012
8536
5360
5356
5360
0
Resource Allocation (Age groups): Top 15 populated areas- NYC.Fig. 7, Fig. 8 present the results for the top 15 populated regions in the New York City for racial and age instance problems. Note that in both instance problems, the total population equals the summation of all associated groups (e.g. Age groups) due to having unknown labels in the data. In Fig. 7, “11236″ and “11219” zip-codes receive less vaccines both under Fair-Diverse and alpha = 0.5 models due to having lower exposed population, Table 4. The Diverse-only does not consider the exposed population. Therefore, the associated allocations are higher for these regions. Besides, the allocation obtained from the tuned range of α is significantly different from the allocation obtained with the alpha = 0.5 model since the latter does not satisfy the fairness requirement ε
.
Fig. 7
Resource Allocation (Racial groups): Top 15 populated areas-NYC.
Fig. 8
Resource Allocation (Age groups): Top 15 populated areas-NYC.
Resource Allocation (Racial groups): Top 15 populated areas-NYC.Moving to another instance problem, Fig. 8
represents the allocation results for the age instance problem. Based on the plot, we can notice that the alpha = 0.5 and Diverse-only allocation solutions overlap. This can be justified by the fact that the 50% emphasis on fairness is not sufficient to close the age subgroups disparities, and it requires a higher α value as we obtained through the tuning algorithm. That being said, the tuned α value, in this case, is 0.84, which is substantially higher than 0.5. For example, “11226″ and “11211” regions receive less vaccines using Fair-Diverse model comparing with Diverse-only or alpha = 0.5 since they have relatively lower exposed population, Table 5.Resource Allocation (Age groups): Top 15 populated areas-NYC.
Baltimore City
Utilizing Pop., and BLT-COVID-Zipcode datasets, we empirically tested our proposed resource allocation framework at the zip-code level. In this problem, we modify the total number of vaccines available b = 100000 since Baltimore has a relatively lower population than Chicago and New York City. However, similar to other case studies, the first instance problem considers Race, and the second instance considers Age as the sensitive attributes and ignore the description of the Gender instance problem (please see the appendix for the results).The Race instance problem captures the inequalities across different racial subgroups. The results for the top 15 populated areas in Baltimore are reported in Table 6
. The ε
and ε
derived from the alpha = 0.5 model, are 0.046 and 0.016 respectively. For Fair-Diverse model ε
and ε
are both set to be 0.025 to decrease the fairness gap compared to the baseline value. The tuned α range is between 0.57 and 1 (midpoint = 0.78). Looking at Table 6, the area associated with zip-code “21230”, receives a higher number of vaccines using Diverse-only model but lower numbers in both alpha = 0.5 and Fair-Diverse model (α = 0.78) due to having lower total exposed population and less risk level. On the other hand, the Fair-only closes the fairness gap (ε
= 0), and obtains an extreme allocation solution in which only a few areas (zip-codes) receive vaccines. As shown in Table 6, four regions among the top 15populated areas receive vaccines under this extreme condition. As a result, this could not be an applicable allocation solution.
Table 6
Resource Allocation (Racial groups): Top 15 populated areas- Baltimore.
Zipcode
Total Population
Exposed Population
alpha = 0.5
alpha_tuned
Diverse-only
Fair-Only
21215
60161
4893
10534
11000
9572
0
21206
50846
3944
8903
7118
8090
0
21218
49796
3739
7128
6742
7923
0
21224
49134
3782
7033
8163
7818
14868
21229
45213
3555
7097
8214
7194
12266
21217
37111
3050
6498
6785
5905
0
21230
33568
2188
4805
4545
5341
0
21213
32733
2722
5731
5985
5208
0
21212
32322
2176
4626
4376
5143
0
21216
32071
2725
5615
5864
5103
49078
21239
28793
2303
5041
5265
4581
0
21209
26465
1494
3788
3583
4211
11242
21223
26366
2143
4616
4821
4195
0
21202
22832
1724
3268
3091
3633
0
21214
20564
1510
3277
2784
3272
0
Resource Allocation (Racial groups): Top 15 populated areas- Baltimore.The Age instance problem captures inequalities across different age groups. The results for the top 15 populated areas are reported in Table 7
. The baseline values for ε
and ε
are derived from the alpha = 0.5 problem and are equal to 0.014 and 0 while the α value range is between zero and one. Next, we run the binary search algorithm to tune the α value and find a feasible solution for Fair-Diverse model with ε
and ε
both set to 0.007. We obtained a tuned range of [0.76 1] for α in this case (midpoint = 0.88). As mentioned previously, the Diverse-only model assigns vaccines to areas merely based on the total population, and the Fair-only model obtains an extreme allocation solution in which only four areas (zip-codes) receive vaccines. Therefore, these models are not capable of delivering a fair and diverse allocation solution. Note that in this instance problem, alpha = 0.5 model is not doing any better than the Diverse-only model since the weight on the fairness component is not adequate, which further reveals the necessity of the proposed tuning approach.
Table 7
Resource Allocation (Age groups): Top 15 populated areas- Baltimore.
Zipcode
Total Population
Exposed Population
alpha = 0.5
alpha_tuned
Diverse-only
Fair-Only
21215
60161
4943
9572
9177
9572
13765
21206
50846
4211
8090
7978
8090
0
21218
49796
4213
7923
8251
7923
0
21224
49134
4285
7818
8141
7818
0
21229
45213
3762
7194
7126
7194
0
21217
37111
3040
5905
5661
5905
0
21230
33568
2961
5341
5562
5341
0
21213
32733
2661
5208
4993
5208
7954
21212
32322
2682
5143
4934
5143
0
21216
32071
2614
5103
4892
5103
0
21239
28793
2416
4581
4771
4581
0
21209
26465
2230
4211
4385
4211
2268
21223
26366
2150
4195
4022
4195
0
21202
22832
2048
3633
3783
3633
0
21214
20564
1727
3272
3386
3272
26707
Resource Allocation (Age groups): Top 15 populated areas- Baltimore.Fig. 9, Fig. 10 show the results for the top 15 populated regions in the Baltimore City for racial and age instance problems. Note that in both instance problems, the total population equals the summation of all associated groups (e.g. Age groups) due to having unknown labels in the data. In Fig. 9
, and zip-codes receive less vaccines both under Fair-Diverse and alpha = 0.5 models due to having lower exposed population, Table 6. The Diverse-only does not consider the exposed population. Therefore, the associated allocations are higher for these regions. Besides, the allocation obtained from the tuned range of α is significantly different from the allocation obtained with the alpha = 0.5 model since the latter does not satisfy the fairness requirement ε
.
Fig. 9
Resource Allocation (Racial groups): top 15 populated areas-Baltimore.
Fig. 10
Resource Allocation (Age groups): top 15 populated areas-Baltimore.
Resource Allocation (Racial groups): top 15 populated areas-Baltimore.Moving to the next instance problem, Fig. 10
represents the allocation results for the age instance problem. Based on the plot, we can notice that the alpha = 0.5 and Diverse-only allocation solutions overlap. This can be justified by the fact that the 50% emphasis on fairness is not sufficient to close the age subgroups disparities, and it requires a higher α value as we obtained through the tuning algorithm. That being said, the tuned α value, in this case, is 0.88, which is substantially higher than 0.5. For example, “21217″ and “21223” regions receive less vaccines using Fair-Diverse model comparing with Diverse-only or alpha = 0.5 since they have relatively lower exposed population, Table 7.Resource Allocation (Age groups): top 15 populated areas-Baltimore.As we discussed in § 3.1, the allocation solution that is obtained from P2 does not necessarily satisfy the fairness and diversity requirements (under any α value). To demonstrate the performance of the tuning algorithm, which always returns a range for α under which the optimal solution of P2 is feasible, we now study the impact of ε
and ε
on the optimal solution under different models using the city of Chicago case study.The plots in 11 are based on the racial instance problem. These figures reveal that under any fairness and diversity requirements (given ε
and ε
values) the α tuning algorithm returns a feasible solution for Fair − Diverse model. Note that, this is not the case for other models (Fair-only, Diverse-only and alpha = 0.5) as it can be observed from Fig. 11
(a) and (b). In other words, the optimal solutions obtained from the Diverse-only and alpha = 0.5 models does not satisfy the fairness requirement (ε
≤ 0.03 and ε
≤ 0.2), and the solution obtained from the Fair-only model fails to satisfy the diversity requirement (ε
≤ 0.3) in Fig. 11(a) and (b). However, if we relax the fairness requirement to ε
≤ 0.3, Fig. 11(c), the alpha = 0.5 model can indeed achieve a feasible solution. It is worth mentioning that the diversity requirement is easier to achieve compared to the fairness requirement due to the larger inherent disparities in exposed population.
Fig. 11
Impact of the ε and ε on the diversity-fairness trade-off.
Impact of the ε and ε on the diversity-fairness trade-off.In this section, we will compare the fairness and diversity gaps under different models and population subgroups to discuss the price of fairness (PoF). The results are obtained based on the aforementioned racial group instance problem for the city of Chicago. We present the results under tuned α value (0.71) in this part.Firstly, Fig. 12
reveals that the Fair-Diverse model reduces fairness and diversity gaps more compared to Diverse-only and Fair-only models. Although the Diverse-only model eliminates the diversity gap, it fails to decrease the fairness gap. Similarly, the Fair-only model eliminates the fairness gap, but it fails to decrease the diversity gap. Moreover, Fig. 13
shows some considerable reduction in the gaps across different population subgroups (racial groups) using the Fair-Diverse model. The comparison between the results with the Diverse-only and the uniform allocation solution reveals the necessity of the Fair-Diverse model in closing the gaps, and therefore, reducing the disparities across different population subgroups.
Fig. 12
Diversity and Fairness gaps in different models.
Fig. 13
Fairness gaps across different population subgroups.
Diversity and Fairness gaps in different models.Fairness gaps across different population subgroups.Lastly, Fig. 14
illustrates the trade-off between fairness and diversity gaps considering different α values. Note that the α values represent the midpoint of the feasible range. We can observe that increasing the α value, which is the weight on the fairness component in P2, decreases the fairness gap as expected. However, the diversity gap will increase due to the fairness-diversity trade-off.
Fig. 14
Fairness and Diversity gaps based on different α values.
Fairness and Diversity gaps based on different α values.Finally, as discussed in § 3.3, the Price of Fairness (PoF) can be evaluated using the difference of the fairness gap from the optimal solution of P2 that is obtained with and without any fairness constraints. POF could be defined as the fraction of the fairness gap that is obtained from the (Diverse-only) allocation to the allocation solution based on the Fair-Diverse model. If an allocation solution decreases the fairness gap more than the Fair-Diverse model, the POF is less than one. Otherwise, PoF is > 1, and we will need to balance the fairness and diversity objectives to decide on which allocation to choose. In the case of the Racial instance problem with α = 0.71, the PoF equals , which is significantly larger than 1.
Discussion
In this section, we discuss the scope and limitation of the proposed fair-diverse resource allocation method, which is introduced in §2. Through the three case studies in §4.1, we can see that the proposed method applies to the early stage of medical resource allocation when the resource is still considered to be scarce. As long as the required data are available and accurate, the proposed allocation scheme can be applied. We want to point out some of the limitations that the proposed approach can be further improved upon.First, the exposure rate estimation can be improved. Recall that in §2 the COVID-19 exposure rate for each intersectional group is denoted by P(e|g
). However, the primary challenge while estimating COVID-19 exposure rates for intersectional subgroups (e.g., ), is the lack of intersectional population data in each Zipcode, denotes by S
in §2. Given the availability of the required data format, one could use a Poisson regression model to obtain the intersectional exposure rates. Second, the proposed method does not consider the profession of residents in prioritizing the sub-groups. In the U.S., during the early vaccination process, priority was given to the sub-groups based on age, underlying health conditions, and professions. Front-line workers such as health care workers, grocery employees, K-12 educators, etc., whose works are essential to normal social functions were among the first one or two batches of vaccine receivers. In our work, we have estimated the COVID-19 exposure rate from the daily infected cases. It indirectly considers the residents’ profession since front-line workers are expected to have a larger chance of being exposed to the virus. But we are not able to directly include the profession of resident as a characteristic to define the sub-group of populations. It is mainly due to the lack of data with job descriptions for residents (or the percentage) in each Zipcode area. We plan to work with city officials or non-profit organizations with more detailed population data and further improve our proposed method.Furthermore, estimating the intersectional population size could not be an appropriate implementation approach. Regarding the data mentioned above limitation issue, we solved the allocation problem for age, race, and gender as the sensitive attributes separately in §4.4. However, our model is able to return a fair-diverse allocation using the intersectional subgroups, given that the real intersectional population is available for each area. Applying the same framework to the intersectional data could be a future direction of this work.Another future direction with respect to COVID-19 exposure rates could incorporate a learning approach to obtain the rates. Several successful time-series techniques such as traditional ARIMA models or more novel approaches such as RNN-LSTM can be applied to the historical data. Time-series analysis would enable one to predict more precise exposure rates for future time windows.Moreover, we only considered a single treatment, mainly vaccination. However, the proposed approach can be generalized to incorporate multiple treatments. Our proposed fair-diverse allocation can be utilized at different stages of a pandemic considering the updated exposure ratio.In this paper, we mainly focused on the resource allocation to the centers and facilities. Although designing allocation policy at a granular level of individuals from each center or facility is also critical, high-level policies have been shown to be more effective and feasible for deployment than individual levels ones. There are multiple barriers to the individual-level deployment of the guidelines. For example, vaccine hesitancy [22,57,58] is a widespread problem for many people, and enforcing it could violate social values such as Freedom of choice(FoC) [21,52]. Moreover, designing strategies at the individual level requires an extensive amount of data and medical considerations as vaccines(e.g., COVID19 vaccines) can cause a severe allergic reaction, and people with pre-existing allergies should avoid taking them [19,29,50].Last but not least, applications of our proposed framework are beyond the allocation policy design use case. The solutions obtained for the allocation problem could help identify the vulnerable regions to incentive vaccination and for advertisement programs purposes. This will help to maximize the vaccination rate within higher-risk communities in the future.
Conclusion
In this paper, we propose the idea of fairness in scarce resource allocation problems (e.g., vaccine distribution) in terms of disparities across various population subgroups in different regions. To do so, we consider diversity and fairness components to design a Fair-Diverse allocation. We first formulate a general multi-objective (MO) problem P1, and propose the weighted sum method with the LP relaxation to simplify it to P2. We then solve the LP-relaxed problem, P2, based on the fairness and diversity requirements as described in § 3.1. For this purpose, we propose a binary search approach, Algorithm 2, to find an optimal range for the trade-off parameter α in P2 such that the obtained solution is feasible.Moreover, we have empirically analyzed our proposed methodologies in §4 using COVID-19 datasets in three major and segregated US cities (New York City, Chicago, and Baltimore). We designed three instance problems for each city based on different demographic attributes, race, age, and gender. We then implemented our fair-diverse model and compared it with other models (Diverse-only, Fair-only, and α = 0.5) to investigate the fairness criteria in the solution and highlight the necessity behind our approach. We have also discussed the fairness and diversity requirements, ε
and ε
, and compared the Fair-Diverse allocation with other allocation solutions under different thresholds. Lastly, we have examined the price of fairness based on the associated gaps across different models.In brief, our empirical results reveal the paramount role of fairness criteria in decision-making problems involving scarce resource allocation (e.g., a vaccine allocation). While certain minorities and population groups are more vulnerable to the COVID-19 virus, a Diverse-only (population-based) vaccine allocation can lead to higher fatality rates by neglecting the vulnerability of various population subgroups. We require to ensure a fair and diverse vaccine allocation to induce lower mortality rates across different regions. That is, we aim to find a decent balance between the diversity and fairness measures in other geographical areas and allocate the resources accordingly.
Author statement
All the authors contributed equally in Methodology Development, Draft Preparation, Implementation, and Validation of our proposal.
Authors: L Silvia Muñoz-Price; Ann B Nattinger; Frida Rivera; Ryan Hanson; Cameron G Gmehlin; Adriana Perez; Siddhartha Singh; Blake W Buchan; Nathan A Ledeboer; Liliana E Pezzin Journal: JAMA Netw Open Date: 2020-09-01