Literature DB >> 20199691

Decomposing socioeconomic inequality for binary health outcomes: an improved estimation that does not vary by choice of reference group.

Vasoontara Yiengprugsawan1, Lynette Ly Lim, Gordon A Carmichael, Keith Bg Dear, Adrian C Sleigh.   

Abstract

BACKGROUND: Decomposition of concentration indices yields useful information regarding the relative importance of various determinants of inequitable health outcomes. But the two estimation approaches to decomposition in current use are not suitable for binary outcomes.
FINDINGS: The paper compares three estimation approaches for decomposition of inequality concentration indices: Ordinary Least Squares (OLS), probit, and the Generalized Linear Model (GLM) binomial distribution and identity link. Data are from the Thai Health and Welfare Survey 2003. The OLS estimates do not take into account the binary nature of the outcome and the probit estimates depend on the choice of reference groups, whereas the GLM binomial identity approach has neither of these problems.
CONCLUSIONS: The GLM with binomial distribution and identity link allows the inequality decomposition model to hold, and produces valid estimates of determinants that do not vary according to choice of reference groups. This GLM approach is readily available in standard statistical packages.

Entities:  

Year:  2010        PMID: 20199691      PMCID: PMC2845147          DOI: 10.1186/1756-0500-3-57

Source DB:  PubMed          Journal:  BMC Res Notes        ISSN: 1756-0500


Findings

Over the past decade, inequality measures have been adapted from the field of economics and subsequently applied to the study of health inequalities. The concentration index is now widely used to study inequality in the health sector [1-3]. One of its important features is a mathematical property that allows the overall concentration index to be decomposed into a linear combination of concentration indices of its determinants [4,5]. Quantifying contributions of determinants of an overall health inequality has been undertaken for many health outcomes [6-12]. Decomposition estimation was originally designed for cases where the health outcome was continuous using Ordinary Least Squares (OLS) regression. The OLS assumes normality of the outcome variable and implicitly assumes also that the mean outcome is a linear combination of the determinants. Since then another approach, used for the case where the health outcome to be decomposed is binary in nature, has been based on the use of a probit model with marginal effects [2], and Hosseinpoor et al [8] modified this approach slightly, using a logit instead of a probit analysis. This extension of the decomposition method to deal with binary outcomes is very appealing, because health sector outcomes are often binary. Decomposition methodology has long been used to examine discrimination in the labor market [13,14]. The impact of choices of reference groups on parameter estimates for wage discrimination studies was first noted by Jones [15] and then addressed by Oaxaca and Ransom [16] in the context of multiple sets of categorical variables. However, these wage discrimination papers deal with continuous outcomes and provide no information on how to manage reference groups for the binary outcomes often encountered in health studies. Here we: 1) compare the existing estimation approaches for decomposition of inequality for binary health outcomes; 2) show that the decomposition of a binary outcome using probit analysis can lead to different results with different choices of reference group; and 3) introduce an alternative approach that uses the Generalized Linear Model (GLM) with binomial distribution and identity link.

Methods

Data source and variables

We used data from the Thai Health and Welfare Survey of 2003 conducted by the National Statistical Office. In this survey every available member of a sampled household aged 15 years or older was interviewed, a total of 37,202 individuals from 19,952 households.

Outcome variable

The health outcome studied was recent morbidity, a binary variable. The English translation of the relevant survey question was: "Have you been ill or not feeling well during the past one month?"

Socioeconomic rank

Monthly adult-equivalent household income was used as the measure of socioeconomic status. For Thailand, empirical studies suggest weighting each child aged under 15 as 0.5 of an adult and allowing for economies of scale applying to any household with more than one member by raising adult-equivalent household size to the power of 0.75 [17].

Determinants

Three categorical health determinants were examined: eight age-sex groups (males aged 15-29 years, males aged 30-44 years, males aged 45-59 years, males aged 60 years or older, females aged 15-29 years, females aged 30-44 years, females aged 45-59 years, females aged 60 years or older); four levels of education (no education, primary, high school, higher education); and five areas of residence (Bangkok, Central, North, Northeast, South). Measurement of inequalities in health as a concentration index (C) has primarily drawn on the literature on income inequality measures [3,18,19]. The concentration index can be written in various ways, but one of the most cited is that proposed by Kakwani et al. [1]: Where his the variable of interest for the iperson; μ is the mean or proportion of h;n is the number of persons; and if the n individuals are ranked according to their socioeconomic status, beginning with the most disadvantaged, then Ris their relative rank, i - 0.5/n. When there is no inequality (or when inequality is balanced and opposite for equal fractions of the income-ranked population), the concentration index equals 0. If the variable of interest is concentrated at a lower (or higher) socioeconomic level, the concentration index becomes negative (or positive). Three approaches to the decomposition of a binary health outcome are compared: Ordinary Least Squares (OLS), marginal effects from probit analysis, and Generalized Linear Model (GLM) specifying binomial distribution and identity link [20].

Ordinary Least Squares (OLS)

Wagstaff et al [5] demonstrate that the concentration index of a continuous health outcome can be decomposed into the contributions of individual determinants. In this case, a linear additive relationship between outcome variables hand the contributions of k determinants is appropriate: and OLS regression is applied to estimate the β's. By substituting from Equation 2 into Equation 1, the overall concentration index (C) can be rewritten as a linear combination of the concentration indices of the determinants, plus an error term (Equation 3): βare the coefficients from regressions of the health outcome on each k determinant, is the mean or proportion of each k determinant, μ is the mean or proportion of the health outcome, and Cis the concentration index for the kth determinant calculated using Equation 1, replacing the health outcome (h) with the determinant (x). GCis the generalized concentration index for the error term.

Probit estimates

Health sector variables are seldom continuous and are often binary (e.g., ill, not ill). Van Doorslaer [2] modified Wagstaff's method for use in such non-linear settings. The essential modification was to estimate the β's that go into Equation 3 from a probit regression instead of OLS regression. More specifically, van Doorslaer recommends the use of marginal effects of the β's. The World Bank technical notes on non-linear estimation suggest generating marginal effects using the Stata command: [21]. Marginal effects can also be calculated using the command after running the non-linear model. By default, the marginal effects of each explanatory variable are evaluated at sample means, and in large samples the sample mean approximates the overall mean of the marginal effects [22].

Generalized Linear Models (GLM)

The GLM is an extension of the linear modelling process that allows models to be fitted to data that follow probability distributions other than the normal distribution, such as the binomial distribution [23]. The GLM relaxes the assumption of homogeneity of variances that is usual in linear models and enlarges the class of linear OLS models in two ways: (i) the distribution of Y for fixed x is assumed to be from an exponential family of distributions [24], which includes important families such as the normal and binomial distributions; (ii) the relationship between the mean of Y and a linear combination of x's is specified by a link function. The link function connects the probability distribution of the outcome variable (the random part of the model) to the systematic (explanatory) part of the model. For traditional linear models in which the outcome variable follows the normal distribution, the link function used is the identity link; it specifies that the expected value of the outcome variable is a linear combination of the x's. When the outcome variable follows a binomial distribution, link functions commonly used are the logit and probit, giving rise to logistic and probit regressions respectively.

Binomial distribution with identity link

The use of GLM with a binomially distributed dependent variable and specifying an identity link function in this non-linear context is a suitable choice in the decomposition analysis of a binary outcome because it considers the structure of the distribution while preserving the link between the independent and dependent variables. The decomposition requires an identity link for the mathematics in Equation 3 to hold. This can be calculated using the Stata command:[25].

Results

The overall concentration index for reported illness in the previous month in the Thai sample of 2003 was -0.105 (95% confidence interval -0.086, -0.124). Thus recent illness was concentrated more at the poorer end of the income distribution. Proportions in age-sex, education and geographic residence groups are presented in Table 1. Negative concentration indices showed lower socioeconomic status among males aged 60 or older (C = -0.247) females aged 60 or older (C = -0.251), persons with no education (C = -0.321), and those residing in the Northeastern region (C = -0.256). We decomposed the overall inequality observed, estimating contributions due to age-sex, education and regions (Table 2). We used three estimates (OLS, marginal effects from probit analysis and GLM binomial identity) and compared results obtained with two extreme sets of reference groups (the most and least advantaged in each category). Each column presents contributions to the overall concentration index which are obtained from the first element in Equation 3 (i.e., Contribution to Concentration index or CC = ) as well as percentages of the overall concentration index (-0.105). Both the OLS and GLM binomial identity approaches gave CC subtotal estimates that did not vary by choice of reference groups, in marked contrast to the probit-based estimates.
Table 1

Proportions and concentration indices for age-sex, education and geographic groups

GroupsProportion ()Concentration index (Ck)*
Age-sex (years)
 Males aged 15-290.1430.035
 Males aged 30-440.1440.080
 Males aged 45-590.1030.049
 Males aged 60+0.061-0.247
 Females aged 15-290.1700.024
 Females aged 30-440.1760.059
 Females aged 45-590.123-0.009
 Females aged 60+0.079-0.251
Subtotal1.000
Education levels
 No education0.055-0.321
 Primary level0.585-0.123
 High school level0.2580.125
 Higher level0.1010.570
Subtotal1.000
Regions
 Bangkok0.1390.559
 Central region0.2150.179
 Northern region0.195-0.162
 Northeastern region0.341-0.256
 Southern region0.1100.024
Subtotal1.000

Source: Thai Health and Welfare Survey 2003

Table 2

Contributions to Concentration indices (CC) and its percent contributions (shown in brackets) comparing two reference sets for Ordinary Least Squares (OLS), Probit and Generalized Linear Model (GLM) with binomial distribution and identity link

GroupsOLS (%)Reference 1*OLS (%)Reference 2*Probit (%)Reference 1Probit (%)Reference 2GLM (%)Reference 1GLM (%)Reference 2
Males 15-29ref-0.007 (7.0)ref-0.005 (4.5)ref-0.007 (7.0)
Males 30-440.002 (-1.7)-0.015 (14.6)0.003 (-2.6)-0.010 (9.3)0.002 (-1.6)-0.015 (14.6)
Males 45-590.003 (-2.4)-0.005 (4.8)0.003 (-3.2)-0.003 (3.0)0.002 (-2.3)-0.005 (4.8)
Males 60+-0.016 (15.2)0.006 (-6.1)-0.020 (18.7)0.004 (-4.1)-0.016 (15.1)0.006 (-6.1)
Females 15-290.001 (-0.7)-0.005 (5.1)0.001 (-1.0)-0.004 (3.4)0.001 (-0.6)-0.005 (5.1)
Females 30-440.005 (-4.7)-0.010 (9.8)0.007 (-6.3)-0.007 (6.5)0.005 (-4.5)-0.011 (10.0)
Females 45-59-0.001 (0.9)0.001 (-0.7)-0.001 (1.1)0.000 (-0.5)-0.001 (0.9)0.001 (-0.7)
Females 60+-0.029 (27.9)ref-0.034 (32.2)ref-0.029 (27.8)ref

Subtotal-0.036 (34.5)-0.036 (34.5)-0.041 (39.0)-0.023 (22.1)-0.037 (34.7)-0.037 (34.7)

No education-0.005 (4.6)ref-0.005 (5.0)ref-0.004 (3.4)ref
Primary-0.018 (17.1)0.002 (-1.6)-0.019 (18.3)0.000 (-0.3)-0.014 (13.2)0.001 (-0.7)
High school0.002 (-2.1)-0.007 (6.3)0.002 (-2.0)-0.007 (6.2)0.001 (-0.8)-0.006 (5.4)
Higherref-0.016 (15.0)Ref-0.015 (14.0)ref-0.012 (11.2)

Subtotal-0.021 (19.7)-0.021 (19.7)-0.022 (21.3)-0.021 (19.9)-0.017 (15.9)-0.017 (15.9)

Bangkokref-0.039 (37.0)Ref-0.034 (32.4)ref-0.038 (36.2)
Central0.001 (-1.3)-0.018 (17.2)0.002 (-1.9)-0.016 (15.1)0.003 (-2.4)-0.016 (15.6)
North-0.016 (15.1)ref-0.017 (16.3)ref-0.016 (14.8)ref
Northeast-0.007 (6.9)0.037 (-34.7)-0.008 (8.0)0.034 (-31.9)-0.007 (6.2)0.036 (-34.5)
South0.001 (-0.6)-0.001 (0.7)0.001 (-0.7)-0.001 (0.6)0.001 (-0.6)-0.001 (0.7)

Subtotal-0.021 (20.2)-0.021 (20.2)-0.023 (21.8)-0.017 (16.1)-0.019 (18.0)-0.019 (18.0)

ΣCC-0.078 (74.3)-0.078 (74.3)-0.086 (82.0)-0.061 (58.2)-0.072 (68.6)-0.072 (68.6)
Residual-0.027 (25.7)-0.027 (25.7)-0.019 (18.0)-0.044 (41.8)-0.033 (31.4)-0.033 (31.4)
Overall C-0.105-0.105-0.105-0.105-0.105-0.105

Source: Thai Health and Welfare Survey 2003

*Reference values used in set 1 are: males aged 15-29, higher education, Bangkok.

*Reference values used in set 2 are: females aged 60+, no education, North.

Proportions and concentration indices for age-sex, education and geographic groups Source: Thai Health and Welfare Survey 2003 Contributions to Concentration indices (CC) and its percent contributions (shown in brackets) comparing two reference sets for Ordinary Least Squares (OLS), Probit and Generalized Linear Model (GLM) with binomial distribution and identity link Source: Thai Health and Welfare Survey 2003 *Reference values used in set 1 are: males aged 15-29, higher education, Bangkok. *Reference values used in set 2 are: females aged 60+, no education, North. Estimation using marginal effects from probit analysis sees the sum of the contributions to the overall concentration index (ΣCCI) depending on the choice of reference groups, for which no guidance is given in literature that has used this approach. At the two extremes, in our example, choosing set 1 as reference groups tends to result in more of the observed inequality being explained (ΣCC = -0.086, or 82.0 percent of C = -0.105), while choosing set 2 tends to result in appreciably less of it being explained (ΣCC = -0.061, or 58.2 percent of C = -0.105). More generally, it would appear that one is likely to get a higher ΣCC percent figure when reference groups are at the opposite extreme to the overall inequality.

Discussion and conclusion

There are two requirements for a satisfactory concentration index decomposition in a non-linear (binary outcome) setting: first, the binomial distribution of the outcome needs to be taken into account; and second, the outcome variable must be a linear combination of the independent determinants for the mathematics of the decomposition of the concentration index to hold. The OLS approach is based on a normal model with an identity link function and the probit approach is in essence a binary distribution with a probit link. The OLS approach should not be used for binary outcomes because it does not meet the first requirement. And a probit estimate fails the second requirement because it produces estimates that depend on the choice of reference groups. In practice, it is also possible to estimate a marginal effect using the average of the individual effects rather than the average effect [2]. Decomposition of concentration indices yields useful information regarding the relative importance of various determinants of inequitable health outcomes. But the two decomposition estimation approaches in current use are not suitable for binary outcomes and such outcomes include many useful health indicators. In contrast, our GLM approach specifying the binomial distribution of the outcome and an identity link function allows the decomposition model to hold, and produces valid coefficient estimates that do not vary according to choice of reference groups. In addition the GLM binomial identity link is readily available in standard statistical packages like Stata, and thus should be a valid approach when decomposing concentration indices for binary outcomes.

List of abbreviations

GLM: Generalized Linear Model; OLS: Ordinary Least Squares; C: Concentration index; CC: Contributions of Concentration indices.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

VY designed the study, analysed the data, and drafted the manuscript. LL was involved in the design of the study and the write-up of the manuscript. GC noticed that probit estimates were sensitive to the choice of reference groups and provided detailed comments and editorial guidance. KD provided conceptual advice. AS guided through to revisions and finalization of the manuscript. All authors read and approved the final version of the manuscript.
  12 in total

1.  Explaining income-related inequalities in doctor utilisation in Europe.

Authors:  Eddy van Doorslaer; Xander Koolman; Andrew M Jones
Journal:  Health Econ       Date:  2004-07       Impact factor: 3.046

2.  The bounds of the concentration index when the variable of interest is binary, with an application to immunization inequality.

Authors:  Adam Wagstaff
Journal:  Health Econ       Date:  2005-04       Impact factor: 3.046

3.  Income-related inequality in the use of dental services in Finland.

Authors:  Lien Nguyen; Unto Häkkinen
Journal:  Appl Health Econ Health Policy       Date:  2004       Impact factor: 2.561

4.  A decomposition of income-related health inequality applied to EQ-5D.

Authors:  Jens Gundgaard; Jørgen Lauridsen
Journal:  Eur J Health Econ       Date:  2006-12

5.  Decomposition of health inequality by determinants and dimensions.

Authors:  Jørgen Lauridsen; Terkel Christiansen; Jens Gundgaard; Unto Häkkinen; Harri Sintonen
Journal:  Health Econ       Date:  2007-01       Impact factor: 3.046

6.  Understanding differences in income-related health inequality between geographic regions in Taiwan using the SF-36.

Authors:  Miaw-Chwen Lee; Andrew Michael Jones
Journal:  Health Policy       Date:  2007-02-20       Impact factor: 2.980

7.  Catastrophic and poverty impacts of health payments: results from national household surveys in Thailand.

Authors:  Supon Limwattananon; Viroj Tangcharoensathien; Phusit Prakongsai
Journal:  Bull World Health Organ       Date:  2007-08       Impact factor: 9.408

8.  Urbanization and the spread of diseases of affluence in China.

Authors:  Ellen Van de Poel; Owen O'Donnell; Eddy Van Doorslaer
Journal:  Econ Hum Biol       Date:  2009-05-23       Impact factor: 2.184

Review 9.  On the measurement of inequalities in health.

Authors:  A Wagstaff; P Paci; E van Doorslaer
Journal:  Soc Sci Med       Date:  1991       Impact factor: 4.634

10.  Income related inequalities in mental health in Great Britain: analysing the causes of health inequality over time.

Authors:  John Wildman
Journal:  J Health Econ       Date:  2003-03       Impact factor: 3.883

View more
  23 in total

1.  Decomposing income-related inequality in cervical screening in 67 countries.

Authors:  Brittany McKinnon; Sam Harper; Spencer Moore
Journal:  Int J Public Health       Date:  2010-12-14       Impact factor: 3.380

2.  Income-related inequality and decomposition of edentulism among aged people in China.

Authors:  Shuo Du; Menglin Cheng; Chunzi Zhang; Mengru Xu; Sisi Wang; Wenhui Wang; Xing Wang; Xiping Feng; Baojun Tai; Deyu Hu; Huancai Lin; Bo Wang; Chunxiao Wang; Shuguo Zheng; Xuenan Liu; Wensheng Rong; Weijian Wang; Tao Xu; Yan Si
Journal:  BMC Oral Health       Date:  2022-05-31       Impact factor: 3.747

3.  The wider determinants of inequalities in health: a decomposition analysis.

Authors:  Leonie Sundmacher; David Scheller-Kreinsen; Reinhard Busse
Journal:  Int J Equity Health       Date:  2011-07-26

4.  Irritable bowel syndrome is concentrated in people with higher educations in Iran: an inequality analysis.

Authors:  Asieh Mansouri; Mostafa Amini Rarani; Mosayeb Fallahi; Iman Alvandi
Journal:  Epidemiol Health       Date:  2017-02-01

5.  Changes in Socio-Economic Inequality in Neonatal Mortality in Iran Between 1995-2000 and 2005-2010: An Oaxaca Decomposition Analysis.

Authors:  Mostafa Amini Rarani; Arash Rashidian; Ardeshir Khosravi; Mohammad Arab; Ezatollah Abbasian; Esmaeil Khedmati Morasae
Journal:  Int J Health Policy Manag       Date:  2017-04-01

6.  Socioeconomic inequality of unintended pregnancy in the Iranian population: a decomposition approach.

Authors:  Reza Omani-Samani; Mostafa Amini Rarani; Mahdi Sepidarkish; Esmaeil Khedmati Morasae; Saman Maroufizadeh; Amir Almasi-Hashiani
Journal:  BMC Public Health       Date:  2018-05-09       Impact factor: 3.295

7.  Socioeconomic inequalities in abdominal obesity among Peruvian adults.

Authors:  Marioli Y Farro-Maldonado; Glenda Gutiérrez-Pérez; Akram Hernández-Vásquez; Antonio Barrenechea-Pulache; Marilina Santero; Carlos Rojas-Roque; Diego Azañedo
Journal:  PLoS One       Date:  2021-07-21       Impact factor: 3.240

8.  Socioeconomic status and tobacco consumption: Analyzing inequalities in China, Ghana, India, Mexico, the Russian Federation and South Africa.

Authors:  Laura Rossouw
Journal:  Tob Prev Cessat       Date:  2021-06-25

9.  Inequality of leprosy disability in iran, clinical or socio-economic inequality: an extended concentration index decomposition approach.

Authors:  Rasool Entezarmahdi; Reza Majdzadeh; Abbas Rahimi Foroushani; Mahshid Nasehi; Abolfath Lameei; Kourosh Holakouie Naieni
Journal:  Int J Prev Med       Date:  2014-04

10.  Decomposing Wealth-Based Inequalities in Under-Five Mortality in West Africa.

Authors:  Aristide Romaric Bado; Sathiya Susuman Appunni
Journal:  Iran J Public Health       Date:  2015-07       Impact factor: 1.429

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.