| Literature DB >> 20532222 |
Root Gorelick1, Susan M Bertram.
Abstract
BACKGROUND: How can we compute a segregation or diversity index from a three-way or multi-way contingency table, where each variable can take on an arbitrary finite number of values and where the index takes values between zero and one? Previous methods only exist for two-way contingency tables or dichotomous variables. A prototypical three-way case is the segregation index of a set of industries or departments given multiple explanatory variables of both sex and race. This can be further extended to other variables, such as disability, number of years of education, and former military service. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2010 PMID: 20532222 PMCID: PMC2879365 DOI: 10.1371/journal.pone.0010912
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Seven examples of overall segregation index computed using the Boltzmann/Shannon/Theil association measure, with expected values based on independence.
| Example | Department | Sex | Majority | Minority |
|
|
|
|
|
|
|
|
| Sciences | Male | 100 | 100 | |||||||
| Female | 100 | 100 | 0 | 0 | 0 | 0 | 0.699 | 0.699 | |||
| Humanities | Male | 100 | 100 | (0) | (0) | (0) | (0) | (0) | (0) |
| |
| Female | 100 | 100 | |||||||||
|
| Sciences | Male | 200 | 0 | |||||||
| Female | 0 | 200 | 1.000 | 0 | 0 | 0 | 0.699 | 0.699 | |||
| Humanities | Male | 0 | 200 | (1.00) | (0) | (0) | (0) | (0) | (0) |
| |
| Female | 200 | 0 | |||||||||
|
| Sciences | Male | 200 | 0 | |||||||
| Female | 0 | 200 | 1.000 | 1.000 | 0 | 0 | 0.699 | 0.699 | |||
| Humanities | Male | 200 | 0 | (1.00) | (1.00) | (0) | (0) | (0) | (0) |
| |
| Female | 0 | 200 | |||||||||
|
| Sciences | Male | 200 | 200 | |||||||
| Female | 0 | 0 | 1.000 | 0 | 1.000 | 0 | 0.699 | 0.699 | |||
| Humanities | Male | 0 | 0 | (1.00) | (0) | (1.00) | (0) | (0) | (0) |
| |
| Female | 200 | 200 | |||||||||
|
| Sciences | Male | 200 | 0 | |||||||
| Female | 200 | 0 | 1.000 | 0 | 0 | 1.000 | 0.699 | 0.699 | |||
| Humanities | Male | 0 | 200 | (1.00) | (0) | (0) | (1.00) | (0) | (0) |
| |
| Female | 0 | 200 | |||||||||
|
| Sciences | Male | 190 | 190 | |||||||
| Female | 10 | 10 | 0 | 0 | 0 | 0 | 0.699 | 0.914 | |||
| Humanities | Male | 190 | 190 | (0) | (0) | (0) | (0) | (0) | (0.71) |
| |
| Female | 10 | 10 | |||||||||
|
| Sciences | Male | 190 | 10 | |||||||
| Female | 190 | 10 | 0 | 0 | 0 | 0 | 0.914 | 0.699 | |||
| Humanities | Male | 190 | 10 | (0) | (0) | (0) | (0) | (0.71) | (0) |
| |
| Female | 190 | 10 |
The subscript industry is abbreviated ind and overall is abbreviated all. For each scenario, the first row of values are the association measures, while the second row (in parentheses) are association measures divided by their respective values of . Note that scenario A provides the calculation of these maximum values. Summations in subscripts refer to aggregations/projections along one or more dimensions of each array. Furthermore, , which is the association measure of the aggregation/projection across the dimensions of industry and sex, i.e. all the dimensions other than race. Likewise, is the association measure after collapsing all dimensions other than sex.