| Literature DB >> 27504823 |
Abstract
For pilot or experimental employment programme results to apply beyond their test bed, researchers must select 'clusters' (i.e. the job centres delivering the new intervention) that are reasonably representative of the whole territory. More specifically, this requirement must account for conditions that could artificially inflate the effect of a programme, such as the fluidity of the local labour market or the performance of the local job centre. Failure to achieve representativeness results in Cluster Sampling Bias (CSB). This paper makes three contributions to the literature. Theoretically, it approaches the notion of CSB as a human behaviour. It offers a comprehensive theory, whereby researchers with limited resources and conflicting priorities tend to oversample 'effect-enhancing' clusters when piloting a new intervention. Methodologically, it advocates for a 'narrow and deep' scope, as opposed to the 'wide and shallow' scope, which has prevailed so far. The PILOT-2 dataset was developed to test this idea. Empirically, it provides evidence on the prevalence of CSB. In conditions similar to the PILOT-2 case study, investigators (1) do not sample clusters with a view to maximise generalisability; (2) do not oversample 'effect-enhancing' clusters; (3) consistently oversample some clusters, including those with higher-than-average client caseloads; and (4) report their sampling decisions in an inconsistent and generally poor manner. In conclusion, although CSB is prevalent, it is still unclear whether it is intentional and meant to mislead stakeholders about the expected effect of the intervention or due to higher-level constraints or other considerations.Entities:
Mesh:
Year: 2016 PMID: 27504823 PMCID: PMC4978397 DOI: 10.1371/journal.pone.0160652
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Selection process.
Descriptive statistics.
| Variable | N | Min | Max | Mean | SD | Freq (1) |
|---|---|---|---|---|---|---|
| Effective pilot site | 2,600 | 0 | 1 | -- | -- | 533 |
| Region–North | 2,720 | 0 | 1 | -- | -- | 748 |
| Region–Midlands | 2,720 | 0 | 1 | -- | -- | 544 |
| Region–London | 2,720 | 0 | 1 | -- | -- | 612 |
| Region–South | 2,720 | 0 | 1 | -- | -- | 816 |
| Pathfinder | 2,720 | 0 | 1 | -- | -- | 400 |
| JCP-led programme | 2,720 | 0 | 1 | -- | -- | 1,880 |
| Programme targeting ethnic minorities | 2,720 | 0 | 1 | -- | -- | 160 |
| Programme targeting lone parents | 2,720 | 0 | 1 | -- | -- | 400 |
| Programme targeting disabled people | 2,720 | 0 | 1 | -- | -- | 440 |
| Absolute performance (rank) | 2,560 | 1 | 40 | 20.5 | 11.54 | -- |
| Relative performance (rank) | 360 | 1 | 47 | 22.26 | 14.43 | -- |
| Benefit claimants (%) | 2,720 | 1.3 | 7.3 | 3.08 | 1.42 | -- |
| Working age population (in 100,000) | 2,720 | 3.98 | 15.37 | 7.71 | 2.75 | -- |
| Population per ha (in 10) | 2,720 | 0.1 | 26.8 | 2.92 | 5.24 | -- |
| Cumulated number of pilots | 2,600 | 0 | 25 | 6.35 | 4.82 | -- |
| Ethnic minorities (%) | 2,720 | 4 | 43 | 15 | 10.44 | -- |
| Lone parents claiming Income Support (%) | 2,720 | 0.3 | 5.76 | 2.48 | 1.27 | -- |
| Incapacity Benefit claimants (%) | 2,720 | 2.33 | 23.75 | 8.55 | 5.34 | -- |
Fig 2Frequency distribution of cluster sampling variables used in DWP pilots.
Fig 3Frequency distribution of the number of pilots per JCP district.
Odds ratio of being selected as pilot district vs. not (models 1 to 3).
| (1) | (2) | (3) | |
|---|---|---|---|
| Benefit claimants (%) | 1.12 | 1.30 | 1.13 |
| Working age population (in 100,000) | 1.05 | 1.1 | 1.06 |
| Population per ha (in 10) | 1.03 | 1.06 | 1.04 |
| Region–Midlands | 0.97 | 1.5 | -- |
| Region–London | 0.69 | 0.61 | 0.67 |
| Region–South | 0.57 | 0.62 | 0.60 |
| Cumulated number of pilots | 1.02 | 0.96 | -- |
| Absolute performance (rank) | 1.01 | -- | -- |
| Relative performance (rank) | -- | 1.01 | -- |
| Intercept | 0.1 | 21.8 | 0.13 |
| N | 2,400 | 330 | 2,600 |
- Binary logistic regression
- Y = PILOT
- Coefficients are odds ratios
* p<0.1
** p<0.05
***p<0.01
Odds ratio of being selected as pilot district vs. not (models 4 to 7).
| (4) | (5) | (6) | (7) | |
|---|---|---|---|---|
| Benefit claimants (%) | 1.12 | 1.11 | 1.23 | 1.22 |
| Working age population (in 100,000) | 1.06 | 1.06 | 1.06 | 1.06 |
| Population per ha (in 10) | 1.03 | 1.03 | 1.06 | 1.06 |
| Region–London | 0.64 | 0.65 | 0.50 | 0.51 |
| Region–South | 0.53 | 0.53 | 0.56 | 0.56 |
| A. Absolute performance (rank) | 1.01 | 1.02 | -- | -- |
| B. Relative performance (rank) | -- | -- | 1.01 | 1.00 |
| C. Pathfinder | 0.89 | -- | 0.26 | -- |
| D. JCP-led programme | -- | 1.25 | -- | 0.41 |
| Interaction A | 0.98 | -- | -- | -- |
| Interaction A | -- | 0.99 | -- | -- |
| Interaction B | -- | -- | 0.99 | -- |
| Interaction B | -- | -- | -- | 1.01 |
| Intercept | 0.11 | 0.09 | 0.08 | 0.10 |
| N | 2,520 | 2,520 | 330 | 330 |
- Binary logistic regression
- Y = PILOT
- Coefficients are odds ratios
* p<0.1
** p<0.05
***p<0.01
Odds ratio of being selected as pilot district vs. not (model 8).
| (8) | |
|---|---|
| Population per ha (in 10) | 1.04 |
| Region–London | 0.36 |
| Region–South | 0.50 |
| E. Ethnic minorities (%) | 1.01 |
| F. Programme targeting ethnic minorities | 0.15 |
| Interaction E*F | 1.13 |
| G. Lone parents claiming Income Support (%) | 0.95 |
| H. Programme targeting lone parents | 0.37 |
| Interaction G*H | 1.42 |
| I. Incapacity Benefit claimants (%) | 0.99 |
| J. Programme targeting disabled people | 2.31 |
| Interaction I*J | 0.99 |
| Intercept | 0.27 |
| N | 2,600 |
- Binary logistic regression
- Y = PILOT
- Coefficients are odds ratios
** p<0.05
***p<0.01