Literature DB >> 35815640

Who benefits from individual placement and support? A meta-analysis.

Lars de Winter1, Chrisje Couwenbergh1, Jaap van Weeghel1, Sarita Sanches1, Harry Michon2, Gary R Bond3.   

Abstract

AIMS: Individual placement and support (IPS) is an evidence-based service model to support people with mental disorders in obtaining and sustaining competitive employment. IPS is increasingly offered to a broad variety of service users. In this meta-analysis we analysed the relative effectiveness of IPS for different subgroups of service users both based on the diagnosis and defined by a range of clinical, functional and personal characteristics.
METHODS: We included randomised controlled trials that evaluated IPS for service users diagnosed with any mental disorder. We examined effect sizes for the between-group differences at follow-up for three outcome measures (employment rate, job duration and wages), controlling for methodological confounders (type of control group, follow-up duration and geographic region). Using sensitivity analyses of subgroup differences, we analysed moderating effects of the following diagnostic, clinical, functional and personal characteristics: severe mental illness (SMI), common mental disorders (CMD), schizophrenia spectrum disorders, mood disorders, duration of illness, the severity of symptoms, level of functioning, age, comorbid alcohol and substance use, education level and employment history.
RESULTS: IPS is effective in improving employment outcomes compared to the control group in all subgroups, regardless of any methodological confounder. However, IPS was relatively more effective for service users with SMIs, schizophrenia spectrum disorders and a low symptom severity. Although IPS was still effective for people with CMD and with major depressive disorder, it was relatively less effective for these subgroups. IPS was equally effective after both a short and a long follow-up period. However, we found small, but clinically not meaningful, differences in effectiveness of IPS between active and passive control groups. Finally, IPS was relatively less effective in European studies compared to non-European studies, which could be explained by a potential benefits trap in high welfare countries.
CONCLUSIONS: IPS is effective for all different subgroups, regardless of diagnostic, clinical, functional and personal characteristics. However, there might be a risk of false-positive subgroup outcomes and results should be handled with caution. Future research should focus on whether, and if so, how the IPS model should be adapted to better meet the vocational needs of people with CMD and higher symptom severity.

Entities:  

Keywords:  Mental health; randomised controlled trials; rehabilitation; systematic reviews

Mesh:

Year:  2022        PMID: 35815640      PMCID: PMC9281491          DOI: 10.1017/S2045796022000300

Source DB:  PubMed          Journal:  Epidemiol Psychiatr Sci        ISSN: 2045-7960            Impact factor:   7.818


Introduction

Employment is key to improve community functioning and mental health in people with mental illnesses (Drake and Wallach, 2020). Work fosters a sense of pride and self-esteem, offers financial independence, provides coping strategies for psychiatric symptoms and ultimately facilitates the process of recovery (Dunn et al., 2008). However, depending on the diagnosis, only between 14 and 33% of the working-age adults (18–65 years old) with mental illnesses are employed, which is substantially lower than the general population (Marwaha et al., 2007; Kozma et al., 2010; Hakulinen et al., 2020). Therefore, ongoing support in obtaining and sustaining competitive employment is needed for people with mental illnesses to create a strong and inclusive labour market. Individual placement and support (IPS) is the most effective rehabilitation programme to help people with mental illnesses into competitive employment (Modini et al., 2016; Metcalfe et al., 2018a). IPS was originally developed to support people with severe mental illness (SMI) in achieving competitive employment. IPS is based on eight basic principles (Becker and Drake, 2003; See Box 1). The overall effectiveness of IPS is well-established for people with SMI (Modini et al., 2016; Metcalfe et al., 2018a). Because of its success, IPS is also increasingly offered to people with other diagnoses, such as common mental disorders (CMD), affective disorders, post-traumatic stress disorders (PTSD) and substance use disorders (SUD) (Bond et al., 2019). Results indicated the beneficial effects of IPS for people with PTSD and SUD (Bond et al., 2019). However, mixed indications of the effectiveness of IPS for people with CMD (including affective disorders) were found (Hellström et al., 2021; Probyn et al., 2021). The reasons for the diminished effectiveness of IPS for CMD are unclear, because a consistent definition of CMD is lacking. Diagnostic criteria for labelling CMD differ between studies, but most studies define CMD to include affective and/or anxiety disorders, of varying duration of illness (Vollebergh et al., 2001; Steel et al., 2014; De Vries et al., 2016). The eight basic principles of IPS (Becker and Drake, 2003) Employment outcomes in IPS programmes also vary between service users with different clinical, functional and personal characteristics, such as symptom severity, substance use, involuntary hospitalisation, social functioning, work experience, education level, duration of illness, age and age of onset of psychiatric disorder (Catty et al., 2008; Marwaha et al., 2009; Campbell et al., 2010; Luciano et al., 2014; Fyhn et al., 2020; Christensen et al., 2021). However, these moderating effects have been inconsistent across studies, resulting in ambiguity about the effectiveness of IPS in different subgroups. As IPS has been increasingly expanded to different populations, it is timely to investigate how well the effectiveness of IPS generalises to new target groups. Therefore, in this meta-analysis we analysed the relative effectiveness of IPS for different subgroups of service users as reported in randomised controlled trials of IPS. We assessed the relative effectiveness of IPS by examining study-level outcomes for subgroups of studies with different diagnostic, clinical, functional and personal characteristics using sensitivity analyses of subgroup differences (Borenstein and Higgins, 2013). This is the first meta-analysis that specifically focused on the relative effectiveness of IPS in different target groups with a focus on both target groups with and without SMI. This gives some unique insights into the relative effectiveness of IPS, and valuable addition to the recent contributions about this topic in comparable reviews (i.e. Bond et al., 2019; Probyn et al., 2021). The meta-analysis addressed the following research questions: How does the effectiveness of IPS differ between subgroups of service users with distinct clinical, functional and personal characteristics? What is the relative effectiveness of IPS for specific diagnostic subgroups of service users with CMD, SMI, schizophrenia spectrum disorders and mood disorders?

Materials and methods

Our meta-analysis followed the latest PRISMA guidelines (Page et al., 2021). Our protocol was preregistered in PROSPERO (CRD42020220080).

Search strategy

We identified records through searches in PubMed, PsycInfo and Cochrane of peer-reviewed journals until July 2019. The search was based on terms related to specific primary diagnoses (e.g., schizophrenia, mood disorder, anxiety disorder, but also CMD and SMI), IPS and other vocational rehabilitation programmes and competitive employment (see online Supplementary materials 1). We found additional references through reference lists of identified studies and systematic reviews.

Study selection process

The included studies meet the following criteria:

Participant population

We included studies that investigated people who were diagnosed with any mental disorder, as determined by DSM-III to DSM-5 (American Psychiatric Association, 1997, 2000, 2013) or ICD 10–11 criteria (World Health Organization, 2016, 2019). Participants without mental disorders or at risk of developing mental health problems were excluded from the meta-analysis.

Study design

We included all randomised controlled trials that evaluated the effectiveness of IPS compared to at least one control condition in the meta-analysis.

Intervention

We included studies investigating a treatment arm comprised of IPS as a stand-alone intervention, not augmented with another active intervention, such as cognitive remediation or social skills training, confirmed by an IPS fidelity assessment, receiving at least ‘fair’ fidelity. For studies investigating the effectiveness of both IPS and IPS augmented with another intervention, we only included the IPS-only arm in the analyses and excluded the IPS augmented with another intervention- arm from the analysis.

Comparison

The control group could be any other vocational service or a passive control group (i.e., service as usual or waiting list).

Outcomes

The study reported competitive employment outcomes. Two authors (LdW & CC) independently executed study selection, including both title and abstract screening and full-text screening. Disagreements of the full-text selection process were resolved by consensus.

Data extraction

We extracted study details, participant characteristics, treatment variable, outcomes and study design data from all studies in this meta-analysis. Author LdW executed data extraction and discussed and resolved uncertainties with CC. Details about the data-extraction are presented in online Supplementary materials 2.

Data synthesis

Assessment of outcomes

The included studies reported a variety of competitive employment outcomes. Therefore, we focused on three outcome measures of competitive employment that were reported by at least ten studies, the minimum number to provide outcomes with sufficient statistical power in meta-analyses (Jackson and Turner, 2017; Borenstein et al., 2021): (1) Competitive employment rate (i.e., the proportion of participants competitively employed for at least one day during the study period); (2) Job duration (i.e., days, weeks or months competitive employed during the study period); (3) Wages (i.e., total earnings from competitive employment during the study period).

Assessment of study design and region

The included studies also differed in study design, which might affect outcomes: studies compared IPS with a variety of control groups and outcomes were analysed over different follow-up periods. Furthermore, previous research also indicated regional differences (i.e., European versus non-European studies) in the effectiveness of IPS (Drake et al., 2019). Therefore, we analysed the study outcomes within specific subgroups based on these three confounding factors as follows: (1) type of control group: studies with an active control group encompassing treatment as usual combined with any other vocational services versus studies with a passive control group with treatment as usual and no primary focus on improvement of vocational functioning; (2) follow-up duration: an assessment period of 12 months or less versus more than 12 months; (3) Region: European studies versus non-European studies.

Assessment of moderators of outcomes

In order to answer our two research questions, we selected moderators of study outcomes from the included studies. The selection was based on the identification of relevant moderators analysed in previous studies and the availability of extractable raw data of these moderators in at least ten of our included studies (Borenstein et al., 2021). Assessment of diagnostic subgroups. For this study, we assessed subgroups of SMI or CMD based on diagnosis, duration of illness and inclusion criteria of studies (see Table 1). These three criteria were partly based on previous literature (i.e., Steel et al., 2014; De Vries et al., 2016). However, due to the lack of a consistent definition of CMD, we pragmatically translated these criteria based on the availability of data in the included studies. If studies met none of the three criteria we labelled them as ‘unclear’ and did not include these studies in the analysis. We were also able to include SSD and major depressive disorder as separate moderators. We divided these moderators into subgroups of studies in which the majority (i.e., >50%) of the study sample was diagnosed with the specific diagnosis and subgroups of studies in which the minority was diagnosed with the specific disorder (see Table 1).
Table 1.

Operationalisations of moderators

1. Diagnosis
ModeratorComparisonAssessment instruments (N studies implementing instrument)aOperationalisationStudies with specific moderator
Severity of illnessSevere mental illness (SMI) vs. common mental disorders (CMD)1. Diagnosis (21)2. Duration of illness (0)3. Inclusion criteria (2)(1) Diagnosis: SMI: at least 75% of the study sample schizophrenia spectrum or bipolar disorder, CMD: at least 75% of the study sample mood or anxiety disorder(2) Duration of illness: SMI :at least 50% of the study sample schizophrenia spectrum or bipolar disorder and duration of illness at least 2 years; CMD: none of the study sample schizophrenia spectrum or bipolar disorder and duration of illness less than 2 years(3) Inclusion criteria: SMI or CMD are specifically mentioned in the inclusion criteriaSMI (20 studies): Bejerholm 2014; Bond 2007; Bond 2015; Burns 2007; Christensen 2019; Drake 1996; Drake 1999; Erickson 2020; Gold 2006; Howard 2010; Killackey 2008; Killackey 2019; Latimer 2006; Lehman 2002; Mueser 2004; Reme 2019; Tsang 2009; Twamley 2012; Waghorn 2014; Zhang 2017CMD (5 studies): Davis 2012; Davis 2018; Hellström 2017; Poremski 2015; Reme 2019
Schizophrenia spectrum disorder (SSD)>50% SSD vs ⩽50% SSD1. ICD-10 (4)2. DSM-IV or -5 (6)3. OPCRIT (1)4. SCID (8)5. MINI (3)6. CAPS-IV (1)7. SCAN (2)8 Clinical records (1)9. Unclear (5)Percentage of the study sample diagnosed with a schizophrenia spectrum disorder>50% SSD (20 studies): Bejerholm 2014; Bond 2007; Bond 2015; Burns 2007; Christensen 2019; Drake 1999; Erickson 2020; Gold 2006; Howard 2010; Killackey 2008; Killackey 2019; Latimer 2006; Lehman 2002; Michon 2014; Mueser 2004; Tsang 2009; Twamley 2012; Waghorn 2014; Wong 2008; Zhang 2017⩽50% SSD (11 studies): Bejerholm 2017; Davis 2012; Davis 2018; Drake 1996; Drake 2013; Hellström 2017; Hoffmann 2012; Lones 2017; Poremski 2015; Reme 2019; Viering 2015
Major Depressive Disorder (MDD)>50% MDD vs ⩽50% MDD1. ICD-10 (4)2. DSM-IV or −5 (6)3. OPCRIT (1)4. SCID (8)5. MINI (3)6. CAPS-IV (1)7. SCAN (2)8 Clinical records (1)9. Unclear (5)Percentage of the study sample diagnosed with a major depressive disorder>50% MDD (4 studies): Bejerholm 2017; Drake 2013; Hellström 2017; Poremski 2015⩽50% MDD (23 studies): Bond 2007; Bond 2015; Burns 2007; Christensen 2019; Davis 2012; Davis 2018; Drake 1996; Drake 1999; Erickson 2020; Gold 2006; Hellström 2017; Hoffmann 2012; Howard 2010; Killackey 2008; Killackey 2019; Lehman 2002; Lones 2017; Mueser 2004; Poremski 2015; Reme 2019; Twamley 2012; Waghorn 2014; Zhang 2017
2. Clinical, functional and personal characteristics
Moderator ComparisonAssessment instruments (N studies implementing instrument) Operationalisation Studies with specific moderator
Duration of illness at baselineLong duration of illness vs. Short duration of illnessNot ApplicableWe extracted the duration of illness at baseline from the included studies that reported this construct and we calculated the median duration of illness from all included studies (i.e. 11.5 years of illness).All studies with a duration of illness above the median duration of illness were clustered in the ‘long duration of illness’ group, all studies with a duration of illness below the median duration of illness were clustered in the ‘short duration of illness' group,Long duration of illness (5 studies): Bejerholm 2017; Davis 2018; Lehman 2002; Oshima 2014; Twamley 2012Short duration of illness (6 studies): Bejerholm 2014; Burns 2007; Hoffmann 2012; Killackey 2008; Reme 2019; Viering 2015
Baseline severity of symptomsLow baseline severity of symptoms vs. High baseline severity of symptoms1. BPRS (9)2. DTS (1)3. HADS (1)4. HDRS (1)5. MADRS (1)6. MHI-5 (1)7. PANSS (6)8. PCL-5 (1)9. SANS (2)10. SF-12 mental health (1)From all symptom scales that were reported in the included studies, we searched for studies that assessed the psychometric quality of each scale based on a comparable population (i.e. ‘reference group studies') with the included study that assessed the specific moderator.We used the reference group studies as a basis to calculate percentile scores of the baseline level of symptoms from our included studies. By calculating percentile scores for each assessment instrument, we achieved homogeneous moderators assessed in the same scale range.Finally, we calculated the median percentile score of the normative percentile scores in order to cluster all studies into low (i.e. below-median symptom severity) and high (i.e. above-median symptom severity) symptom severity groups.Low symptom severity (12 studies):Bejerholm 2014; Bond 2007; Burns 2007; Drake 1996; Drake 1999; Gold 2006; Hoffmann 2012; Howard 2010; Killackey 2008; Latimer 2006; Waghorn 2014; Zhang 2017High symptom severity (12 studies):Bejerholm 2017; Christensen 2019; Davis 2012; Davis 2018; Drake 2013; Erickson 2020; Hellström 2017; Killackey 2019; Michon 2014; Mueser 2004; Reme 2019; Twamley 2012
Baseline Level of Functioning (LOF)Low baseline LOF vs High baseline LOF1. GAF (5)2. GAS (2)3. Post-Traumatic Stress-Related Functional Inventory Score (1)4. Personal and social performance scale (1)5. SAS-II (1)6. SOFAS (2)7. UPSA (1)8. WHO-DAS 2.0 (1)From all functioning scales that were reported in the included studies, we searched for studies that assessed the psychometric quality of each scale based on a comparable population (i.e. ‘reference group studies') with the included study that assessed the specific outcome.We used the reference group studies as a basis to calculate percentile scores of the level of functioning (LOF) at baseline from our included studies. By calculating percentile scores for each assessment instrument, we achieved homogeneous moderators assessed in the same scale range.Finally, we calculated the median percentile score of the normative percentile scores in order to cluster all studies into low (i.e. below-median LOF) LOF and high (i.e. above-median LOF) LOF groups.Low LOF (8 studies):Christensen 2019; Davis 2012; Davis 2018; Drake 1996; Drake 1999; Hellström 2017; Hoffmann 2012; Howard 2010;High LOF (7 studies):Killackey 2008; Killackey 2019; Latimer 2006; Mueser 2004; Reme 2019; Twamley 2012; Zhang 2017
Age⩽ 25th percentile (34.63 years) vs ⩾ 75th percentile (41.05 years)Not ApplicableBased on the median and IQR we assessed which study samples' mean age was equal to or below the 25th percentile and which study samples' mean age was equal to are higher than 75th percentile.Age ⩽ 25th percentile (8 studies):Christensen 2019; Erickson 2020; Hoffmann 2012; Killackey 2008; Killackey 2019; Waghorn 2014; Wong 2008; Zhang 2017Age >75th percentile (8 studies):Bond 2015; Davis 2018; Drake 2013; Lehman 2002; Mueser 2004; Poremski 2015; Twamley 2012; Viering 2015
Comorbid alcohol useLow comorbid alcohol use vs High comorbid alcohol usePercentage (%) comorbid alcohol useWe extracted the percentage participants with comorbid alcohol use at baseline from the included studies that reported this construct and we calculated the median percentage comorbid alcohol use from all included studies (i.e. 22%).All studies with a comorbid alcohol use above the median were clustered in the ‘high comorbid alcohol use’, all studies with a comorbid alcohol use below the median were clustered in the ‘low comorbid alcohol use’ groupHigh comorbid alcohol use (6 studies):Bond 2007; Davis 2012; Davis 2018; Lones 2017; Poremski 2015; Twamley 2012Low comorbid alcohol use (6 studies):Bond 2015; Drake 1996; Drake 1999; Gold 2006; Latimer 2006; Mueser 2004
Comorbid substance useLow comorbid substance use vs High comorbid substance usePercentage (%) comorbid substance useWe extracted the percentage participants with comorbid substance use at baseline from the included studies that reported this construct and we calculated the median percentage comorbid substance use from all included studies (i.e. 22%).All studies with a comorbid substance use above the median were clustered in the ‘high comorbid substance use’, all studies with a comorbid substance use below the median were clustered in the ‘low comorbid substance use’ groupHigh comorbid substance use (8 studies):Bond 2015; Davis 2012; Killackey 2008; Killackey 2019; Lehman 2002; Lones 2017; Poremski 2015; Twamley 2012Low comorbid substance use (7 studies):Bond 2007; Drake 1996; Drake 1999; Gold 2006; Hoffmann 2012; Latimer 2006; Mueser 2004
Work experienceHigh work experience vs Low work experience1. Number of months worked past 5 years (8)2. % worked in past 5 years (10)Work experience was assessed by extracting the number of months that participants have worked in competitive employment the past 5 years or the percentage of participants that have worked in competitive employment the past 5 years.For both constructs we calculated the median and labelled the studies as ‘high work experience’ if the number or percentage was above the median and it is labelled as ‘low work experience’ if the number or percentage was below the median.The median number of months that participants have worked in competitive employment the past 5 years was 16.4 months, and the median percentage of participants that have worked in competitive employment the past 5 years was 49%.If one study assessed both the number of months and the percentage of participants that worked in the past 5 years and one construct scored above and the other construct scored below the median, we chose to assess the construct that differed largest from the median to label work experience for the particular study.High work experience (8 studies):Bond 2015; Burns 2007; Davis 2012; Drake 1996; Gold 2006; Howard 2010; Lones 2017; Michon 2014Low work experience (8 studies):Bejerholm 2014; Bond 2007; Christensen 2019; Drake 1999; Latimer 2006; Lehman 2002; Mueser 2004; Wong 2008
Education levelLow education level vs High education levelISCED level of educationThe percentage of participants with specific education levels was extracted from the included studies. As studies were executed from different countries and each country has a different structure of education levels, we chose to label each education level based on the ISCED levels of education and calculated the percentage of participants with an ISCED level 5 or higher (tertiary education) within each study.We calculated the median percentage (i.e. 36.7%) of participants with an ISCED level of 5 or higher and labelled the studies below the median as ‘low education level’ and the studies above the median as ‘high education level’.Low education level (11 studies):Bejerholm 2017; Bond 2015; Christensen 2019; Gold 2006; Hellström 2017; Hoffmann 2012; Mueser 2004; Reme 2019; Tsang 2009; Wong 2008; Zhang 2017High education level (10 studies):Bond 2007; Davis 2012; Davis 2018; Drake 1996; Drake 2013; Erickson 2020; Latimer 2006; Lones 2017; Poremski 2015; Waghorn 2014

BPRS, Brief Psychiatric Rating Scale; CAPS-IV, Administered PTSD Scale for DSM-IV; DSM-IV or −5, Diagnostic and Statistical Manual 4 or 5; DTS, Davidson Trauma Scale; GAF, Global Assessment of Functioning; GAS, Global Assessment Scale; HADS, Hamilton Anxiety and Depression Scale; HDRS, Hamilton Depression Rating Scale; ICD-10, International Classification of Diseases – 10; MADRS, Montgomery-Åsberg Depression Self Rating Scale; MHI-5, Mental Health Inventory – 5; MINI, Mini-International Neuropsychiatric Interview; PANSS, Positive and Negative Symptom Scale; PCL-5, PTSD checklist for DSM-5; SANS, Scale for the Assessment of Negative Symptoms; SAS-II, Simpson Angus Scale – II; SCAN, Structured Clinical Assessment in Neuropsychiatry; SCID, The Structured Clinical Interview for DSM-5; SOFAS, Social and Occupational Functioning Assessment Scale; SF-12, Short Form Health Survey 12; UPSA, UCSD performance-based skills assessment; WHO-DAS 2.0, World Health Organization Disability Assessment Scale 2.0.

Operationalisations of moderators BPRS, Brief Psychiatric Rating Scale; CAPS-IV, Administered PTSD Scale for DSM-IV; DSM-IV or −5, Diagnostic and Statistical Manual 4 or 5; DTS, Davidson Trauma Scale; GAF, Global Assessment of Functioning; GAS, Global Assessment Scale; HADS, Hamilton Anxiety and Depression Scale; HDRS, Hamilton Depression Rating Scale; ICD-10, International Classification of Diseases – 10; MADRS, Montgomery-Åsberg Depression Self Rating Scale; MHI-5, Mental Health Inventory – 5; MINI, Mini-International Neuropsychiatric Interview; PANSS, Positive and Negative Symptom Scale; PCL-5, PTSD checklist for DSM-5; SANS, Scale for the Assessment of Negative Symptoms; SAS-II, Simpson Angus Scale – II; SCAN, Structured Clinical Assessment in Neuropsychiatry; SCID, The Structured Clinical Interview for DSM-5; SOFAS, Social and Occupational Functioning Assessment Scale; SF-12, Short Form Health Survey 12; UPSA, UCSD performance-based skills assessment; WHO-DAS 2.0, World Health Organization Disability Assessment Scale 2.0. Clinical, functional and personal characteristics. We identified eight other moderators of outcomes: duration of illness, the severity of symptoms, level of functioning, age, comorbid alcohol and substance use, education level and employment history. We assessed these moderators at baseline and operationalised those into subgroups. Subgroups were generally assessed based on the available data in the included studies in this meta-analysis, in order to achieve equally distributed subgroups. Criteria for the operationalisation into subgroups of each moderator are described in Table 1.

Risk of bias assessment

We assessed the risk of bias for each study through the Cochrane Collaboration risk of the bias assessment tool (Higgins and Green, 2008). Potential bias (i.e., high, low or unclear) is assessed as a judgment for individual elements from five domains (selection, performance, attrition, reporting and other bias). Author LdW rated all studies and CC independently rated the risk of bias of 50% of all studies. The inter-rater reliability (Cohen's kappa; McHugh, 2012) was substantial (κ = 0.61; Landis and Koch, 1977) and disagreements were resolved through consensus.

Statistical analysis

Meta-analytic procedure

Meta-analyses were conducted using RevMan 5.3 (The Nordic Cochrane Centre, 2014). We assessed the effectiveness of IPS by analysing differences between IPS and the control group over the study period by calculating the standardised mean difference (d) for continuous outcomes (i.e., job duration and wages) and the odds ratio (OR) for categorical outcomes (i.e., employment rate). For studies reporting multiple outcome assessments for related outcome measures, we pooled the effect sizes into an overall effect size. We used random-effects models, weighted by the method of inverse variance (Higgins and Green, 2008). The magnitude of effect sizes was assessed based on the criteria described by Chinn (2000). Statistical heterogeneity was assessed by calculating the I statistic (Higgins and Thompson, 2002). We performed the overall meta-analysis within separate subgroups based on the type of control group, follow-up duration and geographical region. We controlled for the potential influence of these factors using an analysis of subgroup differences (Borenstein and Higgins, 2013). One study had both an active and passive control group (Mueser et al., 2004), and multiple studies both followed service users after ⩽ 12 months and >12 months of follow-up. For the overall meta-analysis we pooled the effect sizes for all control groups or follow-up assessments within the study into one overall effect size, but we analysed both effect sizes separately during the analysis of subgroup differences, controlling for methodological confounders.

Calculating moderating effects

We analysed moderating effects through a sensitivity analysis of subgroup differences (Borenstein and Higgins, 2013), in which we compared subgroup outcomes with high levels or presence of the moderator versus those with low levels or absence of the moderator (see Table 1). Furthermore, the positive or negative influence of specific subgroups on employment outcomes was assessed by investigating which subgroups' confidence intervals of treatment effect exceeded the upper (‘positive’ influence) or lower (‘negative’ influence) bound of the confidence interval of the overall effect size of treatment effect.

Outliers and publication bias

We addressed the potential influence of outliers (i.e., if the confidence interval [CI] of an individual study outcome exceeded the CI of the overall effect size) by comparing the overall effect size of the outcome, including the outliers, with the overall effect size when outliers are removed through an analysis of subgroup differences. Potential publication bias was detected by visual inspection of funnel plots.

Results

Study flow

Of the 1333 records retrieved through database search and reference tracking, 1170 records were excluded after the title and abstract screening. Of the remaining 163 reports, 115 reports were excluded after full-text selection (see Fig. 1 for reasons of exclusion). The remaining 48 reports reported the results of 32 studies.
Fig. 1.

Flow chart selection studies conform Prisma guidelines.

Flow chart selection studies conform Prisma guidelines.

Study characteristics

As shown in Table 2, the 32 studies included 3818 participants receiving IPS and 3847 participants receiving a control intervention. The mean age of the aggregated sample (n = 7665) was 38.9 years (study range: 20.4–51.0); 44.1% of the participants were female. A total of 3454 (45.1%) participants were diagnosed with a schizophrenia spectrum disorder (SSD), and 2587 (33.8%) had a main diagnosis of major depressive disorder (MDD). The remaining 1624 (21.2%) had other diagnoses, such as anxiety disorder, PTSD, SUD or personality disorders. Twenty studies met the criteria for SMI and five studies met the criteria for CMD.
Table 2.

Characteristics of included studies

Study[a]N (IPS versus control(s))Comparison groupbN sitesCountryFollow-up assessmentStudy attritionAge (M(s.d.))Primary diagnosisIPS fidelity scorebOutcomes
1. Bejerholm 2014A,B60–60Traditional vocational rehabilitation (A)6Sweden18 months27.50%38.0 (8.0)1. Schizophrenia and other psychosis: 64.7%2. Bipolar disorder: 7.6%3. Other diagnoses: 27.7%IPS 25-item scalec:6 months: 110 (G)12 months: 115 (E)18 months: 117 (E)1. Competitive employment rate2. Job duration
2. Bejerholm 2017C33–28Traditional vocational rehabilitation (A)4Sweden6 & 12 months4.90%41.0 (11.0)1. Depression: 68.9%2. Bipolar disorder: 31.2%IPS 25-item scalec:12 months: 106 (G)1. Competitive employment rate2. Job duration
3. Bond 2007D,E96–98Diversified placement approach (A)2USA24 months25.30%38.8 (9.6)1. Schizophrenia: 39.0%2. Schizoaffective disorder: 17.1%3. Depression: 17.1%4. Bipolar disorder: 24.1%5. Other diagnoses: 2.7%IPS 15-item scalec:Endpoint: 72 (E)1. Competitive employment rate2. Job duration
4. Bond 2015F45–45Work Choice (A)2USA12 months5.60%43.8 (11.5)1. Schizophrenia: 52.9%2. Depression: 18.4%3. Bipolar: 25.3%4. Other diagnoses: 3.5%IPS 25-item scalec:Good fidelity (exact scores not reported)1. Competitive employment rate2. Job duration
5. Burns 2007G,H,I156–156Vocational service (A)6Bulgaria / Germany/ Italy/ Netherlands/ Switzerland/ UK18 months19.20%37.8 (9.9)1. Schizophrenia /schizoaffective disorder: 80.3%2. Bipolar disorder: 16.5%3. Other psychotic disorders: 3.2%IPS 15-item scalec:Endpoint: 65 (G)1. Competitive employment rate2. Job duration
6. Christensen 2019K243–239SAU (vocational rehabilitation) (A)5Denmark18 months27.00%33.1 (10.1)1. Schizophrenia spectrum disorder: 76.8%2. Bipolar disorder: 11.8%3. Recurrent depression: 11.4%IPS 25-item scalec:Fidelity score range between IPS programmes: 75–101 (fair to good fidelity)1. Competitive employment rate2. Job duration3. Wages
7. Davis 2012M,N42–43Standard VA Vocational Rehabilitation Programme (A)1USA12 months16.50%40.2 (12.1)Post-traumatic stress disorder: 100%IPS 15-item scalec:Study period: 61 (F)1. Competitive employment rate2. Job duration3. Wages
8. Davis 2018O,P271–270Transitional work programme (A)12USA2.3; 4.6; 6.9; 9.2; 11.5; 13.8; 16.2 & 18 months19.20%42.2 (11.0)Post-traumatic stress disorder: 100%IPS 15-item scalec:Study period: 63–69 (Fair to Good)1. Competitive employment rate2. Job duration3. Wages
9. Drake 1996J,L74–69Group Skills Training (A)2USA18 months2.10%37.0 (9.5)1. Schizophrenia or a related psychotic disorder: 46.9%2. bipolar or other severe mood disorder: 42.7%3. Other diagnoses: 10.5%Fidelity scale and assessment score unclear and not reported1. Competitive employment rate2. Job duration3. Wages
10. Drake 1999Q,R76–76Enhanced vocational rehabilitation (A)2USA18 months1.30%39.4 (7.1)1. Schizophrenia spectrum disorder: 67.1%2. Bipolar disorder: 13.8%3. Depressive disorder: 16.5%4. Other Axis I disorder: 2.6%IPS 15-item scalec:Exact scores and ratings not reported1. Competitive employment rate2. Job duration3. Wages
11. Drake 2013S,T1121–1117Care as usual (P)23USA18 & 24 months8.20%43.5 (NRb)1. Schizophrenia: 29.7%2. Affective disorder: 70.3%IPS 15-item scalec:Percentage of IPS programmes with good fidelity:Year 1: 77%Year 2: 86%Year 3: 98%1. Competitive employment rate2. Job duration3. Wages
12. Erickson 2021U56–53Treatment as Usual (P)12Canada6 & 12 months9.20%23.1 (3.4)1. Schizophrenia: 37.6%2. Schizophreniform disorder: 4.6%3. Schizoaffective disorder: 8.3%4. Bipolar disorder: 18.4%5. Major depression: 9.2%6. Psychosis NOS: 15.6%7. Substance-induced psychosis: 4.6%8. Delusional disorder: 0.9%9. Aspergers syndrome: 0.9%IPS 25-item scalec:1 year: 100 (G)2 year: 110 (G)1. Competitive employment rate2. Job duration
13. Gold 2006V66–77Supported Employment Programme (A)1USA24 months24.50%35.5 (NRb)1. Schizophrenia spectrum disorder: 68.5%2. Mood disorder: 31.5%IPS 15-item scalec:Study period: 69 (G)Competitive employment rate
14. Hellström 2017W162–164Job Centre services as usual (A)NRbDenmark12 & 24 months29.80%35.0 (10.5)1. Depression: 69.0%2. Phobic anxiety: 7.7%3. Other anxiety: 12.0%4. Bipolar disorder: 11.4%IPS-MA 21-item fidelity score:Score 102 out of 105Job duration
15. Howard 2010X,Y109–110Treatment as Usual (P)2United Kingdom (UK)12 & 24 months13.70%38.3 (9.4)1. Psychotic disorder: 72.5%2. Mood disorder: 27.5%IPS 15-item scalec:Study period: 68 (G)Competitive employment rate
16. Hoffmann 2012Z,AA,AB46–54Traditional vocational rehabilitation programmes (A)1Switzerland24 & 60 months12.00%33.8 (9.4)1. Schizophrenia spectrum disorder: 38.0%2. Affective disorder: 41.0%3. Other diagnosis: 21.0%IPS 15-item scalec:Study period: 66–68 (G)1. Competitive employment rate2. Job duration3. Wages
17. Killackey 2008AC20–21Treatment as Usual (P)1Australia6 months0.00%21.4 (2.3)Schizophrenia-spectrum disorder: 100%IPS 15-item scalec:Study period: 68 (G)1. Competitive employment rate2. Job duration3. Wages
18. Killackey 2019AD73–73Treatment as Usual (P)1Australia18 months13.00%20.4 (2.4)1. Schizophreni-form/schizophrenia: 43.8%2. Schizoaffective disorder: 13.0%3. Major depressive disorder, psychotic features: 11.6%4. Bipolar disorder: 13.7%5. Psychosis NOS: 11.6%6. Other diagnoses: 6.2%IPS 25-item scalec:Good fidelity (exact scores not reported)Competitive employment rate
19. Latimer 2006AE75–75Usual services (P)1Canada12 months16.70%40.2 (10.0)1. Schizoaffective disorder: 16.8%2. Other schizophrenia spectrum disorders: 59.1%3. Bipolar disorder: 20.1%4. Other diagnoses: 4.0%IPS 15-item scalec:Study period: 71 (E)1. Competitive employment rate2. Job duration3. Wages
20. Lehman 2002AF113–116Psychosocial rehabilitation programme (A)1USA24 months31.10%41.5 (8.5)1. Psychotic disorder: 78.3%2. Mood disorders: 21.7%IPS 15-item scalec:Study period: 69–71 (G)1. Competitive employment rate2. Wages
21. Lones 2017AG22–23Waitlist plus treatment as usual (P)1USA6 & 12 months22.20%37.1 (10.6)Moderate-to-severe opioid use disorder: 100%IPS 25-item scalec:Study period: 85 (F)1. Competitive employment rate2. Job duration3. Wages
22. Michon 2014AH71–80Traditional Vocational Rehabilitation (A)4Netherlands6; 18 & 30 months43.10%34.9 (10.5)1. Psychotic disorder: 54.3%2. Other diagnoses: 45.7%Quality of Supported Employment Implementation Scale (QSEIS):2 IPS programmes Good fidelity; 2 IPS programmes Fair fidelityCompetitive employment rate
23. Mueser 2004AI,AJ,AK68–67–691. Psychiatric Rehabilitation Centre (A)2. Standard Services (P)1USA24 months18.60%41.2 (9.2)1. Schizophrenia: 53.4%2. Schizoaffective disorder: 21.1%3. Major depression: 17.2%4. Bipolar disorder: 4.9%5. Personality disorder: 1.0%Other diagnoses: 2.5%IPS 15-item scalec:Study period: 71 (G)1. Competitive employment rate2. Job duration3. Wages
24. Oshima 2014AL18–19Conventional vocational rehabilitation (A)1Japan6 months0.00%40.6 (8.9)NRbIPS 15-item scalec:Study period: 68 (G)1. Competitive employment rate2. Job duration3. Wages
25. Poremski 2017AM45–45Treatment as Usual (P)1Canada8 months5.60%46.2 (10.0)1. Major depressive disorder: 64.4%2. Psychotic disorder: 22.2%3. Panic disorder: 5.6%4. Mania-hypomania: 4.4%5. Post-traumatic stress disorder: 3.3%IPS 25-item scalec:Study period: 100 (G)1. Competitive employment rate2. Wages
26. Reme 2019AN229–181High quality treatment as usual (A)6Norway12 & 18 months0.50%35.0 (10.8)1. Psychotic disorder: 27.1%2. Bipolar disorder: 13.9%3. Major depression: 40.0%4. Anxiety disorder: 40.5%5. Alcohol/drug abuse: 18.3%6. Other diagnosis: 8.3%IPS 25-item scalec:Good fidelity (exact scores not reported)Competitive employment rate
27. Tsang 2009AO,AP65–66Traditional vocational rehabilitation (A)5Hong Kong7; 11 & 15 months31.80%34.9 (8.5)1. Schizophrenia: 76.7%2. Other diagnoses: 23.3%IPS 15-item scalec:Study period: 65–68 (G)Competitive employment rate
28. Twamley 2012AQ30–28Conventional vocational rehabilitation (P)1USA12 months20.70%51.0 (4.3)1. Schizophrenia: 39.7%2. Schizoaffective disorder: 60.3%IPS 15-item scalec:Study period: 63 (F)Competitive employment rate
29. Viering 2015AR127–123Other vocational services (P)1Switzerland24 months31.60%42.6 (10.6)1. Mood affective disorder: 47.2%2. Schizophrenia/ schizoaffective disorder: 15.6%3. Personality disorder: 17.2%4. Other diagnoses: 18.0%IPS 15-item scalec:Study period: 61 (F)1. Competitive employment rate2. Job duration
30. Waghorn 2014AS106–102Non-integrated forms of supported employment (A)5Australia12 months44.20%32.4 (8.9)1. Psychotic disorder: 80.8%2. Bipolar disorder: 8.2%3. Major depression or anxiety disorder: 6.3%IPS 15-item scalec:Study period: 69 (G)1. Competitive employment rate2. Job duration
31. Wong 2008AT46–46Conventional vocational rehabilitation (A)1Hong Kong6; 12 & 18 months1.10%33.6 (9.2)1. Schizophrenia spectrum disorder; 69.6%2. Affective disorder: 18.5%3. Other diagnoses: 12.0%IPS 15-item scalec:Study period: 69 (G)1. Competitive employment rate2. Job duration3. Wages
32. Zhang 2017AU54–54Traditional Vocational Rehabilitation (P)1China15 monthsNRb32.8 (8.3)Schizophrenia: 100%IPS 15-item scalec:Study period: 67 (G)Competitive employment rate

References of reports of included studies: A. Areberg and Bejerholm (2013); B. Bejerholm et al. (2015); C. Bejerholm et al. (2017); D. Bond et al. (2007); E. Bond et al. (2013); F. Bond et al. (2015); G. Burns et al. (2007); H. Burns and Cathy (2008); I. Kilian et al. (2012); J. Drake et al. (1996); K. Christensen et al. (2019); L. Clark et al. (1998); M. Davis et al. (2012); N. Davis et al. (2014); O. Davis et al. (2018); P. Davis et al. (2018); Q. Drake et al. (1999); R. Dixon et al. (2002); S. Drake et al. (2013); T. Metcalfe et al. (2018); U. Erickson et al. (2021); V. Gold et al. (2006); W. Hellström et al. (2017); X. Howard et al. (2010); Y. Heslin et al. (2011); Z. Hoffmann et al. (2012); AA. Hoffmann et al. (2014); AB. Jäckel et al. (2017); AC. Killackey et al. (2008); AD. Killackey et al. (2019); AE. Latimer et al. (2006); AF. Lehman et al. (2002); AG. Lones et al. (2017); AH. Michon et al. (2014); AI. Mueser et al. (2001); AJ. Mueser et al. (2004); AK. Mueser et al. (2014); AL. Oshima et al. (2014); AM. Poremski et al. (2017); AN. Reme et al. (2019); AO. Tsang et al. (2009); AP. Tsang et al. (2011); AQ. Twamley et al. (2012); AR. Viering et al. (2015); AS. Waghorn et al. (2014); AT. Wong et al. (2008); AU. Zhang et al. (2017)

A, active control group; E, excellent fidelity; F, Fair fidelity; G, good fidelity; NR, Not Reported; P, Passive control group;.

IPS-15 item scale (Bond et al., 1997): item scale range: 15–75; Fidelity ratings: <55 = No IPS; 56–65 = Fair fidelity (F); >65 = Good fidelity (G); IPS-25 item scale (Bond, Peterson, Becker and Drake, 2012): item scale range: 25–125; Fidelity ratings: <74 = No IPS; 74–99 = Fair fidelity (F); 100–114 = Good fidelity (G); 115–125 = Exemplary fidelity (E).

Characteristics of included studies References of reports of included studies: A. Areberg and Bejerholm (2013); B. Bejerholm et al. (2015); C. Bejerholm et al. (2017); D. Bond et al. (2007); E. Bond et al. (2013); F. Bond et al. (2015); G. Burns et al. (2007); H. Burns and Cathy (2008); I. Kilian et al. (2012); J. Drake et al. (1996); K. Christensen et al. (2019); L. Clark et al. (1998); M. Davis et al. (2012); N. Davis et al. (2014); O. Davis et al. (2018); P. Davis et al. (2018); Q. Drake et al. (1999); R. Dixon et al. (2002); S. Drake et al. (2013); T. Metcalfe et al. (2018); U. Erickson et al. (2021); V. Gold et al. (2006); W. Hellström et al. (2017); X. Howard et al. (2010); Y. Heslin et al. (2011); Z. Hoffmann et al. (2012); AA. Hoffmann et al. (2014); AB. Jäckel et al. (2017); AC. Killackey et al. (2008); AD. Killackey et al. (2019); AE. Latimer et al. (2006); AF. Lehman et al. (2002); AG. Lones et al. (2017); AH. Michon et al. (2014); AI. Mueser et al. (2001); AJ. Mueser et al. (2004); AK. Mueser et al. (2014); AL. Oshima et al. (2014); AM. Poremski et al. (2017); AN. Reme et al. (2019); AO. Tsang et al. (2009); AP. Tsang et al. (2011); AQ. Twamley et al. (2012); AR. Viering et al. (2015); AS. Waghorn et al. (2014); AT. Wong et al. (2008); AU. Zhang et al. (2017) A, active control group; E, excellent fidelity; F, Fair fidelity; G, good fidelity; NR, Not Reported; P, Passive control group;. IPS-15 item scale (Bond et al., 1997): item scale range: 15–75; Fidelity ratings: <55 = No IPS; 56–65 = Fair fidelity (F); >65 = Good fidelity (G); IPS-25 item scale (Bond, Peterson, Becker and Drake, 2012): item scale range: 25–125; Fidelity ratings: <74 = No IPS; 74–99 = Fair fidelity (F); 100–114 = Good fidelity (G); 115–125 = Exemplary fidelity (E). Twenty-one studies compared IPS with an active control group and 12 studies compared IPS with a passive control group (including one study with both a passive and active control group). The overall study attrition rate (i.e., lost to follow-up) was 16.3% and only two studies reached a ‘high’ attrition rate exceeding 40%. Instruments for fidelity assessment differed between studies (see Table 2), but the majority of the studies (75.0%) achieved at least ‘good’ IPS programme fidelity.

Quality assessment

Quality assessment is reported in Fig. 2. Overall we found low levels of selection and attrition bias, but relatively higher levels of performance and detection bias. The majority of studies (81.3%) reported a low risk of selection bias (i.e., random sequence generation and allocation concealment). In the majority of studies (53.1%) the participants and personnel were not blinded or information about blinding was unclear (43.4%). However, given the nature of the intervention and the study design, it was generally not feasible to achieve proper blinding of participants, so a certain level of performance bias was inevitable. In nine studies (29.0%) the outcome assessors were not blinded, indicating a high risk of detection bias. This is a relatively large number of studies with a high risk of detection bias, compared with the other risk of bias domains. However, the outcomes we used in our meta-analysis (i.e., employment rate, job duration and wages) are objective outcome measures and not sensitive for the interpretation of the outcome assessor. Therefore, this might not have a large influence on the study outcomes. In most studies (67.7%) we found a low risk of attrition bias (i.e., incomplete outcome data). Five studies reported other sources of bias: three studies reported baseline differences between IPS and the control group that potentially influenced outcomes and one study indicated the potential influence of allegiance bias because specialists favoured one intervention over the other and one study had a low fidelity score during the first part of the study which may have negatively influenced study outcomes at the start of the study. There were no indications of selective outcome reporting in any of the 32 studies.
Fig. 2.

Cochrane risk of bias assessment.

Cochrane risk of bias assessment.

Overall meta-analysis

Thirty-one studies reported employment rate outcomes (see Table 3). A higher percentage of IPS participants (48.8%) than control group participants (28.3%) were employed during follow-up, showing small effect sizes (OR  =  2.62 [2.37–2.89], p < 0.01). Outcomes were moderately heterogeneous (I = 74% [67–80%]; p < 0.01). The overall effect sizes of the employment rate were not influenced by the follow-up duration. However, we did find more favourable employment rate outcomes for IPS in non-European studies compared with European studies (χ = 10.54; p < 0.01) and in studies that compared IPS with an active control group compared with a passive control group (χ = 10.77; p < 0.01).
Table 3.

Overall meta-analysis of outcomes

Employment rate
ComparisonFollow-up (FU) subgroupN studiesa% employed at FUbEffect size of outcomecHeterogeneity
IPSControlORd95% CIpI295% CIp
n (%)Nn (%)N
IPS vs. active control conditionAll studies20949 (50.1%)1893496 (26.7%)18593.15 [S]2.74–3.62<0.0179%71–84%<0.01
⩽ 12 month FU10353 (38.3%)921182 (20.8%)8762.50 [S]2.07–3.02<0.0168%49–80%<0.01
> 12 month FU15895 (54.2%)1652518 (31.9%)16252.88 [S]2.48–3.36<0.0181%72–87%<0.01
European studies (including UK)7345 (42.2%)817199 (25.5%)7812.27 [S]1.84–2.79<0.0162%32–79%<0.01
Non-European studies13604 (56.1%)1076297 (27.6%)10774.09 [M]3.40–4.92<0.0179%68–86%<0.01
IPS vs. passive control conditionAll studies12846 (48.3%)1753525 (29.4%)17882.26 [S]1.97–2.68<0.0161%43–74%<0.01
⩽ 12 month FU7144 (42.5%)33980 (23.6%)3382.12 [S]1.52–2.96<0.0165%37–81%<0.01
> 12 month FU6719 (50.8%)1414454 (31.3%)14502.30 [S]1.97–2.68<0.0155%17–76%0.05
European studies (including UK)3112 (30.3%)37078 (21.3%)3671.64 [S]1.18–2.27<0.0134%0–74%0.22
Non-European studies9734 (53.1%)1383447 (31.5%)14212.44 [S]2.09–2.85<0.0162%38–76%<0.01
Overall outcomes311745 (48.8%)35781013 (28.3%)35782.62 [S]2.37–2.89<0.0174%67–80%<0.01
Test for subgroup differencesActive vs passive control conditionχ2 = 10.77; df = 1; p < 0.01
⩽ 12 month vs. > 12 month follow-upχ2 = 0.29; df = 1; p = 0.59
European vs. non-European studiesχ2 = 10.54; df = 1; p < 0.01
Job duration
ComparisonFollow-up subgroupM (s.d.) at FUbEffect size of outcomecHeterogeneity
IPSControl
N studiesaM (s.d.)NM (s.d.)Ndd95% CIpI295% CIp
IPS vs. active control conditionAll studies1724.5 (11.4)157110.9 (10.0)15650.47 [S]0.33–0.61<0.0181%73–87%<0.01
⩽ 12 month follow-up727.1 (13.6)67415.2 (13.9)6660.42 [S]0.16–0.68<0.0176%51–88%<0.01
> 12 month follow-up1022.7 (9.9)8977.9 (5.1)8990.47 [S]0.30–0.63<0.0182%73–89%<0.01
European studies721.0 (12.2)68810.2 (8.0)6940.40 [S]0.20–0.61<0.0183%68–91%<0.01
Non-European studies1027.8 (10.8)83212.4 (11.5)8280.52 [M]0.33–0.71<0.0177%62–86%<0.01
IPS vs. passive control conditionAll studies727.1 (12.7)135417.1 (13.7)13930.31 [S]0.12–0.49<0.0169%42–83%<0.01
⩽ 12 month follow-up426.4 (17.9)15518.0 (16.9)1520.23 [S]0.07–0.40<0.010%0–78%0.63
> 12 month follow-up328.1 (1.9)119916.0 (11.3)12410.36 [S]0.02–0.71<0.0587%56–96%<0.01
European studies129.6 (19)12727.7 (19.5)121−0.02 [N]−0.27 to 0.230.88NANANA
Non-European studies626.7 (13.9)122715.4 (14.1)12720.37 [S]0.20–0.54<0.0154%15–75%0.06
Overall outcomes2325.2 (11.8)285713.1 (11.4)28890.41 [S]0.30–0.52<0.0177%69–83%<0.01
Test for subgroup differencesActive vs passive control conditionχ2 = 1.98; df = 1; p = 0.16
⩽ 12 month vs. > 12 month follow-upχ2 = 0.27; df = 1; p = 0.60
European vs. non-European studiesχ2 = 0.65; df = 1; p = 0.42
Wages
ComparisonFollow-up subgroupM (s.d.) at FUbEffect size of outcomecHeterogeneity
IPSControl
N studiesaM (s.d.)NM (s.d.)Ndd95% CIpI295% CIp
IPS vs. active control conditionAll studies10286.6 (314.7)994148.3 (196.6)9790.39 [S]0.20–0.58<0.0176%61–85%<0.01
⩽ 12 month follow-up2310.1 (328.5)6077.9 (94.6)620.63 [M]0.26–0.99<0.010%NA0.92
> 12 month follow-up8277.8 (332.2)934174.7 (223.0)9170.35 [S]0.15–0.56<0.0180%64–89%<0.01
European studies (including UK)2558.9 (491.9)289353.1 (247.2)2850.17 [N]−0.07 to 0.410.1752%NA0.15
Non-European studies8226.1 (266.3)705102.8 (166.9)6940.46 [S]0.22–0.69<0.0177%59–87%<0.01
IPS vs. passive control conditionAll studies6497.7 (413.9)1222414.8 (401.1)12590.28 [S]0.14–0.42<0.0123%0–43%0.26
⩽ 12 month follow-up4640.4 (451.7)150550.2 (427.3)1410.15 [N]–0.07 to 0.360.190%0–79%0.61
> 12 month follow-up2252.1 (252.4)1072144.1 (191.8)11200.39 [S]0.11–0.67<0.0164%NA0.09
European studies (including UK)0XXXXXXXXXX
Non-European studies6497.7 (413.9)1222414.8 (401.1)12610.28 [S]0.14–0.42<0.0123%0–43%0.26
Overall outcomes15379.1 (358.9)2148257.0 (306.4)21720.31 [S]0.19–0.44<0.0165%51–76%0.01
Test for subgroup differencesActive vs passive control conditionχ2 = 0.84; df = 1; p = 0.36
⩽ 12 month vs. > 12 month follow-upχ2 = 0.14; df = 1; p = 0.70
European vs. non-European studiesχ2 = 1.58; df = 1; p = 0.21

Some studies have used multiple follow-up assessments or have multiple treatment arms. Therefore, some studies are included in the analysis of both follow-up subgroups and one study compared IPS with both an active and passive control group. Therefore, the total amount of studies and sample sizes analysed in each comparison is sometimes lower than the sum of studies analysed in both follow-up subgroups.

Summary statistics for each of the three employment outcomes are assessed as follows: Employment rate: number and percentage of people in competitive employment at the follow-up assessment; Job duration: percentage of time within the study period that participants are employed; Wages: monthly salary in euros during the study period.

d > 0 and OR > 1 indicates outcomes are beneficial for IPS compared to the control group; d < 0 and OR < 1 indicates outcomes are beneficial for the control group compared to IPS.

Magnitude of effect (Chinn, 2000): Not clinically relevant [N]: d > −0.2 – <0.2; OR > 0.67 – <1.5; Small effect [S]: d ⩽ −0.20 and >−0.50 – ⩾0.20 and <0.50; OR ⩽ 0.67 and >0.29 – ⩾1.5 and <3.5; Medium effect [M]: d ⩽ −0.50 and >−0.80 – ⩾0.50 and <0.80; OR ⩽ 0.29 and >0.20 – ⩾3.5 and <5; Large effect [L]: d < −0.80 – >0.80; OR < 0.20 – >5.

Overall meta-analysis of outcomes Some studies have used multiple follow-up assessments or have multiple treatment arms. Therefore, some studies are included in the analysis of both follow-up subgroups and one study compared IPS with both an active and passive control group. Therefore, the total amount of studies and sample sizes analysed in each comparison is sometimes lower than the sum of studies analysed in both follow-up subgroups. Summary statistics for each of the three employment outcomes are assessed as follows: Employment rate: number and percentage of people in competitive employment at the follow-up assessment; Job duration: percentage of time within the study period that participants are employed; Wages: monthly salary in euros during the study period. d > 0 and OR > 1 indicates outcomes are beneficial for IPS compared to the control group; d < 0 and OR < 1 indicates outcomes are beneficial for the control group compared to IPS. Magnitude of effect (Chinn, 2000): Not clinically relevant [N]: d > −0.2 – <0.2; OR > 0.67 – <1.5; Small effect [S]: d ⩽ −0.20 and >−0.50 – ⩾0.20 and <0.50; OR ⩽ 0.67 and >0.29 – ⩾1.5 and <3.5; Medium effect [M]: d ⩽ −0.50 and >−0.80 – ⩾0.50 and <0.80; OR ⩽ 0.29 and >0.20 – ⩾3.5 and <5; Large effect [L]: d < −0.80 – >0.80; OR < 0.20 – >5. Twenty-three studies reported job duration outcomes. Results indicated that IPS participants were longer employed than those in the control group during follow-up, showing small effect sizes (d = 0.41 [0.30–0.52], p < 0.01). Outcomes were moderately heterogeneous (I = 77% [69–83%]; p < 0.01). The overall effect sizes of job duration were not influenced by the type of control group, follow-up duration or region. Fifteen studies reported outcomes of wages. Results indicated that IPS participants earned more wages during the study period than those in the control group, though effect sizes were small (d = 0.31 [0.19–0.44], p < 0.01). Outcomes were moderately heterogeneous (I = 65% [51–76%]; p < 0.01). The overall effect sizes of wages were not influenced by the type of control group, follow-up duration or region.

Moderating effects on overall outcomes

Sensitivity analysis outcomes were reported in Table 4 and Fig. 3. We excluded some moderators in the sensitivity analysis of job duration and wages, because these moderators were reported in less than ten studies.
Table 4.

Sensitivity analysis of moderating effects on the outcomes

Employment rate
Research question 1: Effectiveness IPS for different diagnosesEffect size of outcomeaHeterogeneity
ModeratorbSubgroupsN studiescN sample (IPS – control)n (%) at FUdORb95% CIpI295% CIp
IPSControl
n (%)Nn (%)N
Severity of psychiatric disordersCMD4445–420243 (54.6%)445161 (38.3%)4201.99 [S]1.51–2.63<0.0172%24–90%0.01
SMI201751–1706795 (45.4%)1751408 (23.9%)17063.37 [S]2.90–3.90<0.0176%68–83%<0.01
Subgroup differenceχ2 = 10.79; df = 1; p < 0.01
Schizophrenia spectrum disorder (SSD)>50% of study sample SSD201574–1578763 (48.5%)1574396 (25.1%)15783.46 [S]2.98–4.03<0.0175%66–82%<0.01
⩽ 50% of study sample SSD101889–1878974 (51.6%)1889615 (32.8%)18782.23 [S]1.95–2.55<0.0157%34–72%0.01
Subgroup differenceχ2 = 18.24; df = 1; p < 0.01
Majr depressive disorder (MDD)>50% of study sample MDD31081–1118542 (50.1%)1081357 (31.9%)11182.17 [S]1.82–2.58<0.010%0–95%0.54
⩽ 50% of study sample MDD222224–21641052 (47.3%)2224577 (25.7%)21642.80 [S]2.46–3.19<0.0174%65–80%<0.01
Subgroup differenceχ2 = 5.36; df = 1; p < 0.05
Research question 2: Effectiveness IPS for different clinical, functional and personal characteristicsEffect size of outcomeaHeterogeneity
ModeratorSubgroupsN studiescN sample (IPS – control)n (%) at FUdORb95% CIpI295% CIp
IPSControl
n (%)Nn (%)N
Duration of illness at baselineShort duration of illness6617–579293 (47.5%)617156 (26.9%)5792.45 [S]1.92–3.14<0.0169%37–84%<0.01
Long duration of illness5465–450221 (47.5%)465130 (28.9%)4502.45 [S]1.82–3.29<0.0134%0–60%0.19
Subgroup differenceχ2 = 0.00; df = 1; p = 0.99
Baseline severity of symptomsHigh severity of symptoms112105–20991055 (50.1%)2105672 (32.0%)20992.20 [S]1.94–2.49<0.0170%54–81%<0.01
Low severity of symptoms121035–1046477 (46.1%)1035223 (21.3%)10463.95 [M]3.24–4.82<0.0159%40–72%<0.01
Subgroup differenceχ2 = 23.99; df = 1; p < 0.01
Baseline Level of Functioning (LOF)High baseline LOF7540–487274 (50.7%)540144 (29.6%)4873.59 [M]1.95–6.61<0.0177%57–88%<0.01
Low baseline LOF7846–846428 (50.6%)846247 (29.2%)8463.54 [M]2.08–6.04<0.0184%70–91%<0.01
Subgroup differenceχ2 = 0.00; df = 1; p = 0.97
Age at baseline⩽ 25th percentile8631–628311 (49.3%)631202 (32.2%)6282.56 [S]1.75–3.74<0.0156%28–73%<0.01
⩾ 75th percentile81699–1728864 (50.9%)1699539 (31.2%)17282.94 [S]2.00–4.31<0.0174%54–85%<0.01
Subgroup differenceχ2 = 0.24; df = 1; p = 0.62
Comorbid alcohol use at baselineHigh alcohol use6501–500308 (61.5%)501183 (36.6%)5003.51 [M]1.93–6.39<0.0172%43–86%<0.01
Low alcohol use6398–405243 (61.1%)39887 (21.5%)4056.44 [L]4.67–8.88<0.0114%0–29%0.32
Subgroup differenceχ2 = 3.05; df = 1; p = 0.08
Comorbid substance use at baselineHigh substance use8379–365187 (49.3%)37982 (22.5%)3654.27 [M]2.55–7.15<0.0147%19–66%0.06
Low substance use7494–511328 (66.4%)494132 (25.8%)5115.84 [L]4.38–7.80<0.0126%1–45%0.23
Subgroup differenceχ2 = 1.08; df = 1; p = 0.30
Work experience at baselineHigh work experience8596–585282 (49.6%)596132 (22.6%)5853.78 [M]2.70–5.30<0.0134%8–52%0.16
Low work experience8752–750374 (49.7%)752160 (21.3%)7505.24 [L]2.95–9.30<0.0183%69–90%<0.01
Subgroup differenceχ2 = 0.93; df = 1; p = 0.34
Education level: tertiary educationHigh proportion tertiary education101795–1826979 (54.5%)1795603 (33.0%)18263.03 [S]2.23–4.12<0.0164%44–77%<0.01
Low proportion tertiary education10890–856411 (46.2%)890243 (28.4%)8564.01 [M]2.33–6.93<0.0183%73–90%<0.01
Subgroup differenceχ2 = 0.77; df = 1; p = 0.38
Job duration
Research question 1: Effectiveness IPS for different diagnosesEffect size of outcomeaHeterogeneity
ModeratorbSubgroupsN studiescN sample (IPS – control)M (s.d.) at FUddb95% CIpI295% CIp
IPSControl
M (s.d.)NM (s.d.)N
Severity of psychiatric disordersCMD3475–47730.2 (10.0)47517.4 (5.4)4770.35 [S]−0.03 to 0.740.0790%69–97%<0.01
SMI121026–102022.1 (12.1)102610.0 (10.9)10200.45 [S]0.29–0.61<0.0178%66–86%<0.01
Subgroup differenceχ2 = 0.19; df = 1; p = 0.66
Schizophrenia spectrum disorder (SSD)>50% of study sample SSD131070–107821.4 (11.8)107010.2 (10.5)10780.42 [S]0.27–0.56<0.0177%65–85%<0.01
⩽ 50% of study sample SSD91769–179230.7 (10.8)176918.0 (12.0)17920.39 [S]0.21–0.57<0.0180%66–88%<0.01
Subgroup differenceχ2 = 0.04; df = 1; p = 0.84
Major depressive disorder (MDD)>50% of study sample MDD31199–124022.5 (6.6)119913.3 (11.3)12400.33 [S]0.11–0.56<0.0180%26–95%<0.01
⩽ 50% of study sample MDD161453–143128.0 (12.6)145314.9 (12.4)14310.44 [S]0.28–0.60<0.0181%72–87%<0.01
Subgroup differenceχ2 = 0.55; df = 1; p = 0.46
Research question 2: Effectiveness IPS for different clinical, functional and personal characteristicsEffect size of outcomeaHeterogeneity
ModeratorSubgroupsN studiescN sample (IPS – control)M (s.d.) at FUddb95% CIpI295% CIp
IPSControl
M (s.d.)NM (s.d.)N
Baseline severity of symptomsHigh severity of symptoms91943–199021.3 (10.7)194310.6 (6.9)19900.37 [S]0.20–0.54<0.0184%73–91%<0.01
Low severity of symptoms9670–66527.5 (12.3)67012.2 (12.2)6650.51 [M]0.37–0.65<0.0159%35–74%0.01
Subgroup differenceχ2 = 1.69; df = 1; p = 0.19
Baseline Level of Functioning (LOF)High baseline LOF3163–16225.0 (10.4)1638.1 (5.6)1620.59 [M]0.09–1.090.0283%39–96%<0.01
Low baseline LOF7911–90526.3 (11.9)91111.7 (7.2)9050.46 [S]0.21–0.72<0.0188%78–93%<0.01
Subgroup differenceχ2 = 0.19; df = 1; p = 0.66
Age at baseline⩽ 25th percentile6510–50627.7 (16.4)51016.5 (12.8)5060.28 [S]0.10–0.47<0.0152%13–73%0.07
⩾ 75th percentile51512–155223.5 (7.5)151213.4 (9.7)15520.35 [S]0.08–0.61<0.0585%66–93%<0.01
Subgroup differenceχ2 = 0.14; df = 1; p = 0.71
Comorbid substance use at baselineHigh substance use4115–11233.6 (16.2)11518.7 (16.6)1120.54 [M]0.21–0.86<0.0143%0–73%0.15
Low substance use6428–42526.5 (9.6)4288.3 (5.7)4250.62 [M]0.40–0.85<0.0174%48–87%<0.01
Subgroup differenceχ2 = 0.19; df = 1; p = 0.66
Work experience at baselineHigh work experience6382–37527.3 (15.0)38213.7 (14.5)3750.51 [M]0.30–0.73<0.0167%34–83%0.01
Low work experience7639–64319.6 (8.9)6397.0 (5.4)6430.49 [S]0.26–0.71<0.0183%67–91%<0.01
Subgroup differenceχ2 = 0.03; df = 1; p = 0.87
Education level: tertiary educationHigh proportion tertiary education81674–170732.1 (12.9)167419.8 (14.3)17070.39 [S]0.25–0.52<0.0161%35–77%0.01
Low proportion tertiary education7640–63022.2 (6.5)6409.6 (9.7)6300.46 [S]0.20–0.72<0.0183%68–91%<0.01
Subgroup differenceχ2 = 0.24; df = 1; p = 0.63
Wages
Research question 1: Effectiveness IPS for different diagnosesEffect size of outcomeaHeterogeneity
ModeratorbSubgroupsN studiescN sample (IPS – control)M (s.d.) at FUddb95% CIpI295% CIp
IPSControl
M (s.d.)NM (s.d.)N
Severity of psychiatric disordersCMD3357–354797.2 (214.5)357553.9 (388.4)3540.32 [S]0.07–0.560.0141%0–80%0.18
SMI7666–719141.7 (146.3)66694.4 (129.8)7190.31 [S]0.04–0.580.0383%69–91%<0.01
Subgroup differenceχ2 = 0.00; df = 1; p = 0.96
Schizophrenia spectrum disorder (SSD)>50% of study sample SSD7639–698123.5 (152.6)63988.9 (132.8)6980.29 [S]0.03–0.550.0382%67–91%<0.05
⩽ 50% of study sample SSD71491–1523686.6 (309.0)1491479.1 (334.1)15230.30 [S]0.22–0.37<0.012%0–4%0.41
Subgroup differenceχ2 = 0.00; df = 1; p = 0.96
Major depressive disorder (MDD)>50% of study sample MDD21048–1092737.4 (433.8)1048619.2 (480.1)10920.29 [S]0.21–0.37<0.010%NA0.96
⩽ 50% of study sample MDD111007–1055387.5 (350.3)1007254.1 (274.7)10550.33 [S]0.15–0.51<0.0175%61–84%<0.01
Subgroup differenceχ2 = 0.15; df = 1; p = 0.70
Research question 2: Effectiveness IPS for different clinical, functional and personal characteristicsEffect size of outcomeaHeterogeneity
ModeratorSubgroupsN studiescN sample (IPS – control)M (s.d.) at FUddb95% CIpI295% CIp
IPSControl
M (s.d.)NM (s.d.)N
Baseline severity of symptomsHigh severity of symptoms51628–1670412.6 (270.6)1628233.9 (186.9)16700.34 [S]0.15–0.54<0.0181%56–92%<0.01
Low severity of symptoms5288–284335.3 (358.7)288199.0 (238.9)2840.36 [S]0.12–0.60<0.0161%16–82%0.04
Subgroup differenceχ2 = 0.01; df = 1; p = 0.93
Education level: tertiary educationHigh proportion tertiary education71520–1551564.7 (369.9)1520407.8 (259.3)15510.29 [S]0.21–0.37<0.014%0–9%0.40
Low proportion tertiary education4403–467241.6 (549.9)403144.8 (533.3)4670.38 [S]0.07–0.700.0283%54–94%<0.01
Subgroup differenceχ2 = 0.32; df = 1; p = 0.57

d > 0 and OR > 1 indicates outcomes are beneficial for IPS compared to the control group; d < 0 and OR < 1 indicates outcomes are beneficial for the control group compared to IPS.

b Underlined moderators were significant moderators of outcome.

Magnitude of effect: Not clinically relevant [N]: d > −0.2 – <0.2; OR > 0.67 – <1.5; Small effect [S]: d ⩽ −0.20 and >−0.50 – ⩾0.20 and <0.50; OR ⩽ 0.67 and >0.29 – ⩾1.5 and <3.5; Medium effect [M]: d ⩽ −0.50 and >−0.80 – ⩾0.50 and <0.80; OR ⩽ 0.29 and >0.20 – ⩾3.5 and <5; Large effect [L]: d < −0.80 – >0.80; OR < 0.20 – >5.

Summary statistics for each of the three employment outcomes are assessed as follows: Employment rate: number and percentage of people in competitive employment at the follow-up assessment; Job duration: percentage of time within the study period employed that participants are employed; Wages: monthly salary in euros during the study period.

Fig. 3.

Overview effect sizes of outcomes for each moderator.

Overview effect sizes of outcomes for each moderator. Sensitivity analysis of moderating effects on the outcomes d > 0 and OR > 1 indicates outcomes are beneficial for IPS compared to the control group; d < 0 and OR < 1 indicates outcomes are beneficial for the control group compared to IPS. b Underlined moderators were significant moderators of outcome. Magnitude of effect: Not clinically relevant [N]: d > −0.2 – <0.2; OR > 0.67 – <1.5; Small effect [S]: d ⩽ −0.20 and >−0.50 – ⩾0.20 and <0.50; OR ⩽ 0.67 and >0.29 – ⩾1.5 and <3.5; Medium effect [M]: d ⩽ −0.50 and >−0.80 – ⩾0.50 and <0.80; OR ⩽ 0.29 and >0.20 – ⩾3.5 and <5; Large effect [L]: d < −0.80 – >0.80; OR < 0.20 – >5. Summary statistics for each of the three employment outcomes are assessed as follows: Employment rate: number and percentage of people in competitive employment at the follow-up assessment; Job duration: percentage of time within the study period employed that participants are employed; Wages: monthly salary in euros during the study period. We found significant favourable employment rate outcomes in the IPS group compared with the control group in all subgroups. However, IPS showed more favourable outcomes in studies targeting participants with SMI than studies targeting CMD (χ2 = 10.79; df = 1; p < 0.01). These differences between both subgroups were specifically explained by differences in employment rates in the control group (i.e., 38.3% in the CMD subgroup versus 23.9% in the SMI subgroup; χ2 = 28.84; df = 1; p < 0.01). We also found more favourable outcomes for IPS in subgroups with a majority diagnosed with SSD (χ2 = 18.24; df = 1; p < 0.01), as well as in subgroups with a minority diagnosed with MDD (χ2 = 5.36; df = 1; p < 0.05), and in subgroups with a lower baseline level of symptoms (χ2 = 20.48; df = 1; p < 0.01). Figure 3 shows all subgroup outcomes. Subgroups with SMI, the majority diagnosed with SSD, low symptom severity, and low comorbid alcohol and substance use problems at baseline had a positive influence on the relative effectiveness of IPS. None of the potential moderators included in the sensitivity analysis had significant effects on either job duration and wages. As the type of control group and the region in which the study is executed significantly influenced employment rate outcomes, above-mentioned moderating effects might be explained by an overrepresentation of a specific moderator in one of the subgroups based on the region or type of control group. However, chi-square analyses did not find any indications of overrepresentation in any of these subgroups. We could therefore not explain any moderating effects by regional differences or type of control group.

Assessment of outliers and publication bias

We found two negative outliers and six positive outliers for employment rate, three negative outliers and three positive outliers for job duration and one negative and one positive outlier for wages. Removing these outliers did not positively or negatively influence the study outcomes. The funnel plots are presented in online Supplementary materials 3. For all outcomes (employment rate, job duration and wages) we found no indications of publication bias.

Discussion

This meta-analysis investigated the relative effectiveness of IPS for different subgroups based on diagnostic, clinical, functional and personal characteristics. Overall, we found that IPS is effective in improving employment outcomes regardless of sample characteristics. However, we did find that IPS was relatively less effective in supporting service users into competitive employment in European studies and in studies comparing IPS to a passive control group. Furthermore, we found that IPS was relatively more effective for people with SMIs, compared with CMD. We also found more favourable outcomes of IPS in subgroups in which the majority was diagnosed with a schizophrenia spectrum disorder (SSD), in which the minority was diagnosed with major depressive disorder (MDD), subgroups with a low baseline symptom severity, and subgroups with a low baseline level of substance and alcohol use problems. These subgroup effects could not be explained by an overrepresentation of non-European studies or an active control group within any subgroup. Despite the fact that we found overall effectiveness of IPS for all subgroups, the issue remains that in many studies the majority of service users that received IPS remain unemployed. This highlights the need for continuous refinement of the IPS model. The fact that IPS was less effective on employment rate outcomes in European studies is most probably explained by the relatively extensive welfare systems with a disability benefit structure in most European countries. The risk of losing steady income from disability benefits after finding competitive employment (i.e., the ‘benefits trap’) might discourage service users from seeking employment (Burns and Cathy, 2008; Metcalfe et al., 2018a). We also found that IPS was relatively more effective on employment rate outcomes when compared with an active control group than when compared with a passive control group. We found a slightly larger employment rate in the IPS group (i.e., 50.1% vs 48.3%) but a slightly smaller employment rate in the control group (i.e., 26.7% vs. 29.4%) when IPS was compared with an active control group. However, differences in both IPS and control groups were negligible and therefore we could not give any clinical meaningful explanations for the differences between both types of the control groups. Our findings that IPS is relatively more effective for people with SMI and SSD and relatively less effective for people with MDD and CMD are in line with previous research (Hellström et al., 2021). The main explanation for differences in the effectiveness of IPS between CMD and SMI subgroups is that employment rate outcomes in the control group were larger in the CMD subgroup, whereas the outcomes were equal in the IPS group. Previous research found even more favourable employment outcomes in the control group for people with mood disorders or less severe thought disorders (Campbell et al., 2010; Jonsdottir and Waghorn, 2015). This may indicate that people with CMD also benefit from other vocational rehabilitation interventions. Nevertheless, our meta-analysis indicates that IPS leads to more favourable employment outcomes for people with CMD compared to any control group, and these indications of effectiveness were also found in another recently published meta-analysis (Probyn et al., 2021). Another possible explanation for the differences between CMD and SMI subgroups is the fact that IPS is originally developed for people with SMI who are generally supported by professionals working in integrated treatment teams, whereas service users with CMD are often supported in different healthcare settings. Previous research indicated that the level of organisational characteristics, such as the type of clinical practice, service intensity and quality of mental health treatment could be an important prerequisite for successful implementation (Lockett et al., 2018). This explanation is supported by two studies that conducted IPS for service users with CMD (Hellström et al., 2017; Poremski et al., 2017) in another healthcare setting for this group. These differences can also partially be explained by the fidelity scores. Only fifty per cent of all studies that evaluated the effectiveness of IPS for people with CMD reached a fair fidelity. In contrast, 89% of the studies that evaluated the effectiveness of IPS for people with SMI achieved good or excellent fidelity. Given the fact that better fidelity scores lead to better outcomes in IPS (Bond et al., 2012; Kim et al., 2015; Lockett et al., 2016; De Winter et al., 2020), this might be an important explanation for the differences in outcomes between SMI and CMD subgroups. In addition to fidelity, other important factors, such as the quality of healthcare services and the intensity of employment support might also be relevant topics for further investigation. Therefore, poorer outcomes for people with CMD compared to people SMI might partially be explained by specific challenges in the implementation of IPS in a different healthcare setting, which underlines our recommendation to adapt implementation for specific subgroups. We also found more favourable indications of the effectiveness of IPS for people with lower symptom severity and lower comorbid substance and alcohol use problems at the start of IPS. This is in line with previous studies which also indicated that lower symptom severity increased the odds of being employed for people who received any type of vocational rehabilitation (Michon et al., 2005; Campbell et al., 2010; Nygren et al., 2013). This might be explained by the fact that a lower symptom severity frees up more time for and focus on the adequate job support, because less focus on symptom stabilisation and intensive treatment programmes is needed. The positive influence of low symptom severity on the effectiveness of IPS may contradict the superior outcomes of IPS for people with SMI compared with CMD, as SMIs are generally associated with higher symptom severity. However, four out of the five studies (80%) that reported outcomes of IPS for people with CMD had a high symptom severity at baseline. Therefore, symptom severity and the severity of illness are not interrelated in this meta-analysis. This meta-analysis had several limitations. First, all our findings were analysed on a study level and subgroups were based on aggregated scores or percentages of the whole study sample. This analysis provides an overarching overview of the influence of specific service users' characteristics on the effectiveness of IPS, but does not reflect on the specific variability of individual client level characteristics or outcomes. However, despite this limitation, this meta-analysis gives valuable insights toward better understanding of making effective adaptations in the implementation of IPS in real-world settings. Analysis of outcomes on a study level inevitably leads to heterogeneity of outcomes because the context and setting in which the studies are executed differ (Ioannidis, 2008). Furthermore, this meta-analysis only focused on the effectiveness of IPS as a stand-alone intervention within a mental health population, in order to achieve a relatively homogeneous sample of studies. As a consequence, we did not include a number of relevant studies that exclusively investigating IPS with an add-on intervention (e.g., McGurk et al., 2015; Tsang et al., 2016) or studies focused on populations with a high risk of developing mental disorders (e.g., Sveinsdottir et al., 2019). This also partially explained the lack of available study data to investigate the influence of other relevant moderators (such as cognitive functioning). Second, some of our sensitivity analyses, in which we investigated moderating effects, were based on a relatively low number of studies. This might have limited the generalisability of outcomes for subgroups based on a small number of studies. Another limitation is that 12 (37.5%) of our included studies are conducted more than 10 years ago. During that time IPS was executed in a societal setting, that was applicable at that time, with most probably other welfare policies or treatment practices than implemented nowadays. Another potential limitation is the fact that the broad variety of studies might influence the interpretation and representativeness of some moderating effects. Our included studies investigated target groups with different diagnoses and clinical characteristics and were therefore in some cases using different assessment instruments. This was specifically the case in the moderating effects of the severity of symptoms and level of functioning. We tried to solve this issue by using normative data based on representative target groups that matches with each included study. However, this inevitably leads to heterogeneity in the outcomes. Interpretation of the findings on the influence of symptom severity and level of functioning on employment should therefore be handled with caution. Finally, we have executed a relatively high number of sensitivity analyses based on a relatively low number of studies. This increases the chance of false-positive outcomes and alpha inflation (Wang et al., 2017). We should therefore consider the results of this meta-analysis as exploratory and the findings suggesting potentially valuable trends for improving IPS for different target groups. Overall this meta-analysis has shown that IPS is implemented for a wide variety of service users: IPS is effective for different subgroups, regardless of distinct diagnostic, clinical, functional and personal characteristics. However, future research should focus on the implementation of IPS for people with CMD and higher symptom severity. It is important to investigate whether, and if so, how to make more effective adaptations in the implementation of IPS to better meet the vocational needs of these groups.
PrincipleExplanation
1. Goal of competitive employmentThe goal of IPS is obtaining and sustaining competitive employment. This is defined as jobs anyone can apply for, pay at least minimum wage/same pay as coworkers with similar duties, and have no artificial time limits imposed by the social service agency.
2. Zero exclusion and eligibility based on client choicePeople are not excluded on the basis of readiness, diagnoses, symptoms, substance use history, psychiatric hospitalisations, homelessness, level of disability or legal system involvement.
3. Attention to client preferencesIPS programme services are based on each job seeker's preferences and choices rather than the employment specialist's and supervisor's judgments.
4. Rapid job searchIPS programmes use a rapid job search approach to help job seekers obtain jobs rather than assessments, training, & counselling. The first face to face contact with the employer occurs within 30 days.
5. Integration with mental health treatmentIPS programmes are integrated with mental health treatment teams. Employment specialists attach to 1 or 2 mental health treatment teams, which discuss their caseload.
6. Personalised benefits counsellingEmployment specialists help people obtain personalised, understandable, and accurate information about their Social Security, Medicaid and other government entitlements.
7. Targeted job developmentEmployment specialists systematically visit employers, who are selected based on the job seeker's preferences, to learn about their business needs and hiring preferences.
8. Individualised, long-term supportJob supports are individualised and continue for as long as each worker wants and needs the support. Employment Specialists have face to face contact at least monthly.
  81 in total

1.  A simple method for converting an odds ratio to effect size for use in meta-analysis.

Authors:  S Chinn
Journal:  Stat Med       Date:  2000-11-30       Impact factor: 2.373

2.  Supported employment versus traditional vocational rehabilitation for individuals with severe mental illness: a three-year study.

Authors:  H W H Tsang
Journal:  Hong Kong Med J       Date:  2011-02       Impact factor: 2.227

3.  Integrated supported employment plus cognitive remediation training for people with schizophrenia.

Authors:  H W H Tsang; M D Bell; V Cheung; K L Tam; W S Yeung
Journal:  Hong Kong Med J       Date:  2016-02       Impact factor: 2.227

Review 4.  Person-related predictors of employment outcomes after participation in psychiatric vocational rehabilitation programmes--a systematic review.

Authors:  Harry W C Michon; Jaap van Weeghel; Hans Kroon; Aart H Schene
Journal:  Soc Psychiatry Psychiatr Epidemiol       Date:  2005-05       Impact factor: 4.328

Review 5.  Vocational Outcomes of the Individual Placement and Support Model in Subgroups of Diagnoses, Substance Abuse, and Forensic Conditions: A Systematic Review and Analysis of Pooled Original Data.

Authors:  Lone Hellström; Pernille Pedersen; Thomas Nordahl Christensen; Iben Gammelgaard Wallstroem; Anders Bo Bojesen; Elsebeth Stenager; Ulrika Bejerholm; Jooske van Busschbach; Harry Michon; Kim T Mueser; Silje Endresen Reme; Sarah White; Lene Falgaard Eplov
Journal:  J Occup Rehabil       Date:  2021-03-04

6.  The meaning and importance of employment to people in recovery from serious mental illness: results of a qualitative study.

Authors:  Erin C Dunn; Nancy J Wewiorski; E Sally Rogers
Journal:  Psychiatr Rehabil J       Date:  2008

7.  Assisting Social Security Disability Insurance beneficiaries with schizophrenia, bipolar disorder, or major depression in returning to work.

Authors:  Robert E Drake; William Frey; Gary R Bond; Howard H Goldman; David Salkever; Alexander Miller; Troy A Moore; Jarnee Riley; Mustafa Karakus; Roline Milfort
Journal:  Am J Psychiatry       Date:  2013-12       Impact factor: 18.112

8.  A randomized controlled trial of a supported employment program for persons with long-term mental illness in Hong Kong.

Authors:  Kenny Kin Wong; Rose Chiu; Betty Tang; Donald Mak; Joanne Liu; Siu Ning Chiu
Journal:  Psychiatr Serv       Date:  2008-01       Impact factor: 3.084

9.  A randomized controlled trial comparing two vocational models for persons with severe mental illness.

Authors:  Gary R Bond; Michelle P Salyers; Jerry Dincin; Robert Drake; Deborah R Becker; Virginia V Fraser; Michael Haines
Journal:  J Consult Clin Psychol       Date:  2007-12

10.  Power analysis for random-effects meta-analysis.

Authors:  Dan Jackson; Rebecca Turner
Journal:  Res Synth Methods       Date:  2017-04-04       Impact factor: 5.273

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.