Literature DB >> 32173661

Construct validity and test-retest reliability of the World Mental Health Japan version of the World Health Organization Health and Work Performance Questionnaire Short Version: a preliminary study.

Norito Kawakami1, Akiomi Inoue2, Masao Tsuchiya1, Kazuhiro Watanabe1, Kotaro Imamura1, Mako Iida3, Daisuke Nishi1.   

Abstract

The aim of the study was to investigate test-retest reliability and construct validity of the World Mental Health Japan (WMHJ) version of World Health Organization Health and Performance Questionnaire (WHO-HPQ) short version according the COSMIN standard. We conducted two consecutive surveys of 102 full-time employees recruited through an Internet survey company in Japan, with a two-week interval in 2018. We calculated Pearson's correlation (r) of measures of the WHO-HPQ with other presenteeism scales (Stanford Presenteeism Scale, Work Functioning Impairment Scale, and perceived relative presenteeism), health and psychosocial job conditions. We tested the test-retest reliability (intraclass correlation, ICC) among those who reported no change of job performance during the follow-up. Among 92 (90%) respondents, the absolute presenteeism significantly correlated with WFun and perceived relative presenteeism (r=-0.341 and -0.343, respectively, p=0.001) and psychological distress (r=-0.247, p=0.018). The absolute/relative absenteeism did not significantly correlate with the other covariates. The test-retest reliability over a two-week period was high for the WHO-HPQ absolute presenteeism (ICC, 0.73), while those for absolute/relative absenteeism measures were moderate. The study found an adequate level of test-retest reliability, but limited support for the construct validity of the absolute presenteeism measure of the WMHJ version of the WHO-HPQ. Further research is needed to investigate the construct validity of the WHO-HPQ measures in a larger sample.

Entities:  

Keywords:  Absenteeism; Consensus-based standards for the selection of Health Measurement Instruments (COSMIN); Presenteeism; Productivity; Test-retest reliability

Mesh:

Year:  2020        PMID: 32173661      PMCID: PMC7417506          DOI: 10.2486/indhealth.2019-0090

Source DB:  PubMed          Journal:  Ind Health        ISSN: 0019-8366            Impact factor:   2.179


Introduction

Measuring work productivity loss has become increasingly important in research on mental health. It has been shown that a societal cost of mental disorders is large1). Work productivity loss has been to shown to be one of the largest components of the societal cost2, 3). Employers are also concerned with the cost effectiveness or cost benefit of a workplace intervention4). Recent studies of workplace interventions tend to focus on the impact of intervention programs on work performance5). Work productivity loss consists of two components: absenteeism and presenteeism. Absenteeism is the number (or the proportion) of lost workdays per a certain period; presenteeism is the reduction of on-the-job performance6). A number of instruments have been developed to measure absenteeism and presenteeism7). Some are developed to measure absenteeism and presenteeism of workers with specific health conditions, such as depression8) and musculoskeletal disorders9). Most instruments have been shown to be reliable and valid, while a further effort is needed to comprehensibly assess the psychometric properties of these instruments7, 9). In Japan, absenteeism and presenteeism of workers due to health problems is a concern both for occupational health and productivity loss10). Well-established instruments of absenteeism and presenteeism, such as the Stanford Presenteeism Scale (SPS)11) and Work Limitation Questionnaire (WLQ)12) have been translated and tested for the reliability and validity in Japan13,14,15). Some instruments, such as Work Functioning Impairment Scale (WFun), were developed even originally in Japan14) and extensively used in other countries16). Although many measures have been developed to assess both absenteeism and presenteeism among employees, most of them focus solely on presenteeism11, 14). The World Health Organization Health and Work Performance Questionnaire (WHO-HPQ) is the only instrument that measures both absenteeism and presenteeism6). The original English version of WHO-HPQ has been validated against administrative records of a company and performance ratings by supervisors among employees with different occupations6, 17). It was also validated among patients with arthritis18, 19). The original English version of WHO-HPQ has been translated into several languages (Portuguese for use in Brazil; Spanish; French; Hebrew; and Japanese) (https://www.hcp.med.harvard.edu/hpq/info.php). However, not many studies have been conducted to validate the translated versions. The Persian version was tested for the validity of absenteeism measures with administrative records of a company20). The Japanese version was found to correlate with and predict sick leave due to mental disorders21, 22). However, no translated version has been fully tested for its psychometric properties, such as test-retest reliability and construct validity including associations with other absenteeism/presenteeism scales23). The World Mental Health Japan (WMHJ) Survey conducted in 2002–2006 included some items from the short version of WHO-HPQ24), and this was used to estimate work productivity loss due to mental disorders25). This WMHJ version of WHO-HPQ was translated earlier using a slightly different wording from the recent Japanese translation by Suzuki et al21, 22). The WMHJ version has not yet been tested for reliability or validity. The study aim was to preliminarily evaluate the reliability and validity of the WHO-HPQ based on its version used in the WMHJ by investigating its test-retest reliability, construct validity, and responsiveness in a sample of employees, according to the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN)23). We conducted two consecutive surveys of a small sample of full-time workers in Japan with a two-week interval to see if the WHO-HPQ absenteeism and presenteeism measures showed acceptable levels of test-retest reliability. We also tested if the measures correlated with eight selected covariates that are thought to be related to absenteeism and presenteeism, including other measures of presenteeism, in order to know the construct validity (hypothesis testing)23). For testing the construct validity, we first calculated correlations between the WHO-HPQ measures and other absolute presenteeism scales, i.e., SPS11), WFun13), and perceived relative presenteeism to know if the WHO-HPQ absolute presenteeism measure correlate with these three scales. However, while the WHO-HPQ absolute presenteeism measure captures presenteeism from any reason6), these other presenteeism scales specifically asked presenteeism from health-related problems11, 13). Thus, the associations is expected to be only moderate. The WHO-HPQ measures of relative presenteeism would correlate with perceived relative presenteeism more than with SPS or WFun that measure absolute presenteeism. The WHO-HPQ absolute and relative absenteeism measures would not correlate with these other presenteeism scales, because a sick worker with no absenteeism may have greater presenteeism. Second, we tested the correlations between the WHO-HPQ measures and five possible predictors (two health conditions and three psychosocial job conditions. We assumed that WHO-HPQ absenteeism measures would correlate positively with psychological distress and depression/anxiety disorder10) and negatively with job control, and supervisor and coworker support26); Similarly, the WHO-HPQ presenteeism measures, for which the greater score implies better work performance6), would correlate negatively with psychological distress and depression/anxiety disorder, and positively with job control, and supervisor and coworker support27).

Subjects and Methods

Sample

A sample (N=102) of employees was drawn from a large pool (n>100,000) of registered members of a large Internet survey company in Japan. The inclusion criteria were: currently being employed full-time by a company or organization; and being aged between 20 and 60 yr old. They were asked to respond to two anonymous Internet surveys within a two-week interval (T1 and T2), because a 2–4 wk period was the most recommended time interval for test-retest reliability28). In previous studies, the interval used for the test-retest reliability for measures of absenteeism and presenteeism varied, e.g., one day29), 1–2 wk19, 30, 31), and one month or longer32). A short interval, such as one day, may overestimate the test-retest reliability because respondents remember their initial responses28). While a longer interval may underestimate the test-retest reliability due to a change in a target condition, a previous study reported that the test-retest reliability of a health-related quality of life scale was similar for two different intervals of two days and two weeks33). In addition, even for a longer interval, limiting respondents to those who reported no change during the follow-up would help correctly estimating the test-retest reliability23). Thus we decided to use a two-week interval for estimate the test-retest reliability among respondents who reported no change in their work performance. The sample size was planned so that a moderate correlation (Pearson’s r=0.3 between a scale score and other variables)34) could be statistically significant (p<0.05, two tailed) with the power of 0.80 and 80% of valid responses. The study aim and procedure were fully informed to participants and consent was obtained. The study plan was reviewed and approved by the Research Ethics Committee of the Graduate School of Medicine/Faculty of Medicine, The University of Tokyo (No. 2953-(4)).

Measures

The short-version of WHO-HPQ

In addition to items that were already translated into Japanese in the WMHJ Survey in 2008, the authors (N.K., A.I. and M.T.) translated some other items on the short version of WHO-HPQ. The authors again reviewed all the items and made modifications based on discussion and feedback from three employees selected at companies for which some of the authors (N.K. and D.N.) worked as occupational physicians. The final version was back-translated to English by a commercial translator and reviewed by Professor Kessler, the researcher who developed the WHO-HPQ. Briefly, absolute absenteeism is defined as total hours lost from work in a certain time frame; and relative absenteeism is hours lost from work relative to the total work hours. Absolute presenteeism is defined as work performance (i.e, the quality of work) rated by a respondent on a 0–10 scale; relative presenteeism is self-rated performance relative to work performance done by most coworkers that are also rated by the same respondent. According to the scoring manual, the following measures were calculated. See the items and calculation formula in the Appendices 1 and 2, respectively): 1) Absenteeism a) Using four-week estimates Absolute absenteeism in the past four weeks Relative absenteeism in the past four weeks b) Using seven-day estimates Absolute absenteeism in the past seven days Relative absenteeism in the past seven days 2) Presenteeism (work performance) Absolute presenteeism Relative presenteeism (ratio) Relative presenteeism (subtraction) The survey at T1 used the WMHJ version of the WHO-HPQ. For testing the construct validity, we used measures at T1 in principle. After a preliminary analysis of all the collected data from the T1 survey revealed that respondents seemed to rate similarly on B9 (work performance of most people working on a similar job) and B11 (their own work performance), since the correlation between these two questions was strong (r=0.653). Thus, for the T2 survey, we added one sentence to B9 to clarify for respondents that the question asked about their co-workers’ job performance, not their own. The modified version was used in the second survey at T2. We used the relative presenteeism measures at T2 for testing the construct validity. This modification also made it impossible to calculate the relative presenteeism measures comparative for the two surveys. For this reason, we could not calculate the test-retest reliability for the relative presenteeism measures.

Other presenteeism scales

The SPS11) and WFun13) were measured in the survey at T1. The SPS was translated into Japanese and already validated35). The SPS score ranged from 10 to 50, with higher scores being indicative of greater presenteeism. The score was calculated only when a respondent endorsed any of 13 chronic conditions35). The WFun is a seven item self-rated scale to measure work performance on the job at present, developed and validated in Japan13, 14, 36). The total score of WFun ranged from 4 to 28, with higher scores being indicative of greater presenteeism. In the survey at T2, one question, A13 from the Clinical Trials Baseline Version of the WHO-HPQ, was added to ask respondent’ perceived relative presenteeism using a seven-point response option, with a greater score being indicative of poorer relative presenteeism (see Appendix 1). To ascertain the responsiveness, in the survey at T2, a single-item question asked if a respondent had better or poor work performance compared to two weeks ago, with a seven-point response option (e.g., 1=much better, 4=no change, and 7=much worse).

Other covariates

For mental health conditions, K6 was measured to assess psychological distress37, 38). In addition, the SPS listed 13 chronic conditions11). One item from the list was used to determine if a respondent had depression/anxiety disorder. Selected psychosocial job conditions, i.e., job control, and supervisor and coworker support, were measured using the Brief Job Stress Questionnaire, which has been well-validated in Japan36). Information of sex, age, occupation, and educational attainment was also collected in the survey at T1.

Statistical analysis

The authors made corrections to some apparently careless input values. For instance, if a respondent was not a manager, his/her expected work hours (B2) were set as 40 h per week, which is the legal requirement for regular work hours in Japan. For respondents who reported 0 h on B6, the response was replaced with an estimation based on B5a to B5e. Minimum, maximum, and average scores of WHO-HPQ measures were calculated. For testing the construct validity, the COSMIN taxonomy integrates convergent, discriminant and known groups validity into one single concept, i.e., the “hypothesis testing”23). Also in the COSMIN taxonomy, the criterion validity is only used when it is compared with a “gold” standard, such as objectively measured absenteeism and presenteeism. In the present study, we investigated only the hypothesis testing for the validity (see the list of hypotheses in the Appendix 3). Pearson’s correlation coefficients for the measures of the WHO-HPQ (absolute and relative absenteeism measures, both for 4 wk and 7 d, at T1, absolute presenteeism at T1, and relative presenteeism measures, both ratio and subtract, at T2) were calculated with the three other presenteeism scales and the five covariates to examine the construct validity (hypothesis testing). Test-retest reliability was measured by the intraclass correlation coefficients (ICCs) of the measures from the WHO-HPQ at T1 and T2 in a one-way random model, only for those who reported no changes in their work performance between T1 and T2 (4=no change on the 7-point single question on the self-reported changes of work performance), following the definition of test-retest reliability of COSMIN23). The responsiveness was tested by the Pearson’s correlation coefficients between the T1–T2 changes of the WHO-HPQ measures and the self-reported changes in work performance assessed at T2. Because the Internet survey required the participants to answer all items, there were no missing values on any variables or items. The IBM SPSS Software (ver. 22) was used for the analyses. The statistical significance of the correlations was assessed with two-side test with an alpha level of 0.05.

Results

Participants

All participants at T1 participated in the survey at T2. The following respondents were excluded from the analysis: those who reported 97 h employed per week (n=3); who had a large discrepancy between the reported value on B6 and one estimated from B5a-B5e (more than 2SDs, i.e., 152) (n=7). The final sample for the analysis included 92 respondents (Table 1). Half were women, with an average age of 43.1 yr old. Most of them were white-collar workers such as clerks, professionals and technicians, and managers. They worked about eight hours longer in the past week than the labor standard work hours (i.e., 40 h per week) on average. Almost half were university graduates. About 60% were currently married and had a child. Less than half (45.7%) had any of 12 chronic medical conditions. Back/neck disorders were most frequent (16.3%), followed by depression/anxiety disorder (10.9%).
Table 1.

Demographic, occupational and health-related characteristics of the respondents (n=92)

n%MeanSD
Sex (women)4346.7
Age (yr)43.111.2
20–342426.1
35–493841.3
50–603032.6
Occupation
Managers1819.6
Professionals/technicians1920.7
Clerks3335.9
Service workers98.7
Production/machine operators1415.2
Work hours in the past week47.912.0
Education (university or higher)4852.2
Marital status (married)5762.0
Having a child (any)5458.7
Chronic conditions
Allergy77.6
Stomach/bowels66.5
Asthma22.2
Back/neck disorders1516.3
Heart or circulatory22.2
Depression/anxiety disorder1010.9
Diabetes33.3
Arthritis/joint pain88.7
Migraine/chronic headache99.8
Hearing impairment43.3
Vision impairment33.3
Skin diseases77.6
Others33.3
Any of the above4245.7
WFun presenteeism score T1 (7–35)15.16.6
SPS presenteeism score T1 (1–50)32.96.5
Perceived relative presenteeism T2 (1–7)3.01.4
Psychological distress (K6) T1 (0–24)12.15.5
Job control score T1 (3–12)7.92.0
Supervisor support score T1 (3–12)6.72.1
Coworker support score T1 (3–12)6.82.2

SD: standard deviation.

SD: standard deviation.

Construct validity

Average values of both absolute and relative absenteeism measures were negative, indicating that respondents worked longer than they were expected on average (Table 2). The scores of all absolute and relative absenteeism measures showed a unimodal, right-skewed distribution. No significant correlation was observed between any of the absenteeism measures and any of the other presenteeism scales. For the hypothesis of the correlations with health conditions and psychosocial job conditions, only supervisor support significantly and negatively correlated with four-week absolute absenteeism (p=0.044).
Table 2.

Pearson’s correlation coefficients of the WHO-HPQ measures of absenteeism and presenteeism with other presenteeism measures, health conditions, and psychosocial job conditions in a sample of full-time employees in Japan (n=92)†

AverageSDMin, MaxPearson’s correlation coefficients (p values)

Other presenteeism measuresHealth conditionsPsychosocial job conditions



Stanford Presenteeism Scale (SPS) (T1) ‡WFun (T1)Perceived relative presenteeism (T2)K6 (T1)Depression/ anxiety (T1)Job control (T1)Supervisor support (T1)Coworker support (T1)
Absenteeism (4 wk)
Absolute absenteeism T1 −21.240.7−232.0, 64.0−0.010.014−0.11−0.094−0.0210.09−0.210*−0.145
(hr)(0.950)(0.898)(−0.298)(0.373)(0.843)(0.393)(0.044)(0.168)
Relative absenteeism T1−0.140.27−1.45, 0.19−0.0130.03−0.091−0.072−0.0160.074−0.196−0.147
(0.935)(0.778)(0.390)(0.496)(0.877)(0.486)(0.061)(0.161)
Absenteeism (7 d)
Absolute absenteeism T1 −36.642.5−176.0, 40.0−0.114−0.097−0.011−0.1030.101−0.002−0.0370.038
(hr)(0.474)(0.355)(0.917)(0.330)(0.340)(0.981)(0.724)(0.716)
Relative absenteeism T1−0.230.26−1.10, 0.17−0.117−0.101−0.012−0.1050.090.004−0.0340.042
(0.460)(0.339)(0.91)(0.321)(0.391)(0.968)(0.75)(0.692)
Presenteeism (work performance)
Absolute presenteeism T1 60.517.80.0, 100.0−0.304−0.341**−0.343**−0.247*−0.070.2010.0250.162
(0–100)§(0.050)(0.001)(0.001)(0.018)(0.507)(0.055)(0.813)(0.123)
Relative presenteeism T2 1.050.330.25, 2.000.041−0.052−0.358**0.061−0.1830.146−0.0290.045
(ratio)§(0.794)(0.622)(<0.001)(0.562)(0.081)(0.164)(0.787)(0.667)
Relative presenteeism T2 1.820−8.00, 8.00−0.004−0.079−0.383**0.043−0.1380.199−0.0330.054
(subtract)§(0.981)(0.453)(<0.001)(0.685)(0.190)(0.057)(0.751)(0.611)

†T1: assessed at the baseline; T2: assessed at the follow-up survey two weeks later. ‡Only asked when a respondent had any of 13 health problems (N=42). §A greater score is indicative of better work performance. *p<0.05, **p<0.01.

T1: assessed at the baseline; T2: assessed at the follow-up survey two weeks later. ‡Only asked when a respondent had any of 13 health problems (N=42). §A greater score is indicative of better work performance. *p<0.05, **p<0.01. The scores of the absolute and relative presenteeism measures showed a unimodal, left-skewed distribution.Average scores of absolute presenteeism measures at T1 were about 60. Relative presenteeism (subtraction) at T2 was small but positive, indicating that participants rated their work performance slightly better than others on average. For the hypotheses of correlations with the other presenteeism scales, the absolute presenteeism measure significantly and negatively correlated with WFun and perceived relative performance (p=0.001); it also marginally significantly and negatively with SPS (p=0.050). For the hypothesis of a negative correlation with health conditions and psychosocial job conditions, absolute presenteeism significantly and negatively correlated only with K6 (p=0.018), and marginally significantly and positively with job control (p=0.055). For the hypothesis of a correlation with perceived relative presenteeism, perceived relative presenteeism significantly and negatively correlated with WHO-HPQ relative presenteeism measures (both ratio and subtract) at T2 (p<0.001). For the correlations with health conditions and psychosocial job conditions, none of these variables significantly correlated with relative presenteeism measures. Sex, age or education did not significantly correlate with any WHO-HPQ measures (p>0.05, data available upon request).

Test-retest reliability and sensitivity to change

A total of 64 (63%) participants reported no changes in work performance during the past two weeks. These participants had significantly lower prevalence of chronic conditions than participants who reported the changes (n=17) (39% and 61%, respectively). Otherwise, no statistically significant difference between these two groups. The ICC calculated for the participants reported no changes was high enough for absolute presenteeism (Table 3). ICCs were moderate for four-wk absolute and relative absenteeism; ICCs were slightly greater for the seven-day absenteeism measures. The change in work performance in two weeks significantly and positively correlated with the change in absolute absenteeism in the total sample (n=92, r=0.252, p=0.015), but not with the change in other measures (r=0.085 − 0.174, p>0.05).
Table 3.

Test-retest reliability (ICC) between two surveys with a 2-wk interval among respondents who reported no change in work performance between T1 and T2 (n=64)

ICC95%CI
Absenteeism:
Absolute absenteeism (4 wk)0.6100.429–0.743
Relative absenteeism (4 wk)0.5270.336–0.690
Absolute absenteeism (7 d)0.6490.480–0.771
Relative absenteeism (7 d)0.6470.478–0.770
Presenteeism:†
Absolute presenteeism0.7300.591–0.827

ICC: intraclass correlation; 95%CI: 95% confidence intervals. †Relative presenteeism measures were not tested for the test-retest reliability because the measures were modified at T2.

ICC: intraclass correlation; 95%CI: 95% confidence intervals. †Relative presenteeism measures were not tested for the test-retest reliability because the measures were modified at T2.

Discussion

The aim of the study was to preliminarily investigate test-retest reliability and construct validity of the WMHJ version of WHO-HPQ in Japan. The absolute and relative absenteeism measures and the absolute presenteeism measure of the WMHJ version of the WHO-HPQ were stable over a two-week period (test-retest reliability). Among eight correlations hypothesized, the absolute presenteeism measure significantly correlated with WFun and perceived relative presenteeism and with psychological distress; this measure also marginally significantly correlated with SPS and job control. These findings provided some, but limited, support for the construct validity of this measure. As hypothesized, relative presenteeism measures significantly correlated with perceived relative presenteeism. Otherwise, none of WHO-HPQ measures significantly correlated with the variables for hypothesis testing. We found significant and moderate correlations of the absolute presenteeism measure with WFun and perceived relative presenteeism, and also its marginally significantly and moderately correlation with SPS. The findings are consistent with our hypothesis. The WHO-HPQ presenteeism measure asks presenteeism derived from any reasons, including health problems and work environment29), while SPS and WFun assess presenteeism only due to health problem. This could explain the moderate correlation between absolute presenteeism scores of the WHO-HPQ and SPS and WFun. The WHO-HPQ presenteeism measure may capture a different aspect of presenteeism than that measured by SPS and WFun. The WHO-HPQ absolute presenteeism measures significantly correlated with K6, and marginally significantly with job control. This is in line with previous findings that presenteeism was associated with poor mental health conditions26) and job control27). However, since only three of the eight correlations tested were found significant, the present study provides only limited support for the construct validity of the WHO-HPQ absolute presenteeism measure. The construct validity should be investigated further, in particular, with variables that could be more closely associated with absolute presenteeism, such as a scale of presenteeism from any reason not limited to health problems or health status (e.g., musculoskeletal symptoms). On the other hand, the absolute presenteeism measure was quite stable for a two-week period (0.73 in ICC). This test-retest reliability is better than a moderate two-week test-retest reliability (0.59 in ICC) for this measure19) and close to that for other global performance measures (0.69–0.78 in ICC) that were previously reported among patients with rheumatic diseases19). The higher ICC in this study may be because we limited the sample to participants who did not have change in work performance. These findings suggest that the absolute presenteeism measure of the WMHJ version of the WHO-HPQ is reliable over a short period (e.g., two wk) and valid to measure work performance among Japanese employees. The WHO-HPQ relative presenteeism measures (both ratio and subtract) significantly and negatively correlated with perceived relative presenteeism. The relative presenteeism measure did not correlate with SPS or WFun that are supposed to assess absolute presenteeism. Previous studies reported that the relative presenteeism was useful in predicting mental health problems in future21, 22). However, in this study, we did not find significant correlations between this measure and any of health conditions or psychosocial job conditions, providing little support for the construct validity. More research is needed to investigate the construct and predictive validity of the WHO-HPQ relative presenteeism measures. From our experience, it may also be better to add a small sentence to avoid a respondent misunderstanding that the question B9 that asks job performance of most workers in a job similar, not the respondent’s job performance. The average scores of absolute and relative presenteeism measures in this study were close to those reported in a previous study from Japan22). However, the scores were lower than those reported in a previous study in the United States, in which median scores of absolute presenteeism were between 80 and 90 among four different sample of workers6). Reporting presenteeism on the WHO-HPQ may be affected by norms and cultures of the workplace in a given country. The WHO-HPQ absolute absenteeism measure significantly and negatively correlated with supervisor support, as hypothesized. The change in absolute and relative absenteeism correlated with self-reported changes in one’s own work performance. The test-retest reliability (ICCs) was also moderate. However, the present study did not find much supporting evidence for the construct validity of the absolute or relative absenteeism measures. In addition, about 10% of the sample reported extremely long or conflicting responses. Some respondents made clear mistakes in entering hours and days in the WHO-HPQ questions about absenteeism. Some full-time non-manager respondents reported that their contract hours were longer than regular work hours (i.e., 40 h per week). This may be because in Japanese culture39), employees are often expected to work outside of their regular work hours. However, such inconsistency in reporting regular work hours among participants is likely to lead to a measurement error in relative absenteeism.

Limitations

The following limitations of the study should be noted. Our use of an Internet sample for the sake of convenience may limit the generalizability of the findings, since Internet users tend to have different sociodemographic and psychological characteristics than non-users40, 41). In addition, the present sample included a limited proportion of respondents with blue-collar jobs. The reliability and validity of the WHO-HPQ should be replicated and confirmed in a future study with a larger diverse sample of workers. The prevalence of depression/anxiety disorder in this sample was higher than the prevalence reported from a nationally representative survey42), that may further limit the generalization of the findings. Calculating the response rate was impossible because the employees from registered members joined the survey in order of arrival. This may have caused selection bias in the Japanese working population. Some of the participants may not have answered the questions carefully. The time frame that we used to investigate the test-retest reliability may not be optimal, because four-week time periods of participants’ first and second assessments were not same. This could be still the case even if we limited the analysis to participants who reported no change of their work performance in the past two weeks at T2. This could underestimate the test-retest reliability in our study. We did not use doctors’ diagnoses of health problems. Self-reported health problems may be more associated with self-reported work performance. We did not use objective measure of absenteeism (e.g., a company record) or presenteeism (e.g., manager’s evaluation of job performance of participants) to test the criterion validity. Finally, the selection of covariates to test the construct validity was arbitrary, not systematic. Some covariates may not have been appropriate for selection. This may lead to underestimation of the construct validity of the instrument. The covariates for testing the construct validity should be selected based on a systematic review of the relevant literature in future research.

Practical implications

For practical implications of the study findings, the absolute presenteeism measure of the WMHJ version of the WHO-HPQ may be used as a reliable measure of presenteeism among Japanese workers. However, this measure should be tested further for its construct validity and used with caution that it reflects presenteeism from any reasons, not like other presenteeism scales such as SPS and WFun that assess presenteeism solely from health problems. Further research is needed to clarify the validity of other measures of the WMHJ version of WHO-HPQ.

Conclusion

The study found some support for the construct validity and test-retest reliability of the absolute presenteeism measure of the WMHJ version of the WHO-HPQ among Japanese workers. Further research is needed to clarify the construct validity of other measures of this instrument.

Author Contribution

Conceptualization, N.K., A.I., and M.T.; Methodology, N.K., A.I., M.T., K.W., K.I., and D.N.; Investigation, K.W., and M.I.; Formal analysis, N.K.; Writing, N.K., A.I., M.T., K.W., K.I., I.M., and D.N.; Funding Acquisition, N.K.

Funding

This work was supported by JSPS KAKENHI Grant Number JP 18H04072.

Conflict of Interest

M.T. is an employee of the ADVANTAGE Risk Management Co., Ltd., Tokyo, Japan. Neither the funder nor the employer participated in designing the study, collecting and analyzing the data or preparing the manuscript. Otherwise, the authors declare no conflict of interest.
DescriptionCalculation formula(refer the items used, B4–B11 to the Appendix 1)
Absenteeism (4 wk)
Absolute absenteeism (hr)Difference (deficit) of actual work hours compared to standard working hours in the last four weeks4 × standard working hours per week (B4) − Hours worked in the last four weeks (B6)
Relative absenteeismProportion of difference (deficit) of actual work hours compared to standard working hours relative to standard working hours in the last four weeks[4 × Standard working hours per week (B4) − Hours worked in the last four weeks (B6)]/[4 × Standard working hours per week (B4)]
Absenteeism (7 d)
Absolute absenteeism (hr)Difference (deficit) of actual work hours compared to standard working hours in the last seven days4 × Standard working hours per week (B4) − 4 × Hours worked in the last seven days (B3)
Relative absenteeism[Standard working hours per week (B4) − 4 × Hours worked in the last seven days (B3)]/[4 × Standard working hours per week (B4)]
Presenteeism (work performance)
Absolute presenteeismWork performance (i.e, the quality of work) rated by a respondent10 × Self-reported work performance (B11, ranging 0–10)
Relative presenteeism (ratio)Work performance (i.e, the quality of work) rated by a respondent relative to work performance of other most workers at the same job also rated by the respondent, calculated in a ratio10 × [Self-reported work performance (B11) − Work performance of most workers at the same job (B9)], restricted to the range of 0.25 to 2.
Relative presenteeism (subtraction)Work performance (i.e, the quality of work) rated by a respondent relative to work performance of other most workers at the same job also rated by the respondent, calculated in a differenceSelf-reported work performance (B11)/ Work performance of most workers at the same job (B9)
Other presenteeism measuresHealth conditionsPsychosocial job conditions

WHO-HPQ measuresStanford PresenteeismScale (SPS) (T1)WFun(T1)Perceived relativepresenteeism (T2)K6(T1)Depression/anxiety (T1)Job control(T1)Supervisorsupport (T1)Coworkersupport (T1)
Absenteeism (absolute)XXXXX
Absenteeism (relative)XXXXX
Presenteeism (absolute)XXXXXXXX
Presenteeism (relative)XXXXXX

† T1: assessed at the baseline; T2: assessed at the follow-up survey two weeks later.

  39 in total

1.  Stanford presenteeism scale: health status and employee productivity.

Authors:  Cheryl Koopman; Kenneth R Pelletier; James F Murray; Claire E Sharda; Marc L Berger; Robin S Turpin; Paul Hackleman; Pamela Gibson; Danielle M Holmes; Talor Bendel
Journal:  J Occup Environ Med       Date:  2002-01       Impact factor: 2.162

2.  Relationship between sickness presenteeism (WHO-HPQ) with depression and sickness absence due to mental disease in a cohort of Japanese workers.

Authors:  Tomoko Suzuki; Koichi Miyaki; Yixuan Song; Akizumi Tsutsumi; Norito Kawakami; Akihito Shimazu; Masaya Takahashi; Akiomi Inoue; Sumiko Kurioka
Journal:  J Affect Disord       Date:  2015-03-28       Impact factor: 4.839

Review 3.  Prevalence of mental disorders and mental health service use in Japan.

Authors:  Daisuke Nishi; Hanako Ishikawa; Norito Kawakami
Journal:  Psychiatry Clin Neurosci       Date:  2019-06-18       Impact factor: 5.188

4.  Construct Validity and Scoring Methods of the World Health Organization: Health and Work Performance Questionnaire Among Workers With Arthritis and Rheumatological Conditions.

Authors:  Rawan AlHeresh; Michael P LaValley; Wendy Coster; Julie J Keysor
Journal:  J Occup Environ Med       Date:  2017-06       Impact factor: 2.162

5.  Test-retest Reliability and Correlations of 5 Global Measures Addressing At-work Productivity Loss in Patients with Rheumatic Diseases.

Authors:  Sarah Leggett; Antje van der Zee-Neuen; Annelies Boonen; Dorcas E Beaton; Mihai Bojinca; Ailsa Bosworth; Sabrina Dadoun; Bruno Fautrel; Sofia Hagel; Catherine Hofstetter; Diane Lacaille; Denise Linton; Carina Mihai; Ingemar F Petersson; Pam Rogers; Jamie C Sergeant; Carlo Sciré; Suzanne M M Verstappen
Journal:  J Rheumatol       Date:  2015-12-01       Impact factor: 4.666

6.  [The effect of chronic health conditions on work performance in Japanese companies].

Authors:  Koji Wada; Mio Moriyama; Rie Narai; Hiroyuki Tahara; Ritsuko Kakuma; Toshihiko Satoh; Yoshiharu Aizawa
Journal:  Sangyo Eiseigaku Zasshi       Date:  2007-05

Review 7.  The economic burden of depression and the cost-effectiveness of treatment.

Authors:  Philip S Wang; Gregory Simon; Ronald C Kessler
Journal:  Int J Methods Psychiatr Res       Date:  2003       Impact factor: 4.035

8.  Telephone screening, outreach, and care management for depressed workers and impact on clinical and work productivity outcomes: a randomized controlled trial.

Authors:  Philip S Wang; Gregory E Simon; Jerry Avorn; Francisca Azocar; Evette J Ludman; Joyce McCulloch; Maria Z Petukhova; Ronald C Kessler
Journal:  JAMA       Date:  2007-09-26       Impact factor: 56.272

9.  Cost of lost productive work time among US workers with depression.

Authors:  Walter F Stewart; Judith A Ricci; Elsbeth Chee; Steven R Hahn; David Morganstein
Journal:  JAMA       Date:  2003-06-18       Impact factor: 56.272

10.  Methodological issues in Internet-mediated research: a randomized comparison of internet versus mailed questionnaires.

Authors:  Lisa Whitehead
Journal:  J Med Internet Res       Date:  2011-12-04       Impact factor: 5.428

View more
  6 in total

1.  The Effects of Objective Push-Type Sleep Feedback on Habitual Sleep Behavior and Momentary Symptoms in Daily Life: mHealth Intervention Trial Using a Health Care Internet of Things System.

Authors:  Hiroki Takeuchi; Kaori Suwa; Akifumi Kishi; Toru Nakamura; Kazuhiro Yoshiuchi; Yoshiharu Yamamoto
Journal:  JMIR Mhealth Uhealth       Date:  2022-10-06       Impact factor: 4.947

2.  Workplace responses to COVID-19 associated with mental health and work performance of employees in Japan.

Authors:  Natsu Sasaki; Reiko Kuroda; Kanami Tsuno; Norito Kawakami
Journal:  J Occup Health       Date:  2020-01       Impact factor: 2.708

3.  Modelling the COVID-19 Pandemic Effects on Employees' Health and Performance: A PLS-SEM Mediation Approach.

Authors:  Ion Popa; Simona Cătălina Ștefan; Ana Alexandra Olariu; Ștefan Cătălin Popa; Cătălina Florentina Popa
Journal:  Int J Environ Res Public Health       Date:  2022-02-07       Impact factor: 3.390

4.  A Comparison of the Validities of Traditional Chinese Versions of the Work Productivity and Activity Impairment Questionnaire: General Health and the World Health Organization's Health and Work Performance Questionnaire.

Authors:  Kim-Ngan Ta-Thi; Kai-Jen Chuang
Journal:  Int J Environ Res Public Health       Date:  2022-04-06       Impact factor: 3.390

5.  Validation of the Japanese version of the Dutch Boredom Scale.

Authors:  Michiko Kawada; Akihito Shimazu; Masahito Tokita; Daisuke Miyanaka; Wilmar B Schaufeli
Journal:  J Occup Health       Date:  2022-01       Impact factor: 2.570

6.  The Survey Measure of Psychological Safety and Its Association with Mental Health and Job Performance: A Validation Study and Cross-Sectional Analysis.

Authors:  Natsu Sasaki; Akiomi Inoue; Hiroki Asaoka; Yuki Sekiya; Daisuke Nishi; Akizumi Tsutsumi; Kotaro Imamura
Journal:  Int J Environ Res Public Health       Date:  2022-08-11       Impact factor: 4.614

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.