Literature DB >> 22012025

Self-report fatigue questionnaires in multiple sclerosis, Parkinson's disease and stroke: a systematic review of measurement properties.

Roy G Elbers1, Marc B Rietberg, Erwin E H van Wegen, John Verhoef, Sharon F Kramer, Caroline B Terwee, Gert Kwakkel.   

Abstract

PURPOSE: To critically appraise, compare and summarize the measurement properties of self-report fatigue questionnaires validated in patients with multiple sclerosis (MS), Parkinson's disease (PD) or stroke.
METHODS: MEDLINE, EMBASE, PsycINFO, CINAHL and SPORTdiscus were searched. The COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist was used to assess the methodological quality of studies. A qualitative data synthesis was performed to rate the measurement properties for each questionnaire.
RESULTS: Thirty-eight studies out of 5,336 records met the inclusion criteria, evaluating 31 questionnaires. Moderate evidence was found for adequate internal consistency and structural validity of the Fatigue Scale for Motor and Cognitive functions (FSMC) and for adequate reliability and structural validity of the Unidimensional Fatigue Impact Scale (U-FIS) in MS.
CONCLUSIONS: We recommend the FSMC and U-FIS in MS. The Functional Assessment of Chronic Illness Therapy Fatigue subscale (FACIT-F) and Fatigue Severity Scale (FSS) show promise in PD, and the Profile of Mood States Fatigue subscale (POMS-F) for stroke. Future studies should focus on measurement error, responsiveness and interpretability. Studies should also put emphasis on providing input for the theoretical construct of fatigue, allowing the development of questionnaires that reflect generic and disease-specific symptoms of fatigue.

Entities:  

Mesh:

Year:  2011        PMID: 22012025      PMCID: PMC3389599          DOI: 10.1007/s11136-011-0009-2

Source DB:  PubMed          Journal:  Qual Life Res        ISSN: 0962-9343            Impact factor:   4.147


Introduction

Fatigue is common in chronic neurological disorders [1]. Prevalence rates in conditions often seen in neurological rehabilitation, such as multiple sclerosis (MS), Parkinson’s disease (PD) and stroke, range from 58% [2] to 90% [3]. One of the challenges in assessing fatigue is the lack of a widely accepted definition [4] and with that, differentiating its many dimensions [2, 5]. Fatigue usually refers to the difficulty initiating or sustaining voluntary activity [6]. Its multidimensionality is believed to result from a complex interplay between the underlying disease process, peripheral control systems (i.e. muscle fatigability), central control systems (i.e. subjective sense of fatigue) and environmental factors [6]. This may reflect the large number of generic and disease-specific self-report questionnaires that are available to measure fatigue as either a multidimensional or a unidimensional assessment in patients considered for rehabilitation services. These questionnaires may measure different aspects or even different theoretical constructs of fatigue [7]. The clinician or researcher has to consider that each questionnaire is characterized by its own underlying concept, measurement properties and practical feasibility. A systematic review of the characteristics and measurement properties of self-report fatigue questionnaires can assist in selecting an appropriate questionnaire to evaluate fatigue in patients with MS, PD and stroke. Several systematic reviews [7-13] have evaluated the measurement properties of fatigue questionnaires. Three of these reviews [7, 12, 13] focused on patients with chronic disease, including samples of patients with MS and PD. Unfortunately, no recommendations were made specifically for patients with MS or PD. One review [10] focused on patients with MS. The authors recommended the Fatigue Impact Scale (FIS) and the Modified Fatigue Impact Scale (MFIS) [10]. Another review [8] recommended the Multidimensional Fatigue Inventory (MFI) and the Fatigue Severity Scale (FSS) for patients with PD. No systematic review evaluated questionnaires validated in patients with stroke. A limitation of the aforementioned reviews is that no uniform definitions and standards for the assessment of the methodological quality of the included studies were used. Therefore, the methodological quality of these studies was not taken into account when formulating conclusions, which makes it difficult to judge the strength of the evidence underlying the formulated recommendations. Recently, the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist [14] was developed to systematically evaluate the methodological quality of studies on measurement properties. This makes it possible to appraise the methodological quality of the included studies and take this into account when formulating conclusions. The aim of the present study was to critically appraise, compare and summarize the quality of the measurement properties of all published self-report fatigue questionnaires validated in patients with MS, PD or stroke, in order to assist clinicians and researchers in selecting a fatigue questionnaire.

Methods

Search

Five databases were searched up to November 2010 (MEDLINE (1966–2010), EMBASE (1974–2010), PsycINFO (1806–2010), CINAHL (1981–2010) and SPORTdiscus (1985–2010)). Text words and MESH terms for fatigue, MS, PD and stroke were combined with a sensitive filter (designed for PubMed) to identify studies on measurement properties of self-report questionnaires [15] (see supplementary file 1). References of the included studies were screened for additional articles.

Selection of studies

Two reviewers (RE/EvW) independently screened all titles and abstracts. The full text papers of relevant studies were obtained, and two reviewers (RE/MR) independently applied the a priori defined criteria for study selection. Studies were included if they met the following criteria: the study (1) focused on the development or evaluation of measurement properties of self-report questionnaires that assess subjective fatigue; (2) included patients with a clinical diagnosis of MS, PD or stroke and (3) included questionnaires that could be used for evaluative purposes. Studies were excluded if: the study (1) explicitly focused on the diagnostic test accuracy of the included questionnaire(s); (2) was published in a language other than Dutch, English, French or German. In case of disagreement, a third reviewer (EvW) was asked for advice to reach consensus.

Assessment of methodological quality

The methodological quality of a study was evaluated using the COSMIN checklist [14]. This checklist consists of 114 items, grouped in twelve boxes. Nine of these boxes contain standards for measurement properties (i.e. internal consistency, reliability, measurement error, content validity, structural validity, hypotheses testing, cross-cultural validity, criterion validity and responsiveness). One box contains standards for studies on interpretability, which is an important characteristic of a measurement scale [16]. In addition, two boxes contain requirements for studies in which Item Response Theory (IRT) methods are applied, and requirements for the generalizability of the results, respectively [14]. Each item was scored on a 4-point rating scale (i.e. ‘poor’, ‘fair’, ‘good’, or ‘excellent’) [17]. The methodological quality of a study was evaluated per measurement property and determined by the lowest rating of any of the items in a box. Pairs of reviewers (RE/EvW, RE/JV, RE/MR or RE/SK) independently scored the methodological quality of the included studies. Disagreement was resolved during consensus meetings.

Data extraction

A data extraction form was designed and tested before the pairs of reviewers independently extracted data on the: (1) characteristics of the study samples; (2) characteristics of the questionnaires (i.e. language version, theoretical construct of fatigue and dimensions, recall period, number of items, response options, range of scores, time to administer and ease of scoring); (3) evaluated measurement properties and (4) the interpretability and generalizability of the results.

Data synthesis

The theoretical construct of fatigue measured by a questionnaire was categorized by either ‘impact of fatigue on daily life’, ‘fatigue severity’ or ‘factors influencing fatigue’. Ease of scoring was categorized as ‘easy’ if items were simply summed, ‘moderate’ if a visual analogue scale (VAS) or simple formula was used, or ‘difficult’ if either a VAS in combination with a formula or a complex formula was used. Measurement properties were summarized according to the COSMIN taxonomy [16]. For each study, the estimates of the investigated measurement properties were rated as ‘adequate’ (+), ‘not adequate’ (−) or ‘unclear’ (?), based on predefined criteria [18] as described below. A qualitative data synthesis was performed to determine the overall quality of the measurement properties for each self-report questionnaire by taking into account the: (1) ratings for each measurement property; (2) consistency of results between studies; (3) methodological quality of studies and (4) the number of studies that investigated the measurement property. The possible overall quality of a measurement property was either ‘adequate’ (+), ‘not adequate’ (−), ‘conflicting’ (±) or ‘unclear’ (?). As shown in Table 1, levels of evidence were defined to express whether the strength of the evidence for the overall quality was, for example, convincing (‘strong’ level of evidence) or unconvincing (‘unknown’ level of evidence) [19].
Table 1

Levels of evidence for the overall quality of a measurement property

LevelRatingCriteria
Strong‘Adequate’ or ‘Not adequate’ (+ or −)Consistent findings in multiple studies of ‘good’ methodological quality OR in one study of ‘excellent’ methodological quality
Moderate‘Adequate’ or ‘Not adequate’ (+ or −)Consistent findings in multiple studies of ‘fair’ methodological quality OR in one study of ‘good’ methodological quality
Limited‘Adequate’ or ‘Not adequate’ (+ or −)One study of ‘fair’ methodological quality
Conflicting‘Conflicting’ (±)Conflicting findings
Unknown‘Unknown’ (?)Only studies of ‘poor’ methodological quality
Levels of evidence for the overall quality of a measurement property

Criteria for the quality of measurement properties

Reliability

The domain reliability contains three measurement properties: internal consistency, reliability and measurement error [16]. Internal consistency is the degree of the interrelatedness among items, assuming the questionnaire to be unidimensional [16]. Cronbach’s α was considered an acceptable measure of internal consistency and scored adequate if it ranged between 0.70 and 0.95 [18]. If a questionnaire was multidimensional, internal consistency was considered per subscale. Reliability was defined as the proportion of the total variance in the measurements which is because of ‘true’ differences between patients [16]. The intraclass correlation coefficient (ICC) and weighted kappa are acceptable measures for reliability and considered adequate if they were ≥0.70 [18]. If a Pearson or Spearman correlation coefficient (CC) was presented, which do not account for systematic differences between two tests [20], an estimate of ≥0.80 was considered adequate. Measurement error, defined as the systematic and random error of a score that is not attributed to true changes in the construct to be measured [16], was scored adequate if the smallest detectable change (SDC) was smaller than the minimal important change (MIC), or if the MIC was outside the limits of agreement (LOA) [18].

Validity

Validity contains the measurement properties content validity, construct validity and criterion validity [16]. Content validity includes face validity and extends to the degree to which the content of a questionnaire is an adequate reflection of the construct to be measured [16]. It was rated adequate if the target population and experts considered all items in the questionnaire relevant and considered the questionnaire to be complete. Construct validity was defined as the degree to which scores of a questionnaire are consistent with hypothesis, based on the assumption that the instrument validly measures the construct to be measured [16]. Construct validity is divided into structural validity, hypothesis testing and cross-cultural validity. Structural validity, defined as the degree to which scores of a questionnaire are an adequate reflection of the dimensionality of the construct to be measured [16], was scored adequate if factor analysis showed that all factors together explained ≥50% of the total variance, or when IRT methods were applied to confirm unidimensionality. Hypothesis testing was scored adequate if the correlation with a questionnaire that assessed fatigue (convergent validity) was ≥0.50, or ≥75% of the results were in accordance with a priori defined hypotheses, and the correlations with other constructs (divergent validity) were lower than the correlations with fatigue. A score unclear was given if only the correlation with questionnaires measuring another construct than fatigue (divergent validity) was investigated. Cross-cultural validity was defined as the degree to which the performance of the items on a translated or culturally adapted health-related patient-reported outcomes (HR-PRO) instrument is an adequate reflection of the performance of the items of the original version of the HR-PRO instrument [16]. As no gold standard exits for fatigue questionnaires, criterion validity was not evaluated.

Responsiveness

Responsiveness was defined as the ability of a questionnaire to detect change over time in the construct to be measured [16]. Responsiveness refers to the validity of a change score [21] and scored adequate if the change score correlated ≥0.50 with the change score of an instrument assessing fatigue, or if ≥75% of the results were in accordance with a priori defined hypotheses, or if the area under the receiver operator characteristic curve (AUC) was ≥0.70 [18].

Interpretability

Interpretability was defined as the degree to which one can assign qualitative meaning to an instruments’ quantitative scores or change in scores. Authors should provide information about clinically relevant differences in scores between subgroups (mean or median with distribution of scores), floor and ceiling effects and the MIC [21]. A floor or ceiling effect was present if >15% of patients achieved the lowest or highest possible score on a questionnaire [18].

Results

The search yielded 5,336 records, of which 56 studies were retrieved in full text for further assessment. This resulted in the exclusion of another 18 studies [10, 22–38] (see Fig. 1). Thirty-eight studies were included in the review, investigating 31 different self-report fatigue questionnaires [3, 39–75]. The FSS was most frequently investigated (n = 20) and the only questionnaire validated in patients with MS, PD and stroke. Characteristics of the included studies are presented in Table 2.
Fig. 1

Flow diagram for study selection

Table 2

Characteristics of included studies

ReferencesPatient characteristicsQuestionnaire
PopulationNAgeYearsMean (SD)Disease durationYearsMean (SD)Disease severityEDSS/S&E/SA-SIP-30Median (IQR)InvestigatedLanguage version
Armutlu [39]MS7238.16 (10.03)9.5 (6.43)

EDSS

4.0 (1.0–9.5)a

FSSTurkish
Armutlu [40]MS7138.6 (9.9)9.42 (6.39)

EDSS

3.94 (1.0–9.5)a

FISTurkish
Benito-León [41]MS6837.0 (9.0)6.0 (4.0–10.0)b

EDSS

2.5 (2.0–4.0)

D-FIS

MFI

Spanish
Brown [42]PD39–495c 64.2 (9.6)–70.4 (9.5)c 10.0 (7.6)–7.9 (6.7)c

S&E

66.4 (23.0)–70.3 (15.5)c

PFS-16 (2)

PFS-16 (5)

RFS

English
Debouverie [3]MS23742.5 (10.9)9.8 (7.4)

EDSS

3.7 (1.7)d

EMIF-SEP

FIS

French
Doward [43]MS9–167c 39.0 (12.9)–54.3 (5.9)c 8.4 (11.6)–22.7 (13.7)c Not reported

NHP-E

U-FIS

Canadian-English

Canadian-French

French

German

Italian

Swedish

US-English

Fisk [44]MS10542.5 (11.6)Not reportedNot reportedFISEnglish
Flachenecker [45]MS15139.0 (9.3)9.9 (6.7)

EDSS

3.5 (0–8.5)a

FSS

MFIS

MFSS

German
Flachenecker [46]MS67–158c 39.2 (8.7)–39.2 (9.2)c 9.7 (6.8)–9.9 (6.7)

EDSS

3.5 (0–6.5)a–3.5 (0–8.5)a,c

FSS

MFIS

MFSS

WEIMUS

German
Flachenecker [47]MS25–580c 44.1 (11.6)–47.2 (11.0)c 11.0 (8.1)–15 (9.5)c

EDSS

4.5 (1–8)a–5.5 (0–9)a,c

FSS

MFIS

MFSS

WEIMUS

German
Flensner [48]MS161

47.9 (10.1)e

48.0 (11.1)f

Not reportedNot reportedFISSwedish
Grace [49]PD5071.66 (1.39)Not reportedNot reported

FSS

PFS-16 (5)

English
Hagell [50]PD11863.9 (9.6)8.4 (5.7)

S&E

90 (80–90)g

FACIT-F

FSS

NHP-E

Swedish
Johansson [51]MS21947.0 (12.0)14 (10)

EDSS

1.0–3.5: 130h

4.0–5.5: 37h

6.0–9.5: 52h

FSS

SOFI

Swedish
Kim [52]MS4947 (25–67)i 15.7 (1.3–48.0)i

EDSS

3.2 (0–7)i

FSS

MFIS

English
Kos [53]MS5151.9 (10.5)16.6 (8.9)

EDSS

6.5 (3–8.5)a

FSS

MFIS

Dutch
Kos [54]MS30–51c 44.6 (11.7)–52.9 (10.5)c 11.3 (6.8)–16.6 (8.9)c

EDSS

6 (3.5–7.5)–6.5 (3–8.5)c

FSS

MFIS

Dutch

Italian

Slovenian

Spanish

Kos [55]MS6252 (10.5)Not reported

EDSS

6.5 (3–8.5)

FSS

MFIS

VAS-1

VAS-2

VAS-3

Dutch
Krupp [56]MS2544.8 (10)Not reportedNot reportedFSSEnglish
Kummer [57]PD8756.9 (10.3)8.7 (4.9)

S&E

76.7 (14.5)–86.1 (8.7)c

PFS-16 (2)

PFS-16 (5)

Brazilian-Portuguese
Lerdal [58]MS227–368c 46.6 (12.4)–49.1 (11.7)c 11.4 (8.3)–14.0 (10.4)c Not reported

FSS

FSS-7

FSS-5

Norwegian

Swedish

Losonczi [59]MS11143.82 (11.62)11.12 (8.29)

EDSS

1.94 (1.37)d

FISHungarian
Marrie [60]MS932452.3 (10.8)Not reportedNot reported

FSS

MFIS

PS-F

English
Martínez–Martín [61]PD9666.7 (9.6)j 8 (4–13)b,j

S&E

80 (70–90)j

D-FIS

MFI

Spanish
Mathiowetz [62]MS5450 (31–74)i 9.5 (1–34)i Not reported

FIS

FSS

SF-36-V

English
Mead [63]Stroke5573 (66–81)b

23 (10–53)b,k

137 (93–217)b,l

Not reported

FAS

MFSI-G

POMS-F

SF-36-V (V2.0)

English
Meads [64]MS15–135c 24–77m 0.4–59m Not reported

NHP-E

U-FIS

English
Mills [65]MS41645.8 (10.5)17.0 (9.5)

EDSS

0.0–4.0: 143h

4.5–6.5: 126h

7.0–7.5: 81h

8.0–9.5: 58h

Unknown: 8h

FSS

FSS-5

English
Mills [66]MS317–318c 46.4 (10.6)–46.8 (11.3)c 14.2 (9.4)–16.0 (9.7)c

EDSS

0.0–4.0: 214h

4.5–6.5: 196h

7.0–7.5: 136h

8.0–9.5: 80h

Unknown: 9h

NFI-MSEnglish
Mills [67]MS415Not reportedNot reportedNot reported

MFIS

MFIS C-5/MFIS P-8

English
Penner [68]MS30943.4 (9.95)Not reported

EDSS

3.4 (1.63)d

FSMC

FSS

MFIS

Not reported
Rendas–Baum [69]MS18450.9 (10.5)Not reported

EDSS

6 (0–9)a

FISNot reported
Reske [70]MS2039.1n 9.0 (9.3)

EDSS

3.2 (1.9)d

FSSGerman
Rietberg [71]MS4348.7 (7.0)14.3 (9.2)

EDSS

3.5 (1–6.5)a

CIS-20R

FSS

MFIS

Dutch
Schwartz [72]MS40Not reportedNot reportedNot reported

FAI

SF-36-V

English
Smith [73]Stroke8074.1 (6.6)7.6 (5.4)o

SA-SIP-30

72.8 (31.5)p

77.9 (26.0)q

82.1 (29.0)r

36.3 (30.6)s

FASDutch
Twiss [74]MS91136.5 (8.4)4.8 (5.2)

EDSS

0.0–1.5: 400h

2.0–2.5: 262h

3.0–3.5: 135h

>4: 105h

Unknown: h9

U-FIS

Australian-English

Canadian-English

Canadian-French

French

German

Italian

Spanish

UK-English

US-English

Valko [75]MS18845.0 (13.0)11.07 (9.79)

EDSS

3.61 (2.26)d

FSSGerman
Stroke23563 (14)1.21 (0.62)Not reported

aExpressed as median (Range)

bExpressed as median (IQR)

cRange of different (sub)samples

dExpressed as mean (SD)

eFemale

fMale

gDuring ‘off’ phase

hExpressed as numbers: EDSS categorized scores

iExpressed as mean (Range)

jBased on a total sample of N = 142

kInpatients, expressed in days

lOutpatients, expressed in days

mRange

nSD Not reported

oExpressed in months

pExpressed as percentage of total score body care and movement subscale

qExpressed as percentage of total score mobility subscale

rExpressed as percentage of total score ambulation subscale

sExpressed as percentage of total score alertness behaviour subscale

Flow diagram for study selection Characteristics of included studies EDSS 4.0 (1.0–9.5)a EDSS 3.94 (1.0–9.5)a EDSS 2.5 (2.0–4.0) D-FIS MFI S&E 66.4 (23.0)–70.3 (15.5)c PFS-16 (2) PFS-16 (5) RFS EDSS 3.7 (1.7)d EMIF-SEP FIS NHP-E U-FIS Canadian-English Canadian-French French German Italian Swedish US-English EDSS 3.5 (0–8.5)a FSS MFIS MFSS EDSS 3.5 (0–6.5)a–3.5 (0–8.5)a,c FSS MFIS MFSS WEIMUS EDSS 4.5 (1–8)a–5.5 (0–9)a,c FSS MFIS MFSS WEIMUS 47.9 (10.1)e 48.0 (11.1)f FSS PFS-16 (5) S&E 90 (80–90)g FACIT-F FSS NHP-E EDSS 1.0–3.5: 130h 4.0–5.5: 37h 6.0–9.5: 52h FSS SOFI EDSS 3.2 (0–7)i FSS MFIS EDSS 6.5 (3–8.5)a FSS MFIS EDSS 6 (3.5–7.5)–6.5 (3–8.5)c FSS MFIS Dutch Italian Slovenian Spanish EDSS 6.5 (3–8.5) FSS MFIS VAS-1 VAS-2 VAS-3 S&E 76.7 (14.5)–86.1 (8.7)c PFS-16 (2) PFS-16 (5) FSS FSS-7 FSS-5 Norwegian Swedish EDSS 1.94 (1.37)d FSS MFIS PS-F S&E 80 (70–90)j D-FIS MFI FIS FSS SF-36-V 23 (10–53)b,k 137 (93–217)b,l FAS MFSI-G POMS-F SF-36-V (V2.0) NHP-E U-FIS EDSS 0.0–4.0: 143h 4.5–6.5: 126h 7.0–7.5: 81h 8.0–9.5: 58h Unknown: 8h FSS FSS-5 EDSS 0.0–4.0: 214h 4.5–6.5: 196h 7.0–7.5: 136h 8.0–9.5: 80h Unknown: 9h MFIS MFIS C-5/MFIS P-8 EDSS 3.4 (1.63)d FSMC FSS MFIS EDSS 6 (0–9)a EDSS 3.2 (1.9)d EDSS 3.5 (1–6.5)a CIS-20R FSS MFIS FAI SF-36-V SA-SIP-30 72.8 (31.5)p 77.9 (26.0)q 82.1 (29.0)r 36.3 (30.6)s EDSS 0.0–1.5: 400h 2.0–2.5: 262h 3.0–3.5: 135h >4: 105h Unknown: h9 Australian-English Canadian-English Canadian-French French German Italian Spanish UK-English US-English EDSS 3.61 (2.26)d aExpressed as median (Range) bExpressed as median (IQR) cRange of different (sub)samples dExpressed as mean (SD) eFemale fMale gDuring ‘off’ phase hExpressed as numbers: EDSS categorized scores iExpressed as mean (Range) jBased on a total sample of N = 142 kInpatients, expressed in days lOutpatients, expressed in days mRange nSD Not reported oExpressed in months pExpressed as percentage of total score body care and movement subscale qExpressed as percentage of total score mobility subscale rExpressed as percentage of total score ambulation subscale sExpressed as percentage of total score alertness behaviour subscale

Characteristics of questionnaires

Table 3 presents the characteristics of the included self-report questionnaires. Most questionnaires aimed to assess the impact of fatigue on activities in daily life (Fatigue Impact Scale for Daily use (D-FIS), Adapted French version of Fatigue Impact Scale (EMIF-SEP), Fatigue Assessment Scale (FAS), FIS, Fatigue Severity Scale 5 item version (FSS-5), MFI, MFIS, Modified Fatigue Impact Scale Cognitive and Physical (MFIS C-5/MFIS P-8), Parkinson Fatigue Scale 2-point scale version (PFS-16 (2)), Parkinson Fatigue Scale 5-point scale version (PFS-16 (5)), Performance Scale Fatigue subscale (PS-F), Unidimensional Fatigue Impact Scale (U-FIS), Visual Analogue Scale-1, 2 or 3 (VAS-1, VAS-2, VAS-3), Würzburger Erschöpfungsinventars bei Multiple sclerosis (WEIMUS)), whereas six questionnaires focused primarily on fatigue severity (Multidimensional Fatigue Symptom Inventory general subscale (MFSI-G), Profile Of Mood States Fatigue subscale (POMS-F), Rhoten Fatigue Scale (RFS), Short-form-36 Vitality subscale (SF-36-V), Short-form-36 Vitality subscale version 2.0 (SF-36-V (V2.0)), Swedish Occupational Fatigue Inventory (SOFI)).
Table 3

Characteristics of included questionnaires

QuestionnaireConstruct assessedRecall periodDimensions (number of items)Response options (range)Range of scoresTime to administerEase of scoring
CIS-20R

Impact of fatigue

Fatigue severity

Last 2 weeks

Subjective experience of fatigue (8)

Reduction in motivation (4)

Reduction in activity (3)

Reduction in concentration (5)

Total (20)

7-point Likert

(1–7)

20–140

(Best–worst)

Not reportedEasy
D-FISImpact of fatigueLast day

One dimension

Total (8)

5-point Likert (0–4)

0–32

(Best–worst)

Not reportedEasy
EMIF-SEPImpact of fatigueLast month

Cognitive (10)

Physical (13)

Psychological (4)

Social (13)

Total (40)

4-point Likert

(1–4)

0–100a

(Best–worst)

Not reportedDifficulta
FACIT-F

Impact of fatigue

Fatigue severity

Last week

One dimension

Total (13)

5-point Likert

(0–4)

0–52

(Worst–best)

Not reportedEasy
FAI

Impact of fatigue

Fatigue severity

Last 2 weeks

Psychological consequencesb

Severityb

Situation—specificb

Response to restb

Total (29)

7-point Likert

(1–7)

29–203

(Best–worst)

Not reportedEasy
FASImpact of fatigueUsually…

One dimension

Total (10)

5-point Likert

(1–5)

10–50

(Best–worst)

Not reportedEasy
FISImpact of fatigueLast month

Cognitive (10)

Physical (10)

Social (20)

Total (40)

5-point Likert

(0–4)

0–160

(Best–worst)

Not reportedEasy
FSMC

Impact of fatigue

Fatigue severity

Factors influencing fatigue

In general…

Cognitive (10)

Motor (10)

Total (20)

5-point Likert

(1–5)

20–100

(Best–worst)

Not reportedEasy
FSS

Impact of fatigue

Fatigue severity

Not specified

One dimension

Total (9)

7-point Likert

(1–7)

1–7c

(Best–worst)

Not reportedModeratec
FSS-7

Impact of fatigue

Fatigue severity

Not specified

One dimension

Total (7)

7-point Likert (1–7)

1–7c

(Best–worst)

Not reportedModeratec
FSS-5Impact of fatigueNot specified

One dimension

Total (5)

7-point Likert

(1–7)

0–100d

(Best–worst)

Not reported

Moderated

Easye

MFIImpact of fatigueLately…

General (4)

Physical (4)

Reduced activity (4)

Reduced motivation (4)

Mental (4)

Total (20)

5-point Likert

(1–5)

20–100

(Best–worst)

Not reportedEasy
MFISImpact of fatigueLast month

Cognitive (10)

Physical (9)

Social (2)

Total (21)

5-point Likert

(0–4)

0–84

(Best–worst)

Not reportedEasy
MFIS C-5/MFIS P-8Impact of fatigueLast month

Cognitive (5)

Physical (8)

Total (13)

5-point Likert

(0–4)

0–52

(Best–worst)

Not reportedEasy
MFSI-GFatigue severityLast week

One dimension

Total (6)

5-point Likert

(0–4)

0–24

(Best–worst)

Not reportedEasy
MFSSFactors influencing fatigueNot specified

One dimension

Total (6)

7-point Likert

(1–7)

1–7c

(Best–worst)

Not reportedModeratec
NFI-MS

Fatigue severity

Factors influencing fatigue

Last 2 weeks

Abnormal nocturnal sleep (5)

Cognitive (4)

Physical (8)

Relief by rest (6)

Summary scale (10)

Total (33)

4-point Likert

(0–3)

0–99e

(Best–worst)

Not reported

Moderated

Easye

NHP-E

Impact of fatigue

Fatigue severity

Not specified

One dimensional

Total (3)

Adjectival

(Weighted score per item)

0–100

(Best–worst)

Not reportedEasy
PFS-16 (2)Impact of fatigueLast 2 weeks

One dimension

Total (16)

2-point Likert

(0–1)

0–16

(Best–worst)

Not reportedEasy
PFS-16 (5)Impact of fatigueLast 2 weeks

One dimension

Total (16)

5-point Likert

(1–5)

1–5c

(Best–worst)

Not reportedModeratec
POMS-FFatigue severityLast week

One dimension

Total (6)

5-point Likert

(0–4)

0–24

(Best–worst)

Not reportedEasy
PS-FImpact of fatigueLast month

One dimension

Total (1)

6-point Likert

(0–5)

0–5

(Best–worst)

Not reportedEasy
RFSFatigue severityLast 2 weeks

One dimension

Total (1)

11-point Likert

(0–10)

0–10

(Best–worst)

Not reportedEasy
SF-36-VFatigue severityLast month

One dimension

Total (4)

6-point Likert

(1–6)

4–24

(Worst–best)

Not reportedEasy
SF-36-V (V2.0)Fatigue severityLast month

One dimension

Total (4)

5-point Likert

(1–5)

4–20

(Worst–best)

Not reportedEasy
SOFIFatigue severityLast 6 months

Lack of energy (4)

Lack of motivation (4)

Physical discomfort (4)

Physical exertion (4)

Sleepiness (4)

Total (20)

7-point Likert

(0–6)

0–30f

(Best–worst)

Not reportedModeratef
U-FISImpact of fatigueLast week

One dimension

Total (22)

4-point Likert

(0–3)

0–66

(Best–worst)

Not reportedEasy
VAS-1Impact of fatigueNot specified

One dimension

Total (1)

100 mm VAS

0–100g

(Best–worst)

Not reportedModerateg
VAS-2Impact of fatigueNot specified

One dimension

Total (1)

100 mm VAS

0–100g

(Best–worst)

Not reportedModerateg
VAS-3Impact of fatigueNot specified

One dimension

Total (1)

100 mm VAS

0–100g

(Best–worst)

Not reportedModerateg
WEIMUSImpact of fatigueLast 2 weeks

Cognitive (9)

Physical (8)

Total (17)

5-point Likert

(0–4)

0–68

(Best–worst)

Not reportedEasy

aAdjusted total score on 0–100 scale

bNot reported

cAverage of total summed items

dOrdinal-interval (Rasch) transformation

eSummed raw (ordinal) score

fSummed total of averaged domain scores

gVisual analogue scale

Characteristics of included questionnaires Impact of fatigue Fatigue severity Subjective experience of fatigue (8) Reduction in motivation (4) Reduction in activity (3) Reduction in concentration (5) Total (20) 7-point Likert (1–7) 20–140 (Best–worst) One dimension Total (8) 0–32 (Best–worst) Cognitive (10) Physical (13) Psychological (4) Social (13) Total (40) 4-point Likert (1–4) 0–100a (Best–worst) Impact of fatigue Fatigue severity One dimension Total (13) 5-point Likert (0–4) 0–52 (Worst–best) Impact of fatigue Fatigue severity Psychological consequencesb Severityb Situation—specificb Response to restb Total (29) 7-point Likert (1–7) 29–203 (Best–worst) One dimension Total (10) 5-point Likert (1–5) 10–50 (Best–worst) Cognitive (10) Physical (10) Social (20) Total (40) 5-point Likert (0–4) 0–160 (Best–worst) Impact of fatigue Fatigue severity Factors influencing fatigue Cognitive (10) Motor (10) Total (20) 5-point Likert (1–5) 20–100 (Best–worst) Impact of fatigue Fatigue severity One dimension Total (9) 7-point Likert (1–7) 1–7c (Best–worst) Impact of fatigue Fatigue severity One dimension Total (7) 1–7c (Best–worst) One dimension Total (5) 7-point Likert (1–7) 0–100d (Best–worst) Moderated Easye General (4) Physical (4) Reduced activity (4) Reduced motivation (4) Mental (4) Total (20) 5-point Likert (1–5) 20–100 (Best–worst) Cognitive (10) Physical (9) Social (2) Total (21) 5-point Likert (0–4) 0–84 (Best–worst) Cognitive (5) Physical (8) Total (13) 5-point Likert (0–4) 0–52 (Best–worst) One dimension Total (6) 5-point Likert (0–4) 0–24 (Best–worst) One dimension Total (6) 7-point Likert (1–7) 1–7c (Best–worst) Fatigue severity Factors influencing fatigue Abnormal nocturnal sleep (5) Cognitive (4) Physical (8) Relief by rest (6) Summary scale (10) Total (33) 4-point Likert (0–3) 0–99e (Best–worst) Moderated Easye Impact of fatigue Fatigue severity One dimensional Total (3) Adjectival (Weighted score per item) 0–100 (Best–worst) One dimension Total (16) 2-point Likert (0–1) 0–16 (Best–worst) One dimension Total (16) 5-point Likert (1–5) 1–5c (Best–worst) One dimension Total (6) 5-point Likert (0–4) 0–24 (Best–worst) One dimension Total (1) 6-point Likert (0–5) 0–5 (Best–worst) One dimension Total (1) 11-point Likert (0–10) 0–10 (Best–worst) One dimension Total (4) 6-point Likert (1–6) 4–24 (Worst–best) One dimension Total (4) 5-point Likert (1–5) 4–20 (Worst–best) Lack of energy (4) Lack of motivation (4) Physical discomfort (4) Physical exertion (4) Sleepiness (4) Total (20) 7-point Likert (0–6) 0–30f (Best–worst) One dimension Total (22) 4-point Likert (0–3) 0–66 (Best–worst) One dimension Total (1) 0–100g (Best–worst) One dimension Total (1) 0–100g (Best–worst) One dimension Total (1) 0–100g (Best–worst) Cognitive (9) Physical (8) Total (17) 5-point Likert (0–4) 0–68 (Best–worst) aAdjusted total score on 0–100 scale bNot reported cAverage of total summed items dOrdinal-interval (Rasch) transformation eSummed raw (ordinal) score fSummed total of averaged domain scores gVisual analogue scale Fifteen unidimensional (D-FIS, Functional Assessment of Chronic Illness Therapy Fatigue subscale (FACIT-F), FAS, FSS, Fatigue Severity Scale 7 item version (FSS-7), FSS-5, MFSI-G, Multiple sclerosis-specific Fatigue Severity Scale (MFSS), Nottingham Health Profile Energy subscale (NHP-E), PFS-16 (2), PFS-16 (5), POMS-F, SF-36-V, SF-36-V (2.0), U-FIS) and eleven multidimensional questionnaires (Checklist Individual Strength (CIS-20R), EMIF-SEP, Fatigue Assessment Instrument (FAI), FIS, Fatigue Scale for Motor and Cognitive functions (FSMC), MFI, MFIS, MFIS C-5/MFIS P-8, Neurological Fatigue Index MS (NFI-MS), SOFI, WEIMUS) were identified. The total number of items per questionnaire varied from 3 (NHP-E) to 40 (EMIF-SEP, FIS). Three visual analogue scales (VAS-1, VAS-2 and VAS-3) and two single-item Likert scales (PS-F, RFS) were included. Six disease-specific questionnaires were found: the MFSS, NFI-MS, PS-F and WEIMUS for patients with MS and the PFS-16 (2) and PFS-16 (5) for patients with PD. Most questionnaires were found easy to administer. One questionnaire (EMIF-SEP) uses a complex formula to calculate an adjusted total score from 0 to 100, and for two questionnaires (FSS-5, NFI-MS), a nomogram was provided [65, 66] for ordinal-interval (Rasch) transformation. None of the included studies reported on the time needed to complete the questionnaires.

Measurement properties and methodological quality

Details about the investigated measurement properties and the methodological quality of the included studies are summarized in Table 4. Most studies investigated reliability and construct validity, whereas results on measurement error and responsiveness were often not reported.
Table 4

Methodological quality and investigated measurement properties per study

ReferencePopulationInvestigated measurement properties
Internal consistencyReliabilityMeasurement errorContent validityStructural validityHypothesis testingCross-cultural validitya Responsiveness
Armutlu [39]MSPoorFairFairPoor
Armutlu [40]MSPoorFairFairPoor
Benito–León [41]MSFairFairFairFair
Brown [42]PDGood

Fairb

Poorc

FairGoodFair
Debouverie [3]MSGoodFairGoodFair
Doward [43]MSGoodd FairFairGoodd FairPoor
Fisk [44]MSPoorPoorPoor
Flachenecker [45]MSPoor
Flachenecker [46]MS

Faire

Poorf

Poor

Fairg

Poorh

Fair
Flachenecker [47]MSPoorPoor
Flensner [48]MSPoorFairFair
Grace [49]PD

Fairb

Poori

Fair
Hagell [50]PDFairFair

Fairj

Goodk

Fair
Johansson [51]MSFairFairFair
Kim [52]MSFair
Kos [53]MSPoorPoorPoorPoor
Kos [54]MSFairFairFairFairPoor
Kos [55]MSFairPoor
Krupp [56]MSPoorPoorPoorPoorPoor
Kummer [57]PD

Fairb

Poorc

Fair
Lerdal [58]MSGood
Losonci [59]MSPoorPoorPoorPoor
Marrie [60]MSFair
Martínez-Martín [61]PDGoodPoorFairFairPoor
Mathiowetz [62]MSFairFair
Mead [63]StrokeFairFairFairFairFair
Meads [64]MSPoorFairFairPoorFair
Mills [65]MSGood
Mills [66]MSFairFairFairFair
Mills [67]MSGood
Penner [68]MSGoodFairFairGoodFair
Rendas-Baum [69]MSPoor
Reske [70]MSPoorPoorPoorPoorPoor
Rietberg [71]MSFairFairFairPoorPoor
Schwartz [72]MSFairFairFairPoor
Smith [73]StrokeFairPoorFair
Twiss [74]MSPoorFairPoor
Valko [75]

MS

Stroke

PoorPoorPoor

aOnly items for translation scored

bPFS-16 (5)

cPFS-16 (2)

dBased on Swedish subsample

eFSS, MFSS

fMFIS, WEIMUS

gFSS, MFIS, MFSS

hWEIMUS

iFSS

jCTT

kIRT

Methodological quality and investigated measurement properties per study Fairb Poorc Faire Poorf Fairg Poorh Fairb Poori Fairj Goodk Fairb Poorc MS Stroke aOnly items for translation scored bPFS-16 (5) cPFS-16 (2) dBased on Swedish subsample eFSS, MFSS fMFIS, WEIMUS gFSS, MFIS, MFSS hWEIMUS iFSS jCTT kIRT Eight out of 31 studies that investigated hypothesis testing [41, 43, 50, 51, 61, 62, 64, 66] formulated a priori hypothesis about the expected direction or magnitude of the correlation between the investigated questionnaires. Seven studies [39, 40, 54, 59, 61, 70, 75] that translated a questionnaire scored poor methodological quality because the translated questionnaires were not pre-tested in a small sample to check interpretation, cultural relevance and ease of comprehension of the translation. All studies [53, 56, 69, 71, 74] that reported on responsiveness scored poor methodological quality.

Overall quality of measurement properties

Table 5 presents the overall quality of the measurement properties per self-report questionnaire, accompanied by the level of evidence.
Table 5

Data synthesis, levels of evidence and overall quality of measurement properties per questionnaire

QuestionnairePopulationMeasurement properties
Internal consistencyReliabilityMeasurement errorContent validityStructural validityHypothesis testingCross-cultural validityResponsiveness
CIS-20RMS

+

Limited

?

Unknown

Limited

?

Unknown

D-FISMS

+

Limited

+

Limited

+

Limited

Limited

?

Unknown

PD

+

Moderate

?

Unknown

+

Limited

Limited

EMIF-SEPMS

+

Moderate

+

Limited

+

Moderate

?

Unknown

FACIT-FPD

+

Limited

+

Limited

Moderate

+

Limited

FAIMS

+

Limited

Limited

Limited

?

Unknown

FASStroke

±

Conflicting

+

Limited

?

Unknown

+

Limited

Limited

FISMS

?

Unknown

±

Conflicting

?

Unknown

Moderate

?

Unknown

?

Unknown

FSMCMS

+

Moderate

+

Limited

+

Limited

+

Moderate

+

Limited

FSSMS

+

Limited

+

Moderate

Strong

±

Conflicting

?

Unknown

?

Unknown

PD

+

Limited

Moderate

+

Moderate

Stroke

?

Unknown

?

Unknown

?

Unknown

FSS-7MS

+

Moderate

FSS-5MS

±

Conflicting

MFIMS

Limited

PD

Limited

MFISMS

Limited

+

Moderate

?

Unknown

±

Conflicting

+

Moderate

?

Unknown

?

Unknown

MFIS C-5/MFIS P-8MS

+

Moderate

MFSI-GStroke

+

Limited

+

Limited

?

Unknown

+

Limited

Limited

MFSSMS

Limited

?

Unknown

+

Limited

Limited

NFI-MSMS

+

Limited

+

Limited

+

Limited

Limited

NHP-EMS

+

Moderate

PD

+

Limited

PFS-16 (2)PD

?

Unknown

?

Unknown

+

Limited

?

Unknown

PFS-16 (5)PD

Moderate

Limited

+

Limited

+

Moderate

+

Moderate

?

Unknown

POMS-FStroke

+

Limited

+

Limited

?

Unknown

+

Limited

+

Limited

PS-FMSNot applicableNot applicable

+

Limited

RFSPDNot applicableNot applicable

+

Limited

SOFIMS

Limited

Limited

Limited

SF-36-VMS

+

Limited

SF-36-V (V2.0)Stroke

+

Limited

Limited

?

Unknown

+

Limited

Limited

U-FISMS

Moderate

+

Moderate

+

Moderate

+

Moderate

+

Moderate

?

Unknown

?

Unknown

VAS-1MSNot applicable

Limited

Not applicable

?

Unknown

VAS-2MSNot applicable

Limited

Not applicable

?

Unknown

VAS-3MSNot applicable

Limited

Not applicable

?

Unknown

WEIMUSMS

?

Unknown

?

Unknown

?

Unknown

Limited

+ Adequate, − Not adequate, ± Conflicting, ? Unknown

Data synthesis, levels of evidence and overall quality of measurement properties per questionnaire + Limited ? Unknown Limited ? Unknown + Limited + Limited + Limited Limited ? Unknown + Moderate ? Unknown + Limited Limited + Moderate + Limited + Moderate ? Unknown + Limited + Limited Moderate + Limited + Limited Limited Limited ? Unknown ± Conflicting + Limited ? Unknown + Limited Limited ? Unknown ± Conflicting ? Unknown Moderate ? Unknown ? Unknown + Moderate + Limited + Limited + Moderate + Limited + Limited + Moderate Strong ± Conflicting ? Unknown ? Unknown + Limited Moderate + Moderate ? Unknown ? Unknown ? Unknown + Moderate ± Conflicting Limited Limited Limited + Moderate ? Unknown ± Conflicting + Moderate ? Unknown ? Unknown + Moderate + Limited + Limited ? Unknown + Limited Limited Limited ? Unknown + Limited Limited + Limited + Limited + Limited Limited + Moderate + Limited ? Unknown ? Unknown + Limited ? Unknown Moderate Limited + Limited + Moderate + Moderate ? Unknown + Limited + Limited ? Unknown + Limited + Limited + Limited + Limited Limited Limited Limited + Limited + Limited Limited ? Unknown + Limited Limited Moderate + Moderate + Moderate + Moderate + Moderate ? Unknown ? Unknown Limited ? Unknown Limited ? Unknown Limited ? Unknown ? Unknown ? Unknown ? Unknown Limited + Adequate, − Not adequate, ± Conflicting, ? Unknown The EMIF-SEP and FSMC showed moderate evidence for adequate internal consistency in patients with MS (Cronbach’s α = 0.82–0.93) [3, 68] and the D-FIS in patients with PD (Cronbach’s α = 0.93) [61]. Limited evidence for adequate internal consistency was found for the D-FIS and FSS in patients with MS (Cronbach’s α = 0.91–0.93) [41, 46], the FACIT-F and FSS in patients with PD (Cronbach’s α = 0.90–0.94) [49, 50], and the MFSI-G, POMS-F and SF-36-V (V2.0) in patients with stroke (Cronbach’s α = 0.76–0.93) [63]. Moderate evidence was found for adequate reliability for the FSS, MFIS and U-FIS in patients with MS (CC or ICC = 0.73–0.93) [39, 43, 52, 54, 64, 71]. Limited evidence for adequate reliability was found for the FAS, MFSI-G and POMS-F in patients with stroke (ICC = 0.74–0.77) [63] and the FACIT-F in patients with PD (ICC = 0.84–0.85) [50]. Reliability of the PFS-16 (5) was found not adequate (limited evidence, CC = 0.63) [42]. Measurement error was investigated for the CIS-20R, D-FIS, FAS, FSS, MFIS, MFSI-G, POMS-F and SF-36-V (V2.0), but only one study on the D-FIS used in patients with MS [41] reported details about the MIC. There was limited evidence for adequate measurement error of the D-FIS in patients with MS (SEM = 3.18 and MIC = 3.65) [41]. Content validity was investigated for the FAS, FIS, FSMC, MFSI-G, NFI-MS, PFS-16 (2), PFS-16 (5), POMS-F, SF-36-V (V2.0) and U-FIS. Moderate evidence was found for adequate content validity of the U-FIS in patients with MS [43, 64]. Limited evidence for adequate content validity was found for the FSMC and NFI-MS in patients with MS [66, 68], for the PFS-16 (2) and PFS-16 (5) in patients with PD [42], and for the FAS, MFSI-G, POMS-F and SF-36-V (V2.0) in patients with stroke [63]. Moderate evidence for adequate structural validity was found for the EMIF-SEP, FSMC (% total explained variance = 61.4–61.5) [3, 68] and U-FIS [43] in patients with MS and for the PFS-16 (5) in patients with PD (% total explained variance = 63.2–64.0) [42]. Four studies that applied IRT methods to assess structural validity demonstrated misfits for items in the FSS and MFIS in patients with MS [58, 65, 67] and in the FACIT-F and FSS in patients with PD [50]. Based on these analyses, new versions for the FSS (FSS-7, FSS-5) [58, 65] and for the MFIS (MFIS C-5/MFIS P-8) [67] were introduced. Moderate evidence for convergent validity was found for the MFIS (CC = 0.54–0.89 with CIS-20R, FSMC, FSS, PS-F, WEIMUS, WEIMUS Cognitive subscale, WEIMUS Physical subscale) [46, 54, 60, 68, 71], U-FIS (CC = 0.48–0.86 with NHP-E) [43, 64] and NHP-E (CC = 0.48–0.86 with U-FIS) [43, 64] in patients with MS, and for the FSS (CC = 0.62–0.84 with FACIT-F, NHP-E, PFS-16 (5)) [49, 50] and PFS-16 (5) (CC = 0.71–0.84 with FSS, RFS) [42, 49] in patients with PD. In 13 studies [3, 39, 40, 43, 48, 53, 54, 57, 59, 61, 70, 71, 75], questionnaires were translated. None of these studies investigated cross-cultural validity by means of confirmatory factor analysis or differential item functioning (DIF). Five studies [53, 56, 69, 71, 74] reported on responsiveness. None of these studies presented details about the correlation coefficient between change scores in the investigated questionnaires with change in an external anchor. Therefore, responsiveness was scored unknown for these questionnaires. Clinically relevant differences in scores between subgroups were reported for the FIS [48], FSS [45], U-FIS [43, 64, 74] and WEIMUS [47] in patients with MS, and for the FACIT-F [50], FSS [50] and PFS-16 (5) [57] in patients with PD. No floor or ceiling effects were found for the D-FIS [41], FSS [53], FSS-7 and FSS-5 [58], MFIS [53, 54], MFIS C-5/MFIS P-8 [67], NFI-MS [66] and U-FIS [74] in patients with MS. The SOFI showed a floor effect in patients with MS (on 12 of the 20 items, more than 25% of patients achieved the lowest possible score) [51]. The D-FIS [61], FACIT-F [50], FSS [50], PFS-16 (5) and PFS-16 (2) [57] showed no floor or ceiling effects in patients with PD. Values for the MIC were reported for the D-FIS (MIC = 3.65) [41], FIS (MIC = 9.0–24.0) [69] and U-FIS (MIC = 2.4–7.0) [74] in patients with MS.

Discussion

To our knowledge, this review is the first that systematically appraised and summarized the evidence on the measurement properties of self-report fatigue questionnaires validated in patients with MS, PD or stroke, by taking the methodological quality of the included studies into account. Thirty-one questionnaires were evaluated. No multidimensional questionnaires were identified that were adequately validated in patients with PD or stroke. Moderate evidence was found for adequate internal consistency and structural validity of the FSMC and for adequate reliability and structural validity of the U-FIS in patients with MS. Therefore, we recommend the FSMC for the multidimensional, and the U-FIS for the unidimensional assessment of fatigue in patients with MS. The FACIT-F and FSS show promise for the assessment of fatigue in patients with PD, and the POMS-F for patients with stroke. However, reliability and validity should be confirmed in high-quality studies on the FACIT-F, FSS and POMS-F in these populations. Above recommendations should be considered with caution, given that studies investigating measurement error, responsiveness and interpretability are lacking. Second, as the level of evidence supporting the overall quality of most measurement properties was limited, future high-quality studies may change our recommendations. Two reviews [8, 10] recommend on the use of a questionnaire. One review [10] suggested the FIS and MFIS in patients with MS. The other review [8] recommended the FSS for the unidimensional assessment of fatigue in patients with PD. Although not specifically validated in PD, the MFI was recommended for the multidimensional assessment of fatigue in patients with PD [8]. These recommendations are partially in line with our findings. However, taken the methodological quality of the studies included in our systematic review into account, most measurement properties of the FIS showed only unknown level of evidence. In addition, four studies [50, 58, 65, 67] that applied IRT methods to investigate structural validity demonstrated misfits for some items in the FSS and MFIS. The inconsistent scores for hypothesis testing confirm that different questionnaires measure different aspects or constructs of fatigue. Unfortunately, details on the construct of fatigue measured by a questionnaire were often not reported. Furthermore, factors contributing to fatigue in patients with MS, PD or stroke are still not well known [2, 76, 77]. Translational research, bridging pre-clinical and clinical research [78], focused on physiological and clinical aspects contributing to peripheral and central fatigue [6], may provide input for more clearly defined concepts and dimensions of fatigue. As both fatigue and most clinical aspects contributing to fatigue fluctuate in time, associations between these factors may be more accurately reflected using longitudinal study designs with repeated measures in time [79]. Repeated measurement designs allow the investigation of the longitudinal construct validity of fatigue measures. For now, we suggest that clinicians assessing fatigue carefully consider whether a questionnaire reflects the most relevant aspects of fatigue of their interest. Furthermore, a comprehensive evaluation of fatigue should be accompanied by the assessment of clinically related factors such as mood and sleep. Acknowledging that each fatigue questionnaire measures different aspects of fatigue, we recommend the simultaneous use of different questionnaires in research. Interpretability is considered an important characteristic of a measurement scale [16], unfortunately, only a few studies reported details on clinically relevant differences in scores between subgroups [43, 45, 47, 48, 50, 57, 64, 74], floor and ceiling effects [41, 50, 51, 53, 54, 57, 58, 61, 66, 67, 74] and the MIC [41, 69, 74]. This makes it difficult to interpret scores and change scores on a fatigue questionnaire in both clinical practice and research. Although it is believed that measurement properties are sample dependent [80], no major differences in measurement properties were found for questionnaires that were evaluated in more than one population. For example, all estimates of measurement properties for the D-FIS were consistent in patients with MS and PD. The FSS showed consistent scores for most measurement properties that were evaluated in patients with MS, PD and stroke. In addition, another review [8] concluded that the items of the disease-specific PFS-16 (5) did not differ much from other generic fatigue questionnaires and that it provided no clear advantages above a generic questionnaire for use in patients with PD. Furthermore, it is not clear whether manifestations of fatigue are different between neurological disorders [8]. These results suggest that generic fatigue questionnaires presented in this review can be used interchangeably in patients with MS, PD and stroke and favour a generic approach for the assessment of fatigue. In contrast, studies using IRT methods showed misfits on the FSS for four items in patients with MS [65], and for only one item in patients with PD [50]. This difference might have been caused by a difference in statistical power between both studies [65], but it is also possible that it was related to DIF in patients with MS and PD [65]. This emphasizes the importance of disease-specific validation for fatigue questionnaires used in patients with MS, PD and stroke. Above-mentioned findings suggest that self-report fatigue questionnaires should contain a core set of items assessing generic aspects of fatigue, whereas some additional items are more disease specific. We therefore recommend the adaptation of existing questionnaires, incorporating a uniform section on general aspects of fatigue and a section with disease-specific items. Items to assess general aspects of fatigue may be derived from the recently developed Patient-Reported Outcomes Measurement Information System (PROMIS) fatigue item bank [81]. This systematic review has some limitations. First, only studies published in Dutch, English, French or German were included. This language restriction resulted in the exclusion of six articles [22, 28–30, 35, 38]; however, these studies evaluated a diversity of questionnaires and language versions, so it is not likely that this resulted in selection bias. Second, the COSMIN checklist has some items that require subjective judgment, which may lead to disagreement between raters. However, we tested the COSMIN checklist with all reviewers before assessing the methodological quality of the included studies, and one reviewer (RE) was involved in the assessment of all studies to improve consistency in rating across studies. Third, the quality criteria we applied for rating measurement properties heavily weighed on classical test theory (CTT). As a consequence, IRT methods were not considered for underpinning the structural validity of questionnaires. To overcome this incompleteness, we decided, post hoc, that any misfit in a questionnaire displayed by a study using IRT methods was judged as not adequate structural validity.

Conclusion

We recommend the FSMC and U-FIS for the assessment of fatigue in patients with MS. The FACIT-F and FSS show promise in patients with PD, and the POMS-F for patients with stroke. No multidimensional questionnaires were adequately validated in patients with PD or stroke. Future studies should focus on translational research in which assumed underlying physiological and clinical aspects contributing to fatigue are investigated longitudinally, as perceptions of fatigue often show fluctuations in time. Such studies may provide input for the development of the theoretical construct of self-report fatigue questionnaires. We suggest that existing questionnaires should be adapted to contain both a uniform section that reflects general aspects of fatigue, and a disease-specific section that contains items that are related with physiological and clinical aspects of underlying disease. Studies on responsiveness and the MIC of fatigue questionnaires in patients with MS, PD and stroke are needed, to establish whether an instrument can detect meaningful changes in clinical practice and research. Below is the link to the electronic supplementary material. Supplementary material 1 (DOCX 27 kb)
  67 in total

1.  Assessing fatigue in multiple sclerosis: Dutch modified fatigue impact scale.

Authors:  Daphne Kos; Eric Kerckhofs; Guy Nagels; Bie D D'Hooghe; William Duquet; Marijke Duportail; Pierre Ketelaer
Journal:  Acta Neurol Belg       Date:  2003-12       Impact factor: 2.396

2.  Reliability and validity of the Swedish version of the Fatigue Impact Scale (FIS).

Authors:  Gullvi Flensner; Anna-Christina Ek; Olle Söderhamn
Journal:  Scand J Occup Ther       Date:  2005-12       Impact factor: 2.611

Review 3.  Self-report instruments for fatigue assessment: a systematic review.

Authors:  Dálete D C F Mota; Cibele A M Pimenta
Journal:  Res Theory Nurs Pract       Date:  2006       Impact factor: 0.688

4.  Self-report assessment of fatigue in multiple sclerosis: a critical evaluation.

Authors:  Daphne Kos; Eric Kerckhofs; Pierre Ketelaer; Marijke Duportail; Guy Nagels; Marie D'Hooghe; Godelieve Nuyens
Journal:  Occup Ther Health Care       Date:  2004

5.  The Fatigue Scale for Motor and Cognitive Functions (FSMC): validation of a new instrument to assess multiple sclerosis-related fatigue.

Authors:  I K Penner; C Raselli; M Stöcklin; K Opwis; L Kappos; P Calabrese
Journal:  Mult Scler       Date:  2009-12-07       Impact factor: 6.312

6.  The fatigue severity scale. Application to patients with multiple sclerosis and systemic lupus erythematosus.

Authors:  L B Krupp; N G LaRocca; J Muir-Nash; A D Steinberg
Journal:  Arch Neurol       Date:  1989-10

Review 7.  Fatigue in Parkinson's disease: a review.

Authors:  Joseph H Friedman; Richard G Brown; Cynthia Comella; Carol E Garber; Lauren B Krupp; Jau-Shin Lou; Laura Marsh; Lillian Nail; Lisa Shulman; C Barr Taylor
Journal:  Mov Disord       Date:  2007-02-15       Impact factor: 10.338

Review 8.  A systematic review of the scales used for the measurement of cancer-related fatigue (CRF).

Authors:  O Minton; P Stone
Journal:  Ann Oncol       Date:  2008-08-04       Impact factor: 32.976

9.  The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study.

Authors:  Lidwine B Mokkink; Caroline B Terwee; Donald L Patrick; Jordi Alonso; Paul W Stratford; Dirk L Knol; Lex M Bouter; Henrica C W de Vet
Journal:  Qual Life Res       Date:  2010-02-19       Impact factor: 4.147

10.  Validity of a French version of the fatigue impact scale in multiple sclerosis.

Authors:  M Debouverie; S Pittion-Vouyovitch; S Louis; F Guillemin
Journal:  Mult Scler       Date:  2007-09       Impact factor: 6.312

View more
  49 in total

1.  Towards the implementation of 'no evidence of disease activity' in multiple sclerosis treatment: the multiple sclerosis decision model.

Authors:  Martin Stangel; Iris Katharina Penner; Boris A Kallmann; Carsten Lukas; Bernd C Kieseier
Journal:  Ther Adv Neurol Disord       Date:  2015-01       Impact factor: 6.570

Review 2.  A systematic review of measurement properties of patient-reported outcome measures for use in patients with foot or ankle diseases.

Authors:  Yuanxi Jia; Hsiaomin Huang; Joel J Gagnier
Journal:  Qual Life Res       Date:  2017-03-17       Impact factor: 4.147

Review 3.  Measurement properties of rheumatoid arthritis-specific quality-of-life questionnaires: systematic review of the literature.

Authors:  Jiyeon Lee; Soo Hyun Kim; Seung Hei Moon; Eun-Hyun Lee
Journal:  Qual Life Res       Date:  2014-05-21       Impact factor: 4.147

4.  Feasibility, Validity, and Reliability of the Italian Pediatric Quality of Life Inventory Multidimensional Fatigue Scale for Adults in Inpatients with Severe Obesity.

Authors:  Gian Mauro Manzoni; Alessandro Rossi; Nicoletta Marazzi; Fiorenza Agosti; Alessandra De Col; Giada Pietrabissa; Gianluca Castelnuovo; Enrico Molinari; Allessandro Sartorio
Journal:  Obes Facts       Date:  2018-02-07       Impact factor: 3.942

5.  COSMIN guideline for systematic reviews of patient-reported outcome measures.

Authors:  C A C Prinsen; L B Mokkink; L M Bouter; J Alonso; D L Patrick; H C W de Vet; C B Terwee
Journal:  Qual Life Res       Date:  2018-02-12       Impact factor: 4.147

Review 6.  Patient-reported outcome measures in older people with hip fracture: a systematic review of quality and acceptability.

Authors:  K L Haywood; J Brett; E Tutton; S Staniszewska
Journal:  Qual Life Res       Date:  2016-10-20       Impact factor: 4.147

Review 7.  Neurobiological studies of fatigue.

Authors:  Mary E Harrington
Journal:  Prog Neurobiol       Date:  2012-07-24       Impact factor: 11.685

8.  Fatigue and Comorbidities in Multiple Sclerosis.

Authors:  Kirsten M Fiest; John D Fisk; Scott B Patten; Helen Tremlett; Christina Wolfson; Sharon Warren; Kyla A McKay; Lindsay I Berrigan; Ruth Ann Marrie
Journal:  Int J MS Care       Date:  2016 Mar-Apr

Review 9.  Disease-Induced Skeletal Muscle Atrophy and Fatigue.

Authors:  Scott K Powers; Gordon S Lynch; Kate T Murphy; Michael B Reid; Inge Zijdewind
Journal:  Med Sci Sports Exerc       Date:  2016-11       Impact factor: 5.411

Review 10.  Fatigue as a symptom or comorbidity of neurological diseases.

Authors:  Iris-Katharina Penner; Friedemann Paul
Journal:  Nat Rev Neurol       Date:  2017-10-13       Impact factor: 42.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.