Chang-Sik Park1. 1. Department of Physical Therapy, Howon University: 64 Howondae 3gil, Impimyeon, Gunsan-si 573-932, Republic of Korea.
Abstract
[Purpose] To establish the test-retest reliabilities, minimal detectable change of the Short form Barthel Index and associations with stroke-specific impairments. [Subjects and Methods] The Short form-Barthel Index assessment was tested on 24 chronic stroke patients twice, 7 days apart. A relative reliability index (ICC2,1), Weighted Kappa Coefficients was used to examine the level of agreement of test-retest reliability for SF-BI, Absolute reliability indices, including the standard error of measurement and the minimal detectable change. The validity was demonstrated by spearman correlation of SF BI-total score with Postural Assessment Scale for Storke, Fugl Meyer Assessment. [Results] There was excellent agreement between test-retest for individual items of BI and total score ICC2,1=0.91 and it all showed acceptable SEM and MDC were 2.83 score, 7.84 score respectively. The item-to-total correlations were all significant, ranging from r=0.83-0.92. SF-BI showed good internal consistency. Individual items also possessed high internal consistency 0.82-0.86. The SF-BI and total score were demonstrated high concurrent validity with the PASS, FMA. [Conclusion] This study has demonstrated that the SF-BI is a useful instrument with high test-retest reliability, Absolute reliability indices, internal consistency and validity.
[Purpose] To establish the test-retest reliabilities, minimal detectable change of the Short form Barthel Index and associations with stroke-specific impairments. [Subjects and Methods] The Short form-Barthel Index assessment was tested on 24 chronic strokepatients twice, 7 days apart. A relative reliability index (ICC2,1), Weighted Kappa Coefficients was used to examine the level of agreement of test-retest reliability for SF-BI, Absolute reliability indices, including the standard error of measurement and the minimal detectable change. The validity was demonstrated by spearman correlation of SF BI-total score with Postural Assessment Scale for Storke, Fugl Meyer Assessment. [Results] There was excellent agreement between test-retest for individual items of BI and total score ICC2,1=0.91 and it all showed acceptable SEM and MDC were 2.83 score, 7.84 score respectively. The item-to-total correlations were all significant, ranging from r=0.83-0.92. SF-BI showed good internal consistency. Individual items also possessed high internal consistency 0.82-0.86. The SF-BI and total score were demonstrated high concurrent validity with the PASS, FMA. [Conclusion] This study has demonstrated that the SF-BI is a useful instrument with high test-retest reliability, Absolute reliability indices, internal consistency and validity.
Entities:
Keywords:
Activities daily living; Short form Barthel Index; Stroke
The low performance of strokepatients in activities of daily living is a major factor in
deteriorating the life satisfaction levels of the patients and their caregivers1). The improvement of independent activities
of daily living is one of important therapeutic goals in occupational therapy. Therefore, it
is essential for clinicians and therapists to select appropriate assessment instruments to
effectively evaluate patients’ performance of activities of daily living2,3,4). The Barthel Index5) and the Functional Independence Measure (FIM) have been most
frequently used to assess impairment levels in domestic and foreign clinics6). The FIM is known to be a more comprehensive
and responsive assessment instrument for scoring impairments than the BI7). However, the psychological characteristics
of the two instruments are similar, suggesting that the FIM is not superior to the BI or has
any particular advantages8). In addition,
as both the instruments are composed of comprehensive and quantitative assessment items, the
FIM test requires about 30 to 45 minutes and the BI test takes 20 minutes if the assessment
of the performance of subjects is carried out through observation8). In particular, as the time required for assessment
increases, the psychological burden (the decrease in muscle endurance and the increase in
muscle tone) of therapists and strokepatients with neurological disorders is increased,
which can leads to a problematic reliability of evaluation results3). Therefore, clinicians and researchers should be aware that
if assessment takes a long time in data collection process, selective biases (random
measurement errors) may occur and the obtained scores may differ from the actual measured
values9, 10). For this reason, short-form assessment instruments have been
developed11) by reducing unnecessary
items within a range that minimizes the loss of data information without affecting the
psychometric properties of the original assessment instruments. In addition, the assessment
of subjects in clinical practice should be easy to apply; required time should be little;
and obtained data should be quantifiable and easy to interpret11). Also, psychological characteristics such as reliability,
validity, and response rate should be well verified3). Hobart et al.8)
derived the short form BI (5 items: transfers, bathing, toilet use, stair climbing and
ambulation) from the original 10-item BI and examined its psychological characteristics. As
a result, it was found that the SF-BI has an item internal consistency of Cronbach’s α
coefficient=0.888) with 0.71 at admission
and 0.73 at discharge, respectively; an inter-rater reliability of ICC=0.90; and a
standardized response mean (SRM) of 0.71 at average, which are the same as those of the
original BI1). In particular, the SF-BI
(1.2) was reported to have a considerably high response rate compared with the original BI
(1.2) and the FIM motor subscales (1.3)1).
It was also reported that the concurrent validity between the SF-BI and the original BI
(r=0.96) and the convergent and discriminant validities between the SF-BI and the FIM total
scores (r=0.87) are significantly associated8). In addition, the SF-BI was found to have psychological
characteristics similar to those of the original BI1) with concurrent validities of r=0.74 and 0.94 at admission and
discharge, respectively, which are at satisfying levels. Nanayakkara and Lekamwasam12) proved that the 5-item SF-BI is a very
sensitive assessment instrument that can predict or determine the independency levels of the
elderly’s activities of daily living (the original BI total scores for the ten items: ≤80
points=self-reported dependency, >80 points=self-reported independency). Thirty-seven
points obtained in the SF-BI test (the maximum score: 55 points) is the cutoff value to
distinguish between dependency and independency (sensitivity level 95%, specificity level
82%). Recent studies have emphasized the need for an assessment method that can identify the
effects of treatment interventions and predict functional recovery in terms of treatment
management for patients11, 13). The standard error measurement (SEM) and minimal
detectable change (MDC) are the absolute reliability indices by which the two can be
standardized and quantified14). They are
used to estimate the sizes of potential random measurement errors caused by chance
variations in case of score changes at the repeated same tests conducted on individuals or
determine whether the measured values remain systematically consistent (95% confidence
level)4, 14). Particularly, the MDC can be used as important index data for
clinical decision making because it can serve as the threshold value for both therapists and
clinical researchers to determine the sizes and prognoses of post-treatment effects based on
the actual changes in the scores of patients14). However, the test-retest reliability and MDC of the SF-BI have not
been investigated in overseas studies and the concurrent validity for the posture control
and upper and lower extremity movement control functions reflecting stroke-specific
impairments has not been known yet, either. Thus, this study was aimed at investigating the
test-retest reliability and MDC of the SF-BI and its associations with stroke-specific
impairments.
SUBJECTS AND METHODS
The subjects of this study were selected among chronic strokepatients who had been
diagnosed with hemiplegia due to stroke and agreed to participate in this study. The
subjects were those who were receiving regular medical services at G Hospital and the study
period lasted from August 2016 to December 2016. The study purpose and methods were
explained to the subjects, who provided informed consent according to the principles of the
Declaration of Helsinki before participating. The criteria for selection of the subjects
were: chronic strokepatients who had the onset of stroke more than six months ago; had
obtained 24 or higher points for cognitive functions in the Mental State Examination-Korean
Version (MMSE-K); and are able to understand verbal instructions. Patients unable to
participate in functional performance assessment required by this study due to orthopaedic
diseases were excluded. The general medical characteristics of the subjects including their
ages, duration of illness, diagnoses, paralyzed areas and the MMSE-K scores were collected
through admission notes and one-to-one interviews.The sample size of this study was calculated to be 24 using the G Power (version 3.1) and
through a repeated measures (twice) ANOVA at a power of the test of 95% (a significance
level of 0.05) and an effect size of F=0.4. In order to minimize the effects of potential
recovery on the BI in the test-retest agreement rates, patients with chronic stroke were
selected as subjects10). The data were
collected from the finally selected 24 patients after excluding two who did not meet the
criteria for subject selection, two who were emergently discharged in the final data
collection process and two who showed unreliable assessment results that caused data errors.
For the test-retest reliability (ICC2,1) of the BI, the agreement rates were
compared in a total of two assessments performed at a weekly interval by occupational
therapists with more than 16 years of clinical experiences15). The construct validity of the SF-BI was estimated by the
correlation coefficient among the original BI, the PASS total scores and BBS total scores.
The assessment of the subjects’ functional performance was conducted by two physical
therapists with 15 years of clinical experiences. The entire assessment process included two
to five minute breaks as needed in order to minimize the decrease in performance due to
excessive tension caused by fatigue and associated reactions along with learning
effects.The Barthel Index which is a basic daily activity assessment tool for stroke patients5) was used. The BI consists of 10 items:
personal hygiene, bathing, feeding, toilet use, stair climbing, dressing, bowel control,
bladder control, ambulation or wheelchair mobility and chair/bed transfers. Each item has a
five-stage scoring system according to the degree of external help, with the maximum score
of 100 points. The 5-item SF-BI derived from the 10-item original BI consists of transfers,
bathing, toilet use, stair climbing and ambulation, with the maximum score of 55 points8). It was reported that SF-BI has an item
internal consistency of Cronbach’s α coefficient=0.88 and an inter-rater reliability of
ICC=0.908).The Postural Assessment Scale for Stroke (PASS) was developed to evaluate strokepatients’
performance of postural control by modifying and supplementing the FM-B16). The PASS consists of 12 items (0–3 points) among which
five items are for assessing postural maintenance and seven items for assessing postural
changes in relation with three basic postures (lying, sitting and standing), and its maximum
score is 36 points. It was reported that the PASS for chronic strokepatients has an
inter-rater agreement of weighted kappa coefficient=0.88 (0.61–0.96)17).The Fugl Meyer Motor Assessment for the Upper and Lower Extremities (FMA-U/E, L/E) has been
used to assess impairments in upper and lower extremity motor functions including the
movements, coordination and reflexes of agonistic and synergic muscles among strokepatients. The FMA is a 3-point scale (0–2 points) and consists of 33 items to assess upper
extremity motor functions (0–66 points) and 17 items to assess lower extremity motor
functions (0–34 points), with the maximum score of 100 points18). The test-retest reliabilities of the FMA-U/E, L/E in strokepatients were reported to be ICC=0.98 and 0.95, respectively, with that of the total score
of ICC=0.983).A statistical analysis was performed using the SPSS 18.0 for Windows 7 in this study. For
the general characteristics of the subjects, a frequency analysis and descriptive statistics
were carried out. The test-retest agreement rates of the SF-BI total scores were calculated
using the intra class coefficient (ICC2,1) and those of the SF-BI individual
items using the Spearman correlation coefficient between the total scores of the SF-BI and
of the original BI, while estimating the item internal consistency using the Cronbach’s α
coefficient. Also, in this study, the absolute reliability indices were calculated. To
quantify random measurement errors, the standard error measurement (SEM=the standard
deviation of all test-retest scores × [√1-ICC]) was used, while the minimal detectable
change (MDC=1.96 × SEM × √2) was used for the calculation to obtain the reference value
determining whether the actual changed scores (treatment effect sizes) of each patient are
maintained consistent within the 95% confidence interval14). The SEM is <15% of all test-retest average scores and the less
its value is, the more acceptable it becomes. MDC values are reliable when they are <20%
of the highest measured value10). To
estimate the concurrent validity of the SF-BI total scores, the PASS total scores were used,
while the correlation between the SF-BI total scores and the FMA total scores was calculated
using the Spearman correlation coefficient. All statistical significance levels were set at
α=0.05.
RESULTS
The SF-BI individual items’ weighted kappa coefficients were as follows: transfers=0.67,
bathing=0.61, toilet use=0.61, stair climbing=0.75, and ambulation=0.62. The agreement
degree was good and the observed agreement rates were within a range of 71% to 80%, which
was satisfactory.The SF-BI total scores were acceptable with ICC=0.91 (0.86–0.95), SEM=2.83 (<10% of the
average score, 41.66) and MDC=7.84 (<20% of the maximum score, 54).The SF-BI individual items were found to be highly positively relevant to the SF-BI total
scores (r=0.83–0.92), and to be highly relevant to the original BI total scores
(r=0.78–0.86). The correlation coefficient of the SF-BI total scores and the original BI
total scores was analyzed to be very high (r=0.95). The Cronbach’s α coefficient of the
SF-BI total scores was 0.87; and remained 0.82 to 0.86, which is an acceptable level, even
when the individual items were deleted .It was found that the SF-BI individual items were significantly relevant to the PASS total
scores (r=0.75–0.78) and the FMA total scores (r=0.72–0.77) respectively, and the SF-BI
total scores were significantly correlated to the PASS total scores (r=0.81) and the FMA
total scores (r=0.76), respectively.
DISCUSSION
This study was aimed at identifying the SF-BI’s test-retest reliability, absolute
reliability (the SEM and MDC) and the associations with stroke-specific impairments. When it
comes to the agreement degrees of the SF-BI individual items in this study, stair climbing
showed the most excellent degree with 0.75, followed by transfers (0.67), ambulation (0.62),
bathing (0.61) and toilet use (0.61). The agreement rates of the SF-BI individual items were
71% to 80%, which was a satisfactory level, and the agreement rate of the SF-BI total scores
was confirmed to be high (ICC=0.91). These results are similar to those of the previous
study by Hsueh et al.1), where the weighted
kappa coefficients of the original BI individual items were 0.53 to 0.94 (Median=0.72),
which showed satisfactory or excellent agreement rates, but the agreement rate of “bathing”
was 0.53 which was satisfactory and similar to the result of this study. In the present
study, “bathing” and “toilet use” showed relatively low agreement rates, and this is due to
the difference in the evaluation methods: by the subjects’ self reporting rather than
therapists’ direct observation of their performance. Therefore, the evaluation method needs
to be standardized and modulated for the two items1). The SF-BI total scores of this study showed an ICC of 0.91
(0.86–0.95), which was consistent with 0.90 in the study of Hobart et al8). However, the result was in contrast to that
of the study of Hsueh et al.1) that
presented ICCs of 0.55 at admission and 0.74 at discharge. In the case of the SF-BI, the
floor effect is reduced from 46.6% at admission to 13.6% at discharge. Therefore, the SF-BI
has a limitation in assessing the ADL of acute strokepatients. Nevertheless, it has been
reported that it is appropriate to evaluate the independency levels of strokepatients as it
has a response rate of 1.2, which is similar to that of the original BI and the FIM-motor
subscales1). The ICC indicates the
consistency of test scores in repeated measurements, but should be investigated using
absolute reliability indices since the discrepancies in tests and measurement errors are
unknown14). The SEM of the SF-BI in this
study was 2.83, which was <10% of the average score of 41.66 and consistent with the SEM
of 2.2 of the SF-BI in the study of Hobart et al8). However, the MDC of the SF-BI has never been reported. According to
previous studies, the SEM and MDC of the original BI were 1.45 and 4.02, respectively2), similar to the SEM and MDC of the SF-BI of
this study. The MDC of the SF-BI of this study was at an acceptable level, 7.84 points which
is <20% of the maximum achievable score of 54. The score of 7.84 points of the subjects
were maintained at the 95% confidence interval, which means that it is not a measurement
error due to random variation and that there will be functional changes (improvement of
independency levels) in each individual in the future. A highly reliable test should have a
high ICC value and a low MDC value15),
which was proved in this study. Thus, it was also confirmed that the SF-BI is a reliable
assessment instrument for detecting and observing functional changes in patients over time
in clinical settings. In this study, it was found that the individual items and total scores
of the SF-BI are highly positively correlated (r=0.83–0.92) and that the SF-BI individual
items and the total scores of the original BI are also highly correlated (r=0.78–0.86). The
results of the study by Hobart et al.8)
were as follows: transfers (r=0.83), bathing (r=0.57), toilet use (r=0.83), stair climbing
(r=0.68) and ambulation (r=0.77). The present study showed differences in bathing (r=0.83)
and stair climbing (r=0.86). This is because the subjects of this study were strokepatients
while the study of Hobart et al.8) was
conducted on patients with various diseases such as multiple sclerosis (45.6%), stroke
(14.2%), spinal cord injuries (16.5%) and others (23.7%). In general, the application of one
assessment instrument to patients with various diseases is inadequate for the functional
assessment for patients and there is limitation in the estimation of recovery or response
rates19). In this study, the concurrent
validity (r=0.95) between the SF-BI and original BI total scores and the Cronbach’s α
coefficient (0.87) of the SF-BI were similar to those reported in the study of Hobart et
al.8) (r=0.96, 0.88). However, the
results were quite different from those of the study of Hsueh et al.1), which reported the concurrent validities of r=0.74 at
admission and r=0.94 at discharge, and the Cronbach’s α coefficients of 0.71 at admission
and 0.73 at discharge, respectively. The average illness duration of the subjects in this
study was 17.42 months and the 24 chronic strokepatients who had showed no natural recovery
were selected as the subjects, while the study of Hsueh et al.1) was conducted on 118 patients with acute stroke among which were
included patients with a median FIM-motor score of 28 points (the maximum score: 91 points)
and thus with severe impairments. In this case, selective biases and systematic errors that
can affect the validity and reliability among variables due to the floor effect may be
caused. In this study, the item internal consistency of the SF-BI total scores was
considerably reliable, and even when individual items were deleted, it remained within a
range of 0.82 to 0.86, which is at a high level. This means that the characteristics of the
ADL of subjects are well reflected in the individual SF-BI items in terms of assessment of
their ADL levels, which suggests that the individual items in the SF-BI are closely related
to each other. The BI is an instrument to assess comprehensive ADL levels and it is known
that the BI total scores at 14, 30, 90, and 180 days after the onset of stroke are
significantly correlated with the FMA total scores (r=0.78–0.81)1). However, the SF-BI is composed of the two items of toilet
use and bathing (40%) derived from the original BI’s seven self-management items and the
three items of transfers, stair climbing and ambulation (60%) derived from the original BI’s
mobility items, which means that it is focused on the assessment of the ability to move. The
correlations of the SF-BI with the PASS and the FMA have not been known yet. In this study,
the individual items and total scores of the SF-BI were highly correlated with the PASS
total scores (r=0.75–0.81). In a study on the correlation between the PASS and the FIM,
which is the most similar to the BI, the PASS was reported to be associated with the
FIM-motor items (r=0.82), ambulation (r=0.73) and the FIM total scores (r=0.73)16). The PASS includes the items to assess
standing on paralyzed and non-paralyzed sides (weight load, movement and balance) and
picking objects from the floor in a standing position. It consists of tasks that can
directly affect the mobility of strokepatients. Therefore, the PASS is considered to be
composed of essential items to assess postural control and reactions prerequisite and
necessary for the activities of daily living, so the two variables can be seen to be
significantly correlated. The FMA reflects the selective separate movements and coordination
of upper and lower extremities, the combined flexion and extension of the two extremities
needed to perform activities of daily living20). The FMA is mainly focused on the performance of upper extremity
functions; however, the FMA total scores are significantly relevant to the BI total scores
(r=0.75), personal hygiene (r=0.89), transfers (r=0.76), feeding (r=0.72) and dressing
(R=0.76)20). Regarding these
characteristics, the individual items and total scores of the SF-BI in this study were found
to be significantly related to the FMA total scores (=0.73–0.77), which is similar to the
results of previous studies. Therefore, the SF-BI can be said to be suitable for the
selective evaluation of ADL as it has high reliability and validity. The SF-BI has its
advantages: allowing to easily collect data in a laboratory study and requiring only the
least amount of time in managing data and interpreting results8). However, the reduction of evaluation items can cause floor effect
and give limits to discrimination in uniformly evaluating ADL for all subjects with stroke.
The limitations of this study are as follows: first, the SF-BI, which is focused on the
items for assessing lower extremity motor functions and self-management, is inadequate to
the patients with stroke who have severe disorders because there is floor effect; second,
relationship between the SF-BI and upper extremity motor functions was not analyzed; and
third, the results of this study cannot be generalized because this study’s sample size is
small, the average age of subjects is 60, and the FMA-L/E is 23.71 (23 to 28: moderate
disorder) out of the maximum score, 3418),
which suggests that this study is composed of relatively active subjects. In that sense,
further studies are needed to check if the SF-BI can adequately evaluate ADL according to
the degree of impairments in subjects with stroke and the duration of illness and to
identify problems and limitations in the Rasch analysis and application of evaluation. The
SF-BI was confirmed to be a useful evaluation instrument in that its test-retest agreement
rate, absolute reliability, item internal consistency, and validity were high. Therefore,
the SF-BI can be easily used in clinical practice, and both clinicians and researchers can
use the selective ADL functions of patients with stroke and utilize them as useful
information.
Funding
This paper was supported by research funds provided from the Howon University.
Authors: R Haigh; A Tennant; F Biering-Sørensen; G Grimby; C Marincek; S Phillips; H Ring; L Tesio; J L Thonnard Journal: J Rehabil Med Date: 2001-11 Impact factor: 2.912