BACKGROUND: Congenital talipes equinovarus (clubfoot) is one of the most common congenital pediatric orthopedic foot deformity, which varies in severity and clinical course. Assessment of severity of the club foot deformity is essential to assess the initial severity of deformity, to monitor the progress of treatment, to prognosticate, and to identify early relapse. Pirani's scoring system is most acceptable and popular for club foot deformity assessment because it is simple, quick, cost effective, and easy. Since the scoring system is subjective in nature it has inter- and intra-observer variability, it is widely used. Hence, the interobserver variability between orthopedic surgeons in assessing the club foot severity by Pirani scoring system. MATERIALS AND METHODS: We assessed the interobserver variability between five orthopedic surgeons of comparable skills, in assessing the club foot severity by Pirani scoring system in 80 feet of 60 children (20 bilateral and 40 unilateral) with club foot deformity. All the five different orthopedic surgeons were familiar with Pirani clubfoot severity scoring and Ponseti cast manipulation, as they had already worked in CTEV clinics for at least 2 months. Each of them independently scored, each foot as per the Pirani clubfoot scoring system and recorded total score (TS), Midfoot score (MFS), Hind foot score (HFS), posterior crease (PC), emptiness of heel (EH), rigidity of equnius (RE), medial crease (MC), curvature of lateral border (CLB), and lateral head of talus (LHT). Interobserver variability was calculated using kappa statistic for each of these signs and was judged as poor (0.00-0.20), fair (0.21-0.40), moderate (0.41-0.60), substantial (0.61-0.80), or almost perfect (0.81-1.00). RESULTS: The mean age was 137 days (range 21-335) days. The mean Pirani score was 3.86. We found the overall consistency to be substantial for overall score (total score kappa - 0.71) and also for midfoot (0.68) and hindfoot (0.66) separately. The consistency was least for the emptiness of heel (kappa - 0.39), and best for rigidity of equnius (kappa - 0.68) and rest of the parameters were moderate (kappa between 0.40 and 0.60). CONCLUSION: The Pirani scoring system had got substantial reliability in assessing the clubfoot deformity even when the reliability test was extended to five different orthopedic surgeons simultaneously.
BACKGROUND:Congenital talipes equinovarus (clubfoot) is one of the most common congenital pediatric orthopedic foot deformity, which varies in severity and clinical course. Assessment of severity of the club foot deformity is essential to assess the initial severity of deformity, to monitor the progress of treatment, to prognosticate, and to identify early relapse. Pirani's scoring system is most acceptable and popular for club foot deformity assessment because it is simple, quick, cost effective, and easy. Since the scoring system is subjective in nature it has inter- and intra-observer variability, it is widely used. Hence, the interobserver variability between orthopedic surgeons in assessing the club foot severity by Pirani scoring system. MATERIALS AND METHODS: We assessed the interobserver variability between five orthopedic surgeons of comparable skills, in assessing the club foot severity by Pirani scoring system in 80 feet of 60 children (20 bilateral and 40 unilateral) with club foot deformity. All the five different orthopedic surgeons were familiar with Pirani clubfoot severity scoring and Ponseti cast manipulation, as they had already worked in CTEV clinics for at least 2 months. Each of them independently scored, each foot as per the Pirani clubfoot scoring system and recorded total score (TS), Midfoot score (MFS), Hind foot score (HFS), posterior crease (PC), emptiness of heel (EH), rigidity of equnius (RE), medial crease (MC), curvature of lateral border (CLB), and lateral head of talus (LHT). Interobserver variability was calculated using kappa statistic for each of these signs and was judged as poor (0.00-0.20), fair (0.21-0.40), moderate (0.41-0.60), substantial (0.61-0.80), or almost perfect (0.81-1.00). RESULTS: The mean age was 137 days (range 21-335) days. The mean Pirani score was 3.86. We found the overall consistency to be substantial for overall score (total score kappa - 0.71) and also for midfoot (0.68) and hindfoot (0.66) separately. The consistency was least for the emptiness of heel (kappa - 0.39), and best for rigidity of equnius (kappa - 0.68) and rest of the parameters were moderate (kappa between 0.40 and 0.60). CONCLUSION: The Pirani scoring system had got substantial reliability in assessing the clubfoot deformity even when the reliability test was extended to five different orthopedic surgeons simultaneously.
Entities:
Keywords:
Club foot; Clubfoot; Pirani score; congenital abnormalities; foot; interobserver variability; reliability and validity
Congenital talipes equinovarus (clubfoot, CTEV) is one of the most common congenital pediatric orthopedic foot deformity which requires correction.12 Assessment of severity of the club foot deformity is essential to assess the initial severity of deformity, to monitor the progress of treatment, to prognosticate, and to identify early relapse.13 There are various clinical assessment scoring systems such as Ponseti and Smoley4, Catterall5, Dimeglio6, Harrold and Walker.7 Pirani score is reliable, quick, and easy to use, hence it is used both for the initial assessment and for followup of the treatment.89 Being subjective nature of the scoring system makes it prone to interobserver variability. Different studies have compared the interobserver variability of the Pirani score among orthopedic surgeon and physiotherapist or allied health worker. But to the best of our knowledge none of the study has been done to compare the interobserver variability of Pirani score, among orthopedic surgeons themselves, who are the most frequent users of the scoring system.1011 Hence, purpose of this study was to assess the interobserver variability between orthopedic surgeons in assessing the club foot severity by Pirani scoring system.
MATERIALS AND METHODS
A foot deformity correction camp was organized at our institute in September 2015. All patients coming to the camp with foot deformity were examined by a senior orthopedic surgeon to screen patients of club foot. All patients of idiopathic club foot coming to the camp with age <1 year were included in the study. Secondary club foot, previously operated patients, atypical club foot, and children more than 1 year age were excluded from the study. All the clubfoot children included in the study were independently examined and assessed by five different orthopedic surgeons of comparable clinical experience and skill, who were familiar with Pirani clubfoot severity scoring and the Ponseti cast manipulation. All the five orthopedic surgeons were senior resident, who had atleast 2 years experience after completion of their postgraduation in orthopedics and had worked in CTEV clinics for at least 2 months which is being run in the department at our institute weekly.All these clubfoot children were then started on treatment by Ponseti cast manipulation and thereafter, were asked to review weekly and regularly in CTEV clinics.All five orthopedic surgeons independently scored, each foot as per the Pirani clubfoot scoring system (total score [TS]), which is the sum total of midfoot score (MFS) and hind foot score (HFS). The HFS is the sum total of three signs – posterior crease (PC), emptiness of heel (EH), and rigidity of equnius (RE). The MFS is sum total of three signs – medial crease (MC), curvature of lateral border (CLB), and lateral head of talus (LHT). Each of these six signs was graded as either 0 (no abnormality), 0.5 (moderate abnormality), or 1 (severe abnormality) as per the deformity. Thus, TS, i.e., sum of MFS and HFS of all six signs of the club foot, can range from 0 to 6, with 6 being the most deformed foot and 0 being the normal [Table 1].8
Table 1
Pirani scoring system
Pirani scoring systemThe data was analyzed for interobserver variability using kappa statistic for each of the six signs (PC, EH, RE, MC, CLB, LHT) and also for MFS, HFS, and TS between all five orthopedic surgeons. The kappa statistic interobserver reliability (strength of agreement) was judged as poor (0.00–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), or almost perfect (0.81–1.00).
RESULTS
A total of 112 children were enrolled for the foot deformitycamp, out of which 78 children were of clubfoot deformity. After fulfilling the inclusion criteria, 60 children were enrolled for the study. Twenty had bilateral involvement and 40 were unilateral (24 right and 16 left). Thus, a total of 80 feet in 60 children were included in the study. Out of these 60 children, 46 were male and 14 were female. Age of the patients ranged from 21 days to 335 days (mean age 137 days). None of these children were ambulatory at the time of assessment.The mean of the six Pirani score parameters for all the five observers was as 0.61, 0.55, 0.76, 0.62, 0.67, and 0.67 for PC, EH, RE, MC, CLB, and LHT, respectively. The mean for all the five observers for HFS, MFS, and TS was 1.92, 1.98, and 3.86, respectively [Table 2]. Mean overall Pirani score was 3.86, whereas mean Pirani score in unilateral cases was 3.28, and mean in bilateral cases was 4.44.
Table 2
Mean of the Pirani score parameters by different observers
Mean of the Pirani score parameters by different observersThe overall kappa value, i.e., interobserver reliability for total Pirani score (TS) was 0.71, with substantial degree of agreement present between the observers. There was substantial reliability in HFS and MFS also as the kappa value in both the groups was more than 0.6, i.e., 0.66 and 0.68, respectively.The interobserver reliability, i.e., kappa value for hind foot signs were as 0.46 (moderate) for PC, 0.39 (fair) for EH, and 0.68 (substantial) for RE and of the mid foot signs were as 0.43 (moderate) for MC, 0.56 (moderate) for CLB, and 0.53 (moderate) for LHT, respectively [Table 3].
Table 3
Inter-observer reliability of Pirani score parameters
Inter-observer reliability of Pirani score parameters
DISCUSSION
The incidence of clubfoot is 1:1000 live birth and 50% are bilateral.12 The condition is variable in its clinical course, severity, and expected response to the treatment, leading to the unpredictability in the duration and type of the treatment required.3 Hence, while treating clubfoot, it is important to classify and grade between various forms and severity of CTEV. These classification systems help to assess the initial degree and severity of the composite deformity before treatment, to monitor and guide the progress of treatment and to predict and compare outcome as well as to identify the early relapse and plan the treatment accordingly.13An ideal classification should describe the deformity, correlate, and compare the outcomes, determine the treatment and predict prognosis without having intra- and inter-observer variability.13 It should be simple, easy, user friendly, objective, uniformly accepted, cost effective, reliable, reproducible, and retrievable from retrospective analysis. It should be comprehensive accounting for the three-dimensional characteristics of deformity and include separate information for forefoot, midfoot, and hindfoot deformities and applicable to all forms of deformity, at all ages and at all stages of treatment.13Assessment by radiography and magnetic resonance imaging is not recommended in the child due to various reasons such as nonvisualization of unossified cartilige, projection errors, difficult positioning, radiation exposure, noncost-effectiveness, and lack of uniform interpretation universally.1213 Authors also have emphasized on clinical evaluation as the yard stick for the assessment of deformity.61415Even after improved understanding of the pathoanatomy of clubfoot, a reliable classification system based on the clinical evaluation still remains elusive and there is no agreed ideal grading system. Many authors such as Maceven et al.,16 Wynne-Davies,17 Chacko and Mathew,18 McKay,19 Ponseti and Smoley,4 Harrold and Walker,7 Catterall,5 Diméglio et al.,6 and Pirani et al.89 have developed the classification systems. None of them had proved superior over the other and gold standard is yet to be established. But Pirani's classification has gained wide clinical acceptability and popularity because it is simple, reliable, quick, cost effective, easy to learn, use, and apply.320 It can predict the number of casts required to correct the deformity and the probability of achillies tendon tentomy.2021 Scher et al. found that significantly higher Pirani score requires significantly more number of cast and HFS rather than the MFS of the Pirani score predicts the need for tenotomy, as it is the hindfoot equnius that the tenotomy is correcting.22Several studies such as Catterall5 and Cummings et al.23 commented on problem of, lack in intra- and inter-observer consistency in classification systems owning to subjective nature of these classifications and despite the lack of reliable data, surgeons have been using them regularly as a dependent measure.2324Since Pirani score is among one of the most commonly used score, we thought it was worthwhile to find its interobserver consistency among five different orthopedic surgeons using kappa value.Flynn concluded that there is good interobserver reliability of 89% for both Demeglio and Pirani classification systems between orthopedic specialist and a fellow in pediatric orthopedics, but only after a short initial training phase.25 Porter assessed the inter- and intra-observer agreement of photographic and radiological measurements of the resting neonatal foot with club foot and showed mean measurement of error of more than 9°.26Wainwright compared four club foot assessment systems and found that Ponseti and Smoley classification, which is based on worst component of the deformity and Harrold and Walker's system, which is based on the ability to correct the deformity, both of these systems produced moderate to substantial agreement when all feet were being assessed, whereas Catterall's system had only poor to slight agreement. For all the three systems, the agreement was lowest and was only fair to moderate when the normal feet had been excluded and only affected feet were assessed. Diméglio-system although gave the best agreement with moderate to substantial agreement, but it is complex and needs training for reduction in the discrepancy from 40% to 6%. They finally concluded that all current classifications are still not entirely satisfactory as they are subjective in nature and have inter- and intra-observer variation. Jillani et al.10 in a two staged study, i.e., before training and after training, compared orthopedic surgeon and a lower level allied health worker, i.e., a plaster technician who had 2-year operation theater technician diploma and showed the overall kappa values for the parameters as 0.716, 0.625, 0.696, 0.675, 0.391, 0.543, 0.457, and 0.362, respectively, for CLB, MC, LHT, PC, RE, EH, HFS, and TS with conclusion that prior training and supervision in the early phase improves the reliability. They found interobserver reliability to be fair to substantial (fair for TS and equines rigidity, other parameters substantial to moderate) with point-to-point interobserver agreement for all components of deformity to be 82%.10 Another study showed moderate to substantial interobserver reliability between a pediatric orthopedic surgeon and a physiotherapy assistant, with point-to-point interobserver agreement for all components of deformity to be 83%,11 with κ statistic was 0.61 for PC, 0.72 for EH, 0.51 for RE, 0.54 for HFS, 0.57 for MC, 0.54 for CLB, 0.56 for LHT, 0.50 for MFS, and 0.50 for TS. Flynn found higher agreement of 89% when comparison done between two physicians of comparable skills, i.e., orthopedic specialist and a fellow in pediatric orthopedics with correlation coefficients of 0.90 for the Pirani classification, and 0.83 for the Dimeglio classification. Correlation coefficients were much lower for the first 15 feet scored and were also lower when the therapist's scores were included.25 In similar study, Pirani et al. found the interobserver strength of agreement in clubfoot scoring to be substantial or almost perfect among three independent observers, with kappa score of TS, MFS, and hindscore to be 0.92, 0.91, and 0.86, respectively.9 However, in their study, the second observer was an orthopedic resident, not a paramedic.Although all the studies had done comparison between two persons alone, hence we thought it would be interesting to extend the comparison between five orthopedic surgeons of comparable skill and experience. We found the overall consistency to be substantial for overall score (TS kappa - 0.66) and also for midfoot and hindfoot separately. But when the components were visualized separated, the consistency was least for the EH (kappa - 0.39), and best for RE (kappa - 0.68) and rest of the parameters were moderate (kappa between 0.40 and 0.60). Thus the assessment of EH was the parameter which was least and the rigidity of equinus was most reliable as per our study. Since both the parameters are part of HFS, the HFS agreement remained marginally on the substantial side.Our study is limited by factors such as repeated examination by several observers may have led to greater flexibility of the foot and the child and parents may have tolerated earlier examinations better than later examinations. Further collecting static measurements from infants is challenging because of the size of the foot, the less evident anatomical landmarks and the degree of cooperation.These interobserver variations can be also attributed to differences in the training and background of observers, which we tried to remove by taking orthopedic surgeon of comparable skill and experience in our study. Further our agreement was substantial in only two of the Pirani's parameter and rest of the parameters had poor or moderate agreement because Pirani system is also not so sensitive and it tends to give a diagnosis of moderate abnormality as there are only three levels of scoring 0, 0.5, and 1, but the overall Pirani score had substantial agreement. Another limitation of the study is low number of feet, but even with this number of feet the power of the study is more than 0.80 with alpha error of 0.05. Further the study includes only the children coming in the camp on that single day on which camp was done, hence study is limited to 80 feet only.
CONCLUSION
The Pirani scoring system has got substantial reliability in assessing the clubfoot deformity even when the reliability test was extended to five different orthopedic surgeons simultaneously. This consistency was seen in the various parameters of Pirani score also when assessed separately, except for the EH, which is the least reliable among all the parameters. We recommend to do further studies including the many persons simultaneously, such as surgeons, physiotherapist, or assistants for the assessment of the reliability of these classification systems.
Authors: Syed Ali Anwer Jillani; Muhammad Zeeshan Aslam; Muhammad Amin Chinoy; Mansoor Ali Khan; Anum Saleem; Syed Kamran Ahmed Journal: J Pak Med Assoc Date: 2014-12 Impact factor: 0.781
Authors: David M Scher; David S Feldman; Harold J P van Bosse; Debra A Sala; Wallace B Lehman Journal: J Pediatr Orthop Date: 2004 Jul-Aug Impact factor: 2.324