Literature DB >> 34040663

Reproducibility and reliability analysis of the Luk Distal Radius and Ulna Classification for European patients with adolescent idiopathic scoliosis.

James Houston¹, Amy Chiang², Shahnawaz Haleem³, Jason Bernard¹, Timothy Bishop¹, Darren F Lui¹.

Abstract

PURPOSE: Current clinical and radiological methods of predicting a patient's growth potential are limited in terms of practicality, accuracy and known to differ in different races. This information influences optimal timing of bracing and surgical intervention in adolescent idiopathic scoliosis (AIS). The Luk classification was developed to mitigate limitations of existing tools. Few reliability studies are available and are limited to certain geographical regions with varying results. This study was performed to analyze reproducibility and reliability of the Luk Distal Radius and Ulna Classification in European patients.
METHODS: This is a radiological study of 50 randomly selected left hand and wrist radiographs of patients with AIS referred to a tertiary referral centre. They were assessed for bone maturity using the Luk Distal Radius and Ulna Classification. Assessment was performed twice by four examiners at an interval of one month. Statistical analysis was performed using the intraclass correlation (ICC) method to determine the reliabilities within and between the examiners.
RESULTS: In total, 50 radiographs (M:F = 13:37) with a mean age of 13.7 years (10 to 18) were assessed for reliability. The inter-rater ICC value was 0.918 for radius assessment and 0.939 for ulna assessment. The intra-rater ICC values for radius assessment ranged between 0.897 and 0.769 and between 0.948 and 0.786 for ulna assessment. There was near perfect correlation for both assessments.
CONCLUSION: This study provides independent evidence that the Luk Distal Radius and Ulna Classification is a reliable tool for assessment of skeletal maturity for European patients. Minimal clinical experience is required to reliably utilize it. LEVEL OF EVIDENCE: IV.

Entities: Chemical

Keywords: Luk classification; adolescent idiopathic scoliosis; bone age; growth velocity; reliability

Year: 2021 PMID： 34040663 PMCID： PMC8138788 DOI： 10.1302/1863-2548.15.200251

Source DB: PubMed Journal: J Child Orthop ISSN： 1863-2521 Impact factor: 1.548

Introduction

Bone maturity assessment underpins the management of all types of scoliosis but is especially useful during the peak growth spurt just prior to cessation of growth.[1] Understanding the intricacies of bone maturation in this adolescent growth spurt period is invaluable for decision-making with regards to the use of a brace, employing growing rod techniques or proceeding to fusion.[2-5] Correct use of a brace can mitigate the need for surgery and the development of accurate, practical and reliable tools is paramount to optimizing patient experience and outcomes.[2] Multiple methods have been used to determine skeletal maturity that include clinical history, physical examination and imaging. Clinical measurements include recording height, spinal length, arm span and foot size.[6-8] While simple to perform, these rely on differences between serial recordings to gauge growth spurt. Multiple recordings may not be efficient, their entry not sensitive, may differ in different races and by the time the change has been noted, it may already be too late to intervene.[9-11]Secondary sexual characteristics, such as the timing of menarche can also be used but this is only applicable to female patients, and the peak growth spurt may have already passed by its occurrence.[12] Given that there is considerable variety in patient size, even within a particular stage of skeletal maturity, various radiographic assessment tools have been developed. The Risser grading system is a simple and commonly used system that measures the iliac crest apophysis, however, it has been noted to grade peak growth poorly.[13-16] Other areas of the body used in bone maturity assessments include the proximal humerus, elbow and the fusion of the rib heads and ring epiphysis.[17-19] Radiographs of the hand and wrist are also commonly used to assess age or grade maturity. These include the Tanner and Whitehouse (TW), Greulich and Pyle (GP) and Sanders classifications.[12,20-23] The Tanner-Whitehouse III (TW3) classification established a relationship between bone maturity and bone age and scores each bone based on maturity with the total weighted score converted to a bone age.[24] The GP classification is based upon ‘Atlas-Matching’ and is less objective than the scoring method of TW.[23] Moreover, the GP, while reliable in a developed Caucasian population was found to be unreliable in other ethnic populations.[25-31] Studies comparing both methods considered TW to be more reliable but TW takes longer to grade, taking on average almost 8 minutes.[32,33] While both scoring systems are accurate, they are complex and time consuming to use.[32] This practically limits their use in a busy clinical setting. The Sanders or simplified TW3 classification (sTW3) removed the radius and ulna assessment and evaluated only Digital Skeletal Age.[12] The authors argued that even with removal of the radius and ulna assessment, curve progression could still be predicted accurately, however, given the complexity of the assessment pre-clinical training was still required. A shorthand bone age assessment study attempted to shorten and simplify assessment but is valid only for an age range of 12.5 to 16 years for boys and ten to 14 years for girls, hence limiting its use in the overall scoliosis setting.[34,35] Luk et al[36] developed the Luk Distal Radial and Ulna Classification (LRDU), due to the limitations of the other skeletal maturity classifications in adolescent idiopathic scoliosis (AIS).[36] In contrast to the Sanders classification, the LDRU looks solely at the distal radius and ulna. The original LDRU was simplified following feedback from early adopters removing descriptions of multiple parameters at different stages. The current classification is based on examination of a plain film radiograph of the left wrist. It has 11 radius grades (R1 to R11) and nine ulna grades (U1 to U9). Growth was found to have peak spurt at levels R7 and U5 and halt at levels R10 and U9. Table 1 characterizes the LDRU.

Table 1

Characterization of the Luk Distal Radius and Ulna Classification

Stage	Description
Radius
R1	Epiphysis appears as single or multiple spots
R2	Distinct and oval-shaped epiphysis
R3	Maximal diameter is more than half the width of the metaphysis
R4	Double line at the distal border of the epiphysis represents the palmar and dorsal surface
R5	Width of epiphysis not as wide as the metaphysis
R6	Epiphysis is as wide as metaphysis. No capping or narrowing of the physis is seen
R7	Epiphysis capping on medial side, but not on the lateral side
R8	Epiphysis capping on both medial and lateral sides. The medial and lateral ends of the physis are wider than the centre
R9	Ossification has begun with blurring of the central physis
R10	The physeal line is closed, forming a sclerotic line. A notch is still visible at the medial or the lateral end of the growth plate
R11	Complete fusion of the physis with no notch
Ulna
U1	The epiphysis appears at single/multiple spots
U2	A round shaped epiphysis
U3	The epiphysis is at least half the width of the metaphysis
U4	The styloid is visible on the medial end of the epiphysis
U5	Epiphysis width up to the metaphyseal width
U6	Medial epiphysis as wide as the metaphysis
U7	Medial physeal plate narrowing or fusion
U8	More than half the medial growth plate fused with the unfused part just proximal to the styoid process
U9	Complete fusion of the physis

Characterization of the Luk Distal Radius and Ulna Classification Our study aims to perform a reliability analysis on the Luk Distal Radius and Ulna Classification system at a different geographical location. By performing the study on a set of patients based in the United Kingdom it attempts to ascertain reliability analysis on a different geographical set of patients as previous studies were limited to Chinese and Japanese cohorts.[37] The study also uses assessors at different grades of seniority to assess whether there is a difference in reliability due to experience. By broadly repeating the methodology of the original LDRU study it aims to provide independent verification or rebuttal of reliability of the original classification.

Materials and methods

A radiographic study was performed on 50 randomly selected patients from a prospectively collected database who had presented with AIS to our tertiary referral centre for children with spinal conditions between 07 February 2017 and 29 August 2018. The study was registered within our hospital, but ethics committee approval was not required as the study did not affect patient management. Left hand radiographs (as per standard agreement) were obtained in all patients.[23,38] Four examiners were used (DFL, SH, JH, AC), one consultant spine surgeon, one senior fellow, one registrar and one medical student. There was no discussion between the examiners during the study or about the classification. Radiographs were accessed using a picture archiving and communication system (PACS v4.4; Intellispace PACS Enterprise, Philips, Foster City, California). Intraobserver reliability assessments were performed one month apart with data being stored securely on a spreadsheet by AC until the end of reliability measurements.

Statistical analysis

Data was evaluated in descriptive and frequency terms and analyzed using SPSS version 25 (SPSS Inc, Chicago, Illinois). Inter- and intraobserver reliability was evaluated using intraclass correlation coefficients (ICC) within its 95% confidence intervals, as per the original reliability analysis. The ICC was further categorized into standard groups based on alpha values; poor agreement (0 to 0.29), fair agreement (0.30 to 0.49), moderate agreement (0.5 to 0.69), strong agreement (0.7 to 0.8) and near perfect agreement (> 0.8). Further analysis was performed for interobserver disagreement.

Results

In total, 50 patients (M:F = 13:37) with a mean age of 13.7 years (10 to 18) were assessed in the study. The inter-rater ICC values for radius (0.918) and ulna (0.939) assessment showed near perfect agreement as shown in Table 2. The mean intra-rater ICC values for radius was 0.822 (0.769 to 0.897) and ulna was 0.847 (0.786 to 0.948) assessment also showed near perfect agreement as shown in Table 3.

Table 2

Inter-rater intraclass correlation coefficient (ICC) values for radius and ulna assessment

	Inter-rater ICC value	95% confidence interval
Radius assessment	0.918	0.878 to 0.948
Ulna assessment	0.939	0.908 to 0.962

Table 3

Intra-rater intraclass correlation coefficient values for radius and ulna assessment according to each examiner

	AC	DL	JH	SH	Mean
Radius assessment	0.897	0.809	0.769	0.814	0.822
Ulna assessment	0.948	0.843	0.786	0.810	0.847

Inter-rater intraclass correlation coefficient (ICC) values for radius and ulna assessment Intra-rater intraclass correlation coefficient values for radius and ulna assessment according to each examiner

Intraobserver disagreement

For the radius (n = 200) there were a total of 84 (42%) one-grade disagreements and 18 (9%) two-grade disagreements. For the ulna (n = 200) there were at total of 62 (31%) one-grade disagreements, 13 (6.5%) two-grade disagreements and two (1%) three-grade disagreements. No discussions were held to discuss disagreements as the intention of the study was to assess reliability and reproducibility at different knowledge and experience levels.

Discussion

This study finds that the LDRU classification had near perfect agreement for interobserver and intraobserver analysis. Our study is broadly representative of the original reliability analysis with similar average age and range of scores. The authors of the LDRU suggest that the system is simple, practical and provides accurate prediction of a patient’s peak growth spurt. One study has suggested that the LDRU is more reliable than the sTW3, and may also provide more practical indications of when to start weaning off the brace.[39,40] Bracing was suggested in the initial study to begin at R6 and R7 although Li et al[41] suggested earlier adoption at R5 and more frequent observations around R7 to R9 to begin brace weaning at R10.[36] If the benefits of the LDRU are true, it could provide better guidance on whether bracing is appropriate, and when to begin and end bracing for any given patient with AIS.[2,41,42] The evidence for reliability in the LDRU is limited, and the key to cementing evidence is reproducibility in research.[43] Few reliability analyses have been performed so far, and they are all confined to a particular geographical continent.[39,41,44] The larger reliability analysis was performed by the same unit which designed the classification, and reported both inter- and intraobserver reliability to be near perfect, leaving the study open to potential criticism for not being independent.[36,44] The second study gave data on interobserver disagreement which was lower than our findings. None of the independent analyses included boys and results varied between them. Okuda et al[39] found significantly lower reliability in the LDRU in their cohort of Japanese patients compared with ours and the initial study.[36] This may be due to the use of a different statistical tool; Kappa values instead of ICC. They also found in contrast to our study that the radius was the more reliable tool for assessment, although they noted that the low reliability in their ulna classification may be due to different amounts of pronation on their radiographs. Li et al[41] have suggested that either the radius or the ulna could be assessed in isolation given there is good correlation between radial and ulna progression, however this was only applicable to older children and knowing that variations in radius and ulna maturity might exist. Their study of 40 physically immature girls in a Chinese population found near perfect reliability. Our study used four observers, which is a greater number than most of the other reliability analyses. This study also had a greater number of patients evaluated than the other independent analyses. Our study finds that interobserver reliability was better than intraobserver reliability. Similar to the findings of Cheung et al,[9] this may demonstrate a possible learning curve effect for examiners, although this was short and may not be clinically significant. Our interobserver reliability was high without significant prior experience of using the LDRU. In our study three of the four of the examiners were novices at using the system. There was no apparent increase in reliability based upon the radiograph experience of the examiner. Of interest, the least experienced examiner, a medical student, had the highest intra-rater reliability. This shows that contrary to the sTW3, little to no experience is required of the system and this is also not dependant on experience of the observer. Our study has some limitations. Similar to the other studies, there were no patients with R1 to R4 grades. This could be due to our referral system which does not involve a school screening programme which is the usual mode of referral for the original authors. Our patients may, therefore, be referred at a later stage in their growing phase. However, bracing decisions are only applicable to patients around their peak growth spurt, which is typically around R7 and U5, and these scores were well represented in our study. The majority of our patients are Caucasian in origin (both in the outpatient and the theatre settings), we have however not specifically analyzed the ethnic origins of our study patients.

Conclusions

This study confirms that the Luk Distal Radius and Ulna Classification system is a reliable system for assessing skeletal maturity. The LDRU system can be used reliably in patients from the United Kingdom as well as in Hong Kong. Little to no experience is required to use the LDRU, which is also not dependant on the clinical experience of the observer. Further research is required to evaluate its reliability in younger children, its sensitivity and specificity, as well as how different grades should influence management decisions.

42 in total

1. The Iliac apophysis; an invaluable sign in the management of scoliosis.

Authors: J C RISSER
Journal: Clin Orthop Date: 1958

2. Applicability of Greulich and Pyle skeletal age standards to Indian children.

Authors: Sumit T Patil; M P Parchand; M M Meshram; N Y Kamdi
Journal: Forensic Sci Int Date: 2011-10-19 Impact factor: 2.395

3. The association between brace compliance and outcome for patients with idiopathic scoliosis.

Authors: Tariq Rahman; J Richard Bowen; Masakazu Takemitsu; Claude Scott
Journal: J Pediatr Orthop Date: 2005 Jul-Aug Impact factor: 2.324

4. Reproducibility of bone ages when performed by radiology registrars: an audit of Tanner and Whitehouse II versus Greulich and Pyle methods.

Authors: D G King; D M Steventon; M P O'Sullivan; A M Cook; V P Hornsby; I G Jefferson; P R King
Journal: Br J Radiol Date: 1994-09 Impact factor: 3.039

5. Standards from birth to maturity for height, weight, height velocity, and weight velocity: British children, 1965. I.

Authors: J M Tanner; R H Whitehouse; M Takaishi
Journal: Arch Dis Child Date: 1966-10 Impact factor: 3.791

6. The value of shoe size for prediction of the timing of the pubertal growth spurt.

Authors: Iris Busscher; Idsart Kingma; Frits Hein Wapstra; Sjoerd K Bulstra; Gijsbertus J Verkerke; Albert G Veldhuizen
Journal: Scoliosis Date: 2011-01-20

7. An innovative technique of vertebral body stapling for the treatment of patients with adolescent idiopathic scoliosis: a feasibility, safety, and utility study.

Authors: Randal R Betz; John Kim; Linda P D'Andrea; M J Mulcahey; Rohinton K Balsara; David H Clements
Journal: Spine (Phila Pa 1976) Date: 2003-10-15 Impact factor: 3.468

8. Correlation of Risser sign, radiographs of hand and wrist with the histological grade of iliac crest apophysis in girls with adolescent idiopathic scoliosis.

Authors: William Wei Jun Wang; Cai Wei Xia; Feng Zhu; Ze Zhang Zhu; Bin Wang; Shou Feng Wang; Benson Hiu Yan Yeung; Simon Kwong Man Lee; Jack Chun Yiu Cheng; Yong Qiu
Journal: Spine (Phila Pa 1976) Date: 2009-08-01 Impact factor: 3.468

9. The rib epiphysis and other growth centers as indicators of the end of spinal growth.

Authors: Stanley Hoppenfeld; Baron Lonner; Vasantha Murthy; Yun Gu
Journal: Spine (Phila Pa 1976) Date: 2004-01-01 Impact factor: 3.468

10. Reproducibility in research.

Authors: Vivian Siegel
Journal: Dis Model Mech Date: 2011-05 Impact factor: 5.758