Literature DB >> 34143230

Reporting errors in plain radiographs for lower limb trauma-a systematic review and meta-analysis.

Thomas York1, Christopher Franklin2, Kate Reynolds2, Greg Munro2, Heloise Jenney2, William Harland2, Darren Leong2.   

Abstract

INTRODUCTION: Plain radiographs are a globally ubiquitous means of investigation for injuries to the musculoskeletal system. Despite this, initial interpretation remains a challenge and inaccuracies give rise to adverse sequelae for patients and healthcare providers alike. This study sought to address the limited, existing meta-analytic research on the initial reporting of radiographs for skeletal trauma, with specific regard to diagnostic accuracy of the most commonly injured region of the appendicular skeleton, the lower limb.
METHOD: A prospectively registered, systematic review and meta-analysis was performed using published research from the major clinical-science databases. Studies identified as appropriate for inclusion underwent methodological quality and risk of bias analysis. Meta-analysis was then performed to establish summary rates for specificity and sensitivity of diagnostic accuracy, including covariates by anatomical site, using HSROC and bivariate models.
RESULTS: A total of 3887 articles were screened, with 10 identified as suitable for analysis based on the eligibility criteria. Sensitivity and specificity across the studies were 93.5% and 89.7% respectively. Compared with other anatomical subdivisions, interpretation of ankle radiographs yielded the highest sensitivity and specificity, with values of 98.1% and 94.6% respectively, and a diagnostic odds ratio of 929.97.
CONCLUSION: Interpretation of lower limb skeletal radiographs operates at a reasonably high degree of sensitivity and specificity. However, one in twenty true positives is missed on initial radiographic interpretation and safety netting systems need to be established to address this. Virtual fracture clinic reviews and teleradiology services in conjunction with novel technology will likely be crucial in these circumstances.
© 2021. The Author(s).

Entities:  

Keywords:  Accuracy; Emergency; Foot and ankle; Initial reporting; Knee; Lower limb; Reporting errors; Skeletal radiographs; Trauma radiographs

Mesh:

Year:  2021        PMID: 34143230      PMCID: PMC8626392          DOI: 10.1007/s00256-021-03821-9

Source DB:  PubMed          Journal:  Skeletal Radiol        ISSN: 0364-2348            Impact factor:   2.199


Introduction

In February of 1896, at the physics laboratory of Dartmouth College, Edwin Brant Frost used what were then known as roentgen rays to capture an image of the healing ulna of his patient, Edward McCarthy [1]. The supreme clinical applications of this novel technology were not lost on early observers, Silvanus P. Thompson (President of the Roentgen Society) said a year later: ‘Excepting only the introduction into surgery by Lord Lister of antiseptics, and the discovery of anaesthetics, no discovery in the present century has done so much for operative surgery as this of the roentgen rays’ [1].Over the following 130 years of clinical practice, plain radiographs have remained foundational to the investigation of musculoskeletal injuries. The WHO estimates that 3.6 billion investigations using ionising radiation are performed globally each year, the majority of which being simple X-rays [2]. In the UK, more than 60% of emergency department attendances have a primary diagnosis relating to the musculoskeletal (MSK) system [3]. In total, 38.7% of all patients will receive at least one plain radiograph and in MSK injuries this rises to over 50% [4]. Despite being ubiquitous, the interpretation of skeletal radiographs is challenging, and errors can be of significant detriment to both patients and care providers. The interpretation of radiographs in a trauma setting is especially fraught, with high patient turnover and often junior staff. Consequently, emergency departments are recognised as ‘high risk’ for diagnostic error [5]. Research reviewing UK medicolegal claims in skeletal radiology between 1995 and 2006, showed the ‘great majority followed missed diagnoses of fractures following trauma’ [6]. Existing research has shown variable levels of performance in the initial interpretation of skeletal radiographs for trauma. Across all radiographs in the emergency department setting, an error rate of approximately 3% has been shown [7]. In the upper limb, estimates suggest incorrect assessment is made in around 8.5% of cases [7, 8]. There have so far been limited attempts to produce summary rates of reporting error in plain skeletal radiographs of lower limb trauma, despite a body of individual studies assessing this both in generality and by more specific anatomical site. The aims of this study were to conduct a systematic review and meta-analysis of the existing literature to establish sensitivity, specificity, and diagnostic odds ratio for the initial interpretation of lower limb radiographs (including those of anatomical sub-divisions; foot, ankle, knee and femur).

Methods

Review protocol and search strategy

This systematic review was prospectively registered with the PROSPERO database, a copy of the review protocol can be found under registration number CRD42020197973. In April of 2020, the PubMed MEDLINE, Embase, Cochrane Database of Systematic Reviews (CDSR) and Cochrane Central Register of Controlled Trials (CENTRAL) databases were scrutinised from 1990 to the present, using a search strategy developed with the aid of Imperial College Library Services. The full electronic search strategy is detailed in Fig. 1.
Fig. 1

Literature review process

Literature review process

Eligibility criteria

In accordance with the objectives of this study, eligibility criteria were developed by the authors to identify papers containing pertinent data for inclusion. These were as follows: Written in the English language Conducted during or after 1990 Original research, published in peer reviewed, academic journals (editorial letters, opinion pieces and expert reviews were excluded) Reporting an initial assessment of plain radiographs of the lower limb, performed by identified members of staff or grade of staff and compared to a definitive assessment of findings Investigation of subjects with a confirmed or suspected trauma and orthopaedic injury, as characterised by the WHO ICD 11 Radiographs included for review being of skeletally mature subjects Conducted in active healthcare settings where diagnostic services are provided to a patient population Outcomes reported with respect to accuracy, specificity or sensitivity of radiograph reporting Outcomes reported with respect to specific anatomical site or regional anatomy

Study selection

An initial sample of 200 search results was reviewed for inclusion by the six reviewing authors (TY, CF, KR, GM, HJ, WH). Using the eligibility criteria against title and abstract, each author sorted these 200 results into ‘reject’ or ‘further review’ categories. Inter-reader reliability assessment was then performed to establish the degree of agreement amongst the authors on those articles meriting further review. Fleiss’ Multirater Kappa was calculated to be 0.640 (p < .005), conventionally taken to represent substantial agreement [9]. Each author then individually assessed an equal share of the remaining results by title and abstract, again categorising as ‘reject’ or ‘further review’. These, along with the reviewed results of the initial sample were combined, and further categorised on the basis of the anatomical region to which they related: lower limb, upper limb, pelvis, spine and thorax, skull and facial. Where an article included data pertinent to more than one anatomical region, it was duplicated, and a copy assigned to both. TY and CF then reviewed the full text of all potentially eligible results categorised as lower limb against the aforementioned eligibility criteria. Where disparity arose, it was resolved by means of further review and joint assessment.

Data collection and assessment

A bespoke data extraction tool was developed by the authors; this was applied to all included studies. Variables recorded were radiograph reporting population, male/female % of radiograph subjects, recruitment methods to study, anatomical site identified, reporting accuracy/error rate %, specificity %, sensitivity % and qualitative outcome statement. An assessment was made of methodological quality using the MINORS tool [10] and of risk of bias using a modified Cochrane RoB2 tool [11]. Where the authors initially made a divergent assessment of any study, a consensus evaluation was formed.

Summary and synthesis

The radiograph reporting populations, reporting accuracy and specificity/sensitivity were identified as the principle summary measures. Meta-analysis was then performed in order to produce summary estimates of specificity and sensitivity, including covariates by anatomical site, using HSROC and bivariate model analysis.

Results

Study selection and characteristics

After the removal of duplicates, a total of 3887 papers were identified for screening. Following abstract review, 89 articles were progressed to full-text review. A total of 23 articles were included for qualitative synthesis, of which 10 articles yielded data suitable for meta-analysis [12-21]. These 10 articles examined an aggregate of 3902 sets of radiographs, producing a total of 4709 radiograph interpretation episodes for meta-analysis (see Fig. 2).
Fig. 2

Literature review process

Literature review process The specific anatomical areas examined by articles in the meta-analysis were foot (n = 3), ankle (n = 4), knee (n = 1) and femur (n = 2). Two studies examined multiple anatomical locations (see Table 1).
Table 1

Table of characteristics for all articles included in meta-analysis

Study authorAnatomical locationReportersSets of radiographsGold standardNo. interpretationsSensitivitySpecificityError rate
Utukuri MM, 2000, UKFoot2 orthopaedic consultants, 2 orthopaedic trainees, 2 basic surgical trainees and 2 senior emergency medicine staff50Known clinical course of patient +/− scans200 for junior and 200 for senior staff(one view senior 92.5%, junior 97.5%) (two views senior 100%, junior 97.5%)(one view senior 86.5%, junior 71.6%) (two views senior 90%, junior 78.5%)(one view senior 11%, junior 18%) (two views senior 6%, junior 14%)
Vannier MW et al., 1991, USAFoot4 MSK radiologists7Consensus of plain film and CT2895.8%100.0%3.6%
York TJ et al., 2020, UKAnkleED doctors ranging in experience from F2 to consultant2947Consensus of Orthopaedic Surgeon and MSK radiologist294794.0%94.8%5.6%
Singh A.K et al., 1990, UKAnkleN/A114Consensus of XR and US, if discrepancy then repeat XR performed at 3 weeks11485.2%100.0%3.5%
Ozturk, P et al., 2018, TurkeyAnkle1 consultant orthopaedic surgeon120CT examination12092.8%100.0%2.5%
Gray S, 1997, USAKnee4 radiology residents92Consensus of 3 MSK consultant radiologists with access to follow up imaging368(two view 79%, four view 85%)(two view 87%, four view 92%)(two view 4.1%, four view 2.7%)
Riaz O, et al. 2016 UKFemur2 orthopaedic SpR289Intraop findings578(two view 54.3%, two views 92.1%)(one view 89.9%, two view 91.4%)(one view 16.4%, two view 4.3%)
Lampart A et al., 2019, SwitzerlandFemurConsultant radiologist70CT examination7082.1%96.8%11.4%
Remplik P et al., 2004, GermanyKnee, ankle, foot2 experienced radiologists43Clinical follow-up + review of all imaging86 (knee 16, ankle 48, foot 22)58.5% (knee 0%, ankle 66.7%, foot 83.3%)68.9% (knee 91.4%, ankle 77.7%, foot 100%)17.4% (knee 31.3%, ankle 25.0%, foot 3.6%)
Ricci et al., 2019, ItalyExtremity fracturesN/A198CBCT19885.3%65.5%20.2%
Table of characteristics for all articles included in meta-analysis The studies primarily involved the comparison of plain film radiology with an alternative form of imaging (n = 6). Alternatively, inter reader plain film X-ray diagnostic performance was examined (n = 1), or the value of additional X-ray views on diagnostic performance (n = 2), or both (n = 1). The seniority of the studied initial reporters ranged from post-graduate surgical and radiology trainees to senior orthopaedic surgeons, radiologists and emergency physicians. There was some variation across the ten articles included in the meta-analysis, specifically regarding the definition of a ‘positive’ and ‘negative’ radiographic finding. One article [14] defined positive and negative findings as the presence or absence of any bony or soft tissue pathology. This included soft tissue injury, fractures, dislocations, osteomyelitis and osteoporosis. The other nine articles defined positive and negative finding as the presence or absence of a bony fracture [12, 13, 15–21]. However, two of these nine articles went further and required radiograph interpreters to correctly classify any fracture identified for their findings to be regarded as a ‘true’ positive. Utukuri [12] required interpreters to specify if a calcaneal fracture was intra- or extra-articular. For proximal femur fractures, Riaz O et al. [18] required radiograph interpreters to correctly specify the location and degree of fracture displacement.

Individual study results

Across all lower limb studies sensitivity ranged from 0.59–0.97, and specificity from 0.66–1.00. Utukuri [12] found the highest sensitivity in initial interpretation, with 0.97 achieved for radiographs of the foot. Ricci [21] found the lowest specificity with only 0.65 achieved for lower limb radiographs (see Table 2).
Table 2

Individual study results forest plot

Individual study results forest plot

Synthesis of results

A bivariate model was used to conduct meta-analysis along with a hierarchical summary receiver operating characteristic (HSROC) curve for diagnostic performance across all lower limb plain radiographs (see Fig. 3).
Fig. 3

HSROC for all studies

HSROC for all studies The summary estimate of sensitivity across the included studies was 93.5%, with specificity of 89.7% and a false positive rate of 10.3%. Covariate analysis was also performed to assess specificity and sensitivity by lower limb anatomical subdivision; this was possible for all subdivisions apart from the knee where only a single included study was found (see Table 3).
Table 3

Summary estimates

Anatomical regionSensitivitySpecificityFalse positive rateDiagnostic odds ratio
All studies0.9350.8970.103125.303
Femur0.9490.8460.154103
Knee
Ankle0.9810.9460.054929.974
Foot0.9490.940.06296.168
Summary estimates Summary sensitivity and specificity were both found to be highest for ankle radiographs, 98.1% and 94.6% respectively. Similarly, the initial interpretation of ankle radiographs had the highest diagnostic odds ratio (929.97).

Risk of bias assessment

All studies included in meta-analysis were analysed using a modified Cochrane risk of bias tool, this qualitative tool assesses study risk of bias on seven separate criteria. One study was considered to be at high risk of bias due to scoring in greater than four categories. Four studies were considered at moderate risk of bias due to scoring in three or more categories or scoring particularly strongly in one of two categories. Five studies scored in two or fewer categories and so were considered to have a low risk of bias (see Table 4).
Table 4

Modified Cochrane ‘Risk of Bias’ assessment tool

Modified Cochrane ‘Risk of Bias’ assessment tool

Methodological quality

The methodological quality of the ten articles identified for meta-analysis was assessed using the ‘Minors’ (methodological index for non-randomised studies) tool developed by Slim et al. The range of scores was 13–22 out of a possible 24 points. Articles generally scored highly (average score 16.9). Nine (90%) of the studies lacked prospective calculations of size, and seven (70%) did not possess an unbiased assessment of their endpoint (see Table 5). Conversely, the studies tended to have minimal losses to follow up (80%) and involved the prospective collection of data (70%).
Table 5

Table demonstrating study methodological quality as per MINORS assessment tool

Study authorClearly stated aimInclusion of consecutive patientsProspective of consecutive patientsEndpoints appropriate to the aim of the studyUnbiased assessment of the study endpointFollow-up period appropriate to the aim of the studyLoss to follow-up less than 5%Prospective calculation of study sizeAn adequate control groupContemporary groupsCaseline equivalence groupsAdequate statistical analysesTotal
Gray S, 1997, USA22220220101216
Lampart A et al., 2019, Switzerland22021220222219
Ozturk, P et al., 2018, Turkey21220220222219
Remplik P et al., 2004, Germany22210220220217
Riaz O, et al. 2016 UK22020200222216
Ricci et al., 2019, Italy01200220222013
Singh A.K et al., 1990, UK01211220222015
Utukuri MM, 2000, UK10220120122215
Vannier MW et al., 1991, USA20220220222117
York TK et al., 2020, UK22022222222222
Table demonstrating study methodological quality as per MINORS assessment tool

Discussion

Key findings

This study finds that initial interpreters of lower limb plain radiographs for trauma achieve a relatively high degree of sensitivity (93.5%). It is difficult to quantify the rate at which healthcare systems are justified in accepting the failure to detect findings. Certainly, false negatives are likely to represent the most deleterious of these errors; borne-out by the evidence on litigation for missed fractures both in the UK [6, 22] and abroad [23, 24]. False negatives in the initial interpretation of greater than one in twenty lower limb radiographs, mean that busy accident and emergency or trauma settings are likely to miss substantial numbers of injuries. This appears to support the necessity of safety-netting measures to mitigate the risk of reporting errors. In particular, virtual fracture clinic review [25] and out-of-hours teleradiology services [26] have been widely adopted across the UK and Europe. Alongside these existing methods, the development of novel technologies (such as artificial intelligence algorithms [27]) to supplement interpretation is evidence of a broadly accepted clinical need to improve this reporting. The summary specificity of reporting was found to be 3.8% lower (89.7%) than sensitivity, suggesting that initial interpreters were less able to identify true negative skeletal radiographs. This finding was commented upon by Utukuri et al. [12] and is also supported by a wider evidence base that shows increasing the seniority of interpreters has a greater benefit to specificity than sensitivity [28, 29]. This implies that some interpretation errors, particularly false negatives, represent a limitation of plain radiographs as a modality and so are not easily preventable. These findings also explain the conclusions of the qualitative synthesis which highlighted the importance of corroborating radiograph interpretation with examination and clinical judgement to prevent fractures being ‘missed’ [7, 30–32]. Of the compared anatomical subdivisions, the diagnostic odds ratio for ankle radiographs was found to be superior, followed by the foot and then the femur. The cause for this is not explored in this study; however, the frequency with which ankle injuries present to emergency and trauma care settings may mean initial interpreters are more practiced in the review of these radiographs. The ankle is both the most commonly injured joint, and also the most frequently operated upon [33]; with the estimated incidence for fractures of the ankle being as high as 187 per 100,000 people per annum [34].

Limitations

Of the included studies, a generally favourable assessment of risk of bias and methodological quality was made. However, weaknesses were noted due to lack of prospective size calculation and establishing an unbiased endpoint. The extent to which these factors influence results is uncertain; however, sample sizes in a number of studies appear underpowered [12, 19, 20]. During study selection, a number of large sample-size papers were identified but lacked sufficient characterisation of data for inclusion in meta-analysis. Whilst these are a targeted for use in future analysis, they emphasise the importance of reporting diagnostic accuracy along STARD 2015 [35] or similar, relevant guidelines.

Conclusions

This study suggests that the initial interpretation of plain skeletal radiographs is performed with a relatively high degree of specificity and sensitivity. However, this still represents greater than one in twenty true positives being missed on primary review. The necessity of systems designed to provide safety netting against this are paramount, as are the development of novel means to improve the accuracy of initial interpretation. Evidence is also found to support statistically significant variation in the accuracy of interpretation across anatomical subdivisions; radiographs of the ankle were shown to have the highest diagnostic odds ratio. The cause of this is uncertain and may reflect inherent difficulties present in certain radiographic views or anatomy, or simply greater interpreter familiarity with some radiographs. Further research is warranted to explore these factors.
  33 in total

1.  Methodological index for non-randomized studies (minors): development and validation of a new instrument.

Authors:  Karem Slim; Emile Nini; Damien Forestier; Fabrice Kwiatkowski; Yves Panis; Jacques Chipponi
Journal:  ANZ J Surg       Date:  2003-09       Impact factor: 1.872

2.  Medico-legal claims against English radiologists: 1995-2006.

Authors:  S F S Halpin
Journal:  Br J Radiol       Date:  2009-05-26       Impact factor: 3.039

3.  Malpractice Litigation Following Traumatic Fracture.

Authors:  Shaan A Ahmed; Steven F DeFroda; Syed J Naqvi; Adam E M Eltorai; Davis Hartnett; Jack H Ruddell; Christopher T Born; Alan H Daniels
Journal:  J Bone Joint Surg Am       Date:  2019-04-03       Impact factor: 5.284

4.  Emergency physician accuracy using ultrasonography to diagnose lateral malleolar fracture.

Authors:  Pinar Ozturk; Ersin Aksay; Nese Colak Oray; Basak Bayram; Onur Basci; Duygu Tokgoz
Journal:  Am J Emerg Med       Date:  2017-08-07       Impact factor: 2.469

5.  The value of the axial view in assessing calcaneal fractures.

Authors:  M M Utukuri; D Knowles; K L Smith; J L Barrie; D Gavan
Journal:  Injury       Date:  2000-06       Impact factor: 2.586

6.  The measurement of observer agreement for categorical data.

Authors:  J R Landis; G G Koch
Journal:  Biometrics       Date:  1977-03       Impact factor: 2.571

7.  Supination-eversion fractures of the ankle joint: changes in incidence over 30 years.

Authors:  M Bauer; U Bengnér; O Johnell; I Redlund-Johnell
Journal:  Foot Ankle       Date:  1987-08

8.  Reporting of fracture radiographs by radiographers: the impact of a training programme.

Authors:  C F Loughran
Journal:  Br J Radiol       Date:  1994-10       Impact factor: 3.039

9.  STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration.

Authors:  Jérémie F Cohen; Daniël A Korevaar; Douglas G Altman; David E Bruns; Constantine A Gatsonis; Lotty Hooft; Les Irwig; Deborah Levine; Johannes B Reitsma; Henrica C W de Vet; Patrick M M Bossuyt
Journal:  BMJ Open       Date:  2016-11-14       Impact factor: 2.692

10.  Reporting Discrepancy Resolved by Findings and Time in 2947 Emergency Department Ankle X-rays.

Authors:  Thomas James York; P J Jenkins; A J Ireland
Journal:  Skeletal Radiol       Date:  2019-11-21       Impact factor: 2.199

View more
  1 in total

1.  After-hour trauma-radiograph interpretation in the emergency centre of a District Hospital.

Authors:  Yi-Ying Melissa Liu; Suzanne O'Hagan; Frederik Carl Holdt; Sa'ad Lahri; Richard Denys Pitcher
Journal:  Afr J Emerg Med       Date:  2022-06-06
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.