Literature DB >> 27612918

Blood metal ion levels are not a useful test for adverse reactions to metal debris: A systematic review and meta-analysis.

M Pahuta1, J M Smolders2, J L van Susante2, J Peck1, P R Kim1, P E Beaule3.   

Abstract

OBJECTIVES: Alarm over the reported high failure rates for metal-on-metal (MoM) hip implants as well as their potential for locally aggressive Adverse Reactions to Metal Debris (ARMDs) has prompted government agencies, internationally, to recommend the monitoring of patients with MoM hip implants. Some have advised that a blood ion level >7 µg/L indicates potential for ARMDs. We report a systematic review and meta-analysis of the performance of metal ion testing for ARMDs.
METHODS: We searched MEDLINE and EMBASE to identify articles from which it was possible to reconstruct a 2 × 2 table. Two readers independently reviewed all articles and extracted data using explicit criteria. We computed a summary receiver operating curve using a Bayesian random-effects hierarchical model.
RESULTS: Our literature search returned 575 unique articles; only six met inclusion criteria defined a priori. The discriminative capacity of ion tests was homogeneous across studies but that there was substantial cut-point heterogeneity. Our best estimate of the "true" area under curve (AUC) for metal ion testing is 0.615, with a 95% credible interval of 0.480 to 0.735, thus we can state that the probability that metal ion testing is actually clinically useful with an AUC ≥ 0.75 is 1.7%.
CONCLUSION: Metal ion levels are not useful as a screening test for identifying high risk patients because ion testing will either lead to a large burden of false positive patients, or otherwise marginally modify the pre-test probability. With the availability of more accurate non-invasive tests, we did not find any evidence for using blood ion levels to diagnose symptomatic patients.Cite this article: M. Pahuta, J. M. Smolders, J. L. van Susante, J. Peck, P. R. Kim, P. E. Beaule. Blood metal ion levels are not a useful test for adverse reactions to metal debris: a systematic review and meta-analysis. Bone Joint Res 2016;5:379-386. DOI: 10.1302/2046-3758.59.BJR-2016-0027.R1.
© 2016 Pahuta et al.

Entities:  

Keywords:  Hip implants; Metal ion; Metal-on-metal

Year:  2016        PMID: 27612918      PMCID: PMC5027892          DOI: 10.1302/2046-3758.59.BJR-2016-0027.R1

Source DB:  PubMed          Journal:  Bone Joint Res        ISSN: 2046-3758            Impact factor:   5.853


We report a systematic review and meta-analysis of the performance of blood metal ion testing for adverse reactions to metal debris (ARMDs) in patients with metal-on-metal (MoM) hip implants. Blood metal ion levels are not useful as a screening test for identifying high-risk patients because ion testing will either lead to a large burden of false positive results or otherwise marginally modify the pre-test probability. With the availability of more accurate non-invasive tests such as MRI, there is no role for using blood ion levels to diagnose symptomatic patients. Out of 575 references identified by the literature search, only six met selection criteria. These six studies were homogeneous in terms of diagnostic performance estimates.

Introduction

Despite the fact that total hip arthroplasty (THA) has been touted as the operation of the 21st century, orthopaedic researchers continue to propose new designs in an effort to improve implant longevity and patient function.[1] One such purported improvement was the use of a large femoral head with a metal-on-metal (MoM) bearing; this design failed and has become a significant public health concern leading to withdrawal and even recall of certain implant designs.[2,3] This is yet another example of the flawed cycle of innovation in arthroplasty where new implants actually underperform relative to the existing standard.[4] This alarm over MoM hip implants stems from reported high failure rates, and the potential for locally aggressive ion-induced local tissue reactions such as pseudotumours, a type of adverse reaction to metal debris (ARMD).[5] It has been estimated that over a million patients have received these implants worldwide, thus posing a significant concern in regard to monitoring and advising patients on both short and long term performance.[5] In the 1980s, MoM bearings were introduced as an improvement on standard metal-on-polyethylene (MoP) bearings. MoM bearings generate less volumetric wear which may translate into longer implant survival.[6] In addition, MoM bearings facilitate greater hip stability and range of movement from larger head sizes. Furthermore, MoM bearings allow for bone-conserving hip resurfacing (HR) which is important for younger patients who will eventually require revision surgery.[7] Both MoP and MoM bearings have been associated with elevated systemic levels of metal ions and ARMDs, however, ion levels have been shown to be consistently higher in patients with MoM bearings, and ARMDs appear to be more common in these patients.[8-11] ARMDs span a spectrum of aseptic necrotic effusions that include pseudotumours as well as aseptic lymphocyte-dominated vasculitis-associated lesions (ALVL) (pericapsular hypersensitivity reactions associated with osteolysis).[12-14] When patients present with mechanical symptoms or pain following a MoM hip implant, unlike with a MoP hip implant, expectant management may not be advisable due to the potential for ARMD lesions which can lead to progressive tissue destruction that compromises reconstructive options.[15-17] As these lesions are diagnosed with ultrasound (US) or metal artefact reduction sequence magnetic resonance imaging (MARS MRI), there has been interest in developing an inexpensive and rapid laboratory test.[18] Government agencies worldwide have published recommendations on the surveillance and work-up of patients with MoM hip implants. The United Kingdom’s Medicines and Healthcare Regulatory Agency (Medicines and Healthcare Regulatory Agency 2012),[19] the United States’ Food and Drug Administration (Food and Drug Agency 2013),[20] the European Commission’s Scientific Committee on Emerging Newly Identified Health Risks (European Commission Scientific Committee on Emerging Newly Identified Health Risks 2014),[21] Health Canada (Health Canada 2012)[22] and the Therapeutic Goods Administration of Australia (Therapeutic Goods Administration 2012)[23] have recommended close follow-up of patients, even those with well-functioning implants. Surveillance with metal ions has been recommended for patients with the ASR implant (Depuy Synthes, Warsaw, Indiana) in the United Kingdom, with large-diameter THA and small-diameter hip resurfacing (HR) in Australia, and with large-diameter THA and any HR in Europe. The recommended work-up of symptomatic patients by each of these organisations includes blood metal ion assessment. To date, the most detailed guidelines were put forth by the United Kingdom’s Medicines and Healthcare Regulatory Agency which advised that a blood metal ion level > 7 µg/L indicates potential for soft-tissue reaction. As noted by others, ion level cut-offs are arbitrary and not supported by scientific data.[5,24-26] Given the uncertain utility of laboratory testing in surveillance and investigation of patients with MoM hip implants, there is a need to synthesise the evidence for the measurement of blood cobalt and chromium ion concentrations. In this paper we report a systematic review and meta-analysis of the screening and diagnostic value of metal ion testing for ARMDs.

Materials and Methods

Literature search and data extraction

We conducted an electronic search to identify relevant articles that reported original research findings including blood ion concentrations in patients with total hip arthroplasty (THA) or hip resurfacing (HR). MEDLINE (between January 1946 and 15th February 2015) and EMBASE (between January 1974 and February 2015) were searched for relevant publications with the assistance of a clinical librarian. The electronic search was individually tailored to each database to maximise sensitivity (see Appendix 1). We supplemented the electronic search by obtaining referenced articles and articles citing articles were each of the articles ultimately included in the meta-analysis through Scopus. Two readers (a fellowship-trained joint reconstruction surgeon and orthopaedic surgery resident) independently reviewed all articles using explicit criteria, and recorded assessments using a standard computerised form. A third reader (a fellowship-trained joint reconstruction surgeon) resolved any disagreements. Readers screened titles and abstracts to exclude animal and basic science studies, review articles, guidelines, and editorials. The readers identified articles evaluating a hip prosthesis (either HR or THA) and reporting cobalt and/or chromium blood ion concentrations from the full text. We included articles in the meta-analysis if it was possible to reconstruct a 2 × 2 table for the use of blood ion measurements as a test for ARMDs. Only studies that evaluated metal ion levels for screening or for diagnosis of symptomatic patients were eligible. Studies that only recruited patients who underwent revision surgery were excluded (three studies, 81 hips). In these papers, the decision making for revision was not clearly described. Consequently we felt that there was a high risk of bias from the spectrum effect as ion levels were likely used in the decision-making process.[27] Eligible measures of diagnostic performance included: sensitivity; specificity; predictive values; likelihood ratios; diagnostic odds ratios; and receiver operating characteristic (ROC) curves. If an odds ratio for an ion cut-point used as a covariate in logistic regression was reported, we deemed the study eligible. Reviewers collected the following covariates: country of study; inclusion criteria; benchmark test; index test; number of patients; ARMD prevalence; number of revisions; and prevalence of symptomatic patients. Reviewers also assessed the quality of studies using the QUADAS-2 tool.[28]

Statistical analysis

We selected the diagnostic performance meta-analytic technique according to the algorithm proposed by Chappell, Raab and Wardlaw.[29] Ultimately, we computed a summary receiver operating characteristic (SROC) curve using a random-effects hierarchical SROC (HSROC) model controlling for the different cut-points reported (see Appendix 2).[30,31] We quantified heterogeneity by comparing the widths of 95% confidence intervals (CIs) and 95% prediction intervals (PIs).[32]

Implementation

We performed Bayesian computation for both the diagnostic performance and normative ion level meta-analyses using R (R Foundation, Vienna, Austria) and the Bayesian modeling language, Stan.[33] We ran four chains for 5000 iterations, discarded the first 2500 and used a thinning interval of five iterations. We assessed appropriate sampling of chains graphically by ensuring mixing on trace plots, and convergence by ensuring the Gelman-Rubin statistic was < 1.2.[34,35] We used non-informative prior distributions.

Results

Our literature search identified 575 unique references (Fig. 1). Six met the selection criteria defined a priori (see Appendix 1).[9,11,36-39] We contacted the authors of two studies[9,11] reporting logistic regression with a > 5 µg/L blood ion cut-point for additional data.
Fig. 1

Flow of studies through selection process.

Flow of studies through selection process. These six studies included a total of 898 hips, of which 376 had an ARMD. The prevalence of ARMDs ranged from 29% to 69%, and the prevalence of symptoms ranged from 23.6% to 100% (Figs 2f and 2g, Appendices 3, 4, 5, 6). Studies differed in the blood fraction tested, ion measured and cut-point used. Only 50% of studies[36,38,39] used MARS MRI as the benchmark for diagnosis. Only one study[36] used blood ion levels in a diagnostic context (symptomatic patients) whereas the remaining four studies[9,11,37,39] used blood ion levels in a screening (undifferentiated patients) context. The three studies[9,11,37] not using MARS MRI were deemed at risk of bias (Table I, Fig. 2c, Appendix 3). Two studies[36,37] used plasma, rather than serum, for ion testing and were therefore deemed to have concerns regarding applicability (Table I, see also Appendix 3). No study described the time interval between ion testing and imaging, and therefore all were deemed to have concerns regarding applicability (Table I, see also Appendix 3). All studies were of either Level I or Level II quality.[40] Prior to proceeding with meta-analysis, we investigated whether clinical, methodological and quality variability manifested in heterogeneity in the estimates of diagnostic accuracy obtained from each study.
Fig. 2

Individual reports of sensitivity and specificity plotted in receiver operating characteristic space with (a) study, (b) ion threshold, (c) benchmark test, (d) index test characteristics, (e) number of hips, (f) adverse reactions to metal debris prevalence, and (g) prevalence of symptomatic patients highlighted. Three studies reported more than one estimate (a). Circle size is proportional to sample size in (e) and prevalence in (f) and (g).

Table I.

QUADAS-2 risk of bias table for included studies

StudyRisk of bias
Applicability concerns
Patient selectionIndex testReference standardFlow and timingPatient selectionIndex testReference standard
Bosker et al 2012[9]???
Malek et al 2012[36]??????
Bisschop et al 2013[11]???
Chang et al 2013[37]??????
MacNair et al 2013[38]???
Van der Weegen et al 2014[39]???

☺ Low risk; ☹ High risk; ??? Unclear risk

Individual reports of sensitivity and specificity plotted in receiver operating characteristic space with (a) study, (b) ion threshold, (c) benchmark test, (d) index test characteristics, (e) number of hips, (f) adverse reactions to metal debris prevalence, and (g) prevalence of symptomatic patients highlighted. Three studies reported more than one estimate (a). Circle size is proportional to sample size in (e) and prevalence in (f) and (g). QUADAS-2 risk of bias table for included studies ☺ Low risk; ☹ High risk; ??? Unclear risk Study-specific estimates of specificity and sensitivity all appear to lie close to a common smooth ROC curve (Fig. 2a), however, there appeared to be variability in the performance of specific cut-points across different studies (Fig. 2b). This suggests that the discriminative capacity of the ion test is homogeneous across studies but that there is substantial cut-point heterogeneity. The cut-point heterogeneity may be due to heterogeneity in the benchmark modality used: the studies reporting unexpectedly high specificity and low sensitivity at 5 µg/L and 7 µg/L cut-points did not exclusively use MARS MRI as the benchmark (Fig. 2c). Sample size, ARMD prevalence, prevalence of symptomatic patients, and ion test characteristics were not associated with cut-point heterogeneity (Figs 2d, 2e, 2f and 2g). Given the homogeneity in discrimination capacity but implicit cut-point heterogeneity we pursued SROC meta-analysis without meta-analysing cut-points (see Appendix 2). Our best estimate of the “true” ROC curve for metal ion test is the mean SROC curve plotted in Figure 3. However, due to random variability the “truth” may not be the same in all studies. Accounting for this random variability, we have 95% confidence that the study-specific “truth” will lie within the 95% prediction region. The prediction and credible regions have similar widths, which further supported minimal heterogeneity. The area under the curve (AUC) for the SROC curve was 0.615 (95% CI 0.480 to 0.735), thus we can state that the probability that metal ion testing is actually clinically useful with AUC ≥ 0.75 is 1.7% (see Appendix 2).[41]
Fig. 3

Summary receiver operating characteristic curve for meta-analysis. Mean curve (—), 95% credible region (---), and 95% prediction region (…) are shown. Sensitivity and specificity reported by individual studies (•).

Summary receiver operating characteristic curve for meta-analysis. Mean curve (—), 95% credible region (---), and 95% prediction region (…) are shown. Sensitivity and specificity reported by individual studies (•). Due to implicit cut-point heterogeneity, we did not perform meta-analysis of cut-point performance (see Appendix 2). Therefore, the SROC curve in Figure 3 does not relate cut-points to a particular specificity and sensitivity. However, diagnostic performance at any given cut-point will lie somewhere on the SROC curve in Figure 3 – we just do not know where. Hence, our meta-analysis can be used to evaluate the overall performance of ion tests, without reference to a particular cut point.

Discussion

This systematic review and meta-analysis is the first synthesis of evidence for the use of blood ion measurements as a test for ARMDs in patients with MoM hip implants. We identified minimal heterogeneity in the inherent discrimination capacity of ion tests used in each study. Our meta-analysis indicates that blood ion levels are a poor test for classifying patients as having or not having an ARMD. All but one study included in our review evaluated blood ion levels in a screening context. Estimates of diagnostic accuracy obtained from high prevalence/symptomatic samples can be biased upwards due to the spectrum effect.[22] The prevalence of ARMDs and symptomatic patients in the included studies spanned a wide range (29% to 69%), therefore, we could graphically evaluate for a spectrum effect. Since Figures 2f and 2g demonstrated that symptom prevalence was not associated with the operating point, we concluded that there was an absence of spectrum effect in our meta-analysis. Based on a mean AUC of 0.615, blood ion levels are a poor, and not clinically useful, test for classifying patients as having or not having an ARMD.[41-44] It has been suggested that a clinically useful test has an AUC ≥ 0.75.[45] Considering the reconstructive consequences of delayed diagnosis, a false negative result could harm patients.[15-17] With the availability of non-invasive tests which definitively determine the presence of an ARMD, we see no role for using blood ion levels to diagnose symptomatic patients.[18] Screening is the process of identifying high-risk patients in the general population. Since screen-positive patients will undergo further testing, screening tests need not be as accurate as diagnostic tests. Screening can use two different approaches: exclude patients with very low probability of disease from further testing by maximising the negative predictive value (NPV), or identify high-risk patients for further testing by maximising the positive predictive value (PPV).[46] The performance of ion testing using these two approaches is shown in Table II. Calculations were made using the SROC curve plotted in Figure 3 and using the mean prevalence of ARMDs in the studies included in this review (41%). Indeed, maximising NPV is burdensome because 99% of patients will test positive. Furthermore, test-positive patients have the same probability of disease as they did prior to undergoing the test. On the other hand, maximising PPV does not reassure test negative patients because they still have a 21% probability of having an ARMD. Test-positive patients are hardly “high risk” because the risk of an ARMD is marginally different from the pre-test probability (52% versus 41%). Aside from statistical concerns, screening for ARMDs is problematic on theoretical grounds. The World Health Organization recommends that screening only be performed if patients will be offered treatment.[47] We are unaware of any evidence supporting revision on asymptomatic patients with ARMDs and thus screening would serve no clinical purpose.
Table II.

Performance of ion levels in screening

Sensitivity (%)Specificity (%)NPV (%)PPV (%)Prevalence of positive test result (%)
Maximise NPV991994199
Maximise PPV8148795264

PPV, positive predictive value; NPV, negative predictive value

Performance of ion levels in screening PPV, positive predictive value; NPV, negative predictive value We have synthesised the totality of evidence for the diagnostic value of metal ion levels for ARMDs in patients with MoM hip implants. We conclude that blood ion levels have no role in the diagnostic algorithm for ARMDs. The probability that we have incorrectly calculated the AUC to be less than 0.75 is 1.7%. Given the strength and consistency of the findings of our meta-analysis, and the improbability that the results of our meta-analysis are incorrect, further study of metal ion testing for the diagnosis of ARMDs would be an inefficient use of research resources. A perceived limitation of our study may be that conclusions are based on a small number of studies, half of which did not use MARS MRI as the benchmark modality. We therefore carefully assessed for, and controlled for, heterogeneity. We used a powerful meta-analytic technique that allowed us to partition results into a “cut-point effect” and “accuracy effect” (see Appendix 2). The methodological heterogeneity only manifested in heterogeneity in the cut-point effect, and not in the accuracy effect. Due to heterogeneity, our meta-analysis cannot be used to determine a useful cut-point. However, this is a moot point because the accuracy of the test is so poor. It was remarkable that these methodologically heterogeneous studies formed a smooth ROC curve (Fig. 2). Therefore, there was substantial homogeneity among these studies in the accuracy effect. This homogeneity is further reflected in the fact that the prediction intervals and confidence intervals were nearly equivalent in width (Fig. 3). In other words, our results are tantamount to those from a single study with 898 hips and 376 ARMDs. We emphasise that our systematic review evaluated the use of blood metal ion levels for the diagnosis of ARMDs. Our findings do not apply to the investigation of the systemic consequences of metal ion exposure which are believed to occur at levels > 60 µg/L.[48] Further research should be directed to determining how blood ion measurements should be used to investigate cobaltism.[49] We conclude that the available evidence does not support existing guidelines, which recommend the use of blood ion measurements for both screening and diagnosis of ARMD.
  37 in total

1.  Quantifying heterogeneity in a meta-analysis.

Authors:  Julian P T Higgins; Simon G Thompson
Journal:  Stat Med       Date:  2002-06-15       Impact factor: 2.373

2.  A practical guide to assigning levels of evidence.

Authors:  James G Wright
Journal:  J Bone Joint Surg Am       Date:  2007-05       Impact factor: 5.284

3.  Hip resurfacings revised for inflammatory pseudotumour have a poor outcome.

Authors:  G Grammatopoulos; G Grammatopolous; H Pandit; Y-M Kwon; R Gundle; P McLardy-Smith; D J Beard; D W Murray; H S Gill
Journal:  J Bone Joint Surg Br       Date:  2009-08

4.  Out of joint: the story of the ASR.

Authors:  Deborah Cohen
Journal:  BMJ       Date:  2011-05-13

5.  Risk stratification algorithm for management of patients with metal-on-metal hip arthroplasty: consensus statement of the American Association of Hip and Knee Surgeons, the American Academy of Orthopaedic Surgeons, and the Hip Society.

Authors:  Young-Min Kwon; Adolph V Lombardi; Joshua J Jacobs; Thomas K Fehring; Courtland G Lewis; Miguel E Cabanela
Journal:  J Bone Joint Surg Am       Date:  2014-01-01       Impact factor: 5.284

6.  The 2012 Otto Aufranc Award: The interpretation of metal ion levels in unilateral and bilateral hip resurfacing.

Authors:  Catherine Van Der Straeten; George Grammatopoulos; Harinderjit S Gill; Alessandro Calistri; Patricia Campbell; Koen A De Smet
Journal:  Clin Orthop Relat Res       Date:  2013-02       Impact factor: 4.176

7.  Relationship of plasma metal ions and clinical and imaging findings in patients with ASR XL metal-on-metal total hip replacements.

Authors:  Eric Y Chang; James L McAnally; James R Van Horne; James G Van Horne; Tanya Wolfson; Anthony Gamst; Christine B Chung
Journal:  J Bone Joint Surg Am       Date:  2013-11-20       Impact factor: 5.284

8.  Analysis of 118 second-generation metal-on-metal retrieved hip implants.

Authors:  H P Sieber; C B Rieker; P Köttig
Journal:  J Bone Joint Surg Br       Date:  1999-01

9.  Poor outcome of revised resurfacing hip arthroplasty.

Authors:  Richard N de Steiger; Lisa N Miller; Gareth H Prosser; Stephen E Graves; David C Davidson; Tyman E Stanford
Journal:  Acta Orthop       Date:  2010-02       Impact factor: 3.717

10.  Pseudotumours associated with metal-on-metal hip resurfacings.

Authors:  H Pandit; S Glyn-Jones; P McLardy-Smith; R Gundle; D Whitwell; C L M Gibbons; S Ostlere; N Athanasou; H S Gill; D W Murray
Journal:  J Bone Joint Surg Br       Date:  2008-07
View more
  4 in total

Review 1.  Revision total hip arthroplasty for metal-on-metal failure.

Authors:  Justin S Chang; Fares S Haddad
Journal:  J Clin Orthop Trauma       Date:  2019-10-04

2.  What should I expect from my recalled Adept Hip Resurfacing?

Authors:  Matteo Cadossi; Silvio Terrando; Andrea Sambri; Giuseppe Tedesco; Antonio Mazzotti; Barbara Bordini; Dalila De Pasquale; Cesare Faldini
Journal:  Musculoskelet Surg       Date:  2017-04-27

3.  High blood metal ion levels in 19 of 22 patients with metal-on-metal hinge knee replacements.

Authors:  Minna Laitinen; Jyrki Nieminen; Aleksi Reito; Toni-Karri Pakarinen; Piia Suomalainen; Konsta Pamilo; Jyrki Parkkinen; Tonis Lont; Antti Eskelinen
Journal:  Acta Orthop       Date:  2017-01-26       Impact factor: 3.717

4.  Comparing 1-year and 10-year whole blood metal ion results following Birmingham hip resurfacing for osteoarthritis.

Authors:  Scott J Watt Kearns; Jonathan Bourget-Murray; Kelly Johnston; Jason Werle
Journal:  Can J Surg       Date:  2022-08-04       Impact factor: 2.840

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.