Literature DB >> 31402836

Homogeneity in prediction of survival probabilities for subcategories of hipprosthesis data: the Nordic Arthroplasty Register Association, 2000-2013.

Christoffer Bartz-Johannessen¹, Ove Furnes^1,2, Anne Marie Fenstad¹, Stein Atle Lie^1,3, Alma Becic Pedersen^4,4, Søren Overgaard^4,5, Johan Kärrholm⁶, Henrik Malchau^6,7,8, Keijo Mäkelä^9,10, Antti Eskelinen^10,11, Jeremy M Wilkinson¹².

Abstract

Introduction: The four countries in the Nordic Arthroplasty Register Association (NARA) share geographic proximity, culture, and ethnicity. Pooling data from different sources in order to obtain higher precision and accuracy of survival-probability estimates is appealing. Nevertheless, survival probabilities of hip replacements vary between the countries. As such, risk prediction for individual patients within countries may be problematic if data are merged. In this study, our primary question was to address when data merging for estimating prosthesis survival in subcategories of patients is advantageous for survival prediction of individual patients, and at what sample sizes this may be advised.
Methods: Patients undergoing total hip replacements for osteoarthritis between January 1, 2000 and December 31, 2013 in the four Nordic countries were studied. A total of 184,507 patients were stratified into 360 patient subcategories based on country, age-group, sex, fixation, head size, and articulation. For each patient category, we determined the sample size needed from a single country to obtain a more accurate and precise estimate of prosthesis-survival probability at 5 and 10 years compared to an estimate using data from all countries. The comparison was done using mean-square error.
Results: We found large variations in the sample size needed, ranging from 40 to 2,060 hips, before an estimate from a single Nordic country was more accurate and precise than estimates based on the NARA data.
Conclusion: Using pooled survival-probability estimates for individual risk prediction may be imprecise if there is heterogeneity in the pooled data sources. By applying mean-square error, we demonstrate that for small sample sizes, applying the larger NARA database may provide a more accurate and precise estimate; however, this effect is not consistent and varies with the characteristics of the subcategory.

Entities: Chemical Disease Gene Species

Keywords: accuracy; arthroplasty registry; hip replacement; merging data sets; precision; variance

Year: 2019 PMID： 31402836 PMCID： PMC6637139 DOI： 10.2147/CLEP.S199227

Source DB: PubMed Journal: Clin Epidemiol ISSN： 1179-1349 Impact factor: 4.790

Introduction

The Nordic Arthroplasty Register Association (NARA), comprising the national arthroplasty registers of Denmark, Finland, Sweden, and Norway, has developed a combined data set with a set of harmonized outcome definitions1 The NARA data have successfully been used to predict outcomes and identify risk factors of hip and knee replacements at the population level.1,2 The four Nordic countries share geographic proximity in northern Europe. The ethnic origin in the countries is also similar, and they have similar welfare and health-service models3 Still, within orthopedics, surgical practices, hospital surgery volume, training of surgeons, prostheses in use, threshold for revision, and completeness of reporting of revisions are different.1,2,4,5 These differences may explain the heterogeneity observed in survival estimates of total hip replacements (THRs) between the countries.1,2 Pooling data from different sources in order to increase sample size and obtain higher precision and accuracy of survival-probability estimates is appealing when calculating individual risk predictions, as in risk calculators6 However, for this approach to be sensible, the different sources should have similar survival probabilities. If the survival probabilities differ, pooled estimates will not represent any of the original sources of data, and will thus have less accuracy. As such, using pooled estimates for individual risk prediction may be imprecise if there is heterogeneity in the pooled data sources. Mean-square error (MSE) is a commonly used measure that accounts for both accuracy and precision when comparing estimates7 In this study, our primary question was to address when data merging for estimating prosthesis survival in subcategories of patients is advantageous for survival prediction of individual patients, and at what sample sizes this may be advised.

Study populations

Patients with THRs from the NARA held within the common database between January 1, 2000 and December 31, 2013 were included in the study.1,2 For homogeneity of indication, only patients with osteoarthritis were included. To avoid outdated prostheses, only THR operations with frequently used contemporary cemented HR stems (Lubinus, Exeter, Charnley, MS30, CPT, Müller, and C-stem) and uncemented HR (Cone, SCP, Bimetric, Bicontact, Corail, Versys, AML, CLS, ABG, Filler, and Omnifit) brands were included.8,9 For both stems and cups, all implants used in <500 operations within a country were also removed from that country’s data set. Furthermore, all stems and cups with <95% survival probability at 10-year follow-up in any of the four countries were excluded to minimize heterogeneity due to poorly performing implants. These cutoffs were based on a UK National Institute for Health and Care Excellence guideline10 For the Finnish data, separate results for stems and cups were not available. Therefore, stems and cups with overall survival (including all revision causes) <90% were excluded from the Finnish part of the data. Metal-on-metal articulation was considered noncontemporary and thus excluded.11,12 The difference between highly cross-linked polyethylene (XLP) and polyethylene has been identified in the NARA database13 Radiation of 5 Mrad and more was classified as XLP. For patients operated on for more than one hip, only time to revision for the first registered primary operation was included. Based on the given criteria, there remained 38,042 Norwegian, 14,385 Finnish, 21,439 Danish, and 110,641 Swedish patients. Therefore, a total of 184,507 patients from the NARA data remained for analyses (Figure 1).

Figure 1

Flowchart for patients included in the study.

Abbreviations: NARA, Nordic Arthroplasty Register Association; OA, osteoarthritis.

Flowchart for patients included in the study. Abbreviations: NARA, Nordic Arthroplasty Register Association; OA, osteoarthritis. The covariates availablefor the present analyses were: age (20–59 years, 60–74 years, 75 years and older), sex, prosthesis fixation (cemented, uncemented, hybrid, reversed hybrid), head size (<32 mm, 32 mm, >32 mm), and articulation (metal + XLP, ceramic + XLP, ceramic + ceramic, metal + polyethylene, ceramic + polyethylene). Table 1 summarizes the categories for the different variables. This categorization resulted in 360 combinations of the covariates, and thus 360 patient subcategories.

Table 1

Different variables for total hip replacements

Age	Sex	Fixation	Head size	Articulation (head/cup)
20–59 years	Male	Cemented	<32 mm	M + XLP
60–74 years	Female	Uncemented	32 mm	C + XLP
Over 74 years		Hybrid	>32 mm	C + C
		Reversed hybrid		M + Poly
				C + Poly

Notes: For articulation, the first term gives the femoral head material, and the second the acetabulum bearing material.

Abbreviations: M, metal; C, ceramic; XLP, highly cross-linked polyethylene; Poly, conventional polyethylene.

Different variables for total hip replacements Notes: For articulation, the first term gives the femoral head material, and the second the acetabulum bearing material. Abbreviations: M, metal; C, ceramic; XLP, highly cross-linked polyethylene; Poly, conventional polyethylene.

Statistical methods

We aimed to quantify at what sample size for different patient subcategories a country’s own data can be considered sufficient in survival-probability calculations for that country versus using the NARA database. For each country, we compared 5- and 10-year survival-probability estimates based on the country’s own data and estimates based on the NARA data set. The procedure was equivalent for all four countries and for the 5- and 10-years survival-probability estimates, but explained only for Norwegian data when analyzing 5-year survival-probability estimates. Norwegian patient subcategories with >250 patients at risk at 5 years in both Norway and the NARA were included. A cut off point at 250 patients at risk has also been chosen in other studies, like Deere et al (2019).14 We chose one of these patient subcategories. The Kaplan–Meier survival-probability estimate at 5 years was calculated with the available Norwegian data in this subcategory and considered the correct Norwegian survival probability (S) for this patient subcategory. A small sample (starting at n=20) of random Norwegian patients was drawn from the patient subcategory and the Kaplan–Meier survival-probability estimate at 5 years calculated. We named this the Norwegian estimate: of S. Additionally, a survival-probability estimate based on the corresponding data from the other NARA countries for the patient subcategory, including the random sample from Norway, was calculated. We named this the NARA estimate: of S. The latter two steps were repeated 500 times in a bootstrap-like simulation to obtain 500 Norwegian and 500 NARA estimates of S. The MSE for the Norwegian estimate was then calculated: The MSE for NARA was calculated applying the same formula. MSE is defined as the variance plus the square of the bias for an estimator, and hence takes into account both the accuracy (bias) and the precision (variance) of survival-probability estimates7 MSE calculations were repeated, with sample sizes increasing by 20 at each step. The MSE for the Norwegian sample estimates will initially be large, due to low precision (large variation due to small sample). After increasing the sample size, the MSE for the Norwegian estimates will eventually be lower than the MSE for NARA, since the accuracy is less (the bias is larger) for the NARA estimates. At this point, the Norwegian estimate is preferable. The procedure described was repeated for all Norwegian patient subcategories with >250 patients at risk at 5 years. R version 3.4.1 was applied for all analyses (www.r-project.org).

Results

In Figure 2, the difference between the MSE for Norway and the NARA as a function of sample size (for the subcategory “female, age 60–74, cemented, head size <32 mm, and metal + conventional polyethylene) is shown. This figure shows that for approximately 1,460 patients, the curve crosses zero for this subcategory. This implies that for samples >1,460, the survival-probability estimate based on the Norwegian sample has smaller MSE than the estimate based on the NARA data. Therefore, for samples >1,460, the survival-probability estimate based on a Norwegian sample is superior with regard to precision and accuracy compared to the NARA estimate. This figure illustrates the principle for the MSE calculations performed. The same calculations were done for all patient subcategories with >250 patients at risk at 5 and 10 years, respectively (). We observed relatively large variation in the sample size needed from a single country to outperform the estimates based on the complete NARA data. The number of patients needed before the Norwegian estimates became more precise and accurate compared to estimates based on the NARA data varied from 120 to 960 at 5 years and from 140 to 2,060 at 10 years. For Denmark, the numbers were 80 and 440 at 5 years, respectively. At 10 years, there was no patient category with sufficient observations. For Finland, the numbers were 100 and 400 at 5 years and 80 and 240 at 10 years. For Sweden, the numbers were 40 and 1,880 at 5 years and 80 and 110 at 10 years ().

Figure 2

Norwegian MSE minus NARA MSE.

Notes: The patient category in this figure is “60–74 years old, female, cemented, head size <32 mm, M + Poly”. The x-axis shows the sample size as explained in the “Statistical methods” section. The y-axis shows the difference in MSE. The red horizontal line is drawn at zero in order to visualize where the difference in MSE crosses zero, and hence shows at what sample size the Norwegian estimate becomes preferable with respect to the MSE.

Abbreviations: MSE, mean-square error; NARA, Nordic Arthroplasty Register Association; M, metal; Poly, conventional polyethylene.

Norwegian MSE minus NARA MSE. Notes: The patient category in this figure is “60–74 years old, female, cemented, head size <32 mm, M + Poly”. The x-axis shows the sample size as explained in the “Statistical methods” section. The y-axis shows the difference in MSE. The red horizontal line is drawn at zero in order to visualize where the difference in MSE crosses zero, and hence shows at what sample size the Norwegian estimate becomes preferable with respect to the MSE. Abbreviations: MSE, mean-square error; NARA, Nordic Arthroplasty Register Association; M, metal; Poly, conventional polyethylene.

Discussion

In the present study, we compared survival-probability estimates based on single Nordic countries versus estimates based on the common database from the NARA to determine whether amalgamation of data increase the accuracy and precision of risk estimates. Using the MSE approach, we demonstrated that for small samples applying the larger NARA database may provide a more precise and accurate estimate. This effect is however inconsistent, and varies with the characteristics of the subcategory studied. Our approach assumes a “true” survival of a certain implant in a specific setting or in a regional hospital environment. Another important aim with the NARA initiative is that local factors, at least to a certain extent, should be “leveled out”, supposing that data in the compiled NARA database represent a more weighted assessment of a specific implant and a more global view. Further aims included studies of implants used in small numbers in solitary countries or comparatively rare outcomes in specific groups of patients. At an early stage in the NARA process, we were also interested in local variations perhaps caused by differences in hospital organization, local traditions, and possible differences in patient demography, which are highlighted in this study. There are many examples of successful merging of data to generate overall survival estimates.5,11,15–17 Several studies have described validation and generalization of individual risk-prediction algorithms.18–25 However, there is a difference between merging data to obtain precise estimates with narrow confidence intervals and merging data from several databases for accurate and precise risk prediction of single individuals (subcategories of patients). The approach applied in this article is based on a standard statistical principle used to obtain a sample size when merging of data can be advantageous. MSE is a standard tool in statistics for comparison of estimators,7 taking both precision and accuracy into account. We argue that it is also a suitable tool for the present application. This study has several strengths. We adjusted for individual confounders to the extent that it was possible within the NARA data set by stratifying patients into subcategories according to the known covariates age, sex, fixation, head size, and articulation to account for these covariates in the estimates. Only patients with primary osteoarthritis and contemporary prostheses with good results were included, in order to reduce heterogeneity across the study populations. Our study also has limitations. Variables not captured within the NARA data set, including medical comorbidities, differences in perioperative management, revision thresholds, completeness of reporting, and differences in choice of prosthesis subtypes and head sizes within the three head-size categories chosen for this study, also affect individual prediction of prosthesis survivorship. When considering merging of data sets to enhance analytical power in individual-patient risk-prediction tools, it is thus important to consider the extent to which such confounders may be accounted for within the applied data sets. Further, the simulations done when calculating the MSE values is a demanding task, and this may take several hours depending on the equipment at hand. In conclusion, using the MSE approach, we demonstrated that for small samples, applying the larger NARA database may provide a more accurate and precise estimate; however, this effect is inconsistent and varies with the characteristics of the subcategory studied.

21 in total

1. Duration of the increase in early postoperative mortality after elective hip and knee replacement.

Authors: Stein Atle Lie; Nicole Pratt; Philip Ryan; Lars B Engesaeter; Leif I Havelin; Ove Furnes; Stephen Graves
Journal: J Bone Joint Surg Am Date: 2010-01 Impact factor: 5.284

Review 2. Systematic assessment of decision-analytic models for chronic myeloid leukemia.

Authors: Ursula Rochau; Ruth Schwarzer; Beate Jahn; Gaby Sroczynski; Martina Kluibenschaedl; Dominik Wolf; Jerald Radich; Diana Brixner; Guenther Gastl; Uwe Siebert
Journal: Appl Health Econ Health Policy Date: 2014-04 Impact factor: 2.561

Review 3. Risk prediction models for mortality in ambulatory patients with heart failure: a systematic review.

Authors: Ana C Alba; Thomas Agoritsas; Milosz Jankowski; Delphine Courvoisier; Stephen D Walter; Gordon H Guyatt; Heather J Ross
Journal: Circ Heart Fail Date: 2013-07-25 Impact factor: 8.790

Review 4. A systematic review of patient-reported and economic outcomes: value to stakeholders in the decision-making process in patients with type 2 diabetes mellitus.

Authors: Ana Vieta; Xavier Badia; José A Sacristán
Journal: Clin Ther Date: 2011-09 Impact factor: 3.393

5. Independent clinical validation of a Canadian FRAX tool: fracture prediction and model calibration.

Authors: William D Leslie; Lisa M Lix; Helena Johansson; Anders Oden; Eugene McCloskey; John A Kanis
Journal: J Bone Miner Res Date: 2010-11 Impact factor: 6.741

Review 6. Computerised decision-support tools in diabetes care: hurdles to implementation.

Authors: Eldon D Lehmann
Journal: Diabetes Technol Ther Date: 2004-06 Impact factor: 6.118

Review 7. Performance of risk assessment instruments for predicting osteoporotic fracture risk: a systematic review.

Authors: S Nayak; D L Edwards; A A Saleh; S L Greenspan
Journal: Osteoporos Int Date: 2013-10-09 Impact factor: 4.507

8. Calibration of FRAX ® 3.1 to the Dutch population with data on the epidemiology of hip fractures.

Authors: A Lalmohamed; P M J Welsing; W F Lems; J W G Jacobs; J A Kanis; H Johansson; A De Boer; F De Vries
Journal: Osteoporos Int Date: 2011-11-26 Impact factor: 4.507

9. Inferior outcome after hip resurfacing arthroplasty than after conventional arthroplasty. Evidence from the Nordic Arthroplasty Register Association (NARA) database, 1995 to 2007.

Authors: Per-Erik Johanson; Anne Marie Fenstad; Ove Furnes; Göran Garellick; Leif I Havelin; Sören Overgaard; Alma B Pedersen; Johan Kärrholm
Journal: Acta Orthop Date: 2010-10 Impact factor: 3.717

10. The Nordic Arthroplasty Register Association: a unique collaboration between 3 national hip arthroplasty registries with 280,201 THRs.

Authors: Leif I Havelin; Anne M Fenstad; Roger Salomonsson; Frank Mehnert; Ove Furnes; Søren Overgaard; Alma B Pedersen; Peter Herberts; Johan Kärrholm; Göran Garellick
Journal: Acta Orthop Date: 2009-08 Impact factor: 3.717

1 in total

1. Similar early mortality risk after cemented compared with cementless total hip arthroplasty for primary osteoarthritis: data from 188,606 surgeries in the Nordic Arthroplasty Register Association database.

Authors: Alma B Pedersen; Aurélie Mailhac; Anne Garland; Søren Overgaard; Ove Furnes; Stein Atle Lie; Anne Marie Fenstad; Cecilia Rogmark; Johan Kärrholm; Ola Rolfson; Jaason Haapakoski; Antti Eskelinen; Keijo T Mäkelä; Nils P Hailer
Journal: Acta Orthop Date: 2020-11-04 Impact factor: 3.717

1 in total