Leif Friberg1, Alessandro Gasparini2, Juan Jesus Carrero3. 1. Karolinska Institutet, Department of Clinical Sciences at Danderyd Hospital, Stockholm, Sweden. 2. University of Leicester, Department of Health Sciences, Leicester, UK. 3. Karolinska Institutet, Department of Medical Epidemiology and Biostatistics, Stockholm, Sweden.
Abstract
BACKGROUND: Information about renal function is important for drug safety studies using administrative health databases. However, serum creatinine values are seldom available in these registries. Our aim was to develop and test a simple scheme for stratification of renal function without access to laboratory test results. METHODS: Our scheme uses registry data about diagnoses, contacts, dialysis and drug use. We validated the scheme in the Stockholm CREAtinine Measurements (SCREAM) project using information on approximately 1.1 million individuals residing in the Stockholm County who underwent calibrated creatinine testing during 2006-11, linked with data about health care contacts and filled drug prescriptions. Estimated glomerular filtration rate (eGFR) was calculated with the CKD-EPI formula and used as the gold standard for validation of the scheme. RESULTS: When the scheme classified patients as having eGFR <30 mL/min/1.73 m2, it was correct in 93.5% of cases. The specificity of the scheme was close to 100% in all age groups. The sensitivity was poor, ranging from 68.2% in the youngest age quartile, down to 10.7% in the oldest age quartile. Age-related decline in renal function makes a large proportion of elderly patients fall into the chronic kidney disease (CKD) range without receiving CKD diagnoses, as this often is seen as part of normal ageing. CONCLUSIONS: In the absence of renal function tests, our scheme may be of value for identifying patients with moderate and severe CKD on the basis of diagnostic and prescription data for use in studies of large healthcare databases.
BACKGROUND: Information about renal function is important for drug safety studies using administrative health databases. However, serum creatinine values are seldom available in these registries. Our aim was to develop and test a simple scheme for stratification of renal function without access to laboratory test results. METHODS: Our scheme uses registry data about diagnoses, contacts, dialysis and drug use. We validated the scheme in the Stockholm CREAtinine Measurements (SCREAM) project using information on approximately 1.1 million individuals residing in the Stockholm County who underwent calibrated creatinine testing during 2006-11, linked with data about health care contacts and filled drug prescriptions. Estimated glomerular filtration rate (eGFR) was calculated with the CKD-EPI formula and used as the gold standard for validation of the scheme. RESULTS: When the scheme classified patients as having eGFR <30 mL/min/1.73 m2, it was correct in 93.5% of cases. The specificity of the scheme was close to 100% in all age groups. The sensitivity was poor, ranging from 68.2% in the youngest age quartile, down to 10.7% in the oldest age quartile. Age-related decline in renal function makes a large proportion of elderly patients fall into the chronic kidney disease (CKD) range without receiving CKD diagnoses, as this often is seen as part of normal ageing. CONCLUSIONS: In the absence of renal function tests, our scheme may be of value for identifying patients with moderate and severe CKD on the basis of diagnostic and prescription data for use in studies of large healthcare databases.
Population-wide administrative health databases are frequently used for observational Post Authorization Safety Studies (PASS) of new drugs. Such studies have many advantages: they are quick and easy to perform; it is often possible to study of outcomes in small subgroups due to large number of patients; few patients are lost to follow-up; and selection bias is seldom a problem when whole populations are included. It is even possible to study the consequences of off-label use and drugs taken by patients with contraindications, which may happen in the real world even if it should not.An important limitation with many of these registers is that they miss detailed information about renal function. Many drugs are excreted via the kidneys, and chronic kidney disease (CKD) may lead to drug accumulation and drug toxicity. A diagnostic code for CKD in one of these registers is often just a binary yes or no, which may cover almost any degree of renal impairment from a slightly elevated S-creatinine to end-stage renal failure.Attempts have been made to stage CKD from International Classification of Diseases (ICD) codes (International Classfication of Diseases) and claims codes [1-5]. Validation studies of these schemes have been disappointing and have shown that administrative databases generally have insufficient sensitivity and positive predictive value (PPV) to allow for stratified analyses according to renal function.However, registries hold more information than just diagnoses. There is other information that could be used as surrogate markers of the severity of renal disease, for example, whether phosphate binders and other drugs used in CKD are used or not, the duration of disease, the frequency of hospitalizations or contacts for CKD, if there was dialysis or surgery for vascular access and so on.The aim of this study was to develop and test a surrogate method to grade renal function for research purposes when laboratory test values are unavailable.
Materials and methods
We constructed a scheme for classification of renal function aiming at the stratification of patients according to the assumed estimated glomerular filtration rate (eGFR). The unit for eGFR is mL/min/1.73 m2. In the following, eGFR values are presented without unit for brevity.The scheme aims to differentiate between the following eGFR strata: >30, 30–59, 60–89 and ≥90. The scheme is presented as a flow chart in Figure 1. The diagnostic and procedure codes used, with plain text translation, are listed in Table 1.
Fig. 1.
Scheme for classifying renal function without access to laboratory test results. *Renal-specific drugs: phosphate binders (ATC codes A12AA, V03AE02, V03AE03, V03AE04), active vitamin D (A11CC03, A11CC04), sodium bicarbonate (A02AH), erythropoiesis-stimulating agents (B03XA).
Table 1.
Codes used for identification of patients with chronic renal disease
Condition
ICD-10 code beginning with
CKD
N18
CKD Stage 5 (eGFR <15)
N185
CKD Stage 4 (eGFR 15–29)
N184
CKD Stage 3 (eGFR 30–59)
N183
CKD Stage 2 (eGFR 60–89)
N182
CKD Stage 1 (eGFR ≥90)
N181
Acute renal failure
N17
Unspecified renal failure
N19
Dependence on renal dialysis
Z992
Adjustment and management of vascular access device
Z492
Procedure codes beginning with
Creation of arterio-venous fistula from artery in the upper limb
PBL
Repair surgery of arterio-venous fistula in the upper limb
PBU
Haemodialysis, chronic
DR016
Peritoneal dialysis, chronic
DR024
Codes used for identification of patients with chronic renal diseaseScheme for classifying renal function without access to laboratory test results. *Renal-specific drugs: phosphate binders (ATC codes A12AA, V03AE02, V03AE03, V03AE04), active vitamin D (A11CC03, A11CC04), sodium bicarbonate (A02AH), erythropoiesis-stimulating agents (B03XA).The scheme was tested and validated in Stockholm CREAtinine Measurements (SCREAM) database [6], a healthcare utilization cohort from the region of Stockholm, Sweden, where serum creatinine was measured in 1.3 million adults during 2006–11 in connection to a healthcare consultation in ambulatory or hospital care. Laboratory data was, thereafter, linked via each citizen’s personal identification number of administrative records containing diagnostic codes (ICD-10 classification), therapeutic procedures (codes issued by the Nordic Medico-Statistical Committee, NOMESCO), validated renal endpoints (undergoing dialysis of renal transplantation) and pharmacy-filled claims [6].The regional healthcare utilization register contains information on all ICD-10 diagnoses and therapeutic procedures issued in ambulatory or inpatient care in the region of Stockholm since the system was adopted in Sweden in 1997. The Swedish Dispensed Drug register stores records of all pharmacy-dispensed prescriptions in Sweden since 1 July 2005. All pharmacies in the country are required to participate by law, and information is transferred electronically whenever a prescribed drug is dispensed. It does not contain information about prescriptions that were not dispensed, drugs used during hospital stays and over-the-counter drugs.The study population considered for this analysis consisted of adult SCREAM individuals (≥18 years). We discarded all measurements taken in connection with a hospital stay (n = 2 415 743) because we assumed that a high proportion of these samples represented acute illness rather than a potential underlying chronic renal disease. We also discarded all measurements from non-residents in the Stockholm County (n = 203 326), implausible serum-creatinine concentrations (i.e. below 25 or above 1500 µmol/L; n = 1808), and measurements recorded after a renal transplantation (n = 50 843). In cases where there were concurrent serum creatinine measurements on the same day, we took their median value. Index date was defined as the date of the most recent measurement, and we obtained 1 126 952 individuals with 5 352 191 measurements eligible for the study.The 2009 CKD-EPI creatinine-based equation [7] was used for calculation of eGFR. We estimated renal function at index date, averaging two eGFR values 3–12 months apart (if available). Race is not registered in Sweden by law, and therefore all patients were assumed to be Caucasian; nonetheless, the Swedish population is relatively homogenous and dominated by Caucasians (91.3% born in Europe, according to Statistics Sweden: http://www.statistikdatabasen.scb.se). CKD stages were categorized as eGFR <30, 30–59, 60–89 or ≥90. Patients undergoing dialysis (hemodialysis or peritoneal dialysis) have varying eGFR values related to the time since the previous dialysis. Such eGFR values are not representative of kidney function and were, therefore, replaced by a random value between 0 and 15 mL/min/1.73 m2.For each patient, scheme-based classifications of renal function were made based on the presence or absence of diagnoses, contacts, drugs or procedures according to the schemes. This classification was then compared with the classification obtained through measured creatinine values, which was used as the gold standard.We computed accuracy and Cohen’s Kappa statistics comparing the predicted CKD categories with the observed ones, and also performed McNemar’s test of agreement. Furthermore, we computed sensitivity, specificity, PPV and negative predictive value (NPV) in discriminating each CKD category against the remaining ones pooled together. All analyses were performed using R (R Foundation for Statistical Computing, Vienna, Austria).The regional Ethical Review Board and the Swedish National Board of Health and Welfare approved the study for use of de-identified data. The study conforms to the Declaration of Helsinki.
Results
Renal function in the validation cohort
The study population consisted of 1 126 954 individuals, who contributed a total of more than 5.2 million creatinine measurements. The median age was 52.8 years [interquartile interval (IQI) 37.7–67.5] and 54.2% were females. Median eGFR was 94.4 (IQI 80.5–107.8). More than 92% had normal or near normal eGFR ≥60 (n = 1 038 461, Figure 2), 6.9% had eGFR 30–59 (n = 77 286) and 1.0% had eGFR <30 (n = 11 207). Mean and median eGFR values declined with age, from ∼120 at the age of 30 years to ∼60 at the age of 90 years (Figure 3). The proportion of patients with a diagnosis of CKD increased with more severe degrees of CKD, thus only 11.8% of patients with eGFR <60 and 52.5% of patients with eGFR <30 had a CKD diagnosis in the register. Among patients on dialysis, 99.1% also had a registry diagnosis of renal failure. There was a widespread underreporting of CKD in the elderly population.
Fig. 2.
Distribution of eGFR values among 1.1 million patients in the SCREAM cohort, Stockholm County, Sweden.
Fig. 3.
Creatinine-based eGFR values in relation to age among 1.1 million inhabitants in Stockholm County, Sweden.
Distribution of eGFR values among 1.1 million patients in the SCREAM cohort, Stockholm County, Sweden.Creatinine-based eGFR values in relation to age among 1.1 million inhabitants in Stockholm County, Sweden.
Scheme validation
The correspondence between the scheme-derived stratification and the eGFR-based stratification, used as reference, is presented in Table 2. The full four-graded scheme (Scheme A) created groups with mean eGFR values in the targeted intervals, but the distribution was wide in all groups (Figure 4, left panel). The overall accuracy was only 59.4% [95% confidence interval (CI) 59.3–59.4%]. Cohen’s Kappa statistic was low, 0.02, and McNemar’s test was significant indicating lack of agreement between predicted and observed values. Nonetheless, the scheme identified patients with eGFR <30 with a PPV of 93.5% and NPV of 99.2% (Figure 5).
Table 2.
Scheme-predicted classification compared with classification according to the gold standard
Scheme-based classification of eGFR mL/min/1.73 m2
Gold standard eGFR reference, mL/min/1.73 m2
<30
30–59
60–89
≥90
<30
1832
53
44
30
30–59
2552
2011
284
50
60–89
1411
2470
905
426
≥90
5412
72 752
372 586
664 136
Fig. 4.
Correspondence between the SCREAM-scheme classification of CKD stages versus creatinine-based eGFR values.
Fig. 5.
Scheme performance in relation to age.
Scheme-predicted classification compared with classification according to the gold standardCorrespondence between the SCREAM-scheme classification of CKD stages versus creatinine-based eGFR values.Scheme performance in relation to age.The four-graded scheme was clearly incapable of differentiating between eGFR 60–89 and eGFR ≥90, largely due to the infrequent use of CKD diagnoses among elderly patients who typically have eGFR in the 60–89 range. The attempts to differentiate between these strata was abandoned, and all patients with eGFR ≥60 were combined into one group, thus making the scheme three-graded (Scheme B). This modification resulted in a marked improvement in the overall accuracy (92.5%, CI 92.5–92.6%), mostly driven by higher PPV (92.7%) and NPV (94.0%) in the normal/mildly reduced renal function group with assumed GFR ≥60. The specificity and the NPV was close to 100% for scheme-classified eGFR <30 and eGFR 30–59, while the sensitivity was very poor. Cohen’s Kappa statistics improved by the simplification of the scheme, up to 0.12, but McNemar’s test still indicated lack of agreement between predicted and observed values.
Stratification by age
Given the age-related decline in renal function, and the infrequent use of codes for CKD among the elderly, we proceeded with stratification according to age quartiles. This showed that the sensitivity for advanced CKD was higher in younger age groups; 68.2% in the lowest quartile below 38 years compared with 10.7% in the highest quartile older than 68 years (Figure 5).When the scheme classified patients as GFR <30 it was correct in 87.8% of patients in the lowest age group, and increased up to 95.4% of the cases in the highest age group. Conversely, when the scheme said that a patient had normal or near-normal renal function, it was true for almost all patients below 68 years, and for about three-quarters of the patients older than 68 years.Accordingly, the accuracy was well above 90% in all but the highest age quartile (99.9, 99.5, 97.3 and 73.3%, respectively). Cohen's Kappa was 53.9, 42.2, 26.2 and 7.4 in the respective age quartiles.
Discussion
We have shown that it is possible to achieve a crude grading of renal function without access to laboratory test values. Our scheme was, however, not able to discriminate between normal renal function and age-related decline of renal function or CKD in early stages. As shown here, and in other international registers [8-11], there is an important underutilization of ICD diagnostic codes for CKD in healthcare, overall emphasizing the importance of estimating CKD on the basis of laboratory values. However, most general administrative registers and claims databases lack information about renal function, which is important, for instance, for studies on pharmacovigilance or drug safety in real-life settings.Because of this underutilization of diagnostic codes for CKD among the elderly, many individuals with advanced CKD are not identified by our scheme (low sensitivity). The age-related decline of renal function may be seen as a part of the aging process, in the same way as the reduction of pulmonary function, reduction of physical strength, atherosclerotic changes, etc.The stratification into age quartiles showed that the sensitivity and performance of the scheme was better among younger than among older patients, which is consistent with what has just been said about underutilization of diagnoses. The study population was relatively young (mean age 52.8 years), and the cut-off age for the highest quartile was only 68 years. Many patients in pharmacovigilance studies are well past that age. A scheme that identifies two-thirds of patients with eGFR <30 in a population below 38 years may not be very useful in a retired population, where it only can identify 1 out of 10.However, the high PPV of the scheme makes it possible to identify a subgroup with a 93.5% probability of having an eGFR <30. For studies where it is more important to identify a group with eGFR <30 with a high degree of certainty than to identify all patients with poor renal function, this scheme may be useful.Access to actual eGFR values would of course be better, but if this information is not available and the advantages with Big Data make it desirable to go on without eGFR values, this scheme may be of value as it offers more information than a simple yes/no to a previous diagnosis of renal disease.Our experiences of the under-reporting of CKD by diagnostic codes has been observed previously by a number of study groups [1-5] who have tried to identify CKD patients by means of diagnostic or claims codes. All these studies reported much lower sensitivity than specificity, just as we do in our study.What is new with our study is that it uses information other than diagnostic codes, and that it stratifies renal function, instead of simply categorizing patients in a binary way as having CKD or not. Moreover, this study, with over 1 million patients and over 5 million creatinine measurements, is by far the largest study in this field that has ever been published, as far as we know.Despite the scale of the study, it is not certain that the scheme would perform well in another database or in another country where healthcare registration is organized in a different way. Although different versions of the ICD coding system have been in use in different parts of the world, it is generally not difficult to translate from one version to another. Translation of codes for diagnostic and surgical procedures may be a greater challenge since there is no universally accepted list for these. The exact meaning of the NOMESCO codes used in our scheme study are listed in Table 1 in order to facilitate adoption to other countries. We think it is desirable that the scheme is evaluated in other settings with openness for modifications/adaptations to those other contexts.
Conclusion
The likelihood that patients identified as having poor renal function by our scheme is high. The scheme may, therefore, be useful for pharmacovigilance studies using administrative registries lacking information about creatinine values. The sensitivity for detection of CKD, especially in the elderly, is poor.
Funding
We acknowledge grant support from Stockholm Country Council and from the Swedish Heart and Lung Foundation.
Conflict of interest statement
None of the authors has any conflicts of interest related to the contents of the present study, which is of a purely methodological nature. Outside of the present work L.F. has conducted pharmacovigilance studies in registries as a consultant to Bayer, Bristol-Myers-Squibb, Pfizer and Sanofi.
Authors: Paul E Ronksley; Marcello Tonelli; Hude Quan; Braden J Manns; Matthew T James; Fiona M Clement; Susan Samuel; Robert R Quinn; Pietro Ravani; Sony S Brar; Brenda R Hemmelgarn Journal: Nephrol Dial Transplant Date: 2011-10-19 Impact factor: 5.992
Authors: Morgan E Grams; Laura C Plantinga; Elizabeth Hedgeman; Rajiv Saran; Gary L Myers; Desmond E Williams; Neil R Powe Journal: Am J Kidney Dis Date: 2010-08-06 Impact factor: 8.860
Authors: Morgan E Grams; Casey M Rebholz; Blaithin McMahon; Seamus Whelton; Shoshana H Ballew; Elizabeth Selvin; Lisa Wruck; Josef Coresh Journal: Am J Kidney Dis Date: 2014-04-13 Impact factor: 8.860
Authors: Paul Muntner; Orlando M Gutiérrez; Hong Zhao; Caroline S Fox; Nicole C Wright; Jeffrey R Curtis; William McClellan; Henry Wang; Meredith Kilgore; David G Warnock; C Barrett Bowling Journal: Am J Kidney Dis Date: 2014-09-19 Impact factor: 8.860
Authors: Chaoyang Li; Xiao-Jun Wen; Meda E Pavkov; Guixiang Zhao; Lina S Balluz; Earl S Ford; Desmond Williams; Carol A Gotway Journal: Am J Nephrol Date: 2014-04-12 Impact factor: 3.754
Authors: Andrew S Levey; Lesley A Stevens; Christopher H Schmid; Yaping Lucy Zhang; Alejandro F Castro; Harold I Feldman; John W Kusek; Paul Eggers; Frederick Van Lente; Tom Greene; Josef Coresh Journal: Ann Intern Med Date: 2009-05-05 Impact factor: 25.391
Authors: Lynn M Robertson; Lucas Denadai; Corri Black; Nicholas Fluck; Gordon Prescott; William Simpson; Katie Wilde; Angharad Marks Journal: Health Informatics J Date: 2014-12-31 Impact factor: 2.681
Authors: Jamie L Fleet; Stephanie N Dixon; Salimah Z Shariff; Robert R Quinn; Danielle M Nash; Ziv Harel; Amit X Garg Journal: BMC Nephrol Date: 2013-04-05 Impact factor: 2.388
Authors: Björn Runesson; Alessandro Gasparini; Abdul Rashid Qureshi; Olof Norin; Marie Evans; Peter Barany; Björn Wettermark; Carl Gustaf Elinder; Juan Jesús Carrero Journal: Clin Kidney J Date: 2015-11-14
Authors: Louise Roy; Michael Zappitelli; Brian White-Guay; Jean-Philippe Lafrance; Marc Dorais; Sylvie Perreault Journal: Can J Kidney Health Dis Date: 2020-10-10