Literature DB >> 35905096

A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records.

Faye Cleary1, David Prieto-Merino1, Dorothea Nitsch1.   

Abstract

BACKGROUND: Electronic healthcare records (EHRs) are a useful resource to study chronic kidney disease (CKD) progression prior to starting dialysis, but pose methodological challenges as kidney function tests are not done on everybody, nor are tests evenly spaced. We sought to review previous research of CKD progression using renal function tests in EHRs, investigating methodology used and investigators' recognition of data quality issues. METHODS AND
FINDINGS: We searched for studies investigating CKD progression using EHRs in 4 databases (Medline, Embase, Global Health and Web of Science) available as of August 2021. Of 80 articles eligible for review, 59 (74%) were published in the last 5.5 years, mostly using EHRs from the UK, USA and East Asian countries. 33 articles (41%) studied rates of change in eGFR, 23 (29%) studied changes in eGFR from baseline and 15 (19%) studied progression to binary eGFR thresholds. Sample completeness data was available in 44 studies (55%) with analysis populations including less than 75% of the target population in 26 studies (33%). Losses to follow-up went unreported in 62 studies (78%) and 11 studies (14%) defined their cohort based on complete data during follow up. Methods capable of handling data quality issues and other methodological challenges were used in a minority of studies.
CONCLUSIONS: Studies based on renal function tests in EHRs may have overstated reliability of findings in the presence of informative missingness. Future renal research requires more explicit statements of data completeness and consideration of i) selection bias and representativeness of sample to the intended target population, ii) ascertainment bias where follow-up depends on risk, and iii) the impact of competing mortality. We recommend that renal progression studies should use statistical methods that take into account variability in renal function, informative censoring and population heterogeneity as appropriate to the study question.

Entities:  

Mesh:

Year:  2022        PMID: 35905096      PMCID: PMC9337679          DOI: 10.1371/journal.pone.0264167

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.752


Introduction

Chronic kidney disease (CKD) is a growing public health problem [1, 2]. Risks associated with CKD include cardiovascular morbidity, death, and in rare cases progression to end-stage renal disease (ESRD) requiring renal replacement therapy (RRT) [3]. Severity of disease, mechanism of renal damage and rate of progression of disease vary between patients, and the disease may change course over time in response to changing risk factors [4, 5]. While a minority of patients progress to ESRD, the cost of RRT presents a substantial economic burden to public health services and is likely to increase further over the coming years as prevalence of RRT rises alongside population growth and an ageing population [6, 7]. Increasing adoption of electronic healthcare records (EHRs) offers an opportunity to study progression of kidney disease in real-world care, that may enable improved decision-making in clinical practice. Whilst there is the promise of big sample sizes to be analysed, constraints on data availability of renal function test results may complicate reliable evaluation in EHRs. Frequency of monitoring of renal function is likely to vary in routine care according to differing individual patient risk profiles, local healthcare policy, physician-related factors, area of management within the healthcare system, social factors, or temporary illness. This may lead to some members of the target population being less likely to be followed up for renal function, potentially leading to selection and ascertainment biases in the study of CKD progression that may result in unreliable conclusions. There are other methodological challenges in evaluation of CKD progression that are not specific to EHRs that should be considered by researchers. Deterioration in renal function over time is most commonly detected through changes of the estimated glomerular filtration rate (eGFR), usually derived from serum creatinine, sex, age, and ethnicity. Such creatinine-based GFR-estimating equations are imprecise, particularly at high levels of eGFR [8, 9]. Major changes in renal function in the context of acute illness are a sign of acute kidney injury (AKI). Although AKI is at least partially reversible in surviving patients, a history of AKI may accelerate subsequent loss in renal function. However, when researchers study eGFR decline over time, often statistical models are used that ignore the impact of acute drops in renal function on the subsequent trajectory. Population heterogeneity (caused by variation in risk factors both at baseline and evolving over time) may complicate analyses that assume a common mean linear trajectory of renal function loss over time, and it may be necessary to use more sophisticated methods if this assumption is violated that take this variability into account. Unmeasured confounding may also present issues, particularly if important confounders are not considered in the analysis. Competing events such as initiation of RRT or death complicate evaluation of progression outcomes. A previous systematic review by Boucquemont et al. in 2014 [10] reviewed statistical methods used to identify risk factors for progression of CKD, covering research on cohort studies published between 2002 and 2012. They summarised most used outcome measures and statistical models, critiquing handling of bias due to informative censoring, competing risks, correlation due to repeated measures, and non-normality of response, and proposed recommendations for best practice statistical methods and software packages. We performed a systematic review of all longitudinal analyses of renal function tests investigating the nature, burden or consequences of CKD progression using EHRs. We aimed to establish how data issues inherent to EHRs and methodological challenges were handled, how CKD progression was defined, what statistical methods were used and whether data issues were acknowledged in the context of reliability of study conclusions.

Materials and methods

Protocol and registration

There is no published protocol available for this systematic review. Prior to completion of data extraction, this review was registered in the PROSPERO international prospective register of systematic reviews (registration number CRD42020182587).

Eligibility criteria

This is a review of statistical methodology covering all research studying the nature, burden or consequences of CKD progression using EHRs. Our intention was to focus on how researchers used renal function tests to study CKD progression. Initiation of dialysis is already a well-established clinically important outcome and as this was not the subject of the review, we excluded dialysis endpoints (as a measure of CKD progression) from review. Populations that had already initiated RRT at baseline or that were sampled on the basis of RRT initiation were excluded from review, since such populations are not appropriate for studying progression of CKD. (This criterion does not exclude patients that initiated RRT during follow-up.) Measures of CKD progression may constitute either exposures or outcomes of analysis. PICOS criteria are listed in the table below. There are no restrictions on sample size, population location or date of publication. Only studies reported in English language are included.

Searches

We performed electronic searches of MEDLINE, EMBASE, Global Health and Web of Science databases through to 11th August 2021. A copy of the search strategy is provided in the supplementary materials S2 File.

Study selection

This study had one lead reviewer and two supporting reviewers. The lead reviewer was responsible for screening all articles for eligibility, which involved scrutiny of abstracts followed by full-text review. The two supporting reviewers independently screened a sample of 50 articles each for eligibility. Consistency of agreement and reasons for disagreement were discussed. Clarity of inclusion/exclusion criteria was updated following discussion and prior to completion of eligibility review by the lead reviewer.

Data collection process

The lead reviewer was responsible for data extraction for all eligible research articles. In addition, key items that were the subject of this review were validated by supporting reviewers who independently extracted the following items for all articles: (1) measure of change in renal function; (2) statistical methods used in analysis of changes in renal function; and (3) definitions of progression of CKD, if any. The lead reviewer developed a data extraction form in an Excel spreadsheet, which was reviewed and approved by supporting reviewers in the initial stages of data extraction.

Data items

Information extracted from eligible research articles included details of the study population, study methodology and how data quality issues and other methodological issues were handled. Extracted items are listed below.

Study population

Data collection timeframe; Country of residence; Mean age; Percent male; Primary morbidity under study / reason for inclusion; Data source / healthcare setting

Study methodology

Date of publication; Study design; Research aims; Sample size (before and after exclusions for reasons of data completeness [for details, see below explanation of data completeness inclusion criteria and calculations of percentage of target population analysed]); Measure of renal function; Measure of change in renal function over time; Definition of progression (if any); Whether change in renal function was exposure or outcome; Duration of follow up for changes in renal function; Data completeness inclusion criteria and the minimum number of renal function tests required for analysis; Statistical tools used; Statistical model used. Some additional results were derived to quantify data completeness for analysis, including the percentage of the target population that were analysed after application of data completeness inclusion criteria and the percentage of patients that dropped out of analysis during the intended follow up period having met criteria for inclusion in analysis. Here, “data completeness inclusion criteria” refer to the study-specific inclusion criteria applied prior to main analyses being performed that aimed to retain only those patients with sufficient data completeness to be deemed suitable for analysis, with such criteria expected to vary between studies. Percentage of target population analysed was defined as: This was computable in some but not all studies, as it requires data on the total number of patients included in analysis as well as the number of patients that met population criteria before data completeness exclusion criteria were applied. (In propensity score matched cohort studies, propensity score matching criteria are included in population criteria, and we only compute percentage of target population analysed in the propensity score matched cohort, where this is possible.) Percentage of study population lost to follow up was defined as: Again, this was computable in some but not all studies, as it requires data on the number of patients analysed and the number of those patients that dropped out during the intended follow up period, for example due to death, initiation of RRT or other lack of follow up in routine care which could be for many different reasons.

Handling of data quality issues and other methodological challenges

Of the items below, details extracted included whether items were mentioned, whether information was provided on data completeness [if relevant], whether implications were acknowledged, whether challenges were tackled methodologically and any statistical methods used to attempt to overcome challenges: Handling of sample completeness / representativeness of the target population; Handling of informative drop-outs/censoring; Handling of missing longitudinal data; Handling of missing covariate data; Distributional checks/issues; Handling of within-patient correlation and variability of kidney function over time; Handling of population heterogeneity; Handling of confounding.

Risk of bias in individual studies

Assessment of bias in individual studies was one of the main aims of this systematic review. Key measures of bias evaluated in individual studies were the percentage of the sample target population that were analysed and the percentage of the analysed study population that were lost to follow up. Study-specific measures were reported and bar charts were produced for these measures to demonstrate the potential for bias in individual studies due to informatively missing data.

Synthesis of results

This review was descriptive with simple aggregation of collected data items only and no statistical analysis was performed. 4 separate summaries are provided to describe study population characteristics, study methodology used, acknowledgment and handling of data quality issues and other methodological challenges, and definitions of CKD progression. For studies exploring multiple outcomes or conducting multiple analyses of changes in renal function, the outcomes and analyses considered the primary focus regarding renal progression in each paper are summarised in the review.

Risk of bias across studies

There was no single effect size of interest in this study and no meta-analysis was performed, as the review focussed on methodology used and investigators’ handling of data quality issues. Publication bias was therefore challenging to evaluate, as funnel plots and statistical tests could not be used. Efforts were made to maximise coverage of peer-reviewed literature in this field, including extraction of articles from 4 major databases. If research is missing from review due to publication in non-English languages, then data quality issues in such missing studies are likely to be similar to those in English language studies that were included. There will be clinical audit studies that are not peer-reviewed; these studies are likely to be of a similar of worse quality than reviewed studies because peer-reviewed literature is expected to go through certain research quality checks. In any case, as peer-reviewed literature is more likely to be used to inform policy than other research, this is arguably the optimal collection of research to assess the aims of this review.

Results

731 unique articles were identified from database searching, of which 80 met study eligibility criteria (Fig 1). Primary reasons for exclusion were not using EHRs, pre-planned data collection for research purposes such as a prospective cohort study, and studies with a single renal function test rather than longitudinal analysis of repeated measures of renal function. Other reasons for exclusion were ineligible populations, such as studies including children, restricted to RRT populations or studies that did not include CKD patients, such as studies of the incidence of CKD. All included studies retrospectively analysed routinely collected healthcare data. It was not always clear whether electronic or paper records were used, and while efforts were taken to differentiate this, it is possible that some included studies may have involved manual data extraction from paper records. 70 studies (88%) clearly stated the use of EHRs. In the 10 studies that did not state this, the time-frame for data collection and location of research suggested that electronic healthcare systems were likely to have been used, but we could not verify this. These studies have been summarised separately in the supplementary materials. A full list of reviewed studies is also included in the supplementary materials S3 File.
Fig 1

Flow chart of study selection.

Study population characteristics

Table 1 summarises characteristics of study populations analysed in reviewed articles. Research was most commonly conducted in the UK (25%) and USA (30%), followed by East Asian countries, including South Korea (8%), China (6%), Taiwan (9%) and Japan (8%). Research in non-English-speaking countries may be missing from review. Typically (based on median), studied populations had a mean age of 64 and were 52% male, although there was substantial variation between studies in these characteristics. Most commonly studied morbidities were CKD (26%) and diabetes (20%) although research covered a range of different populations, including (non-renal) transplant recipients and specific renal diseases. 10% studied the general population, with a further 3% studying patients with general risk factors for CKD. Clinical settings of retrieved databases varied widely, including primary care (23%), un-specified hospital settings (14%), outpatient clinics (21%), and 29% of studies used linked data across multiple care settings.
Table 1

Summary of study populations studied (N = 80).

Study population characteristicsN (%)
Primary decade of follow up
 2010–201935 (43.8%)
 2000–200936 (45.0%)
 1990–19993 (3.8%)
 Not available6 (7.5%)
Country
Europe  28 (35.0%)
  UK20 (25.0%)
  Germany2 (2.5%)
  Italy2 (2.5%)
  Norway2 (2.5%)
  Multiple European countries2 (2.5%)
North America  25 (31.3%)
  USA24 (30.0%)
  Canada1 (1.3%)
Asia 25 (31.3%)
  South Korea6 (7.5%)
  China5 (6.3%)
  Taiwan7 (8.8%)
  Japan6 (7.5%)
  Thailand1 (1.3%)
Oceania  1 (1.3%)
  Australia1 (1.3%)
South America  1 (1.3%)
  Colombia1 (1.3%)
Africa  0
Mean agea
 Median (IQR)64 (56, 71)
 30–497 (8.8%)
 50–5920 (25.0%)
 60–6929 (36.3%)
 70–8022 (27.5%)
 Not stated2 (2.5%)
Percent male
 Median (IQR)52% (44%, 63%)
 ≤ 34%6 (7.5%)
 35–44%15 (18.8%)
 45–54%24 (30.0%)
 55–64%16 (20.0%)
 ≥ 65%19 (23.8%)
Main morbidity /reason for inclusion
 CKD21 (26.3%)
 Diabetes16 (20.0%)
 General population8 (10.0%)
 Diabetic nephropathy / kidney disease5 (6.3%)
 Atrial fibrillation5 (6.3%)
 Multiple CKD risk factors2 (2.5%)
 IgA nephropathy2 (2.5%)
 Infections (Hepatitis C, HIV)3 (3.8%)
 Transplant recipients (liver, heart)3 (3.8%)
 Autoimmune diseases (lupus, IgG4 related, vasculitis)3 (3.8%)
 Gout/hyperuricemia2 (2.5%)
 Other*10 (12.5%)
Data source / clinical setting
 Multiple care settings 23 (28.8%)
 Primary care 19 (23.8%)
 Outpatient 17 (21.3%)
  Diabetes clinic6 (7.5%)
  Renal clinic3 (3.8%)
  Diabetic-renal clinic1 (1.3%)
 Not specified7 (8.8%)
 Hospital 11 (13.8%)
 Tertiary care 6 (7.5%)
 Not stated 4 (5.0%)

aOther morbidities/reason for inclusion were urinary system disorders, hyperkalemia, obesity, osteoporosis, primary aldosteronism, abdominal aortic aneurysm, acute renal embolism, light chain deposition disease, lung cancer and renal cancer.

aOther morbidities/reason for inclusion were urinary system disorders, hyperkalemia, obesity, osteoporosis, primary aldosteronism, abdominal aortic aneurysm, acute renal embolism, light chain deposition disease, lung cancer and renal cancer.

Study methodology

Study methodology is summarised in Table 2 and a listing of key items by study is also provided in the supplementary materials S4 Table. Use of EHRs for observational research increased rapidly in recent years, with 74% of reviewed studies published in the last 5.5 years. The overwhelming majority of research was focussed on risk factor identification and causal inference (82%), with only a handful of studies attempting risk prediction (9%). Other aims included estimation of incidence or prevalence (4%) and descriptive characterisations of changes in renal function (4%). Sample size ranged drastically from 24 up to 1,597,629, with a median sample size of 1,114.
Table 2

Study methodology (N = 80).

Study methodology featuresN (%)
Date of publication
 2015–202159 (73.8%)
 2010–201414 (17.5%)
 2005–20096 (7.5%)
 2000–20041 (1.3%)
Study design
 Retrospective cohort study74 (92.5%)
 Cross-sectional study4 (5.0%)
 Case-control study2 (2.5%)
Research aims
 Risk factor identification / causal inference65 (81.3%)
 Risk prediction7 (8.8%)
 Estimation of incidence/prevalence3 (3.8%)
 Descriptive characterisation of changes in renal function3 (3.8%)
 Identification of sub-populations1 (1.3%)
 Audit of care provision1 (1.3%)
Sample size
 Median (IQR)1114 (209, 9876)
 ≤ 9910 (12.5%)
 100–49918 (22.5%)
 500–99911 (13.8%)
 1,000–9,99922 (27.5%)
 ≥ 10,00019 (23.8%)
Measure of renal function
eGFR  75 (93.8%)
  MDRD33 (41.3%)
  CKD-EPI28 (35.0%)
  MDRD, CKD-EPI combination1 (1.3%)
  Taiwan CKD-EPI1 (1.3%)
  Japanese formula3 (3.8%)
  Not specified9 (11.3%)
Estimated creatinine clearance  2 (2.5%)
  Cockcroft and Gault2 (2.5%)
Serum creatinine  2 (2.5%)
Inverse serum creatinine  1 (2.5%)
Measure of change in renal function over timea
eGFR 75 (93.8%)
  Regression slope (absolute changes)20 (25.0%)
   Individual linear regression8 (10.0%)
   Linear mixed model10 (12.5%)
   Growth model1 (1.3%)
   Generalised estimating equations1 (1.3%)
  Regression slope (absolute and percent changes)1 (1.3%)
   Linear mixed model1 (1.3%)
  Rate of change between measures5 (6.3%)
  Rate of change, not clearly defined4 (5.0%)
  Rate of percentage change, not clearly defined3 (3.8%)
  Raw absolute change from baseline10 (12.5%)
  Raw percent change from baseline13 (16.3%)
  Raw percent change between measures1 (1.3%)
  Binary progression to threshold eGFR6 (7.5%)
  Binary progression (changes/threshold combination)3 (3.8%)
  Transition between CKD stages6 (7.5%)
  Trajectory shape class (mixed model)1 (1.3%)
  Model predicted percent change per year1 (1.3%)
  Model predicted eGFR at multiple time points1 (1.3%)
Estimated creatinine clearance 2 (2.5%)
  Regression slope (absolute scale)1 (1.3%)
  Raw percent change from baseline1 (1.3%)
Serum creatinine 2 (2.5%)
  Raw absolute change from baseline1 (1.3%)
  Binary progression to threshold serum creatinine1 (1.3%)
Inverse serum creatinine 1 (1.3%)
  Regression slope (absolute changes)1 (1.3%)
Change in renal function as outcome or exposure
 Outcome 74 (92.5%)
 Exposure (if exposure, outcome listed below) 6 (7.5%)
  Referral to renal care  1 (1.3%)
  CV events  1 (1.3%)
  Multiple outcomes (CV, hospitalisation, death)  1 (1.3%)
  Advanced CKD (stage 4)  1 (1.3%)
  Bleeding events  1 (1.3%)
Duration of follow up for renal function changes
 Median (IQR), years3.0 (1.6, 4.4)
 < 1 year7 (8.8%)
 1–4.9 years48 (60.0%)
 5–9.9 years14 (17.5%)
 ≥ 10 years1 (1.3%)
 Not stated10 (12.5%)
Minimum number of renal function measures for inclusion
 01 (1.3%)
 17 (8.8%)
 224 (30.0%)
 315 (18.8%)
 45 (6.3%)
 51 (1.3%)
 64 (5.0%)
 Not stated23 (28.8%)
Percentage of target population used in analysis
 <50%17 (21.3%)
 50% - 75%9 (11.3%)
 75% - 90%5 (6.3%)
 90% - 95%5 (6.3%)
 >95%8 (10.0%)
 Not available36 (45.0%)
Percentage of study population lost to follow up
 < 25%2 (2.5%)
 25% - 50%3 (3.8%)
 > 50%1 (1.3%)
 Not available62 (77.5%)
 Complete case analysis (only including records of people with follow-up data)11 (13.8%)
Statistical tools usedb
 Descriptive results only5 (6.3%)
 Simple statistical tests9 (11.3%)
 Linear regression models8 (10.0%)
 ANOVA/ANCOVA2 (2.5%)
 Kaplan-Meier estimation / life table analysis3 (3.8%)
 Generalised linear models (GLMs)11 (13.8%)
 Cox proportional hazards regression18 (22.5%)
 Competing risks survival models3 (3.8%)
 Mixed modelling methods12 (15.0%)
 Other latent variable methods2 (2.5%)
 Generalised estimating equations (GEEs)2 (2.5%)
 Joint longitudinal survival modelling2 (2.5%)
 Structural equation modelling1 (1.3%)
 Multiple imputation5 (6.3%)
 Machine learning methods3 (3.8%)
Statistical model usedb
Risk factor identification / causal inference N = 65
  Difference in means t-test2 (3.1%)
  Mean difference paired t-test4 (6.2%)
  Simple non-parametric tests (Mann-Whitney U)1 (1.5%)
  Difference in proportions chi-squared test2 (3.1%)
  ANOVA1 (1.5%)
  ANCOVA1 (1.5%)
  Linear regression 7 (10.8%)
  Logistic regression10 (15.4%)
  Kaplan Meier estimation /life table analysis3 (4.6%)
  Cox proportional hazards regression16 (24.6%)
  Competing risk survival models3 (4.6%)
  Linear mixed model10 (15.4%)
  Generalised estimating equations (GEEs)2 (3.1%)
  Joint longitudinal survival model2 (3.1%)
  Structural equation modelling1 (3.1%)
Risk prediction N = 7
  Kalman filter (time series model)1 (14.3%)
  Naïve Bayes classifier1 (14.3%)
  Logistic regression4 (57.1%)
  Cox proportional hazards regression1 (14.3%)
  Random forest regression2 (28.6%)
  Linear mixed model1 (14.9%)
Estimation of incidence/prevalence N = 3
  Crude estimation3 (100%)
Identification of sub-populations N = 1
  Trajectory clustering using latent variables1 (100%)
Audit of care provision N = 1
  Linear mixed model1 (100%)

aMore specific details of measures of changes in renal function in individual studies assessing CKD progression and corresponding statistical analysis methods are shown in Table 4, including where time-to-event models were used in the presence of unequal follow up or censoring.

bMultiple items possible for a single study but focus only on main analysis of CKD progression.

aMore specific details of measures of changes in renal function in individual studies assessing CKD progression and corresponding statistical analysis methods are shown in Table 4, including where time-to-event models were used in the presence of unequal follow up or censoring.
Table 4

Listing of CKD progression measures in reviewed articles (52 of 80 articles).

MethodsRuleaTermAuthor [ref]bYearAvg follow upSample sizeOther methodsa
Individual linear regressioneGFR slope decline: > 3 ml/min/1.73m2/yearProgressorsChase HS et al. [11]20146 years481Naïve Bayes classifier; logistic regression
eGFR slope decline: > median (8.1) ml/min/1.73m2/yearRelatively rapid eGFR declineWang Y et al. [12]20192 years128Logistic regression
eGFR slope decline: > mean (1.5) ml/min/1.73m2/yearFaster declineAbdelhafiz AH et al. [13]201214 years100Logistic regression
Linear mixed modeleGFR slope decline: > 5 ml/min/1.73m2/yearRapid progressionEriksen BO et al. [14]20063.7 years3,047Slope interactions
eGFR slope decline: > 4 ml/min/1.73m2/yearRapid progressionJalal K et al. [15]2019> = 3 years10,927N/A
eGFR slope decline: > 3 ml/min/1.73m2/yeareGFR slope declineCabrera CS et al. [16]20204.3 years30,222Cox PH regression
eGFR slope decline: > 0 ml/min/1.73m2/yearProgressors (vs non-progressors)Eriksen et al. [17]20104 years1,2242-level model
eGFR slope decline: > 0 ml/min/1.73m2/yeareGFR declineAnnor FB et al. [18]20154 years575Structural equation modelling
eGFR predicted percent rate of decline: > 5% per yearProgressionDiggle PJ et al. [19]20154.5 years22,910Piecewise linear mixed model
Absolute change between measureseGFR drop at any time: > 10 ml/min/1.73m2ProgressionButt AA et al. [20]20183 months17,624Difference in proportions chi-squared test
Percent change between measureseGFR percent drop: >10%; >20%ProgressionSingh A et al. [21]20151 year6,435Logistic regression
eGFR percent drop: >15%Progressive renal impairmentEvans RDR et al. [22]20185 years24Descriptive result only
eGFR percent drop: >20%Transient or persistent renal function declineJackevicius CA et al. [23]2021Approx. 1.4 years49,458Cox PH regression
eGFR percent drop: >25%ProgressionLai YJ et al. [24]20191 year1,620Cox PH regression
eGFR percent drop: >25%ProgressionVejakama P et al. [25]20154.5 years32,106Competing risks survival models
(AND increase in CKD stage)
eGFR percent drop: >30%“30% decline in eGFR”Posch F et al. [26]20191.4 years14,432Cox PH regression
eGFR percent drop: >30%Renal function declineHsu TW et al. [27]20195 years5,046Cox PH regression
eGFR percent drop: >30%Rapid eGFR declineInaguma D et al. [28]20202 years9,911Logistic regression; Random forest regression
eGFR percent drop: >30%eGFR declinePeng YL et al. [29]20201.5 years1,050Cox PH regression
eGFR percent drop: >30%(no label)Yao X et al. [30]201711 months9,796Cox PH regression
eGFR percent drop: >30%“Loss of eGFR >30%”Lamacchia O et al. [31]20184 years582Logistic regression
eGFR percent drop: >30%eGFR lossViazzi F et al. [32]20184 years535Logistic regression
eGFR percent drop: >30%Clinically important declineRej S et al. [33]20203.1 years6,226Cox PH regression
eGFR percent drop: >30%; 30–50%; and 50%ProgressionYoo H et al. [34]20195.7 years478Kaplan meier with log-rank test
eGFR percent drop: >40% (or RRT initiation)RRT40Tangri N et al. [35]20213.9 years32,007Cox PH regression
eGFR percent drop: >50%Renal survival endpointLv L et al. [36]20173.1 years208Cox PH regression
Serum creatinine percent increase: >50%Worsening renal functionLi XM et al. [37]20161.8 years44Descriptive results only
Estimate creatinine clearance percent drop: >0%Decline in creatinine clearanceGallant JE et al. [38]20051 year658Descriptive results only
Rate of change between measureseGFR drop per time elapsed (assumed):Progressive GFR declineHerget-Rosenthal S et al. [39]20133 years803Logistic regression
> 2.5 ml/min/1.73m2/year
eGFR drop per time elapsed: > 3 ml/min/1.73m2/yearRapid progressionMorales-Alvarez MC et al. [40]2019Not stated594Descriptive comparisons
eGFR drop per time elapsed: > 5 ml/min/1.73m2/yeareGFR declineNderitu P et al. [41]20149 months4,145Logistic regression
eGFR drop per time elapsed: > 5 ml/min/1.73m2/yearFast progressionKoraishy FM et al. [42]2017Not stated2,170Logistic regression
eGFR drop per time elapsed (assumed): > 5 ml/min/1.73m2/yearProgressive CKDJohnson F et al. [43]2015Not stated200Difference in proportions chi-squared test
eGFR drop per time elapsed: > 5 ml/min/1.73m2/yearRapid declineChakera A et al. [44]20157 years147Logistic regression
eGFR percent drop per time elapsed (assumed): >5% per yearRapid kidney function declineChen H et al. [45]20143 years365Logistic regression
Change in CKD stage, based on measuresPopulation: incident CKD stage 3 (2 x eGFR < 60 over > 3 months);CKD progression from stage 3 to 4Perotte A et al. [46]2015Not stated2,908Cox proportional hazards regression
Outcome: 2 x eGFR <30 over >3 months
Increase in CKD stage: By one or more stagesWorsening in CKD stageCummings DM et al. [47]20117.6 years791Logistic regression
Increase in CKD stage: By one or more stages (eGFR values or diagnostic codes)Declining kidney functionHorne L et al. [48]2019Not stated195,178Crude estimation of incidence rate
Increase in CKD stage: By one or more stages (eGFR values or coded RRT)CKD stage worseningRobinson DE et al. [49]2021Approx. 3.7 years19,324Competing risks survival models
Increase in CKD stage: By one stageProgression of kidney dysfunction to next CKD stageNicolos GA et al. [50]20205 yearsApprox 37,000Life-table analysis
Increase in CKD stage / risk category: To very high risk category (eGFR <30 and proteinuria (-); eGFR <45 and proteinuria (±); eGFR < 60 and proteinuria (+))Diabetic kidney disease progressionYanagawa T et al. [51]20216.2 years681Cox PH regression
Change in CKD stage: From and to any stage, summarised by initial and final stageTransition between CKD stagesVesga JI et al. [52]20216-month intervals1,783Crude estimation
Binary progression to threshold valueThreshold eGFR: median eGFR < 30, for at least 3 consecutive monthsNephrotoxicityOetjens M et al. [53]20148.8 years115Cox PH regression
Threshold eGFR: 2 x eGFR<30 over ≥90 days with no intermediate eGFR>30Advanced CKDNeuen BL et al. [54]20212.9 years91,319Cox PH regression
Threshold eGFR: 2 x eGFR<30 over ≥90 days with no intermediate eGFR>30 (or a stage 4–5 code)Incident CKD stages 4–5Weldegiorgis M et al. [55]20197.5 years1,397,573Cox PH regression
Threshold eGFR: < 45 ml/min/1.73m2Progression to CKD stage 3bNiu SF et al. [56]20213.0 years3,114Cox PH regression
Threshold eGFR: < 15 ml/min/1.73m2Renal survival endpointO’Riordan A et al. [57]20093.2 years54Kaplan meier estimation; log-rank test
Threshold eGFR: ESRD (eGFR<15 or dialysis)Progression to ESRDTsai CW et al. [58]20174.2 years739Cox PH regression
Binary progression (changes/threshold combination)eGFR percent drop: >50%Renal eventLeither MD et al. [59]20195.3 years196,209Cox PH regression
AND
Threshold eGFR: 2 x eGFR <30
eGFR percent drop: >50%“ESRD or an irreversible reduction in eGFR”Liu D et al. [60]20193.7 years455Cox PH regression
OR
Threshold eGFR: ESRD
eGFR percent drop: >50%CKD progressionRincon-Choles H et al. [61]20172.8 years1,676Competing risks survival models
OR
Threshold eGFR: ESRD
Latent class non-linear mixed modelsPrediction of latent eGFR trajectory class, 6 categoriesTrajectory category*VanWagner LB et al. [62]20181 year671Logistic regression, conditional on class

aIn time-to-event analyses (e.g. Cox PH regression, competing risks survival models), the rule for progression can be met at any time during data collection, utilising repeated test results over time. In binary analyses (e.g. logistic regression), the rule is applied once per patient, likely at a specific time which may vary between studies.

bFor consistency, article reference numbers [ref] also match those provided in the supplementary S3 File listing of reviewed studies.

bMultiple items possible for a single study but focus only on main analysis of CKD progression. eGFR was the most commonly used measure of renal function (94%). Measures of change in renal function and methods of derivation were highly variable. Regression of absolute changes in eGFR was most common (26% of studies), although methods varied with many using mixed models but others using individual linear regression. Calculation of absolute changes and percent changes in eGFR were also common (14% and 17% respectively), but duration of follow up varied substantially between studies. Other less common measures were rates of change calculated between measures, regression slopes on the percent scale, and binary measures for progression to thresholds of eGFR or CKD stages. 7 studies (9%) analysed rates of change in eGFR that were not clearly defined as either regression slopes or rates of change between measures. Other renal function measures studied were Cockcroft and Gault estimated creatinine clearance (3%), serum creatinine (3%) and inverse serum creatinine (1%). Most studies (93%) analysed changes in renal function as an outcome, with only 6 studying changes in renal function as an exposure. Typical (median) duration of follow up for renal function was 3 years, but ranged from 3 months to 14 years, and was not stated in 13% of studies. Duration of follow up also commonly varied significantly between patients within individual studies, mostly due to variation in data completeness with regards to availability and timing of serum creatinine test results on the health record. Inclusion criteria relating to availability of repeat eGFR measures varied and was commonly not stated (29%). The percentage of the target population analysed could not be calculated for 36 studies (45%) due to insufficient data (Fig 2A). The study population constituted less than 50% of patients in the target population for 17 studies (21%), and less than 75% of the target population in 26 studies (33%) (Fig 2B). Statistics on data completeness were rarely stated explicitly and were often difficult to ascertain. Rates of loss to follow up were even more difficult to ascertain, and many studies sampled patients on the basis of varying levels of completeness of follow up. In 11 studies (14%), quantifying the impact of loss to follow up was not possible due to sampling based on complete follow up, and in 62 studies (78%) no data was reported on losses to follow up. The supplementary listing of individual studies provides a more detailed breakdown of analysis criteria, percentage of target population analysed and rates of loss to follow up.
Fig 2

Risk of selection bias (A) and ascertainment bias (B) in individual studies.

Risk of selection bias (A) and ascertainment bias (B) in individual studies. Statistical methods for analysing CKD progression depended on whether the renal function measure was continuous (e.g. rate of change in eGFR) or binary (e.g. >30% change in eGFR from baseline at repeat measurement), which varied between studies. Most commonly used statistical methods were linear mixed models, linear regression, logistic regression, and Cox proportional hazards regression. Many studies used simple statistical tests, despite the inability of these methods to adjust for confounders commonly present in observational data. More sophisticated methods taking into account differential drop-outs due to death were rare. 2 studies used joint longitudinal survival models and 3 studies used competing risks survival models.

Handling of data quality issues and methodological challenges

Table 3 summarises how data quality issues and methodological challenges were dealt with in reviewed articles. EHR databases used for analysis rarely had good quality data on renal function, i.e. collected regularly over time and completely for all patients in the target population. A few studies attempted to improve sample completeness, for example by using imputation methods to avoid exclusions. Studies selected patients for analysis on the basis of varying levels of data completeness, relating to number of measures and duration of follow up, and many studies would have excluded patients from analysis completely on the basis of insufficient data over time. 64% of studies at least partially acknowledged this as introducing bias, 18% provided some data on sample completeness without acknowledging implications and 16% did not mention sample completeness or representativeness at all. Very few studies mentioned losses to follow up during the study period or potential reasons for loss to follow up and 61% of studies did not mention the issue of informative censoring at all. Only 6 studies (8%) tackled the issue methodologically, for example by accounting for the competing risk of death through joint longitudinal survival models and competing risks survival models.
Table 3

Critique of handling of data quality and methodological challenges (N = 80).

Handling of data quality and methodological challengesN (%)
Representativeness of sample to target population
 Not mentioned13 (16.3%)
 Mentioned care pathway and inclusion criteria, but not sample completeness2 (2.5%)
 Mentioned sample completeness, but not implications14 (17.5%)
 Partially acknowledged implications of sample completeness37 (46.3%)
 Fully acknowledged implications of sample completeness10 (12.5%)
 Tackled methodologically4 (5.0%)
Methods of handlinga
 None68 (85.0%)
 Detailed/comprehensive database of EHRs used5 (6.3%)
 Multiple imputation (to avoid exclusions)4 (5.0%)
 Other imputation methods (to avoid exclusions)3 (3.8%)
Handling of informative drop-outs/censoring
 Not mentioned49 (61.3%)
 Mentioned care pathway follow up, but not losses to follow up (inc. death)2 (2.5%)
 Mentioned losses to follow up, but not implications7 (8.8%)
 Partially acknowledged implications of losses to follow up13 (16.3%)
 Fully Acknowledged implications of losses to follow up3 (3.8%)
 Tackled methodologically6 (7.5%)
Methods of handlinga
 None71 (88.8%)
 Complete follow up1 (1.3%)
 Joint modelling of longitudinal changes and time to drop out (including death)2 (2.5%)
 Sensitivity analysis in drop-outs1 (1.3%)
 Competing risks survival models4 (5.0%)
 Sensitivity analysis adjusting for competing risks1 (1.3%)
Handling of missing longitudinal data
 Not mentioned47 (58.8%)
 Mentioned care pathway follow up, but not data completeness4 (5.0%)
 Mentioned data completeness, but not implications7 (8.8%)
 Partially acknowledged implications of data completeness13 (16.3%)
 Fully acknowledged implications of data completeness1 (1.3%)
 Tackled methodologically8 (10.0%)
Methods of handlinga
 None62 (77.5%)
 LOCF1 (1.3%)
 Imputation with mean/median2 (2.5%)
 Mixed modelling13 (16.3%)
 Generalised estimating equations1 (1.3%)
 Multiple imputation1 (1.3%)
Handling of missing covariate data
 Not relevant (no covariate analysis)16 (20.0%)
 Not mentioned (despite covariate analysis)32 (40.0%)
 Mentioned data completeness, but not implications2 (2.5%)
 Partially acknowledged implications of data completeness17 (21.3%)
 Fully acknowledged implications of data completeness3 (3.8%)
 Tackled methodologically7 (8.8%)
Methods of handlinga
 None64 (80.0%)
 LOCF2 (2.5%)
 Imputation with mean4 (5.0%)
 Multiple imputation5 (6.3%)
 Complete data was available for all covariates2 (2.5%)
 Data linkage to improve data completeness1 (1.3%)
 Adjustment for missingness2 (2.5%)
Distributional checks/issues
 Not mentioned70 (87.5%)
 Mentioned or partially addressed5 (6.3%)
 Fully Acknowledged0
 Tackled5 (6.3%)
Methods of handlinga
 None75 (93.8%)
 Distributional checks4 (5.0%)
 Consideration of alternative error distributions1 (1.3%)
Handling of within-patient correlation / variability in kidney function over time
 Not mentioned20 (25.0%)
 Mentioned or partially addressed24 (30.0%)
 Fully Acknowledged4 (5.0%)
 Tackled32 (40.0%)
Methods of handlinga
 None35 (43.8%)
 Random effects / latent variables17 (21.3%)
 Generalised estimating equations2 (2.5%)
 Modelling of stochastic process1 (1.3%)
 Outcome likely to identify real change22 (27.5%)
 Measures capturing AKI explicitly excluded1 (1.3%)
 Paired t-test3 (3.8%)
Handling of population heterogeneity
 Not mentioned1 (1.3%)
 Mentioned or partially addressed36 (45.0%)
 Fully Acknowledged3 (3.8%)
 Tackled40 (50.0%)
Method of handlinga
 None8 (10.0%)
 Adjustment for covariates21 (26.3%)
 Interaction terms9 (11.3%)
 Stratified or separate/subgroup analysis34 (42.5%)
 Latent classes1 (1.3%)
 Random effects3 (3.8%)
 ANOVA/ANCOVA2 (1.5%)
 Propensity score methods1 (1.3%)
 Features in machine learning classification1 (1.3%)
Handling of confounding (risk factor / causal inference analyses only)N = 65
 Not mentioned7 (10.8%)
 Mentioned or partially addressed17 (26.2%)
 Fully Acknowledged3 (4.6%)
 Tackled38 (58.5%)
Methods of handlinga
 None12 (18.5%)
 Adjustment for baseline confounders46 (70.8%)
 Propensity score methods6 (9.2%)

aMethods/approaches for handling issues are listed, regardless of whether the corresponding issues were fully tackled in analysis.

aMethods/approaches for handling issues are listed, regardless of whether the corresponding issues were fully tackled in analysis. Most studies (59%) did not mention (or tackle) the issue of missing longitudinal data on renal function tests over time. One in 6 studies did however use mixed modelling methods (16%) which may partially deal with the issue. 4 studies (5%) attempted to deal with missing longitudinal data through imputation methods. 40% of studies failed to mention missing covariate data despite covariate analysis, while 20% did not perform covariate adjustment. 25% at least partially acknowledged the issue and 16 studies (20%) made some attempt to handle missing covariate data through imputation methods, data linkage or other adjustment for missingness. Distributional checks for renal function measures were rare, with only 5 studies (6%) mentioning distributional checks or considering alternative error distributions. Regarding the issue of variability in renal function over time and within-patient correlation, 25% did not mention (or tackle) such issues at all, 40% tackled the issue methodologically, 30% partially tackled or acknowledged the issue and a further 5% fully acknowledged such issues. 21% of studies used patient random effects to account for within-patient correlation, and 28% used outcomes which are likely to identify an important and real change. Most studies acknowledged some aspects of population heterogeneity in analyses. At the most basic level, covariate adjusted analyses were used to account for baseline differences between patients (26%). Other methods included stratification or subgroup analyses to study distinct populations (43%), interaction terms allowing differing trajectories of renal function according to patient characteristics (11%) and random effects (4%). For studies performing causal analyses, 59% tackled the issue of confounding, mostly through baseline adjustment. A subset (11%) did not mention (or tackle) confounding at all, with some studies performing simple statistical tests such as t-tests and chi-squared tests despite the potential for confounding by indication.

Definitions of CKD progression

Table 4 provides a list of CKD progression measures used in individual studies, grouped by method of derivation. A listing is provided rather than aggregate summary due to the substantial variation in the way researchers defined CKD progression across the literature. Terms used included progression, rapid progression, fast progression, rapid decline, progressive decline, progressive renal impairment, renal function deterioration and worsening renal function, while some did not provide labels, simply stating the outcome as a threshold percent change in renal function for example. There is no consistency between studies in the way these terms apply to different outcomes. aIn time-to-event analyses (e.g. Cox PH regression, competing risks survival models), the rule for progression can be met at any time during data collection, utilising repeated test results over time. In binary analyses (e.g. logistic regression), the rule is applied once per patient, likely at a specific time which may vary between studies. bFor consistency, article reference numbers [ref] also match those provided in the supplementary S3 File listing of reviewed studies.

Discussion

We performed a systematic review of peer-reviewed literature studying progression of CKD using routinely collected EHR data. Handling of data quality issues was generally poor, with unclear reporting of analysis criteria, data completeness and discussion of the implications of missing data on reliability of conclusions. For studies with sufficient data, representativeness of samples to target populations was likely to be poor with large numbers of patients excluded from analysis on the basis of poor data completeness at baseline and during follow-up thereby likely introducing selection bias. Methods capable of handling missing longitudinal data and informative losses to follow up, such as joint longitudinal survival models, were only used in a minority of studies and many studies are likely to have overstated the reliability of findings and applicability to populations of interest. Measures of change in renal function and definitions of progression varied substantially between studies, revealing a lack of consensus on clinically important and statistically robust measures in the study of CKD progression. Unlike prospective cohort studies and clinical trials which prospectively identify patients for research and take efforts to follow up patients regularly and completely over time, retrospective analysis of routine healthcare data relies on data collected for the purposes of clinical care. While monitoring guidelines may be in place in healthcare systems that aim to ensure regular follow up of patients at risk of CKD progression, such guidelines may be followed at the discretion of healthcare providers, and frequency of testing and time between tests is likely to be influenced by patient risk. If patients are sampled for analysis on the basis of threshold levels of data completeness over time, there is a risk of disproportionately including patients in analysis that are followed up more regularly as a result of their evolving risk profile (selection bias) and that remain both alive and free of RRT long enough to meet the follow up criteria (survival bias). In addition, if data is collected in a single care setting but patients are managed in different care settings based on their risk, data may be informatively missing where patients move between care settings (ascertainment bias). It is highly likely that studies using EHRs that exclude patients from analysis due to poor data completeness or fail to follow up patients equally among different risk groups will have unreliable results, and results may reflect an unknown subgroup of the target population. The use of such studies to inform clinical decision-making may therefore fail to benefit the community as hoped. There are a number of methodological challenges in longitudinal analysis of renal function that are not necessarily specific to EHRs but that are important considerations for researchers, discussed in more detail in [10, 63] and introduced earlier. In the absence of acute kidney injury, mixed effects models with patient random effects may improve estimation of changes over time compared to individual linear regressions which may lead to more extreme slope estimations. Such models allow sharing of information between patients, assuming a common mean trajectory, and they allow patients to be included in analysis with variable levels of data completeness to avoid excluding patients from analysis unnecessarily. Other benefits are the ability to perform the entire analysis (comparing exposures and outcomes) in a single model, without the loss of information and under-estimation of standard errors that may result from a 2-step model that estimates individual changes prior to further modelling. CKD is a heterogeneous disease, with various possible contributing causes and pathways of progression. Linear mixed models typically assume a common mean trajectory but other methods are available if this assumption is too strong. While random slope models allow individual trajectories to vary around a common mean slope, more sophisticated models such as latent class mixed models allow modelling of trajectory groups which may be linear or non-linear and correspond to sub-populations of patients. Another challenge is competing risk of mortality and how to handle the initiation of RRT in the analyses of repeated renal function tests, where such events are likely to be associated with rate of decline. An analysis that does not account for informative censoring may lead to biased results. Joint longitudinal survival models and competing risks survival models can be used to account for competing risks if data is available (this may require data linkage to external databases to obtain information on competing event dates). A major finding of this review was the extreme variation in definitions of CKD progression used, and the clinical importance of each definition was unclear. More work has been done in the last decade to identify clinically important measures of progression of CKD. In 2012, the United States Food and Drug Administration (FDA) commissioned research to identify new endpoints of CKD progression for use in clinical trials [64, 65]. Definitions were developed using data from the Chronic Kidney Disease Prognosis Consortium (CKD-PC) that showed strong association with important clinical outcomes of progression to ESRD and all-cause mortality, including thresholds of reduction in eGFR between measures of 30% and 40% over approximately 2 years, stratified by baseline eGFR. Further research that aims to define new outcomes of smaller clinically meaningful changes in renal function would be useful, as this may enable earlier identification of progression of CKD that would be useful in clinical practice, and future EHR studies could adopt such outcomes for research. Strengths of this review include the large number of databases utilised and studies reviewed and detailed data extraction efforts, allowing a comprehensive evaluation of how well data quality issues were handled and acknowledged. The review was however limited to peer-reviewed articles and those that clarified in their abstract that repeated renal function tests were used in analysis. Limitations include the limitation to articles written in English, lack of inclusion of grey literature and issues with ascertaining whether EHRs were used as opposed to other methods of extraction from paper records. Despite this, the majority of data issues present will be the same regardless of whether electronic or paper records were used. Retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks.

Conclusions

Many studies using EHRs to study progression of CKD do not fully acknowledge the biases that result from poor data quality inherent in EHRs and reporting was poor. While some studies have defined CKD progression measures similar to those validated by FDA in 2012 [64, 65] showing an understanding of identifying clinically important changes in renal function, recommendations following the systematic review by Boucquemont et al. review in 2014 [10] have not been implemented on a broader scale. Observational studies using EHRs should follow the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) [66, 67] and REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) [68] guidelines, which aim to improve transparency and clarity in reporting of research. Research publications should clearly state the care pathway and intended follow up framework, data completeness eligibility criteria, the percentage of the target population excluded based on those criteria, whether there were differences in characteristics of those included vs. excluded and according to important risk factors, as well as rates of loss to follow up. Where possible, researchers should attempt to ascertain reasons for loss to follow up, which may involve linkage to external data. Researchers should consider using existing validated outcomes of CKD progression and we hope that heterogeneity in definitions of CKD progression will improve over time. Focussing research questions on populations for which regular data collection is performed as part of routine care may offer a route to better quality data on changes of renal function over time and important changes in renal function will be easier to identify accurately in patients with reduced renal function at baseline, such as those with established CKD where GFR-estimating equations perform better.

PRISMA checklist.

(DOC) Click here for additional data file.

MEDLINE database search strategy.

(DOCX) Click here for additional data file.

List of reviewed studies.

(DOCX) Click here for additional data file.

Summary of study populations, where unclear if EHRs used.

(DOCX) Click here for additional data file.

Study methodology, where unclear if EHRs used.

(DOCX) Click here for additional data file.

Critique of handling of data quality and methodological challenges, where unclear if EHRs used.

(DOCX) Click here for additional data file.

Listing of key features of all included studies, sorted by year of publication.

(DOCX) Click here for additional data file.

Data extraction spreadsheet.

(XLSX) Click here for additional data file.

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present. 11 May 2021 PONE-D-21-03779 A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records PLOS ONE Dear Dr. Cleary, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The authord need to address reviewers' comments. Reviewer # 1: The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work. Reviewer # 2: A well written systematic review with proper design, presentation of result, and discussion. I have few comments: -The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date. -Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness); -The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”.  Only studies reported in English language were included in this review. Therefore,  it is unfair to label studies in non-English language as inferior in quality. -What is the outcome for Cox proportional hazards regression? I assume time to event !! Please submit your revised manuscript by Jun 25 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see:  http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at  https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Stanislaw Stepkowski Academic Editor PLOS ONE Additional Editor Comments: The authord need to address reviewers' comments. Reviewer # 1: The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work. Reviewer # 2: A well written systematic review with proper design, presentation of result, and discussion. I have few comments: -The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date. -Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness); -The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”. Only studies reported in English language were included in this review. Therefore, it is unfair to label studies in non-English language as inferior in quality. -What is the outcome for Cox proportional hazards regression? I assume time to event !! Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 3. Please remove your figures from within your manuscript file, leaving only the individual TIFF/EPS image files, uploaded separately.  These will be automatically included in the reviewers’ PDF. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work. Reviewer #2: A well written systematic review with proper design, presentation of result, and discussion. I have few comments: -The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date. -Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness); -The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”. Only studies reported in English language were included in this review. Therefore, it is unfair to label studies in non-English language as inferior in quality. -What is the outcome for Cox proportional hazards regression? I assume time to event !! ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Dulat Bekbolsynov Reviewer #2: Yes: Sadik A. Khuder [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 26 Oct 2021 I respond to specific reviewer comments individually below. I break this down point by point for reviewer 2 using the word 'RESPONSE' for each individual point separately. Comments reviewer #1: The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work. RESPONSE: We are pleased that the reviewer sees the value and quality of our work. In response to reviewer suggestions, we have updated text in the discussion to note that traditional paper records will suffer from the same problems as those using electronic healthcare records. Comments Reviewer #2: A well written systematic review with proper design, presentation of result, and discussion. RESPONSE: We are pleased that the reviewer believes we have conducted a well-designed and presented review of the literature. -The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date. RESPONSE: We have updated the search dates to include studies available in the 4 databases covered by the review as of August 2021, allowing us to capture more recently published studies. -Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness); RESPONSE: We have clarified in the methods text that data completeness inclusion criteria refer to the specific study inclusion criteria applied prior to main analyses being performed that aimed to restrict analyses to only those patients with sufficient data completeness to be deemed suitable for analysis, with such criteria expected to vary between studies. The explanation of the calculation for “percent of target population analysed” also shows readers how we used sample size data before and after data completeness inclusion criteria were applied to uncover the extent to which patients were excluded from analysis purely due to failure to meet a study’s data completeness requirements. -The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”. Only studies reported in English language were included in this review. Therefore, it is unfair to label studies in non-English language as inferior in quality. RESPONSE: Our comment that “With peer-reviewed literature expected to go through certain research quality checks, it is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed” was intended to convey that studies missing from review due to not being peer-reviewed are likely to be of similar or worse quality as/than those peer-reviewed, due to the quality checks that peer-reviewed studies go through. It was not intended to say anything about studies published in non-English languages. We have clarified in the methods text that we anticipate that studies published in both English and non-English languages are likely to be of similar quality. -What is the outcome for Cox proportional hazards regression? I assume time to event !! RESPONSE: I’m not 100% sure where exactly in the manuscript the reviewer is referring to in this comment, but I imagine it may be results Tables 2 and 4. I would like to clarify what is reported and what is not. Due to the anticipated variation in how researchers define progression of kidney disease over time and the challenges this may pose in clinical interpretability of findings of research studies, a key aim of our review was to summarise how researchers measured changes in renal function over time. We also reported methods for analysis (which include as the reviewer states Cox proportional hazards regression models with such models using as outcome time to some event). In our reporting of study methodology (Table 2), we summarise “Measure of change in renal function over time”. As an example, an event of a 30% decline in eGFR between measures would be reported as “Raw percent change in eGFR between measures” (as this captures how changes over time were measured) and we do not specifically state whether this was analysed as time to event or as a binary outcome but we do report the method of analysis (“Statistical model used”), for example Cox proportional hazards regression. Table 4 further clarifies precise measures of changes in renal function over time for each individual study (e.g. percent loss in eGFR between measures >30%) alongside the methods used (e.g. Cox PH regression). Although we do not specifically state what the outcome is (e.g. time to percent loss in eGFR between measures >30%), this is inferred. We have added a comment to Table 4 to clarify this. Submitted filename: Response to Reviewers.docx Click here for additional data file. 7 Feb 2022 A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records PONE-D-21-03779R1 Dear Dr. Cleary, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Mabel Aoun, MD, MPH Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Dulat Bekbolsynov 9 Feb 2022 PONE-D-21-03779R1 A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records Dear Dr. Cleary: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Mabel Aoun Academic Editor PLOS ONE
ParticipantsInclude: Adults aged ≥18 with CKD stages 3–5; Studies that involve both CKD and non-CKD patients are also included, e.g. diabetesExclude: Patients who have initiated RRT (dialysis or transplant), even if data is collected for renal function prior to RRT initiation; Patients with AKI (unless chronic changes are also studied); Non-human subjects; Children
Intervention/ ExposureNo restriction if CKD progression is measured as the outcome, rather than exposure.If CKD progression is analysed as an exposure, restrictions of this measure apply (see outcome definition).
Comparators/ ControlNo restriction.
OutcomeNo restriction on outcomes if CKD progression is measured as an exposure, rather than outcome.If the outcome is a measure of CKD progression:Include: Measures of chronic change in renal function based on multiple measures of eGFR or any other measure that may be used to infer eGFR (e.g. serum creatinine, cystatin-C, iohexol clearance), e.g. rate of change, change from baseline, regression slope, time to change or threshold eGFRExclude: All other measures of renal function, e.g. proteinuria; Studies of acute AKI or short term follow up (<6 months) of renal function following a procedure; Single time-point analyses; Time to RRT as single outcome.
Study designInclude: Retrospective analysis of routinely collected electronic healthcare records which may include retrospective cohort studies, case-control studies and cross-sectional studies (if a measure of past progression is included)Exclude: Case reports, Clinical trials, prospective cohort studies or any other study design with pre-planned data collection strategy for research purposes.
  66 in total

1.  Bleeding Risk of Direct Oral Anticoagulants in Patients With Heart Failure And Atrial Fibrillation.

Authors:  Cynthia A Jackevicius; Lingyun Lu; Zunera Ghaznavi; Alberta L Warner
Journal:  Circ Cardiovasc Qual Outcomes       Date:  2021-02-05

2.  Progressive chronic kidney disease in primary care: modifiable risk factors and predictive model.

Authors:  Stefan Herget-Rosenthal; Dorothea Dehnen; Andreas Kribben; Thomas Quellmann
Journal:  Prev Med       Date:  2013-06-16       Impact factor: 4.018

3.  The impact of outpatient acute kidney injury on mortality and chronic kidney disease: a retrospective cohort study.

Authors:  Maxwell D Leither; Daniel P Murphy; Luke Bicknese; Scott Reule; David M Vock; Areef Ishani; Robert N Foley; Paul E Drawz
Journal:  Nephrol Dial Transplant       Date:  2019-03-01       Impact factor: 5.992

4.  Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization.

Authors:  Alan S Go; Glenn M Chertow; Dongjie Fan; Charles E McCulloch; Chi-yuan Hsu
Journal:  N Engl J Med       Date:  2004-09-23       Impact factor: 91.245

Review 5.  Evolving importance of kidney disease: from subspecialty to global health burden.

Authors:  Kai-Uwe Eckardt; Josef Coresh; Olivier Devuyst; Richard J Johnson; Anna Köttgen; Andrew S Levey; Adeera Levin
Journal:  Lancet       Date:  2013-05-31       Impact factor: 79.321

6.  Analgesia dose prescribing and estimated glomerular filtration rate decline: a general practice database linkage cohort study.

Authors:  Paul Nderitu; Lucy Doos; Vicky Y Strauss; Mark Lambie; Simon J Davies; Umesh T Kadam
Journal:  BMJ Open       Date:  2014-08-19       Impact factor: 2.692

Review 7.  Global Prevalence of Chronic Kidney Disease - A Systematic Review and Meta-Analysis.

Authors:  Nathan R Hill; Samuel T Fatoba; Jason L Oke; Jennifer A Hirst; Christopher A O'Callaghan; Daniel S Lasserson; F D Richard Hobbs
Journal:  PLoS One       Date:  2016-07-06       Impact factor: 3.240

8.  Renal Function Decline in Latinos With Type 2 Diabetes.

Authors:  Martha Catalina Morales-Alvarez; Gabriela Garcia-Dolagaray; Andreina Millan-Fierro; Sylvia E Rosas
Journal:  Kidney Int Rep       Date:  2019-05-22

9.  Implications of a Family History of Diabetes and Rapid eGFR Decline in Patients With Type 2 Diabetes and Biopsy-Proven Diabetic Kidney Disease.

Authors:  Yiting Wang; Lijun Zhao; Junlin Zhang; Yucheng Wu; Rui Zhang; Hanyu Li; Ruikun Guo; Qianqian Han; Tingli Wang; Lin Li; Shanshan Wang; Fang Liu
Journal:  Front Endocrinol (Lausanne)       Date:  2019-12-13       Impact factor: 5.555

10.  The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement.

Authors:  Eric I Benchimol; Liam Smeeth; Astrid Guttmann; Katie Harron; David Moher; Irene Petersen; Henrik T Sørensen; Erik von Elm; Sinéad M Langan
Journal:  PLoS Med       Date:  2015-10-06       Impact factor: 11.069

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.