Literature DB >> 34512071

Challenges and Pitfalls of Using Repeat Spirometry Recordings in Routine Primary Care Data to Measure FEV₁ Decline in a COPD Population.

Hannah R Whittaker¹, Steven J Kiddle², Jennifer K Quint¹.

Abstract

BACKGROUND: Electronic healthcare records (EHR) are increasingly used in epidemiological studies but are often viewed as lacking quality compared to randomised control trials and prospective cohorts. Studies of patients with chronic obstructive pulmonary disease (COPD) often use the rate of forced expiratory volume in 1 second (FEV1) decline as an outcome; however, its definition and robustness in EHR have not been investigated. We aimed to investigate how the rate of FEV1 decline differs by the criteria used in an EHR database.
METHODS: Clinical Practice Research Datalink and Hospital Episode Statistics were used. Patient populations were defined using 8 sets of criteria around repeated FEV1 measurements. At a minimum, patients had a diagnosis of COPD, were ≥35 years old, were current or ex-smokers, and had data recorded from 2004. FEV1 measurements recorded during follow-up were identified. Thereafter, eight populations were defined based on criteria around: i) the exclusion of patients or individual measurements with potential measurement error; ii) minimum number of FEV1 measurements; iii) minimum time interval between measurements; iv) specific timing of measurements; v) minimum follow-up time; and vi) the use of linked data. For each population, the rate of FEV1 decline was estimated using mixed linear regression.
RESULTS: For 7/8 patient populations, rates of FEV1 decline (age and sex adjusted) were similar and ranged from -18.7mL/year (95% CI -19.2 to -18.2) to -16.5mL/year (95% CI -17.3 to -15.7). Rates of FEV1 decline in populations that excluded patients with potential measurement error ranged from -79.4mL/year (95% CI -80.7 to -78.2) to -46.8mL/year (95% CI -47.6 to -46.0).
CONCLUSION: FEV1 decline remained similar in a COPD population regardless of number of FEV1 measurements, time intervals between measurements, follow-up period, exclusion of specific FEV1 measurements, and linkage to HES. However, exclusion of individuals with questionable data led to selection bias and faster rates of decline.

Entities: Chemical

Keywords: COPD; electronic healthcare records; lung function; spirometry

Year: 2021 PMID： 34512071 PMCID： PMC8420778 DOI： 10.2147/POR.S319965

Source DB: PubMed Journal: Pragmat Obs Res ISSN： 1179-7266

Introduction

Electronic health care record (EHR) databases consist of data routinely collected as part of clinical care and are often used for healthcare research. Whilst EHR databases have many strengths, one concern is that data are not collected for the purpose of research and that when tests are undertaken, they are not done so at routine intervals as they would be in arandomised control trial (RCT), nor is the reason for a test being undertaken at that specific point in time always known.1 EHR databases differ from RCTs or prospective cohort studies, which have structured data collection where data are collected at specific times for research purposes, by specific healthcare technicians, with specific equipment following specific protocols. EHRs are becoming increasingly used in epidemiological research; however, they are often viewed as lacking quality. In studies of people with chronic obstructive pulmonary disease (COPD), longitudinal spirometry measurements, such as forced expiratory volume in 1 second (FEV1), are often used to estimate lung function decline, a common marker of disease progression.2 Lung function decline is important as it is associated with quality of life, symptom burden, and mortality.3,4 Unlike RCTs and cohort studies, the sporadic nature of lung function measurements recorded in EHR can lead to greater apparent variation in lung function in COPD patients. This could be due to measurement error, the number of measurements over follow-up, time intervals between measurements, follow-up time, and the reason or time at which lung function measures are recorded by healthcare practitioners. The Clinical Practice Research Datalink (CPRD) is a routinely collected EHR database of general practices in the UK and contains clinical patient information that is recorded at general practices during consultations. In this setting, spirometry measurements in COPD patients are performed by healthcare practitioners during visits to the general practice. The quality and outcomes framework (QOF) for COPD incentivises healthcare practitioners in general practice to perform spirometry every 15 months in COPD patients.5 A previous validation study of spirometry recordings in CPRD found that more than 96% of recordings had adequate quality whereby a valid interpretation could be made by a respiratory physician.6 Despite this, it is possible that other factors, such as timings of measurements, time intervals, and follow-up time, could lead to differences in longitudinal changes in lung function decline. We aimed to investigate the rate of FEV1 decline and how variation in FEV1 differs by the criteria used to define the rate of FEV1 decline in CPRD, a routinely collected database. Specifically, we aimed to investigate how criteria around measurement error, the number of FEV1 measurements, timing of measurements, follow-up time, and use of additional linked databases may lead to differences in FEV1 decline estimates using CPRD. With the ever-growing use of EHR data for cohort studies and pragmatic trials, it is important to understand how robust EHR derived lung function decline is as an outcome.

Methods

Patient Eligibility Criteria

CPRD-GOLD (GOLD database of CPRD) was used and linked to a secondary care database, Hospital Episode Statistics (HES). CPRD contains clinical information on patients recorded at the general practices, such as diagnoses, prescriptions, consultation information, referrals, and tests performed (e.g., spirometry). HES contains information on secondary care processes for patients registered at general practices in England who are eligible for linkage with CPRD. In this study, COPD patients were identified using a validation definition of COPD in CPRD.7 COPD patients were identified if they had a clinical diagnosis of COPD, were over 35 years of age, and were smokers or ex-smokers. The start of follow-up (index date) was the first FEV1 measurement date after the following criteria were met: i) COPD diagnosis; ii) date of registration with current general practice; iii) date at which data recorded at a general practice were deemed to be of research quality (“up-to-standard”); iv) date at age 35; and v) after the implementation of QOF from the 1st January 2004. End of follow-up was the first date of the following: i) death date; ii) date at which the patient transferred to a non-CPRD GP; or iii) the 31st December 2017. Figure 1 describes the study design used to create this base population.

Figure 1

Study design.

FEV1 Measurements

FEV1 measurements recorded between the index date and the end of follow-up were identified. FEV1 measurements were recorded in millilitres. Measurements recorded in litres were transformed to mL and measurements that were higher than 7 litres were excluded (94.8% of all measurements over follow-up were below 7 litres). A cut off 7 L was used based on the average total lung capacity for a healthy adult male. Whilst an FEV1 measurement of 7 L would be considered high in COPD patients, a distribution of FEV1 measurements was considered based off previous studies.8–10 A previous validation study of spirometry in CPRD GOLD found that of 96.5% of spirometry traces recorded at the general practice in COPD patients were of adequate quality where a respiratory clinician was able to make an interpretation.6 Of these, 27.9% were identified as post-bronchodilator FEV1, 7.2% were confirmed to be pre-bronchodilator FEV1, and for the remaining measurements it was unclear if the measurements were pre- or post-bronchodilator.

Patient Populations

In order to understand how the rate of FEV1 declines and its variation differs based on the criteria used to define longitudinal FEV1 decline, the following patient populations were created using specific criteria: patients with at least two FEV1 measurements at least six months apart (base population). A minimum period of six months was chosen in order to estimate medium-term lung function decline; patients in the base population excluding those who had an FEV1 greater than 10%, 20%, and 30% of their previous and subsequent FEV1 measurement. These measurements were regarded as potential measurement error, and patients were hence excluded; patients in the base population excluding individual FEV1 measurements that were greater than 10%, 20%, and 30% of the previous and subsequent FEV1 measurement. These measurements were regarded as potential measurement error and were hence excluded; patients in the base population excluding measurements that were within one week of an exacerbation of COPD (AECOPD) because AECOPD events are associated with decreased FEV1 both before and after an AECOPD;11 patients with at least three or four FEV1 measurements, rather than two, of which at least two were at least six months apart (with no other time constraint on the other measurements). At least three and four measurements were chosen following a common number of measurements used in RCT and cohort studies; patients with at least two FEV1 measurements with at least six months or 1-year time intervals between all measurements. This was chosen following the nature of RCTs and cohort studies whereby spirometry measurements are recorded at regular intervals; patients in the base population with at least three years of follow-up following the maximum length of previous RCTs on lung function decline in COPD patients;12 patients in the base population and patients meeting inclusion criteria but who did require HES linkage eligibility. Approximately 60% of CPRD-GOLD patients are eligible for HES linkage, which can restrict populations. Figure 2 illustrates how each patient population was identified using spirometry measurements.

Figure 2

Patient populations.

Statistical Analysis

Baseline patient characteristics were described for all patient populations using means (standard deviation [SD]), medians (interquartile range [IQR]), and proportions (%). Baseline characteristics included age, gender, the closest recorded smoking status (smokers or ex-smokers) to index date, the closest BMI (underweight, normal, overweight, obese using standard categories) and modified MRC dyspnoea (0–4) recorded three years prior or two years after index date, severity of obstruction using FEV1% predicted (FEV1>80% predicted, FEV1 50–80% predicted, FEV1 30–50% predicted, and FEV1<30% predicted calculated using patient’s first FEV1 measurement, height, and gender13), and AECOPD frequency and severity (none, 1 moderate and 0 severe, 2 moderate and 0 severe, ≥3 moderate and 0 severe, 1 severe and any moderate, and ≥2 severe and any moderate) in the year prior to index date. Moderate AECOPD were defined as GP treated AECOPD and severe AECOPD were defined as hospitalised AECOPD. In addition, the median number of FEV1 measurements (IQR and minimum/maximum number) and follow-up time were described for each population. Mixed linear regression was used to estimate the rate of FEV1 decline in mL/year. Random effects included both random intercepts and random slopes allowing the intercept and rate of decline to vary by patients. Models included a non-adjusted crude model, a minimally adjusted model adjusted for age and gender, a non-adjusted crude model for patients with complete baseline covariates, and a fully adjusted model adjusted for age, gender, smoking status, BMI, mMRC, FEV1% predicted, and AECOPD frequency. Within patient variation in FEV1 (mL) was estimated from mixed linear models. In addition, a ninth analysis population was used to describe the rate of FEV1 decline using linear regression rather than mixed linear regression to understand how clustering at the patient level influences the rate of FEV1 decline in the base population (population 1). Linear regression models included an unadjusted model, a minimally adjusted model, an unadjusted model for patients with complete baseline covariates, and a fully adjusted model adjusted for the same covariates as those used in the mixed linear regression model. Similarly, RCTs commonly use a baseline FEV1 measurement and a follow-up measurement to describe the change in FEV1 over time. This method was used to compare the rates of decline against those estimated using linear regression and mixed linear regression methods. This was calculated using patient’s first and last FEV1 over follow-up divided by the time between these two measurements (in years) to estimate rate of FEV1 decline in mL/year.

Results

The numbers of patients included in each population varied because of the different criteria for repeated FEV1 measurements. Population eight included the greatest number of patients (N=125,682) as it did not require linked HES data and population two included the fewest number of patients because patients were excluded if they had an FEV1 that was greater than 10% of their previous and subsequent FEV1 (N=29,058). Table 1 summarises baseline characteristics for all populations. Populations were similar in terms of age, gender, smoking status, BMI, mMRC, and AECOPD frequency. However, population two included fewer severely obstructed patients and more mild COPD patients (i.e., FEV1>80% predicted).

Table 1

Baseline Characteristics for All Populations Defined Using Different Criteria for FEV1 Decline

Baseline Characteristics	Population 1 N=72,683	Population 2			Population 3			Population 4 N=70,887
Baseline Characteristics	Population 1 N=72,683	10% N=29,058	20% N=41,879	30% N=50,308	10% N=72,683	20% N=72,683	30% N=72,683	Population 4 N=70,887
Median number of FEV₁ measurements (IQR)	4(3–7)	3(2–4)	3(2–5)	4(2–5)	4(3–7)	4(3–7)	4(3–7)	4(3–6)
Age	66.7 (10.7)	67.3 (11.1)	67.2 (10.9)	67.1 (10.9)	66.7 (10.7)	66.7 (10.7)	66.7 (10.7)	66.7 (10.7)
Females	33,417 (46.0)	13,639 (46.9)	19,504 (46.6)	23,364 (46.4)	33,417 (46.0)	39,266 (54.0)	39,266 (54.0)	32,537 (45.9)
Current smokers	43,902 (60.4)	15,419 (53.1)	25,179 (60.1)	30,267 (60.2)	43,902 (60.4)	43,902 (60.4)	43,902 (60.4)	42,822(60.4)
BMI
Underweight	2353 (3.2)	1004 (3.5)	1378 (3.3)	1617 (3.2)	2353 (3.2)	2353 (3.2)	2353 (3.2)	2258 (3.2)
Normal	20,445 (28.1)	8431 (29.0)	1861 (28.3)	14,203 (28.2)	20,445 (28.1)	20,445 (28.1)	20,445 (28.1)	19,895 (28.1)
Overweight	20,017 (27.5)	7907 (27.2)	11,575 (27.6)	13,953 (27.7)	20,017 (27.5)	20,017 (27.5)	20,017 (27.5)	19,573 (27.6)
Obese	14,749 (20.3)	5683 (19.6)	8384 (20.0)	10,146 (20.2)	14,749 (20.3)	14,749 (20.3)	14,749 (20.3)	14,405 (20.3)
Missing	15,119 (20.8)	6033 (20.8)	8681 (20.7)	10,389 (20.7)	15,119 (20.8)	15,119 (20.8)	15,119 (20.8)	14,756 (20.8)
mMRC
0	8098 (11.1)	3813 (13.1)	5181 (12.4)	6040 (12.1)	8098 (11.1)	8098 (11.1)	8098 (11.1)	7990 (11.3)
1	15,887 (21.9)	6704 (23.1)	9505 (22.7)	11,281 (22.4)	15,887 (21.9)	15,887 (21.9)	15,887 (21.9)	15,579 (22.0)
2	9550 (13.1)	3756 (12.9)	5417 (12.9)	6478 (12.9)	9550 (13.1)	9550 (13.1)	9550 (13.1)	9335 (13.2)
3	4533 (6.2)	1853 (6.4)	2580 (6.2)	3301 (6.2)	4533 (6.2)	4533 (6.2)	4533 (6.2)	4350 (6.1)
4	787 (1.0)	367 (1.3)	483 (1.2)	564 (1.1)	787 (1.1)	787 (1.1)	787 (1.1)	739 (1.0)
Missing	33,828 (46.5)	12,565 (43.2)	18,713 (44.7)	22,844 (45.4)	33,828 (46.5)	33,828 (46.5)	33,828 (46.5)	32,894 (46.4)
Airflow obstruction
Mild	18,267 (25.1)	11,164 (38.4)	13,960 (33.3)	15,356 (30.5)	18,267 (25.1)	18,267 (25.1)	18,267 (25.1)	17,855 (25.2)
Moderate	33,452(46.0)	12,468 (42.9)	19,146 (45.7)	23,465 (46.6)	33,452 (46.0)	33,452 (46.0)	33,452 (46.0)	32,794 (46.3)
Severe	16,522 (22.7)	4329 (14.9)	7157 (17.1)	9363 (18.6)	16,522 (22.7)	16,522 (22.7)	16,522 (22.7)	16,009 (22.6)
Very severe	3777 (5.2)	765(2,6)	1214 (2.9)	1657 (3.3)	3777 (5.2)	3777 (5.2)	3777 (5.2)	3601 (5.1)
Missing	665 (0.9)	332 (1.1)	402 (1.0)	467 (0.9)	665 (0.9)	665 (0.9)	665 (0.9	628 (0.9)
FEV₁% predicted (SD)	63.1 (21.9)	70.5 (21.9)	68.1 (21.5(	66.7 (21.5)	63.1 (21.9)	63.1 (21.9)	63.1 (21.9)	63.2 (21.8)
AECOPD
None	30,178 (41.5)	12,923 (44.5)	18,251 (43.6)	21,583 (42.9)	30,178 (41.5)	30,178 (41.5)	30,178 (41.5)	29,990 (42.3)
1 moderate, 0 severe	17,665 (24.3)	6948 (23.9)	10,140 (24.2)	12,232(24,3)	17,665 (24.3)	17,665 (24.3)	17,665 (24.3)	17,268 (24.4)
2 moderate, 0 severe	9618 (13.2)	3655 (12.6)	5350 (12.8)	6482 (12.9)	9618 (13.2)	9618 (13.2)	9618 (13.2)	9324 (13.2)
3+ moderate, 0 severe	11,339 (15.6)	4103 (14.1)	6090 (14.5)	7509 (14.9)	11,339 (15.6)	11,339 (15.6)	11,339 (15.6)	10,705 (15.1)
1 severe, any moderate	3126 (4.3)	1122 (3.9)	1642 (3.9)	2011 (4.0)	3126 (4.3)	3126 (4.3)	3126 (4.3)	2945 (4.2)
2+ severe, any moderate	757 (1.0)	307 (1.1)	406 (1.0)	491 (1.0)	757 (1.0)	757 (1.0)	757 (1.0)	655 (0.9)
Maintenance therapy*	43,710 (60.1)	16,652 (57.3)	24,376 (58.2)	29,610 (58.9)	43,710 (60.1)	43,710 (60.1)	43,710 (60.1)	42,526 (60.0)
Baseline Characteristics	Population 5		Population 6		Population 7 N=59,185		Population 8 N=125,682
Baseline Characteristics	≤3 FEV1 N=58,121	≤4 FEV1 N=44,673	≤6 months N=72,683	≤1 Year N=65,875	Population 7 N=59,185		Population 8 N=125,682
Median number of FEV₁ measurements (IQR)	5 (4–7)	6 (5–8)	4 (3–7)	4 (3–7)	5 (3–7)		4 (3–7)
Age	66.5 (10.5)	66.2 (10.2)	66.7 (10.7)	66.6 (10.7)	66.4 (10.6)		66.4 (10.7)
Females	26,540 (45.7)	20,225 (45.3)	33,417 (46.0)	30,494 (46.3)	27,541 (46.5)		58,504 (46.6)
Current smokers	34,791 (59.9)	26,383 (59.1)	43,902 (60.4)	39,813 (60.4)	35,496 (60.0)		77,716 (61.8)
BMI
Underweight	1780 (3.1)	1323 (3.0)	2353 (3.2)	2070 (3.1)	1800 (3.0)		5056 (4.0)
Normal	16,291 (28.0)	12,417 (27.8)	20,445 (28.1)	18,321 (27.8)	16,399 (27.7)		38,939 (31.0)
Overweight	16,138 (27.8)	12,602 (28.2)	20,017 (27.5)	18,225 (27.7)	16,451 (27.8)		39,091 (31.1)
Obese	11,894 (20.5)	9130 (20.4)	14,749 (20.3)	13,410 (20.4)	12,133 (20.5)		32,322 (25.7)
Missing	12,018 (20.7)	9201 (20.6)	15,119 (20.8)	13,849 (21.0)	12,402 (21.0)		10,274 (8.2)
mMRC
0	6227 (10.7)	4632 (10.4)	8098 (11.1)	7237 (11.0)	6133 (10.4)		13,958 (11.1)
1	12,582 (21.7)	9345 (20.9)	15,887 (21.9)	13,900 (21.1)	11,924 (20.2)		28,856 (23.0)
2	7442 (12.8)	5447 (12.2)	9550 (13.1)	8251 (12.5)	7026 (11.9)		17,264 (13.7)
3	3387 (5.8)	2404 (5.4)	4533 (6.2)	3811 (5.8)	3237 (5.5)		7943 (6.3)
4	508 (0.9)	310 (0.7)	787 (1.0)	654 (1.0)	517 (0.9)		1290 (1.0)
Missing	27,975 (48.1)	22,535 (50.4)	33,828 (46.5)	32,022 (48.6)	30,348 (51.3)		56,371 (44.9)
Airflow obstruction
Mild	13,822(23,8)	10,035 (22.5)	18,267 (25.1)	16,479 (25.0)	14,661 (24.8)		30,695 (24.4)
Moderate	27,268 (46.9)	21,365 (47.8)	33,452(46.0)	30,646 (46.5)	27,671 (46.8)		57,780 (46.0)
Severe	13,571(23,4)	10,713 (24.0)	16,522 (22.7)	14,905 (22.6)	13,461 (22.7)		26,437 (21.0)
Very severe	3022 (5.2)	2291 (5.1)	3777 (5.2)	3284 (5.0)	2942 (5.0)		5680 (4.5)
Missing	438 (0.8)	269 (0.5)	665 (0.9)	561 (0.9)	450 (0.8)		5090 (4.1)
FEV₁% predicted (SD)	62.6 (21.6)	62.2 (21.6)	63.1 (21.9)	63.2 (21.8)	63.3 (21.7)		63.1 (21.9)
AECOPD**
None	23,594 (40.6)	17,859 (40.0)	30,178 (41.5)	27,529 (41.8)	24,284 (41.0)		53,447 (42.5)
1 moderate, 0 severe	14,375 (24.7)	11,181 (25.0)	17,665 (24.3)	16,165 (24.5)	12,631 (24.7)		31,431 (25.0)
2 moderate, 0 severe	7933 (13.7)	6227 (13.9)	9618 (13.2)	8781 (13.3)	8115 (13.7)		17,685 (14.2)
3+ moderate, 0 severe	9323 (16.0)	7296 (16.3)	11,339 (15.6)	10,246 (15.6)	9373 (15.8)		23,119 (18.4)
1 severe, any moderate	2390 (4.1)	1776 (4.0)	3126 (4.3)	2584 (3.9)	2303 (3.9)		n/a
2+ severe, any moderate	506 (0.9)	334 (0.8)	757 (1.0)	570 (0.9)	479 (0.8)		n/a
Maintenance therapy	35,195 (60.6)	27,309 (61.1)	43,710 (60.1)	39,540 (60.0)	35,712 (60.3)		76,235 (60.7)

Notes: Population 1: Patients with at least 2 FEV1 measurements at least 6 months apart with linked HES data; 2: Excluding patients with an FEV1 greater than 10, 20, or 30% of the previous FEV1; 3) Excluding measurements that are greater than 10, 20, or 30% of the previous FEV1 and subsequent FEV1; 4) Excluding measurements that are within 1 week of an AECOPD; 5) At least 3 (or 4) FEV1 measurements with at least 2 at least 6 months apart; 6) FEV1 measurements that are all at least 6 months or 1 year apart; 7) At least 3 years of follow-up with at least 2 FEV1 measurements at least 6 months apart; 8) At least 2 FEV1 measurements at least 6 months apart (regardless of linked HES data); *Maintenance therapy includes: LABA, LAMA, ICS, ICS/LABA, ICS/LABA/LAMA, LABA/LAMA); **AECOPD groups for cohort without HES linkage: non, 1 moderate, 2 moderate, 3+ moderate.

Baseline Characteristics for All Populations Defined Using Different Criteria for FEV1 Decline Notes: Population 1: Patients with at least 2 FEV1 measurements at least 6 months apart with linked HES data; 2: Excluding patients with an FEV1 greater than 10, 20, or 30% of the previous FEV1; 3) Excluding measurements that are greater than 10, 20, or 30% of the previous FEV1 and subsequent FEV1; 4) Excluding measurements that are within 1 week of an AECOPD; 5) At least 3 (or 4) FEV1 measurements with at least 2 at least 6 months apart; 6) FEV1 measurements that are all at least 6 months or 1 year apart; 7) At least 3 years of follow-up with at least 2 FEV1 measurements at least 6 months apart; 8) At least 2 FEV1 measurements at least 6 months apart (regardless of linked HES data); *Maintenance therapy includes: LABA, LAMA, ICS, ICS/LABA, ICS/LABA/LAMA, LABA/LAMA); **AECOPD groups for cohort without HES linkage: non, 1 moderate, 2 moderate, 3+ moderate. Most populations had a median of 4 FEV1 measurements during follow-up; however, patients in population two (that excluded patients with an FEV1 greater than 10% or 20% of the previous and subsequent FEV1 measurements) had fewer FEV1 measurements over follow-up with a median of 3. On the other hand, population five that included patients with at least 4 FEV1 measurements had a median of 6 measurements over follow-up. In addition, for population one, the median FEV1 measurement over follow-up was 1.5 L (IQR 1.08–2.03 ().

Rate of FEV1 Decline

Minimally adjusted and fully adjusted mean rates of FEV1 decline in each population are shown in Figures 3 and 4. Estimates from crude analyses can be found in the supplementary material (). Mean rates of FEV1 decline were similar in all patient populations except for population two (i.e., exclusion of patients with FEV1 greater than a) 10%, b) 20%, and c) 30% of the previous and subsequent FEV1 measurements. Minimally adjusted rates of FEV1 decline in population one and three-eight ranged from −18.7mL/year (95% CI −19.2 to −18.2) to −16.5mL/year (95% CI −17.3 to −15.7). The mean rates of FEV1 decline for population two was −79.4mL/year (95% CI −80.7 to −78.2) excluding those with an FEV1 greater than 10% of their previous FEV1, −57.1mL/year (95% CI −58.0 to −56.2) excluding patients with an FEV1 greater than 20% of their previous FEV1, and −46.8mL/year (95% CI −47.6 to −46.0) excluding patients with an FEV1 greater than 30% of their previous FEV1.

Figure 3

Minimally adjusted rates of FEV1 decline.

Figure 4

Fully adjusted rates of FEV1 decline.

Minimally adjusted rates of FEV1 decline. Fully adjusted rates of FEV1 decline. Fully adjusted rates of FEV1 decline in population one and three-eight ranged from −14.6mL/year (95% CI −15.7 to −13.6) to −9.8mL/year (95% CI −11.5 to −8.1). Fully adjusted mean rates of FEV1 decline for population two were −94.9mL/year (95% CI −97.5 to −92.3) excluding those with an FEV1 greater than 10% of their previous FEV1, −64.3mL/year (95% CI −66.1 to −62.5) excluding those with an FEV1 greater than 20% of their previous FEV1, and −51.4mL/year (95% CI −53.0 to −49.8) excluding those with an FEV1 greater than 30% of their previous FEV1. It is important to note that minimally adjusted and fully adjusted models differed by patient numbers due to complete case analysis. provide further information on unadjusted models, minimally adjusted models, unadjusted models in patients with complete data, and fully adjusted models. Unadjusted models in patients with complete baseline covariates were similar to fully adjusted models. Fully adjusted complete case analyses included fewer patients due to missing BMI, mMRC, and height (used to calculate FEV1% predicted). Estimates using linear regression models were similar for unadjusted and minimally adjusted analyses; however, the rate of FEV1 decline was faster in fully adjusted analyses compared to rates estimated using mixed linear regression models (). The rate of FEV1 decline was also assessed in the base population (population 1) using patients first and last FEV1 measurements only. Overall, the mean unadjusted rate of FEV1 decline using this method was −16.4mL/year (SD 246.5). Within patient variation of FEV1 in minimally adjusted mixed linear models ranged from 330.6 mL to 339.8 mL in populations one, four, five, seven, and eight. Within patient variation in FEV1 was slightly higher in population six at 350.7 mL. Population two had the lowest within patient variation in FEV1 at 195.5 mL, 191.2 mL, and 202.5 mL after excluding patients with an FEV1 greater than 10%, 20%, and 30% of the previous FEV1 measurements, respectively. See for further details on within patient variation in all models and populations.

Discussion

This piece of work set out to describe potential differences in patient characteristics, FEV1 variability and rates of FEV1 decline between COPD populations defined using different requirements around FEV1 measurements in routinely collected data using CPRD. Specifically, how different definitions around measurement error, number of measurements, time intervals between measurements, follow-up time, and secondary care data linkage can lead to the selection of potentially different study populations. Overall, we found that regardless of the number of FEV1 measurements, time intervals between measurements, follow-up time, and secondary care data linkage, patient demographics, within patient FEV1 variability, and rates of FEV1 decline remained consistent. Similarly, results were consistent in populations that excluded individual FEV1 measurements that were likely due to measurement error. However, excluding patients (rather than individual measurements) with likely measurement error led to the exclusion of a specific group of COPD patients; severely obstructed patients (low FEV1% predicted). This could lead to selection bias in studies that aim to use a representative population of COPD patients. In addition, using mixed linear regression provided estimates that were different from those using linear regression, suggesting that clustering at the patient level is important when investigating the rate of FEV1 decline in routinely collected data. The mean annual rates of FEV1 decline described in this study for populations one and three to eight were similar to those reported in previous studies of COPD patients.4,8,9 Interestingly, population two (those who were excluded due to potential measurement error) had faster rates of FEV1 decline. More patients with low FEV1% predicted were excluded in this population, which meant that there were more patients with milder COPD (higher FEV1% predicted) than all other populations. Previous studies have suggested that rates of FEV1 decline are faster in COPD patients with milder disease because they have more absolute lung function to lose at baseline than those with severe disease.14 This also suggests that patients with lower FEV1% predicted are more likely to have poorly recorded spirometry (potential measurement error). It is possible that patients with more severe disease, and lower FEV1% predicted, were more likely to perform poor spirometry, which might have contributed to the high variation seen in this group of patients. Whilst the best of three spirometry recordings should be recorded by healthcare practitioners, it is possible that only one spirometry is performed and recorded if patients are too severe to perform three in a row. It is also important to note that the increase in FEV1 by 10%, 20%, and 30% could be due to initiation of COPD medications, which can have an acute bronchodilation effect; however, studies have shown that this effect decreases over time in COPD patients on long-term treatment.15,16 Therefore, researchers should consider their research question prior to defining the rate of FEV1 decline. It is also important to note that fewer patients were included in models with patients who had complete baseline covariate data. In CPRD-GOLD this is due to the lack of consistent recording of BMI and mMRC. Fully adjusted analyses produced slower mean rates of FEV1 decline compared to crude or minimally adjusted rates of decline. However, in the population that excluded patients due to potential measurement error (population two), fully adjusted mean rates of FEV1 decline were faster than crude and minimally adjusted models. Patient populations that excluded patients with potential measurement error and who had complete baseline data included slightly milder patients than the same population regardless of complete baseline data (data not shown). On the other hand, all other patient populations who had complete baseline data had slightly more severe patients than the same populations regardless of complete baseline data. This is consistent with the theory that milder COPD patients might have faster rates of FEV1 decline due to initial lung function.14 Therefore, it is possible that missing baseline variables could influence the type of patients included in a study, and the rate of FEV1 decline. Lastly, simple linear regression provided estimates that were slightly faster than those using mixed linear regression. Previous studies have used linear regression to describe the rate of FEV1 decline.4,9,17 The limitation to this model is that within and between patient variations are not taken into consideration by the model. If similar types of patients are included and FEV1 is not highly variable within or between patients, then linear regression can be sufficient. However, due to the nature of CPRD, a routinely collected EHR, a wide variety of phenotypes can exist, and measurement error is possible therefore, mixed linear regression should be used to take into account FEV1 variation. Previous studies (notably RCTs) have also estimated changes in FEV1 using two FEV1 measurements, one at baseline and one at follow-up.18,19 The nature of RCTs ensures that patients are similar in all arms of the trial and confounding is adjusted for by the study design. However, in CPRD, and other EHRs, this method could be easily biased by measurement errors, changes in maintenance therapies during follow-up, AECOPD events during follow-up, and other everyday primary care factors. This method for estimating changes in FEV1 would require the two measurements to be accurate and not be recorded close to events that may influence FEV1 such as AECOPDs and changes in medication. Overall, in order to use as much data as possible over follow-up, and to consider varying changes in individual patient decline and initial lung function, mixed linear regression should be used when studying FEV1 using EHR. One important limitation to acknowledge in this study is that whilst the majority (96.5%) of spirometry measurements recorded in CPRD-GOLD are of adequate quality, pre, and post-FEV1 cannot always be distinguished. However, in a previous validation study of spirometry measurements in CRPD-GOLD, of those that could be distinguished, the majority of measurements were post-bronchodilator spirometry.6 Overall, this study aimed to explore different definitions around longitudinal change in FEV1 in CPRD-GOLD and describe differences in populations created using these definitions. Previous studies have investigated how patient characteristics are associated with FEV1 decline; however, the purpose of this study was to primarily show how mean rates of FEV1 decline differed depending on the definition used.8–10,20,21 We found that in the cohorts that did not exclude measurements or individuals due to increases in FEV1, the mean rates of FEV1 and within patient variation remained similar. For this reason, the definition used to create population one (i.e., using all available FEV1 measurements over follow-up) could be used to describe the rate of FEV1 decline in a COPD population using CPRD data. However, researchers should consider their research question prior to the selection of the definition for rate of FEV1 decline.

Conclusion

Overall, the quality of FEV1 in CPRD is adequate for the purpose of studying FEV1 decline. We found that regardless of potential measurement error, number of FEV1 measurements, time intervals between measurements, follow-up period, exclusion of specific FEV1 measurements, and linkage to secondary care data, FEV1 variability and rate of FEV1 decline remain similar in a COPD population. This suggests that CPRD is a good resource for investigating the rate of FEV1 decline in epidemiological studies and pragmatic trials. However, researchers using EHR should be aware of the difference in rate of FEV1 decline and patient characteristics when excluding individuals with questionable data.

21 in total

Review 1. Mortality in COPD: Role of comorbidities.

Authors: D D Sin; N R Anthonisen; J B Soriano; A G Agusti
Journal: Eur Respir J Date: 2006-12 Impact factor: 16.671

2. Relationship between exacerbation frequency and lung function decline in chronic obstructive pulmonary disease.

Authors: G C Donaldson; T A R Seemungal; A Bhowmik; J A Wedzicha
Journal: Thorax Date: 2002-10 Impact factor: 9.139

3. Withdrawal of inhaled glucocorticoids and exacerbations of COPD.

Authors: Helgo Magnussen; Bernd Disse; Roberto Rodriguez-Roisin; Anne Kirsten; Henrik Watz; Kay Tetzlaff; Lesley Towse; Helen Finnigan; Ronald Dahl; Marc Decramer; Pascal Chanez; Emiel F M Wouters; Peter M A Calverley
Journal: N Engl J Med Date: 2014-09-08 Impact factor: 91.245

4. Bronchodilator responsiveness in patients with COPD.

Authors: D P Tashkin; B Celli; M Decramer; D Liu; D Burkhart; C Cassino; S Kesten
Journal: Eur Respir J Date: 2008-02-06 Impact factor: 16.671

5. Changes in forced expiratory volume in 1 second over time in COPD.

Authors: Jørgen Vestbo; Lisa D Edwards; Paul D Scanlon; Julie C Yates; Alvar Agusti; Per Bakke; Peter M A Calverley; Bartolome Celli; Harvey O Coxson; Courtney Crim; David A Lomas; William MacNee; Bruce E Miller; Edwin K Silverman; Ruth Tal-Singer; Emiel Wouters; Stephen I Rennard
Journal: N Engl J Med Date: 2011-09-26 Impact factor: 91.245