| Literature DB >> 31158315 |
Richard B Belzer1, R Jeffrey Lewis2.
Abstract
Conventional spirometry produces measurement error by using repeatability criteria (RC) to discard acceptable data and terminating tests early when RC are met. These practices also implicitly assume that there is no variation across maneuvers within each test. This has implications for air pollution regulations that rely on pulmonary function tests to determine adverse effects or set standards. We perform a Monte Carlo simulation of 20,902 tests of forced expiratory volume in 1 second (FEV1 ), each with eight maneuvers, for an individual with empirically obtained, plausibly normal pulmonary function. Default coefficients of variation for inter- and intratest variability (3% and 6%, respectively) are employed. Measurement error is defined as the difference between results from the conventional protocol and an unconstrained, eight-maneuver alternative. In the default model, average measurement error is shown to be ∼5%. The minimum difference necessary for statistical significance at p < 0.05 for a before/after comparison is shown to be 16%. Meanwhile, the U.S. Environmental Protection Agency has deemed single-digit percentage decrements in FEV1 sufficient to justify more stringent national ambient air quality standards. Sensitivity analysis reveals that results are insensitive to intertest variability but highly sensitive to intratest variability. Halving the latter to 3% reduces measurement error by 55%. Increasing it to 9% or 12% increases measurement error by 65% or 125%, respectively. Within-day FEV1 differences ≤5% among normal subjects are believed to be clinically insignificant. Therefore, many differences reported as statistically significant are likely to be artifactual. Reliable data are needed to estimate intratest variability for the general population, subpopulations of interest, and research samples. Sensitive subpopulations (e.g., chronic obstructive pulmonary disease or COPD patients, asthmatics, children) are likely to have higher intratest variability, making it more difficult to derive valid statistical inferences about differences observed after treatment or exposure.Entities:
Keywords: FEV1; information quality; intertest variability; intratest variability; measurement error
Year: 2019 PMID: 31158315 PMCID: PMC6851780 DOI: 10.1111/risa.13315
Source DB: PubMed Journal: Risk Anal ISSN: 0272-4332 Impact factor: 4.000
Exclusion Rates in NHANES 2009–2010 Pulmonary Function Testing
| Maneuver | Maneuvers Performed ( | Maneuvers Accepted ( | Maneuvers Excluded ( | Implied Exclusion Rate ( |
|---|---|---|---|---|
| 1 | 6,845 | – | – | |
| 2 | 7,169 | |||
| 3 | 7,198 | 2,163 | 5,035 | 70% |
| 4 | 5,035 | 1,848 | 3,187 | 63% |
| 5 | 3,187 | 1,136 | 2,051 | 64% |
| 6 | 2,051 | 687 | 1,164 | 57% |
| 7 | 1,364 | 394 | 970 | 71% |
| 8 | 970 | 968 | 2 | 0.2% |
| 9 | 2 | – | – | – |
= – .
= .
= ÷ .
Logically, N 1 should always exceed N 2 and N 2 should always exceed N 3. NHANES (2008) provides no explanation why N 1 < N 2 and N 2 < N 3.
Ninth maneuvers are not documented.
Source: NHANES (2008); NHANES uses the ATS (1995) guidelines (three minimum maneuvers; RC = 0.15 L/sec).
Proportion of Acceptable Results Excluded Due to Repeatability Criterion as a Function of Intratest Variability CV
|
| Proportion of Acceptable Results Excluded |
|---|---|
| 1% | 0.3% |
| 2% | 14% |
| 3% | 32% |
| 4% | 46% |
| 5% | 55% |
| 6% | 61% |
| 7% | 67% |
| 8% | 71% |
| 9% | 76% |
| 10% | 77% |
Default Simulation Parameters
| Parameters | Common to Both Models | |
|---|---|---|
| Predicted max FEV1 (Brändli et al., |
| |
| Estimated | 0.51 | |
| Tests | 20,902 | |
Figure 1Percentage of simulated tests rejected for lack of ATS/ERS maneuver repeatability, by number of maneuvers performed.
Figure 2Proportion of FEV1 tests without a pair of maneuvers satisfying ATS/ERS reproducibility criterion (RC) for three alternative intratest coefficients of variation (CV).
Note: = 3.55 L/sec; = 3%; = 6%; 0.1−0.2 L/sec.
Figure 3Mean FEV1 measurement error resulting from test termination after three maneuvers compared to unconstrained maximum (L/sec and %).
Note: = 3.55 L/sec; = 6%; = 6%; 0.1−0.2 L/sec.
Figure 4Minimum decline in FEV1 necessary for statistical significance at p ≤ 0.05 taking only into account.
Figure 5Minimum decline in FEV1 necessary for statistical significance at p ≤ 0.05 taking both and into account.
Mean FEV1 Measurement Error Under ATS/ERS Protocol After Three Maneuvers, by Repeatability Criterion (RC) and Intratest Coefficients of Variation () (L/sec and %)
| Intratest Coefficient of Variation ( | |||||
|---|---|---|---|---|---|
| Repeatability Criterion (RC) | 0% | 3% | 6% | 9% | 12% |
| 0.10 L/sec | 0.00 L/sec 0.0% | 0.08 L/sec | 0.18 L/sec 4.9% | 0.30 L/sec 8.1% | 0.41 L/sec 11% |
| 0.15 L/sec | 0.00 L/sec 0.0% | 0.08 L/sec 2.1% | 0.19 L/sec 5.1% | 0.31 L/sec 8.4% | 0.42 L/sec 11% |
| 0.20 L/sec | 0.00 L/sec 0.0% | 0.07 L/sec 2.0% | 0.07 L/sec 4.7% | 0.28 L/sec 7.6% | 0.40 L/sec 11% |
Notes: Default subject characteristics from Table III. Intertest coefficient of variation () = 3%. L/sec values reported ± 0.005 L/sec. Percentage values reported as two significant figures. Interquartile range: 25th−75th percentile of simulated distribution.
Because 0% yields an undefined result, 0.01% is used to approximate the 0% value implicitly assumed in the ATS/ERS protocol and published studies.