| Literature DB >> 31752881 |
Julia O Udesky1, Robin E Dodson2, Laura J Perovich2,3, Ruthann A Rudel4.
Abstract
BACKGROUND: Environmental health and exposure researchers can improve the quality and interpretation of their chemical measurement data, avoid spurious results, and improve analytical protocols for new chemicals by closely examining lab and field quality control (QC) data. Reporting QC data along with chemical measurements in biological and environmental samples allows readers to evaluate data quality and appropriate uses of the data (e.g., for comparison to other exposure studies, association with health outcomes, use in regulatory decision-making). However many studies do not adequately describe or interpret QC assessments in publications, leaving readers uncertain about the level of confidence in the reported data. One potential barrier to both QC implementation and reporting is that guidance on how to integrate and interpret QC assessments is often fragmented and difficult to find, with no centralized repository or summary. In addition, existing documents are typically written for regulatory scientists rather than environmental health researchers, who may have little or no experience in analytical chemistry.Entities:
Keywords: Data validation; Environmental chemicals; Environmental epidemiology; Environmental monitoring; Exposure measurement; Exposure science; Measurement error; Quality assurance/quality control (QA/QC)
Mesh:
Year: 2019 PMID: 31752881 PMCID: PMC6868687 DOI: 10.1186/s12940-019-0537-8
Source DB: PubMed Journal: Environ Health ISSN: 1476-069X Impact factor: 5.984
Summary of QC sample types, interpretation, and possible actions
| QA/QC Concept | Measure | Interpretation | Possible Actions |
|---|---|---|---|
Lab control sample recoveries and/or matrix spike recoveries Certified reference material Isotope dilution quantification | Measure of whether the analytical method produces accurate quantification for each compound. Matrix spike recovery evaluates matrix effects on accuracy, such as interferences. Isotope dilution is the most rigorous approach to generating accurate measurements in biomonitoring. | • Drop compounds with inaccurate quantification from the data analysis, discuss with lab whether improvements can be made for future analyses. • If problems are modest and batch-specific, include batch as a covariate in regression model. | |
| Surrogate spike recovery in each sample | Measure – for each field sample - of whether the chemical is extracted completely from the sample matrix, (e.g., blood, dust). Isotope dilution approaches capture and correct for differences in extraction efficiency. | • Consider dropping samples with poor surrogate recovery from data analysis. • Consider applying a surrogate correction factor (1/fraction recovery) if the recovery is consistent (± 15–20% in standard deviation). | |
Level above which the lab can detect with confidence that the analyte is present in the sample. Common terms include: Instrument detection limit (IDL), Detection limit (DL), Method detection limit (MDL), Limit of detection (LOD) | • See Method Reporting Limit. | ||
Level above which the lab can quantify with confidence the amount of chemical in the sample. Common terms include: Practical quantitation limit (PQL), Limit of quantitation (LOQ), Laboratory quantitation level (LQL), Contract required quantitation limit (CRQL) | • See Method Reporting Limit. | ||
| Levels detected in blanks (lab-blind field blanks, solvent blanks, matrix blanks, storage blanks, other types) | Level above which the researcher is confident that the reported chemical measurement reflects a signal from the media sampled, considering all sources of measurement error, especially potential contamination during sample collection and handling as well as in the laboratory. | • Determine MRL by comparing the lab limit (quantitation limit, unless not reported, in which case detection limit) to the levels in the blanks for each compound. • Qualify reported values below the MRL as “estimated”. | |
| Levels detected in blanks (lab-blind field blanks, solvent blanks, matrix blanks, storage blanks, other types) | Measure of confidence in accuracy of values reported above the MRL. | • If evidence of contamination, consider dropping a compound or dropping results for a compound in a particular batch. • Identify source of contamination (e.g., lab vs field equipment) to inform future work. | |
| • For compounds with consistent contamination in blanks, researchers may correct field sample quantity by subtracting the amount attributed to contamination. This is most important when contamination is significant relative to sample values (e.g., > 10%) and for comparisons with external data. | |||
| Relative percent difference (RPD) for side-by-side duplicate samples (lab-blind) or split samples (lab-blind if possible) | A measure of reproducibility of field measurements, including analytical variability and sampling variability. | • Flag compounds with > 30% RPD. • Consider precision in combination with other QA/QC when deciding to qualify results. |
Get acquainted with your data
| ○ Check CAS number, chemical name, isomer type of reported analytes vs. analytical standards purchased by lab (see Additional file | |
| ○ “Real” samples – compare these to the chain of custody that lists the samples submitted for analysis to make sure all submitted samples were analyzed | |
| ○ Lab control and/or matrix spike recoveries | |
| ○ Reference samples (e.g., CRMs) | |
| ○ Surrogate spike recoveries | |
| ○ Blanks (solvent method, field, matrix, other) | |
| ○ Duplicates | |
Fig. 1Visualizing urine sample results by analytical batch (data not yet published) revealed that sample-specific detection limits in later batches were higher and in the range of sample results in previous batches. After discussing with the laboratory, samples in later batches were re-analyzed to achieve lower detection limits
Spiked samples and certified reference material
| Approach (see Additional file | |
| □ We typically apply an acceptable range of 50–150% recovery for most environmental samples, particularly when we are analyzing for new chemicals or combinations of chemicals for which methods are not well-established. For well-established methods, a more conservative range – 80-120% recovery – is appropriate. | |
| □ If recoveries for a particular chemical or chemicals are consistently out of range (> 150% or < 50%) across multiple batches, this should be discussed with the laboratory analyst. | |
| ○ If the laboratory analyst agrees that the method was not successful, we drop the chemical(s) from our dataset. We do not report values or include such chemicals in any data analyses. | |
| ○ If the laboratory analyst can explain the reason for consistent high or low recoveries and has confidence in the ranking and relative values of the reported sample data, the reported values can be used for many data analyses, but it will be difficult to compare with levels from another study. | |
| □ If recoveries from one or a few batches are out of range, we are concerned that results in those batches might be over/under-estimated compared to the rest. One way to investigate this concern is to look for corresponding systematic differences in sample data (see Additional file | |
| ○ If field samples have been randomized into batches, we check if the variation in sample results correlates with spiked sample or CRM recoveries by batch. Note: we still go through this step even if we were not able to randomize field samples, but in this case it can be very challenging to distinguish systematic analytical variation from other possible sources of variation in sample results between batches (e.g., if samples in different batches were also collected during different seasons). | |
| ▪ If there are systematic differences (e.g., the sample results for a chemical are higher in the batch where the spike or CRM recovery was high, or if only one batch, the sample results for a chemical with high spike or CRM recovery are much higher than previously reported levels), we consider dropping the chemical results from the affected batches from the dataset. If an identical/split reference sample was analyzed in each batch, these results can also be helpful to resolve questions about whether and how to use the data in this case. | |
| ▪ If there are no obvious systematic differences, we keep the chemical in our dataset, but flag the results for that chemical in the batch with the out-of-range spiked sample or CRM recovery. | |
| Reporting: | |
| □ We note in summary statistics when the average spiked sample or CRM recovery for a particular chemical was out of range. | |
| □ We note whether levels in our study might be systematically over- or under-reported (i.e., because of consistent high or low spiked sample or CRM recoveries). We especially note this if comparing to levels from another study. | |
| □ For chemicals with low/high recoveries in certain batches, we may perform sensitivity analyses – for example, by including lab batch as a covariate in regression analyses, though this can be challenging for small datasets. |
Fig. 2a Results for flame retardant HBCD measured in air samples collected in 105 homes. All but three samples were non-detects (open circles). Samples were analyzed in six different analytical batches. b Summary of laboratory control spike recovery data for HBCD across the six analytical batches shows very poor accuracy and indicates no confidence for this analyte in the indoor air samples
Surrogates
| Approach (see Additional file | |
| □ We typically apply an acceptable range of 50–150% recovery for most environmental samples, particularly when we are analyzing for new chemicals or combinations of chemicals for which methods are not well-established. For well-established methods, a more conservative range – 80-120% recovery – would be appropriate. | |
| □ Discuss with lab analyst. Consider dropping sample. | |
| □ If these recoveries are out of range, this suggests a larger problem with the analytical method rather than with particular samples. Summarize information about the surrogate recoveries in the QC samples as well as lab control or matrix spike recoveries for the associated chemicals and discuss with lab analyst. | |
| □ Note any trends (upward or downward) in the distribution of surrogate recoveries across batches. Such trends should be discussed with the laboratory analyst, even if all recoveries are in the 50–150% acceptable range (see Fig. | |
| □ If the surrogate is a deuterated version of one of the target chemicals, it can be helpful to compare a plot of the surrogate recoveries by batch to the sample data for the corresponding un-deuterated target chemical by batch. We would be concerned – and would seek guidance from the lab analyst – if we saw a trend for the target chemical results that matched the trend in the surrogate recoveries. | |
| □ Note if many surrogate recoveries (e.g., more than half) are out of range in a particular lab batch. If yes, flag the results in that batch for the chemical(s) represented by that surrogate. | |
| □ Note whether samples with high surrogate recoveries consistently have the highest results for the associated chemical(s). | |
| ○ If yes, we would be concerned that samples with high recoveries are all overestimated. Discuss with lab analyst. Consider applying a surrogate correction factor to sample results (multiplying by 1/fraction recovery). | |
| □ Note whether samples with low surrogate recoveries were consistently non-detects or very low-level detects for the associated chemical(s). | |
| ○ If yes, we would be concerned that samples with low recoveries are all underestimated. Discuss with lab analyst. Note in publications that levels and detection frequencies for associated chemicals might be underestimated. | |
| Reporting: | |
| □ In summary statistics, we note whether any maximum value is from a sample associated with a high (> 150%) surrogate recovery and note that in this case the maximum might be overestimated. Similarly, if 100% of samples are detects, we also flag the minimum value if it is from a sample associated with a low (< 50%) recovery and note that in this case the minimum might be underestimated. | |
| □ For any statistical analyses, if possible (i.e., if large enough dataset) we run sensitivity analyses: | |
| ○ Excluding samples with out-of-range surrogate recoveries. | |
| ○ Controlling for lab batch, if surrogates were problematic for a particular batch. |
Fig. 3In this example from our data, recoveries of surrogate d4-di-n-butyl-phthalate from air samples showed notable upward and downward trends over time, despite largely staying within the 50–150% acceptable bounds. Here we were examining surrogate recoveries in batches of samples from different studies analyzed at the same laboratory. The last two batches (Sept 2014 and May 2015) were from the same study but collected approximately a year apart per the study design
Get acquainted with blanks
| Approach (see Additional file | |
| For chemicals detected in blanks: | |
| □ Consider whether blank detects are consistent across batches. Note whether detects seem to occur mostly in one type of blank which could indicate a source of contamination in the lab or field. | |
| ○ If a particular source is suspected, we investigate (talk to lab, look at field logs, etc.). | |
| □ Note whether blanks are in range of the samples. | |
| □ If field samples have been randomized into batches, check if variation in sample results correlates with blank results by batch. Note: we still go through this step even when we were not able to randomize field samples, but it is more challenging to distinguish whether contamination is driving differences in sample results in a particular batch or whether other explanations are more likely (e.g., all samples in one batch were collected in a different season or from a particular study site). |
Fig. 4Phthalate DEHP measured in air in college dorm rooms before and after occupancy (data not yet published). Levels in our samples (purple dots) were higher post- compared to pre-occupancy, but this plot revealed that levels in field blanks (blue dots) were also higher post- compared to pre-occupancy and within the range of field samples. We also saw a matrix blank (green dot) well within the range of the field samples in the pre-occupancy batch. These data suggest DEHP contamination in both batches; for the post-occupancy batch, we hypothesized this might have come from the plastic bags in which the samplers were shipped. We will not report results for this chemical from this study, given the evidence of contamination. LLOQ = Lower Limit of Quantitation
Consider Raising Method Reporting Limits (MRLs)
| Approach (see Additional file | |
| 1A. For chemicals not detected in blanks, the MRL is equal to the laboratory reporting limit. | |
| 1B. For each chemical detected in blanks, | |
| □ Compare the lab’s reporting limit to the 90th percentile of field blanks (computed with non-detects set to ½ lab’s reporting limit). The higher value is the new MRL. | |
| ○ However, if we observe many detects in other types of blanks (e.g., matrix, solvent), we consider determining the MRL by comparing the lab’s reporting limit to the 90th percentile of ALL blanks (computed with non-detects set to ½ lab’s reporting limit). The higher value is the new MRL. | |
| ○ It can be helpful here to plot sample data with different possible MRLs to gain understanding of precisely what is being achieved by raising the MRL (i.e., are we successfully flagging data that we are not confident in and at the same time leaving data in which we have confidence unqualified?). See Additional file | |
| *Note*: we use the 90th percentile of the blanks rather than using the maximum value or the mean because the 90th percentile is less sensitive to extreme values and can be estimated for data that are not normally distributed. However if the overall study is small (e.g., in our practice, when we have < 5 blanks), we set the MRL equal to the maximum blank mass. | |
| 1C. For each chemical detected in blanks, | |
| □ If just one extremely problematic batch, consider dropping the sample data from that batch. | |
| □ If multiple field blanks were run in each batch, can consider determining MRL as above but on a batch-specific basis. | |
| ○ In this case, the way to proceed will very much be a judgment call. Spend time with the data considering various approaches. | |
| □ Data from reference material and duplicate samples can be helpful in deciding which data points should be qualified because they are “in the noise.” | |
| 2. After determining the MRL, we | |
| □ 0 flag = measurement reported by the lab as “non-detect” | |
| □ 0.5 flag = measurement falls below the MRL. These are considered “estimated detects” | |
| □ 1 flag = measurement falls above the MRL. These are considered “true detects” | |
| Note that our data qualifier flags may differ from those used by others. For example, NHANES flags non-detects with a “1” and detects with a “0.” | |
| 3. | |
| □ If the MRL is determined on a mass basis but sample results are normalized by some factor, such as sample volume, we compute a sample-specific concentration-based MRL by dividing the mass-based MRL by the sample volume. | |
| Reporting: | |
| □ We do not count estimated values (0.5 flags) as detects when reporting % > MRL. We do not use estimated detects to calculate summary statistics such as percentiles (see Table shell S2 in Additional file | |
| □ In summary statistics, we identify any chemicals with greater than 50% estimated detects and add a footnote: “Imprecise quantification for more than 50% of detected values”. | |
| □ Graphical presentations should distinguish estimated from true detects (e.g., by plotting as different shapes, see Fig. | |
| □ For reporting in tables, we use median sample volume across samples to convert mass-based MRL to a single concentration-based MRL for each chemical, if applicable. | |
| □ There are different approaches for incorporating estimated or 0.5 flagged values in statistical analyses, including performing analyses weighted by estimates of the measurement precision below the MRL, or using censored regression methods [ |
Fig. 5Example of graphical presentation distinguishing true, estimated, and non-detects. MRL = Method Reporting Limit
Blank correction
| Approach (see Additional file | |
| 1 | |
| □ If detects are spread across all types of blanks (e.g., field, solvent method, matrix), we use all blanks for blank correction. Otherwise we use field blanks. We try to keep our blank correction approach consistent with our MRL approach. | |
| 2. | |
| □ If > 5 blanks: | |
| ○ For each chemical, we use a one-sided one sample sign test (special case of binomial test with | |
| ▪ We blank-correct chemicals with a sign test | |
| ▪ However, if the number of blanks is relatively small (10 or fewer) we consider blank correction even when the sign test does not produce a significant result. The sign test does not take into account the magnitude of the levels detected in the blanks nor does it distinguish different types of blanks (i.e., field and lab). | |
| • For example, if we have 3 field blanks and 4 lab blanks, and we see consistent levels detected across all field blanks and all but one lab blank, we would consider blank correcting even though the sign test would produce | |
| □ If ≤ 5 blanks (i.e., for a small dataset): | |
| ○ With five or fewer blanks, the sign test will never be significant. In this case, we blank-correct chemicals with 100% detects in blanks. | |
| 3. | |
| □ Calculate the median value of the blanks, with non-detects set to ½ lab’s reporting limit and using all values (i.e., estimated and true detects). | |
| ○ It is useful to pause here and assess the value being used for blank correction. Is it based on an estimated value below the MRL? What will be the percent change in the median, comparing the original to the blank-corrected data? | |
| □ Subtract median blank value from all sample results. | |
| □ Subtract median blank value from the MRL (determined as in Table | |
| Reporting: | |
| □ We are explicit about whatever procedure we use to decide whether or not to perform blank correction (sign test or other) and about the statistic (e.g., median, mean) and amount used for correction. | |
| □ Any presentation of measurements (e.g., summary statistics) should use blank-corrected values because they may be compared with measurements in other studies. | |
| □ For statistical analyses such as regression and correlation performed within the dataset, non-blank-corrected data can be used. |
Duplicates
| Approach (see Additional file | |
| 1. | |
| □ Compute & summarize average relative percent difference (RPD) for duplicate pairs or, if ≥3 side-by-side samples, compute relative standard deviation (RSD): | |
| ○ If sample results have been normalized (e.g., mass converted to concentration), compute precision with normalized values. | |
| ○ Compute | |
| ○ Also consider precision restricted to pairs where both samples are flagged as “true detects” above the MRL. | |
| 2. | |
| □ This is a good point to pause and check your data and to note/investigate anything that looks unusual (e.g., huge difference in results for two members of a duplicate pair, how tight are detect/non-detect pairs). See Additional file | |
| 3. | |
| □ Calculate average volume and concentration for each pair. Note, can skip this step if | |
| □ Back calculate new average mass using average volume and average concentration. Or, simply average the duplicate measurements if | |
| □ Compare new average measurement to MRL to determine data qualifier flag. | |
| □ Combine duplicate averages back with rest of data. | |
| Reporting: | |
| □ In publications, we note the range of average RPDs across all chemicals in our QA/QC discussion. We consider average RPD < 30% to be “good” precision. | |
| □ If a chemical has sporadic blank contamination or variable spike recoveries, excellent precision can increase our confidence in the field sample results. |