| Literature DB >> 22493676 |
Stephany N Duda1, Bryan E Shepherd, Cynthia S Gadd, Daniel R Masys, Catherine C McGowan.
Abstract
Observational studies of health conditions and outcomes often combine clinical care data from many sites without explicitly assessing the accuracy and completeness of these data. In order to improve the quality of data in an international multi-site observational cohort of HIV-infected patients, the authors conducted on-site, Good Clinical Practice-based audits of the clinical care datasets submitted by participating HIV clinics. Discrepancies between data submitted for research and data in the clinical records were categorized using the audit codes published by the European Organization for the Research and Treatment of Cancer. Five of seven sites had error rates >10% in key study variables, notably laboratory data, weight measurements, and antiretroviral medications. All sites had significant discrepancies in medication start and stop dates. Clinical care data, particularly antiretroviral regimens and associated dates, are prone to substantial error. Verifying data against source documents through audits will improve the quality of databases and research and can be a technique for retraining staff responsible for clinical data collection. The authors recommend that all participants in observational cohorts use data audits to assess and improve the quality of data and to guide future data collection and abstraction efforts at the point of care.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22493676 PMCID: PMC3320898 DOI: 10.1371/journal.pone.0033908
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Characteristics of data collection, abstraction, and management at audit sites A–G.
| Sites | |||||||
| Characteristic | A | B | C | D | E | F | G |
| Structured patient follow-up visit form | Yes | no | no | no | no | no | no |
| Drug dispensing form | no | no | no | Yes | no | Yes | Yes |
| Electronic laboratory system | no | Yes | Yes | no | no | Yes | no |
| On-site research database (vs. commercially hosted and managed database) | Yes | Yes | Yes | Yes | no | Yes | Yes |
| Data manager | Yes | Yes | Yes | Yes | no | no | Yes |
| Full-time data abstraction team | no | no | no | Yes | no | no | Yes |
| Internal data audits | no | no | no | no | no | Yes | no |
Availability of randomly selected clinical records requested by the audit team according to site.
| Audit Site | Total charts requested | Charts available and audited | Charts available but not audited | Charts unavailable |
| A | 40 | 29 | 0 | 11 |
| B | 28 | 23 | 3 | 2 |
| C | 17 | 17 | 0 | 0 |
| D | 28 | 27 | 0 | 1 |
| E | 33 | 27 | 5 | 1 |
| F | 35 | 35 | 0 | 0 |
| G | 27 | 26 | 0 | 1 |
|
|
|
|
|
|
Total number of audited variables and percentage of erroneous data by data type during initial audits at seven sitesa.
| Audit Sites | ||||||||||||||||
| A | B | C | D | E | F | G | All | |||||||||
|
| %err |
| %err |
| %err |
| %err |
| %err |
| %err |
| %err |
| %err | |
|
| ||||||||||||||||
| Gender | 29 | 0% | 23 | 0% | 17 | 0% | 27 | 0% | 27 | 0% | 35 | 0% | 26 | 0% | 184 | 0% |
| Birth date | 29 | 7% | 23 | 9% | 17 | 0% | 27 | 19% | 27 | 0% | 35 | 0% | 26 | 0% | 184 | 5% |
| Weight | 29 | 31% | 37 | 41% | 55 | 11% | 26 | 38% | 27 | 93% | 268 | 1% | 45 | 2% | 487 | 14% |
| Weight date | 29 | 21% | 37 | 30% | 55 | 15% | 26 | 38% | 27 | 100% | 268 | 0% | 45 | 9% | 487 | 14% |
|
| ||||||||||||||||
| CD4 | 29 | 14% | 33 | 21% | 31 | 6% | 96 | 13% | 132 | 5% | 134 | 1% | 88 | 5% | 543 | 7% |
| CD4 date | 29 | 21% | 33 | 27% | 31 | 10% | 96 | 16% | 132 | 17% | 134 | 1% | 88 | 8% | 543 | 12% |
| Viral load | 29 | 7% | 26 | 42% | 0 | – | 57 | 25% | 120 | 7% | 112 | 1% | 84 | 4% | 428 | 9% |
| Viral load date | 29 | 17% | 26 | 42% | 0 | – | 57 | 28% | 119 | 13% | 112 | 0% | 84 | 7% | 427 | 12% |
|
| ||||||||||||||||
| Regimen | 46 | 11% | 54 | 26% | 23 | 13% | 38 | 21% | 49 | 22% | 67 | 7% | 47 | 19% | 324 | 17% |
| Start date | 46 | 28% | 54 | 56% | 23 | 13% | 38 | 32% | 49 | 39% | 67 | 12% | 47 | 26% | 324 | 30% |
| Stop date | 30 | 27% | 54 | 50% | 7 | 29% | 38 | 29% | 49 | 33% | 67 | 10% | 47 | 38% | 292 | 30% |
|
| 354 | 17% | 400 | 34% | 259 | 10% | 526 | 21% | 758 | 20% | 1299 | 2% | 627 | 10% | 4223 | 14% |
This table shows the number of variables audited in each of eleven categories of data, including gender, birth date, weight, CD4 count, viral load, antiretroviral (ARV) regimens, and all associated dates.
Columns contain the counts for each site (N), along with the percentage of data that was labeled “in error” by auditors (%err). The reported percentage of erroneous data includes incorrect, missing, and sourceless values (error categories 3, 4, and 5), but not minor errors.
Site C did not submit any viral load data.
Figure 1Error rates by error type for antiretroviral regimens.
The chart shows the rates of overall, incorrect, missing, and sourceless data errors and their 95% confidence intervals for antiretroviral (ARV) regimens. The horizontal line represents error rates of 10%.
Figure 2Error rates by error type for start and stop dates of antiretroviral treatment.
The two tiled charts show the rates of overall, incorrect, missing, and sourceless data errors and their 95% confidence intervals for the start and stop dates of patients' antiretroviral regimens. The horizontal line represents error rates of 10%.
Variable counts and error rates by data category during initial and follow-up audits at a single site.
| InitialSite Audit | Follow-up Site Audit | |||
|
| %err |
| %err | |
|
| ||||
| Gender | 23 | 0% | 26 | 0% |
| Birth date | 23 | 9% | 26 | 8% |
| Weight | 37 | 41% | 42 | 26% |
| Weight date | 37 | 30% | 42 | 21% |
|
| ||||
| CD4 | 33 | 21% | 35 | 6% |
| CD4 date | 33 | 27% | 35 | 6% |
| Viral load | 26 | 42% | 32 | 16% |
| Viral load date | 26 | 42% | 32 | 13% |
|
| ||||
| Regimen | 54 | 26% | 65 | 12% |
| Start date | 54 | 56% | 64 | 23% |
| Stop date | 54 | 50% | 64 | 33% |
|
| 400 | 34% | 463 | 17% |