| Literature DB >> 31077148 |
Naila A Shaheen1,2,3, Bipin Manezhi4, Abin Thomas5,6,7, Mohammed AlKelya8,6,7,9.
Abstract
BACKGROUND: A dataset is indispensable to answer the research questions of clinical research studies. Inaccurate data lead to ambiguous results, and the removal of errors results in increased cost. The aim of this Quality Improvement Project (QIP) was to improve the Data Quality (DQ) by enhancing conformance and minimizing data entry errors.Entities:
Keywords: Clinical research data quality; Data entry errors; Data quality management; Data quality metrics; Defective dataset; Poor-quality dataset
Mesh:
Year: 2019 PMID: 31077148 PMCID: PMC6511206 DOI: 10.1186/s12874-019-0735-7
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1PARETO Chart of Most Common Types of Data Defects at the Baseline
Pre vs Post-Intervention Change in DPU, DPMO, Yield and SIGMA
| Measures | Pre-Intervention | Post-Intervention | ||
|---|---|---|---|---|
| Distribution of defects | ||||
| Zero defects | 6 (13.3) | 17 (81) | ||
| One defect | 8 (17.7) | 2 (9.5) | ||
| Two or more defects | 31 (69) | 2 (9.5) | ||
| DPU (defects per unit) | 2.33 | 0.57 | ||
| DPMO (defects per million opportunities) | 194,444.44 | 47,619.04 | ||
| Yield | 80.55 | 95.23 | ||
| SIGMA | 2.4 | 3.2 | ||
| Pre-intervention datasets | Post-intervention datasets | |||
| Not Defective | Defective | Not Defective | Defective | |
| Data Capturing Points | ||||
| Single data capturing point (Cross-Sectional studies) | 6 (28.6) | 15 (71.4) | 8 (66.7) | 4 (33.3) |
| Multiple data capturing points (Longitudinal studies) | 0 | 24 (100)* | 9 (100) | 0 |
| Research Coordinator on board | ||||
| No | 5 (17.9) | 23 (82.1) | 7 (63.60) | 4 (36.4) |
| Yes | 1 (5.9) | 16 (94.1) | 10 (100) | 0 |
| Principal Investigators (PIs) Requesting Data Analysis | ||||
| Returning PIs | 3 (30) | 7 (70) | 14 (100) | 0 |
| New PIs | 3 (8.6) | 32 (91.4) | 3 (42.9) | 4 (57.1)** |
Longitudinal studies = Cohort/Case-Control/Randomized controlled trials (RCT)
The reported percentage is row percentage
p -value is based on Fisher's exact test
* p = 0.007
**p = 0.006
Fig. 2Root Cause Analysis of Data Defects
Fig. 3Pre vs Post Intervention Change in Number of Data Defects
Fig. 4Association of Expert Consultation with Defective Datasets
| Metrics | Formula | Description |
|---|---|---|
| 1. SIPOC (supplier, input, process, output, customer): | – | Identifies all elements of a process improvement before measuring baseline |
| 2. DPU |
| Provides a measurement of the average number of defects in a single unit [ |
| 3. DPO |
| Measures the number of defects that occur per opportunity for success or failure [ |
| 4. DPMO | DPO ∗ 1000, 000 | “Total number of defects observed divided by the total number of opportunities expressed in events per million, sometimes called defects per million” [ |
| 5. Yield |
| “Traditionally, yield is a proportion of correct items (conforming to specifications) you get out of the process compared to the number of raw items put into it” [ |
| 6. Sigma | Six sigma quality performance means 3.4 defects per million opportunities [ |