| Literature DB >> 31815652 |
Jessica E Lockery1, Taya A Collyer2, Christopher M Reid2,3, Michael E Ernst4, David Gilbertson5, Nino Hay2, Brenda Kirpach6, John J McNeil2, Mark R Nelson2,7, Suzanne G Orchard2, Kunnapoj Pruksawongsin2, Raj C Shah8, Rory Wolfe2, Robyn L Woods2.
Abstract
BACKGROUND: Large-scale studies risk generating inaccurate and missing data due to the complexity of data collection. Technology has the potential to improve data quality by providing operational support to data collectors. However, this potential is under-explored in community-based trials. The Aspirin in reducing events in the elderly (ASPREE) trial developed a data suite that was specifically designed to support data collectors: the ASPREE Web Accessible Relational Database (AWARD). This paper describes AWARD and the impact of system design on data quality.Entities:
Keywords: Clinical trial; Data quality; Health data; Health technology
Year: 2019 PMID: 31815652 PMCID: PMC6902598 DOI: 10.1186/s13063-019-3789-2
Source DB: PubMed Journal: Trials ISSN: 1745-6215 Impact factor: 2.279
Operational and data management considerations and solutions
| Operational domain | Key requirements | Design solution |
|---|---|---|
| Visit booking | - Identification of participants to be booked and the visit required - Organisation of “due” participants by visit venue - Computation of visit venue booking time - Recording and tracking of venue room bookings - Recording tracking of participants bookings linked with room bookings | “2 step” solution implemented - Venue room booking information entered via a single web page - Participants’ booking information entered on a nested web page Bookings presented in both calendar and list format via the web application |
| Conduct of calls | - Identification of participants to be contacted - Mechanism to record call attempts and messages - Mechanism to record if participants are unavailable for calls at certain timepoints | Online call tracking implemented - List of participants due and eligible to receive calls available via web application Simple online phone call data collection form |
| Study medication tracking | - Tracking of dispensing and retrieval of study medication bottle - Mechanism to ensure that the correct medication is provided to each participant | Online drug log implemented - Study medication bottle dispensation date and retrieval date recorded - Pill count logged To avoid unnecessary queries caused by transcription errors, each participant’s unique study medication code prompted and validated on data entry |
| Retention | - Conduct of scheduled contact at certain timepoints identified as increasing the risk of participant withdrawal (e.g. between dementia trigger and completion of additional cognitive assessment) - Mechanism required to shift participants at risk of withdrawal from the regular contact lists to a retention team list | Retention status implemented - Database “views” utilised to derive a status describing whether scheduled study contact was appropriate (e.g. not eligible for phone contact – dementia trigger follow up in progress) Status utilised to shift participants from regular contact lists to retention team lists |
| Communication | - Mechanism for staff to notify PCPs/GPs of abnormal results - Mechanism for requesting clinical documents from third parties (e.g. hospitals, specialists and general practitioners) | Curated third party communication pipeline created and implemented - Standard document request and abnormal result notification letters auto-populated with relevant participant details via web application - Microsoft Visual Basic for Applications utilised to send standard letters via fax or email communications |
| Staff decision support | - Mechanism to ensure that protocol specified follow up of endpoints was completed - Mechanism to ensure protocol specified follow up of abnormal results was completed - Mechanism to ensure that only eligible participants were randomised | Key operational “status” for each study participant or key step derived and displayed - Database views utilised to derive a status describing the operational “next step”’ (e.g. event coded – awaiting supporting documents; annual visit – overdue etc) Status displayed on relevant pages on the user interface Randomisation restrictions implemented - Automated checks compared entered data against eligibility criteria - Randomisation function disabled for ineligible participants |
| Data entry | - Mechanism to alert staff to potentially incorrect data for review - Clear process for alerting staff to data queries for resolution | Checks and balances implemented to minimise transcription errors - Pre-programmed value ranges, process prompts and protocol compliance checks, checked at the point of data entry - Page submission restrictions implemented to check for logic between values on a page Staff action list implemented - Automated checks compared entered data against acceptable rangesa and produced “action items” - Staff specific list of action items displayed on “home” page of |
PCP primary care provider, GP general practitioner
aAcceptable ranges were determined by an expert committee based on physiological plausibility
Fig. 1Conceptual design and functionality of the ASPREE Web Accessible Relational Database (AWARD) suite. e-forms = electronic versions of case report or other forms. “Other data library” refers to the library storing unstructured files such as PDF supporting documents, PDF consent forms and retinal photographs
Fig. 2Data flow between stakeholders in the ASPREE clinical trial
Data quality comparison, prior and post system upgrade to ASPREE Web Accessible Relational Database (AWARD)
| Prior to | With | |
|---|---|---|
| Number of participants | 1000 | 18,114 |
| Number of fields collected at baseline | 206 | 220 |
| Baseline data missing due to staff error | 646 (0.3%) | 351 (0.01%) |
| Baseline data requiring querying | 278 (0.14%) | 1469 (0.04%) |
| Protocol deviations | 4 (0.4%) | 15 (0.08%) |
| Proportion of participants with at least 1 missing field due to staff error | 65% | 2% |
Results are presented as number or number (percentage)
ASPREE longitudinal data quality
| Data category | TOTAL potential valuesab | Number of variables collected | Entered within range or found to be correct on querying | Unresolved queries | Protocol deviations | Values where data collection not possiblec |
|---|---|---|---|---|---|---|
| number (% of total) | number | number | ||||
| Participant demographics | 824,947 | 80 | 795,984 (96.5%) | 0 | 0 | 704,173 |
| Clinical informatione | 2,413,294 | 227 | 2,358,542 (97.7%) | 0 | 14 | 1,925,584 |
| Pathology | 959,347 | 88 | 842,136 (87.8%) | 0 | 5 | 722,685 |
| Medications | 939,684 | 31d | 937,256 (99.7%) | 0 | 0 | 72,444 |
| Family history | 293,929 | 37 | 286,982 (97.6%) | 0 | 0 | 413,289 |
| Cognitive measures | 2,737,448 | 590 | 2,689,837 (98.3%) | 0 | 0 | 8,539,812 |
| Physical function | 509,858 | 73 | 488,648 (95.8%) | 0 | 0 | 885,464 |
| Mood, function and quality of life | 6,317,845 | 749 | 6,070,867 (96.1%) | 0 | 0 | 7,998,541 |
| Endpoints | 146,243 | 46 | 146,243 (100%) | 0 | 0 | 733,001 |
| Study medication | 97,004 | 8 | 97,004 (100%) | 0 | 0 | 55,908 |
| Visit conduct | 469,259 | 61 | 456,539 (97.3%) | 0 | 0 | 696,695 |
| TOTAL | 15,708,858 | 1990 | 15,170,038 (96.6%) | 0 | 19 | 23,399,596 |
aIncludes all data values scheduled for collection between randomisation, death, withdrawal of consent, or study closure. Excludes fields that were not active at the time of data collection, and fields that were not applicable due to a response to a hierarchical question
bAll values queried for missing and out-of-range data
cIncludes all data values scheduled for collection after death, withdrawal on consent or study closure. Also includes fields that were not active at the time of data collection, and fields that were not applicable due to a response to a hierarchical question
dPlus concomitant medication. The number of medications reported varied for each participant
eIncludes past medical history, past cancer screening, and physical examination measures such as blood pressure, heart rate, height, weight and abdominal circumference
ASPREE longitudinal data completeness
| Data category | TOTAL missing valuesa | Missing data | ||||
|---|---|---|---|---|---|---|
| Visit not conducted | Third party | Participant declined | Staff/device error | Other reasons | ||
| Participant demographics | 28,963 (3.5%) | 1.5% | 0% | 2.1% | < 0.1% | 0% |
| Clinical information | 54,738 (2.3%) | 1.5% | 0% | 0.3% | < 0.1% | 0.4% |
| Pathology | 117,206 (12.2%) | 0% | 10.2% | 0% | 0% | 2.0% |
| Medications | 2428 (0.3%) | 0% | 0% | 0.1% | < 0.1% | 0.1% |
| Family history | 6947 (2.4%) | 2.0% | 0% | 0.2% | 0.1% | 0% |
| Cognitive measures | 47,611 (1.7%) | 1.5% | 0% | 0.3% | < 0.1% | < 0.1% |
| Physical function | 21,210 (4.2%) | 2.6% | 0% | 1.5% | < 0.1% | 0% |
| Mood, function and quality of life | 246,978 (3.9%) | 3.6% | 0% | 0.3% | 0.1% | < 0.1% |
| Endpoints | 0 (0%) | 0% | 0% | 0% | 0% | 0% |
| Study medication | 0 (0%) | 0% | 0% | 0% | 0% | 0% |
| Visit conduct | 12,720 (2.7%) | 2.7% | 0% | 0% | 0% | 0% |
| TOTAL | 538,801 (3.4%) | 2.2% | 0.6% | 0.4% | < 0.1% | 0.2% |
aExcludes fields that were not active at the time of data collection; fields that were not applicable due to a response to a hierarchical question