| Literature DB >> 33611927 |
William J Cragg1,2, Caroline Hurley3, Victoria Yorke-Edwards1, Sally P Stenning1.
Abstract
BACKGROUND/AIMS: It is increasingly recognised that reliance on frequent site visits for monitoring clinical trials is inefficient. Regulators and trialists have recently encouraged more risk-based monitoring. Risk assessment should take place before a trial begins to define the overarching monitoring strategy. It can also be done on an ongoing basis, to target sites for monitoring activity. Various methods have been proposed for such prioritisation, often using terms like 'central statistical monitoring', 'triggered monitoring' or, as in the International Conference on Harmonization Good Clinical Practice guidance, 'targeted on-site monitoring'. We conducted a scoping review to identify such methods, to establish if any were supported by adequate evidence to allow wider implementation, and to guide future developments in this field of research.Entities:
Keywords: Good Clinical Practice; Trial monitoring; central statistical monitoring; data fabrication; research misconduct; risk-based monitoring; triggered monitoring
Mesh:
Year: 2021 PMID: 33611927 PMCID: PMC8010889 DOI: 10.1177/1740774520976561
Source DB: PubMed Journal: Clin Trials ISSN: 1740-7745 Impact factor: 2.486
Figure 1.PRISMA flow diagram.
aReasons: no relevant methods presented (n = 28); no novel methods presented (e.g. review article; n = 28); method to measure variation between trial sites but no ‘flagging’ of sites of concern (n = 25); abstract only and not enough detail to confirm relevance (n = 10); duplicate or abstract where full paper also available (n = 8); grey literature not considered to present reproducible methods (n = 5); not about ‘monitoring’ according to ICH Good Clinical Practice definition (n = 5); trial-level assessment only, not site-level (n = 4); focus on consistency of outcome assessment only (n = 4); method from observational study only, not clinical trial (n = 1).
General characteristics of included studies.
| Characteristic | N (total = 30) | % |
|---|---|---|
| Publication year | ||
| 1996–2000 | 0 | 0 |
| 2000–2005 | 2 | 7 |
| 2006–2010 | 2 | 7 |
| 2010–2015 | 13 | 43 |
| 2016–2018 | 13 | 43 |
| Type of source | ||
| Peer-reviewed paper | 21 | 70 |
| Conference abstract or poster | 8 | 27 |
| Thesis | 1 | 3 |
| Disease setting of trial involved | ||
| Cardiovascular disease | 4 | 13 |
| Emergency medicine | 1 | 3 |
| Haematology | 1 | 3 |
| Infectious diseases | 1 | 3 |
| Mental health | 3 | 10 |
| Neurology | 1 | 3 |
| Oncology | 3 | 10 |
| Ophthalmology | 1 | 3 |
| Renal disease | 1 | 3 |
| Respiratory disease | 1 | 3 |
| Unknown or no specific trial involved | 13 | 43 |
| Geographical setting of trials involved | ||
| Brazil | 1 | 3 |
| International | 7 | 23 |
| Japan | 1 | 3 |
| North America | 4 | 13 |
| UK | 2 | 7 |
| Unknown or no specific trial involved | 15 | 50 |
| Use of Investigational Medicinal Product (IMP) in involved trials | ||
| Involves IMP | 14 | 47 |
| No IMP | 1 | 3 |
| Unknown or no specific trial involved | 15 | 50 |
| Phase of trials involved | ||
| Phase I | 0 | 0 |
| Phase II | 1 | 3 |
| Phases II and III | 1 | 3 |
| Phase III | 9 | 30 |
| Unknown or no specific trial involved | 19 | 63 |
| Status of investigational medicinal product used[ | ||
| Unlicensed | 0 | 0 |
| Licensed, used outside of its licensed indication | 5 | 17 |
| Licensed, used within its licensed indication | 4 | 13 |
| Unknown or no specific trial involved | 22 | 73 |
| Focus of work[ | ||
| Central statistical monitoring, focus on fraud or misconduct | 7 | 23 |
| Central statistical monitoring, general | 13 | 43 |
| Triggered monitoring | 9 | 30 |
| Other method(s) for highlighting sites at risk | 2 | 7 |
| Scope of work | ||
| Description or development of method | 9 | 30 |
| Some assessment of methods’ effectiveness | 21 | 70 |
Categories not mutually exclusive.
Full listing of all included reports.
| Author(s) | Type of source | Focus of work | Scope of work |
|---|---|---|---|
| Agrafiotis et al.[ | Peer-reviewed paper | Triggered monitoring | Some assessment of methods’ effectiveness |
| Almukhtar and Glassman[ | Conference abstract/poster | Central statistical monitoring, general | Description or development of method |
| Atanu et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Description or development of method |
| Bailey et al.[ | Conference abstract/poster | Triggered monitoring | Description or development of method |
| Bengtsson[ | Thesis | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| Biglan et al.[ | Conference abstract/poster | Triggered monitoring | Some assessment of methods’ effectiveness |
| Desmet et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| Desmet et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| Diani et al.[ | Peer-reviewed paper | Triggered monitoring | Some assessment of methods’ effectiveness |
| Djali et al.[ | Peer-reviewed paper | Other method(s) for highlighting sites at risk (combines site metric scores directly to flag sites of concern) | Some assessment of methods’ effectiveness |
| Dress et al.[ | Conference abstract/poster | Triggered monitoring | Description or development of method |
| Edwards et al.[ | Peer-reviewed paper | Central statistical monitoring with triggered monitoring | Some assessment of methods’ effectiveness |
| Kirkwood et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| Knepper et al.[ | Peer-reviewed paper | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Knott et al.[ | Conference abstract/poster | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| Kodama et al.[ | Conference abstract/poster | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Lindblad et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| O’Kelly[ | Peer-reviewed paper | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Pogue et al.[ | Peer-reviewed paper | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Smith and Seltzer[ | Peer-reviewed paper | Other method(s) for highlighting sites at risk (use of “statistical process control methodology” to combine per-site risk indicator scores) | Description or development of method |
| Stenning et al.[ | Peer-reviewed paper | Triggered monitoring | Some assessment of methods’ effectiveness |
| Taylor et al.[ | Peer-reviewed paper | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Timmermans et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
| Tudur Smith et al.[ | Peer-reviewed paper | Triggered monitoring | Description or development of method |
| Valdes-Marquez et al.[ | Conference abstract/poster | Central statistical monitoring, general | Description or development of method |
| Valdes-Marquez et al.[ | Conference abstract/poster | Central statistical monitoring, general | Description or development of method |
| Van den Bor et al.[ | Peer-reviewed paper | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Whitham, 201859 | Peer-reviewed paper | Triggered monitoring | Description or development of method |
| Wu and Carlsson[ | Peer-reviewed paper | Central statistical monitoring, focus on fraud or misconduct | Some assessment of methods’ effectiveness |
| Zink et al.[ | Peer-reviewed paper | Central statistical monitoring, general | Some assessment of methods’ effectiveness |
Figure 2.Publications by year and type.
Types of assessments and evidence presented by reports that included some assessments of their methods’ effectiveness.
| Author(s) | Case studies | Illustration of method(s) on data with no known issues | Assessment of methods’ ability to identify simulated problem sites | Assessment of methods’ ability to identify known problems in real trial data | Methods used in ongoing trial, results of on-site monitoring reported | Methods used in ongoing trial, effects reported on trial in general (e.g. in terms of cost or data quality) | Prospectively designed, controlled study to assess methods’ ability to target on-site monitoring visits to most problematic sites |
|---|---|---|---|---|---|---|---|
| Agrafiotis et al.[ | X | X | |||||
| Bengtsson[ | X | ||||||
| Biglan et al.[ | X | X | |||||
| Desmet et al.[ | X | X | X | ||||
| Desmet et al.[ | X | X | |||||
| Diani et al.[ | X | ||||||
| Djali et al.[ | X | ||||||
| Edwards et al.[ | X | ||||||
| Kirkwood et al.[ | X | X | |||||
| Knepper et al.[ | X | ||||||
| Knott et al.[ | X | ||||||
| Kodama et al.[ | X | ||||||
| Lindblad et al.[ | X | ||||||
| O’Kelly[ | X | ||||||
| Pogue et al.[ | X | X | |||||
| Stenning et al.[ | X | ||||||
| Taylor et al.[ | X | ||||||
| Timmermans et al.[ | X | ||||||
| Van den Bor et al.[ | X | ||||||
| Wu and Carlsson[ | X | X | |||||
| Zink et al.[ | X | ||||||
| Total | 3 | 9 | 6 | 4 | 3 | 3 | 1 |
Best reported information on methods’ classification ability, where available or deducible.
| Author(s) | Available information on methods’ classification abilities | Definition of ‘positive’ centres | ‘True’ test status: real or simulated? | Test for ‘true’ centre status | Sensitivity[ | Specificity[ | Positive predictive value[ | Negative predictive value[ |
|---|---|---|---|---|---|---|---|---|
| Biglan et al.[ | Partial (‘true’ status known for only one centre; total number of centres not known) | Not clearly defined[ | Real | On-site monitoring | Unavailable due to limited data; report states that one ‘low-risk’ centre was visited and considered to be misclassified (i.e. should have been ‘medium risk’ or ‘high risk’). However, the total number of sites classified and visited (overall and within each risk category) is not known | |||
| Desmet et al.[ | Explored through simulation | Presence of atypical data | Simulated | Known because simulated | Dependent on simulation scenario; no specific figure given | |||
| Detailed information (vital signs data used as illustrative example) | Presence of atypical data | Real | Unclear (‘closer inspection’) | Reported: 83% (10/(10+2)) | Reported: 99% (204/(204+2)) | Calculated: 83% (10/(10+2)) | Calculated: 99% (204/(204+2)) | |
| Desmet et al.[ | Explored through simulation | Presence of atypical data | Simulated | Known because simulated | Reported: dependent on simulation scenario; no specific figure given | Reported: median specificity varied from 98%–100% depending on scenario | Not reported and not possible to calculate (results of many simulations presented) | |
| Knepper et al.[ | Detailed information | Presence of fabricated data | Simulated with physician input | Known because simulated | Reported: best result from 4 scenarios (study 1): 86% (6/(6+1)) | Reported: best result from 4 scenarios (study 1a): 87% (148/(148+23))[ | Reported: best result from 4 scenarios (study 2a): 27% (3/3+8) | Reported: best result from 4 scenarios (study 1): 99% (132/132+1) |
| Knott et al.[ | Partial (total number of sites not reported but likely more than number whose results reported; ‘true’ status of any unreported centres not known) | Presence of any findings | Real | On-site monitoring | Calculated: 85% (11/(11+2)) | Calculated: 88% (7/(7+1)) | Calculated: 92% (11/(11+1) | Calculated: 78% (7/(7+2)) |
| Presence of findings ‘indicative of sloppy practice’ (clearer definition not reported) | Real | On-site monitoring | Calculated: 83% (10/((10+2)) | Calculated: 78% (7/(7+2)) | Calculated: 83% (10/(10+2)) | Calculated: 78% (7/(7+2)) | ||
| Presence of serious findings | Real | On-site monitoring | Calculated: 100% (1/1+0) | Calculated | Calculated | Calculated | ||
| Lindblad et al.[ | Partial (‘true’ status known only at 21/413 centres) | Presence of serious problems | Real | Regulatory inspection | Reported: 83% (5/((5+1)) | Cannot be calculated without making assumptions about the 392/413 sites with unknown ‘true’ status | ||
| Presence of minor problems | Real | Regulatory inspection | Reported: 89% (8/(8+1)) | |||||
| Presence of any problems | Real | Regulatory inspection | Reported: 87% (13/(13+2)) | |||||
| O’Kelly[ | Detailed information, but sample of data from trial[ | Presence of fabricated data | Simulated with physician input | Known because simulated | Calculated: 33% (1/(1+2)) | Calculated: 95% (18/(18+1)) | Calculated: 50% (1/(1+1)) | Calculated: 90% (18/(18+2)) |
| Pogue et al.[ | POISE trial data: detailed information from all sites with >= 20 randomisations | Presence of fabricated data | Real | On-site monitoring | Reported: different models and different thresholds give different pros and cons in terms of classification. Models 1, 3 and 5 all have at least some scenarios where both specificity and sensitivity >80%. (Models 1 and 5, risk score ≥ 7; Model 3, risk score ≥ 5) | |||
| HOPE trial data: summary information from all sites with ≥ 20 randomisations | Presence of fabricated data | Real | On-site monitoring | N/a (no true positives) | Reported: | Calculated: | Calculated: | |
| Stenning et al.[ | Partial (only sample of negative-testing sites visited, although the study design aimed to control for this) | Presence of ≥1 serious (Major or Critical) finding | Real | On-site monitoring | Calculated: | Calculated: | Reported: | Calculated: |
| Calculated: | Calculated: | Reported: | Calculated: | |||||
| Calculated: | Calculated: | Reported: | Calculated: | |||||
| Van den Bor et al.[ | Partial in paper, but authors confirmed that trial implemented source data verification for all sites (personal communication) | Presence of fabricated data | Real | On-site monitoring | Various situations presented, with different implications for classification ability. | |||
| Wu and Carlsson[ | Partial (15/17 sites have unknown ‘true’ status) | Presence of fabricated data | Real | Auditing | Results presented narratively via a number of scenarios. | |||
Number of correctly flagged problem sites/(number of correctly flagged problem sites+sites incorrectly not flagged as concerning); thick border used to highlight results more than or equal to 90%.
Number of sites correctly flagged as not concerning/(number of sites correctly flagged as not concerning+sites incorrectly flagged as concerning); thick border used to highlight results more than or equal to 90%.
Number of correctly flagged problem sites/(number of correctly flagged problem sites+sites incorrectly flagged as concerning); thick border used to highlight results more than or equal to 90%.
Number of sites correctly flagged as not concerning/(number of sites correctly flagged as not concerning+sites incorrectly not flagged as concerning); thick border used to highlight results more than or equal to 90%.
One ‘positive’ centre is described as ‘reveal[ing that] RBM was not assessing risk sufficiently to drive monitoring decisions’.
Publication incorrectly rounds this to 86%.
Approximately one-third of sites included from a trial; also some uncertainty about total number of sites (sometimes reported as 21, sometimes 22; used 22 for calculations given here as this is the figure in the ‘Results’ section).