Laura Trotta1, Yuusuke Kabeya2,3, Marc Buyse4,5, Erik Doffagne1, David Venet6, Lieven Desmet7, Tomasz Burzykowski8,9, Akira Tsuburaya10, Kazuhiro Yoshida11, Yumi Miyashita12, Satoshi Morita13, Junichi Sakamoto12,14, Paurush Praveen1, Koji Oba2,15. 1. CluePoints S.A., Louvain-la-Neuve, Belgium. 2. Department of Biostatistics, The University of Tokyo, Tokyo, Japan. 3. EPS Corporation, Tokyo, Japan. 4. International Drug Development Institute (IDDI), San Francisco, CA, USA. 5. CluePoints, Wayne, PA, USA. 6. Institut de Recherches Interdisciplinaires et de Développements en Intelligence Artificielle (IRIDIA), University of Brussels, Brussels, Belgium. 7. Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA), University of Louvain, Louvain-la-Neuve, Belgium. 8. International Drug Development Institute (IDDI), Louvain-la-Neuve, Belgium. 9. Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), University of Hasselt, Hasselt, Belgium. 10. Department of Surgery, Jizankai Medical Foundation, Tsuboi Cancer Center Hospital, Koriyama, Japan. 11. Department of Surgical Oncology, Graduate School of Medicine, Gifu University, Gifu, Japan. 12. Epidemiological and Clinical Research Information Network (ECRIN), Okazaki, Japan. 13. Department of Biomedical Statistics and Bioinformatics, Graduate School of Medicine, Kyoto University, Kyoto, Japan. 14. Tokai Central Hospital, Kakamigahara, Japan. 15. Interfaculty Initiative in Information Studies, The University of Tokyo, Tokyo, Japan.
Abstract
BACKGROUND/AIMS: A risk-based approach to clinical research may include a central statistical assessment of data quality. We investigated the operating characteristics of unsupervised statistical monitoring aimed at detecting atypical data in multicenter experiments. The approach is premised on the assumption that, save for random fluctuations and natural variations, data coming from all centers should be comparable and statistically consistent. Unsupervised statistical monitoring consists of performing as many statistical tests as possible on all trial data, in order to detect centers whose data are inconsistent with data from other centers. METHODS: We conducted simulations using data from a large multicenter trial conducted in Japan for patients with advanced gastric cancer. The actual trial data were contaminated in computer simulations for varying percentages of centers, percentages of patients modified within each center and numbers and types of modified variables. The unsupervised statistical monitoring software was run by a blinded team on the contaminated data sets, with the purpose of detecting the centers with contaminated data. The operating characteristics (sensitivity, specificity and Youden's J-index) were calculated for three detection methods: one using the p-values of individual statistical tests after adjustment for multiplicity, one using a summary of all p-values for a given center, called the Data Inconsistency Score, and one using both of these methods. RESULTS: The operating characteristics of the three methods were satisfactory in situations of data contamination likely to occur in practice, specifically when a single or a few centers were contaminated. As expected, the sensitivity increased for increasing proportions of patients and increasing numbers of variables contaminated. The three methods showed a specificity better than 93% in all scenarios of contamination. The method based on the Data Inconsistency Score and individual p-values adjusted for multiplicity generally had slightly higher sensitivity at the expense of a slightly lower specificity. CONCLUSIONS: The use of brute force (a computer-intensive approach that generates large numbers of statistical tests) is an effective way to check data quality in multicenter clinical trials. It can provide a cost-effective complement to other data-management and monitoring techniques.
BACKGROUND/AIMS: A risk-based approach to clinical research may include a central statistical assessment of data quality. We investigated the operating characteristics of unsupervised statistical monitoring aimed at detecting atypical data in multicenter experiments. The approach is premised on the assumption that, save for random fluctuations and natural variations, data coming from all centers should be comparable and statistically consistent. Unsupervised statistical monitoring consists of performing as many statistical tests as possible on all trial data, in order to detect centers whose data are inconsistent with data from other centers. METHODS: We conducted simulations using data from a large multicenter trial conducted in Japan for patients with advanced gastric cancer. The actual trial data were contaminated in computer simulations for varying percentages of centers, percentages of patients modified within each center and numbers and types of modified variables. The unsupervised statistical monitoring software was run by a blinded team on the contaminated data sets, with the purpose of detecting the centers with contaminated data. The operating characteristics (sensitivity, specificity and Youden's J-index) were calculated for three detection methods: one using the p-values of individual statistical tests after adjustment for multiplicity, one using a summary of all p-values for a given center, called the Data Inconsistency Score, and one using both of these methods. RESULTS: The operating characteristics of the three methods were satisfactory in situations of data contamination likely to occur in practice, specifically when a single or a few centers were contaminated. As expected, the sensitivity increased for increasing proportions of patients and increasing numbers of variables contaminated. The three methods showed a specificity better than 93% in all scenarios of contamination. The method based on the Data Inconsistency Score and individual p-values adjusted for multiplicity generally had slightly higher sensitivity at the expense of a slightly lower specificity. CONCLUSIONS: The use of brute force (a computer-intensive approach that generates large numbers of statistical tests) is an effective way to check data quality in multicenter clinical trials. It can provide a cost-effective complement to other data-management and monitoring techniques.
Entities:
Keywords:
Data quality; central statistical monitoring; fraud detection; operating characteristics; risk-based monitoring; simulations