Literature DB >> 35911195

Discrepancy review: a feasibility study of a novel peer review intervention to reduce undisclosed discrepancies between registrations and publications.

Abstract

Undisclosed discrepancies often exist between study registrations and their associated publications. Discrepancies can increase risk of bias, and when undisclosed, they disguise this increased risk of bias from readers. To remedy this issue, we developed an intervention called discrepancy review. We provided journals with peer reviewers specifically assigned to check for undisclosed discrepancies between registrations and manuscripts submitted to journals. We performed discrepancy review on 18 manuscripts submitted to Nicotine and Tobacco Research and three manuscripts submitted to the European Journal of Personality. We iteratively refined the discrepancy review process based on feedback from discrepancy reviewers, editors and authors. Authors addressed the majority of discrepancy reviewer comments, and there was no opposition to running a trial from authors, editors or discrepancy reviewers. Outcome measures for a trial of discrepancy review could include the presence of primary or secondary outcome discrepancies, whether publications that are not the primary report from a clinical trial registration are clearly described as such, whether registrations are permanent, and an overarching subjective assessment of the impact of discrepancies in published articles. We found that discrepancy review could feasibly be introduced as a regular practice at some journals interested in this process. A full trial of discrepancy review would be needed to evaluate its impact on reducing undisclosed discrepancies.

Entities: Chemical

Keywords: meta-research; outcome switching; peer review; pre-registration; selective reporting; trial registration

Year: 2022 PMID： 35911195 PMCID： PMC9326291 DOI： 10.1098/rsos.220142

Source DB: PubMed Journal: R Soc Open Sci ISSN： 2054-5703 Impact factor: 3.653

Introduction

Prospective registration of scientific studies serves several purposes and has become standard practice for clinical trials [1]. Other research disciplines have followed suit and created their own implementations of prospective registration, often called pre-registration. These disciplines include psychology, economics and other social sciences; systematic reviews and meta-analyses; and to some extent, preclinical animal research and observational health research. Across these disciplines, registration aims to reduce bias, increase transparency and calibrate confidence in research findings [1,2]. For example, registration can mitigate or disclose activities that increase risk of bias, including selective reporting, outcome switching and data dredging. In practice, published studies often contain undisclosed discrepancies in relation to their registration. Meta-analyses of more than 6000 publications and their associated registrations, mostly in clinical medicine, estimated that 29–37% (95% CI) of registered studies have at least one primary outcome discrepancy and 50–75% (95% CI) have at least one secondary outcome discrepancy (the associated 95% prediction intervals are 10–68% and 13–95%, respectively) [3]. Undisclosed discrepancies are also common in sample size, exclusion criteria and statistical analysis in psychology research [4] and in hypotheses in economics research [5]. While a publicly available registration allows readers to check whether the outcomes reported in a publication match those that were initially planned, it would be unrealistic to expect every reader to examine this information. Indeed, one survey suggested that only about a third of clinical trial peer reviewers examine registrations as part of their review [6]. Moreover, if a reader finds discrepancies between a registration and publication, there is no clear avenue for how to share this information. One study systematically sent 58 letters to the editor that described undisclosed discrepancies in published manuscripts and found that only three of five top medical journals were willing to publish these letters [7]. A publication workflow that addresses reporting issues before publication would be more desirable than relying on individual readers' scrutiny and correction efforts. One trial attempted to reduce discrepancies at medical journals by sending peer reviewers information about the registration associated with the manuscript they are reviewing (protocol available at [8]). This yet-to-be published study leaves the decision for how to use this information to the reviewers and editors. Here, we propose a novel intervention—discrepancy review—to improve transparent reporting between registrations and their final manuscript. Our intervention provides journals with peer reviewers specifically assigned to check for both outcome and non-outcome discrepancies and asks them to prepare an itemized list of constructive recommendations to manuscript authors for how to reduce or disclose discrepancies between their registration and submitted manuscript. Importantly, the goal of discrepancy review is to encourage transparent reporting, not to inhibit publication of manuscripts with discrepancies. In many cases, discrepancies are entirely justifiable and can be necessary to improve a study design or analysis.

Objectives

We pre-registered three overarching objectives, but no hypotheses. Our objectives were to (1a) evaluate the feasibility of incorporating discrepancy review as a regular practice at scientific journals, (1b) evaluate the feasibility of conducting a trial on discrepancy review; (2) explore the benefits and effort required to incorporate discrepancy review as a regular practice at scientific journals and (3) refine the discrepancy review process.

Terminology

We use the term registration throughout this manuscript to refer to time-stamped documents stored in a permanent and publicly accessible repository (e.g. clinicaltrials.gov, osf.io/registries) that contain information about a study (e.g. design, outcome measures and analyses). Registrations can be prospective (posted before participant enrolment) or retrospective (after participant enrolment begins). Another term for prospective registration is pre-registration. While these terms are sometimes used interchangeably, clinical trials often use the term prospective registration, and various other disciplines often use the term pre-registration. Clinical trial registrations differ from registration on platforms such as the Open Science Framework (OSF) in their history, the functions they were designed to serve, and implementation details (e.g. clinical trial registries do not include sections dedicated to hypotheses or analysis plans). Because of these differences, we try not to conflate clinical trial registration with pre-registration more broadly and mostly use the term registration, which encompasses both and makes no assumption regarding the timing of registration. We use the term discrepancy to refer to any incongruity between the content of a manuscript and its associated registration. We use the term discrepancy review to encompass the entire process we developed to check for and report on discrepancies, and discrepancy report for the written report that comes from discrepancy review and is shared with the action editor.

Methods

There are several minor discrepancies between this manuscript and the pre-registered protocol, which are outlined in the electronic supplementary material A.

Journal recruitment

Via email, we invited the editors-in-chief of 18 journals in medicine and psychology to participate. We selected 11 journals from the list that offer pre-registered badges (cos.io/initiatives/badges) as well as seven additional journals our team thought might be interested in participating. We selected journals based on previous contact with editors-in-chief, a presumption of their interest in open science, and coverage of disciplines that the project lead (RTT) is familiar with.

Discrepancy reviewers

Discrepancy reviewers included the project lead, a frequent collaborator and three team members recruited from a call on Twitter and selected for their experience with registration and range of career stages. While all the reviewers have experience with meta-research and registration, as well as general knowledge in neuroscience or psychology, they did not have domain expertise in the topics of personality psychology and nicotine/tobacco research which they reviewed. At the time of reviewing, their career stages included associate professor (GN, n discrepancy reviews performed = 3), assistant professor (CRP, n = 10), postdoctoral researcher (RTT, n = 21; TEH, n = 5) and PhD student (AO, n = 3).

Original discrepancy review

Before starting this feasibility study, we developed a systematic peer review procedure—discrepancy review—to detect discrepancies between registrations and submitted manuscripts across the 18 dimensions listed in table 1. To perform discrepancy review, a team member would complete a survey regarding how these 18 dimensions were reported in the registration,[1] repeat the process using the submitted manuscript and compare the output of the surveys. They would identify the presence of discrepancies across each dimension as well as whether the discrepancies were disclosed, and provide a subjective assessment of whether each discrepancy presents an issue that they considered negligible, minor or major. The reviewer would then prepare a discrepancy report that itemizes the discrepancies and provide recommendations for how the authors can address them. The reviewer would submit this report to the action editor, who then sends it to the manuscript authors.

Table 1

dimension	registration			manuscript			discrepancy				issues
dimension	0	1	2	0	1	2	n.a.	no	yes	disclosed	n.a.	none	negligible	minor	major
hypotheses (primary)	0	4	6	2	5	3	0	5	5	1	0	5	0	1	4
hypotheses (secondary)	7	3	0	8	2	0	6	0	4	0	6	1	0	2	1
independent variables	0	2	8	0	2	8	0	9	1	0	0	7	2	0	1
covariates/moderators	7	2	1	6	2	2	6	2	2	0	4	3	1	1	1
outcomes (primary)	0	3	7	0	2	8	0	5	5	0	0	4	0	2	4
outcomes (secondary)	4	4	2	6	2	2	3	2	5	0	3	2	0	1	4
sample size	1	1	8	1	0	9	1	2	7	0	1	2	3	1	3
sample size justification	6	1	3	10	0	0	6	0	4	0	6	0	0	2	2
participant eligibility	3	0	7	2	0	8	1	3	6	0	0	4	4	2	0
data exclusion	6	3	1	7	1	2	6	1	3	0	4	0	3	3	0
missing data	8	1	1	5	1	4	5	1	4	1	1	1	7	1	0
randomization	5	3	2	5	1	4	5	5	0	0	5	3	2	0	0
blinding	6	2	2	8	1	1	6	1	3	0	5	1	1	2	1
preprocessing	10	0	0	9	1	0	9	0	1	0	7	1	2	0	0
analysis (primary)	2	5	3	0	4	6	0	1	9	0	0	2	0	5	3
statistical assumptions	8	2	0	9	1	0	7	0	3	0	5	0	3	2	0
inferential criteria	6	0	4	7	1	2	5	2	3	0	4	1	4	0	1
analysis (secondary)	4	5	1	8	1	1	4	0	6	1	4	2	1	3	0

aAfter we performed original discrepancy review on the first manuscript we received, we made changes to the extraction form. Thus, we exclude this manuscript from table 1.

bClinical trials will often have a 'main publication' that reports on the main questions a clinical trial is trying to answer. There will oftentimes be additional publications that use data from a clinical trial, but report analyses unrelated to the main purpose of the trial. In this manuscript we refer to these as ‘secondary publications associated with a clinical trial’.

Reporting and discrepancies across the 18 dimensions included in the original discrepancy review process. Reporting was coded as 0 (not reported), 1 (reported, but unclear) or 2 (reported clearly). This table presents data from the first coder for the 10 manuscriptsa that underwent the original discrepancy review process and were not a secondary publication associated with a clinical trial registration.b Electronic supplementary material C operationalizes each dimension. Electronic supplementary material, table B2 contains the same data from a second coder on the subset of manuscripts where the second coder also performed the original discrepancy review process. We did not attempt to resolve differences between coders. Inter-rater agreement is available in the electronic supplementary material, table B3. Issues were considered n.a. if the dimension was irrelevant to the particular study (e.g. if neither the registration nor manuscript had secondary hypotheses). Issues were coded as present for some dimensions due to a lack of transparency rather than a discrepancy. aAfter we performed original discrepancy review on the first manuscript we received, we made changes to the extraction form. Thus, we exclude this manuscript from table 1. bClinical trials will often have a 'main publication' that reports on the main questions a clinical trial is trying to answer. There will oftentimes be additional publications that use data from a clinical trial, but report analyses unrelated to the main purpose of the trial. In this manuscript we refer to these as ‘secondary publications associated with a clinical trial’.

Updated discrepancy review

Due to shortcomings in the original discrepancy review process (see §3.2), which we used on 16 manuscripts, we developed and switched to a simplified process (summarized in box 1). Whereas the original discrepancy review process used a detailed and structured checklist, the updated discrepancy review process had a semi-structured format with guiding questions. These changes aimed to reduce the time required to perform discrepancy review, while still identifying all important discrepancies. Does the manuscript present only exploratory outcomes and analyses? Is the registration properly registered (i.e. permanent and public)? Does the registration date suggest that the timing of registration may be retrospective? Are there discrepancies in the number, content or prioritization of hypotheses? Do the study arms, independent variables, exposure variables or study/experimental grouping match?* Are there discrepancies in the number, content or prioritization of outcome measures? Do the analyses match?* While answering the previous questions, if you identified notable additional discrepancies or questionable research practices you may raise them. These could include sample size, control variables, covariates or moderators, eligibility criteria, analytic decisions (e.g. outlier definition), randomization, blinding and data preprocessing, among others.* *For these items, reviewers were instructed to not spend time looking for minor discrepancies. Electronic supplementary material E contains the exact instructions for the updated discrepancy review process.

Discrepancy reports

Two team members independently prepared a written discrepancy report for each manuscript. Only the first report was sent to the action editors. We used the second report to examine consistencies and differences among discrepancy reviewer reports. For eight manuscripts, both the first and second discrepancy reports were written after performing the original discrepancy review process; for eight manuscripts, the first report was based on the original discrepancy review process and the second report was based on the updated discrepancy review process; for five manuscripts, both reports were based on the updated discrepancy review process.

Questionnaires

We sent an optional questionnaire to action editors and manuscript authors to help us refine the discrepancy review process and understand it from their perspective. The questionnaire had 16 questions. These included whether editors and authors would use discrepancy review again, how much time it took them to implement discrepancy review, whether the added benefit of discrepancy review was worth the effort, and open-ended questions about likes, dislikes and other thoughts regarding discrepancy review. The questionnaires are provided in the electronic supplementary material H and I.

Results

Of the 18 journals we contacted, five agreed to participate, two were interested in participating but did not begin sending us manuscripts by the time we concluded data collection, six declined to participate and five did not respond. Of the five journals that agreed to participate, two did not receive any manuscripts reporting a registration during the study period, and one had difficulty adding discrepancy review to their manuscript handling procedures and therefore did not send us any manuscripts to review. Of the journals that declined to participate, three expressed interest but were currently too busy, one stated they receive too few manuscripts reporting a registration and believed the process would be complicated to set-up, one stated they were uncomfortable using their journal for research purposes and one stated their editors already check for discrepancies rendering their journal ill-suited for the present study. All invitations were sent within the first year of the COVID-19 pandemic, which may have impacted the willingness of journals to participate. We reviewed 18 manuscripts submitted to Nicotine and Tobacco Research over a period of three months and three manuscripts submitted to the European Journal of Personality over a period of five months. When assigning manuscripts to peer review, the action editors at Nicotine and Tobacco Research additionally invited our team to perform discrepancy review. These reviews were submitted through an online submission portal (ScholarOne) within three weeks of invitation, as done for regular peer reviews. One of our team members (MRM) is the editor-in-chief of Nicotine and Tobacco Research and served as action editor on two manuscripts included in this study, but did not perform any of the discrepancy reviews. The editor-in-chief of the European Journal of Personality invited our team to review manuscripts on which he himself served as the action editor. For manuscripts that passed an initial round of review at the European Journal of Personality, the editor-in-chief shared the manuscript with our team and informed authors that their manuscript would soon undergo discrepancy review. We conducted discrepancy review within a week and the editor-in-chief sent our review to the manuscript authors in an email separate from the standard peer reviewer comments and editorial decision. We suggested that journals inform manuscript authors about the study in one of four ways: (i) an opt-in button on the journal's manuscript submission page; (ii) an opt-out button on the journal's manuscript submission page; (iii) through an automated email following manuscript submission or (iv) in the email containing the initial editorial decision. One journal chose Option (i) and had an opt-in rate of 79%[2] over four months. The opt-in text was relatively vague and stated that the journal had partnered with researchers to improve peer review, but did not mention anything about registration in particular.[3] Due to difficulties setting up the opt-in button, editors forgetting to flag manuscripts that both opted-in and reported a registration, and a limited number of manuscripts that met both these criteria, we did not receive invitations to review manuscripts from this journal. One participating journal selected Option (iv), informing authors in the same email as the initial editorial decision and standard peer review comments. The other participating journal selected a modified version of Option (iii), where they manually sent an email to authors after an initial editorial decision to revise and resubmit, but before discrepancy review. These emails informed authors that they could withdraw the data collected based on their manuscript. No author requested their data be withdrawn. We planned to use discrepancy review for two purposes: to provide the basis for writing discrepancy reports (the intervention) and to evaluate the presence of discrepancies in published manuscripts (for outcome assessment). We found our original implementation of discrepancy review fit for neither purpose; mainly because registrations were less comprehensive and less precise than we anticipated. Electronic supplementary material, table B1 lists additional shortcomings in the original discrepancy review process and how we addressed them. Nonetheless, the data collected from the original discrepancy review process, which we used as the basis for writing discrepancy reports, provided several insights (table 1): many dimensions were rarely included in registrations and manuscripts, such as sample size justifications, plans for data exclusion, secondary hypotheses and secondary analyses; discrepancies were common across many dimensions; author disclosure of discrepancies was very rare; many discrepancies were judged by reviewers to be negligible (e.g. when the registration did not indicate the inferential criteria or how the researcher will test for violation of statistical assumptions but the manuscript did); and, most discrepancies judged to be major issues are in hypotheses, outcomes, sample size and analyses. Based on these findings, we developed an updated discrepancy review process and revised our outcome measures.

Inter-rater agreement: original discrepancy review

Inter-rater agreement was low across many of the 18 dimensions coded in the original discrepancy review process (inter-rater comparisons of the written discrepancy reports are provided in §3.4 of this manuscript). In particular, inter-rater agreement was low for judgements of discrepancies as negligible, minor or major (see electronic supplementary material, table B2 for second coder results and electronic supplementary material, table B3 for inter-rater agreement). The original discrepancy review process allowed for subjectivity. For example, the survey asked reviewers to state whether each of the 18 dimensions was reported ‘sufficiently’ for the reviewer to replicate the study if they had the expertise.[4] We also asked reviewers to judge whether a discrepancy or lack of transparency for each dimension presented a negligible, minor or major issue.[5] Whereas some reviewers made use of ‘negligible’ for issues of transparency (where an item was reported in neither the registration nor the manuscript), other reviewers often considered these minor issues. Some reviewers also used the option n.a. (not applicable) more frequently than others, in turn lowering inter-rater agreement. Taken together, we concluded that our original implementation of discrepancy review could not reliably assess whether discrepancies were present, or the severity of their impact. Discrepancy reviewers reported preferring the updated process but noted that the less structured methodology required greater focus. The updated discrepancy review process took less time than the original process—clinical trial registrations median and range: 28 (10, 60) min (n reviews = 8) versus 105 (16, 180) min (n = 16); OSF registrations: 50 (20, 92) min (n = 7) versus 210 (90, 360) min (n = 7); PROSPERO registrations: 43 (12, 50) min (n = 3) versus 90 min (n = 1) (table 2). We feel updated discrepancy review is more feasible for journals to implement as a standard practice. For two of five manuscripts that underwent updated discrepancy review, and for one of 16 manuscripts that underwent original discrepancy review, the editors invited the discrepancy reviewer for a second round of review to ensure their comments were addressed, as is common for traditional peer review. The times reported are only for the first round of discrepancy review.

Table 2

Characteristics and outcomes of the manuscripts reviewed.

	clinical trial	OSF	PROSPERO
total reviewed	12	7	2
original discrepancy review performed (submitted to editor)	10	5	1
original discrepancy review performed (second reviewer)	6	2	0
time for original discrepancy review, median and range	105 (16, 180) min	210 (90, 360) min	90 min
updated discrepancy review performed (submitted to editor)	2	2	1
updated discrepancy review performed (second reviewer)	6	5	2
time for updated discrepancy review, median and range	28 (10, 60) min	50 (20, 92) min	43 (12, 50) min
non-permanent registrations	0	4^c	0
submitted manuscript meets criteria for pre-registered badge	0^b	0	0
manuscripts correctly labelled as a secondary publication	0/5	0/0	0/0
importance of addressing discrepancies (submitted manuscripts)^a
quite important	3/7	1/7	1/2
somewhat important	4/7	3/7	1/2
not important or no discrepancies	0/7	3/7	0/2
accepted for publication	6	6^d	0
non-permanent registrations	0	2^c	n.a.
manuscripts correctly labelled as a secondary publication	3/4	0/0	n.a.
importance of addressing discrepancies (accepted manuscripts)^a
quite important	1/2	0/5	n.a.
somewhat important	0/2	0/5	n.a.
not important or no discrepancies	1/2	5/5	n.a.
one or more undisclosed primary outcome discrepancy	0/2	not assessed	n.a.
one or more undisclosed secondary outcome discrepancy	2/2	not assessed	n.a.

aWe only assessed the importance of addressing discrepancies for manuscripts that were not secondary publications associated with a clinical trial registration.

bAll the manuscripts associated with clinical trial registrations that we reviewed were submitted to Nicotine and Tobacco Research. Although this journal does not offer pre-registered badges, we nonetheless checked whether manuscripts met the badge criteria.

cIn addition to these numbers, two submitted manuscripts and one accepted manuscript had permanent registrations on the OSF REGISTRIES webpage, but only included a link to a non-permanent version on the OSF HOME website.

dOne registration was very difficult to map onto its associated manuscript (e.g. the registration contained over 1000 words in the hypotheses section) and the coders judged their confidence in their ratings to be hardly more confident than chance and no more confident than chance. Thus, we do not provide an assessment of the importance of addressing discrepancies for this study.

Characteristics and outcomes of the manuscripts reviewed. aWe only assessed the importance of addressing discrepancies for manuscripts that were not secondary publications associated with a clinical trial registration. bAll the manuscripts associated with clinical trial registrations that we reviewed were submitted to Nicotine and Tobacco Research. Although this journal does not offer pre-registered badges, we nonetheless checked whether manuscripts met the badge criteria. cIn addition to these numbers, two submitted manuscripts and one accepted manuscript had permanent registrations on the OSF REGISTRIES webpage, but only included a link to a non-permanent version on the OSF HOME website. dOne registration was very difficult to map onto its associated manuscript (e.g. the registration contained over 1000 words in the hypotheses section) and the coders judged their confidence in their ratings to be hardly more confident than chance and no more confident than chance. Thus, we do not provide an assessment of the importance of addressing discrepancies for this study.

Consistency of written discrepancy reports among reviewers

We qualitatively compared each pair of written discrepancy reports and found that their content was relatively similar in regard to substantial discrepancies, but less similar for minor concerns. These were comparisons of the written text in the discrepancy reports and are distinct from the inter-rater agreement we calculated across the 18 dimensions in the original discrepancy review process and the inter-rater agreement of potential outcome measures presented in §3.6. Regardless of whether reviewers used the original or updated discrepancy review process, their reports were relatively consistent in identifying non-permanent registrations (e.g. study protocols that were publicly shared on an OSF project page, but not formally registered);[6] manuscripts that were not the main paper related to a clinical trial registration; substantial discrepancies in the outcome measures of studies that used clinical trial registrations and substantial discrepancies in the hypotheses of studies using OSF registrations. Three pairs of reports differed considerably. One reviewer reported rushing two reviews and agreed with the presence of major issues when the other reviewer presented them. The third pair of reports was in regard to the first manuscript we received. It was the first manuscript reviewed, but the last manuscript to receive a second review—by which point we had substantially refined how we performed discrepancy review. Reports often differed in the minor issues they presented, which may have arisen due to the latitude reviewers had in what to report. Reports based on the updated discrepancy review process mentioned fewer minor issues. The style and wording of reports differed somewhat between reviewers. In summary, although inter-rater agreement was low across the specific 18 dimensions coded in the original discrepancy review process, the written reports had comparable overall conclusions.

Editor and author feedback

Five of the 13 action editors involved in our study responded to an optional questionnaire in regard to one of the manuscripts they handled. All five editors responded that they would use discrepancy review again and that it would be possible to implement discrepancy review for all manuscripts submitted to their journal if they were provided with discrepancy reviewers. To limit the impact our study had on the publication process beyond transparent reporting, we specifically asked editors to not allow discrepancy review to influence their editorial decision. We found that four of the five editors reported that discrepancy review did not influence their editorial decision and the fifth reported it encouraged acceptance. Three editors rejected the manuscripts they received and two asked for revisions. Our questionnaire prompted these two editors who asked for revisions with additional questions, such as the time it took them to implement discrepancy review. They reported taking 6 min and 30 min and that it was worth their time. Authors from 4 of the 21 manuscripts responded to an optional questionnaire. They all agreed that they would like their next manuscript to undergo discrepancy review and were positive about discrepancy review in their open-ended responses. They reported taking 30, 90 and 180 min to address the discrepancy reviewer comments (one author did not answer this question). Responses from both authors and editors were mixed regarding whether discrepancy review should be implemented before (n = 2), during (n = 3) or after (n = 2) standard peer review (one author responded ‘unsure’ and another author did not respond). The questionnaire included questions about the value of discrepancy review as well as an optional open-ended question asking what they disliked about discrepancy review. Two authors but no editors answered the question about disliking discrepancy review. One author disliked that discrepancy review increased the overall word count and another disliked that the discrepancy review provided a recommendation to include information regarding missing data, but that this information was already reported in the manuscript. Across the questionnaires, no response indicated against conducting a trial of discrepancy review (e.g. unwillingness to use discrepancy review, excess time requirements and general dissatisfaction with the process).

Updated outcomes

Our pre-registered outcome measure of discrepancies across each of the 18 dimensions was intended to answer two overarching questions: (i) whether authors address comments from discrepancy reviewers and (ii) whether a trial of discrepancy review could use this outcome measure. Based on information gained while performing this feasibility study, we revised our outcome measures to better answer these same questions.

Procedural outcome

Our first revised outcome measure is whether published manuscripts addressed the specific recommendations in the discrepancy reports. For the first few discrepancy reports we prepared, each recommendation was directly linked to one of the 18 dimensions. However, we found this way of writing discrepancy reports cumbersome and switched to writing itemized comments that were not necessarily directly linked to one of the 18 dimensions. Thus, we assessed whether authors addressed the itemized recommendation of the discrepancy reports, but did not subdivide between the 18 dimensions. The discrepancy reports contained between 1 and 10 comments (mean = 4.8, s.d. = 2.7, median = 5, IQR = 3–7). Of the 59 comments in the 12 manuscripts that were published, 31 were fully addressed, 10 were partially addressed, 17 were not addressed and 1 presented issues that the discrepancy reviewer later noticed did not need to be addressed. We categorized these 12 manuscripts into four bins: six addressed all or nearly all comments, three largely ignored discrepancy review and addressed very few comments (accounting for 13 of the 17 comments that were not addressed), one addressed some comments and two were secondary publications associated with a clinical trial registration but did not explicitly state that some aspects of their study were not registered. One manuscript did not address any comments, probably due to an editorial misunderstanding.[7] We coded several comments as partially addressed because the text in the published manuscript was imprecise. Inviting the discrepancy reviewers to review revised manuscripts may ensure that comments are more precisely addressed. In short, these data show that revised manuscripts often take into account the comments from discrepancy reviews. We do not plan to use this outcome measure for a trial. Rather we would recommend comparing endpoints between the control and experimental groups. Nonetheless, it provides evidence that discrepancy review has the potential to impact reporting quality.

Potential trial outcome measures

We selected several other revised outcome measures that could feasibly be used in a trial. These include whether published manuscripts (i) are properly registered, (ii) are transparently identified as secondary publications associated with a clinical trial registration, when relevant, (iii) contain at least one primary outcome discrepancy, (iv) contain at least one secondary outcome discrepancy, and (v) are assessed to have discrepancies that are important to address (table 2). Inter-rater agreement varied greatly across these potential trial outcomes. Coders had a perfect agreement (Cohen's κ = 1.00) on whether there were issues in the registrations for submitted manuscripts (where 4/21 had issues) and had one disagreement (Cohen's κ = 0.75) for published manuscripts (where 2/12 had issues). Coders agreed when identifying whether manuscripts were a secondary publication associated with a clinical trial registration for 18/21 submitted manuscripts (Cohen's κ = 0.67) and 11/12 published manuscripts (Cohen's κ = 0.8). Of the six published manuscripts linked with clinical trial registrations that were published, only two of them were the main report of the clinical trial. Coders agreed that neither had a primary outcome discrepancy and that both had at least one secondary outcome discrepancy. Coders' subjective assessments of whether discrepancies were ‘quite important’, ‘somewhat important’ or ‘not important’ to address[8] matched in five cases were off by one category in seven cases, and off by two categories in one case for submitted manuscripts (Cohen's κ = 0.05). For published manuscripts, they matched in four cases and were off by one category in one case, and off by two categories in two cases (Cohen's κ = −0.21). Inter-rater agreement was low for these subjective assessments, but differences in coding were easily resolved through discussion because one coder would often share information that the other coder had missed. In general, the coder who stated discrepancies were more important to address shifted the other coder's rating. More precise instructions for these subjective assessments may improve inter-rater agreement to the point where it could be reliably used as an outcome measure in a trial.

Discussion

We found that (i) when provided with discrepancy reviewers, some journals can incorporate discrepancy review into their manuscript handling procedures; (ii) registrations were less precise and less comprehensive than our original discrepancy review process was designed for; (iii) a semi-structured discrepancy review process was more feasible to implement than a rigidly structured and comprehensive implementation; (iv) authors addressed the majority of discrepancy reviewer comments, (v) clinical trial registrations and OSF registrations should be treated separately in a trial, as the former are generally more precise and the latter are generally more comprehensive, and (vi) a trial of discrepancy review appears feasible in terms of the procedure and outcomes measures, particularly so for clinical trial registrations where primary and secondary outcome are consistently and clearly demarcated. The outcome measures that could be used in a trial should depend on the types of registrations being reviewed. In contrast with OSF registrations, clinical trial registrations in our sample were always properly registered, but many manuscripts were secondary publications associated with a clinical trial registration. OSF registrations were generally less precise than clinical trials registrations, making it difficult to confidently claim an absence of primary or secondary outcome discrepancies. The lack of standardization in the content and precision of OSF registrations may also makes the process of awarding pre-registered badges challenging. Box 2 outlines a potential design for a trial of discrepancy review. Eligibility criteria. Manuscripts submitted to participating journals that include a clinical trial registration. Design: Two-arm parallel-group randomized controlled trial. Randomized at the level of submitted manuscripts within each journal. Intervention. Updated discrepancy review process. Primary outcome. Absence of undisclosed primary outcome discrepancies. If a manuscript is a secondary publication associated with a clinical trial registration and is clearly labelled as such, it will be considered to have no undisclosed primary outcome discrepancies. This binary outcome will be coded from manuscripts accepted for publication. Other outcomes. The time to prepare discrepancy reports. To be used to inform future cost-effectiveness analyses. Sample size and analysis. This trial could be approached with a power analysis or precision analysis. For a power analysis, we would need to select a smallest effect size of interest. Determining this effect is difficult without knowing the cost that journals are willing to incur for discrepancy review and the cost of undisclosed discrepancies (e.g. in misdirected future research resources or reduced quality of patient care). A precision analysis, in contrast, remains agnostic regarding whether the effect is meaningful and could be used in follow-up cost-effectiveness analyses. Both power and precision analyses require assumptions about the proportion of manuscripts with discrepancies in both the control and experimental group. For the control group, we assume a base-rate of 33% of manuscripts with at least one primary outcome discrepancy, which is the point estimate from a relevant meta-analysis [3]. For the experimental group, we take a best guess that about 10% of publications will have at least one primary outcome discrepancy. To detect this difference between groups, a Fisher's exact test would require 49 manuscripts per group (α = 0.05, power = 0.80). Alternatively, to estimate the difference between the two groups within a range of 20% (e.g. to show the experimental group has between 10% and 30% fewer manuscripts with primary outcome discrepancies), would require 120 manuscripts per group (95% confidence interval).[9] Several questions remain unanswered. First, we have not established the degree to which general research knowledge and domain expertise—which the discrepancy reviewers did not have in relation to the manuscripts they reviewed—facilitate or improve discrepancy review. Second, our study was not designed to identify elements of discrepancy review that could be automated (e.g. with a standard email asking authors to ensure accurate reporting). Third, our study did not evaluate the feasibility of randomizing manuscripts to the discrepancy review intervention, which could pose a challenge when conducting a trial. Fourth, it remains unknown whether editors who received discrepancy reviews will become more aware of these issues and check for discrepancies in future manuscripts. Fifth, we did not assess how easy it would be to recruit discrepancy reviewers outside the context of a trial and whether journals are willing to incur those potential costs. Additionally, the respondents to our questionnaires were probably biased towards those who were more interested in discrepancy review. Taken together, a trial of discrepancy review appears feasible if enough journals agree to participate. Such a trial may provide evidence for one potential intervention to improve reporting quality. At the same time, this intervention cannot directly improve study design or the quality of registrations and may add to the workload of researchers and reviewers. Parallel efforts that aim to improve quality early on in the research pipeline (e.g. pre-study peer review) could prove complementary to interventions, such as discrepancy review, that occur after study completion.

6 in total

1. Update on Trial Registration 11 Years after the ICMJE Policy Was Established.

Authors: Deborah A Zarin; Tony Tse; Rebecca J Williams; Thiyagu Rajakannan
Journal: N Engl J Med Date: 2017-01-26 Impact factor: 91.245

2. COMPare: a prospective cohort study correcting and monitoring 58 misreported trials in real time.

Authors: Ben Goldacre; Henry Drysdale; Aaron Dale; Ioan Milosevic; Eirion Slade; Philip Hartley; Cicely Marston; Anna Powell-Smith; Carl Heneghan; Kamal R Mahtani
Journal: Trials Date: 2019-02-14 Impact factor: 2.279

3. Peer reviewed evaluation of registered end-points of randomised trials (the PRE-REPORT study): protocol for a stepped-wedge, cluster-randomised trial.

Authors: Christopher W Jones; Amanda Adams; Mark A Weaver; Sara Schroter; Benjamin S Misemer; David Schriger; Timothy F Platts-Mills
Journal: BMJ Open Date: 2019-06-01 Impact factor: 2.692

4. Discrepancy review: a feasibility study of a novel peer review intervention to reduce undisclosed discrepancies between registrations and publications.

Authors:
Journal: R Soc Open Sci Date: 2022-07-27 Impact factor: 3.653

5. Use of trial register information during the peer review process.

Authors: Sylvain Mathieu; An-Wen Chan; Philippe Ravaud
Journal: PLoS One Date: 2013-04-10 Impact factor: 3.240

6 in total

1 in total

1. Discrepancy review: a feasibility study of a novel peer review intervention to reduce undisclosed discrepancies between registrations and publications.

Authors:
Journal: R Soc Open Sci Date: 2022-07-27 Impact factor: 3.653

1 in total