| Literature DB >> 30841850 |
Cecilia Superchi1,2,3, José Antonio González4, Ivan Solà5,6, Erik Cobo4, Darko Hren7, Isabelle Boutron8.
Abstract
BACKGROUND: A strong need exists for a validated tool that clearly defines peer review report quality in biomedical research, as it will allow evaluating interventions aimed at improving the peer review process in well-performed trials. We aim to identify and describe existing tools for assessing the quality of peer review reports in biomedical research.Entities:
Keywords: Methods; Peer review; Quality control; Report; Systematic review
Mesh:
Year: 2019 PMID: 30841850 PMCID: PMC6402095 DOI: 10.1186/s12874-019-0688-x
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Definition of terms used in the present study
| Structured quality tool: scale or checklist including more than one item aimed at guiding the user to assess the overall quality of a peer review report. | |
| Unstructured quality tool: scale or checklist including only one item inquiring the overall quality of a peer review report. | |
| Items: elements of a scale or checklist representing a component of peer review report quality. Items in a scale could or could not have an attached numerical score. If there is no attached score, these items provide the evaluator with a guidance to assess the overall quality of a peer review report. | |
| Overall quality score in a scale is measured as: |
Examples of definition of scoring system instructions
| Scoring system instructions | ||
|---|---|---|
| Defined | Partially defined | Not defined |
| 5 (Exceptional) = The rare outstanding critique that is comprehensive, objective, and insightful. Evaluates purpose of the study, study design, scientific validity, and conclusions by numbering questions and constructive suggestions to be addressed by the author. Includes comments to the editor about whether this is something new and important and useful to our readers. | 1 (Poor) = Does not follow reviewer guideline structure or preferred formatting in providing comments; unfavourable timeliness. | 1 = poor; |
Fig. 1Study selection flow diagram
Main characteristics of the included tools
| Characteristics of tools | N (%) |
|---|---|
| Type of tool: | |
| Scale | 23 (96%) |
| Checklist | 1 (4%) |
| Number of items: | |
| 1 | 6 (25%) |
| > 1 | 18 (75%) |
| Weight of items a: | |
| Same weight | 10 (42%) |
| Different weight | 2 (8%) |
| User defined weight | 1 (4%) |
| Not applicable | 11 (46%)a |
| Score System Instruction: | |
| Defined | 5 (21%) |
| Partially defined | 3 (12%) |
| Not defined | 16 (67%) |
| Tool development: | |
| Reported | 1 (4%) |
| Not reported | 23 (96%) |
| Overall quality assessment b | |
| Single score | 6 (22%) |
| Summary score | 11 (41%) |
| Mean score | 6 (22%) |
| Sum score | 3 (11%) |
| Not reported | 1 (4%) |
aItem weight is not applicable for scale with a single item (n = 6), checklist (n = 1) and for scale including more than one item without a numerical score attached but presenting only a summary score (n = 4)
bThe total number is different because three tools presented more than one way to assess the overall quality and the checklist did not provide an overall score
Descriptive characteristics of tools used to assess the quality of a peer review report
| Journal or Company Name a | First Author, Year | Format | Quality defined b | Overall quality assessment | Items (n) | Items weights c | Scoring range d | Scoring system instruction e | Scale/ Checklist Development f | Validity g | Reliability h | Internal consistency | RCTs i |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Advances in Nursing Science; Issues in Mental Health Nursing; The Journal of Holistic Nursing | Shattell 2010 [ | Scale | N | Summary Score | 6 | S | 1–10 | N | NR | NR | NR | NR | 0 |
| American Journal of Roentgenology | Friedman 1995 [ | Scale | N | Single Score | 1 | NA | 1–4 | N | NR | NR | NR | NR | 0 |
| American Journal of Roentgenology | Kliewer 2005 [ | Scale | N | Summary Score | 4 | NA | 1–4 | N | NR | NR | NR | NR | 0 |
| American Journal of Roentgenology | Rajesh 2013 [ | Scale | N | Single Score | 1 | NA | 1–4 | P | NR | NR | NR | NR | 0 |
| American Journal of Roentgenology | Berquist 2017 [ | Scale | N | Summary Score | 4 | NA | 0–4 | Y | NR | NR | NR | NR | 0 |
| Annals of Emergency Medicine | Callaham 1998 [ | Scale | N | Single Score | 1 | NA | 1–5 | N | NR | NR | Inter-Rater (ICC = 0.44, 0.24, 0.12) l | NR | 2 m |
| Annals of Emergency Medicine | Callaham 2002 [ | Scale | N | Summary Score | 6 | NA | 1–5 | N | NR | NR | Inter-Rater (ICC = 0.44, 0.24, 0.12) l | NR | 1 |
| Annals of Emergency Medicine; Annals of Internal Medicine; JAMA; Obstetrics & Gynecology and Ophthalmology | Justice 1998 [ | Scale | N | Summary Score | 4 | S | 1–5 | N | NR | NR | NR | NR | 0 |
| British Journal of General Practice | Moore 2014 [ | Scale | N | Single Score | 1 | NA | A-E | Y | NR | NR | NR | 0 | |
| British Medical Journal | Black 1998 (RQI 3.2) [ | Scale | N | Summary Score | 7 | S | 1–5 | N | Y | Face ( | Test-Retest | Internal Consistency (Cronbach’s alpha = 0.84) | 5 |
| Mean | Content ( | Inter-Rater | |||||||||||
| British Medical Journal | Van Rooyen 1999 (RQI 4) [ | Scale | N | Mean n | 8 | S | 1–5 | N | NR | NR | Inter-Rater | 2 | |
| Chinese Journal of Tuberculosis and Respiratory Diseases | Yang 2009 [ | Checklist | N | NA | 5 | NA | NA | N | NR | NR | NR | 0 | |
| Journal of Clinical Investigation | Stossel 1985 [ | Scale | N | Single Score | 1 | NA | Good- | Y | NR | NR | NR | 0 | |
| Journal of General Internal Medicine | McNutt 1990 [ | Scale | N | Summary Score | 9 | S | 1–5 | N | NR | Construct | NR | 1 | |
| Journal of Vascular Interventional Radiology | Feurer 1994 [ | Scale | N | Sum | 7 | D | 0–14 | N | NR | Content ( | Inter-Rater | 0 | |
| NA | Review quality collector (RQC) 2012 [ | Scale | N | Mean | 4 | User-defined weights | 0–100 | N | NR | NR | NR | 0 | |
| Nursing Research | Henly 2009 [ | Scale | N | Mean (CAS, GAS scale) | 15 | S | 1–5 | P | NR | NR | Inter-Rater (ICC = 0.79) p | 0 | |
| Summary Score (OAS scale) | 1–5 | ||||||||||||
| Summary Score (GRQ scale) | 0–100 | ||||||||||||
| Nursing Research | Henly 2010 [ | Scale | N | Mean (CAS, GAR, SARNR scale) | 26 | S | 1–5 | P | NR | NR | Inter-Rater | 0 | |
| Summary Score (GRQ scale) | 0–100 | ||||||||||||
| Obstetrics & Gynecology, Dutch Journal of Medicine | Landkroon 2006 [ | Scale | N | Summary Score | 5 | NA | 1–5 | Y | NR | NR | Test-Retest | 0 | |
| Pakistan Journal of Medical Sciences | Jawaid 2006 [ | Scale | N | NR q | 5 | S | 1–5 | N | NR | NR | NR | 0 | |
| Peerage of science | Peerage Essay Quality (PEQ) 2011 [ | Scale | N | Mean | 3 | S | 1–5 | N | NR | NR | NR | 0 | |
| Publons Academy | Review Rating and Feedback Form 2016 [ | Scale | N | Sum | 4 | S | 0–3 (Full score: 0–12) | N | NR | NR | NR | 0 | |
| The Journal of Bone and Joint Surgery | Thompson 2016 [ | Scale | N | Single Score | 1 | NA | 80–100 | Y | NR | NR | Inter-Rater | 0 | |
| The National Medical Journal of India | Das Sinha 1999 [ | Scale | N | Sum | 5 | D | 0–100 | N | NR | NR | NR | 0 |
aName of journal or company/organization where the tool was used to assess the quality of their peer review reports
bThe quality of a peer review report is not clearly defined in any reports
cNA Not applicable, S Same weight for each item, D Different weight for each item
dNA Not applicable
eY Yes defined, P Partially defined, N Not defined
f, g, hNR Not reported
iNumber of randomized controlled trials where the tool was used as outcome criteria
lThe ICC was 0.44 for reviewers, 0.24 for editors, and 0.12 for manuscripts
mOne article consists of two studies. First study is not a RCT while the second one is a RCT [55]
nThe overall quality is based on the mean of the first seven items (the item about the tone of the review was not included)
oThe inter-rater reliability was measured with weighted K for item from 1 to 7 for two editors’ independent assessments
pThe tool includes more than one scale. We reported inter-rater reliability only for General Review Quality (GRQ) scale
qNot reported. Although the authors reported that the reviewers were rated as excellent, good and average based on the quality of the reviews, it is not reported how they assessed the overall quality of peer review reports
rICC range for 11 manuscripts. There was one outlier manuscript that if removed brought the range to 0.87–0.99
Fig. 2Frequency of quality domains and subdomains
Explanations and Examples of quality domains and subdomains
| N | Domains | Subdomains | Explanations and Examples |
|---|---|---|---|
| 1 | Relevance of the study | Explanation: Items inquiring if the reviewer has discussed in the peer review report the importance of the research question and usefulness of the study. | |
| 2 | Originality of the study | Explanation: Items inquiring if the reviewer has commented in the peer review report on the originality of the manuscript. | |
| 3 | Interpretation of the study results | Explanation: Items inquiring if the reviewer has commented in the peer review report on how authors interpreted and discussed the results of the study. | |
| 4 | Strengths and weaknesses of the study | General | Explanation: Items inquiring if the reviewer has identified and commented in the peer review report on the general strong and weak points of the study. |
| Methods | Explanation: Items inquiring if the reviewer has identified and commented in the peer review report on the strong and weak points specifically related to study’s methods | ||
| Statistical methods | Explanation: Items inquiring if the reviewer has identified and commented in the peer review report on the strong and weak points specifically related to study’s statistical methods | ||
| 5 | Presentation and organization of the manuscript | Explanation: Items inquiring if the reviewer has made comments in the peer review report on the data presentation such as tables and figures and on the organization of the manuscript such as writing communication. | |
| 6 | Structure of reviewer’s comments | Explanation: Items inquiring if the reviewer has made in the peer review report organized and structured comments. | |
| 7 | Characteristics of reviewer’s comments | Clarity | Explanation: Items inquiring if the reviewer has provided in the peer review report clear and easily to read comments. |
| Constructiveness | Explanation: Items inquiring if the reviewer has provided in the peer review report helpful, relevant and realistic comments. | ||
| Detail/Thoroughness | Explanation: Items inquiring if the reviewer has provided in the peer review report detailed and thorough comments supplying appropriate evidence. | ||
| Fairness | Explanation: Items inquiring if the reviewer has provided in the peer review report balanced and objective comments. | ||
| Knowledgeability | Explanation: Items inquiring if the reviewer has showed in the peer review report to know and understand correctly the content of the manuscript. | ||
| Tone | Explanation: Items inquiring if the reviewer has used a courteous tone in the peer review report. | ||
| 8 | Timeliness of the review report | Explanation: Items inquiring if the reviewer has completed the peer review report on time. | |
| 9 | Usefulness of the review report | Decision making | Explanation: Items inquiring if the reviewer has provided a peer review report useful to make a decision about the acceptance, revision or rejection of a manuscript |
| Manuscript improvement | Explanation: Items inquiring if the reviewer has provided useful suggestions in the peer review report to improve the manuscript. |
Fig. 3Hierarchical clustering of tools based on the nine quality domains. The figure shows which quality domains are present in each tool. A slice of the chart represents a tool, and each slice is divided into sectors, indicating quality domains (in different colours). The area of each sector corresponds to the proportion of each domain within the tool. For instance, the “Review Rating” tool consists of two domains: Timeliness, meaning that 25% of all its items are encompassed in this domain, and Characteristics of reviewer’s comments occupying the remaining 75%. The blue lines starting from the centre of the chart define how the tools are divided into the five clusters. Clusters #1, #2 and #3 are sub-nodes of a major node grouping all three, meaning that the tools in these clusters have a similar domain profile compared to the tools in clusters #4 and #5