Literature DB >> 27550650

Measurement of patient safety: a systematic review of the reliability and validity of adverse event detection with record review.

Mirelle Hanskamp-Sebregts¹, Marieke Zegers², Charles Vincent³, Petra J van Gurp¹, Henrica C W de Vet⁴, Hub Wollersheim².

Abstract

OBJECTIVES: Record review is the most used method to quantify patient safety. We systematically reviewed the reliability and validity of adverse event detection with record review.
DESIGN: A systematic review of the literature.
METHODS: We searched PubMed, EMBASE, CINAHL, PsycINFO and the Cochrane Library and from their inception through February 2015. We included all studies that aimed to describe the reliability and/or validity of record review. Two reviewers conducted data extraction. We pooled κ values (κ) and analysed the differences in subgroups according to number of reviewers, reviewer experience and training level, adjusted for the prevalence of adverse events.
RESULTS: In 25 studies, the psychometric data of the Global Trigger Tool (GTT) and the Harvard Medical Practice Study (HMPS) were reported and 24 studies were included for statistical pooling. The inter-rater reliability of the GTT and HMPS showed a pooled κ of 0.65 and 0.55, respectively. The inter-rater agreement was statistically significantly higher when the group of reviewers within a study consisted of a maximum five reviewers. We found no studies reporting on the validity of the GTT and HMPS.
CONCLUSIONS: The reliability of record review is moderate to substantial and improved when a small group of reviewers carried out record review. The validity of the record review method has never been evaluated, while clinical data registries, autopsy or direct observations of patient care are potential reference methods that can be used to test concurrent validity. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

Entities: Chemical Disease Species

Mesh：

Year: 2016 PMID： 27550650 PMCID： PMC5013509 DOI： 10.1136/bmjopen-2016-011078

Source DB: PubMed Journal: BMJ Open ISSN： 2044-6055 Impact factor: 2.692

We have reviewed ∼4000 articles across five databases on psychometric data regarding the record review as a method to detect adverse events. We evaluated the methodological quality of the included studies on measurement properties with the validated COSMIN checklist. Two instruments for record review, the Global Trigger Tool and the Harvard Medical Practice Study, were extensively tested on their reliability, but data regarding the validity of these instruments completely lack. The subgroup analyses were limited to the variables that were reported by the authors in the studies that were included in our systematic review.

Introduction

Healthcare professionals are faced with the challenge of improving patient safety by detecting, preventing and mitigating the occurrence of adverse events (AEs).1 2 An AE is defined as an injury that is caused by healthcare management (rather than the underlying disease) and results in prolonged hospitalisation, disability at the time of discharge or even in patient's death.3 Besides improving patient safety, transparency with reliable and valid data is necessary for accountability purposes.4 5 Non-valid or unreliable instruments for quantifying patient safety can lead to inadequate diagnosis of patient safety problems and subsequently to the implementation of inadequate patient safety improvement interventions. Patient record review is the most thoroughly studied method used to measure the prevalence of AEs.6 Incidents, complaints and claims reporting systems are less suitable for counting AEs, because the amount of AEs strongly depends on the willingness of healthcare providers and patients to report them. Only 3–5% of the AEs detected in patient records are reported by healthcare providers in hospitals.7–11 In addition, the denominator, the related number of patients, is difficult to determine. These systems are therefore inadequate to count the actual number of incidents.12–14 Although record review is widely accepted as the method for quantifying AEs, data about the psychometric aspects of this method reported in previous literature reviews are limited12 13 15 or outdated.16 Therefore, we systematically reviewed the reliability and validity of record review and which factors are associated with these psychometric measures. We assumed that the inter-rater reliability of record review was higher for studies with a small number of reviewers, more reviewer experience and a higher training level.

Methods

Search strategy and databases

Our literature search strategy was prespecified and aligned with recommendations outlined in the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses).17 We included the study protocol in online supplementary appendix 1. We searched for full-text studies published until October 2013 and updated our search in February 2015 using the following databases: PubMed (including MEDLINE), EMBASE, CINAHL, PsycINFO and the Cochrane Library. The references of the included studies were manually checked, and the authors' personal files and bibliographies of previously published related reviews were searched to identify additional relevant studies (snowballing). There were no language restrictions. Online supplementary appendix 2 provides a detailed listing of search strings.

Selection criteria and process

Two researchers (MH-S and MZ) independently screened the titles and abstracts of all studies identified by the search strategy for their eligibility. Studies were included if (1) the record review method was described in detail, (2) AEs were measured in a wide variety of patient groups and (3) data about reliability and validity were reported. Studies not available in full-text were excluded. When the title and abstract did not clearly indicate whether the inclusion criteria were met, the full text (meaning the complete article) was obtained and reviewed by two researchers (MH-S and MZ). The previously described inclusion criteria were applied again, and a final set of studies was identified for data extraction. Disagreement about inclusion was solved by discussion. When no consensus could be achieved, a third researcher (HW) made the final decision.

Terminology and definitions

Different types of reliability and validity of measurement instruments can be distinguished. Focus of our systematic review was on the inter-rater reliability, content (face) validity and concurrent validity of record review. Definitions are described in table 1.

Table 1

Definitions of reliability and validity in the context of record review

Terms	Definition (expressed by)	Comments relevant to record review
Inter-rater reliability18	Measures consensus in the scores when different raters using the same measurement instrument in the same group of patients. Mostly expressed as a reliability measure (κ), or % agreement	Two independent reviewers assess patient records without discussion between the reviewers during the review process
Face validity18	The degree to which the content of an instrument is an adequate reflection of the construct to be measured (descriptive, expert opinion)
Concurrent validity19	The extent to which scores on a new measure are related to scores from a criterion measure administered at the same time (Se, Sp, PPV and NPV)	Clinical data registries, autopsy or direct observations of patient care have the potential to be a criterion measure for record review

NPV, negative predictive value; PPV, positive predictive value; Se, sensitivity; Sp, specificity;

Definitions of reliability and validity in the context of record review NPV, negative predictive value; PPV, positive predictive value; Se, sensitivity; Sp, specificity;

Quality assessment

Assessment of the methodological quality of the selected studies was carried out using the COSMIN checklist.20 The COSMIN checklist facilitates a separate judgement of the methodological quality of the included studies and their results.21 The COSMIN checklist consists of nine boxes with methodological standards for how each measurement property should be assessed. Three of the nine boxes were relevant for this systematic review regarding inter-rater reliability, content validity and concurrent validity. There are no standards for assessing face validity, because face validity requires a subjective judgement of experts.22 Each item in these relevant boxes was scored on a four-point rating scale (ie, ‘poor’, ‘fair’, ‘good’ or ‘excellent’).20 21 An overall score for the methodological quality of a study was determined by taking the lowest rating of any of the items in a box. The methodological quality of a study was assessed per measurement property by MH-S, and 10% of the studies were assessed independently by MZ. In cases of disagreement, a third reviewer (HW) was consulted for a final decision.

Data extraction

Each article that met study eligibility criteria was independently abstracted by one reviewer (MH-S), and a second reviewer (MZ) crosschecked the data extraction of the first reviewer. Both reviewers used a standardised form, which compromised a description of objectives, study population, design and methods used and the results of the analysis of the reliability and validity, including statistical parameters (see online supplementary appendix 1).

Data synthesis and analysis

We tabulated study characteristics and outcomes such as setting, number of records, percentage AEs and data about reliability and validity of record review. In some studies, percentage agreement was calculated from source data by MH-S and confirmed by MZ. To be able to rate the reliability of record review, we classified the κ values as ‘slight’ (κ=0.00–0.20), ‘fair’ (κ=0.21–0.40), ‘moderate’ (κ=0.41–0.60), ‘substantial’ (κ=0.61–0.80) and ‘almost perfect’ (κ=0.81–1.00).23 We pooled the outcomes statistically by calculating the mean percentage agreement and the mean and pooled κ on the presence of AEs to draw conclusions about the reliability of record review. We used the number of records on which the κ value is calculated as weighing factor in the statistical pooling as a proxy for accuracy, since we missed information about the 95% CIs of the κ values in the included studies. To examine differences in κ values depending on the number of reviewers, reviewer experience and reviewer training, we present descriptive statics per subgroup (mean with SD or median with IQR for non-normal distributions, minimum and maximum). In order to better interpret the results, we classified the number of reviewers per study, reviewer experience and reviewer training into three proportional classes: maximum 5 reviewers, >5–20 reviewers, >20 reviewers; <100 records per reviewer, 100–300 records per reviewer, >300 records per reviewer and <1 day training, 1 day training, >1 day training, respectively. We used the non-parametric Kruskal-Wallis test for the group characteristics, which are not normally distributed and an ANOVA for the group characteristics with a normal distribution. We checked whether the assumptions for ANCOVA were met. It was not possible to incorporate all variables (the number of reviewers, reviewer experience and reviewer training) in one ANCOVA, because the number of studies in our analyses was limited (n=20). Therefore, we performed three separate ANCOVAs, with prevalence of AE as covariate. We adjusted for prevalence of AEs, since a previous study of Lilford et al16 showed correlation between prevalence and κ. Additionally, we studied the influence of the aim of the study and the type of instrument (Global Trigger Tool (GTT) vs Harvard Medical Practice Study (HMPS)) on κ with two separate ANCOVAs adjusted for prevalence. A p value of <0.05 was regarded as statistically significant. Statistical software IBM SPSS V.22 was used for all statistical analyses and data processing.

Results

Results of the literature search

Our literature study yielded 3915 citations (see online supplementary appendix 3, flow chart), of which 1790 were in PubMed, 1153 were in EMBASE, 515 were in CINAHL, 30 were in PsycINFO and 427 were in the Cochrane Library. After removing duplicates, 3415 studies remained, of which 148 were selected for full-text selection. A total of 137 studies were excluded after reading the full text, because these studies did not meet the inclusion criteria, including studies that did not focus on the reliability or validity of record review,24–26 did not have AEs as outcome27 or reported a different method than retrospective reviewing of medical records.28 29 We collected eight additional articles through manual searching of articles' bibliographies. In February 2015, we updated our search and found six additional studies. The final set consisted of 25 record review studies; 24 studies were used for calculating the mean κ, and 20 studies were appropriate for the subgroup analysis. Five studies were excluded because only the intraclass correlation coefficient was calculated,30 the prevalence was an outlier,31 the prevalence was not reported32 33 or the number of reviewers was not reported.3

Description of the GTT and the HMPS

We found two record review instruments for detecting AEs, namely, the GTT and the HMPS. Both instruments use an implicit review style, meaning that the AE assessment relies on expert judgement instead of using well-defined criteria on a checklist (explicit review style).6 16 The GTT and the HMPS consist of a two-stage review process conducted by nurses and physicians (table 2). The GTT is primarily used as a quality improvement tool for clinical practice and for estimating and tracking AE rates over time in a hospital or a clinic. The HMPS is commonly used to measure the prevalence rate of AEs on a national level. The GTT is not meant to identify every single AE in a patient record, and, therefore, assessments have a time limit of 20 min per record.34 The GTT consists of 47–55 triggers to identify potential AEs. Reviewing the preventability of adverse events is originally no part of the GTT method, but has been recently included in the studies of Schildmeijer et al,35 Kennerly et al,36 Najjar et al37 and Hwang et al.38 In contrast, the HMPS consists of 16–18 screening criteria (triggers), 27 leading questions for AE detection, of which three questions are crucial for AE determination: injury present; resulting in prolongation of hospital stay, temporary or permanent disability or death and caused by healthcare management. Determination of preventability of AEs is standard within the HMPS method. The HMPS is more time-consuming and labour-intensive in assessing AEs (stage 2) than the GTT, due to the number of questions.

Table 2

Description of the Global Trigger Tool and Harvard Medical Practice Study

Instrument	Description	Safety outcomes	Conducted by	Scale
Global Trigger Tool34	Two-stage retrospective record review
	Stage 1: Screening records for the presence of triggers and determining the adverse event that caused harm to patients	Triggers (mostly narrow)	Stage 1: Trained nurses or hospital pharmacists (primary reviewers, mostly two reviewers per records)Maximum 20 min per record	Dichotomous: yes/no trigger
	Stage 2: Confirming or dismissing the occurrence and category of the adverse event	Adverse events	Stage 2: Trained physicians (second reviewers, mostly one reviewer)	Dichotomous: yes/no AEDefinition of AE: Any unintended physical injury resulting from or contributed to by medical care that requires additional monitoring, treatment or hospitalisation or that results in death
Medical record review based on HMPS3	Two-stage or three-stage retrospective record review
	Stage 1: Screening records using criteria	(Broad) Screening criteria (triggers)	Stage 1: Trained nurses*No time limit	Dichotomous: yes/no trigger
	Stage 2: Detailed review to confirm the presence of adverse events and their preventability	(Preventable) Adverse events	Stage 2: Trained physicians (one or two reviewers per record)	AE determination is based on three criteria: Unintended injury to the patient (dichotomous: yes/no) Resulted in prolongation of hospital stay, temporary or permanent disability or death (dichotomous: yes/no) Caused by healthcare management (six-point scale) Preventability: six-point scaleWhen criteria 1 and 2 are met and the score on criteria 3 is ≥4, then there has been an AE and an AE is preventable when the score on the preventability scale is ≥4
	Stage 3: Discussion or independently supervising review (consensus stage)†		Stage 3: Supervising physician

*With the exception of the study of Brennan et al,32 in which medical records were reviewed by medical-record-room administrators.

†In some studies, a third stage was used.3 32 39–42

AEs, adverse events; HMPS, Harvard Medical Practice Study.

Description of the Global Trigger Tool and Harvard Medical Practice Study Unintended injury to the patient (dichotomous: yes/no) Resulted in prolongation of hospital stay, temporary or permanent disability or death (dichotomous: yes/no) Caused by healthcare management (six-point scale) *With the exception of the study of Brennan et al,32 in which medical records were reviewed by medical-record-room administrators. †In some studies, a third stage was used.3 32 39–42 AEs, adverse events; HMPS, Harvard Medical Practice Study.

Characteristics and methodological quality of included studies

Most of the identified studies were carried out in the USA, UK, Canada, Europe and Australia (see online supplementary appendices 4 and 5). In these studies, the GTT (n=10 studies) and HMPS (n=15 studies) were all tested in hospitals. The percentage AEs in GTT studies ranged from 7.2% to 27.0% (see online supplementary appendix 4). The total number of reviewers varied from 2 to 20 reviewers per study. Reviewers assessed 50 to 4043 records on average. The percentage AEs in HMPS studies ranged from 2.9% to 18.0%, and for preventable AEs they ranged from 1% to 8.6% (see online supplementary appendix 5). The total number of reviewers varied from 2 to 127 reviewers per study. Average records per reviewer ranged from 38 to 3872 records. The primary aim of most of the GTT studies included in this review was to examine the inter-rater reliability, whereas the primary aim of the HMPS studies reporting inter-rater reliability data was measuring AE rates. The methodological quality of the included studies3 11 30–33 35–58 was good. In all these studies, the inter-rater reliability was evaluated. In one study, the face validity was evaluated.32

Reliability of the GTT

The percentage agreement for reviewers of AE assessment was reported in four studies,31 38 43 47 ranging from 83% to 94% with a mean of 87.5% (SD 4.8%) (see online supplementary appendix 4). One study showed fair inter-rater reliability (κ=0.34),47 two studies showed moderate inter-rater reliability (κ=0.45),35 43 five studies showed substantial inter-rater reliability (κ=0.62–0.74)31 36 38 45 46 and two studies showed almost perfect inter-rater reliability (κ=0.85–0.89).37 44 The mean κ and pooled κ are 0.65 (SD 0.19), meaning that the overall inter-rater reliability of the GTT is substantial.23

Reliability of the HMPS

The percentage agreement of AE assessment was reported in 10 studies and ranged from 73% to 91% with a mean of 83% (SD 6.1%);3 11 39–42 49 50 52–54 percentage agreement for preventability of AE was assessed in six studies and ranged from 58% to 93% with a mean of 81% (SD 13%)3 11 39 40 49 54 (see online supplementary appendix 5). Ten studies showed moderate inter-rater reliability for AE detection (κ=0.40–0.57)32 39 41 42 48–52 54 and in four studies the inter-rater reliability was substantial (κ=0.61–0.80).3 11 40 49 In 10 studies, the κ for assessing preventable AEs was reported and ranged from 0.19 to 0.76.3 11 32 39 40 48 49 51 53 54 One study showed slight inter-rater reliability (κ=0.19),53 three studies showed fair inter-rater reliability (κ=0.24–0.34),3 32 54 three studies showed moderate inter-rater reliability (κ=0.44–0.49)11 39 48 and three studies showed substantial inter-rater reliability (κ=0.69–0.76)40 49 51 for assessing preventable AEs. The mean κ and pooled κ of the HMPS for AE assessment are 0.54 (SD 0.10) and 0.55 (SD 0.07), respectively, and, for assessing preventability, they are 0.47 (SD 0.20) and 0.48 (SD 0.20), respectively. The inter-rater reliability of the HMPS is classified as moderate.23

Subgroup analysis inter-rater reliability

The number of GTT studies (n=9) and HMPS studies (n=11) were too small to perform the subgroup analysis for the methods separately. Therefore, we used the κ statistics of all studies (n=20) to carry out the subgroup analysis. The assumptions for ANCOVA were met. Prevalence was not statistically significant associated with the κ values (p=0.069, p=0.189 and p=0.726, respectively). We found a statistically significant difference in the pooled κ values, p=0.006, among subgroups according to the number of reviewers (table 3). There were no differences in κ values between subgroups according to reviewer experience (p=0.062) and reviewer training (p=0.809). The group of maximum five reviewers detected more AEs (average 17.1%) in comparison with the other two groups of reviewers (table 4). This group received the least training (median 6 hours) and assessed the largest number of records (median 213 records). There was no significant difference in the reviewer experience (p=0.351), the reviewer training (p=0.317) and the prevalence of AEs (p=0.480) between the three groups of reviewers (maximum 5 reviewers, >5–20 reviewers and >20 reviewers).

Table 3

Differences in pooled κ values (n=20) among subgroups according to number of reviewers, reviewer experience and reviewer training

	n	Pooled κ* (SD)	95% CI	p Value†
Group of reviewers
Max 5	7	0.80 (0.07)	0.66 to 0.94	0.006
>5–20	7	0.52 (0.06)	0.40 to 0.64
>20	6	0.54 (0.02)	0.50 to 0.59
Total	20
Reviewer experience (records/reviewer)
<100	7	0.71 (0.06)	0.58 to 0.84	0.062
100–300	6	0.51 (0.04)	0.43 to 0.58
>300	7	0.53 (0.04)	0.45 to 0.62
Total	20
Training
<1 day	4	0.53 (0.07)	0.37 to 0.68	0.809
1 day	4	0.56 (0.14)	0.25 to 0.87
>1 day	5	0.57 (0.05)	0.45 to 0.67
Total	13

*Pooled κ weighted for the number of records on which the κ value is calculated.

†p Values are obtained with the prevalence rate as covariate.

Table 4

The reviewer experience, reviewer training and the prevalence of AEs in the three groups of reviewers

	Max 5 reviewers		>5–20 reviewers		>20 reviewers
	Median* (IQR)	Min–Max	Median* (IQR)	Min–Max	Median* (IQR)	Min–Max	p Value†
Reviewer experience (records/reviewer)	213 (60–1138)	50–4043	95 (39–317)	38–591	129 (109–616)	78–675	0.351
Training hours	6 (0–6)	0–12	16 (5–20)	2–24	8 (3–10)	2–16	0.317
	Mean* (SD)	Min–Max	Mean* (SD)	Min–Max	Mean* (SD)	Min–Max	p Value‡
Prevalence AEs (%)	17.1 (7.8)	7.2–27	13.5 (4.0)	7.5–21	12.7 (8.5)	2.9–25.1	0.480

*Unweighted statistics for reviewer experience, training and prevalence rate.

†p Values are obtained by the non-parametric Kruskal-Wallis test.

‡p Value is obtained with an ANOVA.

AEs, adverse events; Min, minimum; Max, maximum.

Differences in pooled κ values (n=20) among subgroups according to number of reviewers, reviewer experience and reviewer training *Pooled κ weighted for the number of records on which the κ value is calculated. †p Values are obtained with the prevalence rate as covariate. The reviewer experience, reviewer training and the prevalence of AEs in the three groups of reviewers *Unweighted statistics for reviewer experience, training and prevalence rate. †p Values are obtained by the non-parametric Kruskal-Wallis test. ‡p Value is obtained with an ANOVA. AEs, adverse events; Min, minimum; Max, maximum. The number of studies that reported the κ of preventable AEs (n=8) was too small for subgroup analysis. The aim of the study and the type of instrument (GTT vs HMPS) were not statistically significantly associated with κ (p=0.572 and p=0.086, respectively).

Validity

The face validity of the HMPS was reported in one study as being a valid method to identify AEs.32 We found no studies in which the concurrent validity of the GTT or HMPS has been studied.

Discussion

The inter-rater reliability of record review to detect AEs is moderate to substantial;23 with a pooled κ of 0.65 and 0.55 for the GTT method and the HMPS method, respectively. The pooled κ for preventability, measured with the HMPS method, is moderate, 0.48. The fact that there are no studies looking at concurrent validity is alarming, given the statements that record review is accepted worldwide as the ‘best’ means of measuring incidence rates of AEs (even called ‘the gold standard’).15 59 Even if the inter-rater reliability of record review is acceptable, there is no evidence that record review really detects AEs. Possible methods to test the concurrent validity of record review are clinical data registries, autopsy or direct observations of patient care. No single, even a small study experimented with above listed reference methods, although these methods capture valuable (real-time), accurate and precise patient data.13 60–63 We found statistically significant higher inter-rater reliability in subgroups in which the group of reviewers consisted of five reviewers or less. An explanation for this difference is that when the group of reviewers is small, the assessment of the presence of an AE becomes more standardised.40 64 Having a small group of reviewers stimulates (un)intentionally working closer together, resulting in less variation in the review methodology and more consensus about the definition of what constitutes harm in order to be counted as an AE. Additional advantages of having a small group of reviewers are that intensive review training can be organised, and the review process can be better monitored.40 In our review however, the group of maximum five reviewers received less training hours. Probably, they were better supervised or communicate better with each other during the study, which could increase the inter-rater agreement. The inter-rater reliability was higher when reviewers assess a substantial number of records.40 We found no statistically significant differences between subgroups according to reviewer experience, despite the group of maximum five reviewers assessed a notable number of records compared to the groups of reviewers, which consist of 6–20 reviewers or more than 20 reviewers. From other studies, we know that training improves the performance of review teams and the application of record review.65 66 We found no evidence for this in our review. In fact, the group of maximum five reviewers had half the training hours compared to the group of 6–20 reviewers but achieved a higher inter-rater agreement. The systematic review of Lilford et al16 showed that there was an association between κ and the prevalence of AEs. We found no statistically significantly association between κ and the prevalence of AEs. The smaller range of the prevalence rate (2.9–27.0%) in our review compared to the review of Lilford et al16 (2.8–58.9%) could explain why we did not find an association between κ and the prevalence of AEs. Our systematic review has some strengths and limitations. First, the evidence of the results of the statistical pooling depends on the quality of the therein contained studies. We used the validated COSMIN tool20 to evaluate the methodological quality of the included studies. Second, it was not possible to formally estimate the pooled κ statistics for the GTT and Medical Record Review (MRR) to assess between-study heterogeneity or to carry out analyses of the likelihood of publication bias, because CIs were lacking in approximately half of the reliability studies. Third, the subgroup analyses were limited to the variables that were reported by the authors in the included studies of our systematic review. Other factors that possibly influence the inter-rater agreement between reviewers, such as the level of cooperation between the reviewers during the review process, could therefore not be studied. Fourth, our review may have been influenced by publication bias, as studies reporting low reliability or validity may be less likely to be published than those with more positive results. Fifth, we statistically pooled the κ values. However, specific agreement on the presence of AE, expressing the agreement separately for the positive and negative ratings, is recommended.67 After all, inter-rater reliability concerns when one reviewer finds an AE, and this AE is also found by a second reviewer. Unfortunately, in most of the studies, information about the number of records for which there was agreement, presented in a 2×2 cross table, was missing. Therefore, we could not perform a statistical pooling of the proportion of specific agreement. In conclusion, users of the record review method to assess (preventable) AEs should be aware that the inter-rater agreement between reviewers is moderate to substantial and increases when using a smaller group of reviewers. More studies are needed to explore which factors increase the inter-rater reliability of record review. Most importantly, concurrent validity should be tested, otherwise it remains an imperfect, never evaluated method.

55 in total

1. Discussion between reviewers does not improve reliability of peer review of hospital quality.

Authors: T P Hofer; S J Bernstein; S DeMonner; R A Hayward
Journal: Med Care Date: 2000-02 Impact factor: 2.983

Review 2. Inter-rater reliability of case-note audit: a systematic review.

Authors: Richard Lilford; Alex Edwards; Alan Girling; Timothy Hofer; Gian Luca Di Tanna; Jane Petty; Jon Nicholl
Journal: J Health Serv Res Policy Date: 2007-07

3. Estimating hospital deaths due to medical errors: preventability is in the eye of the reviewer.

Authors: R A Hayward; T P Hofer
Journal: JAMA Date: 2001-07-25 Impact factor: 56.272

4. The Quality in Australian Health Care Study.

Authors: R M Wilson; W B Runciman; R W Gibberd; B T Harrison; L Newby; J D Hamilton
Journal: Med J Aust Date: 1995-11-06 Impact factor: 7.738

5. The unintended consequences of publicly reporting quality information.

Authors: Rachel M Werner; David A Asch
Journal: JAMA Date: 2005-03-09 Impact factor: 56.272

Review 6. Detecting adverse events for patient safety research: a review of current methodologies.

Authors: Harvey J Murff; Vimla L Patel; George Hripcsak; David W Bates
Journal: J Biomed Inform Date: 2003 Feb-Apr Impact factor: 6.317

7. Assessment of the global trigger tool to measure, monitor and evaluate patient safety in cancer patients: reliability concerns are raised.

Authors: Thea Otto Mattsson; Janne Lehmann Knudsen; Jens Lauritsen; Kim Brixen; Jørn Herrstedt
Journal: BMJ Qual Saf Date: 2013-02-27 Impact factor: 7.035

8. Comparison of three methods for estimating rates of adverse events and rates of preventable adverse events in acute care hospitals.

Authors: Philippe Michel; Jean Luc Quenon; Anne Marie de Sarasqueta; Olivier Scemama
Journal: BMJ Date: 2004-01-24

9. French national survey of inpatient adverse events prospectively assessed with ward staff.

Authors: Philippe Michel; Jean Luc Quenon; Ahmed Djihoud; Sophie Tricaud-Vialle; Anne Marie de Sarasqueta
Journal: Qual Saf Health Care Date: 2007-10

10. Adverse events and potentially preventable deaths in Dutch hospitals: results of a retrospective patient record review study.

Authors: M Zegers; M C de Bruijne; C Wagner; L H F Hoonhout; R Waaijman; M Smits; F A G Hout; L Zwaan; I Christiaans-Dingelhoff; D R M Timmermans; P P Groenewegen; G van der Wal
Journal: Qual Saf Health Care Date: 2009-08

24 in total

1. A patient and family reporting system for perceived ambulatory note mistakes: experience at 3 U.S. healthcare centers.

Authors: Fabienne C Bourgeois; Alan Fossa; Macda Gerard; Marion E Davis; Yhenneko J Taylor; Crystal D Connor; Tracela Vaden; Andrew McWilliams; Melanie D Spencer; Patricia Folcarelli; Sigall K Bell
Journal: J Am Med Inform Assoc Date: 2019-12-01 Impact factor: 4.497

2. Serious adverse effects occurring after chemotherapy: A general cancer registry-based incidence survey.

Authors: Isabelle Ingrand; Gautier Defossez; Claire Lafay-Chebassier; François Chavant; Aurélie Ferru; Pierre Ingrand; Marie-Christine Pérault-Pochat
Journal: Br J Clin Pharmacol Date: 2020-01-16 Impact factor: 4.335

3. Do people living in rural and urban locations experience differences in harm when admitted to hospital? A cross-sectional New Zealand general practice records review study.

Authors: Carol Atmore; Susan Dovey; Robin Gauld; Andrew R Gray; Tim Stokes
Journal: BMJ Open Date: 2021-05-06 Impact factor: 2.692

4. Validation of an Automatic Tagging System for Identifying Respiratory and Hemodynamic Deterioration Events in the Intensive Care Unit.

Authors: Danielle Jeddah; Ofer Chen; Ari M Lipsky; Andrea Forgacs; Gershon Celniker; Craig M Lilly; Itai M Pessach
Journal: Healthc Inform Res Date: 2021-07-31

5. Patient and hospital characteristics that influence incidence of adverse events in acute public hospitals in Portugal: a retrospective cohort study.

Authors: Paulo Sousa; António Sousa Uva; Florentino Serranheira; Mafalda Sousa Uva; Carla Nunes
Journal: Int J Qual Health Care Date: 2018-03-01 Impact factor: 2.038

6. Patient safety incidents are common in primary care: A national prospective active incident reporting survey.

Authors: Philippe Michel; Jean Brami; Marc Chanelière; Marion Kret; Anne Mosnier; Isabelle Dupie; Anouk Haeringer-Cholet; Maud Keriel-Gascou; Claire Maradan; Frédéric Villebrun; Meredith Makeham; Jean-Luc Quenon
Journal: PLoS One Date: 2017-02-14 Impact factor: 3.240

7. Exploring similarities and differences in hospital adverse event rates between Norway and Sweden using Global Trigger Tool.

Authors: Ellen Tveter Deilkås; Madeleine Borgstedt Risberg; Marion Haugen; Jonas Christoffer Lindstrøm; Urban Nylén; Hans Rutberg; Soop Michael
Journal: BMJ Open Date: 2017-03-20 Impact factor: 2.692

8. Implementing paediatric early warning scores systems in the Netherlands: future implications.

Authors: J F de Groot; N Damen; E de Loos; L van de Steeg; L Koopmans; P Rosias; M Bruijn; J Goorhuis; C Wagner
Journal: BMC Pediatr Date: 2018-04-06 Impact factor: 2.125

9. Potential value of patient record review to assess and improve patient safety in general practice: A systematic review.

Authors: Caoimhe Madden; Sinéad Lydon; Ciara Curran; Andrew W Murphy; Paul O'Connor
Journal: Eur J Gen Pract Date: 2018-12 Impact factor: 1.904

10. Development of the Surgical Patient safety Observation Tool (SPOT).

Authors: A J Heideveld-Chevalking; H Calsbeek; Y J Emond; J Damen; W J H J Meijerink; J Hofland; A P Wolff
Journal: BJS Open Date: 2018-04-03