Literature DB >> 19561765

Inter-rater reliability of historical data collected by non-medical research assistants and physicians in patients with acute abdominal pain.

Angela M Mills1, Anthony J Dean, Frances S Shofer, Judd E Hollander, Christine M McCusker, Michael K Keutmann, Esther H Chen.   

Abstract

OBJECTIVES: In many academic emergency departments (ED), physicians are asked to record clinical data for research that may be time consuming and distracting from patient care. We hypothesized that non-medical research assistants (RAs) could obtain historical information from patients with acute abdominal pain as accurately as physicians.
METHODS: Prospective comparative study conducted in an academic ED of 29 RAs to 32 resident physicians (RPs) to assess inter-rater reliability in obtaining historical information in abdominal pain patients. Historical features were independently recorded on standardized data forms by a RA and RP blinded to each others' answers. Discrepancies were resolved by a third person (RA) who asked the patient to state the correct answer on a third questionnaire, constituting the "criterion standard." Inter-rater reliability was assessed using kappa statistics (kappa) and percent crude agreement (CrA).
RESULTS: Sixty-five patients were enrolled (mean age 43). Of 43 historical variables assessed, the median agreement was moderate (kappa 0.59 [Interquartile range 0.37-0.69]; CrA 85.9%) and varied across data categories: initial pain location (kappa 0.61 [0.59-0.73]; CrA 87.7%), current pain location (kappa 0.60 [0.47-0.67]; CrA 82.8%), past medical history (kappa 0.60 [0.48-0.74]; CrA 93.8%), associated symptoms (kappa 0.38 [0.37-0.74]; CrA 87.7%), and aggravating/alleviating factors (kappa 0.09 [-0.01-0.21]; CrA 61.5%). When there was disagreement between the RP and the RA, the RA more often agreed with the criterion standard (64% [55-71%]) than the RP (36% [29-45%]).
CONCLUSION: Non-medical research assistants who focus on clinical research are often more accurate than physicians, who may be distracted by patient care responsibilities, at obtaining historical information from ED patients with abdominal pain.

Entities:  

Year:  2009        PMID: 19561765      PMCID: PMC2672296     

Source DB:  PubMed          Journal:  West J Emerg Med        ISSN: 1936-900X


INTRODUCTION

A busy emergency department (ED) is a challenging site for collecting data for prospective clinical trials. Frequently, treating physicians are asked to enroll eligible patients and complete structured data forms, a time-consuming process that can interfere with clinical responsibilities. Research assistants (RAs) without formal medical training [e.g., undergraduate and post-baccalaureate students] have been used to assist in this process by identifying eligible patients, obtaining consent, documenting demographic information on standard data forms, and assisting with other data collection and management.1–3 Historical and physical examination features remain the basis for decision making about work-up and treatment of patients with acute abdominal pain; therefore, they are usually considered to be essential variables in research on this topic. Several studies have suggested that historical information obtained by medical providers may have significant inter-observer variability. In one such study, information recorded on standardized data sheets in a cohort of stroke patients revealed significant discrepancies in historical elements taken by six neurologists.4 In a study of chest pain patients, the historical features documented by nurse practitioners were less typical of angina pectoris compared to those documented by physicians after interviewing the same patients.5 These studies highlight the importance of assessing the reliability of the data- collection instrument as an integral part of the research project. No study to date has examined the reliability of the non-medical RAs in obtaining historical information for research. We designed and piloted a survey instrument containing standard, simple historical questions about abdominal pain. We hypothesized that non-medical RAs can reliably use this questionnaire and be at least as accurate as resident physicians (RPs) in obtaining historical information from patients with acute abdominal pain.

METHODS

Study Design

We conducted a prospective comparative study to evaluate the reliability of the historical features obtained from ED patients with abdominal pain using a standard questionnaire administered by RAs compared to RPs. Our Institutional Committee on Research involving Human Subjects at the University of Pennsylvania approved the study. Informed consent was obtained from all subjects.

Study Setting and Population

This study was conducted at an urban university hospital ED with a annual census of approximately 55,000 visits. Adult patients with acute abdominal pain were enrolled from April 6 to 22, 2007. A survey instrument with questions about historical features was completed independently for each patient by a RA and a RP. RAs are undergraduate and post-baccalaureate students enrolled in the Academic Associate Program,3, 6 a structured class at the University of Pennsylvania for which course credit is given. Students are responsible for attending research-related classes and working shifts in the ED during which they identify and enroll eligible patients for research projects, and in the current study, obtain historical information about patients with acute abdominal pain.

Study Protocol and Measurements

From 7 AM-midnight, seven days per week, the RAs identified and enrolled patients 18 years of age or older who presented with non-traumatic abdominal pain of less than 72 hours duration. Patients were excluded if they were pregnant, or if within the previous seven days they had sustained abdominal trauma or had an abdominal surgical procedure. A standardized questionnaire was completed independently by the RA and RP caring for the patient within 20 minutes of each other. The time of assessment was recorded on the data forms. Discrepancies between the two forms were resolved by a third person (RA) who was coached to specifically ask the patient: “we did not have a clear understanding of your answer to this question … [question repeated],” thus allowing patients to use either of their previous responses. This form was used as the “criterion standard.” Formal training sessions were provided to the RAs teaching them open-ended and neutral questioning techniques most likely to avoid influencing respondents.

Data Analysis

Descriptive data are presented as means ± standard deviation, frequencies, and percentages. Cohen’s kappa (κ) statistic and percent crude agreement (CrA), both with 95% confidence intervals (95% CIs), were used to measure inter-rater reliability. As described elsewhere, κ values range between 0 (chance agreement) and 1.00 (complete agreement); κ <0.2 represents poor agreement, 0.21–0.40 fair agreement, 0.41–0.60 moderate agreement, 0.61–0.80 good agreement, and 0.81–1.00 excellent agreement.7 To summarize specific types of questions (e.g., past medical history) we present the median kappa values with interquartile ranges (IQRs). Data were analyzed using SAS statistical software (Version 9.1, SAS Institute, Cary, NC) and StatXact (Version 6.1, Cytel Software Corporation, Cambridge, MA).

RESULTS

Sixty-five patients with acute abdominal pain were surveyed by 29 RAs and 32 RPs. The median age of the abdominal pain patients was 43 years; 77% were female and 54% black. There were 49 variables, of which 43 were dichotomized responses. The remaining six historical variables were related to times (e.g. when was the last time you vomited), which proved highly variable and not easily dichotomized. These were excluded. Therefore, there were 2754 comparisons (some variables had fewer comparisons and some were restricted by gender), of which there were 458 discrepancies between RP and RA (17%). Inter-rater reliability measures for all historical variables are listed in Table 1. Overall, the median agreement was moderate (κ 0.59 [IQR 0.37–0.69]; CrA 85.9%) but varied across data categories: initial pain location (κ 0.61 [IQR 0.59–0.73]; CrA 87.7%), current pain location (κ 0.60 [IQR 0.47–0.67]; CrA 82.8%), past medical history (κ 0.60 [IQR 0.48–0.74]; CrA 93.8%), associated symptoms (κ 0.38 [IQR 0.37–0.74]; CrA 87.7%), and aggravating/alleviating factors (κ 0.09 [IQR −0.01–0.21]; CrA 61.5%).
Table 1

Kappa statistics and crude agreement for abdominal pain characteristics

CharacteristicsKappa95% CICrude agreement95% CI
Initial pain location
Pain start RUQ*0.730.540.9289.2%79.1%95.6%
Pain start LUQ*0.610.400.8283.1%71.7%91.2%
Pain start RLQ*0.600.410.7980.0%69.1%89.2%
Pain start LLQ*0.570.370.7778.5%66.5%87.7%
Pain start epigastrium0.770.570.9692.3%83.0%97.5%
Pain start both lower quadrants0.790.640.9590.8%81.0%96.5%
Pain start diffuse0.660.390.9492.3%83.0%97.5%
Pain start right flank0.23−0.040.5080.0%69.1%89.2%
Pain start left flank0.590.340.8587.7%77.2%94.5%
Median and IQR*0.61 (0.59–0.73)87.7% (80.0–90.8%)
Current pain location
Pain now RUQ0.6730.4820.86485.9%75.0%93.4%
Pain now LUQ0.6020.4010.80281.3%69.5%69.5%
Pain now RLQ0.5940.3980.79079.7%67.8%88.7%
Pain now LLQ0.4660.2480.68373.4%60.9%83.7%
Pain now epigastrium0.6560.4420.87087.5%76.9%94.5%
Pain now both lower quadrants0.6920.5080.87785.9%75.0%93.4%
Pain now diffuse0.7440.5080.97993.8%84.8%98.3%
Pain now right flank0.4550.1840.72682.8%71.3%91.1%
Pain now left flank0.3010.0090.59281.3%69.5%69.5%
Median and IQR0.60 (0.47–0.67)61.5% (53.1–67.0%)
Aggravating/alleviating symptoms
Pain ever gone0.237−0.0100.48068.3%55.3%79.4%
Eating aggravating0.130−0.0500.31050.8%38.1%63.4%
Urinating aggravating0.040−0.1600.24076.9%64.8%86.5%
Coughing aggravating0.3100.0900.52063.1%50.2%74.7%
Antacid alleviating−0.050−0.2700.16041.5%29.4%54.4%
Eating alleviating−0.020−0.2000.16060.0%47.1%72.0%
Median and IQR0.09(0.01–0.21)61.5% (53.1–67.0%)
Associated symptoms
Vomiting0.7400.5700.90087.7%77.2%94.5%
Diarrhea0.7800.6000.96092.3%83.0%97.5%
DysuriaNC*96.9%89.3%99.6%
Pass gas0.160−0.0900.40060.0%47.1%72.0%
Fever0.3800.1500.61073.8%61.5%84.0%
Vaginal dischargeNC*96.0%86.3%99.5%
Vaginal bleeding0.370−0.1900.93094.0%83.5%98.8%
Median and IQR0.38 (0.37–0.74)87.7% (73.8–93.2%)
Past Medical History
HX* Abdominal surgery0.6900.5200.87084.6%73.5%92.4%
HX Gallstones0.5800.2600.90092.3%83.0%97.5%
HX Liver Disease0.5000.1300.88092.3%83.0%97.5%
HX Pancreatitis1.0001.0001.000100.0%94.5%100.0%
HX Inflammatory bowel disease1.0001.0001.000100.0%94.5%100.0%
HX Irritable bowel syndrome0.4200.0200.82092.3%83.0%97.5%
HX Diverticulitis0.5500.0901.00095.4%87.1%99.0%
HX GERD*0.2100.0000.42072.3%59.8%82.7%
HX Kidney stones0.6100.3900.83086.2%75.3%93.5%
HX Cancer0.8700.7001.00096.9%89.3%99.6%
HX Diabetes0.7000.3901.00095.4%87.1%99.0%
HX CAD*0.380−0.1800.93095.4%87.1%99.0%
Median and IQR0.60 (0.48–0.74)93.8% (85.9–95.8%)
Overall Median0.59 (0.37–0.69)85.9% (77.7–92.3%)

RUQ, right upper quadrant; LUQ, left upper quadrant; RLQ, right lower quadrant; LLQ, left lower quadrant; IQR, interquartile range; NC, not calculable; HX, history; GERD, gastroesophageal reflux disease;CAD, coronary artery disease.

Overall, crude agreement for both groups was above 80% in all but one of the five general categories (Figure 1). Of the 458 discordant results between the RP and RA, criterion standard was available for 429 (94%). Of these disagreements, the RA more often agreed with the criterion standard (N=274, 64% [55%–71%] compared with the RP (N=155, 36% [29–45%]. (See Table 2.)
Figure 1

Accuracy of historical features by research assistants and physicians

Table 2

Accuracy amongst discordant pairs compared to criterion standard

CharacteristicsNumber discordant pairs%RA correct%RP correct
Initial pain location
Pain start RUQ*771.4%28.6%
Pain start LUQ*1163.6%36.4%
Pain start RLQ*1250.0%50.0%
Pain start LLQ*1457.1%42.9%
Pain start epigastrium5100.0%0.0%
Pain start both lower quadrants650.0%50.0%
Pain start diffuse560.0%40.0%
Pain start right flank1376.9%23.1%
Pain start left flank875.0%25.0%
Median and IQR*63.6% (57.1–75.0%)36.4% (25.0–42.9%)
Current pain location
Pain now RUQ837.5%62.5%
Pain now LUQ1145.5%54.5%
Pain now RLQ1353.8%46.2%
Pain now LLQ1656.3%43.8%
Pain now epigastrium742.9%57.1%
Pain now both lower quadrants875.0%25.0%
Pain now diffuse366.7%33.3%
Pain now right flank1060.0%40.0%
Pain now left flank1266.7%33.3%
Median and IQR56.3% (45.5–66.7%)43.8% (33.3–54.5%)
Aggravating/alleviating symptoms
Pain ever gone1963.2%36.8%
Eating aggravating2774.1%25.9%
Urinating aggravating1464.3%35.7%
Coughing aggravating2171.4%28.6%
Antacid alleviating3663.9%36.1%
Eating alleviating2166.7%33.3%
Median and IQR65.6% (64.0–70.2%)34.5% (29.8–36.0%)
Associated symptoms
Vomiting771.4%28.6%
Diarrhea560.0%40.0%
Dysuria2100.0%0.0%
Pass gas2665.4%34.6%
Fever1741.2%58.8%
Vaginal discharge250.0%50.0%
Vaginal bleeding366.7%33.3%
Median and IQR65.4% (55.0–69.0%)34.6% (31.0–45.0%)
Past Medical History
HX* Abdominal surgery955.6%44.4%
HX Gallstones560.0%40.0%
HX Liver Disease5100.0%0.0%
HX Pancreatitis0no discordant pairsno discordant pairs
HX Inflammatory bowel disease0no discordant pairsno discordant pairs
HX Irritable bowel syndrome4100.0%0.0%
HX Diverticulitis30.0%100.0%
HX GERD*1764.7%35.3%
HX Kidney stones977.8%22.2%
HX Cancer250.0%50.0%
HX Diabetes3100.0%0.0%
HX CAD*3100.0%0.0%
Median and IQR62.4% (54.2–83.3%)37.6% (16.7–45.8%)
Overall Median63.9% (54.7–71.4%)36.1% (28.6–45.3%)

RUQ, right upper quadrant; LUQ, left upper quadrant; RLQ, right lower quadrant; LLQ, left lower quadrant; IQR, interquartile range; NC, not calculable; HX, history; GERD, gastroesophageal reflux disease;CAD, coronary artery disease.

DISCUSSION

This study explores the inter-rater reliability of historical features obtained by RAs and RPs using a standard questionnaire in the evaluation of abdominal pain. We found an overall moderate agreement between RAs and RPs for 43 historical variables. There was good agreement for initial pain location and moderate agreement for current pain location and past medical history. For associated symptoms, there was fair agreement using the kappa statistic with a crude agreement of 88%. The poorest agreement was found for aggravating and alleviating factors in which information obtained by both groups of investigators was correct only 62% of the time. The mathematical properties of the κ statistic determine that low rates of discrepancy in infrequent clinical findings will result in lower κ scores than the same rate in common ones. This may have resulted in the wide range of alleviating and aggravating factors, any one of which is encountered relatively infrequently, appearing to result in lower κ scores. Our results are consistent with prior studies of inter-rater reliability of physicians obtaining historical features, showing fair to excellent agreement (κrange 0.27–0.89) in hospitalized chest pain patients,8 fair to good agreement (κ range 0.37–0.69) in suspected stroke patients,9 and good agreement (κ range 0.58–0.71) in patients with suspected osteoarthritis.10 The current study also supports the findings of reports in which non-physician army medical practitioners demonstrated good overall agreement compared to physicians in the assessment of upper respiratory infection.11, 12 Specific to abdominal pain, our results were also consistent with those of a recent study comparing pediatric emergency physicians with surgeons in the evaluation of appendicitis in children showing fair to excellent agreement (κ range 0.33–0.82) for historical questions.13 Accurate data collection is an essential component of high quality clinical research. Prospectively collected data is generally considered to be of higher quality than data collected retrospectively or through chart abstraction. In many prospective studies conducted in the ED, the treating physician is asked to record subjects’ clinical data. This process may be cumbersome and time consuming. It may also be distracting or interfere with the physicians’ other responsibilities or create a fundamental conflict between the physician’s role as care provider and as researcher. To date, this is the first study to compare the ability of RAs with no formal medical training to RPs in obtaining historical information for research purposes. If, as the current study suggests, non-medical research assistants can obtain historical information about ED patientsacute abdominal pain that is as accurate or more accurate than that obtained by the treating physician, the burden of data collection may be lifted from the treating physician, allowing it to be obtained and recorded in a less hurried and more meticulous manner. This may result in higher quality medical research on this topic in the ED setting.

LIMITATIONS

As this study was conducted in a single institution with an established Academic Associate Program, our results may not be generalizable to other practice settings. Under-enrollment of patients evaluated in the overnight hours, the most acutely ill patients, and patients who did not consent to participate in the study may have caused some selection bias. The authors do not know of any “gold standard” available to be certain that patient responses to historical items are accurate. As such, this study design was our best attempt to study accuracy and inter-rater reliability in obtaining historical data for patients with abdominal pain. It is possible that patients may have been prompted into providing answers that were consistent with one of their prior responses when being interviewed by the third person for the “criterion standard” form. It is also possible that the third-person interviewer might have had a tendency to “coach” respondents to resolve discrepancies in a way that supported the data obtained by the first RA. Neither RAs nor RPs were blinded to the purpose of the study, which may have biased our results.

CONCLUSION

Non-medical research assistants focused on clinical research are often more accurate than physicians, who may be distracted by patient care responsibilities, at obtaining data for clinical research. They can reliably use a standardized data collection sheet to obtain historical information from patients who present to the ED with acute abdominal pain.
  13 in total

1.  An innovative strategy for conducting clinical research: the academic associate program.

Authors:  Judd E Hollander; Adam J Singer
Journal:  Acad Emerg Med       Date:  2002-02       Impact factor: 3.451

2.  Physician variability in history taking when evaluating patients presenting with chest pain in the emergency department.

Authors:  Thea L James; James Feldman; Supriya D Mehta
Journal:  Acad Emerg Med       Date:  2006-01-25       Impact factor: 3.451

3.  Research subject enroller program: a key to successful emergency medicine research.

Authors:  D J Cobaugh; L L Spillane; S M Schneider
Journal:  Acad Emerg Med       Date:  1997-03       Impact factor: 3.451

4.  Academic associate program: integrating clinical emergency medicine research with undergraduate education.

Authors:  J E Hollander; S M Valentine; G X Brogan
Journal:  Acad Emerg Med       Date:  1997-03       Impact factor: 3.451

5.  Algorithm-directed care by nonphysician practitioners in a pediatric population: Part I. Adherence to algorithm logic and reproducibility of nonphysician practitioner data-gathering behavior.

Authors:  F P Wilson; L O Wilson; M F Wheeler; L Canales; R W Wood
Journal:  Med Care       Date:  1983-02       Impact factor: 2.983

6.  Systematic bias in recording the history in patients with chest pain.

Authors:  D H Hickam; H C Sox; C H Sox
Journal:  J Chronic Dis       Date:  1985

7.  Comparison of pediatric emergency physicians' and surgeons' evaluation and diagnosis of appendicitis.

Authors:  Anupam B Kharbanda; Steven J Fishman; Richard G Bachur
Journal:  Acad Emerg Med       Date:  2008-02       Impact factor: 3.451

8.  Interobserver agreement for the bedside clinical assessment of suspected stroke.

Authors:  Peter J Hand; Janneke A Haisma; Joseph Kwan; Richard I Lindley; Bart Lamont; Martin S Dennis; Joanna M Wardlaw
Journal:  Stroke       Date:  2006-02-16       Impact factor: 7.914

9.  Interobserver variability in the assessment of neurologic history and examination in the Stroke Data Bank.

Authors:  D Shinar; C R Gross; J P Mohr; L R Caplan; T R Price; P A Wolf; D B Hier; C S Kase; I G Fishman; C L Wolf
Journal:  Arch Neurol       Date:  1985-06

10.  College research associates: a program to increase emergency medicine clinical research productivity.

Authors:  K Bradley; H H Osborn; M Tang
Journal:  Ann Emerg Med       Date:  1996-09       Impact factor: 5.721

View more
  1 in total

1.  Patient impression and satisfaction of a self-administered, automated medical history-taking device in the Emergency Department.

Authors:  Sanjay Arora; Andrew D Goldberg; Michael Menchine
Journal:  West J Emerg Med       Date:  2014-02
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.