Literature DB >> 10094067

A reliability study for evaluating information extraction from radiology reports.

G Hripcsak1, G J Kuperman, C Friedman, D F Heitjan.   

Abstract

GOAL: To assess the reliability of a reference standard for an information extraction task.
SETTING: Twenty-four physician raters from two sites and two specialties judged whether clinical conditions were present based on reading chest radiograph reports.
METHODS: Variance components, generalizability (reliability) coefficients, and the number of expert raters needed to generate a reliable reference standard were estimated.
RESULTS: Per-rater reliability averaged across conditions was 0.80 (95% CI, 0.79-0.81). Reliability for the nine individual conditions varied from 0.67 to 0.97, with central line presence and pneumothorax the most reliable, and pleural effusion (excluding CHF) and pneumonia the least reliable. One to two raters were needed to achieve a reliability of 0.70, and six raters, on average, were required to achieve a reliability of 0.95. This was far more reliable than a previously published per-rater reliability of 0.19 for a more complex task. Differences between sites were attributable to changes to the condition definitions.
CONCLUSION: In these evaluations, physician raters were able to judge very reliably the presence of clinical conditions based on text reports. Once the reliability of a specific rater is confirmed, it would be possible for that rater to create a reference standard reliable enough to assess aggregate measures on a system. Six raters would be needed to create a reference standard sufficient to assess a system on a case-by-case basis. These results should help evaluators design future information extraction studies for natural language processors and other knowledge-based systems.

Entities:  

Mesh:

Year:  1999        PMID: 10094067      PMCID: PMC61353          DOI: 10.1136/jamia.1999.0060143

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  12 in total

1.  An evaluation of natural language processing methodologies.

Authors:  C Friedman; G Hripcsak; I Shablinsky
Journal:  Proc AMIA Symp       Date:  1998

2.  Knowledge discovery and data mining to assist natural language understanding.

Authors:  A Wilcox; G Hripcsak
Journal:  Proc AMIA Symp       Date:  1998

Review 3.  Development and initial validation of an instrument to measure physicians' use of, knowledge about, and attitudes toward computers.

Authors:  R D Cork; W M Detmer; C P Friedman
Journal:  J Am Med Inform Assoc       Date:  1998 Mar-Apr       Impact factor: 4.497

4.  An experiment comparing lexical and statistical methods for extracting MeSH terms from clinical free text.

Authors:  G F Cooper; R A Miller
Journal:  J Am Med Inform Assoc       Date:  1998 Jan-Feb       Impact factor: 4.497

5.  Extracting findings from narrative reports: software transferability and sources of physician disagreement.

Authors:  G Hripcsak; G J Kuperman; C Friedman
Journal:  Methods Inf Med       Date:  1998-01       Impact factor: 2.176

6.  Comparison of computer-aided and human review of general practitioners' management of hypertension.

Authors:  J van der Lei; M A Musen; E van der Does; A J Man in 't Veld; J H van Bemmel
Journal:  Lancet       Date:  1991-12-14       Impact factor: 79.321

7.  Respiratory isolation of tuberculosis patients using clinical guidelines and an automated clinical decision support system.

Authors:  C A Knirsch; N L Jain; A Pablos-Mendez; C Friedman; G Hripcsak
Journal:  Infect Control Hosp Epidemiol       Date:  1998-02       Impact factor: 3.254

8.  Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports.

Authors:  N L Jain; C A Knirsch; C Friedman; G Hripcsak
Journal:  Proc AMIA Annu Fall Symp       Date:  1996

9.  Performance of four computer-based diagnostic systems.

Authors:  E S Berner; G D Webster; A A Shugerman; J R Jackson; J Algina; A L Baker; E V Ball; C G Cobbs; V W Dennis; E P Frenkel
Journal:  N Engl J Med       Date:  1994-06-23       Impact factor: 91.245

10.  Validation of the medical expert system PNEUMON-IA.

Authors:  A Verdaguer; A Patak; J J Sancho; C Sierra; F Sanz
Journal:  Comput Biomed Res       Date:  1992-12
View more
  21 in total

1.  Ad hoc classification of radiology reports.

Authors:  D B Aronow; F Fangfang; W B Croft
Journal:  J Am Med Inform Assoc       Date:  1999 Sep-Oct       Impact factor: 4.497

2.  Toward a measured approach to medical informatics.

Authors:  C P Friedman
Journal:  J Am Med Inform Assoc       Date:  1999 Mar-Apr       Impact factor: 4.497

3.  Automatic detection of acute bacterial pneumonia from chest X-ray reports.

Authors:  M Fiszman; W W Chapman; D Aronsky; R S Evans; P J Haug
Journal:  J Am Med Inform Assoc       Date:  2000 Nov-Dec       Impact factor: 4.497

4.  Reference standards, judges, and comparison subjects: roles for experts in evaluating system performance.

Authors:  George Hripcsak; Adam Wilcox
Journal:  J Am Med Inform Assoc       Date:  2002 Jan-Feb       Impact factor: 4.497

5.  A study of biomedical concept identification: MetaMap vs. people.

Authors:  Wanda Pratt; Meliha Yetisgen-Yildiz
Journal:  AMIA Annu Symp Proc       Date:  2003

6.  Automated extraction and normalization of findings from cancer-related free-text radiology reports.

Authors:  Burke W Mamlin; Daniel T Heinze; Clement J McDonald
Journal:  AMIA Annu Symp Proc       Date:  2003

7.  Integrating a hypernymic proposition interpreter into a semantic processor for biomedical texts.

Authors:  Marcelo Fiszman; Thomas C Rindflesch; Halil Kilicoglu
Journal:  AMIA Annu Symp Proc       Date:  2003

8.  A pilot study of contextual UMLS indexing to improve the precision of concept-based representation in XML-structured clinical radiology reports.

Authors:  Yang Huang; Henry J Lowe; William R Hersh
Journal:  J Am Med Inform Assoc       Date:  2003-08-04       Impact factor: 4.497

9.  Automated evaluation of electronic discharge notes to assess quality of care for cardiovascular diseases using Medical Language Extraction and Encoding System (MedLEE).

Authors:  Jung-Hsien Chiang; Jou-Wei Lin; Chen-Wei Yang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

10.  Unsupervised Topic Modeling in a Large Free Text Radiology Report Repository.

Authors:  Saeed Hassanpour; Curtis P Langlotz
Journal:  J Digit Imaging       Date:  2016-02       Impact factor: 4.056

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.