Literature DB >> 23765123

Information Extraction for Clinical Data Mining: A Mammography Case Study.

Houssam Nassif1, Ryan Woods, Elizabeth Burnside, Mehmet Ayvaci, Jude Shavlik, David Page.   

Abstract

Breast cancer is the leading cause of cancer mortality in women between the ages of 15 and 54. During mammography screening, radiologists use a strict lexicon (BI-RADS) to describe and report their findings. Mammography records are then stored in a well-defined database format (NMD). Lately, researchers have applied data mining and machine learning techniques to these databases. They successfully built breast cancer classifiers that can help in early detection of malignancy. However, the validity of these models depends on the quality of the underlying databases. Unfortunately, most databases suffer from inconsistencies, missing data, inter-observer variability and inappropriate term usage. In addition, many databases are not compliant with the NMD format and/or solely consist of text reports. BI-RADS feature extraction from free text and consistency checks between recorded predictive variables and text reports are crucial to addressing this problem. We describe a general scheme for concept information retrieval from free text given a lexicon, and present a BI-RADS features extraction algorithm for clinical data mining. It consists of a syntax analyzer, a concept finder and a negation detector. The syntax analyzer preprocesses the input into individual sentences. The concept finder uses a semantic grammar based on the BI-RADS lexicon and the experts' input. It parses sentences detecting BI-RADS concepts. Once a concept is located, a lexical scanner checks for negation. Our method can handle multiple latent concepts within the text, filtering out ultrasound concepts. On our dataset, our algorithm achieves 97.7% precision, 95.5% recall and an F1-score of 0.97. It outperforms manual feature extraction at the 5% statistical significance level.

Entities:  

Keywords:  BI-RADS; clinical data mining; free text; lexicon; mammography

Year:  2009        PMID: 23765123      PMCID: PMC3676897          DOI: 10.1109/icdmw.2009.63

Source DB:  PubMed          Journal:  Proc IEEE Int Conf Data Min        ISSN: 1550-4786


  18 in total

Review 1.  False-negative mammograms. Medical, legal, and risk management implications.

Authors:  R J Brenner
Journal:  Radiol Clin North Am       Date:  2000-07       Impact factor: 2.303

2.  Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS.

Authors:  P G Mutalik; A Deshpande; P M Nadkarni
Journal:  J Am Med Inform Assoc       Date:  2001 Nov-Dec       Impact factor: 4.497

Review 3.  False-negative results in screening programs. Medical, psychological, and other implications.

Authors:  M Petticrew; A Sowden; D Lister-Sharp
Journal:  Int J Technol Assess Health Care       Date:  2001       Impact factor: 2.188

4.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

5.  Evaluation of negation phrases in narrative clinical reports.

Authors:  W W Chapman; W Bridewell; P Hanbury; G F Cooper; B G Buchanan
Journal:  Proc AMIA Symp       Date:  2001

6.  Breast imaging reporting and data system (BI-RADS).

Authors:  Laura Liberman; Jennifer H Menell
Journal:  Radiol Clin North Am       Date:  2002-05       Impact factor: 2.303

7.  Automated encoding of clinical documents based on natural language processing.

Authors:  Carol Friedman; Lyudmila Shagina; Yves Lussier; George Hripcsak
Journal:  J Am Med Inform Assoc       Date:  2004-06-07       Impact factor: 4.497

8.  Lessons extracting diseases from discharge summaries.

Authors:  William Long
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

9.  Global cancer statistics, 2002.

Authors:  D Max Parkin; Freddie Bray; J Ferlay; Paola Pisani
Journal:  CA Cancer J Clin       Date:  2005 Mar-Apr       Impact factor: 508.702

10.  A logistic regression model based on the national mammography database format to aid breast cancer diagnosis.

Authors:  Jagpreet Chhatwal; Oguzhan Alagoz; Mary J Lindstrom; Charles E Kahn; Katherine A Shaffer; Elizabeth S Burnside
Journal:  AJR Am J Roentgenol       Date:  2009-04       Impact factor: 3.959

View more
  29 in total

1.  Automatic classification of mammography reports by BI-RADS breast tissue composition class.

Authors:  Bethany Percha; Houssam Nassif; Jafi Lipson; Elizabeth Burnside; Daniel Rubin
Journal:  J Am Med Inform Assoc       Date:  2012-01-29       Impact factor: 4.497

2.  Developing a utility decision framework to evaluate predictive models in breast cancer risk estimation.

Authors:  Yirong Wu; Craig K Abbey; Xianqiao Chen; Jie Liu; David C Page; Oguzhan Alagoz; Peggy Peissig; Adedayo A Onitilo; Elizabeth S Burnside
Journal:  J Med Imaging (Bellingham)       Date:  2015-08-17

3.  Score As You Lift (SAYL): A Statistical Relational Learning Approach to Uplift Modeling.

Authors:  Houssam Nassif; Finn Kuusisto; Elizabeth S Burnside; David Page; Jude Shavlik; Vítor Santos Costa
Journal:  Mach Learn Knowl Discov Databases       Date:  2013

4.  Discriminatory power of common genetic variants in personalized breast cancer diagnosis.

Authors:  Yirong Wu; Craig K Abbey; Jie Liu; Irene Ong; Peggy Peissig; Adedayo A Onitilo; Jun Fan; Ming Yuan; Elizabeth S Burnside
Journal:  Proc SPIE Int Soc Opt Eng       Date:  2016-03-24

5.  Characterizing mammography reports for health analytics.

Authors:  Carlos C Rojas; Robert M Patton; Barbara G Beckerman
Journal:  J Med Syst       Date:  2011-06-14       Impact factor: 4.460

6.  Comparing the value of mammographic features and genetic variants in breast cancer risk prediction.

Authors:  Yirong Wu; Jie Liu; David Page; Peggy Peissig; Catherine McCarty; Adedayo A Onitilo; Elizabeth S Burnside
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

7.  Improving breast cancer risk prediction by using demographic risk factors, abnormality features on mammograms and genetic variants.

Authors:  Shara I Feld; Kaitlin M Woo; Roxana Alexandridis; Yirong Wu; Jie Liu; Peggy Peissig; Adedayo A Onitilo; Jennifer Cox; C David Page; Elizabeth S Burnside
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

8.  Automatic Detection of Negated Findings in Radiological Reports for Spanish Language: Methodology Based on Lexicon-Grammatical Information Processing.

Authors:  Walter Koza; Darío Filippo; Viviana Cotik; Vanesa Stricker; Mirian Muñoz; Ninoska Godoy; Natalia Rivas; Ricardo Martínez-Gamboa
Journal:  J Digit Imaging       Date:  2019-02       Impact factor: 4.056

9.  Automated annotation and classification of BI-RADS assessment from radiology reports.

Authors:  Sergio M Castro; Eugene Tseytlin; Olga Medvedeva; Kevin Mitchell; Shyam Visweswaran; Tanja Bekhuis; Rebecca S Jacobson
Journal:  J Biomed Inform       Date:  2017-04-18       Impact factor: 6.317

10.  Comparing Mammography Abnormality Features to Genetic Variants in the Prediction of Breast Cancer in Women Recommended for Breast Biopsy.

Authors:  Elizabeth S Burnside; Jie Liu; Yirong Wu; Adedayo A Onitilo; Catherine A McCarty; C David Page; Peggy L Peissig; Amy Trentham-Dietz; Terrie Kitchner; Jun Fan; Ming Yuan
Journal:  Acad Radiol       Date:  2015-10-26       Impact factor: 3.173

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.