Renaud Schiappa1, Sara Contu1, Dorian Culie2, Brice Thamphya1, Yann Chateau1, Jocelyn Gal1, Caroline Bailleux3, Juliette Haudebourg4, Jean-Marc Ferrero4, Emmanuel Barranger3, Emmanuel Chamorey1. 1. Department of Epidemiology, Biostatistics and Health Data, Centre Antoine Lacassagne, University of Côte d'Azur, Nice, France. 2. Cervico-facial Oncology Surgical Department, University Institute of Face and Neck, University of Côte d'Azur, Nice, France. 3. Department of Medical Oncology, Centre Antoine Lacassagne, University of Côte d'Azur, Nice, France. 4. Anatomy and Pathological Cytology Laboratory, Centre Antoine Lacassagne, University of Côte d'Azur, Nice, France.
Abstract
PURPOSE: Electronic medical records are a valuable source of information about patients' clinical status but are often free-text documents that require laborious manual review to be exploited. Techniques from computer science have been investigated, but the literature has marginally focused on non-English language texts. We developed RUBY, a tool designed in collaboration with IBM-France to automatically structure clinical information from French medical records of patients with breast cancer. MATERIALS AND METHODS: RUBY, which exploits state-of-the-art Named Entity Recognition models combined with keyword extraction and postprocessing rules, was applied on clinical texts. We investigated the precision of RUBY in extracting the target information. RESULTS: RUBY has an average precision of 92.8% for the Surgery report, 92.7% for the Pathology report, 98.1% for the Biopsy report, and 81.8% for the Consultation report. CONCLUSION: These results show that the automatic approach has the potential to effectively extract clinical knowledge from an extensive set of electronic medical records, reducing the manual effort required and saving a significant amount of time. A deeper semantic analysis and further understanding of the context in the text, as well as training on a larger and more recent set of reports, including those containing highly variable entities and the use of ontologies, could further improve the results.
PURPOSE: Electronic medical records are a valuable source of information about patients' clinical status but are often free-text documents that require laborious manual review to be exploited. Techniques from computer science have been investigated, but the literature has marginally focused on non-English language texts. We developed RUBY, a tool designed in collaboration with IBM-France to automatically structure clinical information from French medical records of patients with breast cancer. MATERIALS AND METHODS: RUBY, which exploits state-of-the-art Named Entity Recognition models combined with keyword extraction and postprocessing rules, was applied on clinical texts. We investigated the precision of RUBY in extracting the target information. RESULTS: RUBY has an average precision of 92.8% for the Surgery report, 92.7% for the Pathology report, 98.1% for the Biopsy report, and 81.8% for the Consultation report. CONCLUSION: These results show that the automatic approach has the potential to effectively extract clinical knowledge from an extensive set of electronic medical records, reducing the manual effort required and saving a significant amount of time. A deeper semantic analysis and further understanding of the context in the text, as well as training on a larger and more recent set of reports, including those containing highly variable entities and the use of ontologies, could further improve the results.
Authors: Alexander W Forsyth; Regina Barzilay; Kevin S Hughes; Dickson Lui; Karl A Lorenz; Andrea Enzinger; James A Tulsky; Charlotta Lindvall Journal: J Pain Symptom Manage Date: 2018-02-27 Impact factor: 3.612
Authors: Adam Yala; Regina Barzilay; Laura Salama; Molly Griffin; Grace Sollender; Aditya Bardia; Constance Lehman; Julliette M Buckley; Suzanne B Coopey; Fernanda Polubriaginof; Judy E Garber; Barbara L Smith; Michele A Gadd; Michelle C Specht; Thomas M Gudewicz; Anthony J Guidi; Alphonse Taghian; Kevin S Hughes Journal: Breast Cancer Res Treat Date: 2016-11-08 Impact factor: 4.872
Authors: Anni Coden; Guergana Savova; Igor Sominsky; Michael Tanenblatt; James Masanz; Karin Schuler; James Cooper; Wei Guan; Piet C de Groen Journal: J Biomed Inform Date: 2008-12-27 Impact factor: 6.317
Authors: Arika E Wieneke; Erin J A Bowles; David Cronkite; Karen J Wernli; Hongyuan Gao; David Carrell; Diana S M Buist Journal: J Pathol Inform Date: 2015-06-23
Authors: David A Hanauer; Jill S Barnholtz-Sloan; Mark F Beno; Guilherme Del Fiol; Eric B Durbin; Oksana Gologorskaya; Daniel Harris; Brett Harnett; Kensaku Kawamoto; Benjamin May; Eric Meeks; Emily Pfaff; Janie Weiss; Kai Zheng Journal: JCO Clin Cancer Inform Date: 2020-05
Authors: Mohammed Alawad; Shang Gao; John X Qiu; Hong Jun Yoon; J Blair Christian; Lynne Penberthy; Brent Mumphrey; Xiao-Cheng Wu; Linda Coyle; Georgia Tourassi Journal: J Am Med Inform Assoc Date: 2020-01-01 Impact factor: 4.497