Literature DB >> 34297002

Assessing the Performance of Clinical Natural Language Processing Systems: Development of an Evaluation Methodology.

Lea Canales1, Sebastian Menke2, Stephanie Marchesseau2, Ariel D'Agostino2, Carlos Del Rio-Bermudez2, Miren Taberna2, Jorge Tello2.   

Abstract

BACKGROUND: Clinical natural language processing (cNLP) systems are of crucial importance due to their increasing capability in extracting clinically important information from free text contained in electronic health records (EHRs). The conversion of a nonstructured representation of a patient's clinical history into a structured format enables medical doctors to generate clinical knowledge at a level that was not possible before. Finally, the interpretation of the insights gained provided by cNLP systems has a great potential in driving decisions about clinical practice. However, carrying out robust evaluations of those cNLP systems is a complex task that is hindered by a lack of standard guidance on how to systematically approach them.
OBJECTIVE: Our objective was to offer natural language processing (NLP) experts a methodology for the evaluation of cNLP systems to assist them in carrying out this task. By following the proposed phases, the robustness and representativeness of the performance metrics of their own cNLP systems can be assured.
METHODS: The proposed evaluation methodology comprised five phases: (1) the definition of the target population, (2) the statistical document collection, (3) the design of the annotation guidelines and annotation project, (4) the external annotations, and (5) the cNLP system performance evaluation. We presented the application of all phases to evaluate the performance of a cNLP system called "EHRead Technology" (developed by Savana, an international medical company), applied in a study on patients with asthma. As part of the evaluation methodology, we introduced the Sample Size Calculator for Evaluations (SLiCE), a software tool that calculates the number of documents needed to achieve a statistically useful and resourceful gold standard.
RESULTS: The application of the proposed evaluation methodology on a real use-case study of patients with asthma revealed the benefit of the different phases for cNLP system evaluations. By using SLiCE to adjust the number of documents needed, a meaningful and resourceful gold standard was created. In the presented use-case, using as little as 519 EHRs, it was possible to evaluate the performance of the cNLP system and obtain performance metrics for the primary variable within the expected CIs.
CONCLUSIONS: We showed that our evaluation methodology can offer guidance to NLP experts on how to approach the evaluation of their cNLP systems. By following the five phases, NLP experts can assure the robustness of their evaluation and avoid unnecessary investment of human and financial resources. Besides the theoretical guidance, we offer SLiCE as an easy-to-use, open-source Python library. ©Lea Canales, Sebastian Menke, Stephanie Marchesseau, Ariel D’Agostino, Carlos del Rio-Bermudez, Miren Taberna, Jorge Tello. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 23.07.2021.

Entities:  

Keywords:  clinical natural language processing; electronic health records; gold standard; natural language processing; reference standard; sample size

Year:  2021        PMID: 34297002     DOI: 10.2196/20492

Source DB:  PubMed          Journal:  JMIR Med Inform


  7 in total

1.  CORR Insights®: Can We Geographically Validate a Natural Language Processing Algorithm for Automated Detection of Incidental Durotomy Across Three Independent Cohorts From Two Continents?

Authors:  Eugene K Wai
Journal:  Clin Orthop Relat Res       Date:  2022-05-25       Impact factor: 4.755

2.  MedTAG: a portable and customizable annotation tool for biomedical documents.

Authors:  Fabio Giachelle; Ornella Irrera; Gianmaria Silvello
Journal:  BMC Med Inform Decis Mak       Date:  2021-12-18       Impact factor: 2.796

3.  Assessment of medical management in Coronary Type 2 Diabetic patients with previous percutaneous coronary intervention in Spain: A retrospective analysis of electronic health records using Natural Language Processing.

Authors:  Carlos González-Juanatey; Manuel Anguita-Sá Nchez; Vivencio Barrios; Iván Núñez-Gil; Juan Josá Gómez-Doblas; Xavier García-Moll; Carlos Lafuente-Gormaz; María Jesús Rollán-Gómez; Vicente Peral-Disdie; Luis Martínez-Dolz; Miguel Rodríguez-Santamarta; Xavier Viñolas-Prat; Toni Soriano-Colomé; Roberto Muñoz-Aguilera; Ignacio Plaza; Alejandro Curcio-Ruigómez; Ernesto Orts-Soler; Javier Segovia; Claudia Maté; Ángel Cequier
Journal:  PLoS One       Date:  2022-02-10       Impact factor: 3.240

4.  Clinical characteristics and prognostic factors for Crohn's disease relapses using natural language processing and machine learning: a pilot study.

Authors:  Fernando Gomollón; Javier P Gisbert; Iván Guerra; Rocío Plaza; Ramón Pajares Villarroya; Luis Moreno Almazán; Mª Carmen López Martín; Mercedes Domínguez Antonaya; María Isabel Vera Mendoza; Jesús Aparicio; Vicente Martínez; Ignacio Tagarro; Alonso Fernández-Nistal; Sara Lumbreras; Claudia Maté; Carmen Montoto
Journal:  Eur J Gastroenterol Hepatol       Date:  2022-04-01       Impact factor: 2.566

5.  Evaluation of Natural Language Processing for the Identification of Crohn Disease-Related Variables in Spanish Electronic Health Records: A Validation Study for the PREMONITION-CD Project.

Authors:  Carmen Montoto; Javier P Gisbert; Iván Guerra; Rocío Plaza; Ramón Pajares Villarroya; Luis Moreno Almazán; María Del Carmen López Martín; Mercedes Domínguez Antonaya; Isabel Vera Mendoza; Jesús Aparicio; Vicente Martínez; Ignacio Tagarro; Alonso Fernandez-Nistal; Lea Canales; Sebastian Menke; Fernando Gomollón
Journal:  JMIR Med Inform       Date:  2022-02-18

Review 6.  A glance into the future of diagnosis and treatment of spondyloarthritis.

Authors:  Victoria Navarro-Compán; Joerg Ermann; Denis Poddubnyy
Journal:  Ther Adv Musculoskelet Dis       Date:  2022-07-22       Impact factor: 3.625

7.  Real-life burden of hospitalisations due to COPD exacerbations in Spain.

Authors:  José Luis Izquierdo; José Miguel Rodríguez; Carlos Almonacid; María Benavent; Ramón Arroyo-Espliguero; Alvar Agustí
Journal:  ERJ Open Res       Date:  2022-08-15
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.