Amber Stubbs1, Michele Filannino2,3, Ergin Soysal4, Samuel Henry2, Özlem Uzuner2,3,5. 1. Department of Mathematics and Computer Science, Simmons University, Boston, Massachusetts, USA. 2. Information Sciences and Technology, George Mason University, Fairfax, Virginia, USA. 3. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. 4. School of Biomedical Informatics, University of Texas Health Science Center, Houston, Texas, USA. 5. Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA.
Abstract
OBJECTIVE: Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria. MATERIALS AND METHODS: To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks, including concept extraction, temporal reasoning, and inference. RESULTS: A total of 47 teams participated in this shared task, with 224 participants in total. The participants represented 18 countries, and the teams submitted 109 total system outputs. The best-performing system achieved a micro F1 score of 0.91 using a rule-based approach. The top 10 teams used rule-based and hybrid systems to approach the problems. DISCUSSION: Clinical narratives are open to interpretation, particularly in cases where the selection criterion may be underspecified. This leaves room for annotators to use domain knowledge and intuition in selecting patients, which may lead to error in system outputs. However, teams who consulted medical professionals while building their systems were more likely to have high recall for patients, which is preferable for patient selection systems. CONCLUSIONS: There is not yet a 1-size-fits-all solution for natural language processing systems approaching this task. Future research in this area can look to examining criteria requiring even more complex inferences, temporal reasoning, and domain knowledge.
OBJECTIVE: Track 1 of the 2018 National NLP Clinical Challenges shared tasks focused on identifying which patients in a corpus of longitudinal medical records meet and do not meet identified selection criteria. MATERIALS AND METHODS: To address this challenge, we annotated American English clinical narratives for 288 patients according to whether they met these criteria. We chose criteria from existing clinical trials that represented a variety of natural language processing tasks, including concept extraction, temporal reasoning, and inference. RESULTS: A total of 47 teams participated in this shared task, with 224 participants in total. The participants represented 18 countries, and the teams submitted 109 total system outputs. The best-performing system achieved a micro F1 score of 0.91 using a rule-based approach. The top 10 teams used rule-based and hybrid systems to approach the problems. DISCUSSION: Clinical narratives are open to interpretation, particularly in cases where the selection criterion may be underspecified. This leaves room for annotators to use domain knowledge and intuition in selecting patients, which may lead to error in system outputs. However, teams who consulted medical professionals while building their systems were more likely to have high recall for patients, which is preferable for patient selection systems. CONCLUSIONS: There is not yet a 1-size-fits-all solution for natural language processing systems approaching this task. Future research in this area can look to examining criteria requiring even more complex inferences, temporal reasoning, and domain knowledge.
Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: Jejo D Koola; Sharon E Davis; Omar Al-Nimri; Sharidan K Parr; Daniel Fabbri; Bradley A Malin; Samuel B Ho; Michael E Matheny Journal: J Biomed Inform Date: 2018-03-09 Impact factor: 6.317
Authors: Daniel J Feller; Jason Zucker; Michael T Yin; Peter Gordon; Noémie Elhadad Journal: J Acquir Immune Defic Syndr Date: 2018-02-01 Impact factor: 3.731
Authors: Samir R Thadani; Chunhua Weng; J Thomas Bigger; John F Ennever; David Wajngurt Journal: J Am Med Inform Assoc Date: 2009-08-28 Impact factor: 4.497
Authors: Steven R Chamberlin; Steven D Bedrick; Aaron M Cohen; Yanshan Wang; Andrew Wen; Sijia Liu; Hongfang Liu; William R Hersh Journal: JAMIA Open Date: 2020-07-26