BACKGROUND: Temporal information detection systems have been developed by the Mayo Clinic for the 2012 i2b2 Natural Language Processing Challenge. OBJECTIVE: To construct automated systems for EVENT/TIMEX3 extraction and temporal link (TLINK) identification from clinical text. MATERIALS AND METHODS: The i2b2 organizers provided 190 annotated discharge summaries as the training set and 120 discharge summaries as the test set. Our Event system used a conditional random field classifier with a variety of features including lexical information, natural language elements, and medical ontology. The TIMEX3 system employed a rule-based method using regular expression pattern match and systematic reasoning to determine normalized values. The TLINK system employed both rule-based reasoning and machine learning. All three systems were built in an Apache Unstructured Information Management Architecture framework. RESULTS: Our TIMEX3 system performed the best (F-measure of 0.900, value accuracy 0.731) among the challenge teams. The Event system produced an F-measure of 0.870, and the TLINK system an F-measure of 0.537. CONCLUSIONS: Our TIMEX3 system demonstrated good capability of regular expression rules to extract and normalize time information. Event and TLINK machine learning systems required well-defined feature sets to perform well. We could also leverage expert knowledge as part of the machine learning features to further improve TLINK identification performance.
BACKGROUND: Temporal information detection systems have been developed by the Mayo Clinic for the 2012 i2b2 Natural Language Processing Challenge. OBJECTIVE: To construct automated systems for EVENT/TIMEX3 extraction and temporal link (TLINK) identification from clinical text. MATERIALS AND METHODS: The i2b2 organizers provided 190 annotated discharge summaries as the training set and 120 discharge summaries as the test set. Our Event system used a conditional random field classifier with a variety of features including lexical information, natural language elements, and medical ontology. The TIMEX3 system employed a rule-based method using regular expression pattern match and systematic reasoning to determine normalized values. The TLINK system employed both rule-based reasoning and machine learning. All three systems were built in an Apache Unstructured Information Management Architecture framework. RESULTS: Our TIMEX3 system performed the best (F-measure of 0.900, value accuracy 0.731) among the challenge teams. The Event system produced an F-measure of 0.870, and the TLINK system an F-measure of 0.537. CONCLUSIONS: Our TIMEX3 system demonstrated good capability of regular expression rules to extract and normalize time information. Event and TLINK machine learning systems required well-defined feature sets to perform well. We could also leverage expert knowledge as part of the machine learning features to further improve TLINK identification performance.
Authors: Sunghwan Sohn; Jean-Pierre A Kocher; Christopher G Chute; Guergana K Savova Journal: J Am Med Inform Assoc Date: 2011-09-21 Impact factor: 4.497
Authors: Fang Li; Jingcheng Du; Yongqun He; Hsing-Yi Song; Mohcine Madkour; Guozheng Rao; Yang Xiang; Yi Luo; Henry W Chen; Sijia Liu; Liwei Wang; Hongfang Liu; Hua Xu; Cui Tao Journal: J Am Med Inform Assoc Date: 2020-07-01 Impact factor: 4.497
Authors: Chen Lin; Elizabeth W Karlson; Dmitriy Dligach; Monica P Ramirez; Timothy A Miller; Huan Mo; Natalie S Braggs; Andrew Cagan; Vivian Gainer; Joshua C Denny; Guergana K Savova Journal: J Am Med Inform Assoc Date: 2014-10-25 Impact factor: 4.497