Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Trie-based rule processing for clinical NLP: A use-case study of n-trie, making the ConText algorithm more efficient and scalable.

Literature DB >> 30092358

Trie-based rule processing for clinical NLP: A use-case study of n-trie, making the ConText algorithm more efficient and scalable.

Abstract

OBJECTIVE: To develop and evaluate an efficient Trie structure for large-scale, rule-based clinical natural language processing (NLP), which we call n-trie.
BACKGROUND: Despite the popularity of machine learning techniques in natural language processing, rule-based systems boast important advantages: distinctive transparency, ease of incorporating external knowledge, and less demanding annotation requirements. However, processing efficiency remains a major obstacle for adopting standard rule-base NLP solutions in big data analyses.
METHODS: We developed n-trie to specifically address the token-based nature of context detection, an important facet of clinical NLP that is known to slow down NLP pipelines. N-trie, a new rule processing engine using a revised Trie structure, allows fast execution of lexicon-based NLP rules. To determine its applicability and evaluate its performance, we applied the n-trie engine in an implementation (called FastContext) of the ConText algorithm and compared its processing speed and accuracy with JavaConText and GeneralConText, two widely used Java ConText implementations, as well as with a standalone machine learning NegEx implementation, NegScope.
RESULTS: The n-trie engine ran two orders of magnitude faster and was far less sensitive to rule set size than the comparison implementations, and it proved faster than the best machine learning negation detector. Additionally, the engine consistently gained accuracy improvement as the rule set increased (the desired outcome of adding new rules), while the other implementations did not.
CONCLUSIONS: The n-trie engine is an efficient, scalable engine to support NLP rule processing and shows the potential for application in other NLP tasks beyond context detection.

Entities: Chemical Disease Gene Species

Keywords: Algorithms; Data accuracy; Medical informatics applications; Natural language processing

Mesh：

Year: 2018 PMID： 30092358 PMCID： PMC6171746 DOI： 10.1016/j.jbi.2018.08.002

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

15 in total

1. A simple algorithm for identifying negated findings and diseases in discharge summaries.

Authors: W W Chapman; W Bridewell; P Hanbury; G F Cooper; B G Buchanan
Journal: J Biomed Inform Date: 2001-10 Impact factor: 6.317

2. Extracting medication information from clinical text.

Authors: Ozlem Uzuner; Imre Solti; Eithon Cadag
Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497

3. Biomedical negation scope detection with conditional random fields.

Authors: Shashank Agarwal; Hong Yu
Journal: J Am Med Inform Assoc Date: 2010 Nov-Dec Impact factor: 4.497

4. Scaling-up NLP Pipelines to Process Large Corpora of Clinical Notes.

Authors: G Divita; M Carter; A Redd; Q Zeng; K Gupta; B Trautner; M Samore; A Gundlapalli
Journal: Methods Inf Med Date: 2015-11-04 Impact factor: 2.176

Review 5. Extracting information from textual documents in the electronic health record: a review of recent research.

Authors: S M Meystre; G K Savova; K C Kipper-Schuler; J F Hurdle
Journal: Yearb Med Inform Date: 2008

6. Launching HITECH.

Authors: David Blumenthal
Journal: N Engl J Med Date: 2009-12-30 Impact factor: 91.245

Review 7. "Big data" and the electronic health record.

Authors: M K Ross; W Wei; L Ohno-Machado
Journal: Yearb Med Inform Date: 2014-08-15

Review 8. Mining electronic health records: towards better research applications and clinical care.

Authors: Peter B Jensen; Lars J Jensen; Søren Brunak
Journal: Nat Rev Genet Date: 2012-05-02 Impact factor: 53.242

9. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports.

Authors: Henk Harkema; John N Dowling; Tyler Thornblade; Wendy W Chapman
Journal: J Biomed Inform Date: 2009-05-10 Impact factor: 6.317

10. Document Sublanguage Clustering to Detect Medical Specialty in Cross-institutional Clinical Texts.

Authors: Kristina Doing-Harris; Olga Patterson; Sean Igo; John Hurdle
Journal: Proc ACM Int Workshop Data Text Min Biomed Inform Date: 2013 Oct-Nov

8 in total

1. Facilitating information extraction without annotated data using unsupervised and positive-unlabeled learning.

Authors: Zfania Tom Korach; Sharmitha Yerneni; Jonathan Einbinder; Carl Kallenberg; Li Zhou
Journal: AMIA Annu Symp Proc Date: 2021-01-25

2. Using Natural Language Processing to improve EHR Structured Data-based Surgical Site Infection Surveillance.

Authors: Jianlin Shi; Siru Liu; Liese C C Pruitt; Carolyn L Luppens; Jeffrey P Ferraro; Adi V Gundlapalli; Wendy W Chapman; Brian T Bucher
Journal: AMIA Annu Symp Proc Date: 2020-03-04

3. Determination of Marital Status of Patients from Structured and Unstructured Electronic Healthcare Data.

Authors: Brian T Bucher; Jianlin Shi; Robert John Pettit; Jeffrey Ferraro; Wendy W Chapman; Adi Gundlapalli
Journal: AMIA Annu Symp Proc Date: 2020-03-04

4. Deep Learning from Incomplete Data: Detecting Imminent Risk of Hospital-acquired Pneumonia in ICU Patients.

Authors: Travis R Goodwin; Dina Demner-Fushman
Journal: AMIA Annu Symp Proc Date: 2020-03-04

5. Extraction of Treatment Information From Electronic Health Records and Evaluation of Testosterone Recovery in Patients With Prostate Cancer.

Authors: Sunny Guin; Tomi Jun; Vaibhav G Patel; Kristin L Ayers; Matthew Deitz; Yuqin Cai; Xiang Zhou; Che-Kai Tsao; William K Oh; Rong Chen; Bobby C Liaw
Journal: JCO Clin Cancer Inform Date: 2022-06

6. A customizable deep learning model for nosocomial risk prediction from critical care notes with indirect supervision.

Authors: Travis R Goodwin; Dina Demner-Fushman
Journal: J Am Med Inform Assoc Date: 2020-04-01 Impact factor: 4.497

7. Identifying Patients Who Meet Criteria for Genetic Testing of Hereditary Cancers Based on Structured and Unstructured Family Health History Data in the Electronic Health Record: Natural Language Processing Approach.

Authors: Jianlin Shi; Keaton L Morgan; Richard L Bradshaw; Se-Hee Jung; Wendy Kohlmann; Kimberly A Kaphingst; Kensaku Kawamoto; Guilherme Del Fiol
Journal: JMIR Med Inform Date: 2022-08-11

8. Portable Automated Surveillance of Surgical Site Infections Using Natural Language Processing: Development and Validation.

Authors: Brian T Bucher; Jianlin Shi; Jeffrey P Ferraro; David E Skarda; Matthew H Samore; John F Hurdle; Adi V Gundlapalli; Wendy W Chapman; Samuel R G Finlayson
Journal: Ann Surg Date: 2020-10 Impact factor: 13.787

8 in total