Literature DB >> 29186491

CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines.

Ergin Soysal¹, Jingqi Wang¹, Min Jiang¹, Yonghui Wu¹, Serguei Pakhomov², Hongfang Liu³, Hua Xu¹.

Abstract

Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community.

Entities: CellLine Disease Species

Keywords: clinical text processing; machine learning; natural language processing

Year: 2018 PMID： 29186491 PMCID： PMC7378877 DOI： 10.1093/jamia/ocx132

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

INTRODUCTION

In the medical domain, clinical documents contain rich information needed for both clinical research and daily practice. Natural language processing (NLP) technologies play an important role in unlocking patient information from clinical narratives. Several general-purpose NLP systems have been developed to process clinical text, including Clinical Text Analysis and Knowledge Extraction System (cTAKES), MetaMap/MetaMap Lite, and Medical Language Extraction and Encoding System. These systems can extract diverse types of clinical information and have been successfully applied to many information extraction tasks, such as detection of smoking status and identification of respiratory findings and suspicious breast cancer lesions. In addition, many NLP systems have been developed to extract specific types of information from clinical text, eg, medication information extraction systems, temporal information extraction tools, and deidentification systems. Despite the success of current NLP systems, studies have shown that it takes substantial effort for end users to adopt existing NLP systems. Furthermore, users often report reduced performance when an existing system is applied without customization beyond its original purpose, eg, when moving to a different institution, different types of clinical notes, or a different application. For machine learning–based NLP systems, statistical models may have to be retrained using target domain data to achieve desired performance, due to differences between the target domain and the original domain. The effort of customizing existing NLP systems for individual applications is nontrivial and often requires substantial NLP knowledge and skills, which could be challenging in settings with limited NLP expertise. This prevents the widespread adoption of NLP technologies in the medical domain. To address this problem, we have developed a new clinical NLP toolkit called CLAMP (Clinical Language Annotation, Modeling, and Processing), which provides not only state-of-the-art NLP modules, but also an integrated development environment with user-friendly graphic user interfaces (GUIs) to allow users to quickly build customized NLP pipelines for individual applications.

METHODS

Architecture, components, and resources

CLAMP is implemented in Java as a desktop application. It builds on the Apache Unstructured Information Management Architecture™ (UIMA) framework to maximize its interoperability with other UIMA-based systems such as cTAKES. CLAMP also supports the Apache UIMA Asynchronous Scaleout (AS) framework for asynchronous processing in a distributed environment. (UIMA AS is a flexible and powerful scaleout solution for NLP pipelines maintained by the Apache Foundation:https://uima.apache.org/doc-uimaas-what.html). CLAMP follows a pipeline-based architecture that decomposes an NLP system into multiple components. Most CLAMP components are built on approaches developed in our lab and proven in multiple clinical NLP challenges, such as i2b2 (2009 and 2010, named entity recognition [NER] tasks, ranked no. 2),, Shared Annotated Resources/Conference and Labs of the Evaluation Forum (2013 Task 2, abbreviation recognition, ranked no. 1), and SemEVAL (2014 Task 7, encoding to concept unique identifiers [CUIs] in the Unified Medical Language System [UMLS], ranked no. 1). Various technologies, including machine learning–based methods and rule-based methods, were used when developing these components. A list of CLAMP’s available components and their specifications follows: Sentence boundary detection: CLAMP provides both a machine learning–based sentence detector using OpenNLP and a configurable rule-based sentence boundary detection component. Tokenizer: CLAMP contains 3 types of tokenizers: the machine learning–based OpenNLP tokenizer, a delimiter-based (eg, white space) tokenizer, and a rule-based tokenizer with various configuration options. Part-of-speech tagger: CLAMP implements a machine learning–based part-of-speech tagger that is retrained on clinical corpora. Section header identification: CLAMP uses a dictionary-based approach for identifying section headers. We provide a list of common section headers collected from clinical documents. Users can extend/optimize the list based on the document types in their tasks. Abbreviation reorganization and disambiguation: CLAMP partially implements the clinical abbreviation recognition and disambiguation (CARD) framework.,, Users can specify their own abbreviation list if needed. Named entity recognizer: CLAMP presents 3 different types of NER approach: (1) a machine learning–based NER component that uses the conditional random fields (CRF) algorithm in the CRFSuite library, following our proven methods in the 2010 i2b2 challenge; (2) a dictionary-based NER system with comprehensive lexicon collected from multiple resources such as the UMLS; users can provide their own lexicons and specify options for dictionary lookup algorithms, such as with or without stemming; and (3) a regular expression-based NER for entities with common patterns such as dates and phone numbers. Assertion and negation: CLAMP provides a machine learning–based approach for assertion detection that we developed in the 2010 i2b2 challenge, which determines 6 types of assertion: present, absent, possible, conditional, hypothetical, and not associated with the patient. In addition, the rule-based NegEx algorithm is also implemented and users can specify additional negation lexicons and rules. UMLS encoder: After an entity is recognized, it can be mapped to UMLS CUIs using this component (also known as an entity linking task). The UMLS encoder is built on our top-ranked algorithm in the SemEval-14 challenge, which calculates the similarity of an entity and candidate concepts using the vector space model. Rule engine: We implemented the Apache Ruta Rule Engine in CLAMP, thus allowing users to add rules before or after the machine learning algorithms to fine-tune performance. Users can develop Ruta rules by either editing the rule files directly or using the interface for rule specification. In addition to the above NLP components, we prepared 338 clinical notes with entity annotations derived from MTSamples, a collection of different types of transcribed clinical note examples made for clinical documentation education, as a test corpus to be co-released with CLAMP.

GUI development

CLAMP’s GUI was built on top of the Eclipse Framework, which provides built-in components for developing interactive interfaces.Figure 1 shows a screenshot of the main interface of CLAMP for building an NLP pipeline. Built-in NLP components are listed in the top-left palette, and the corpus management palette is in the left-middle area. User-defined NLP pipelines are displayed in the left-bottom palette. The details of each pipeline are displayed in the center area after users click a pipeline. A pipeline can be visually created by dragging and dropping components into the middle window, following specific orders (eg, tokenizer should be before NER). After selecting the components of a pipeline, users can click each component to customize its settings. For example, for regular expression-based or dictionary-based NER components, users can specify their own regular expression or dictionary files. For machine learning–based NER, users can swap the default machine learning model with models trained on local data.

Figure 1.

The user interface for building a pipeline in CLAMP.

The user interface for building a pipeline in CLAMP. To facilitate building machine learning–based NER modules on local data, CLAMP provides interfaces for corpus annotation and model training. We developed a fully functional annotation interface (by leveraging the brat annotation tool), which allows users to define types of entity of interest and annotate them following guidelines (seeFigure 2 for the annotation interface). After finishing annotation, users can click the training icon to build CRF models using the annotated corpus. The system will automatically report its performance based on user-specified evaluation settings (eg, 5-fold cross-validation).Figure 3 shows the popup window where users can select different types of features to build the CRF-based NER models.

Figure 2.

The interface in CLAMP for annotating entities and relations.

Figure 3.

The interface for selecting features and evaluation options for building machine learning–based NER models using CLAMP.

The interface in CLAMP for annotating entities and relations. The interface for selecting features and evaluation options for building machine learning–based NER models using CLAMP.

Evaluation

CLAMP is currently available in 2 versions: (1) CLAMP-CMD, a command line NLP system to extract clinical concepts, built on default CLAMP components, and (2) CLAMP-GUI, which provides the GUI for building customized NLP pipelines. We evaluated CLAMP-CMD on 2 NLP tasks: (1) an NER task to recognize problems, treatments, and lab tests, similar to the 2010 i2b2 challenge; and (2) a UMLS CUI encoding task for diseases, similar to the 2014 SemEVAL Task 7. For the NER task, we included 3 corpora annotated following the guidelines in the i2b2 challenge: (1) the discharge summaries used in the 2010 i2b2 challenge (i2b2, 871 annotated notes); (2) a new corpus of outpatient clinic visit notes from the University of Texas Health Science Center at Houston (UTNotes, 1351 notes); and (3) a new corpus of mock clinical documents from MTSamples, as described in the Methods section (MTSamples, 338 notes). We randomly selected 50 notes from each corpus as the test sets for evaluating CLAMP-CMD, and combined the remaining notes for training the CRF-based NER model. Standard measurements of precision (P), recall (R), andF-measure (F1) were reported for each corpus using both the exact and relaxed matching evaluation scripts in the i2b2 challenge. For the UMLS CUIs encoding task, we compared CLAMP-CMD with MetaMap, MetaMap Lite, and cTAKES using the SemEVAL-2014 Task 7 corpus, which contains 431 notes. In order to compare these systems, we limited the CUIs to those in Systematized Nomenclature of Medicine – Clinical Terms only and slightly changed the evaluation criteria: if a CUI is identified by both the gold standard and a system within the same sentence, we treat it as a true positive, without considering offsets. As MetaMap and cTAKES sometimes output multiple CUIs for one concept, we either selected the one with the highest score or randomly picked one from CUIs with a tied score (or no score). For CLAMP-GUI, we conducted 2 use cases to demonstrate its efficiency: (1) detect smoking status (past smoker, current smoker, or nonsmoker), similar to the task in, and (2) extract lab test names and associated values. For smoking status detection, we annotated 300 sentences with a smoking keyword; 100 were used to develop the rule-based system and 200 were used to test the system. For lab test name/value extraction, we annotated 50 clinic visit notes and divided them into a development set (25 notes) and a test set (25 notes). Performance of the customized pipelines and development time using CLAMP-GUI was then reported using the test sets.

RESULTS

Table 1 shows the results of CLAMP-CMD on the NER task across different corpora. When the relaxed matching criterion was used, CLAMP-CMD could recognize clinical entities (problems, treatments, and tests) with reasonableF-measures >90% across different corpora.Table 2 shows the results of different clinical NLP systems on mapping disease entities to the UMLS CUIs on the SemEVAL-2014 corpus. CLAMP achievedF-measure superior to MetaMap and cTAKES, with faster processing speed, although MetaMap Lite achieved the bestF-measure. InSupplementary Table 1, we also report evaluation results of individual CLAMP components (eg, tokenizer and sentence boundary detector) using existing corpora. The performance of each component was comparable to the state-of-the-art results reported by other systems.,

Table 1.

Performance of CLAMP-CMD on the NER task (problem, treatment, and test) across different corpora

Corpus	No. of entities	State-of-the-art F1 (exact vs relaxed)	Exact match			Relaxed match
Corpus	No. of entities	State-of-the-art F1 (exact vs relaxed)	Precision	Recall	F1	Precision	Recall	F1
i2b2	72 846	0.85/0.92^a	0.89	0.86	0.88	0.96	0.93	0.94
MTSamples	25 531	N/A	0.84	0.81	0.83	0.92	0.89	0.91
UTNotes	124 869	N/A	0.92	0.90	0.91	0.96	0.94	0.95

aThe best performance reported in the shared task.

Table 2.

Performance of CLAMP (version 1.3), MetaMap (2016), MetaMap Lite (2016, version 3.4), and cTAKES (version 4) on extracting disease concepts to UMLS CUIs using the SemEval-2014 corpus

NLP System	No. of entities			Performance			Processing Time (s/doc)^a
NLP System	Correct	Predict	Gold	Precision	Recall	F1	Processing Time (s/doc)^a
CLAMP	7228	9329	13 555	0.775	0.533	0.632	0.95
MetaMap	5574	10 214	13 555	0.546	0.411	0.469	7.07
MetaMap Lite	8009	11 282	13 555	0.710	0.591	0.645	1.95
cTAKES	9126	19 713	13 555	0.463	0.673	0.549	2.27

aAll evaluations were performed on a MacBook with 16G RAM and Intel i7 as CPU with 4 cores. For MetaMap, the default setting was used. For cTAKES, the fast dictionary lookup annotator was used. The performance of both MetaMap and cTAKES could be further improved by optimizing their settings.

Performance of CLAMP-CMD on the NER task (problem, treatment, and test) across different corpora aThe best performance reported in the shared task. Performance of CLAMP (version 1.3), MetaMap (2016), MetaMap Lite (2016, version 3.4), and cTAKES (version 4) on extracting disease concepts to UMLS CUIs using the SemEval-2014 corpus aAll evaluations were performed on a MacBook with 16G RAM and Intel i7 as CPU with 4 cores. For MetaMap, the default setting was used. For cTAKES, the fast dictionary lookup annotator was used. The performance of both MetaMap and cTAKES could be further improved by optimizing their settings. For smoking status detection, we quickly built a rule-based pipeline (∼4 h) using CLAMP-GUI, which could detect patients’ smoking status (nonsmoker, current smoker, past smoker) with accuracies of 0.95, 0.89, and 0.90, respectively. For lab test names/values, we built a hybrid pipeline that combines machine learning (eg, CLAMP’s default NER model for lab names) and rules (eg, for extracting lab values and linking values to names) within approximately 12 h (8 h of annotation and 4 h of customizing components). The pipeline contains 8 different components (Sentence Boundary, Tokenizer, Section Header, POS Tagger, CRF-based NER, Regular Expression–based NER, Ruta Rule Engine, and Relationship Connector) and achievedF-measures of 0.98, 0.85, and 0.81 for recognizing lab test names, values, and their relations in the test set, respectively. The detailed processes for developing these 2 pipelines were recorded as videos, available athttp://clamp.uth.edu/tutorial.php.

DISCUSSION

GUI-based NLP tools such as General Architecture for Text Engineering have been developed and are widely used in the general domain, but few exist in the medical domain. The main advantage of CLAMP is that it provides GUIs to allow non-NLP experts to quickly develop customized clinical information extraction pipelines using proven state-of-the-art methods, eg, machine learning–based NER models. Although there is an increasing trend of applying machine learning to clinical NLP systems,, widely used systems such as cTAKES and MetaMap do not provide easy ways to build machine learning–based models. To the best of our knowledge, CLAMP is the first comprehensive clinical NLP system that provides interfaces for building hybrid solutions (machine learning plus rules) for information extraction. However, such a GUI-based tool also comes with limitations; eg, some tasks are complex and are difficult to build through GUIs. To address such issues, we also provide application program interfaces for individual components in CLAMP, so that professional developers can build integrated systems by directly calling them. To further facilitate building NLP solutions for end users, we are developing a library of NLP pipelines for diverse types of clinical information in CLAMP. For example, if a user wants to extract ejection fraction information from local text and there is a prebuilt pipeline for ejection fraction, he/she can just copy the prebuilt pipeline to his/her own workspace and start customizing each component of the pipeline based on local data, without starting from scratch, thus saving development time. So far we have developed >30 pipelines for extracting different types of information from clinical text, ranging from general pipelines (eg, medication and signature information) to specific pipelines (eg, smoking status). CLAMP was developed with interoperability in mind. It can directly exchange objects with cTAKES via Apache UIMA interfaces. We have also developed wrappers for displaying MetaMap outputs in CLAMP and for integrating CLAMP with other NLP frameworks such as Leo. Our future work includes developing more NLP components (eg, syntactic parsing and relation extraction), improving GUIs by conducing formal usability testing, normalizing its outputs to common data models such as those of the Observational Medical Outcomes Partnership, as well as expanding the library of NLP pipelines for different information extraction tasks. CLAMP is currently freely available for research use athttp://clamp.uth.edu. To develop a sustainable model for continuing its development and maintenance, we are evaluating a paid licensing model for industrial use. Since its release in 2016, there have been >160 downloads by >120 academic institutions and industrial entities.

CONCLUSION

CLAMP integrates proven state-of-the-art NLP algorithms and user-friendly interfaces to facilitate efficient building of customized NLP pipelines for diverse clinical applications. We believe it will complement existing clinical NLP systems and help accelerate the adoption of NLP in clinical research and practice.

COMPETING INTEREST

The authors have no competing interests to declare.

FUNDING

This work was supported in part by grants from the National Institute of General Medical Sciences, GM102282 and GM103859, the National Library of Medicine, LM 010681, the National Cancer Institute, CA194215, and the Cancer Prevention and Research Institute of Texas, R1307.

CONTRIBUTORS

Study planning: SP, HL, HX. Software design and implementation: ES, JW, MJ, YW. Wrote the paper: ES, SP, HX. All authors read and approved the final manuscript.

SUPPLEMENTARY MATERIAL

Supplementary material is available atJournal of the American Medical Informatics Association online. Click here for additional data file.

27 in total

1. Identifying respiratory findings in emergency department reports for biosurveillance using MetaMap.

Authors: Wendy W Chapman; Marcelo Fiszman; John N Dowling; Brian E Chapman; Thomas C Rindflesch
Journal: Stud Health Technol Inform Date: 2004

2. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497

3. An overview of MetaMap: historical perspective and recent advances.

Authors: Alan R Aronson; François-Michel Lang
Journal: J Am Med Inform Assoc Date: 2010 May-Jun Impact factor: 4.497

4. Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences.

Authors: Jung-wei Fan; Elly W Yang; Min Jiang; Rashmi Prasad; Richard M Loomis; Daniel S Zisook; Josh C Denny; Hua Xu; Yang Huang
Journal: J Am Med Inform Assoc Date: 2013-08-01 Impact factor: 4.497

5. Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions.

Authors: Wendy W Chapman; Prakash M Nadkarni; Lynette Hirschman; Leonard W D'Avolio; Guergana K Savova; Ozlem Uzuner
Journal: J Am Med Inform Assoc Date: 2011 Sep-Oct Impact factor: 4.497

6. Towards a comprehensive medical language processing system: methods and issues.

Authors: C Friedman
Journal: Proc AMIA Annu Fall Symp Date: 1997

7. A study of transportability of an existing smoking status detection module across institutions.

Authors: Mei Liu; Anushi Shah; Min Jiang; Neeraja B Peterson; Qi Dai; Melinda C Aldrich; Qingxia Chen; Erica A Bowton; Hongfang Liu; Joshua C Denny; Hua Xu
Journal: AMIA Annu Symp Proc Date: 2012-11-03

Review 8. What can natural language processing do for clinical decision support?

Authors: Dina Demner-Fushman; Wendy W Chapman; Clement J McDonald
Journal: J Biomed Inform Date: 2009-08-13 Impact factor: 6.317

9. De-identification of patient notes with recurrent neural networks.

Authors: Franck Dernoncourt; Ji Young Lee; Ozlem Uzuner; Peter Szolovits
Journal: J Am Med Inform Assoc Date: 2017-05-01 Impact factor: 4.497

10. Recognition and Evaluation of Clinical Section Headings in Clinical Documents Using Token-Based Formulation with Conditional Random Fields.

Authors: Hong-Jie Dai; Shabbir Syed-Abdul; Chih-Wei Chen; Chieh-Chen Wu
Journal: Biomed Res Int Date: 2015-08-26 Impact factor: 3.411

95 in total

1. Improving the Utility of Tobacco-Related Problem List Entries Using Natural Language Processing.

Authors: Daniel R Harris; Darren W Henderson; Alexandria Corbeau
Journal: AMIA Annu Symp Proc Date: 2021-01-25

2. A study of deep learning approaches for medication and adverse drug event extraction from clinical text.

Authors: Qiang Wei; Zongcheng Ji; Zhiheng Li; Jingcheng Du; Jingqi Wang; Jun Xu; Yang Xiang; Firat Tiryaki; Stephen Wu; Yaoyun Zhang; Cui Tao; Hua Xu
Journal: J Am Med Inform Assoc Date: 2020-01-01 Impact factor: 4.497

3. A graph-based method for reconstructing entities from coordination ellipsis in medical text.

Authors: Chi Yuan; Yongli Wang; Ning Shang; Ziran Li; Ruxin Zhao; Chunhua Weng
Journal: J Am Med Inform Assoc Date: 2020-07-01 Impact factor: 4.497

4. Coding Free-Text Chief Complaints from a Health Information Exchange: A Preliminary Study.

Authors: Sotiris Karagounis; Indra Neil Sarkar; Elizabeth S Chen
Journal: AMIA Annu Symp Proc Date: 2021-01-25

5. Assessing the accuracy of automatic speech recognition for psychotherapy.

Authors: Adam S Miner; Albert Haque; Jason A Fries; Scott L Fleming; Denise E Wilfley; G Terence Wilson; Arnold Milstein; Dan Jurafsky; Bruce A Arnow; W Stewart Agras; Li Fei-Fei; Nigam H Shah
Journal: NPJ Digit Med Date: 2020-06-03

6. Computerized Approach to Creating a Systematic Ontology of Hematology/Oncology Regimens.

Authors: Andrew M Malty; Sandeep K Jain; Peter C Yang; Krysten Harvey; Jeremy L Warner
Journal: JCO Clin Cancer Inform Date: 2018-05-11

7. Cohort selection for clinical trials: n2c2 2018 shared task track 1.

Authors: Amber Stubbs; Michele Filannino; Ergin Soysal; Samuel Henry; Özlem Uzuner
Journal: J Am Med Inform Assoc Date: 2019-11-01 Impact factor: 4.497

8. A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives.

Authors: Xi Yang; Hanyuan Yang; Tianchen Lyu; Shuang Yang; Yi Guo; Jiang Bian; Hua Xu; Yonghui Wu
Journal: IEEE Int Conf Healthc Inform Date: 2021-03-12

9. Deep Learning Approach to Parse Eligibility Criteria in Dietary Supplements Clinical Trials Following OMOP Common Data Model.

Authors: Anusha Bompelli; Jianfu Li; Yiqi Xu; Nan Wang; Yanshan Wang; Terrence Adam; Zhe He; Rui Zhang
Journal: AMIA Annu Symp Proc Date: 2021-01-25

10. Clinical concept normalization with a hybrid natural language processing system combining multilevel matching and machine learning ranking.

Authors: Long Chen; Wenbo Fu; Yu Gu; Zhiyong Sun; Haodan Li; Enyu Li; Li Jiang; Yuan Gao; Yang Huang
Journal: J Am Med Inform Assoc Date: 2020-10-01 Impact factor: 4.497