| Literature DB >> 31511785 |
Zhengru Shen1, Hugo van Krimpen1, Marco Spruit1.
Abstract
Natural language processing (NLP) has become essential for secondary use of clinical data. Over the last two decades, many clinical NLP systems were developed in both academia and industry. However, nearly all existing systems are restricted to specific clinical settings mainly because they were developed for and tested with specific datasets, and they often fail to scale up. Therefore, using existing NLP systems for one's own clinical purposes requires substantial resources and long-term time commitments for customization and testing. Moreover, the maintenance is also troublesome and time-consuming. This research presents a lightweight approach for building clinical NLP systems with limited resources. Following the design science research approach, we propose a lightweight architecture which is designed to be composable, extensible, and configurable. It takes NLP as an external component which can be accessed independently and orchestrated in a pipeline via web APIs. To validate its feasibility, we developed a web-based prototype for clinical concept extraction with six well-known NLP APIs and evaluated it on three clinical datasets. In comparison with available benchmarks for the datasets, three high F1 scores (0.861, 0.724, and 0.805) were obtained from the evaluation. It also gained a low F1 score (0.373) on one of the tests, which probably is due to the small size of the test dataset. The development and evaluation of the prototype demonstrates that our approach has a great potential for building effective clinical NLP systems with limited resources.Entities:
Year: 2019 PMID: 31511785 PMCID: PMC6714318 DOI: 10.1155/2019/3435609
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1A general architecture of clinical NLP systems.
Figure 2A lightweight NLP architecture for clinical NLP.
NLP services of common NLP API providers.
| NLP API | Available NLP services |
|---|---|
| IBM Watson NLU | Entity extraction, concept extraction, relation extraction, text classification, language detection, and sentiment analysis |
| Aylien | Article extraction, entity extraction, concept extraction, summarization, text classification, language detection, semantic labeling, sentiment analysis, hashtag suggestion, image tagging, and microformat extraction |
| Lexalytics | Sentiment analysis, concept extraction, categorization, named entity extraction, theme extraction, and summarization |
| Meaning Cloud | Topic extraction, text classification, sentiment analysis, language detection, and linguistic analysis (POS tagging, parsing, and lemmatization) |
| Alchemy API | Entity extraction, concept tagging, keywords extraction, relation extraction, text classification, language detection, sentiment analysis, microformat extraction, feed detection, and linked data |
| TextRazor | Entity extraction, disambiguation, linking, keywords extraction, topic tagging, and classification |
| Developer Cloud | Concept extraction, translation, personality insights, and classification |
| Open Calais | Entity extraction, relation extraction, and sentiment analysis |
| Dandelion API | Entity extraction, text classification, language detection, sentiment analysis, and text similarity |
| Haven OnDemand | Autocomplete, concept extraction, document categorization, entity extraction, language detection, sentiment analysis, and text tokenization |
NLP APIs selected for the prototype.
| API | Fee | Company/team | References |
|---|---|---|---|
| IBM Watson NLU | Free trial | IBM |
|
| MeaningCloud | Free trial | MeaningCloud LLC |
|
| Open Calais | Free trial | Thomson Reuters |
|
| Haven OnDemand | Free trial | Hewlett Packard |
|
| TextRazor | Free trial | TextRazor Ltd. |
|
| Dandelion API | Free trial | Spaziodati |
|
Algorithm 1Pseudocode of the API integration algorithm.
Figure 3Prototype architecture.
Figure 4Prototype user interface of the multiple NLP API extraction pipeline. A demo video and source code are available online.
Impact of negation from the experiments.
| Dataset | Negation ( | Recall | Precision |
|
|---|---|---|---|---|
| Obesity challenge | True | 0.733 | 0.939 | 0.823 |
| False | 0.805 | 0.925 | 0.861 | |
|
| ||||
| Medication challenge | True | 0.62 | 0.835 | 0.712 |
| False | 0.636 | 0.838 | 0.724 | |
|
| ||||
| OPERAM medical conditions | True | 0.594 | 0.271 | 0.373 |
| False | 0.594 | 0.271 | 0.373 | |
|
| ||||
| OPERAM medications | True | 0.795 | 0.816 | 0.805 |
| False | 0.795 | 0.816 | 0.805 | |
Overall results on three datasets.
| Dataset |
|
|
| Recall | Precision |
|
|---|---|---|---|---|---|---|
| Obesity challenge | False | 0.1 | 0.2 | 0.805 | 0.925 | 0.861 |
| Baseline |
|
|
| |||
| Medication challenge | False | 0.1 | 0.35 | 0.636 | 0.838 | 0.724 |
| Baseline |
|
|
| |||
| OPERAM medical conditions | True | 0.1 | 0.5 | 0.594 | 0.271 | 0.373 |
| OPERAM medications | False | 0 | 0.35 | 0.795 | 0.816 | 0.805 |
Average of the top 5 best systems from the challenge.
Figure 5Error distribution of all the experiments, false positives vs false negatives.