Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 UMLS-based data augmentation for natural language processing of clinical research literature.

Literature DB >> 33367705

UMLS-based data augmentation for natural language processing of clinical research literature.

Tian Kang¹, Adler Perotte¹, Youlan Tang¹, Casey Ta¹, Chunhua Weng¹.

Abstract

OBJECTIVE: The study sought to develop and evaluate a knowledge-based data augmentation method to improve the performance of deep learning models for biomedical natural language processing by overcoming training data scarcity.
MATERIALS AND METHODS: We extended the easy data augmentation (EDA) method for biomedical named entity recognition (NER) by incorporating the Unified Medical Language System (UMLS) knowledge and called this method UMLS-EDA. We designed experiments to systematically evaluate the effect of UMLS-EDA on popular deep learning architectures for both NER and classification. We also compared UMLS-EDA to BERT.
RESULTS: UMLS-EDA enables substantial improvement for NER tasks from the original long short-term memory conditional random fields (LSTM-CRF) model (micro-F1 score: +5%, + 17%, and +15%), helps the LSTM-CRF model (micro-F1 score: 0.66) outperform LSTM-CRF with transfer learning by BERT (0.63), and improves the performance of the state-of-the-art sentence classification model. The largest gain on micro-F1 score is 9%, from 0.75 to 0.84, better than classifiers with BERT pretraining (0.82).
CONCLUSIONS: This study presents a UMLS-based data augmentation method, UMLS-EDA. It is effective at improving deep learning models for both NER and sentence classification, and contributes original insights for designing new, superior deep learning approaches for low-resource biomedical domains.

Entities: Chemical

Keywords: NLP; UMLS; Unified Medical Language System; data augmentation; evidence based medicine; machine learning; named entity recognition; natural language processing

Mesh：

Year: 2021 PMID： 33367705 PMCID： PMC7973470 DOI： 10.1093/jamia/ocaa309

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

18 in total

1. Natural language processing and its future in medicine.

Authors: C Friedman; G Hripcsak
Journal: Acad Med Date: 1999-08 Impact factor: 6.893

2. Evidence-based medicine.

Authors: D L Sackett
Journal: Semin Perinatol Date: 1997-02 Impact factor: 3.300

3. The well-built clinical question: a key to evidence-based decisions.

Authors: W S Richardson; M C Wilson; J Nishikawa; R S Hayward
Journal: ACP J Club Date: 1995 Nov-Dec

4. A clinical text classification paradigm using weak supervision and deep representation.

Authors: Yanshan Wang; Sunghwan Sohn; Sijia Liu; Feichen Shen; Liwei Wang; Elizabeth J Atkinson; Shreyasee Amin; Hongfang Liu
Journal: BMC Med Inform Decis Mak Date: 2019-01-07 Impact factor: 2.796

5. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?

Authors: Hilda Bastian; Paul Glasziou; Iain Chalmers
Journal: PLoS Med Date: 2010-09-21 Impact factor: 11.069

Review 6. A guide to deep learning in healthcare.

Authors: Andre Esteva; Alexandre Robicquet; Bharath Ramsundar; Volodymyr Kuleshov; Mark DePristo; Katherine Chou; Claire Cui; Greg Corrado; Sebastian Thrun; Jeff Dean
Journal: Nat Med Date: 2019-01-07 Impact factor: 53.440

7. Automatic classification of sentences to support Evidence Based Medicine.

Authors: Su Nam Kim; David Martinez; Lawrence Cavedon; Lars Yencken
Journal: BMC Bioinformatics Date: 2011-03-29 Impact factor: 3.169

8. Analysis of the time and workers needed to conduct systematic reviews of medical interventions using data from the PROSPERO registry.

Authors: Rohit Borah; Andrew W Brown; Patrice L Capers; Kathryn A Kaiser
Journal: BMJ Open Date: 2017-02-27 Impact factor: 2.692

2. Investigating the impact of weakly supervised data on text mining models of publication transparency: a case study on randomized controlled trials.

Authors: Linh Hoanga; Lan Jiang; Halil Kilicoglu
Journal: AMIA Annu Symp Proc Date: 2022-05-23

2 in total