Literature DB >> 34157094

Biomedical and clinical English model packages for the Stanza Python NLP library.

Yuhao Zhang1, Yuhui Zhang2, Peng Qi2, Christopher D Manning3, Curtis P Langlotz4.   

Abstract

OBJECTIVE: The study sought to develop and evaluate neural natural language processing (NLP) packages for the syntactic analysis and named entity recognition of biomedical and clinical English text.
MATERIALS AND METHODS: We implement and train biomedical and clinical English NLP pipelines by extending the widely used Stanza library originally designed for general NLP tasks. Our models are trained with a mix of public datasets such as the CRAFT treebank as well as with a private corpus of radiology reports annotated with 5 radiology-domain entities. The resulting pipelines are fully based on neural networks, and are able to perform tokenization, part-of-speech tagging, lemmatization, dependency parsing, and named entity recognition for both biomedical and clinical text. We compare our systems against popular open-source NLP libraries such as CoreNLP and scispaCy, state-of-the-art models such as the BioBERT models, and winning systems from the BioNLP CRAFT shared task.
RESULTS: For syntactic analysis, our systems achieve much better performance compared with the released scispaCy models and CoreNLP models retrained on the same treebanks, and are on par with the winning system from the CRAFT shared task. For NER, our systems substantially outperform scispaCy, and are better or on par with the state-of-the-art performance from BioBERT, while being much more computationally efficient.
CONCLUSIONS: We introduce biomedical and clinical NLP packages built for the Stanza library. These packages offer performance that is similar to the state of the art, and are also optimized for ease of use. To facilitate research, we make all our models publicly available. We also provide an online demonstration (http://stanza.run/bio).
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association.

Entities:  

Keywords:  dependency parsing; machine learning; named entity recognition; natural language processing; syntactic analysis

Year:  2021        PMID: 34157094     DOI: 10.1093/jamia/ocab090

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  7 in total

1.  A sequence labeling framework for extracting drug-protein relations from biomedical literature.

Authors:  Ling Luo; Po-Ting Lai; Chih-Hsuan Wei; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2022-07-19       Impact factor: 4.462

2.  Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics.

Authors:  Tiago Almeida; Rui Antunes; João F Silva; João R Almeida; Sérgio Matos
Journal:  Database (Oxford)       Date:  2022-07-01       Impact factor: 4.462

3.  Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical-drug relation extraction?

Authors:  Anfu Tang; Louise Deléger; Robert Bossy; Pierre Zweigenbaum; Claire Nédellec
Journal:  Database (Oxford)       Date:  2022-08-25       Impact factor: 4.462

4.  Identifying Medication-Related Intents From a Bidirectional Text Messaging Platform for Hypertension Management Using an Unsupervised Learning Approach: Retrospective Observational Pilot Study.

Authors:  Anahita Davoudi; Natalie S Lee; Krisda Chaiyachati; Danielle Mowery; ThaiBinh Luong; Timothy Delaney; Elizabeth Asch
Journal:  J Med Internet Res       Date:  2022-06-29       Impact factor: 7.076

5.  Distantly supervised biomedical relation extraction using piecewise attentive convolutional neural network and reinforcement learning.

Authors:  Tiantian Zhu; Yang Qin; Yang Xiang; Baotian Hu; Qingcai Chen; Weihua Peng
Journal:  J Am Med Inform Assoc       Date:  2021-11-25       Impact factor: 7.942

6.  NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories.

Authors:  Yuda Munarko; Dewan M Sarwar; Anand Rampadarath; Koray Atalag; John H Gennari; Maxwell L Neal; David P Nickerson
Journal:  Front Physiol       Date:  2022-02-24       Impact factor: 4.566

7.  A knowledge graph of clinical trials ([Formula: see text]).

Authors:  Ziqi Chen; Bo Peng; Vassilis N Ioannidis; Mufei Li; George Karypis; Xia Ning
Journal:  Sci Rep       Date:  2022-03-18       Impact factor: 4.379

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.