Literature DB >> 32147441

Evaluating sentence representations for biomedical text: Methods and experimental results.

Noha S Tawfik1, Marco R Spruit2.   

Abstract

Text representations ar one of the main inputs to various Natural Language Processing (NLP) methods. Given the fast developmental pace of new sentence embedding methods, we argue that there is a need for a unified methodology to assess these different techniques in the biomedical domain. This work introduces a comprehensive evaluation of novel methods across ten medical classification tasks. The tasks cover a variety of BioNLP problems such as semantic similarity, question answering, citation sentiment analysis and others with binary and multi-class datasets. Our goal is to assess the transferability of different sentence representation schemes to the medical and clinical domain. Our analysis shows that embeddings based on Language Models which account for the context-dependent nature of words, usually outperform others in terms of performance. Nonetheless, there is no single embedding model that perfectly represents biomedical and clinical texts with consistent performance across all tasks. This illustrates the need for a more suitable bio-encoder. Our MedSentEval source code, pre-trained embeddings and examples have been made available on GitHub.
Copyright © 2020 Elsevier Inc. All rights reserved.

Keywords:  BioNLP; Language model; Sentence embeddings; Text representation

Mesh:

Year:  2020        PMID: 32147441     DOI: 10.1016/j.jbi.2020.103396

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  6 in total

1.  Identification and Impact Analysis of Family History of Psychiatric Disorder in Mood Disorder Patients With Pretrained Language Model.

Authors:  Cheng Wan; Xuewen Ge; Junjie Wang; Xin Zhang; Yun Yu; Jie Hu; Yun Liu; Hui Ma
Journal:  Front Psychiatry       Date:  2022-05-20       Impact factor: 5.435

2.  Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models.

Authors:  Xi Yang; Xing He; Hansi Zhang; Yinghan Ma; Jiang Bian; Yonghui Wu
Journal:  JMIR Med Inform       Date:  2020-11-23

3.  Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.

Authors:  Honglei Liu; Zhiqiang Zhang; Yan Xu; Ni Wang; Yanqun Huang; Zhenghan Yang; Rui Jiang; Hui Chen
Journal:  J Med Internet Res       Date:  2021-01-12       Impact factor: 5.428

4.  Protocol for a reproducible experimental survey on biomedical sentence similarity.

Authors:  Alicia Lara-Clares; Juan J Lastra-Díaz; Ana Garcia-Serrano
Journal:  PLoS One       Date:  2021-03-24       Impact factor: 3.240

5.  Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study.

Authors:  Qingyu Chen; Alex Rankine; Yifan Peng; Elaheh Aghaarabi; Zhiyong Lu
Journal:  JMIR Med Inform       Date:  2021-12-30

6.  Clinical language search algorithm from free-text: facilitating appropriate imaging.

Authors:  Gunvant R Chaudhari; Yeshwant R Chillakuru; Timothy L Chen; Valentina Pedoia; Thienkhai H Vu; Christopher P Hess; Youngho Seo; Jae Ho Sohn
Journal:  BMC Med Imaging       Date:  2022-02-04       Impact factor: 1.930

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.