Literature DB >> 35713867

BioBERT and Similar Approaches for Relation Extraction.

Balu Bhasuran1,2.   

Abstract

In biomedicine, facts about relations between entities (disease, gene, drug, etc.) are hidden in the large trove of 30 million scientific publications. The curated information is proven to play an important role in various applications such as drug repurposing and precision medicine. Recently, due to the advancement in deep learning a transformer architecture named BERT (Bidirectional Encoder Representations from Transformers) has been proposed. This pretrained language model trained using the Books Corpus with 800M words and English Wikipedia with 2500M words reported state of the art results in various NLP (Natural Language Processing) tasks including relation extraction. It is a widely accepted notion that due to the word distribution shift, general domain models exhibit poor performance in information extraction tasks of the biomedical domain. Due to this, an architecture is later adapted to the biomedical domain by training the language models using 28 million scientific literatures from PubMed and PubMed central. This chapter presents a protocol for relation extraction using BERT by discussing state-of-the-art for BERT versions in the biomedical domain such as BioBERT. The protocol emphasis on general BERT architecture, pretraining and fine tuning, leveraging biomedical information, and finally a knowledge graph infusion to the BERT model layer.
© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Entities:  

Keywords:  BERT; BioBERT; Deep Learning; Relation Extraction; Text Mining; Transformers

Mesh:

Year:  2022        PMID: 35713867     DOI: 10.1007/978-1-0716-2305-3_12

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  11 in total

1.  Biomedical text mining for research rigor and integrity: tasks, challenges, directions.

Authors:  Halil Kilicoglu
Journal:  Brief Bioinform       Date:  2018-11-27       Impact factor: 11.622

2.  Unsupervised and self-supervised deep learning approaches for biomedical text mining.

Authors:  Mohamed Nadif; François Role
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

3.  Molecular Mechanism of T-2 Toxin-Induced Cerebral Edema by Aquaporin-4 Blocking and Permeation.

Authors:  Nikhil Maroli; Naveen Kumar Kalagatur; Balu Bhasuran; Achuth Jayakrishnan; Renuka Ramalingam Manoharan; Ponmalai Kolandaivel; Jeyakumar Natarajan; Krishna Kadirvelu
Journal:  J Chem Inf Model       Date:  2019-11-05       Impact factor: 4.956

4.  Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases.

Authors:  Balu Bhasuran; Gurusamy Murugesan; Sabenabanu Abdulkadhar; Jeyakumar Natarajan
Journal:  J Biomed Inform       Date:  2016-09-12       Impact factor: 6.317

5.  Text mining and network analysis to find functional associations of genes in high altitude diseases.

Authors:  Balu Bhasuran; Devika Subramanian; Jeyakumar Natarajan
Journal:  Comput Biol Chem       Date:  2018-05-02       Impact factor: 2.877

6.  A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.

Authors:  David Westergaard; Hans-Henrik Stærfeldt; Christian Tønsberg; Lars Juhl Jensen; Søren Brunak
Journal:  PLoS Comput Biol       Date:  2018-02-15       Impact factor: 4.475

Review 7.  Biomedical text mining and its applications in cancer research.

Authors:  Fei Zhu; Preecha Patumcharoenpol; Cheng Zhang; Yang Yang; Jonathan Chan; Asawin Meechai; Wanwipa Vongsangnak; Bairong Shen
Journal:  J Biomed Inform       Date:  2012-11-15       Impact factor: 6.317

8.  Exploring relation types for literature-based discovery.

Authors:  Judita Preiss; Mark Stevenson; Robert Gaizauskas
Journal:  J Am Med Inform Assoc       Date:  2015-05-13       Impact factor: 4.497

9.  Automatic extraction of gene-disease associations from literature using joint ensemble learning.

Authors:  Balu Bhasuran; Jeyakumar Natarajan
Journal:  PLoS One       Date:  2018-07-26       Impact factor: 3.240

10.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Authors:  Jinhyuk Lee; Wonjin Yoon; Sungdong Kim; Donghyeon Kim; Sunkyu Kim; Chan Ho So; Jaewoo Kang
Journal:  Bioinformatics       Date:  2020-02-15       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.