Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical-drug relation extraction?

Literature DB >> 36006843

Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical-drug relation extraction?

Anfu Tang^1,2, Louise Deléger¹, Robert Bossy¹, Pierre Zweigenbaum², Claire Nédellec¹.

Abstract

Collecting relations between chemicals and drugs is crucial in biomedical research. The pre-trained transformer model, e.g. Bidirectional Encoder Representations from Transformers (BERT), is shown to have limitations on biomedical texts; more specifically, the lack of annotated data makes relation extraction (RE) from biomedical texts very challenging. In this paper, we hypothesize that enriching a pre-trained transformer model with syntactic information may help improve its performance on chemical-drug RE tasks. For this purpose, we propose three syntax-enhanced models based on the domain-specific BioBERT model: Chunking-Enhanced-BioBERT and Constituency-Tree-BioBERT in which constituency information is integrated and a Multi-Task-Learning framework Multi-Task-Syntactic (MTS)-BioBERT in which syntactic information is injected implicitly by adding syntax-related tasks as training objectives. Besides, we test an existing model Late-Fusion which is enhanced by syntactic dependency information and build ensemble systems combining syntax-enhanced models and non-syntax-enhanced models. Experiments are conducted on the BioCreative VII DrugProt corpus, a manually annotated corpus for the development and evaluation of RE systems. Our results reveal that syntax-enhanced models in general degrade the performance of BioBERT in the scenario of biomedical RE but improve the performance when the subject-object distance of candidate semantic relation is long. We also explore the impact of quality of dependency parses. [Our code is available at: https://github.com/Maple177/syntax-enhanced-RE/tree/drugprot (for only MTS-BioBERT); https://github.com/Maple177/drugprot-relation-extraction (for the rest of experiments)] Database URL https://github.com/Maple177/drugprot-relation-extraction.

Entities: Chemical

Mesh：

Year: 2022 PMID： 36006843 PMCID： PMC9408061 DOI： 10.1093/database/baac070

Source DB: PubMed Journal: Database (Oxford) ISSN： 1758-0463 Impact factor: 4.462

Keyword Cloud
References

3 in total

Do syntactic trees enhance Bidirectional Encoder Representations from Transformers (BERT) models for chemical-drug relation extraction?

1. GENIA corpus--semantically annotated corpus for bio-textmining.

2. BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

3. Biomedical and clinical English model packages for the Stanza Python NLP library.