| Literature DB >> 31390003 |
Fenia Christopoulou1,2, Thy Thy Tran1,2, Sunil Kumar Sahu1, Makoto Miwa2,3, Sophia Ananiadou1,2.
Abstract
OBJECTIVE: Identification of drugs, associated medication entities, and interactions among them are crucial to prevent unwanted effects of drug therapy, known as adverse drug events. This article describes our participation to the n2c2 shared-task in extracting relations between medication-related entities in electronic health records.Entities:
Keywords: adverse drug events; electronic health records; ensemble methods; neural networks; relation extraction
Year: 2020 PMID: 31390003 PMCID: PMC6913215 DOI: 10.1093/jamia/ocz101
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Architecture of the weighted bidirectional long-short term memory (Weighted BiLSTM) and Walk-based models. ADE: adverse drug event.
Figure 2.Model architecture of inter-sentence relation extraction utilizing the Transformer network.
Statistics of n2c2 dataset in intra- and inter-sentence relations for the training and development sets
| Training | Development | |||
|---|---|---|---|---|
| Total sentences | 44 475 | 11 520 | ||
| Sentences with >1 entity | 7125 | 1907 | ||
| Sentences with 1 entity | 1835 | 401 | ||
| Sentences with no entities | 35 515 | 9212 | ||
| Sentences with 1 pair | 1672 | 409 | ||
|
|
|
|
| |
| Number of positive relations | 1994 | 26591 | 570 | 7119 |
|
| 36 | 5276 | 13 | 1373 |
|
| 107 | 3192 | 33 | 888 |
|
| 29 | 489 | 4 | 120 |
|
| 158 | 4828 | 53 | 1259 |
|
| 123 | 5060 | 74 | 1358 |
|
| 107 | 4220 | 35 | 1173 |
|
| 1239 | 2830 | 307 | 783 |
|
| 195 | 696 | 51 | 165 |
| Negative relations | 78 471 | 39 850 | 20 050 | 9183 |
| Duplicate relations | 19 | 9 | ||
ADE: adverse drug event.
Performance of the proposed RE models on the development set
| Model | Precision | Recall | F1-score |
|---|---|---|---|
|
| |||
| Weighted, 1× BiLSTM | 0.9702 | 0.8985 | 0.9330 |
| + Attention | 0.9713 | 0.8975 | 0.9330 |
| + Walks L = 2 | 0.9767 | 0.9029 | 0.9384 |
| + Walks L = 4 | 0.9803 | 0.9040 | 0.9406 |
| + Walks L = 8 | 0.9804 | 0.9066 | 0.9420 |
| + Negative Filtering | 0.9734 | 0.9115 | 0.9414 |
| Weighted, 2× BiLSTM | 0.9719 | 0.9057 | 0.9376 |
| + Attention | 0.9722 | 0.9036 | 0.9366 |
|
| |||
| sentence span 1, 2× Transformer | 0.9549 | 0.9046 | 0.9291 |
| sentence span 2, 1× Transformer | 0.9265 | 0.8963 | 0.9112 |
| sentence span 2, 2× Transformer | 0.9358 | 0.9458 | 0.9408 |
| sentence span 2, 3× Transformer | 0.9240 | 0.9465 | 0.9351 |
| sentence span 2, 4× Transformer | 0.9198 | 0.9365 | 0.9281 |
| sentence span 3, 2× Transformer | 0.8851 | 0.9654 | 0.9235 |
BiLSTM: bidirectional long-short term memory; RE: relation extraction.
Performance on test set for relation (Track 2) and end-to-end (Track 3) extraction task of submitted and improved models.
| Model | Precision | Recall | F1-score |
|---|---|---|---|
|
| |||
| * Intra [ensemble] + Inter [ensemble] | 0.9463 | 0.9480 | 0.9472 |
| Intra [ensemble] + Inter [ensemble] | 0.9572 | 0.9456 | 0.9514 |
|
| |||
| * NER [recall] + Weighted [ensemble] + Inter [ensemble] | 0.9264 | 0.8318 | 0.8765 |
| NER [recall] + Intra [ensemble] + Inter [ensemble] | 0.9286 | 0.8321 | 0.8777 |
The asterisk indicates our submitted models to the n2c2 shared task.
NER: named entity recognition.
Figure 3.False negative error rate of intra-sentence models and their ensemble on the development set. ADE: adverse drug event.
Figure 4.Performance of the best relation extraction ensemble on each relation class on the development set. Blue bars indicate the performance of the intra-sentence model ensemble (Walk and Weighted models), while orange bars show performance improvement when merging intra- and inter-sentence models. ADE: adverse drug event.
Figure 5.Performance of intra-sentence models on the development set on sentences with different number of entities.
Performance of the Walk-based model with and without considering Drug-Drug pairs
| F1-score | Exclude DDIs | Include DDIs | ||||
|---|---|---|---|---|---|---|
| L = 8 | L = 4 | L = 2 | L = 8 | L = 4 | L = 2 | |
| Micro | 0.9366 | 0.9366 | 0.9366 | 0.9420 | 0.9406 | 0.9384 |
| Macro | 0.9345 | 0.9345 | 0.9335 | 0.9389 | 0.9367 | 0.9349 |
DDI: Drug-Drug interaction.