| Literature DB >> 31533622 |
Tongxuan Zhang1, Hongfei Lin2, Yuqi Ren1, Liang Yang1, Bo Xu1, Zhihao Yang1, Jian Wang1, Yijia Zhang1.
Abstract
BACKGROUND: The adverse reactions that are caused by drugs are potentially life-threatening problems. Comprehensive knowledge of adverse drug reactions (ADRs) can reduce their detrimental impacts on patients. Detecting ADRs through clinical trials takes a large number of experiments and a long period of time. With the growing amount of unstructured textual data, such as biomedical literature and electronic records, detecting ADRs in the available unstructured data has important implications for ADR research. Most of the neural network-based methods typically focus on the simple semantic information of sentence sequences; however, the relationship of the two entities depends on more complex semantic information.Entities:
Keywords: Adverse drug reactions; Complex semantic information; Multihop self-attention mechanism; Neural network
Mesh:
Year: 2019 PMID: 31533622 PMCID: PMC6751590 DOI: 10.1186/s12859-019-3053-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The examples of annotated sentences in the ADR corpus
Fig. 2The sequential overview of our MSAM model
Summary statistics of the corpora
| Coupus | Documents | ADR | non-ADR | Max sentence length | Experimental data length |
|---|---|---|---|---|---|
| TwiMed-Pubmed | 1000 | 264 | 983 | 137 | 75 |
| TwiMed-Twitter | 625 | 311 | 301 | 64 | 50 |
| ADE | 1644 | 6821 | 16695 | 90 | 90 |
Classification results of the compared methods for the TwiMed corpus
| Method | TwiMed-PubMed | TwiMed-Twitter | ||||
|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | |
| Feature-rich SVM [ | 0.799 | 0.681 | 0.728±0.100 | 0.752 | 0.810 | 0.778±0.047 |
| IAN [ | 0.878 | 0.738 | 0.792±0.016 | 0.836 | 0.813 | 0.824±0.042 |
| CNN-based method [ | 0.849 | 0.831 | 0.835±0.060 | 0.739 | 0.788 | 0.761±0.061 |
| multichannel CNN [ | 0.861 | 0.780 | 0.816±0.072 | 0.738 | 0.841 | 0.780±0.054 |
| Joint AB-LSTM [ | 0.817 | 0.856 | 0.831±0.040 | 0.701 | 0.828 | 0.754±0.072 |
| BiLSTM+MSAM+position | 0.858 | 0.852 | 0.853±0.057 | 0.748 | 0.856 | 0.799±0.046 |
Classification results of the compared methods for the ADE corpus
| Method | P | R | F1 |
|---|---|---|---|
| Knowledge-based system [ | 0.421 | 0.763 | 0.543 |
| Feature-rich classification [ | - | - | 0.812 |
| Bi-LSTM-RNN [ | 0.675 | 0.758 | 0.714 |
| CNNA [ | 0.815 | 0.838 | 0.826 |
| C-LSTM-CNN [ | 0.816 | 0.834 | 0.824±0.009 |
| BiLSTM+MSAM | 0.847 | 0.855 | 0.851±0.013 |
Performances obtained by using different attention mechanisms
| Method | TwiMed-PubMed | TwiMed-Twitter | ADE | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | P | R | F1 | |
| Self-Attention | 0.855 | 0.845 | 0.846 | 0.731 | 0.793 | 0.751 | 0.845 | 0.848 | 0.847 |
| Multi-head Self-Attention | 0.829 | 0.850 | 0.841 | 0.767 | 0.800 | 0.784 | 0.820 | 0.851 | 0.836 |
| Multihop Self-Attention | 0.858 | 0.852 | 0.853 | 0.748 | 0.856 | 0.799 | 0.847 | 0.855 | 0.851 |
Performance of various modules on the TwiMed corpus
| Method | TwiMed-PubMed | TwiMed-Twitter | ||||
|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | |
| BiLSTM | 0.853 | 0.806 | 0.829 | 0.680 | 0.754 | 0.715 |
| BiLSTM+position | 0.843 | 0.825 | 0.836 | 0.809 | 0.654 | 0.723 |
| BiLSTM+Self-Attention+position | 0.855 | 0.845 | 0.846 | 0.731 | 0.793 | 0.751 |
| BiLSTM+MSAM+position | 0.858 | 0.852 | 0.853 | 0.748 | 0.856 | 0.799 |
Performance of various modules on the ADE corpus
| Method | P | R | F1 |
|---|---|---|---|
| BiLSTM | 0.812 | 0.822 | 0.817 |
| BiLSTM+Self-Attention | 0.847 | 0.848 | 0.847 |
| BiLSTM+MSAM | 0.847 | 0.855 | 0.851 |
Effects of different number of steps and self-attention on both corpus (F1)
| Method | TwiMed-PubMed | TwiMed-Twitter | ADE |
|---|---|---|---|
| step1 | 0.831 | 0.786 | 0.819 |
| step2 | 0.853 | 0.799 | 0.851 |
| step3 | 0.820 | 0.789 | 0.820 |
Effects of up-sampling and down-sampling for imbalanced data
| Corpus | P | R | F1 |
|---|---|---|---|
| TwiMed-PubMed | 0.858 | 0.852 | 0.853±0.057 |
| TwiMed-PubMed (up) | 0.851 | 0.889 | 0.867±0.032 |
| TwiMed-PubMed (down) | 0.862 | 0.842 | 0.849±0.033 |
| ADE | 0.847 | 0.855 | 0.851±0.013 |
| ADE (up) | 0.846 | 0.869 | 0.857±0.007 |
| ADE (down) | 0.823 | 0.862 | 0.842±0.014 |
Fig. 3Attention heat map from MSAM (k=2) for ADRs classification