| Literature DB >> 30649737 |
Alec B Chapman1, Kelly S Peterson2,3, Patrick R Alba2,3, Scott L DuVall2,3, Olga V Patterson4,5.
Abstract
INTRODUCTION: Identifying occurrences of medication side effects and adverse drug events (ADEs) is an important and challenging task because they are frequently only mentioned in clinical narrative and are not formally reported.Entities:
Mesh:
Year: 2019 PMID: 30649737 PMCID: PMC6373386 DOI: 10.1007/s40264-018-0763-y
Source DB: PubMed Journal: Drug Saf ISSN: 0114-5916 Impact factor: 5.606
Concept instance distribution in training and testing sets
| Concept | Instance count in training set | Instance count in testing set |
|---|---|---|
| Categories | ||
| Drug | 13,508 | 2395 |
| Indication | 3168 | 636 |
| Frequency | 4147 | 659 |
| Severity | 3374 | 534 |
| Dose | 4893 | 801 |
| Duration | 765 | 133 |
| Route | 2278 | 389 |
| ADE | 1509 | 431 |
| SSLIF | 34,056 | 5328 |
| Relationships | ||
| Severity | 3476 | 559 |
| Manner/route | 2551 | 455 |
| Reason | 4554 | 876 |
| Dosage | 5177 | 866 |
| Duration | 906 | 147 |
| Frequency | 4419 | 730 |
| Adverse | 2082 | 530 |
ADE adverse drug event, SSLIF other signs, symptoms, and diseases
Performance metrics of the CRF NER model on the 213 final evaluation documents
| Label | Standard resources | Extended resources | ||||
|---|---|---|---|---|---|---|
| Precision | Recall | F1 | Precision | Recall | F1 | |
| Route | 94.4 | 87.4 | 90.8 | 94.8 | 89.5 | 92.1 |
| Drug | 90.1 | 83.7 | 86.8 | 91.1 | 86.1 | 88.6 |
| Dose | 89.2 | 83.5 | 86.3 | 89.8 | 85.4 | 87.5 |
| Frequency | 85.2 | 79.2 | 82.1 | 88.7 | 83.2 | 85.8 |
| Severity | 86.8 | 77.3 | 81.8 | 87.3 | 75.7 | 81.0 |
| SSLIF | 79.1 | 80.2 | 79.7 | 80.1 | 80.4 | 80.2 |
| Duration | 75.4 | 69.1 | 72.2 | 74.6 | 68.4 | 71.4 |
| ADE | 75.7 | 32.5 | 45.5 | 75.8 | 38.5 | 51.1 |
| Indication | 63.2 | 33.8 | 44.1 | 67.0 | 38.7 | 49.1 |
| Overall micro | 82.8 | 76.7 | 79.6 | 83.8 | 78.1 | 80.9 |
ADE adverse drug event, CRF conditional random field, NER named entity recognition, SSLIF other signs, symptoms, and diagnoses
Contribution of NER model features by strict (exact text match) micro-averaged metrics
| Features | Precision | Recall | F1 |
|---|---|---|---|
| Baseline | 82.1 | 71.4 | 76.4 |
| + Character features | 75.6 | 74.6 | 77.9 |
| + Drug features | 83.1 | 74.0 | 78.3 |
| + EHR embedding clusters (extended) | 82.6 | 75.2 | 78.7 |
| + NoEHR embedding clusters (extended) | 82.1 | 75.6 | 78.7 |
| + EHR and NoEHR embedding clusters (extended) | 82.6 | 76.4 | 79.3 |
| + All features (standard) | 82.8 | 76.7 | 79.6 |
| + All features (extended) | 83.8 | 78.1 | 80.9 |
Baseline features were comprised of commonly used NER features such as tokens, stems, parts of speech and lexical patterns of capitalization, digits, and punctuation
EHR electronic health record, NER named entity recognition
Comparison of training time between our system and the top performing submission in the NER task
| Model type | Team | Time per training iteration (CPU) |
|---|---|---|
| Bi-LSTM-CRF | Worcester Polytechnic Institute [ | 480–3600 min |
| CRF | University of Utah | 23 min |
Bi bidirectional, CRF conditional random field, LSTM long short-term memory, NER named entity recognition
Performance metrics of the relation extraction model on the final 213 evaluation documents
| Relation category | Precision | Recall | F1 |
|---|---|---|---|
| Dosage | 95.7 | 96.2 | 96.0 |
| Frequency | 97.1 | 92.3 | 94.7 |
| Route | 96.1 | 92.1 | 94.1 |
| Severity | 91.1 | 96.2 | 93.6 |
| Duration | 93.7 | 91.2 | 92.4 |
| Reason | 78.0 | 73.9 | 75.8 |
| Adverse | 78.7 | 68.3 | 73.1 |
| Overall micro | 90.3 | 85.9 | 88.1 |
Contribution of features for the relation extraction model using a hold-out set of 176 documents
| Features | Precision | Recall | F1 |
|---|---|---|---|
| Entities between candidates | 28.4 | 35.4 | 31.5 |
| Candidate entities | 42.7 | 72.8 | 53.9 |
| Surface | 74.6 | 66.2 | 70.2 |
| Candidate entities + other entities between | 81.6 | 90.4 | 85.8 |
| All features | 91.7 | 91.2 | 91.4 |
Micro-averaged performance metrics of the final integrated model on the final 213 evaluation documents
| Relation category | Precision | Recall | F1 |
|---|---|---|---|
| Overall micro | 72.1 | 53.4 | 61.2 |
Error analysis from NER predictions related to ADE and indication labels
| Error category | Example | Explanation |
|---|---|---|
| Mislabeled Indication when Drug is not mentioned | “Treating currently as if she had | Without a mention of a Drug, Indication was predicted as SSLIF |
| Mislabeled SSLIF when unrelated Drug is mentioned | “History of | SSLIF was predicted as Indication due to Drug used in other treatment |
| Mislabeled SSLIF when Drug is not mentioned | “DISCHARGE DIAGNOSIS: | Unexplained error when SSLIF was labeled as Indication when there was no mention of a Drug or treatment |
| Misclassification in short sentences | “No | Sentence contains too few words and urinary symptoms was incorrectly predicted as SSLIF |
| New note formatting | “ | Allergy section format is different from training data, and ADE label was not assigned |
| Inconsistent prediction in a list | “Discussed potential side effects which include headaches, nausea, | Unexplained error when vomiting was predicted as SSLIF while the others were correctly predicted as ADE |
| Contraindication mislabeledf as ADE | “Do not want to put her back on | Contraindication diagnosis was predicted as ADE when Drug is mentioned |
ADE adverse drug event, NER named entity recognition, SSLIF other signs, symptoms, and diseases
Error analysis on relation extraction errors from a hold-out set of 176 documents
| Error category | Example | Explanation |
|---|---|---|
| Implicit relation | “He did not have a fever with either cycles of | Drug was not explicitly stated to cause ADE |
| Entities more than two sentences away from each other | “50yo male with a | Drug occurred in a different note section than Indication |
| Identical entity between first and second entity | “Her hematologist looking to initiate | Another mention of identical Drug occurred closer to Route |
| Relation belongs to similar entity | “Patient received | A different Drug has Route |
| Historical treatment | “Patient presents for seventh cycle of hyper-CVAD for | Drug is not currently used as treatment for Indication |
| Annotation error | “Gabapentin 300 mg | Frequency was not annotated with Drug |
ADE adverse drug event
Final evaluation scores for each task
| F1 | |
|---|---|
| Task 1—NER | 79.6 |
| Task 2—RE | 88.1 |
| Task 3—Integrated system | 61.2 |
NER named entity recognition, RE relation extraction
F1 scores reported by the MADE 1.0 organizers of the original test submissions
| Team name | References | Submission F1 | |
|---|---|---|---|
| Task 1—NER | Worcester Polytechnic Institute | [ | 82.9 |
| IBM Research | [ | 82.9 | |
| University of Florida | [ | 82.3 | |
|
|
| ||
| Task 2—RE |
|
| |
| IBM Research | [ | 84.0 | |
| University of Arizona | [ | 83.2 | |
| Task 3—Integrated system | IBM Research | [ | 61.7 |
| University of Arizona | [ | 59.9 | |
|
|
|
The top three scores plus our score are shown for each task. Our scores are shown in italics
NER named entity recognition, RE relation extraction
| Narrative clinical notes in electronic health records are frequently the only documentation of an occurred adverse drug event (ADE). |
| Natural language processing (NLP) can be employed to identify mentions of drugs and symptoms to facilitate detection of ADE mentions in clinical text. |
| While still an active area of research, progress is made in improving methods for NLP for ADE mention detection using advanced algorithms. |