| Literature DB >> 25600332 |
Balaji Polepalli Ramesh1, Steven M Belknap, Zuofeng Li, Nadya Frid, Dennis P West, Hong Yu.
Abstract
BACKGROUND: The Food and Drug Administration's (FDA) Adverse Event Reporting System (FAERS) is a repository of spontaneously-reported adverse drug events (ADEs) for FDA-approved prescription drugs. FAERS reports include both structured reports and unstructured narratives. The narratives often include essential information for evaluation of the severity, causality, and description of ADEs that are not present in the structured data. The timely identification of unknown toxicities of prescription drugs is an important, unsolved problem.Entities:
Keywords: adverse drug events; natural language processing; pharmacovigilance
Year: 2014 PMID: 25600332 PMCID: PMC4288072 DOI: 10.2196/medinform.3022
Source DB: PubMed Journal: JMIR Med Inform
Figure 1A sample AERS Report with structured data and narrative text.
Figure 2The sample constituency parse tree. S: simple declarative clause, NP: noun phrase, VP: verb phrase, DT: determiner, JJ: adjective, NN: noun, VBD: verb, past tense, SBAR: subordinate clause, IN: preposition or subordinating conjunction, VBG: verb, gerund or present participle.
Named entity definition, number of annotated instances, and inter-annotator agreement measured by Cohen’s kappa for both strict and unstrict criterion.
| Named entity | Definition | Number of instances annotated | kappa (strict) | kappa (unstrict) | ||||
|
| AnnPhy | AnnLing | Comb | Tie |
|
| ||
| Medication | Name of the drug they administered to patient including drug class name or medications referred to with | 1231 | 1278 | 1152 | 1286 | .92 | .95 | |
| Dosage | Amount of a single medication used in each administration | 143 | 315 | 137 | 205 | .59 | .82 | |
| Route | Method for administering the medication | 115 | 244 | 107 | 132 | .59 | .64 | |
| Frequency | How often each dose of the medication should be taken | 25 | 56 | 21 | 42 | .58 | .74 | |
| Duration | How long the medication is to be administered | 34 | 153 | 24 | 51 | .34 | .87 | |
| Indication | Medical conditions for which the medication is given | 175 | 148 | 126 | 175 | .76 | .93 | |
| Adverse event (AE) | Harm directly caused including the pronouns referring to it by the drug at normal doses and during normal use | 1689 | 2083 | 1646 | 1842 | .83 | .93 | |
| Other signs, symptoms and diseases (OSSD) | Other symptoms associated with the disease | 234 | 140 | 90 | 147 | .50 | .71 | |
| Treatment | Treatment the patient received for the disease | 77 | 216 | 62 | 153 | .39 | .77 | |
| Total | 3723 | 4633 | 3365 | 4033 |
|
| ||
The precision, recall, and F1 score of Taggers on each of the four annotated data sets (t test, P<.01).
| Machine learning | AnnPhy |
| AnnLing |
| Combined |
| Tie |
| ||||||||
| F1 | Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | |||||
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
| BaseDict | 0.45 (0.10) | 0.86 (0.08) | 0.31 (0.09) | 0.41 (0.09) | 0.91 (0.07) | 0.27 (0.08) | 0.46 (0.12) | 0.82 (0.06) | 0.32 (0.11) | 0.42 (0.10) | 0.86 (0.13) | 0.28 (0.08) | |||
|
| MetaMapTagger | 0.41 (0.17) | 0.41 (0.16) | 0.42 (0.18) | 0.41 (0.10) | 0.47 (0.20) | 0.37 (0.15) | 0.42 (0.18) | 0.41 (0.17) | 0.43 (0.19) | 0.40 (0.16) | 0.46 (0.19) | 0.36 (0.14) | |||
|
| NBTagger | 0.22 (0.08) | 0.39 (0.17) | 0.15 (0.05) | 0.23 (0.08) | 0.45 (0.14) | 0.16 (0.06) | 0.24 (0.08) | 0.40 (0.17) | 0.17 (0.05) | 0.20 (0.06) | 0.47 (0.19) | 0.13 (0.04) | |||
|
| SVMTagger | 0.55 (0.05) | 0.77 (0.10) | 0.44 (0.04) | 0.55 (0.05) | 0.78 (0.07) | 0.43 (0.05) | 0.58 (0.04) | 0.78 (0.10) | 0.46 (0.04) | 0.59 (0.04) | 0.80 (0.05) | 0.46 (0.05) | |||
|
| SimpleTagger | 0.67 (0.09) | 0.77 (0.09) | 0.60 (0.09) | 0.72 (0.08) | 0.81 (0.06) | 0.66 (0.10) | 0.71 (0.08) | 0.81 (0.09) | 0.63 (0.08) | 0.63 (0.09) | 0.69 (0.08) | 0.55 (0.10) | |||
|
| NBTagger+ | 0.45 (0.09) | 0.38 (0.10) | 0.56 (0.06) | 0.44 (0.06) | 0.39 (0.07) | 0.50 (0.06) | 0.46 (0.09) | 0.37 (0.11) | 0.60 (0.04) | 0.43 (0.07) | 0.38 (0.08) | 0.51 (0.07) | |||
|
| SVMTagger+ | 0.66 (0.07) | 0.78 (0.10) | 0.58 (0.06) | 0.67 (0.07) | 0.78 (0.07) | 0.59 (0.07) | 0.70 (0.06) | 0.80 (0.11) | 0.63 (0.05) | 0.66 (0.07) | 0.78 (0.06) | 0.57 (0.08) | |||
|
| CombinedTagger | 0.69 (0.09) | 0.77 (0.10) | 0.62 (0.09) | 0.74 (0.08)* | 0.81 (0.07) | 0.68 (0.09) | 0.73 (0.08) | 0.81 (0.10) | 0.66 (0.07) | 0.65 (0.08) | 0.71 (0.08) | 0.60 (0.09) | |||
|
|
|
|
|
|
|
|
|
|
|
|
| |||||
|
| SimpleTagger | 0.67 (0.09) | 0.77 (0.09) | 0.60 (0.09) | 0.72 (0.08) | 0.81 (0.06) | 0.66 (0.10) | 0.71 (0.08) | 0.81 (0.09) | 0.63 (0.08) | 0.63 (0.09) | 0.69 (0.08) | 0.55 (0.10) | |||
|
| AffixTagger | 0.67 (0.09) | 0.78 (0.09) | 0.60 (0.09) | 0.73 (0.09) | 0.81 (0.06) | 0.66 (0.10) | 0.70 (0.08) | 0.81 (0.09) | 0.63 (0.08) | 0.61 (0.09) | 0.70 (0.08) | 0.52 (0.10) | |||
|
| ConnectiveTagger | 0.67 (0.09) | 0.77 (0.09) | 0.60 (0.09) | 0.73 (0.08) | 0.81 (0.06) | 0.66 (0.10) | 0.71 (0.08) | 0.81 (0.09) | 0.63 (0.08) | 0.63 (0.09) | 0.70 (0.07) | 0.57 (0.10) | |||
|
| MorphologicalTagger | 0.68 (0.09) | 0.77 (0.08) | 0.60 (0.10) | 0.73 (0.08) | 0.81 (0.06) | 0.66 (0.09) | 0.71 (0.08) | 0.80 (0.09) | 0.63 (0.08) | 0.64 (0.08) | 0.71 (0.07) | 0.59 (0.09) | |||
|
| NegHedgeTagger | 0.66 (0.09) | 0.77 (0.09) | 0.59 (0.10) | 0.72 (0.08) | 0.81 (0.06) | 0.65 (0.10) | 0.71 (0.08) | 0.81 (0.09) | 0.63 (0.08) | 0.61 (0.09) | 0.69 (0.08) | 0.54 (0.10) | |||
|
| SemanticTagger | 0.68 (0.09) | 0.77 (0.10) | 0.61 (0.09) | 0.70 (0.09) | 0.78 (0.07) | 0.64 (0.10) | 0.72 (0.09) | 0.80 (0.11) | 0.65 (0.08) | 0.63 (0.09) | 0.69 (0.10) | 0.58 (0.09) | |||
|
| SyntacticTagger | 0.68 (0.09) | 0.78 (0.09) | 0.61 (0.10) | 0.72 (0.08) | 0.80 (0.06) | 0.65 (0.09) | 0.71 (0.08) | 0.80 (0.09) | 0.64 (0.08) | 0.63 (0.08) | 0.70 (0.08) | 0.58 (0.09) | |||
|
| CombinedTagger | 0.69 (0.09) | 0.77 (0.10) | 0.62 (0.09) | 0.74 (0.08) | 0.81 (0.07) | 0.68 (0.09) | 0.73 (0.08) | 0.81 (0.10) | 0.66 (0.07) | 0.65 (0.08) | 0.71 (0.08) | 0.60 (0.09) | |||
The F1 score of different named entities with different features on Comb dataset.
| Feature group | AE | Medication | Dosage | Frequency | Route | Duration | Indication | OSSD | Treatment | Overall |
| Default | 0.70 (0.10) | 0.82 (0.10) | 0.59 (0.35) | 0.57 (0.46) | 0.36 (0.33) | 0.20 (0.42) | 0.57 (0.12) | 0.44 (0.45) | 0.60 (0.52) | 0.71 (0.08) |
| Affix | 0.69 (0.11) | 0.81 (0.12) | 0.58 (0.37) | 0.59 (0.45) | 0.55 (0.37) | 0.40 (0.52) | 0.57 (0.09) | 0.51 (0.44) | 0.60 (0.52) | 0.70 (0.08) |
| Connective | 0.70 (0.10) | 0.81 (0.10) | 0.69 (0.31) | 0.57 (0.46) | 0.44 (0.36) | 0.20 (0.42) | 0.60 (0.15) | 0.44 (0.45) | 0.60 (0.52) | 0.71 (0.08) |
| Morphological | 0.70 (0.10) | 0.82 (0.10) | 0.57 (0.35) | 0.59 (0.45) | 0.32 (0.32) | 0.20 (0.42) | 0.62 (0.12) | 0.47 (0.43) | 0.60 (0.52) | 0.71 (0.08) |
| NegHedge | 0.69 (0.10) | 0.82 (0.10) | 0.56 (0.36) | 0.59 (0.45) | 0.36 (0.33) | 0.20 (0.42) | 0.59 (0.11) | 0.50 (0.43) | 0.60 (0.52) | 0.71 (0.08) |
| Semantic | 0.71 (0.11) | 0.82 (0.11) | 0.56 (0.35) | 0.65 (0.40) | 0.34 (0.33) | 0.30 (0.48) | 0.64 (0.13) | 0.43 (0.39) | 0.60 (0.52) | 0.72 (0.09) |
| Syntactic | 0.70 (0.10) | 0.81 (0.11) | 0.61 (0.35) | 0.59 (0.45) | 0.32 (0.31) | 0.34 (0.47) | 0.58 (0.11) | 0.44 (0.45) | 0.60 (0.52) | 0.71 (0.08) |
| All | 0.72 (0.10) | 0.83 (0.11) | 0.61 (0.37) | 0.59 (0.44) | 0.32 (0.31) | 0.34 (0.47) | 0.65 (0.11) | 0.55 (0.39) | 0.60 (0.52) | 0.73 (0.08) |
Disagreement in medication annotation (medication text is italicized).
| Annotation | Medication annotation |
| Annotated by AnnPhy but not annotated by AnnLing | 1. Given multiple |
| Annotated by AnnLing but not annotated by AnnPhy | 6. Causality assessment |
Figure 3Error categories, their frequency, and an illustrative example of error category on 100 randomly sampled instances. The annotated entities are shown in bold, the annotated named entity type is shown within “[]” and tagger output is {italicized}. AE: adverse events.
The precision, recall, and F1 score of Taggers with feature categories removed one at a time on each of the four annotated data sets.
| Tagger | AnnPhy | AnnLing | Combined | Tie | ||||||||
| F1 | Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | |
| All features | 0.67 (0.09) | 0.77 (0.10) | 0.62 (0.09) | 0.74 (0.08) | 0.81 (0.07) | 0.68 (0.09) | 0.73 (0.08) | 0.81 (0.10) | 0.66 (0.07) | 0.65 (0.08) | 0.71 (0.08) | 0.60 (0.09) |
| No affix features | 0.68 (0.09) | 0.76 (0.10) | 0.62 (0.09) | 0.71 (0.10) | 0.78 (0.07) | 0.65 (0.11) | 0.71 (0.09) | 0.79 (0.11) | 0.64 (0.09) | 0.64 (0.08) | 0.70 (0.08) | 0.60 (0.08) |
| No connective features | 0.69 (0.09) | 0.77 (0.10) | 0.62 (0.09) | 0.74 (0.08) | 0.81 (0.06) | 0.69 (0.09) | 0.73 (0.08) | 0.81 (0.10) | 0.66 (0.07) | 0.65 (0.08) | 0.71 (0.08) | 0.60 (0.09) |
| No morphological | 0.69 (0.09) | 0.78 (0.10) | 0.62 (0.09) | 0.73 (0.08) | 0.81 (0.06) | 0.66 (0.09) | 0.73 (0.08) | 0.82 (0.10) | 0.66 (0.07) | 0.65 (0.08) | 0.72 (0.08) | 0.60 (0.08) |
| No negation and hedge features | 0.68 (0.09) | 0.77 (0.10) | 0.62 (0.09) | 0.74 (0.08) | 0.81 (0.07) | 0.68 (0.09) | 0.72 (0.08) | 0.81 (0.10) | 0.65 (0.07) | 0.64 (0.09) | 0.71 (0.09) | 0.59 (0.09) |
| No semantic features | 0.67 (0.08) | 0.77 (0.08) | 0.60 (0.09) | 0.74 (0.08) | 0.82 (0.05) | 0.68 (0.10) | 0.71 (0.08) | 0.80 (0.09) | 0.64 (0.08) | 0.64 (0.08) | 0.71 (0.07) | 0.59 (0.08) |
| No syntactical features | 0.68 (0.09) | 0.77 (0.10) | 0.61 (0.09) | 0.73 (0.08) | 0.80 (0.07) | 0.68 (0.09) | 0.71 (0.09) | 0.80 (0.11) | 0.64 (0.08) | 0.64 (0.08) | 0.70 (0.09) | 0.58 (0.09) |