| Literature DB >> 30943954 |
Son Doan1, Elly W Yang2, Sameer S Tilak2, Peter W Li2, Daniel S Zisook2, Manabu Torii2.
Abstract
BACKGROUND: Twitter messages (tweets) contain various types of topics in our daily life, which include health-related topics. Analysis of health-related tweets would help us understand health conditions and concerns encountered in our daily lives. In this paper we evaluate an approach to extracting causalities from tweets using natural language processing (NLP) techniques.Entities:
Keywords: Causal relationships; Causality; Cause-effect; Natural language processing (NLP); Twitter
Year: 2019 PMID: 30943954 PMCID: PMC6448183 DOI: 10.1186/s12911-019-0785-0
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1A general framework for causality extraction from Twitter messages
Rule set to extract causal relations from tweets
| # | Causal relation types | Dependency rules | Examples |
|---|---|---|---|
| 1 | A (noun) caused B | {} = subj < subj ({+ Causal verb +} = target >dobj {} = cause) | Stress causes insomnia |
| 2 | A (verb-ing) caused B | {} = subj < csubj ({+ Causal verb +} = target >dobj {} = cause) | Over thinking can increase anxiety and cause insomnia. |
| 3 | B was caused by A | {} = ncsubjpass<nsubjpass({+ Causal verb +} = target >/nmod:agent/{} = cause) | My insomnia was caused by stress. |
| 4 | A is a reason of B | Causal noun + < nsubj ({} = target > /nmod:of/ {} = cause) | Stress is a reason of my insomnia |
| 5 | B was caused by A (verb-ing) | {} = nsubj< nsubjpass ({} = target >/advcl:by/ + Causal noun) | Insomnia was caused by overthinking |
| 6 | A results “in/to/from” B | Causal verb + < [nc] subj ({} = target> /nmod:(to|in|from)/{} = cause) | Stress results to insomnia. |
Results when applying rule set in table i to a corpus of 24 millions tweets. The last rows indicates the numbers of tweets extracted with given effects (insomnia, stress and headache)
| Matched rule # | Insomnia (of 3827) | Stress (of 29,705) | Headache (of 11,252) |
|---|---|---|---|
| 1 | 58 | 381 | 78 |
| 2 | 4 | 12 | 3 |
| 3 | 0 | 4 | 1 |
| 4 | 1 | 21 | 2 |
| 5 | 0 | 32 | 0 |
| 6 | 9 | 51 | 10 |
| Total | 72 | 501 | 94 |
| # extracted causalities | 41 | 98 | 42 |
Precision of extracted causalities when comparing to human annotators
| Strict evaluation | Relax evaluation (exclude hypothetical assertions and negation) | |
|---|---|---|
| Insomnia | 73.81% | 88.10% |
| Stress | 82.65% | 96.94% |
| Headache | 56.10% | 85.37% |
| Micro-average | 74.59% | 92.27% |