| Literature DB >> 29107976 |
Laurissa Tokarchuk1,2, Xinyue Wang1,2, Stefan Poslad2.
Abstract
In an age when people are predisposed to report real-world events through their social media accounts, many researchers value the benefits of mining user generated content from social media. Compared with the traditional news media, social media services, such as Twitter, can provide more complete and timely information about the real-world events. However events are often like a puzzle and in order to solve the puzzle/understand the event, we must identify all the sub-events or pieces. Existing Twitter event monitoring systems for sub-event detection and summarization currently typically analyse events based on partial data as conventional data collection methodologies are unable to collect comprehensive event data. This results in existing systems often being unable to report sub-events in real-time and often in completely missing sub-events or pieces in the broader event puzzle. This paper proposes a Sub-event detection by real-TIme Microblog monitoring (STRIM) framework that leverages the temporal feature of an expanded set of news-worthy event content. In order to more comprehensively and accurately identify sub-events this framework first proposes the use of adaptive microblog crawling. Our adaptive microblog crawler is capable of increasing the coverage of events while minimizing the amount of non-relevant content. We then propose a stream division methodology that can be accomplished in real time so that the temporal features of the expanded event streams can be analysed by a burst detection algorithm. In the final steps of the framework, the content features are extracted from each divided stream and recombined to provide a final summarization of the sub-events. The proposed framework is evaluated against traditional event detection using event recall and event precision metrics. Results show that improving the quality and coverage of event contents contribute to better event detection by identifying additional valid sub-events. The novel combination of our proposed adaptive crawler and our stream division/recombination technique provides significant gains in event recall (44.44%) and event precision (9.57%). The addition of these sub-events or pieces, allows us to get closer to solving the event puzzle.Entities:
Mesh:
Year: 2017 PMID: 29107976 PMCID: PMC5673163 DOI: 10.1371/journal.pone.0187401
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Distribution of events coverage in main stream newswire and Twitter (based on [20]).
Fig 2STRIM framework for TEM.
Overview of the Glastonbury Festival and Sochi Olympic datasets.
| Glastonbury Festival | Sochi Olympic | |||
|---|---|---|---|---|
| Baseline | Adaptive | Baseline | Adaptive | |
| Init. Keywords | Glastonbury | Sochi, #olympic2014, #sochi2014 | ||
| Period | 2013-06-29, 11:00 to 2013-06-30, 00:00 | 2014-02-22, 05:15 to 2014-02-22, 19:15 | ||
| Tweets No. | 171254 | 232811 | 213986 | 281692 |
| Keyword No. | 1 | 118 | 3 | 247 |
Sub-event ground truth for Glastonbury Festival and Sochi Olympic.
| Sub-event indexes | Sub-event lists | |
|---|---|---|
| Glastonbury Festival | Sochi Olympic | |
| 1 | Ben Howard | Discussion about the championship of Kim Yuna |
| 2 | Laura Mvula | Ice hockey Canada vs. USA |
| 3 | Tibetan Monk Throat Singing | Vic Wild for men’s snow board |
| 4 | Elvis Costello | Photo of the day: three Olympic champions |
| 5 | Noah and the Whale | Mao Asada memory and feature report by Asahi Shimbun |
| 6 | Primal Scream | Speed Skating champion by Netherland |
| 7 | Maverick Sabre | Plushenko Back Surgery |
| 8 | Glastonbury founder supports badger cull | Anton Shinpulin led the Russian team to win the biathlon relay |
| 9 | Two Door Cinema Club | Ice Hockey USA vs Finland |
| 10 | Example | |
| 11 | The Rolling Stones | |
Parameter settings of Twitinfo burst detection algorithm.
| Glastonbury Festival | Sochi Olympic | |
|---|---|---|
| Detection latency | 6 | 3 |
| Sample interval | 5 | 10 |
| Smooth factor | 0.6 | 0.75 |
| Inequality threshold | 2.5 | 2.75 |
Fig 3Labelled peak windows for Glastonbury Festival.
Fig 4Labelled peak windows for Sochi Olympic.
Comparison of event detection between the baseline approach and the proposed framework.
| Data stream | Glastonbury Festival | Sochi Olympic | ||||
|---|---|---|---|---|---|---|
| Pevent | Revent | F1 | Pevent | Revent | F1 | |
| BL | 54.55 | 45.45 | 49.59 | 71.43 | 55.56 | 62.50 |
| AD | 45.45 | 36.36 | 40.40 | 75 | 66.67 | 70.59 |
| EX | 64.29 | 72.73 | 68.25 | 83.33 | 55.56 | 66.37 |
| ALL | 58.62 | 90.91 | 71.28 | 80 | 100 | 88.89 |
List of sub-event by the proposed event monitoring framework (Glastonbury Festival).
| Sub-event | Time span | Summary tweet | Descriptive terms |
|---|---|---|---|
| Ben Howard | 16:20 to 16:50 | @benhowardmusic is some guy #amazing #lovehit | [#amazing] [#jealous] [#wow] [#lovehim] [howard] |
| Laura Mvula | 15:55 to 16:20 | Laura Mvula looks stunning! #glastonbury | [heatwave] [#sebheupdate] [laura] [mvula] [6pm] |
| 18:55 to 19:35 | #Glastoshout please stop laura mvula | [#festival] [#jealous] [laura] [manch] [#glastoshout] | |
| Tibetan Monk Throat Singing | 16:20 to 17:20 | Tibetan monk throat singing……I think you''d have to have been there #glastonbury | [heatwave] [alongside] [#silverhayes] [haircut] [monk] |
| Elvis Costello | 17:20 to 18:05 | Olivers Army are on their way #elviscostello #glastonbury | [deborah] [8ish] [hoop] [tenda] [vanessa] |
| Noah and the Whale | 16:20 to 17:20 | Noah and The Whale!<3 #glastonbury #wishiwasthere | [noah] [whale] [heatwave] [heading] [#jealous] |
| 18:55 to 19:55 | Noah and the whale #glastonbury #lovethem | [noah] [whale] [belongings] [door] [johnny] | |
| Primal Scream | 19:35 to 19:55 | The crowd during @screamofficial #stonesglasto #primalscream #therollingstones | [whale] [#primalscream] [noah] [#goodtimes] [#noahandthewhale] |
| Maverick Sabre | 20:25 to 20:55 | Maverick Sabre #wow | [maverick] [sabre] [#wow] [wonderwall] [#amazing] |
| Glastonbury founder supports badger cull | 19:55 to 21:15 | #glastonbury badger badger badger badger badger badger | [badger] [sabre] [maverick] [wonderwall] [1965] |
| 21:15 to 21:40 | Wonder will Eavis get "BADGERED" tomorrow at #Glastonbury?—"BADGER BADGER BADGER!” | [badger] [switch] [rudiment] [1965] [petition | |
| Two Door Cinema Club | 21:10 to 21:35 | Two Door Cinema rock and they look like they could do your accounts…#bbcglasto | [cinema] [door] [margaret] [#leftfield] [invite] |
| Example | 21:35 to 22:05 | is there anybody, completely off their nut? #example | [#example] [#proud] [#nffc] [#jealous] [cinema] |
| Rollingstones | 22:15 to 23:30 | Oh dear the #Stones at #glastonbury look like a Wonga TV ad | [wonga] [#stones] [#glastonbury2013live] [careworker] [#physiotherapy] |