| Literature DB >> 34737325 |
Dandan Tao1, Dongyu Zhang2, Ruofan Hu2, Elke Rundensteiner3,4, Hao Feng5.
Abstract
Foodborne outbreaks are a serious but preventable threat to public health that often lead to illness, loss of life, significant economic loss, and the erosion of consumer confidence. Understanding how consumers respond when interacting with foods, as well as extracting information from posts on social media may provide new means of reducing the risks and curtailing the outbreaks. In recent years, Twitter has been employed as a new tool for identifying unreported foodborne illnesses. However, there is a huge gap between the identification of sporadic illnesses and the early detection of a potential outbreak. In this work, the dual-task BERTweet model was developed to identify unreported foodborne illnesses and extract foodborne-illness-related entities from Twitter. Unlike previous methods, our model leveraged the mutually beneficial relationships between the two tasks. The results showed that the F1-score of relevance prediction was 0.87, and the F1-score of entity extraction was 0.61. Key elements such as time, location, and food detected from sentences indicating foodborne illnesses were used to analyze potential foodborne outbreaks in massive historical tweets. A case study on tweets indicating foodborne illnesses showed that the discovered trend is consistent with the true outbreaks that occurred during the same period.Entities:
Mesh:
Year: 2021 PMID: 34737325 PMCID: PMC8568976 DOI: 10.1038/s41598-021-00766-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1User interface used for data collection on Amazon Mechanical Turk.
Figure 2Pseudocode of the inter-worker agreement algorithm.
Figure 3Diagram of the dual-task BERTweet model.
Performance evaluation of the dual-task BERTweet model.
| Precision | Recall | F1-score | Accuracy | |
|---|---|---|---|---|
| Sentence classification | 0.8495 ± 0.0331 | 0.8867 ± 0.0343 | 0.8667 ± 0.0033 | 0.8153 ± 0.0102 |
| Entity extraction | 0.4927 ± 0.0343 | 0.8143 ± 0.0077 | 0.6134 ± 0.0271 | 0.9241 ± 0.0090 |
Examples of tweets and predictions using the dual-task BERTweet model.
| No | Tweets | Sentence label | Sentence classification | Entity label | Entity extraction |
|---|---|---|---|---|---|
| 1 | “It happens too fast to be food poisoning. It’s like I forced myself to eat too much, even though I didn’t eat that much and I was starving before.” | Not sick | Sick ( | [KEY]: “food poisoning” | [KEY]: “food poisoning, ” [SYM]: “starving” |
| 2 | “Never had one. Never will. I remember a number of years ago reading an article by doctor, who said a high percentage of supposedly stomach flu cases were actually food poisoning.” | Not sick | Not sick ( | [KEY]: “food poisoning,” [KEY]: “stomach flu” | [KEY]: “food poisoning, ” [KEY]: “stomach flu” |
| 3 | “I got food poisoning from a grilled cheese last night and I’ve never felt so betrayed in my life.” | Sick | Sick ( | [KEY]: “food poisoning,” [FOOD]: “grilled cheese” | [KEY]: “food poisoning,” [FOOD]: “grilled cheese” |
| 4 | “Text U know what’s said? I’m so OCD about washing my hands and not getting sick, everyone around me doesn’t care and guess who gets food poisoning.” | Sick | Not sick ( | [KEY]: “food poisoning” | [KEY]: “food poisoning, ” [SYM]: “sick” |
means incorrect predictions and means correct predictions.
Figure 4Number of tweets related to foodborne illness incidents associated with lettuce (a extracted by the dual-task BERTweet model and b extracted by keyword searching).
Figure 5Number of tweets related to foodborne illness incidents in the United States.
Figure 6Number of tweets related to foodborne illness incidents in the United States (a including “lettuce” in the tweets and b including “lettuce” or “salad” or “sandwich” in the tweets).