| Literature DB >> 32961734 |
Siqing Shan1,2, Qi Yan1,2, Yigang Wei1,2.
Abstract
Detecting the period of a disease is of great importance to building information management capacity in disease control and prevention. This paper aims to optimize the disease surveillance process by further identifying the infectious or recovered period of flu cases through social media. Specifically, this paper explores the potential of using public sentiment to detect flu periods at word level. At text level, we constructed a deep learning method to classify the flu period and improve the classification result with sentiment polarity. Three important findings are revealed. Firstly, bloggers in different periods express significantly different sentiments. Blogger sentiments in the recovered period are more positive than in the infectious period when measured by the interclass distance. Secondly, the optimized disease detection process can substantially improve the classification accuracy of flu periods from 0.876 to 0.926. Thirdly, our experimental results confirm that sentiment classification plays a crucial role in accuracy improvement. Precise identification of disease periods enhances the channels for the disease surveillance processes. Therefore, a disease outbreak can be predicted credibly when a larger population is monitored. The research method proposed in our work also provides decision making reference for proactive and effective epidemic control and prevention in real time.Entities:
Keywords: disease detection; flu; sentiment analysis; social media; text classification
Mesh:
Year: 2020 PMID: 32961734 PMCID: PMC7559250 DOI: 10.3390/ijerph17186853
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Number of People in the Incidences and Deaths from Flu in China.
Figure 2Solitary Channel and Multi-Channels Process.
Figure 4The Logical Structure Model over Time.
Figure 5Dual Analytical Activity Model in Channel 3.
Figure 6Cleaning Data.
Figure 7Word2vector Principles of Sina Weibo Text.
Figure 8Structure of the Long Short Term Memory Networks Constructed by Layers.
Figure 9Process of Flu Period and Sentiment Classification.
Data Description.
| Category | Field Name | |
|---|---|---|
| Tweet’s information | URL, released time, title, text | |
| Blogger’s Information | blogger’s ID, nickname of the blogger | |
| Resource | Sina Weibo | |
| Keywords | flu (Gan Mao), influenza (Liu Gan), cough (Ke Sou), fever (Fa Shao), sneeze (Pen Ti), nasal congestion (Bi Sai) | |
| Amount of word2vector training corpora | In 2016 | 50,000 |
| In 2017 | 50,000 | |
| Amount of labeling sets | In 2016 | 10,000 |
| In 2017 | 10,000 | |
| Total valid amount | 15,301 | |
| LSTM training set | 10,711 | |
| LSTM test set | 4590 | |
Flu-related Words.
| Label | Words |
|---|---|
| infectious | uncomfortable, not good, ailment, no strength, ache, too awkward, feel bad, serious, sleepy, exhaustion, a tough time, high fever, low fever, pyrexia, diarrhea, emesis, vomit and have watery stools, sneeze, phlegm, dry cough, nasal congestion, difficulty breathing, sore throat, running nose, a bad cold, excessive internal heat, relapse, swell, sore, itch, clinic, children’s hospital, emergency call, see a doctor, transfusion, blood, outpatient service, take medicine, injection, drink more water, transfusion, headache, dizzy, backache, weakness in the limbs, stomach ache, leg pain, giddy, terrible, exacerbation, brain swelling, nausea, regurgitation, tonsil, anti-inflammatory drug, capsule, electuary, painful, fester, tinnitus, toothache, sternutation, cough, lacking in strength, intravenous drip, bacteria, influenza, infection, epidemic, the upper respiratory tract, feel chilly, swollen eyes, X-ray, dazed, lethargy, teeter, have a temperature, flu, rhinorrhea, snot, cold cure, inflammation, virus. |
| recovered | bring down a fever, much better, improve, healthy, recovery, almost gone, feel good, heal, feel all right, get better, fever subsided, antipyretic, abatement of fever, return to normal, stop taking medication, in good health, fitness, get well, improve markedly, pull through, self-cure. |
| negative | sadness, cry, go crazy, poor, disappointed, tired, heart-broken, agony, worried, unlucky, wronged, grieved, breakdown, disgusting, angry, torturous, sorrow, hard, arduous, piercing pain, sob, anxious, self-accusation, vexation, compunction, fear, gripping, exhausted, weep, worn out, fragile, suffering, helplessness, tantalization, nervous, take offence, guilty, regretful, despairing, whiny, harrowing, depressed, annoying, out of sorts, irritated, listless, bad mood. |
| positive | quiet, happy, thankful, wish, clear up, alive and kicking, hope, expect, laugh, love, smile, delighted, cheer up, ha ha ha, make an effort, lovely, grinning, felicity, warmth, cheerful, strong, glad, excited, pray, bless, impetrate, look forward, chuckle, satisfied, joyful, active, all the best, smooth going, hang on, have fun, yeah, contented, hug, gentle, safe and sound, benediction, grand time, brave, relieved. |
Figure 10Two-dimensional Word Embedding Scatter Plot.
Figure 11Force-Directed Graph of Words’ Similarity.
Figure 12Words of Positive Sentiment and Recovered Period.
Figure 13Words of Negative Sentiment and Infectious Period.
Class Center Gravity Coordinates.
| Label | X | Y |
|---|---|---|
| Infectious | 2.665 | 4.758 |
| Negative | 2.434 | −3.105 |
| Positive | −3.302 | −2.661 |
| Recovered | −3.553 | 1.629 |
Figure 14Class Center Gravity Scatter.
Figure 15Trend Comparison of Flu-related Weibos% and influenza-like illness (ILI)%.
LSTM Test Set Result.
| Period and Sentiment | Positive | Negative | Total | |
|---|---|---|---|---|
| Recovered | total | 876 | 402 | 1278 |
| correct | 655 | 176 | 831 | |
| Infectious | total | 356 | 2956 | 3312 |
| correct | 297 | 2893 | 3190 | |