| Literature DB >> 29614852 |
Kyung Sang Lee1, Hyewon Lee2, Woojae Myung3, Gil-Young Song4, Kihwang Lee4, Ho Kim2,5, Bernard J Carroll6, Doh Kwan Kim1.
Abstract
OBJECTIVE: Suicide is a significant public health concern worldwide. Social media data have a potential role in identifying high suicide risk individuals and also in predicting suicide rate at the population level. In this study, we report an advanced daily suicide prediction model using social media data combined with economic/meteorological variables along with observed suicide data lagged by 1 week.Entities:
Keywords: SNS; Sentiment analysis; Social; Warning signs of suicide
Year: 2018 PMID: 29614852 PMCID: PMC5912497 DOI: 10.30773/pi.2017.10.15
Source DB: PubMed Journal: Psychiatry Investig ISSN: 1738-3684 Impact factor: 2.505
Figure 1.Serial prediction procedure. In this study, a total of 730 individual predictions were executed over 2 years.
An example of the prediction model for national suicide number on 1 January 2013 by using data from 1 January 2008 to 25 December 2012
| Variable | Log (estimates) | Log (standard error) | t | p |
|---|---|---|---|---|
| Suicide variable | ||||
| Suicide (t-7) | 4.96×10-3 | 6.05×10-4 | 8.20 | 4.61×10-16 |
| Economic and meteorological variables | ||||
| Stock | -1.15×10-4 | 2.54×10-5 | -4.54 | 5.95×10-6 |
| Consumer price index | 0.014 | 2.05×10-3 | 6.73 | 2.25×10-11 |
| Sunlight | -3.32×10-3 | 1.13×10-3 | -2.94 | 3.31×10-3 |
| Temperature | -8.33×10-4 | 4.43×10-4 | -1.88 | 0.060 |
| Celebrity | 0.13 | 0.016 | 8.07 | 1.28×10-15 |
| Day of week (reference=monday) | ||||
| Tuesday | 0.030 | 0.021 | 1.38 | 0.17 |
| Wednesday | -6.69×10-3 | 0.022 | -0.31 | 0.76 |
| Thursday | -0.036 | 0.022 | -1.66 | 0.098 |
| Friday | -0.034 | 0.021 | -1.62 | 0.11 |
| Saturday | -0.041 | 0.021 | -2.00 | 0.046 |
| Sunday | -0.027 | 0.017 | -1.61 | 0.11 |
| Social media variables | ||||
| Positive words | -1.12×10-6 | 2.59×10-7 | -4.34 | 1.51×10-5 |
| Weblog count 2: chokjinhada | -2.51×10-4 | 1.48×10-4 | -1.69 | 0.091 |
| Weblog count 3: gyeongjejeok | 1.58×10-4 | 6.15×10-5 | 2.57 | 0.010 |
| Weblog count 5: mogongmakda | 1.35×10-3 | 5.21×10-4 | 2.60 | 9.45×10-3 |
| Weblog count 11: jeotanso | 1.23×10-3 | 3.14×10-4 | 3.91 | 9.75×10-5 |
| Weblog count 14: uuljeung | 1.31×10-4 | 9.32×10-5 | 1.41 | 0.16 |
| Weblog count 15: deodida | 4.28×10-4 | 1.98×10-4 | 2.16 | 0.031 |
| Weblog count 20: meoriapeuda | 4.82×10-4 | 1.25×10-4 | 3.85 | 1.21×10-4 |
| Weblog count 25: uijonhada | 2.45×10-4 | 1.56×10-4 | 1.57 | 0.12 |
| Weblog count 29: buran | 1.53×10-4 | 8.18×10-5 | 1.87 | 0.062 |
Estimates are change of natural logarithm of observed suicide number at prediction date (t) per one increase of predicted variables at t-7
Prediction variables evaluated in 730 prediction models
| Variable | Description | Rates of inclusion | Average estimates | Ranges of estimates (lower, upper0 |
|---|---|---|---|---|
| Suicide variable | ||||
| Suicide (t-7) | Observed number of suicides | 730/730 (100%) | 4.11×10-3 | 2.55×10-3, 6.05×10-3 |
| Economic and meteorological variables | ||||
| Stock | Korean stock index, KOSPI | 698/730 (95.6%) | 4.34×10-5 | -1.43×10-4, 1.37×10-4 |
| Consumer price index | Monthly consumer price index | 639/730 (87.5%) | -3.79×10-3 | -1.48×10-2, 1.72×10-2 |
| Unemployment | Monthly unemployment rate | 332/730 (45.5%) | -0.021 | -3.33×10-2, -1.35×10-2 |
| Sunlight | Sunlight duration | 616/730 (84.4%) | -2.17×10-3 | -3.34×10-3, -1.41×10-3 |
| Temperature | Daily average temperature | 401/730 (54.9%) | -1.23×10-3 | -1.81×10-3, -5.47×10-4 |
| Ozone | Daily average ozone level | 18/730 (2.5%) | -0.89 | -1.09, -0.71 |
| PM-10 | Daily average particulate matter | 0/730 (0%) | - | - |
| Celebrity | If 7 days before prediction date is within one month after a celebrity suicidal event, 1; else, 0 | 730/730 (100%) | 0.11 | 8.86x10-2, 0.14 |
| Day of week | Six dummy variables for day of week | 730/730 (100%) | - | - |
| Social media variables[ | ||||
| Meaning classifications | ||||
| Positive words | Sum of count of weblog posts that contain positive words at least once | 175/730 (24%) | -8.92×10-7 | -1.40×10-6, 3.29×10-7 |
| Neutral words | Sum of count of weblog posts that contain neutral words at least once | 502/730 (68.8%) | 6.24×10-7 | 2.14×10-7, 1.54×10-6 |
| Negative words | Sum of count of weblog posts that contain negative words at least once | 313/730 (42.9%) | -2.68×10-7 | -4.60×10-7, -1.12×10-7 |
| Top 30 weblog counts [ | ||||
| Weblog count 1: siljejeok | Meaning ‘practical’; ‘matter-of-fact’; ‘businesslike’ | 503/730 (68.9%) | 4.65×10-4 | 2.94×10-4, 7.32×10-4 |
| Weblog count 2: chokjinhada | Meaning ‘facilitate’; ‘promote’; ‘accelerate’ | 476/730 (65.2%) | -2.67×10-4 | -4.44×10-4, -1.78×10-4 |
| Weblog count 3: gyeongjejeok | Meaning ‘economical’; ‘financial’ | 393/730 (53.8%) | 1.15×10-4 | 6.26×10-5, 2.30×10-4 |
| Weblog count 4: jijeogitda | Meaning ‘There have been comments about’ | 383/730 (52.5%) | 1.05×10-3 | 6.62×10-4, 1.21×10-3 |
| Weblog count 5: mogongmakda | Meaning ‘blocks the skin’s pores’ | 349/730 (47.8%) | 1.11×10-3 | 6.51×10-4, 1.67×10-3 |
| Weblog count 6: piryoitda | Meaning ‘it requires that’; ‘need to’ | 342/730 (46.8%) | -1.03×10-4 | -1.37×10-4, -7.84×10-5 |
| Weblog count 7: hyogwageoduda | Meaning ‘obtain the desired results’; ‘get effect’ | 342/730 (46.8%) | 6.52×10-4 | 5.45×10-4, 8.39×10-4 |
| Weblog count 8: mijigeunhada | Meaning ‘lukewarm’ | 315/730 (43.2%) | 2.11×10-4 | 1.53×10-4, 3.30×10-4 |
| Weblog count 9: gakgwangbatba | Meaning ‘taking spotlight’ | 176/730 (24.1%) | -3.64×10-4 | -4.65×10-4, -2.50×10-4 |
| Weblog count 10: sonsil | Meaning ‘(economic) loss’ | 164/730 (22.5%) | 2.02×10-4 | 1.10×10-4, 2.75×10-4 |
| Weblog count 11: jeotanso | Meaning ‘low carbon’ | 152/730 (20.8%) | 1.57×10-3 | 7.65×10-4, 2.07×10-3 |
| Weblog count 12: ganeungseongkeuda | Meaning ‘be in with a shout’; ‘possible’ | 144/730 (19.7%) | 2.54×10-4 | 1.79×10-4, 3.23×10-4 |
| Weblog count 13: chimche | Meaning ‘recession’; ‘(economic) depression’ | 140/730 (19.2%) | 1.72×10-4 | 1.15×10-4, 2.91×10-4 |
| Weblog count 14: uuljeung | Meaning ‘depressive disorder’ | 123/730 (16.8%) | 1.49×10-4 | 1.11×10-4, 1.80×10-4 |
| Weblog count 15: deodida | Meaning ‘slow’ | 119/730 (16.3%) | 3.92×10-4 | 3.24×10-4, 5.01×10-4 |
| Weblog count 16: chansa | Meaning ‘praise’; ‘compliment’ | 116/730 (15.9%) | 2.16×10-4 | 1.74×10-4, 2.72×10-4 |
| Weblog count 17: gwallyeonitda | Meaning ‘have relevant to’ | 110/730 (15.1%) | 3.20×10-4 | 2.66×10-4, 3.82×10-4 |
| Weblog count 18: yecheukhada | Meaning ‘predict’; ‘forecast’ | 105/730 (14.4%) | 2.90×10-4 | 2.55×10-4, 3.26×10-4 |
| Weblog count 19: jureodeulda | Meaning ‘decrease’; ‘diminish’ | 97/730 (13.3%) | -1.04×10-4 | -1.30×10-4, -8.51×10-5 |
| Weblog count 20: meoriapeuda | Meaning ‘have a headache’ | 93/730 (12.7%) | 4.60×10-4 | 4.27×10-4, 5.00×10-4 |
| Weblog count 21: miljeopangwangye | Meaning ‘intimate relation’ | 93/730 (12.7%) | 1.49×10-3 | 1.21×10-3, 1.71×10-3 |
| Weblog count 22: uijonjeok | Meaning ‘dependent’ | 90/730 (12.3%) | 1.13×10-3 | 7.84×10-4, 1.47×10-3 |
| Weblog count 23: jeungga | Meaning ‘increase’; ‘growth’ | 88/730 (12.1%) | 7.31×10-5 | 5.80×10-5, 8.72×10-5 |
| Weblog count 24: gyujewanhwa | Meaning ‘relaxation of regulation’; ‘deregulation’ | 85/730 (11.6%) | 7.28×10-4 | 3.34×10-4, 9.95×10-4 |
| Weblog count 25: uijonhada | Meaning ‘depend on’; ‘reliance on’ | 85/730 (11.6%) | 2.47×10-4 | 2.16×10-4, 2.86×10-4 |
| Weblog count 26: ironjeok | Meaning ‘theoretical’ | 79/730 (10.8%) | 4.84×10-4 | 4.08×10-4, 5.22×10-4 |
| Weblog count 27: keodarata | Meaning ‘big’; ‘huge’; ‘large’ | 76/730 (10.4%) | 1.31×10-4 | 1.22×10-4, 1.68×10-4 |
| Weblog count 28: gogeupwha | Meaning ‘gentrification’ | 74/730 (10.1%) | 9.20×10-4 | 7.69×10-4, 1.11×10-3 |
| Weblog count 29: buran | Meaning ‘anxiety’ | 67/730 (9.2%) | 2.05×104 | 1.48×10-4, 2.85×10-4 |
| Weblog count 30: gyeonggihoebok | Meaning ‘a business recovery'; ‘return to prosperity’ | 56/730 (7.7%) | 9.60×10-4 | 8.72×10-4, 1.05×10-3 |
All prediction variables were derived from 7 days before prediction date (t-7) within each unique 5-year data. Estimates are change of natural logarithm of observed national suicide number at prediction date (t) per one increase of predicted variables at t-7, then averaged for 730 prediction models.
weblog posts that contain the word/words at least once,
top 30 weblog count variables that could be included in the prediction with the highest frequency across all 730 prediction models
Figure 2.Trend of annual Korean national suicide numbers per 100,000 persons.
Figure 3.Prediction of daily Korean national suicide number in 2-year prediction period. Observed suicides (blue solid line), predicted suicides (red solid line), and prediction intervals (red dashed lines). The prediction range was computed for 85% probability. Prediction range accuracy was 82.9% for the 2-year prediction period.