| Literature DB >> 28813490 |
Ulrich S Tran1,2, Rita Andel1,2, Thomas Niederkrotenthaler2,3, Benedikt Till2,3, Vladeta Ajdacic-Gross4, Martin Voracek1,2.
Abstract
Recent research suggests that search volumes of the most popular search engine worldwide, Google, provided via Google Trends, could be associated with national suicide rates in the USA, UK, and some Asian countries. However, search volumes have mostly been studied in an ad hoc fashion, without controls for spurious associations. This study evaluated the validity and utility of Google Trends search volumes for behavioral forecasting of suicide rates in the USA, Germany, Austria, and Switzerland. Suicide-related search terms were systematically collected and respective Google Trends search volumes evaluated for availability. Time spans covered 2004 to 2010 (USA, Switzerland) and 2004 to 2012 (Germany, Austria). Temporal associations of search volumes and suicide rates were investigated with time-series analyses that rigorously controlled for spurious associations. The number and reliability of analyzable search volume data increased with country size. Search volumes showed various temporal associations with suicide rates. However, associations differed both across and within countries and mostly followed no discernable patterns. The total number of significant associations roughly matched the number of expected Type I errors. These results suggest that the validity of Google Trends search volumes for behavioral forecasting of national suicide rates is low. The utility and validity of search volumes for the forecasting of suicide rates depend on two key assumptions ("the population that conducts searches consists mostly of individuals with suicidal ideation", "suicide-related search behavior is strongly linked with suicidal behavior"). We discuss strands of evidence that these two assumptions are likely not met. Implications for future research with Google Trends in the context of suicide research are also discussed.Entities:
Mesh:
Year: 2017 PMID: 28813490 PMCID: PMC5558943 DOI: 10.1371/journal.pone.0183149
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Results of previous studies on the associations between Google Trends search volumes and suicide statistics.
| Study | Country | Search terms | Method | Relevant results | Effect size |
|---|---|---|---|---|---|
| Ma-Kellams et al. (2016) [ | USA | Correlation and linear regression analysis | Positive associations of national suicide rates with all terms except | Large ( | |
| Gunn & Lester (2013) [ | USA | Correlation analysis | Positive associations of national suicide rates with | Medium to large ( | |
| McCarthy (2010) [ | USA | Correlation analysis | Positive associations of national suicide rates and rates of intentional self-injury of 15–25 year-olds with | Large ( | |
| Kristoufek et al. (2016) [ | UK | Time-series analysis | Positive associations of national suicide rates with | Search volumes explained 22% of suicide rates | |
| Bruckner et al. (2014) [ | UK | Time-series analysis | Positive associations of suicide rates in England and Wales with | Search volumes explained 7% of suicide rates | |
| Chang et al. (2011) [ | UK & Japan | Graphical display | Increase of search requests after media reporting | N/A | |
| Hagihara et al. (2012) [ | Japan | Time-series analysis with transfer model and prewhitening | Positive associations of national suicide rates of 20–29 year-olds and 30–39 year-olds with | N/A (unstandardized coefficients) | |
| Sueki (2011) [ | Japan | Cross-correlation analysis | Positive associations of national suicide rates with | Medium to large ( | |
| Yang et al. (2011) [ | Taiwan | 37 search terms, including | Cross-correlation and linear regression analysis | Positive associations of suicide rates in Taipei City with 18 search terms at lags -2 to 0 | Medium to large ( |
Final search terms and lengths (in months) and missing values (in weeks) of the respective Google Trends time series.
| Country and search term | Length of time series (number of months) | Missing values |
|---|---|---|
| USA: pro-suicide terms | ||
| | 01.2004–12.2010 (84) | 0% |
| | 01.2004–12.2010 (84) | 0% |
| | 01.2005–12.2010 (72) | 0% |
| | 05.2005–12.2010 (68) | 1.69% |
| | 12.2005–12.2010 (61) | 5.64% |
| | 12.2007–12.2010 (37) | 0% |
| | 02.2007–12.2010 (47) | 2.93% |
| | 02.2004–12.2010 (83) | 1.38% |
| | 05.2007–12.2010 (44) | 5.73% |
| | 12.2006–12.2010 (49) | 0% |
| | 02.2004–12.2010 (83) | 0% |
| | 09.2004–12.2010 (76) | 5.74% |
| USA: suicide prevention terms | ||
| | 02.2004–12.2010 (83) | 1.37% |
| | 11.2005–12.2010 (62) | 5.19% |
| | 02.2005–12.2010 (71) | 0% |
| | 05.2005–12.2010 (68) | 3.04% |
| | 01.2004–12.2010 (84) | 0% |
| | 01.2004–12.2010 (84) | 0% |
| | 02.2005–12.2010 (62) | 0% |
| Germany | ||
| | 01.2004–12.2012 (108) | 0% |
| | 01.2004–12.2012 (108) | 0% |
| | 01.2004–12.2012 (108) | 0% |
| | 11.2009–12.2012 (38) | 0% |
| | 01.2004–12.2012 (108) | 0% |
| Austria | ||
| | 12.2009–12.2012 (37) | 4.94% |
| | 11.2004–12.2012 (98) | 4.92% |
| | 10.2006–12.2012 (75) | 2.14% |
| Switzerland | ||
| | 01.2007–12.2010 (48) | 2.39% |
| | 09.2007–12.2010 (40) | 8.62% |
Reliability of individual and averaged time series for selected Google Trends search terms.
| Country and search term | Individual time series | Averaged time series |
|---|---|---|
| USA | ||
| | .991 [.990–.992] | .999 [.999–.999] |
| | .994 [.994–.995] | .999 [.999–1.000] |
| Germany | ||
| | .91 [.90–.92] | .990 [.989–.991] |
| | .95 [.95–.96] | .995 [.994–.996] |
| | .92 [.91–.93] | .991 [.990–.992] |
| Austria | ||
| | .77 [.75–.80] | .97 [.97–.98] |
| | .74 [.72–.77] | .97 [.96–.97] |
| | .63 [.60–.66] | .95 [.94–.95] |
| Switzerland | ||
| | .78 [.76–.80] | .97 [.97–.98] |
| | .68 [.65–.71] | .95 [.95–.96] |
Numbers are intraclass correlation coefficients (two-way random model, assessing consistency of individual values) with 95% confidence intervals in brackets. All ps < .001.
Fitted models and outliers in the individual and averaged Google Trends search volume time series per country.
| Country | Search term | Fitted model | Type and position of outliers |
|---|---|---|---|
| USA | SARIMA(1,0,0)x(0,1,1)12 | ||
| SARIMA(0,1,2)x(1,0,1)12 | IO: 56 | ||
| AR(1) | |||
| AR(1) | IO: 41 | ||
| SARIMA(0,1,1)x(0,1,1)12 | |||
| SARIMA(1,1,0)x(1,0,0)12 | |||
| SARIMA(0,1,1)x(0,1,1)12 | IO: 9, 21 | ||
| ARIMA(1,1,0) | IO: 52; AO: 5 | ||
| AR(1) | |||
| ARIMA(0,1,1) | |||
| ARIMA(1,1,0) | |||
| ARIMA(1,1,0) | |||
| SARIMA(4,1,0)x(0,1,1)12 | |||
| SARIMA(0,1,1)x(1,0,0)12 | |||
| SARIMA(1,0,0)x(1,0,0)12 | |||
| AR(1) | |||
| SARIMA(0,1,1)x(0,1,1)12 | IO: 10 | ||
| SARIMA(0,1,1)x(0,1,1)12 | |||
| SARIMA(0,1,1)x(0,1,1)12 | |||
| Germany | ARIMA(0,1,1) | IO: 70, 93 | |
| SARIMA(0,1,1)x(0,1,1)12 | IO: 58, 70; AO: 95 | ||
| SARIMA(0,1,1)x(0,1,1)12 | IO: 58, 70; AO: 57 | ||
| ARIMA(0,1,1) | AO: 1 | ||
| ARIMA(0,1,2) | IO: 11; AO: 6 | ||
| Austria | AR(1) | ||
| ARIMA(0,1,1) | IO: 86; AO: 87 | ||
| ARIMA(0,1,1) | IO: 9 | ||
| Switzerland | ARIMA(0,1,1) | ||
| ARIMA(1,1,0) | |||
Fitted models and outliers of averaged time series underlined. AR = autoregressive model, MA = moving average model, ARIMA = integrated autoregressive moving average model, SARIMA = seasonal ARIMA model. Parameters of the AR(p), MA(q), ARIMA(p,d,q), and SARIMA(p,d,q)x(P,D,Q) models denote: p = order of the autoregressive model (number of past time lags that affect current values autoregressively), d = degree of differencing (number of times the data had past values subtracted to reduce non-stationarity in the time series), q = order of the moving average model (number of current and past white noise error terms that affect current values), P,D,Q = autoregressive, differencing, and moving average terms of the seasonal part of the ARIMA model, s = number of periods in each season. Outliers were detected and classified as IO (innovative outlier) and AO (additive outlier) using the method of Cryer and Chan [60] and integrated into the models using ARIMAX modeling.
Fig 1Heat map of cross-correlation coefficients for the US data (pro-suicide terms A).
Numbers in column headers refer to lags in months. Idealized patterns reflect what could be expected if search volumes predicted suicide rates. In the observed patterns, frames highlight statistically significant (p < .05) coefficients.
Fig 6Heat map of cross-correlation coefficients in the Swiss data.
Selbstmord = ‘suicide’ in English; Depressionen = ‘depression’ (plural in German). Numbers in column headers refer to lags in months. Idealized patterns reflect what could be expected if search volumes predicted suicide rates. In the observed patterns, frames highlight statistically significant (p < .05) coefficients.
Fig 4Heat map of cross-correlation coefficients for the German data.
Suizid, Selbstmord, Freitod = ‘suicide’ in English; Depressionen = ‘depression’ (plural in German); Selbstmord Forum = ‘suicide chat’. Numbers in column headers refer to lags in months. Idealized patterns reflect what could be expected if search volumes predicted suicide rates. In the observed patterns, frames highlight statistically significant (p < .05) coefficients.
Fig 3Heat map of cross-correlation coefficients for the US data (suicide prevention terms).
Numbers in column headers refer to lags in months. Idealized patterns reflect what could be expected if search volumes predicted suicide rates. In the observed patterns, frames highlight statistically significant (p < .05) coefficients.
Fig 2Heat map of cross-correlation coefficients for the US data (pro-suicide terms B).
Numbers in column headers refer to lags in months. Idealized patterns reflect what could be expected if search volumes predicted suicide rates. In the observed patterns, frames highlight statistically significant (p < .05) coefficients.
Fig 5Heat map of cross-correlation coefficients for the Austrian data.
Suizid, Selbstmord = ‘suicide’ in English; Depressionen = ‘depression’ (plural in German). Numbers in column headers refer to lags in months. Idealized patterns reflect what could be expected if search volumes predicted suicide rates. In the observed patterns, frames highlight statistically significant (p < .05) coefficients.
Significant cross-correlation coefficients of individual and averaged time-series data.
| Country | Search term | Lag | Individual time-series data | Averaged time-series data |
|---|---|---|---|---|
| USA | -3 | -.35 | -.34 | |
| -1 | -.23 | -.24 | ||
| +2 | -.24 | -.23 | ||
| Germany | +1 | -.20 | -.21 | |
| -.22 | ||||
| -.25 | ||||
| +2 | .21 | |||
| -2 | -.25 | |||
| 0 | .23 | |||
| .20 | ||||
| Austria | +2 | -.34 | ||
| -3 | -.34 | |||
| -.34 | ||||
| -.33 | ||||
| -2 | .32 | |||
| .38 | ||||
| .37 | ||||
| .22 | ||||
| +3 | -.26 | |||
| -3 | -.26 | |||
| +1 | .31 | |||
| .28 | ||||
| .27 | ||||
| +2 | -.23 | |||
| +3 | -.25 | |||
| -.23 | ||||
| Switzerland | -3 | -.33 | ||
| -2 | -.32 | |||
| -.35 | ||||
| -1 | .54 |
Numbers are cross-correlation coefficients with overall suicide rates (total) or with suicide rates of younger individuals (<40 years of age), older individuals (40+ years of age), of older men or of older women. Only significant (p < .05) cross-correlation coefficients are displayed.
* p < .05,
** p < .01.
a,b Results for asuicide/bSelbstmord are omitted as cross-correlations failed to reach significance in the respective individual and averaged time-series data.