| Literature DB >> 27571518 |
Abstract
Traffic congestion is an important problem both on an individual and on a societal level and much research has been done to explain and prevent their emergence. There are currently many systems which provide a reasonably good picture of actual road traffic by employing either fixed measurement points on highways or so called "floating car data" i.e. by using velocity and location data from roaming, networked, GPS enabled members of traffic. Some of these systems also offer forecasting of road conditions based on such historical data. To my knowledge there is as yet no system which offers advance notice on road conditions based on a signal which is guaranteed to occur in advance of these conditions and this is the novelty of this paper. Google Search intensity for the German word stau (i.e. traffic jam) peaks 2 hours ahead of the number of traffic jam reports as reported by the ADAC, a well known German automobile club and the largest of its kind in Europe. This is true both in the morning (7 am to 9 am) and in the evening (4 pm to 6 pm). The main result of this paper is then that after controlling for time-of-day and day-of-week effects we can still explain a significant additional portion of the variation of the number of traffic jam reports with Google Trends and we can thus explain well over 80% of the variation of road conditions using Google search activity. A one percent increase in Google stau searches implies a .4 percent increase of traffic jams. Our paper is a proof of concept that aggregate, timely delivered behavioural data can help fine tune modern societies and prompts for more research with better, more disaggregated data in order to also achieve practical solutions.Entities:
Mesh:
Year: 2016 PMID: 27571518 PMCID: PMC5003347 DOI: 10.1371/journal.pone.0162080
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1ADAC traffic jams.
Natural logs of hourly ADAC traffic jam reports (top left), its autocorellations (top right), spectral density (bottom left) and histogram of percentage changes (bottom right). Data Source: ADAC (adac.de) and own calculations.
Fig 2Google Searches for “stau.”
The reduced series are searches containing the word stau without containing any of: nrw, a7, a2, a3, a1, wdr, a8, a5, aktuell, autobahn, a9, info, swr3, bayern, hamburg, a4, staumelder, adac, berlin, a6, verkehr, ffh, hessen, köln, münchen, swr, a81, a61, deutschland. A 60% of the total stau search volume is accounted for by these words. Data Source: Google Trends (www.google.com/trends) and own calculations.
Origins of four highway-specific Google “stau” searches.
| stau A1 | stau A2 | stau A3 | stau A7 |
|---|---|---|---|
| Glauchau | Glauchau | Glauchau | Glauchau |
| Bremen | Gütersloh | Oberhausen | Flensburg |
| Oldenburg | Bielefeld | Düsseldorf | Drochtersen |
| Vechta | Hanover | Cologne | Norderstedt |
| Osnabrück | Brunswick | Koblenz | Hamburg |
| Münster | Wolfenbüttel | Aschaffenburg | Hanover |
| Dortmund | Magdeburg | Würzburg | Göttingen |
| Erlangen | Kassel |
For the highways A1, A2, A3, A7 searches for stau A1, stau A2, stau A3 and stau A7 come from towns along the respective highways.
Fig 3Hourly Google Searches for “stau.”
The natural logs of the hourly stau search intensity (top left), its autocorellations (top right), spectral density (bottom left) and histogram of percentage changes (bottom right). Data Source: Google Trends (www.google.com/trends) and own calculations.
Fig 4Google Searches for “stau” vs ADAC reports.
A cross correlogram between the hourly number of ADAC traffic jam reports and the hourly Google search intensity for stau, based on the entire observation time interval of 51 days, establishes that Google search has a two hour advance on road conditions. Data Source: Google Trends, ADAC and own calculations.
Forecasting the number of ADAC traffic reports.
| M1 | M2 | M3 | M4 | M5 | |
|---|---|---|---|---|---|
| coef./ | coef./ | coef./ | coef./ | coef./ | |
| .659 | .326 | ||||
| (.000) | (.000) | ||||
| .244 | |||||
| (.000) | |||||
| .368 | .415 | ||||
| (.000) | (.000) | ||||
| D = 0 | .000 | .000 | .000 | ||
| (.) | (.) | (.) | |||
| D = 1 | .433 | .337 | .268 | ||
| (.000) | (.000) | (.000) | |||
| D = 2 | .465 | .240 | .337 | ||
| (.000) | (.000) | (.000) | |||
| D = 3 | .527 | .287 | .382 | ||
| (.000) | (.000) | (.000) | |||
| D = 4 | .559 | .301 | .399 | ||
| (.000) | (.000) | (.000) | |||
| D = 5 | .480 | .232 | .154 | ||
| (.000) | (.000) | (.000) | |||
| D = 6 | .076 | –.056 | –.051 | ||
| (.033) | (.110) | (.023) | |||
| H = 7 | .000 | .000 | .000 | ||
| (.) | (.) | (.) | |||
| H = 8 | .437 | .320 | .253 | ||
| (.000) | (.000) | (.000) | |||
| H = 9 | .565 | .345 | .342 | ||
| (.000) | (.000) | (.000) | |||
| H = 10 | .402 | .082 | .304 | ||
| (.000) | (.099) | (.000) | |||
| H = 11 | .294 | –.037 | .344 | ||
| (.000) | (.479) | (.000) | |||
| H = 12 | .293 | .016 | .424 | ||
| (.000) | (.739) | (.000) | |||
| H = 13 | .323 | .075 | .435 | ||
| (.000) | (.114) | (.000) | |||
| H = 14 | .383 | .121 | .423 | ||
| (.000) | (.011) | (.000) | |||
| H = 15 | .469 | .177 | .466 | ||
| (.000) | (.000) | (.000) | |||
| H = 16 | .562 | .230 | .492 | ||
| (.000) | (.000) | (.000) | |||
| H = 17 | .706 | .309 | .531 | ||
| (.000) | (.000) | (.000) | |||
| H = 18 | .778 | .332 | .552 | ||
| (.000) | (.000) | (.000) | |||
| H = 19 | .651 | .192 | .448 | ||
| (.000) | (.001) | (.000) | |||
| const. | 1.734 | 4.631 | 4.509 | 1.974 | 3.985 |
| (.000) | (.000) | (.000) | (.000) | (.000) | |
| Adj. | .434 | .631 | .612 | .693 | .851 |
| AIC | 531.027 | 49.119 | –57.311 | –185.875 | –639.008 |
| RMSE | .306 | .247 | .227 | .204 | .141 |
| No. of cases | 1126.000 | 1126.000 | 611.000 | 598.000 | 609.000 |
*** p < 0.01,
** p < 0.05,
* p < 0.1
Fig 5Comparing Eqs M3, M4 and M5.
Clearly the regression line of Eq M5 captures the data better than that of Eqs M4 and M3. A higher degree term is apparent (bottom right). Data Source: Google Trends (www.google.com/trends), ADAC and own calculations.