| Literature DB >> 26496370 |
Su Yang1, Shixiong Shi1, Xiaobing Hu1, Minjie Wang1.
Abstract
Spatial-temporal correlations among the data play an important role in traffic flow prediction. Correspondingly, traffic modeling and prediction based on big data analytics emerges due to the city-scale interactions among traffic flows. A new methodology based on sparse representation is proposed to reveal the spatial-temporal dependencies among traffic flows so as to simplify the correlations among traffic data for the prediction task at a given sensor. Three important findings are observed in the experiments: (1) Only traffic flows immediately prior to the present time affect the formation of current traffic flows, which implies the possibility to reduce the traditional high-order predictors into an 1-order model. (2) The spatial context relevant to a given prediction task is more complex than what is assumed to exist locally and can spread out to the whole city. (3) The spatial context varies with the target sensor undergoing prediction and enlarges with the increment of time lag for prediction. Because the scope of human mobility is subject to travel time, identifying the varying spatial context against time lag is crucial for prediction. Since sparse representation can capture the varying spatial context to adapt to the prediction task, it outperforms the traditional methods the inputs of which are confined as the data from a fixed number of nearby sensors. As the spatial-temporal context for any prediction task is fully detected from the traffic data in an automated manner, where no additional information regarding network topology is needed, it has good scalability to be applicable to large-scale networks.Entities:
Mesh:
Year: 2015 PMID: 26496370 PMCID: PMC4619804 DOI: 10.1371/journal.pone.0141223
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Predictor to infer temporal correlation.
The mean accuracy against the time range of the historical data for predicting the traffic flows to appear after 10 minutes at 3254 sensors.
| Time (min.) | 10 | 10∼20 | 10∼30 | 10∼40 | 10∼50 | 10∼60 |
| Accuracy (%) | 89.68 | 89.8 | 89.84 | 89.86 | 89.84 | 89.85 |
The average number of the nonzero weights of different orders for predicting the traffic flows to appear after 10 minutes at 3254 sensors.
| Number of nonzero weights of different orders | ||||||
|---|---|---|---|---|---|---|
| Time range in advance(min.) | p = 1 | p = 2 | p = 3 | p = 4 | p = 5 | p = 6 |
| 10 ∼ 20 | 100 | 59 | 0 | 0 | 0 | 0 |
| 10 ∼ 30 | 92 | 41 | 44 | 0 | 0 | 0 |
| 10 ∼ 40 | 88 | 36 | 33 | 35 | 0 | 0 |
| 10 ∼ 50 | 83 | 32 | 29 | 27 | 36 | 0 |
| 10 ∼ 60 | 81 | 30 | 26 | 23 | 28 | 34 |
The mean prediction accuracy of 3254 sensors based on only the nonzero variables corresponding with the p = 1 components in Table 2.
| Time (min.) | 10∼20 | 10∼30 | 10∼40 | 10∼50 | 10∼60 |
| Accuracy (%) | 89.44 | 89.4 | 89.36 | 89.31 | 89.28 |
Fig 2Predictor to infer spatial correlation.
The prediction accuracy and the number of sparse representation selected variables for 10 sensors and the overall performance of 3254 sensors under different time lags.
| Time lag | |||||||
|---|---|---|---|---|---|---|---|
| Sensor | 10 min. | 20 min. | 30 min. | 40 min. | 50 min. | 60 min. | |
| Accuracy(%) | 1 | 87.8 | 85.68 | 84.57 | 84.4 | 83.73 | 84.45 |
| 2 | 91.54 | 90.64 | 90.4 | 90.16 | 89.95 | 90.19 | |
| 3 | 87.23 | 83.9 | 81.8 | 82.2 | 82.23 | 82.11 | |
| 4 | 87.28 | 84.35 | 82.11 | 82.24 | 82.42 | 81.96 | |
| 5 | 88.18 | 85.16 | 82.98 | 83.28 | 83.12 | 82.56 | |
| 6 | 88.19 | 84.86 | 82.95 | 83.05 | 83.39 | 82.72 | |
| 7 | 86.22 | 84.71 | 84.67 | 84.77 | 84.17 | 83.47 | |
| 8 | 93.14 | 92.09 | 91.65 | 91.39 | 91.01 | 91.1 | |
| 9 | 93.41 | 91.96 | 91.34 | 90.97 | 90.58 | 90.12 | |
| 10 | 87.62 | 85.44 | 83.85 | 82.67 | 83.18 | 83.23 | |
| All |
|
|
|
|
|
| |
| Sparse number | 1 | 117 | 156 | 171 | 186 | 225 | 214 |
| 2 | 115 | 156 | 153 | 158 | 180 | 172 | |
| 3 | 173 | 210 | 219 | 240 | 265 | 235 | |
| 4 | 157 | 211 | 220 | 245 | 261 | 243 | |
| 5 | 156 | 215 | 226 | 242 | 241 | 243 | |
| 6 | 159 | 211 | 223 | 241 | 238 | 251 | |
| 7 | 143 | 196 | 187 | 189 | 231 | 208 | |
| 8 | 121 | 137 | 153 | 158 | 174 | 184 | |
| 9 | 108 | 125 | 157 | 172 | 182 | 191 | |
| 10 | 136 | 187 | 196 | 232 | 230 | 230 | |
| All |
|
|
|
|
|
| |
Fig 3Distribution of the 3254 sensors on prediction accuracy.
Fig 4Distribution of the 3254 sensors on sparse number.
Fig 5Spatial contexts for given prediciton tasks.
Comparison with the least squared fitting method in terms of 10-minute-ahead prediction accuracy (%).
| Sensor | LS | Sparse |
|---|---|---|
| 1 | 76.32 | 87.8 |
| 2 | 84.36 | 91.54 |
| 3 | 78.18 | 87.23 |
| 4 | 79.38 | 87.28 |
| 5 | 80.11 | 88.18 |
| 6 | 80.32 | 88.19 |
| 7 | 75.57 | 86.22 |
| 8 | 85.34 | 93.14 |
| 9 | 88.51 | 93.41 |
| 10 | 79.86 | 87.62 |
| All |
|
|
Mean prediciton accuracy of 60 sensors under different time lags with input from 10, 15, 20, 25, 30 neighboring sensors, and sparse representation (%).
| Number of sensors (input) | |||||||
|---|---|---|---|---|---|---|---|
| Time lag | 10 | 15 | 20 | 25 | 30 | Sparse | |
| Linear | 10 min. | 87.9 | 88.25 | 88.52 | 88.68 | 88.83 |
|
| 20 min. | 86.46 | 85.93 | 86.15 | 86.29 | 87 |
| |
| 30 min. | 84.98 | 84.18 | 84.46 | 84.62 | 85.56 |
| |
| 40 min. | 83.23 | 82.14 | 82.46 | 82.69 | 83.91 |
| |
| 50 min. | 81.77 | 80.24 | 80.66 | 80.95 | 82.58 |
| |
| 60 min. | 80.43 | 78.36 | 78.88 | 79.25 | 81.33 |
| |
| BP | 10 min. | 89.19 | 89.43 | 89.56 | 89.63 | 89.76 |
|
| 20 min. | 87.7 | 87.9 | 88.01 | 88.08 |
| 87.94 | |
| 30 min. | 86.66 | 87.05 | 87.14 | 87.31 |
| 87.07 | |
| 40 min. | 85.4 | 85.88 | 85.99 | 86.26 | 86.47 |
| |
| 50 min. | 84.5 | 84.91 | 85.18 | 85.51 | 85.71 |
| |
| 60 min. | 83.53 | 84.25 | 84.59 | 84.99 | 85.34 |
| |
| RBF | 10 min. | 88.94 | 89.24 | 89.41 | 89.43 | 89.48 |
|
| 20 min. | 87.49 | 87.75 | 87.78 | 87.65 | 87.78 |
| |
| 30 min. | 86.44 | 86.81 | 86.89 | 86.84 | 86.98 |
| |
| 40 min. | 85.08 | 85.54 | 85.67 | 85.75 | 85.83 |
| |
| 50 min. | 84.06 | 84.6 | 84.78 | 84.92 | 84.98 |
| |
| 60 min. | 83.13 | 83.79 | 84.04 | 84.27 | 84.38 |
| |
| VAR | 10 min. | 89.46 | 89.58 | 89.69 | 89.76 | 89.87 |
|
| 20 min. | 87.35 | 87.47 | 87.59 | 87.64 | 87.76 |
| |
| 30 min. | 85.73 | 85.83 | 85.98 | 86.06 | 86.2 |
| |
| 40 min. | 83.72 | 83.85 | 84.05 | 84.14 | 84.28 |
| |
| 50 min. | 81.97 | 82.09 | 82.32 | 82.39 | 82.52 |
| |
| 60 min. | 80.25 | 80.4 | 80.67 | 80.75 | 80.88 |
| |
Fig 6Performance degradation of each model with increment of time lag for traffic volume prediction.
Fig 7Mean sparse number against lg(λ).
Fig 8Mean accuracy against lg(λ).
Fig 9Mean accuracy against mean sparse number.