| Literature DB >> 34764610 |
Fernando Terroso-Sáenz1, Andrés Muñoz1.
Abstract
Nowadays, the anticipation of human mobility flow has important applications in many domains ranging from urban planning to epidemiology. Because of the high predictability of human movements, numerous successful solutions to perform such forecasting have been proposed. However, most focus on predicting human displacements on an intra-urban spatial scale. This study proposes a predictor for nation-wide mobility that allows anticipating inter-urban displacements at larger spatial granularity. For this goal, a Graph Neural Network (GNN) was used to consider the latent relationships among large geographical regions. The solution has been evaluated with an open dataset including trips throughout the country of Spain and the current weather conditions. The results indicate a high accuracy in predicting the number of trips for multiple time horizons, and more important, they show that our proposal only needs a single model for processing all the mobility areas in the dataset, whereas other techniques require a different model for each area under study.Entities:
Keywords: Graph-based neural networks; Human flow prediction; Human mobility; Large-scale mobility; Mobile phone location data
Year: 2021 PMID: 34764610 PMCID: PMC8288072 DOI: 10.1007/s10489-021-02645-3
Source DB: PubMed Journal: Appl Intell (Dordr) ISSN: 0924-669X Impact factor: 5.019
Fig. 1Scales of human mobility. The red arrows of the leftmost figure represent the intra-urban human flows among different areas within a city A. The rightmost figure depicts the mobility flows defined at larger scale among spatial regions (R1,R2,R3) that include several cities
Human mobility prediction approaches
| Ref. | Input Mobility data | Method | Prediction | |
|---|---|---|---|---|
| Scale | Type | |||
| [ | TFC, RSB | ConvLSTM | Intra-urban | PT traffic |
| [ | TFC | ConvLSTM | Intra-urban | Road traffic |
| [ | LBS | LSTM | Intra-urban | Individual flows |
| [ | GPS traj. | LSTM | Intra-urban | Pedestrian flows |
| [ | OSN, GPS traj. | LSTM | Intra-urban | Individual flows |
| [ | GPS traj. | LSTM | Intra-urban | Individual flows |
| [ | TFG, RSB | LSTM | Intra-urban | Road traffic |
| [ | MPL, LBS, CDR | GRU | Intra-urban | Individual flows |
| [ | GPS traj. | GRU | Intra-urban | Individual flows |
| [ | ILD, GPS traj. | ConvGRU | Intra-urban | Traffic speed |
| [ | ILD, GPS traj. | GCN | Intra-urban | Road traffic |
| [ | ILD, GPS traj. | GCN | Intra-urban | Road traffic |
| Our proposal | MPL | GCN | Inter-urban | Crowd flows |
Acronyms, MPL - Mobile Phone Location data, GCN - Graph convolutional network, GRU - Gated Recurrent Unit, LSTM - Long Short-Term Memory, ILD - Inductive Loop Detector, OSN - Online Social Network, TFC - Taxi-based Floating Cars, RSB - Ride-Sharing Bikes, PT- Public Transport, LBS - Location-based Service, CDR - Call Detail Record
Fig. 2Geographical distribution of the sheer number of outgoing trips of the Spanish MAs during the entire period of study
Fig. 3Descriptive parameters of the target MAs
Fig. 4Number of trips between adjacent and non-adjacent mobility areas
Fig. 5Evolution of the number of outgoing trips based on different parameters. Each blue dot represents a particular pair of origin-destination MAs
Fig. 6Box plots for the number of origins and destinations per MA
Fig. 7Spatial auto-correlation of the MAs regarding their outgoing number of trips
Acronyms of the model
| Acronym | Description |
|---|---|
| Graph of MAs | |
| Set of target MAs | |
| Set of edges among MAs | |
| Number of MAs | |
| Time period under study | |
| Adjacency matrix among MAs | |
| Identity matrix | |
| Adjacency matrix with self-connections | |
| Feature matrix with sequence of last | |
| outgoing trips per MA | |
| Degree matrix of |
Fig. 8Pipeline of the used GNN with its general structure of layers
Fig. 9Inner structure of the LSTM cells of the model. FC: Fully Connected
Model parameters for the experiments
| Parameter | Value |
|---|---|
| Batch size | 60 |
| Learning factor | 0,001 |
| Optimizer | Adam |
| Num. of GC layers ( | 2 |
| GC activation function | ReLU |
| Num. of LSTM layers | 2 |
| Num. of LSTM neurons | 200 |
| LSTM activation function | Hyperbolic tangent |
| Num of MLP layers ( | 1 |
| Num of MLP neurons | 200 |
| Num. of epochs | 200 |
Parameters of the candidate models
| Model | Parameter | Value |
|---|---|---|
| LSTM | Batch size | 32 |
| Learning factor | 0.001 | |
| Optimizer | Adam | |
| Num. of layers | 2 | |
| Num of cells per layer | 200 | |
| Activation function | Hyperbolic tangent | |
| Num. of epochs | 20 | |
| ARIMA | order of autoregressive model ( | 12 |
| degree ( | 1 | |
| order of moving average model ( | 1 |
Metric values of the candidate models for different time horizons (T)
| Model | T | MAE | MSE | RMSE |
| 1h | 171.923 (± 202.484) | 120,676.856 (± 629,199.550) | 223.957 (± 265.603) | |
| 215.477 (± 266.398) | 206,847.458 (± 912,681.937) | 287.782 (± 352.239) | ||
| 160.940 (± 171.504) | 102,270.633 (± 479,770.322) | 215.096 (± 236.694) | ||
| 164.947 (± 173.624) | 99,771.510 (± 481495.175) | 215.585 (± 230.896) | ||
| ARIMA (2844) | 131.613 (± 189.702) | 53,295.374 (± 249,934.954) | 174,313 (± 245.913) | |
| LSTM (2844) | 287.582 (± 249.103) | 240,392.048 (± 673,797.631) | 364.926 (± 327.504) | |
| Naive | 445.657 (± 573.288) | 844,369.636 (± 3,552,303.321) | 564.808 (± 724.946) | |
| 3h | 168.832 (± 188.286) | 126,225.899 (± 562,311.371) | 235.882 (± 265.726) | |
| 164.528 (± 182.431) | 127,471.411 (± 571,985.433) | 236.079 (± 267.887) | ||
| 149.769 (± 168.372) | 93,474.430 (± 460,505.513) | 202.048 (± 229.498) | ||
| 172.606 (± 182.565) | 117,719.663 (± 490,176.192) | 233.887 (± 251.075) | ||
| ARIMA (2844) | 229.139 (± 283.488) | 172512.497 (± 839,840.629) | 264.666 (± 320.454) | |
| LSTM (2844) | 268.269 (± 215.494) | 192,273.747 (± 525,928.089) | 338.988 (± 278.197) | |
| Naive | 445.657 (± 573.288) | 844,369.636 (± 3,552,303.321) | 564.808 (± 724.946) | |
| 6h | 186.570 (± 208.805) | 157,179.349 (± 657,231.586) | 261.085 (± 298.405) | |
| 197.032 (± 217.406) | 169,179.710 (± 713,804.705) | 270.959 (± 309.507) | ||
| 166.788 (± 183.809) | 114,271.417 (± 505,934.955) | 223.819 (± 253.375) | ||
| 177.517 (± 190.946) | 139,345.311 (± 589,300.754) | 248.903 (± 278.245) | ||
| ARIMA (2844) | 237.026 (± 243.406) | 152,980.337 (± 299,022.317) | 280.883 (± 275.1284) | |
| LSTM (2844) | 290.504 (± 221.231) | 215,440.658 (± 585,540.530) | 362.157 (± 290.377) | |
| Naive | 685.219 (± 915.489) | 1,880,087.023 (± 8,132,191.379) | 827.188 (± 1093.740) | |
| 12h | 187.419 (± 229.787) | 189,112.463 (± 913,244.057) | 269.144 (± 341.636) | |
| 206.836 (± 267.642) | 239,994.215 (± 1,206,237.416) | 295.081 (± 391.120) | ||
| 193.796 (± 234.397) | 195,603.403 (± 949,279.271) | 277.460 (± 344.472) | ||
| 179.012 (± 220.276) | 182,799.867 (± 898,308.893) | 261.680 (± 338.177) | ||
| ARIMA (2844) | 216.664 (± 205.569) | 128,209.707 (± 230,840.883) | 263.832 (± 244.696) | |
| LSTM (2844) | 252.473 (± 267.717) | 238,852.906 (± 1,135,378.617) | 330.673 (± 359.992) | |
| Naive | 940.104 (± 1,265.396) | 3,209,833.152 (± 14,297,120.361) | 1,068.467 (± 1,438.381) |
The number in brackets in the Model column indicates the number of instances of each model. The lower values of each metric for the GNN models are shown in bold
Fig. 10Geographical distribution of the MAE per model in the Spanish MAs for T= 1h
Fig. 11Evolution of the MAE per model with respect the MA’s sheer number of outgoing trips for T= 1h
Fig. 12Comparison of the total number of outgoing trips and the aggregated predictions of GNN for T= 1h
Fig. 13Association between MAs and airports. Each black dot indicates the location of an airport. The MAs around each dot with the same color indicate the areas covered by the airport (the area of influence of the airport)
Model parameters for the experiments with weather data
| Parameter | Value |
|---|---|
| Batch size | 32 |
| Learning factor | 0,001 |
| Optimizer | Adam |
| Num. of GC layers ( | 1 |
| GC activation function | ReLU |
| Num. of LSTM layers | 1 |
| Num. of LSTM neurons | 300 |
| LSTM activation function | Hyperbolic tangent |
| Num of MLP layers ( | 1 |
| Num of MLP neurons | 200 |
| Num. of epochs | 40 |
Metric values of for different time horizons (T)
| T | MAE | MSE | RMSE |
|---|---|---|---|
| 1h | 206.388 | 196,953.628 | 282.300 |
| (± 252.180) | (± 893,711.000) | (± 342.492) | |
| 3h | 273.089 | 361,923.161 | 376.584 |
| (± 339.387) | (± 1,586,234.000) | (± 469.238) | |
| 6h | 300.036 | 381,354.557 | 386.284 |
| (± 389.861) | (± 1,611,739.000) | (± 481.892) | |
| 12h | 300.965 | 427,649.183 | 416.326 |
| (± 374.367) | (± 1,941,091.000) | (± 504.391) |
Fig. 14Geographical distribution of the MAs according to the results. The red dots indicate the location of the airports
Fig. 15Range of outgoing trips of the MAs based on whether they are improved by