| Literature DB >> 35214584 |
Alessandro Crivellari1, Bernd Resch2,3, Yuhui Shi1.
Abstract
Trajectory data represent an essential source of information on travel behaviors and human mobility patterns, assuming a central role in a wide range of services related to transportation planning, personalized recommendation strategies, and resource management plans. The main issue when dealing with trajectory recordings, however, is characterized by temporary losses in the data collection, causing possible spatial-temporal gaps and missing trajectory segments. This is especially critical in those use cases based on non-repetitive individual motion traces, when the user's missing information cannot be directly reconstructed due to the absence of historical individual repetitive routes. Inserted in the context of location-based trajectory modeling, we tackle the problem by proposing a technical parallelism with the natural language processing domain. Specifically, we introduce the use of the Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art language representation model, into the trajectory processing research field. By training deep bidirectional representations from unlabeled location sequences, jointly conditioned on both left and right context, we derive an explicit predicted estimation of the missing locations along the trace. The proposed framework, named TraceBERT, was tested on a real-world large-scale trajectory dataset of short-term tourists, exploring an effective attempt of adapting advanced language modeling approaches into mobility-based applications and demonstrating a prominent potential on trajectory reconstruction over traditional statistical approaches.Entities:
Keywords: BERT; human mobility; neural networks; spatial–temporal gaps; trajectories
Mesh:
Year: 2022 PMID: 35214584 PMCID: PMC8879562 DOI: 10.3390/s22041682
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Trajectory reconstruction problem: predict the missing locations given the known recorded locations along the trace.
Figure 2Visual representation of the TraceBERT model architecture.
Overall accuracy comparison between TraceBERT and the three baseline approaches, namely the personal Markov model (PMM), global Markov model (GMM), and global location co-visits (GLC).
| Top-1 Accuracy | Top-3 Accuracy | Top-5 Accuracy | |
|---|---|---|---|
| PMM | 0.2050 | 0.2931 | 0.3012 |
| GMM | 0.4482 | 0.6198 | 0.6817 |
| GLC | 0.3658 | 0.5890 | 0.6613 |
| TraceBERT | 0.4745 | 0.6870 | 0.7492 |
Comparison of top-1 accuracy, top-3 accuracy (in round brackets), and top-5 accuracy (in square brackets) for different ranges of traveled distance.
| Traveled Distance = | ≤10 km | 10–25 km | 25–50 km | 50–100 km | ≥100 km |
|---|---|---|---|---|---|
| PMM | 0.4005 | 0.2644 | 0.1829 | 0.1321 | 0.0622 |
| GMM | 0.7039 | 0.5449 | 0.4458 | 0.3627 | 0.2237 |
| GLC | 0.5864 | 0.4642 | 0.3635 | 0.2903 | 0.1617 |
| TraceBERT | 0.7145 | 0.5604 | 0.4671 | 0.3916 | 0.2722 |
Comparison of top-1 accuracy, top-3 accuracy (in round brackets), and top-5 accuracy (in square brackets) for different ranges of the radius of gyration.
| ROG = | ≤3 km | 3–10 km | 10–32 km | ≥32 km |
|---|---|---|---|---|
| PMM | 0.3531 | 0.2190 | 0.1634 | 0.0705 |
| GMM | 0.6305 | 0.4885 | 0.4184 | 0.2401 |
| GLC | 0.5603 | 0.3981 | 0.3152 | 0.1709 |
| TraceBERT | 0.6509 | 0.4995 | 0.4430 | 0.2903 |
Figure 3Top-1, top-3, and top-5 prediction accuracy scores (from left to right) over the 24 h of the day.
Comparison of top-1 accuracy, top-3 accuracy (in round brackets), and top-5 accuracy (in square brackets) for different amounts of masked locations per segment.
| # Masked Locations = | 1–2 Locations | 3–4 Locations | ≥5 Locations |
|---|---|---|---|
| PMM | 0.2732 | 0.1719 | 0.0370 |
| GMM | 0.4670 | 0.4426 | 0.3765 |
| GLC | 0.3691 | 0.3654 | 0.3495 |
| TraceBERT | 0.5017 | 0.4640 | 0.3875 |
Figure 4Bar graphs reporting the error distance distribution of the masked locations that are wrongly predicted by both TraceBERT and GMM (from left to right: wrong predictions in top-1, top-3, and top-5, respectively).
Figure 5Bar graphs reporting the difference of error distance between GMM and TraceBERT in the case of common misprediction (from left to right: wrong predictions in top-1, top-3, and top-5, respectively).