| Literature DB >> 35684916 |
Wei Tian1, Songtao Wang1, Zehan Wang1, Mingzhi Wu2, Sihong Zhou1, Xin Bi1.
Abstract
Accurate trajectory prediction is an essential task in automated driving, which is achieved by sensing and analyzing the behavior of surrounding vehicles. Although plenty of research works have been invested in this field, it is still a challenging subject due to the environment's complexity and the driving intention uncertainty. In this paper, we propose a joint learning architecture to incorporate the lane orientation, vehicle interaction, and driving intention in vehicle trajectory forecasting. This work employs a coordinate transform to encode the vehicle trajectory with lane orientation information, which is further incorporated into various interaction models to explore the mutual trajectory relations. Extracted features are applied in a dual-level stochastic choice learning to distinguish the trajectory modality at both the intention and motion levels. By collaborative learning of lane orientation, interaction, and intention, our approach can be applied to both highway and urban scenes. Experiments on the NGSIM, HighD, and Argoverse datasets demonstrate that the proposed method achieves a significant improvement in prediction accuracy compared with the baseline.Entities:
Keywords: intention learning; lane coordinate transform; multi-modal trajectory; trajectory prediction
Mesh:
Year: 2022 PMID: 35684916 PMCID: PMC9185224 DOI: 10.3390/s22114295
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1The multi-modal vehicle trajectory prediction framework based on interaction modeling and lane orientation information. Related lane centerline is depicted in red. Predicted trajectories are represented in yellow, while historical trajectories are in green.
Figure 2Three interaction model embedding methods in the LSTM encoder–decoder network. (a) Embedding interaction model only at the current frame. (b) Embedding interaction model at each frame. (c) Embedding interactive model with spatial–temporal coupling.
Figure 3Coordinate transform of historical vehicle trajectory.
Vehicle trajectory prediction with various interaction models on the NGSIM and HighD datasets.
| Dataset | NGSIM | HighD | ||
|---|---|---|---|---|
|
| ||||
| CV | 3.42 | 6.68 | 1.49 | 2.83 |
| IDM | 3.40 | 6.60 | 1.52 | 2.86 |
| LSTM (baseline) | 3.19 | 6.27 | 1.43 | 2.76 |
| S-LSTM | 2.24 | 4.18 | 1.30 | 2.55 |
| CS-LSTM | 2.20 | 4.11 | 1.28 | 2.54 |
| P-LSTM | 2.21 | 4.17 | 1.25 | 2.52 |
Prediction with different embedding methods on the NGSIM dataset.
| Embedding Style | ADE (m) ↓ | FDE (m) ↓ | Runtime (Hz) ↑ |
|---|---|---|---|
| Current frame | 2.21 | 4.17 | 98 |
| Each frame | 2.21 | 4.15 | 12 |
| Space–time coupling | 2.20 | 4.14 | 8 |
Figure 4Visualization of the trajectories directly predicted in the world coordinate and by the proposed framework (which can be better viewed in color).
Ablation study of prediction approaches integrating lane orientation information on the Argoverse dataset. Test results of rasterized map and VectorNet are according to their original papers.
| Methods | ADE (m) ↓ | FDE (m) ↓ |
|---|---|---|
| CV | 3.95 | 8.56 |
| CV-map | 3.72 | 7.19 |
| LSTM (baseline) | 2.77 | 5.67 |
| LSTM-map | 1.67 | 3.58 |
| P-LSTM-map | 1.58 | 3.37 |
| VectorNet | 1.66 | 3.67 |
| Rasterized Map | 1.60 | 3.64 |
Figure 5Visualization of multi-modal prediction results in different urban road scenarios (which can be better viewed in color). (a) Prediction at intersection. (b) Prediction at T-junction. (c) Prediction at merging.
Comparison of multi-modal vehicle trajectory prediction methods on Argoverse dataset.
| Methods | minADE (m) ↓ | minFDE (m) ↓ |
|---|---|---|
| CV | 3.95 | 8.56 |
| CV-map | 3.72 | 7.19 |
| LSTM (baseline) | 2.53 | 5.04 |
| LSTM-map | 1.67 | 3.58 |
| P-LSTM-map | 1.58 | 3.37 |
| LSTM-M-map | 0.88 | 1.74 |
| P-LSTM-M-map | 0.85 | 1.66 |
Comparison of proposed framework and non-hierarchical multi-modal prediction methods on Argoverse dataset.
| Methods | minADE (m) ↓ | minFDE (m) ↓ |
|---|---|---|
| P-LSTM-M-map | 0.85 | 1.66 |
| MFP-k | 1.40 | - |
| P-LSTM-map-k | 1.40 | 3.15 |
| LaneRCNN | 0.90 | 1.45 |
| LaneRCNN-M | 0.86 | 1.35 |
| WIMP | 0.90 | 1.42 |
| HOME | 0.94 | 1.45 |
| SceneTransformer | 0.80 | 1.23 |