| Literature DB >> 31167481 |
Haifu Cui1, Liang Wu2,3, Zhanjun He4,5, Sheng Hu6, Kai Ma7, Li Yin8, Liufeng Tao9,10.
Abstract
Affinity propagation (AP) is a clustering algorithm for point data used in image recognition that can be used to solve various problems, such as initial class representative point selection, large-scale sparse matrix calculations, and large-scale data with fewer parameter settings. However, the AP clustering algorithm does not consider spatiotemporal information and multiple thematic attributes simultaneously, which leads to poor performance in discovering patterns from massive spatiotemporal points (e.g., trajectory points). To resolve this issue, a multidimensional spatiotemporal affinity propagation (MDST-AP) algorithm is proposed in this study. First, the similarity of spatial and nonspatial attributes is measured in Gaussian kernel space instead of Euclidean space, which helps address the multidimensional linear inseparability problem. Then, the Davies-Bouldin (DB) index is applied to optimize the parameter value of the MDST-AP algorithm, which is applied to analyze road congestion in Beijing via taxi trajectories. Experiments on different datasets and algorithms indicated that the MDST-AP algorithm can process multidimensional spatiotemporal data points faster and more effectively.Entities:
Keywords: Davies-Bouldin index; Gaussian kernel function; affinity propagation; spatial clustering; trajectory points
Mesh:
Year: 2019 PMID: 31167481 PMCID: PMC6603948 DOI: 10.3390/ijerph16111988
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Flow chart of the multidimensional spatiotemporal affinity propagation (MDST-AP) algorithm.
University of California Irvine (UCI) datasets information.
| Dataset | Iris | Seeds | Wine Quality, Red | Wine Quality, White |
|---|---|---|---|---|
| Objects | 150 | 150 | 150 | 500 |
| Clusters | 3 | 3 | 6 | 7 |
| Attributes | 4 | 7 | 11 | 11 |
Clustering results and computational time for different datasets. AP: affinity propagation.
| Algorithms | Measures | Iris | Seeds | Wine Quality, Red | Wine Quality, White |
|---|---|---|---|---|---|
| AP | F-measure | 0.88 | 0.81 | 0.71 | 0.78 |
| Time (s) | 0.38 | 0.44 | 0.54 | 19.29 | |
| MDST-AP | F-measure | 0.93 | 0.89 | 0.76 | 0.82 |
| Time (s) | 0.35 | 0.45 | 0.47 | 15.98 |
Global Positioning System (GPS) dataset information.
| ID | Time | Latitude | Longitude | Speed (km/h) | Direction | Status |
|---|---|---|---|---|---|---|
| 174853 | 20121101001447 | 116.4548645 | 39.9519463 | 51 | 328 | 1 |
| 453468 | 20121102155618 | 116.2787857 | 39.9250107 | 25 | 180 | 0 |
Figure 2The number of Global Positioning System (GPS) records in three weeks.
Comparison of algorithms.
| Day 1 | Day 2 | Day 3 | |||||
|---|---|---|---|---|---|---|---|
| Clusters | DB | Clusters | DB | Clusters | DB | ||
| 8:00 | AP | 82 | 116.18 | 80 | 129.81 | 84 | 128.85 |
| MDST-AP | 18 | 14.51 | 18 | 7.84 | 15 | 13.23 | |
| 17:00 | AP | 82 | 129.23 | 82 | 124.45 | 74 | 190.65 |
| MDST-AP | 18 | 7.51 | 14 | 6.80 | 10 | 9.77 | |
Clustering results under different values.
| M | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| λ | ||||||
|
| Clusters | 13 | 13 |
| 18 | 998 |
| DB | 11.46 | 9.57 |
| 28.03 | 29.84 | |
| 0.8 | Clusters | 16 |
| 20 | 24 | 998 |
| DB | 6.89 |
| 13.44 | 7.23 | 29.84 | |
| 0.9 | Clusters |
| 23 | 24 | 28 | 998 |
| DB |
| 19.62 | 9.78 | 6.37 | 29.84 |
Accuracy ratio of the points matched (ARP) comparison of the K-means, AP, and MDST-AP algorithms (%).
| Cluster | Unimpeded | Mildly Congested | Congested | Very Congested |
|---|---|---|---|---|
| K-means | 87.54 | 28.3 | 83.11 | 91.07 |
| AP | 90.6 | 31.62 | 85.3 | 93.2 |
| MDST-AP | 97.12 | 40.18 | 90.36 | 98.2 |
Figure 3Average ARP of different buffer radii of the k-means, AP, and MDST-AP algorithms.
Figure 4Clustering results of the morning and evening peaks on weekdays and weekends. (a) Morning peak on weekdays. (b) Evening peak on weekdays. (c) Morning peak on weekends. (d) Evening peak on weekends.
Comparison of traffic conditions between weekdays and weekends (%).
| Cluster | Unimpeded | Mildly Congested | Congested | Very Congested |
|---|---|---|---|---|
| Weekdays | 25.01 | 1.47 | 37.95 | 35.57 |
| Weekends | 29.31 | 10.89 | 39.15 | 20.65 |