| Literature DB >> 34937064 |
Xinhuan Zhang1, Les Lauber2, Hongjie Liu3, Junqing Shi1, Jinhong Wu1, Yuran Pan1.
Abstract
The travel trajectory data of mobile intelligent terminal users are characterized by clutter, incompleteness, noise, fuzzy randomness. The accuracy of original data is an essential prerequisite for better results of trajectory data mining. The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is one of the most effective trajectory data mining methods, but the selection of input parameters often limits it. The Sage-Husa adaptive filtering algorithm effectively controls the error range of mobile phone GPS data, which can meet the positioning accuracy requirements for DBSCAN spatial clustering having the advantages of low cost and convenient use. Then, a novel cluster validity index was proposed based on the internal and external duty cycle to balance the influence of the distance within-cluster, the distance between clusters, and the number of coordinate points in the process of clustering. The index can automatically choose input parameters of density clustering, and the effective clustering can be formed on different data sets. The optimized clustering method can be applied to the in-depth analysis and mining of traveler behavior trajectories. Experiments show that the Sage -Husa adaptive filtering algorithm proposed further improves the positioning accuracy of GPS, which is 17.34% and 15.24% higher eastward and northward, 14.25%, and 18.17% higher in 2D and 3D dimensions, respectively. The number of noise points is significantly reduced. At the same time, compared with the traditional validity index, the evaluation index based on the duty cycle proposed can optimize the input parameters and obtain better clustering results of traveler location information.Entities:
Mesh:
Year: 2021 PMID: 34937064 PMCID: PMC8694428 DOI: 10.1371/journal.pone.0259472
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1GPS data distribution before and after SAGE-HUSA adaptive filtering.
Fig 2Duty cycle coefficient.
Compactness evaluation results of different performance indexes.
| The evaluation index | Silhouette Coefficient | DBI | IEDCI |
|---|---|---|---|
| Distinct cluster | 73287.920 | 69802.893 |
|
| Fuzzy cluster | 130473.570 | 85870.576 |
|
| Halos Clusters | 324076.380 |
| 324113.390 |
| No-cluster | 33617.257 | 20486.406 |
|
Separation evaluation results of different performance indexes.
| The evaluation index | Silhouette Coefficient | DBI | IEDCI |
|---|---|---|---|
| Distinct cluster | 602379.301 | 600588.843 |
|
| Fuzzy cluster | 234766.280 |
| 259148.034 |
| Halos Clusters |
| 188408.806 | 25876.882 |
| No-cluster | 539210.783 | 3836.680 |
|
Improved DBSCAN clustering algorithm.
| 1. Set |
Comparison of positioning errors between L1 and L2.
| Location | Longitude (°) | Latitude (°) | Error (m) |
|---|---|---|---|
| Measured Coordinates of L1 (Jinmao Tower) | 120.097832 | 29.324656 | 2.1 |
| Actual coordinates of L1 | 120.097858 | 29.324661 | |
| Measured Coordinates of L2 (The Double Dragon Cave Visitor Centre) | 119.622086 | 29.201382 | 2.5 |
| Actual coordinates of L2 | 119.622114 | 29.201413 |
Fig 32-D synthetic data sets.
The travel data structure of urban public transport in Jinhua city.
| UID | LNG | LAT | UP_TIME | |
|---|---|---|---|---|
| 1 | 3615691134 | 119.666871 | 29.068345 | 02/01/2021 11:23:56 |
| 2 | 3286093069 | 119.661293 | 29.070279 | 02/10/2021 08:01:12 |
| 3 | 1778287686 | 119.658382 | 29.072213 | 02/23/2021 20:17:04 |
| 4 | 4189128205 | 119.663557 | 29.076293 | 02/27/2021 18:43:33 |
| 5 | . . . | . . . | . . . | . . . |
Comparison of positioning accuracy based on Sage-Husa adaptive filtering algorithm.
| Direction | Eastward | Northward | 2-D dimensional | 3-D dimensional |
|---|---|---|---|---|
| Data processed by the algorithm (m) | 4.72 | 5.45 | 6.56 | 9.73 |
| Raw GPS Data (m) | 5.71 | 6.43 | 7.65 | 11.89 |
| Precision improvement ratio (%) | 17.34% | 15.24% | 14.25% | 18.17% |
Fig 4Frequency of clustering results.
Experimental results of different performance parameters.
| Input parameters | Experience value | Statistics | Improved DBSCAN |
|---|---|---|---|
|
|
|
|
|
| Compactness | 4.373 |
| 4.047 |
| Separation | 588.469 | 568.858 |
|
| Davius-Bouldin index ( | 0.129 | 0.992 |
|
Optimal values of MinPts and Eps.
| The evaluation index | Silhouette Coefficient | DBI | IEDCI |
|---|---|---|---|
|
| 13 | 12 | 14 |
|
| 10 | 50 | 30 |
Fig 53-Dimensional surfaces with different performance indexes.
Fig 6Clustering results of different performance indicators.
Evaluation results of separation and compactness.
| Evaluation index | Silhouette Coefficient | DBI | IEDCI |
|---|---|---|---|
| Compactness | 6.430 | 2.804 |
|
| Separation | 639.290 | 567.494 |
|