| Literature DB >> 34179770 |
Pengyang Wang1, Kunpeng Liu1, Dongjie Wang1, Yanjie Fu1.
Abstract
The pervasiveness of mobile and sensing technologies today has facilitated the creation of Big Crowdsourced Geotagged Data (BCGD) from individual users in real time and at different locations in the city. Such ubiquitous user-generated data allow us to infer various patterns of human behavior, which helps us understand the interactions between humans and cities. In this article, we aim to analyze BCGD, including mobile consumption check-ins, urban geography data, and human mobility data, to learn a model that can unveil the impact of urban geography and human mobility on the vibrancy of residential communities. Vibrant communities are defined as places that show diverse and frequent consumer activities. To effectively identify such vibrant communities, we propose a supervised data mining system to learn and mimic the unique spatial configuration patterns and social interaction patterns of vibrant communities using urban geography and human mobility data. Specifically, to prepare the benchmark vibrancy scores of communities for training, we first propose a fused scoring method by fusing the frequency and the diversity of consumer activities using mobile check-in data. Besides, we define and extract the features of spatial configuration and social interaction for each community by mining urban geography and human mobility data. In addition, we strategically combine a pairwise ranking objective with a sparsity regularization to learn a predictor of community vibrancy. And we develop an effective solution for the optimization problem. Finally, our experiment is instantiated on BCGD including real estate, point of interests, taxi and bus GPS trajectories, and mobile check-ins in Beijing. The experimental results demonstrate the competitive performances of both the extracted features and the proposed model. Our results suggest that a structurally diverse community usually shows higher social interaction and better business performance, and incompatible land uses may decrease the vibrancy of a community. Our studies demonstrate the potential of how to best make use of BCGD to create local economic matrices and sustain urban vibrancy in a fast, cheap, and meaningful way.Entities:
Keywords: Big Crowdsourced Geotagged Data mining; learning-to- rank; spatiotemporal data mining; urban computing; urban vibrancy
Year: 2021 PMID: 34179770 PMCID: PMC8222666 DOI: 10.3389/fdata.2021.690970
Source DB: PubMed Journal: Front Big Data ISSN: 2624-909X
FIGURE 1Example of a residential community.
FIGURE 2Overview of our framework.
Feature summary.
| Feature type | Category | Subcategory | Denotation |
|---|---|---|---|
| Spatial configuration | Density |
| |
| Diversity |
| ||
| Accessibility | Public transportation facilities |
| |
|
| |||
|
| |||
|
| |||
| The quality of the road network |
| ||
|
| |||
| Social interaction | Flow | Inflow |
|
| Outflow |
| ||
| Intra-flow |
| ||
| Range |
| ||
| Average speed |
|
Statistics of the experimental data.
| Data source | Properties | Statistics |
|---|---|---|
| Taxi GPS | Number of taxis | 13,597 |
| Effective days | 92 | |
| Time period | April–August 2012 | |
| Number of trips | 8,202,012 | |
| Number of GPS points | 111,602 | |
| Total distance (km) | 61,269,029 | |
| Bus/subway traces | Number of bus/subway stops | 9,810 |
| Time period | August 2012–May 2013 | |
| Number of car holders | 300,250 | |
| Number of trips | 1,730,000 | |
| Mobile check-ins | Number of check-in POIs | 5,874 |
| Number of check-in events | 2,762,128 | |
| Number of POI categories | 9 | |
| Time period | 01/2012-12/2012 | |
| POIs | Number of business POIs | 328,668 |
| Positions (longitude and altitude) | 328,668 | |
| Residential communities | Number of real estates | 2,990 |
| Size of bounding box (km) | 40*40 |
FIGURE 3Analysis of urban vibrancy based on the proposed metric.
FIGURE 4Feature correlation analysis of bus traces, taxi traces, and subways.
Feature correlation analysis of bus traces, taxi traces, and subways.
| R6 | R5 | R4 | R3 | R2 | R1 | |
|---|---|---|---|---|---|---|
| Inflow of bus | 0.63 | 0.85 | 0.36 | 0.11 | 0.56 | 0.17 |
| Outflow of bus | 0.59 | 0.88 | 0.34 | 0.57 | 0.49 | 0.24 |
| Intra-flow of bus | 0.67 | 0.92 | 0.31 | 0.37 | 0.28 | 0.05 |
| Distance to bus stop | 0.21 | 0.57 | 0.86 | 0.74 | 0.42 | 0.45 |
| Bus stop density | 0.14 | 0.09 | 0.88 | 0.18 | 0.50 | 0.65 |
| Inflow of taxi | 0.89 | 0.73 | 0.79 | 0.59 | 0.14 | 0.51 |
| Outflow of taxi | 0.94 | 0.16 | 0.41 | 0.12 | 0.61 | 0.05 |
| Intra-flow of taxi | 0.85 | 0.62 | 0.84 | 0.49 | 0.42 | 0.27 |
| Speed of taxi | 0.92 | 0.13 | 0.26 | 0.01 | 0.65 | 0.88 |
| Traveling distance of taxi | 0.89 | 0.36 | 0.51 | 0.33 | 0.80 | 0.25 |
| Distance to subway station | 0.89 | 0.63 | 0.47 | 0.10 | 0.07 | 0.01 |
| Subway station density | 0.93 | 0.82 | 0.45 | 0.19 | 0.08 | 0.02 |
FIGURE 5Feature importances based on information gain.
Performance comparison of our approach and baselines.
| Random Forests | ListNet | Coordinate Ascent | RankNet | Our model | |
|---|---|---|---|---|---|
| NDCG@3 | 0.0867 | 0.1002 | 0.0788 | 0.2 | 0.7103 |
| NDCG@5 | 0.0879 | 0.0997 | 0.0841 | 0.2 | 0.5897 |
| NDCG@10 | 0.0919 | 0.1003 | 0.0861 | 0.2 | 0.4544 |
| NDCG@15 | 0.0907 | 0.1004 | 0.0852 | 0.2 | 0.3908 |
| Tau | −1.0 | −0.0401 | −0.0616 | −0.4699 | −0.4594 |
FIGURE 6Performance comparison between models.
FIGURE 7Comparison of feature performance based on different states.
FIGURE 8Comparison of feature performance based on data sources.
FIGURE 9Comparison of feature performance based on different radius distance.