| Literature DB >> 27783645 |
Chao Wu1, Xinyue Ye2, Fu Ren1,3,4, You Wan1,3,4, Pengfei Ning1, Qingyun Du1,3,4,5.
Abstract
Housing is among the most pressing issues in urban China and has received considerable scholarly attention. Researchers have primarily concentrated on identifying the factors that influence residential property prices and how such mechanisms function. However, few studies have examined the potential factors that influence housing prices from a big data perspective. In this article, we use a big data perspective to determine the willingness of buyers to pay for various factors. The opinions and geographical preferences of individuals for places can be represented by visit frequencies given different motivations. Check-in data from the social media platform Sina Visitor System is used in this article. Here, we use kernel density estimation (KDE) to analyse the spatial patterns of check-in spots (or places of interest, POIs) and employ the Getis-Ord [Formula: see text] method to identify the hot spots for different types of POIs in Shenzhen, China. New indexes are then proposed based on the hot-spot results as measured by check-in data to analyse the effects of these locations on housing prices. This modelling is performed using the hedonic price method (HPM) and the geographically weighted regression (GWR) method. The results show that the degree of clustering of POIs has a significant influence on housing values. Meanwhile, the GWR method has a better interpretive capacity than does the HPM because of the former method's ability to capture spatial heterogeneity. This article integrates big social media data to expand the scope (new study content) and depth (study scale) of housing price research to an unprecedented degree.Entities:
Mesh:
Year: 2016 PMID: 27783645 PMCID: PMC5082690 DOI: 10.1371/journal.pone.0164553
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The geographical location of the case study city: Shenzhen.
Measurement methods and housing characteristic variable signs.
| Variable | Variable definition and measurement methods | Mean | Std.dev. | Expected sign | |
|---|---|---|---|---|---|
| Structural Variables | Area | Square footage of the living area of house (m2) | 100.487 | 35.689 | + |
| Floor | Floor on which the unit is situated (floor) | 18.018 | 10.067 | + | |
| NumBed | Number of bedrooms in unit | 3.139 | 0.965 | + | |
| NumWash | Number of bathrooms in unit | 1.725 | 0.643 | Unknown | |
| Fee | Property management fees (RMB) | 3.868 | 1.069 | + | |
| RPlot | Ratio of floor area | 3.976 | 1.404 | - | |
| RGreen | Ratio of green space area (%) | 0.345 | 0.0729 | + | |
| Locational Variables | DCBD | Distance to CBD (central business district; km) | 23.339 | 10.208 | - |
| DSub | Distance to nearest metro station (km) | 0.319 | 0.409 | - | |
| DBus | Distance to nearest bus station (km) | 0.019 | 0.023 | - | |
| DHospital | Distance to nearest hospital (km) | 0.122 | 0.786 | Unknown | |
| DSnursery | Distance to nearest nursery school (km) | 0.597 | 0.464 | - | |
| DSprimary | Distance to nearest primary school (km) | 0.756 | 0.464 | - | |
| DSmiddle | Distance to nearest middle school (km) | 0.108 | 0.640 | - | |
| DPark | Distance to nearest park(km) | 0.113 | 0.609 | - | |
| Neighbourhood Variables | GiZScore_G | Degree of activity in green space | 0.438 | 2.308 | - |
| GiZScore_C | Degree of activity in commercial centre | 1.181 | 3.229 | - |
POI types and aggregated information.
| Type | Abbreviation | Counts | Percentage |
|---|---|---|---|
| Commercial and business facilities | CBF | 13,268 | 59% |
| Industrial | IND | 890 | 3.93% |
| Transport facilities | TRA | 1,696 | 7.48% |
| Residence communities | RES | 3,569 | 15.74% |
| Green space | GRE | 1,413 | 6.23% |
| Administration and public services | APS | 1,834 | 8.09% |
| Total | 22,670 | 100% |
Fig 2The power-law distribution patterns of check-ins (a) and users (b).
Fig 3Spatial hot-spot analysis model based on check-in data.
Fig 4Kernel Density Estimation analysis of POI check-in data.
Average nearest neighbour summary.
| Values | GRE | CBF |
|---|---|---|
| 455.3936 Meters | 89.5055 Meters | |
| 806.4292 Meters | 278.8823 Meters | |
| 0.564704 | 0.320943 | |
| -31.303063 | -149.637195 | |
| 0.000000 | 0.000000 |
Fig 5Hot spot distribution of GRE.
Fig 6Hot spot distribution of CBF.
Hedonic price method (HPM) parameter estimate summary.
| Explanatory variable | Estimated coefficient | p-value | |
|---|---|---|---|
| Constant | 0.060 | 19.388 | .000 |
| DCBD | -0.326 | -55.863 | .000 |
| Fee | 0.293 | 43.604 | .000 |
| DPark | -0.111 | -27.376 | .000 |
| NumWash | 0.094 | 21.743 | .000 |
| GiZScore_C | 0.083 | 17.104 | .000 |
| DHospital | -0.094 | -24.942 | .000 |
| SPrimary | 0.080 | 21.966 | .000 |
| Floor | 0.063 | 20.077 | .000 |
| GiZScore_G | 0.134 | 24.466 | .000 |
| DBus | -0.049 | -14.273 | .000 |
| DSub | -0.049 | -11.417 | .000 |
| Area | 0.031 | 6.907 | .000 |
| RGreen | 0.023 | 6.854 | .000 |
| R2 | 0.724 |
*** Significant at the 1% level.
Geographically weighted regression (GWR) parameter estimate summary.
| Explanatory Variable | Minimum | Lower quartile | Median | Upper quartile | Maximum | p-value |
|---|---|---|---|---|---|---|
| Constant | -12.8186 | -1.38137 | -0.2675 | 0.04614 | 10.28509 | .000 |
| Floor | -0.35291 | 0.031246 | 0.040169 | 0.056628 | 0.339561 | .000 |
| NumBed | -0.66773 | -0.02149 | 0.040887 | 0.128011 | 0.493299 | .000 |
| NumWash | -0.64065 | -0.0432 | 0.022523 | 0.06847 | 0.358014 | .000 |
| Area | -0.68338 | -0.01717 | 0.074358 | 0.142633 | 1.335995 | .000 |
| RPlot | -10.9418 | -0.5452 | -0.1074 | 0.145695 | 1.900368 | .000 |
| RGreen | -3.84742 | -0.0312 | 0.089782 | 0.399657 | 1.858314 | .000 |
| Fee | -2.68112 | 0.015945 | 0.25459 | 0.621551 | 4.052185 | .000 |
| DSub | -12.302 | -0.40319 | -0.02085 | 0.563174 | 15.22164 | .000 |
| DBus | -4.17797 | -0.35394 | -0.03028 | 0.509983 | 23.5197 | .004 |
| DSnursery | -18.9376 | -0.1124 | -0.03278 | 0.158163 | 4.604339 | .000 |
| DSprimary | -12.4242 | -0.13258 | 0.017652 | 0.261109 | 2.282185 | .000 |
| DSmiddle | -3.55686 | -0.41095 | 0.017384 | 0.249635 | 14.10076 | .000 |
| DHospital | -4.868 | -0.62254 | -0.12827 | 0.300655 | 5.447187 | .000 |
| DPark | -2.27624 | -0.07367 | 0.012992 | 0.328673 | 11.93626 | .855 |
| DCBD | -16.3229 | -0.74499 | -0.09705 | 0.430907 | 1.94271 | .000 |
| GiZScore_G | -11.9198 | -2.41877 | -0.1252 | 0.179357 | 4.508242 | .000 |
| GiZScore_C | -9.16523 | 0.015567 | 0.463302 | 2.229864 | 11.68491 | .000 |
| R2 | 0.9399 | |||||
| Bandwidth | 0.337 |
***, ** Significant at the 1% and 5% levels, respectively.
Fig 7Spatial variation of the local parameters of the GWR.
(a) and (b) are GiZscore_C and GiZscore_G, respectively.