| Literature DB >> 35022975 |
Yaowen Luo1, Jianguo Yan2, Stephen C McClure3, Fei Li4,5.
Abstract
Correlations between socioeconomic factors and poverty in regression models do not reflect actual relationships, especially when data exhibit patterns of spatial heterogeneity. Spatial regression models can estimate the relationships between socioeconomic factors and poverty in defined geographical areas, explaining the imbalanced distribution of poverty, but the relationships between these factors and poverty are not always linear however, and conventional simple linear local regression models do not accurately capture these nonlinear relationships. To fill this gap, we used a local regression method, geographically weighted random forest regression (GW-RFR), that integrates a spatial weight matrix (SWM) and random forest (RF). The GW-RFR evaluates the spatial variations in the nonlinear relationships between variables. A county-level poverty data set of China was employed to estimate the performance of the GW-RFR against the random forest (RF). In this poverty application, the value of [Formula: see text] was 0.128 higher than that of the RF, the NRMSE value was 1.6% lower than the RF, and the MAE value was 0.295 lower than the RF. These results showed that the relationship between poverty factors and poverty varies with space at the county level in China, and the GW-RFR was suitable for dealing with nonlinear relationships in local regression analysis.Entities:
Keywords: Nonlinear; Poverty; Random forest; Spatial variation; Variable importance
Mesh:
Year: 2022 PMID: 35022975 PMCID: PMC8754530 DOI: 10.1007/s11356-021-17513-3
Source DB: PubMed Journal: Environ Sci Pollut Res Int ISSN: 0944-1344 Impact factor: 5.190
Fig. 1Study area
Definition of poverty indicators
| Poverty indicators | Indicator meaning | Unit |
|---|---|---|
| Elevation X1 | The average elevation of each county based on a 30-m resolution elevation image | km |
| Relief X2 | The difference between the maximum and minimum elevation of each county | km |
| Slope X3 | The average slope of each county based on a 30 m resolution slope image | degree |
| Railway density X4 | The length of railways per square kilometer of land area | km/km2 |
| Highway density X5 | The length of highways per square kilometer of land area | km/km2 |
| Rivers density X6 | The length of rivers per square kilometer of land area | km/km2 |
| Proportion of secondary industry employees X7 | The proportion of secondary industry employees per 10,000 population | % |
| Proportion of tertiary industry employees X8 | The proportion of tertiary industry employees per 10,000 population | % |
| Per capita GDP X9 | Gross domestic product/total population | 104 Yuan |
| Proportion of landline subscribers X10 | Proportion of landline subscribers per 10,000 population | % |
| Public revenue X11 | Public revenue of each county | 104 Yuan |
| Public financial expenditure X12 | Public financial expenditure of each county | 104 Yuan |
| Per loan amount X13 | Average value of loan amount of local residents | 104 Yuan |
| Per capita total power of agricultural machinery X14 | Total power of agricultural machinery/total population | 103 w |
| Per capita area of agricultural machine harvesting X15 | Total area of agricultural machine harvesting area/total population | km2/individual |
| Per capita area of facility agriculture X16 | Total area of facility agriculture/total population | km2/individual |
| Per grain production X17 | Grain production/total population | 103 kg/individual |
| Per oil production X18 | Gil production/total population | 103 kg/individual |
| Per meat production X19 | Meat production/total population | 103 kg/individual |
| Number of units of large-scale industrial enterprises X20 | Number of units of large-scale industrial enterprises of each county | individual |
| Gross industrial output of large-scale industry X21 | Gross industrial output of large-scale industry of each county | 104 Yuan |
| Fixed asset investment X22 | Fixed asset investment of each county | 104 Yuan |
| Number of social welfare receiving units X23 | Number of social welfare receiving units of each county | individual |
| Proportion of students in regular secondary schools X24 | Number of students in regular secondary schools per 10,000 population | % |
| Proportion of students in regular vocational secondary schools X25 | Number of students in secondary vocational schools per 10,000 population | % |
| Proportion of primary school students X26 | Number of primary school students/total population per 10,000 population | % |
| Per capita hospital beds X27 | Number hospital beds per 10,000 population | individual |
| Per capita beds of various social welfare receiving units X28 | Number of beds of various social welfare receiving units per 10,000 population | individual |
The coefficient of determination (), the normalized root mean square error (NRMSE), and the mean absolute error (MAE) of the RF and GW-RFR in the application example
| Value of | Value of NRMSE | Value of MAE | |
|---|---|---|---|
| RF | 0.758 | 4.0% | 0.612 |
| GW-RFR | 0.918 | 2.0% | 0.312 |
Fig. 2The distribution of local of the GW-RFR in the application example
The statistic of local of the GW-RFR, we calculated the average value of local and the percentage of counties in five local range (≤ 0.2, (0.2, 04], (0.4, 06], (0.6, 08], > 0.8)
| The value of local | Percentage of counties |
|---|---|
| ≤ 0.2 | 0.73% |
| (0.2, 04] | 4.62% |
| (0.4, 06] | 24.66% |
| (0.6, 08] | 46.06% |
| > 0.8 | 23.93% |
Poverty indicators based on vip of the RF in the application example
| Variable importance order | Poverty indicator | Variable importance | |
|---|---|---|---|
| vip1 | X13 | 47.972 | 0.020 |
| vip2 | X9 | 21.672 | 0.020 |
| vip3 | X1 | 19.872 | 0.020 |
| vip4 | X10 | 17.778 | 0.020 |
| vip5 | X7 | 13.096 | 0.020 |
| vip6 | X28 | 12.211 | 0.020 |
| vip7 | X11 | 11.342 | 0.216 |
| vip8 | X22 | 9.688 | 0.353 |
| vip9 | X26 | 8.350 | 0.098 |
| vip10 | X15 | 7.710 | 0.431 |
| vip11 | X5 | 7.704 | 0.157 |
| vip12 | X3 | 7.186 | 0.784 |
| vip13 | X2 | 7.017 | 0.765 |
| vip14 | X25 | 6.645 | 0.431 |
| vip15 | X27 | 6.324 | 0.235 |
| vip16 | X23 | 6.151 | 0.451 |
| vip17 | X17 | 5.975 | 0.804 |
| vip18 | X8 | 5.600 | 0.510 |
| vip19 | X19 | 5.347 | 0.294 |
| vip20 | X18 | 4.407 | 0.373 |
| vip21 | X24 | 3.465 | 0.725 |
| vip22 | X14 | 2.978 | 0.922 |
| vip23 | X4 | 2.783 | 0.922 |
| vip24 | X16 | 2.525 | 0.745 |
| vip25 | X6 | 2.238 | 0.863 |
Fig. 3The distribution of the poverty indicators with the highest value of variable importance (vip 1) in each county
Fig. 4The distribution of the value of variable importance (VI) for poverty indicator per loan amount (a), per capita GDP (b), and proportion of landline subscribers (c) in the application example