| Literature DB >> 36121846 |
Zhendong Yuan1, Jules Kerckhoffs1, Gerard Hoek1, Roel Vermeulen1,2.
Abstract
Mobile measurements are increasingly used to develop spatially explicit (hyperlocal) air quality maps using land-use regression (LUR) models. The prevailing design of mobile monitoring campaigns results in the collection of short-term, on-road air pollution measurements during daytime on weekdays. We hypothesize that LUR models trained with such mobile measurements are not optimized for estimating long-term average residential air pollution concentrations. To bridge the knowledge gaps in space (on-road versus near-road) and time (short- versus long-term), we propose transfer-learning techniques to adapt LUR models by transferring the mobile knowledge into long-term near-road knowledge in an end-to-end manner. We trained two transfer-learning LUR models by incorporating mobile measurements of nitrogen dioxide (NO2) and ultrafine particles (UFP) collected by Google Street View cars with long-term near-road measurements from regular monitoring networks in Amsterdam. We found that transfer-learning LUR models performed 55.2% better in predicting long-term near-road concentrations than the LUR model trained only with mobile measurements for NO2 and 26.9% for UFP, evaluated by normalized mean absolute errors. This improvement in model accuracy suggests that transfer-learning models provide a solution for narrowing the knowledge gaps and can improve the accuracy of mapping long-term near-road air pollution concentrations using short-term on-road mobile monitoring data.Entities:
Keywords: LUR modeling; air pollution mapping; mobile monitoring; transfer learning
Mesh:
Substances:
Year: 2022 PMID: 36121846 PMCID: PMC9535937 DOI: 10.1021/acs.est.2c05036
Source DB: PubMed Journal: Environ Sci Technol ISSN: 0013-936X Impact factor: 11.357
Figure 1Data and methods involved in developing conventional LUR and transfer-learning LUR models. Two conventional LUR models were implemented as baseline models, namely, stepwise linear LUR model (SLR) and standard random forest LUR model (RF_LUR). Prior_RF and TrAdaBoost are two variants of transfer-learning LUR models that incorporated external long-term information into the training of mobile monitoring data. The accuracy of TrAdaBoost was evaluated using half of the external long-term air pollution measurements. SLR, RF_LUR, and Prior_RF were validated using the full set of external long-term measurements.
Summary of Models and Comparisons
| Models | Algorithms | Training data | Validation data | |
|---|---|---|---|---|
| Mobile LUR model | SLR | Linear regression | Mobile data | Full external long-term data |
| RF_LUR | Random forest (RF) | Mobile data | Full external long-term data | |
| Transfer-learning LUR model | Prior_RF | Adapted RF | Mobile data and the ratio of probability distributions between mobile and external long-term measurements | Full external long-term data |
| TrAdaBoost | TrAdaBoost.R2 | Mobile data and half of the external long-term data | Half of the external long-term data | |
| Sensitivity test | ||||
| SLR_half | Llinear regression | Mobile data | Half of the external long-term data | |
| RF_LUR_half | Random forest | Mobile data | Half of the external long-term data | |
| Prior_RF_half | Adapted RF | Mobile data and the ratio of probability distributions between mobile and half of the external long-term measurements | Half of the external long-term data | |
Summary of Concentrations from Mobile and Long-Term Monitoring Data
| Dataset | Source | Number sites | Concentrations | 1st Qu. | Mean | 3rd Qu./unit |
|---|---|---|---|---|---|---|
| Mobile measurements | Mobile points aggregated to 50-m road segments | 41,919 | NO2 | 18.6 | 27.4 | 32.0 μg/m3 |
| 42,813 | UFP | 11,480 | 21,901 | 26,614 particles/cm3 | ||
| Long-term measurements | Palmes[ | 82 | NO2 | 20.9 | 26.1 | 30.5 μg/m3 |
| EXPOsOMIC[ | 17 | UFP | 15,367 | 18,584 | 21,419 particles/cm3 |
Figure 2Differences in density distributions between mobile and long-term measurements for NO2 and UFP at long-term validation sites. The mean values were marked.
Model Performance of Predicting Long-Term Air Pollution Validated by External Long-Term Data (Mean and 95% CI)
| NO2 | UFP | |||||
|---|---|---|---|---|---|---|
| Models | nMAE | nRMSE | nMAE | nRMSE | ||
| SLR | 0.19 | 0.23 | 0.49 | 0.22 | 0.27 | 0.20 |
| RF_LUR | 0.29 (0.29,0.30) | 0.38 (0.38,0.39) | 0.53 (0.52,0.54) | 0.26 (0.25, 0.27) | 0.35 (0.35,0.37) | 0.15 (0.13,0.16) |
| Prior_RF | 0.24 (0.22,0.25) | 0.31 (0.29,0.32) | ||||
| TrAdaBoost | 0.54 (0.47,0.60) | 0.21 (0.18,0.23) | 0.25 (0.18,0.31) | |||
| Sensitivity test (mean and 95% CI) | ||||||
| RF_LUR_half | 0.28 (0.27-, 0.30) | 0.38 (0.35-, 0.40) | 0.54 (0.49, 0.60) | 0.26 (0.22,0.30) | 0.35 (0.3, 0.40) | 0.23 (0.12, 0.34) |
| Prior_RF_half | 0.24 (0.22,0.26) | 0.31 (0.29-, 0.33) | 0.64 (0.60, 0.68) | 0.18 (0.16, 0.20) | 0.22 (0.18, 0.26) | 0.29 (0.14, 0.43) |
Improvement in Percentage of Transfer-Learning LUR Models Compared to Conventional Mobile LUR Modelsa
| SLR | RF_LUR | |||||
|---|---|---|---|---|---|---|
| NO2 models | nMAE | nRMSE | nMAE | nRMSE | ||
| TrAdaBoost | –31.6% | –21.7% | +10.2% | –55.2% | –52.6% | +1.9% |
| Prior_RF | +26.3% | +34.8% | +26.5% | –17.2% | –18.4% | +17.0% |
| UFP models | ||||||
| TrAdaBoost | –4.6% | –7.4% | +25.0% | –19.2% | –28.6% | +66.7% |
| Prior_RF | –13.6% | –7.4% | +40.0% | –26.9% | –28.6% | +86.7% |
Improvement in percentage is calculated using (median_of_transfer_learning – median_of_conventional)/median_of_conventional.
Figure 3Density plot of predictions and measured long-term concentrations at validation sites. For each method, a model was selected whose performance was the median of the repeated cross-validation performance.
Figure 4Map of predicted long-term NO2 concentration (μg/m3) based on various GIS predictors. SLR is one of the conventional linear LUR model. RF_LUR is one of the traditional ML-based LUR models. Prior_RF and TrAdaBoost are two transfer-learning based LUR models that integrate long-term observations with mobile measurements in the training phase.
Figure 5Spatial differences in NO2 predictions (μg/m3) between transfer-learning LUR and mobile LUR models.