| Literature DB >> 33808185 |
Asa Gholizadeh1, João A Coblinski1, Mohammadmehdi Saberioon2, Eyal Ben-Dor3, Ondřej Drábek1, José A M Demattê4, Luboš Borůvka1, Karel Němeček1, Sabine Chabrillat2, Julie Dajčl1.
Abstract
Soil contamination by potentially toxic elements (PTEs) is intensifying under increasing industrialization. Thus, the ability to efficiently delineate contaminated sites is crucial. Visible-near infrared (vis-NIR: 350-2500 nm) and X-ray fluorescence (XRF: 0.02-41.08 keV) spectroscopic techniques have attracted tremendous attention for the assessment of PTEs. Recently, the application of fused vis-NIR and XRF spectroscopy, which is based on the complementary effect of data fusion, is also increasing. Moreover, different data manipulation methods, including feature selection approaches, affect the prediction performance. This study investigated the feasibility of using single and fused vis-NIR and XRF spectra while exploring feature selection algorithms for the assessment of key soil PTEs. The soil samples were collected from one of the most heavily polluted areas of the Czech Republic and scanned using laboratory vis-NIR and XRF spectrometers. Univariate filter (UF) and genetic algorithm (GA) were used to select the bands of greater importance for the PTE prediction. Support vector machine (SVM) was then used to train the models using the full-range and feature-selected spectra of single sensors and their fusion. It was found that XRF spectra alone (primarily GA-selected) performed better than single vis-NIR and fused spectral data for predictions of PTEs. Moreover, the prediction models that were derived from the fused data set (particularly the GA-selected) enhanced the models' accuracies as compared with the single vis-NIR spectra. In general, the results suggest that the GA-selected spectra obtained from the single XRF spectrometer (for As and Pb) and from the fusion of vis-NIR and XRF (for Pb) are promising for accurate quantitative estimation detection of the mentioned PTEs.Entities:
Keywords: XRF spectroscopy; data fusion; feature selection; genetic algorithm; soil contamination; univariate filter; vis–NIR spectroscopy
Year: 2021 PMID: 33808185 PMCID: PMC8037398 DOI: 10.3390/s21072386
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The Czech Republic and location of Příbram in the country, the location of the sampling area, and the sampling points. Sampling points’ colors are distinguishing between different sampling strategies (at different periods); yellow points are the new sampling points and red points represent previously analyzed, but newly collected and re-analyzed, sampling points.
Figure 2vis–NIR spectra measurement setup.
Figure 3X-ray fluorescence (XRF) spectra measurement setup.
Statistical description of measured pseudo-total content of potentially toxic elements (PTEs), total Fe, and soil organic carbon (SOC).
| Element | No. | Unit | Mean | Median | Min | Max | SD | CV% | Skewness |
|---|---|---|---|---|---|---|---|---|---|
| As | 150 | 200 | 197 | 4.50 | 492 | 102 | 51 | 0.3 | |
| Cd | 152 | 24.2 | 24.5 | 1.60 | 48.4 | 10.3 | 43 | 0.0 | |
| Cu | 151 | 54.6 | 53.7 | 13.2 | 104 | 19.6 | 35 | 0.3 | |
| Pb | 152 | (mg/kg) | 1803 | 1656 | 37.9 | 4170 | 920 | 51 | 0.5 |
| Zn | 154 | 2217 | 2124 | 49.4 | 5351 | 1128 | 51 | 0.4 | |
| Mn | 142 | 2380 | 2324 | 499 | 5720 | 1140 | 48 | 0.8 | |
| Fe | 152 | 20,973 | 20,503 | 9670 | 35,428 | 5522 | 26 | 0.4 | |
| SOC | 147 | (%) | 3.1 | 2.9 | 0.9 | 6.2 | 1.1 | 35 | 0.5 |
Figure 4Representative soil mean spectra (bold lines) and their variance (shaded areas) of vis–NIR and XRF.
Figure 5The Pearson correlation coefficients between soil PTEs, total Fe, and SOC.
Statistics of the prediction model performance for soil PTEs concentration (mg/kg) using the single spectrometers’ full-range and feature-selected spectra (validation data set).
| Element | Sensor | Data Set | R | RMSE | Bias |
|---|---|---|---|---|---|
| As | vis–NIR | Full-range | 0.59 | 76.7 | 11.9 |
| UF-selected | 0.55 | 78.5 | 13.8 | ||
| GA-selected | 0.61 | 76.5 | 9.43 | ||
| XRF | Full-range | 0.77 | 58.2 | 14.4 | |
| UF-selected | 0.70 | 66.4 | 15.8 | ||
| GA-selected | 0.82 | 52.5 | 14.4 | ||
| Cd | vis–NIR | Full-range | 0.25 | 8.42 | −1.90 |
| UF-selected | 0.22 | 8.96 | 1.86 | ||
| GA-selected | 0.25 | 8.41 | 1.73 | ||
| XRF | Full-range | 0.73 | 4.98 | 0.38 | |
| UF-selected | 0.78 | 4.51 | 0.50 | ||
| GA-selected | 0.74 | 5.05 | 0.08 | ||
| Cu | vis–NIR | Full-range | 0.53 | 13.9 | 2.30 |
| UF-selected | 0.56 | 13.25 | 1.82 | ||
| GA-selected | 0.58 | 13.2 | 1.73 | ||
| XRF | Full-range | 0.71 | 10.8 | −0.19 | |
| UF-selected | 0.78 | 9.91 | −0.98 | ||
| GA-selected | 0.76 | 10.10 | 0.06 | ||
| Pb | vis–NIR | Full-range | 0.61 | 665 | 64.3 |
| UF-selected | 0.64 | 630 | 62.4 | ||
| GA-selected | 0.68 | 613 | 56.2 | ||
| XRF | Full-range | 0.89 | 382 | −8.93 | |
| UF-selected | 0.86 | 379 | 17.1 | ||
| GA-selected | 0.89 | 370 | 14.7 | ||
| Zn | vis–NIR | Full-range | 0.37 | 907 | 141 |
| UF-selected | 0.37 | 939 | 205 | ||
| GA-selected | 0.52 | 808 | 131 | ||
| XRF | Full-range | 0.81 | 501 | 50.0 | |
| UF-selected | 0.79 | 520 | 58.8 | ||
| GA-selected | 0.80 | 509 | 43.8 | ||
| Mn | vis–NIR | Full-range | 0.45 | 844 | 68.3 |
| UF-selected | 0.47 | 835 | 67.8 | ||
| GA-selected | 0.53 | 829 | 50.0 | ||
| XRF | Full-range | 0.82 | 488 | −21.3 | |
| UF-selected | 0.80 | 517 | 53.8 | ||
| GA-selected | 0.83 | 509 | −11.0 |
Number of spectral variables selected by Univariate filter (UF) and genetic algorithm (GA) from vis–NIR (out of 2051) and XRF (out of 716) spectra.
| Elements | UF | GA | ||
|---|---|---|---|---|
| vis–NIR | XRF | vis–NIR | XRF | |
| As | 890 | 369 | 604 | 331 |
| Cd | 838 | 334 | 610 | 201 |
| Cu | 1024 | 380 | 353 | 342 |
| Pb | 1364 | 354 | 558 | 316 |
| Zn | 567 | 311 | 337 | 301 |
| Mn | 943 | 193 | 557 | 18 |
Statistics of the prediction model performance for soil PTEs concentration (mg/kg) using the fused spectra (validation data set).
| Element | Data Set | R | RMSE | Bias |
|---|---|---|---|---|
| vis–NIR + XRF (Full-range) | 0.76 | 60.9 | 17.1 | |
| As | vis–NIR + XRF (UF-selected) | 0.69 | 66.9 | 17.8 |
| vis–NIR + XRF (GA-selected) | 0.77 | 59.7 | 15.8 | |
| vis–NIR + XRF (Full-range) | 0.77 | 5.85 | 0.57 | |
| Cd | vis–NIR + XRF (UF-selected) | 0.77 | 4.95 | 0.44 |
| vis–NIR + XRF (GA-selected) | 0.77 | 4.04 | 0.44 | |
| vis–NIR + XRF (Full-range) | 0.75 | 10.85 | -0.72 | |
| Cu | vis–NIR + XRF (UF-selected) | 0.74 | 10.84 | -2.16 |
| vis–NIR + XRF (GA-selected) | 0.75 | 10.21 | -0.40 | |
| vis–NIR + XRF (Full-range) | 0.85 | 401 | 43.0 | |
| Pb | vis–NIR + XRF (UF-selected) | 0.86 | 389 | 42.2 |
| vis–NIR + XRF (GA-selected) | 0.89 | 350 | 14.6 | |
| vis–NIR + XRF (Full-range) | 0.68 | 666 | 35.0 | |
| Zn | vis–NIR + XRF (UF-selected) | 0.75 | 592 | 10.9 |
| vis–NIR + XRF (GA-selected) | 0.71 | 659 | -21.0 | |
| vis–NIR + XRF (Full-range) | 0.74 | 583 | 29.3 | |
| Mn | vis–NIR + XRF (UF-selected) | 0.65 | 677 | 58.3 |
| vis–NIR + XRF (GA-selected) | 0.76 | 563 | 13.3 |
Figure 6Pb (mg/kg) prediction model performance using different spectral data sets.
Figure 7Feature selection by genetic algorithm (GA) from vis–NIR and XRF spectra (blue lines are the first derivative spectra, and the red dots are the selected features).