| Literature DB >> 35632134 |
Agnieszka Sujak1, Dariusz Jakubas2, Ignacy Kitowski3, Piotr Boniecki1.
Abstract
Artificial Neural Networks are used to find the influence of habitat types on the quality of the environment expressed by the concentrations of toxic and harmful elements in avian tissue. The main habitat types were described according to the Corine Land Cover CLC2012 model. Eggs of free-living species of a colonial waterbird, the grey heron Ardea cinerea, were used as a biological data storing media for biomonitoring. For modeling purposes, pollution indices expressing the sum of the concentration of harmful and toxic elements (multi-contamination rank index) and indices for single elements were created. In the case of all the examined indices apart from Cd, the generated topologies were a multi-layer perceptron (MLP) with 1 hidden layer. Interestingly, in the case of Cd, the generated optimal topology was a network with a radial basis function (RBF). The data analysis showed that the increase in environmental pollution was mainly influenced by human industrial activity. The increase in Hg, Cd, and Pb content correlated mainly with the increase in the areas characterized by human activity (industrial, commercial, and transport units) in the vicinity of a grey heron breeding colony. The decrease in the above elements was conditioned by relative areas of farmland and inland waters. Pollution with Fe, Mn, Zn, and As was associated mainly with areas affected by industrial activities. As the location variable did not affect the quality of the obtained networks, it was removed from the models making them more universal.Entities:
Keywords: artificial neural networks; biomaterial; biomonitoring; elemental analysis; grey heron
Mesh:
Substances:
Year: 2022 PMID: 35632134 PMCID: PMC9143455 DOI: 10.3390/s22103723
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1Study area showing locations of the investigated grey heron colonies and main habitat types according to the Corine Land Cover CLC2012 model, level 1; (https://land.copernicus.eu/pan-european/corine-land-cover/clc-2012 (accessed on 1 February 2022)) in 20 km buffers around the colonies representing potential foraging areas.
Input and output variables of ANN.
|
|
| |
|
| ||
| 1 | Arable land | ARA_LAND |
| 2 | Artificial, non-agricultural vegetated areas | ARTIF_AREAS |
| 3 | Forests | FORESTS |
| 4 | Heterogeneous agricultural areas | HET_AGRO |
| 5 | Industrial, commercial, and transport units | INDUST_UNITS |
| 6 | Inland waters | INL WAT |
| 7 | Inland wetlands | INL_WET |
| 8 | Marine waters | MAR_WAT |
| 9 | Mines, dumps, and construction sites | MINE_DUMP |
| 10 | Open spaces with little or no vegetation | OPEN_SPACE |
| 11 | Pastures | PASTURES |
| 12 | Permanent crops | PERM_CROPS |
| 13 | Scrub and/or herbaceous vegetation associations | SCRUB_VEGET |
| 14 | Urban fabric | URBAN_FABRIC |
|
|
| |
| A | Contamination index ** | CONT_IN |
| B | Mean Hg(mg/kg) | Hg |
| C | Mean Pb (mg/kg) | Pb |
| D | Mean Cd (μg/kg) | Cd |
| E | Mean Fe (mg/kg) | Fe |
| F | Mean Mn (mg/kg) | Mn |
| G | Mean Zn (mg/kg) | Zn |
| H | Mean As (mg/kg) | As |
* Dimensionless data representing the relative rate of various habitat types in potential foraging areas. ** dimensionless. # values without dimensions were considered in the model.
Values of multi-contamination rank index (CONT_IN) calculated based on the measured concentrations of elements (number of eggshells available per location in brackets). Note: The lowest values indicate the highest sum of the mean concentrations of the harmful or toxic elements (Al, As, Cd, Cr, Cu, Fe, Hg, Mn, Se, Sr, V, and Zn). Colony codes—see Section 2.1.
| Colony Code | CONT_IN | Colony Code | CONT_IN |
|---|---|---|---|
| GK | 237 | DD | 247 |
| KR | 128 | JA | 102 |
| BR | 156 | PO | 177 |
| CH | 168 | MA | 134 |
| KS | 169 | LI | 167 |
| GA | 262 | KI | 120 |
| GO | 232 | RA | 170 |
| OS | 254 | OT | 144 |
| JS | 280 | ST | 157 |
| MO | 195 | PL | 188 |
| ZB | 197 | WR | 163 |
Figure 2Optimal structure of ANN type MLP: 14-16-1 for CONT_IN.
Regressions statistics of the obtained optimal neural models. S.D. ratio—quotient of standard deviations determined for errors and for data. Correlation—standard R-Pearson correlation coefficient between the results given by the generated neural model and the actual output values.
| Learning File | Validation File | Test File | Type of Neural Model | |
|---|---|---|---|---|
|
| ||||
| S.D. ratio | 0.0718400 | 0.0190262 | 0.6633987 | MLP: 14-16-1 |
| Correlation | 0.9975369 | 0.9998225 | 0.9869675 | |
|
| ||||
|
| ||||
| S.D. ratio | 0.06611 | 0.0745 | 0.19449 | MLP: 14-3-1 |
| Correlation | 0.9981396 | 0.9974322 | 0.9822994 | |
|
| ||||
| S.D. ratio | 0.06047 | 0.03304 | 0.06773 | MLP: 14-8-1 |
| Correlation | 0.9120701 | 0.8037518 | 0.972723 | |
|
| ||||
| S.D. ratio | 0.004494 | 0.005565 | 0.006037 | RBF: 14-8-1 |
| Correlation | 0.7052663 | 0.8266247 | 0.7990223 | |
|
| ||||
|
| ||||
| S.D. ratio | 0.1003075 | 0.277002 | 0.2561513 | MLP: 14-3-1 |
| Correlation | 0.9951584 | 0.886153 | 0.8936853 | |
|
| ||||
| S.D. ratio | 0.1890534 | 0.2848873 | 0.2264988 | MLP: 14-3-1 |
| Correlation | 0.9821074 | 0.8758581 | 0.8571459 | |
|
| ||||
| S.D. ratio | 0.253584 | 0.2057967 | 0.1674033 | MLP: 14-47-1 |
| Correlation | 0.7570912 | 0.7174649 | 0.6284907 | |
|
| ||||
| S.D. ratio | 0.1137984 | 0.4906985 | 0.526818 | MLP: 14-25-1 |
| Correlation | 0.9939374 | 0.9460305 | 0.8500785 |
The sensitivity analysis for the most important input variables (in italics) for the examined output variables (in bold) and Pearson correlation coefficient (r) for these variables.
| Output ANN Variable | Rank of Input Variables in ANN | ||
|---|---|---|---|
| 1 | 2 | 3 | |
|
|
|
|
|
| Error | 61.93411 | 55.98893 | 50.2746 |
| Ratio | 16.40922 | 14.83407 | 13.32008 |
| r | −0.116 | −0.025 | 0.347 |
| Toxic elements | |||
|
|
|
|
|
| Error | 0.06141 | 0.0583 | 0.0544 |
| Ratio | 12.86131 | 12.20809 | 11.39253 |
| r | 0.029 | −0.079 | 0.075 |
|
|
|
|
|
| Error | 0.1144253 | 0.1116625 | 0.1056726 |
| Ratio | 1.165484 | 1.137344 | 1.076334 |
| r | 0.153 | −0.294 | 0.025 |
|
|
|
|
|
| Error | 0.006896 | 0.006581 | 0.006522 |
| Ratio | 1.133515 | 1.081717 | 1.071923 |
| r | 0.121 | 0.094 | −0.009 |
| Essential elements | |||
|
|
|
|
|
| Error | 7.788201 | 6.93745 | 3.479581 |
| Ratio | 19.19456 | 17.09783 | 8.575668 |
| r | 0.102 | 0.022 | −0.376 |
|
|
|
|
|
| Error | 7.960049 | 5.096137 | 5.093208 |
| Ratio | 25.89732 | 16.57984 | 16.57031 |
| r | 0.027 | 0.018 | −0.3634 |
|
|
|
|
|
| Error | 9.796726 | 9.366866 | 8.412936 |
| Ratio | 1.19883 | 1.146227 | 1.029494 |
| r | 0.337 | 0.193 | 0.161 |
|
|
|
|
|
| Error | 0.3518201 | 0.2070515 | 0.2014431 |
| Ratio | 12.84457 | 7.559226 | 7.354468 |
| r | 0.261 | 0.175 | 0.211 |
Note: Rank indicates the significance level of the input variable and orders the variables according to importance: number 1 means the dominant variable and orders the variables by importance (by decreasing error); error indicates the quality of the network in the absence of a given variable: the lower the rank number of the ANN input variable, the lower the rank number of the input variable ANN, the larger the error made by the network without this variable; ratio—the ratio of the network reduced error by the SSN error obtained using all the variables; if the quotient is lower than 1.0, removing the variable improves the ANN quality. Habitat types—see Table 2.