| Literature DB >> 26761018 |
Yan An1, Zhihong Zou2, Ranran Li3.
Abstract
In this study, principal component analysis (PCA) and a self-organising map (SOM) were used to analyse a complex dataset obtained from the river water monitoring stations in the Tolo Harbor and Channel Water Control Zone (Hong Kong), covering the period of 2009-2011. PCA was initially applied to identify the principal components (PCs) among the nonlinear and complex surface water quality parameters. SOM followed PCA, and was implemented to analyze the complex relationships and behaviors of the parameters. The results reveal that PCA reduced the multidimensional parameters to four significant PCs which are combinations of the original ones. The positive and inverse relationships of the parameters were shown explicitly by pattern analysis in the component planes. It was found that PCA and SOM are efficient tools to capture and analyze the behavior of multivariable, complex, and nonlinear related surface water quality data.Entities:
Keywords: principal component analysis; self-organising map; water quality
Mesh:
Substances:
Year: 2016 PMID: 26761018 PMCID: PMC4730506 DOI: 10.3390/ijerph13010115
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Locations of the 11 monitoring stations in the zone.
Figure 2Topological structure of SOM.
Statistical description of water quality parameters across the sample points.
| Water Quality Parameter | Unit | Minimum | Maximum | Median | Mean | SD | CV |
|---|---|---|---|---|---|---|---|
| BOD5 | mg·L−1 | 0.01 | 17 | 0.8 | 1.6716 | 2.3262 | 1.3916 |
| NH3-N | mg·L−1 | 0.006 | 13 | 0.056 | 0.6021 | 1.7633 | 2.9285 |
| COD | mg·L−1 | 0.3 | 46 | 4 | 5.2942 | 4.4706 | 0.8444 |
| EC | μS/cm | 29 | 2018 | 157 | 206.5278 | 184.1479 | 0.8916 |
| DO | mg·L−1 | 4.3 | 10.6 | 8.1 | 8.1083 | 1.1248 | 0.1387 |
| TP | mg·L−1 | 0.005 | 1.4 | 0.09 | 0.1674 | 0.2348 | 1.4030 |
| NO3-N | mg·L−1 | 0.026 | 4.6 | 0.81 | 1.0202 | 0.7751 | 0.7598 |
| NO2-N | mg·L−1 | 0.00007 | 1.5 | 0.007 | 0.0674 | 0.1735 | 2.5731 |
| Satur O2 | % | 48 | 130 | 98 | 94.9672 | 11.5266 | 0.1214 |
| Susp | mg·L−1 | 0.1 | 650 | 3.4 | 9.9902 | 37.1404 | 3.7177 |
| Diss sol | mg·L−1 | 27.1 | 1296.3 | 99 | 112.0402 | 95.6936 | 0.8541 |
| T | °C | 11.2 | 33.4 | 23.5 | 23.4907 | 4.3935 | 0.1870 |
Pearson correlation matrix for the water quality parameters across the sample points.
| Water Quality Parameter | BOD5 | NH3-N | COD | EC | DO | TP | NO3-N | NO2-N | Satur O2 | Susp | Diss Sol | T |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BOD5 | 1.000 | |||||||||||
| NH3-N | 0.646 ** | 1.000 | ||||||||||
| COD | 0.667 ** | 0.741 ** | 1.000 | |||||||||
| EC | 0.232 ** | 0.214 ** | 0.236 ** | 1.000 | ||||||||
| DO | −0.340 ** | −0.182 ** | −0.350 ** | −0.315 ** | 1.000 | |||||||
| TP | 0.694 ** | 0.872 ** | 0.795 ** | 0.355 ** | −0.274 ** | 1.000 | ||||||
| NO3-N | 0.312 ** | 0.130 ** | 0.263 ** | 0.416 ** | −0.355 ** | 0.398 ** | 1.000 | |||||
| NO2-N | 0.451 ** | 0.406 ** | 0.442 ** | 0.318 ** | −0.043 | 0.646 ** | 0.471 ** | 1.000 | ||||
| Satur O2 | −0.352 ** | −0.159 ** | −0.300 ** | −0.242 ** | 0.784 ** | −0.234 ** | −0.372 ** | 0.086 | 1.000 | |||
| Susp | 0.256 ** | 0.059 | 0.198 ** | 0.015 | −0.103 * | 0.132 ** | 0.063 | 0.040 | −0.077 | 1.000 | ||
| Diss sol | 0.277 ** | 0.278 ** | 0.343 ** | 0.705 ** | −0.060 | 0.426 ** | 0.365 ** | 0.516 ** | 0.047 | 0.059 | 1.000 | |
| T | 0.027 | 0.051 | 0.111 * | 0.156 ** | −0.466 ** | 0.080 | 0.015 | 0.167 ** | 0.177 ** | 0.052 | 0.151 ** | 1.000 |
Notes: ** indicates correlation is significant at the 0.01 level (2-tailed); * indicates correlation is significant at the 0.05 level (2-tailed).
KMO and Bartlett’s test.
| KMO Measure of Sampling Adequacy | Bartlett’s Test of Sphericity | ||
|---|---|---|---|
| Approx. Chi-Square | df | Sig. | |
| 0.626 | 4517.867 | 66 | 0.000 |
Loading on components for water quality parameters.
| Water Quality Parameter | PC | |||
|---|---|---|---|---|
| PC1 | PC2 | PC3 | PC4 | |
| BOD5 | 0.782 | |||
| NH3-N | 0.755 | |||
| COD | 0.819 | |||
| EC | 0.629 | |||
| DO | 0.787 | |||
| TP | 0.901 | |||
| NO3-N | 0.565 | |||
| NO2-N | 0.663 | |||
| Satur O2 | 0.825 | |||
| Susp | ||||
| Diss sol | 0.582 | 0.544 | ||
| T | 0.886 | |||
| Eigenvalue | 4.582 | 1.784 | 1.557 | 1.184 |
| Percentage of total variance | 38.187 | 14.866 | 12.978 | 9.864 |
| Cumulative percentage of variance | 38.187 | 53.053 | 66.030 | 75.894 |
Note: non-significant correlation coefficients are not shown.
QEs and TEs of different map sizes.
| Quality of Trained SOM | Map Size | ||||||
|---|---|---|---|---|---|---|---|
| (20 × 10) | (17 × 14) | (15 × 8) | (14 × 7) | (13 × 9) | (10 × 8) | (7 × 6) | |
| QE | 7.5403 | 8.9137 | 8.8477 | 8.0568 | 7.9560 | 10.0911 | |
| TE | 1 | 0.5429 | 1 | 0.9975 | 0.9949 | 1 | |
Figure 3Patterning analysis for the water quality parameters on the SOM plane.
Figure 4Davies-Bouldin clustering index of the K-means clustering algorithm.
Figure 5Clusters of the SOM for the water quality dataset.
Correlation matrix of the weight of the SOM.
| Water Quality Parameter | BOD5 | NH3-N | COD | EC | DO | TP | NO3-N | NO2-N | Satur O2 | Susp | Diss Sol | T |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BOD5 | 1.000 | |||||||||||
| NH3-N | 0.8520 ** | 1.000 | ||||||||||
| COD | 0.9472 ** | 0.9452 ** | 1.000 | |||||||||
| EC | 0.5582 ** | 0.5230 ** | 0.5871 ** | 1.000 | ||||||||
| DO | −0.4076 ** | −0.1729 | −0.3624 ** | −0.5098 ** | 1.000 | |||||||
| TP | 0.8946 ** | 0.9643 ** | 0.9636 ** | 0.6485 ** | −0.2656 ** | 1.000 | ||||||
| NO3-N | 0.5526 ** | 0.3704 ** | 0.5258 ** | 0.8214 ** | −0.5080 ** | 0.5488 ** | 1.000 | |||||
| NO2-N | 0.6957 ** | 0.7657 ** | 0.7645 ** | 0.6347 ** | −0.0349 | 0.8722 ** | 0.5593 ** | 1.000 | ||||
| Satur O2 | −0.4349 ** | −0.1504 | −0.3206 ** | −0.4171 ** | 0.8490 ** | −0.2348 ** | −0.5284 ** | 0.0551 | 1.000 | |||
| Susp | 0.6325 ** | 0.2710 ** | 0.4833 ** | 0.2591 ** | −0.4357 ** | 0.3537 ** | 0.4073 ** | 0.1948 | −0.4717 ** | 1.000 | ||
| Diss sol | 0.6172 ** | 0.7041 ** | 0.7131 ** | 0.7485 ** | −0.0320 | 0.7879 ** | 0.6231 ** | 0.8747 ** | 0.0843 | 0.2597 ** | 1.000 | |
| T | −0.0437 | 0.0318 | 0.0730 | 0.1949 | −0.3484 ** | 0.0452 | −0.0294 | 0.1244 | 0.1972 | −0.0541 | 0.1700 | 1.000 |
Note: ** indicates correlation is significant at the 0.05 level.
Statistical description of each group.
| Water Quality Parameter | Group 1 | Group 2 | Group 3 | Group 4 | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Min | Max | Mean | SE | Min | Max | Mean | SE | Min | Max | Mean | SE | Min | Max | Mean | SE | |
| BOD5 | 0.05 | 17 | 3.1412 | 0.3886 | 0.01 | 16 | 0.9123 | 0.1118 | 0.02 | 2 | 0.7095 | 0.1004 | 0.05 | 16 | 2.7442 | 0.2691 |
| NH3-N | 0.006 | 7.7 | 0.9099 | 0.1718 | 0.009 | 3.1 | 0.1628 | 0.0330 | 0.01 | 0.99 | 0.1478 | 0.0311 | 0.016 | 13 | 1.6104 | 0.3560 |
| COD | 0.8 | 16 | 6.6145 | 0.4327 | 0.3 | 22 | 4.2106 | 0.2242 | 0.6 | 25 | 4.3907 | 0.5711 | 2 | 46 | 7.1882 | 0.7265 |
| EC | 148 | 263 | 181.6522 | 2.5830 | 29 | 169 | 92.9548 | 2.6224 | 175 | 307 | 245.4186 | 5.6012 | 282 | 2018 | 474.5176 | 24.4396 |
| DO | 4.3 | 10.1 | 7.8319 | 0.1345 | 4.3 | 10.5 | 8.5518 | 0.0673 | 6.3 | 10.6 | 7.9581 | 0.1489 | 5 | 9.9 | 7.3894 | 0.1236 |
| TP | 0.01 | 0.98 | 0.2431 | 0.0290 | 0.005 | 0.73 | 0.0795 | 0.0072 | 0.03 | 0.72 | 0.1242 | 0.0160 | 0.03 | 1.4 | 0.3326 | 0.0393 |
| NO3-N | 0.47 | 4.6 | 1.5046 | 0.1085 | 0.026 | 2.6 | 0.6816 | 0.0364 | 0.11 | 1.8 | 0.8033 | 0.0797 | 0.14 | 3.6 | 1.5276 | 0.0882 |
| NO2-N | 0.0005 | 0.8 | 0.1354 | 0.0239 | 0.00007 | 1.5 | 0.0211 | 0.0077 | 0.0001 | 0.08 | 0.0212 | 0.0043 | 0.0002 | 1.3 | 0.1443 | 0.0277 |
| Satur O2 | 48 | 130 | 93.2174 | 1.7212 | 48 | 117 | 98.8995 | 0.5259 | 72 | 109 | 93.4884 | 1.1433 | 60 | 124 | 87.9765 | 1.5783 |
| Susp | 0.6 | 650 | 26.5493 | 10.1026 | 0.1 | 120 | 5.6302 | 0.9204 | 0.6 | 19 | 4.1698 | 0.5063 | 1.9 | 41 | 9.5694 | 1.0578 |
| Diss sol | 102.4 | 185 | 135.6246 | 2.6220 | 35.8 | 195 | 83.5106 | 1.6858 | 47.8 | 148.4 | 77.5977 | 4.0243 | 27.1 | 1296.3 | 176.9012 | 19.9310 |
| T | 16.5 | 33.2 | 24.0783 | 0.4486 | 11.2 | 29.9 | 22.9070 | 0.3121 | 11.9 | 29.8 | 23.7256 | 0.7340 | 15.1 | 33.4 | 24.1788 | 0.5001 |