| Literature DB >> 36235476 |
Yifan Yuan1, Bo Shi1, Russell Yost2, Xiaojun Liu1, Yongchao Tian1, Yan Zhu1, Weixing Cao1, Qiang Cao1.
Abstract
Soil is characterized by high spatiotemporal variability due to the combined influence of internal and external factors. The most efficient approach for addressing spatial variability is the use of management zones (MZs). Common approaches for delineating MZs include K-means and fuzzy C-means cluster analysis algorithms. However, these clustering methods have been used to delineate MZs independent of the spatial dependence of soil variables. Thus, the accuracy of the clustering results has been limited. In this study, six soil variables (soil pH, total nitrogen, organic matter, available phosphorus, available potassium, and soil apparent electrical conductivity) were used to characterize the spatial variability within a representative village in Suining County, Jiangsu Province, China. Two variable reduction techniques (PCA, multivariate spatial analysis based on Moran's index; MULTISPATI-PCA) and three different clustering algorithms (fuzzy C-means clustering, iterative self-organizing data analysis techniques algorithm, and Gaussian mixture model; GMM) were used to optimize the MZ delineation. Different clustering model composites were evaluated using yield data collected after the wheat harvest in 2020. The results indicated that the variable reduction technologies in conjunction with clustering algorithms provided better performance in MZ delineation, with average silhouette coefficient (ASC) and variance reduction (VR) of 0.48-0.57, and 13.35-23.13%, respectively. Moreover, the MULTISPATI-PCA approach was more conducive to identifying variables requiring MZ delineation than traditional PCA methods. Combining MULTISPATI-PCA and the GMM algorithm yielded the greatest VR and ASC values in this study. These results can guide the optimization of MZ delineation in intensive agricultural systems, thus enabling more precise nutrient management.Entities:
Keywords: Gaussian mixture model; MULTISPATI-PCA; clustering model composites; management zone; soil variable
Year: 2022 PMID: 36235476 PMCID: PMC9573654 DOI: 10.3390/plants11192611
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Summary statistics of the studied variables.
| Variables | Min | Max | Mean | SD | CV/% | Skewness | Kurtosis | Moran’s I |
|---|---|---|---|---|---|---|---|---|
| pH | 7.51 | 8.46 | 8.10 | 0.28 | 3.50 | −0.46 | 3.73 | 0.38 * |
| TN (g kg−1) | 0.59 | 1.88 | 1.23 | 0.24 | 19.51 | −0.11 | 2.64 | 0.42 * |
| OM (g kg−1) | 6.87 | 28.01 | 17.27 | 3.62 | 20.96 | −0.27 | 2.92 | 0.53 * |
| AP (mg kg−1) | 5.85 | 261.77 | 31.16 | 28.49 | 91.43 | 1.41 | 5.93 | 0.39 * |
| AK (mg kg−1) | 89.00 | 323.15 | 207.30 | 46.38 | 22.37 | 0.12 | 2.58 | 0.24 * |
| ECa (ms m−1) | 10.15 | 51.03 | 18.32 | 6.05 | 33.33 | 1.11 | 4.80 | 0.35 * |
| Yield (t ha−1) | 5.25 | 10.95 | 8.33 | 1.27 | 16.16 | 0.07 | 2.63 | 0.11 * |
TN: total nitrogen; OM: organic matter; AP: available phosphorus; AK: available potassium; ECa: soil apparent electrical conductivity; SD: standard deviation; CV: coefficient of variation; Moran’s I: Global Moran’s Index; * represent the value is significant at 0.01 level.
Semivariogram models and related geostatistical parameters for characterizing the spatial dependence of soil variables.
| Variables | Model | Nugget (C0) | Sill (C + C0) | SDC | Range (m) | R2 |
|---|---|---|---|---|---|---|
| pH | Gaussian | 0.01 | 0.03 | Moderate | 866.03 | 0.94 |
| TN | Gaussian | 0.00 | 0.04 | strong | 214.77 | 0.80 |
| OM | Gaussian | 0.01 | 6.65 | Strong | 180.13 | 0.86 |
| AP | Exponential | 0.18 | 0.68 | Moderate to strong | 1227.43 | 0.85 |
| AK | Exponential | 1029 | 2417 | Moderate | 1770.00 | 0.94 |
| ECa | Spherical | 0.02 | 0.05 | Moderate | 849.64 | 0.92 |
| yield | Exponential | 1000 | 168,4000 | Strong | 945.00 | 0.84 |
TN: total nitrogen; OM: organic matter; AP: available phosphorus; AK: available potassium; ECa: soil apparent electrical conductivity; SDC: spatial dependency class, ranking of spatial dependence The SDC (Nugget/Sill × 100%) of <25%, 25–75%, and >75% reveals strong, moderate, and weak spatial dependence, respectively [4].
Figure 1Variograms and respective models of the measured soil variables and their respective variograms models. (a) Soil pH, Gaussian, (b) TN, Gaussian, (c) OM, Gaussian, (d) AP, exponential, (e) AK, exponential, (f) ECa, spherical. The sample variance of the entire area is represented by the dotted line in each figure. Parameter estimates of the respective models are given in Table 2. Caution: note the differing scaling of the x-axis and y-axis of each figure.
Figure 2Spatial pattern maps for six different variables by visualizing the local indicators of spatial association (LISA). (a): pH; (b):TN; (c):OM; (d): AP; (e): AK; (f): ECa. Not significant: the statistic is not significant at 0.05 level, which were not referenced; High–High: high values in a high value neighborhood; Low–Low: low values in a low value neighborhood.
Figure 3The eigenvalue and cumulative explained variance plots of principal component analysis (PCA) and multivariate spatial analysis based on Moran’s index PCA (MULTISPATI-PCA). (a): PCA; (b): MULTISPATI-PCA. The red line represents the eigenvalue equal to 1.
Loadings for the variables in the principal components (PCs) and spatial principal components (SPCs) and statistics of the PCs and SPCs.
| Component | pH | TN | OM | AP | AK | ECa | Variance | Moran’s I |
|---|---|---|---|---|---|---|---|---|
|
| ||||||||
| PC1 | 0.33 | 0.94 | 0.95 | −0.73 | 0.70 | 0.18 | 6.23 | 0.62 * |
| PC2 | −0.85 | 0.16 | 0.15 | 0.45 | 0.40 | 0.18 | 1.77 | 0.24 * |
| PC3 | 0.12 | 0.02 | −0.06 | 0.03 | −0.21 | 0.96 | 0.94 | 0.33 * |
|
| ||||||||
| SPC1 | −0.34 | 0.90 | 0.88 | 0.58 | 0.02 | 0.67 | 1.98 | 0.68 * |
| SPC2 | 0.73 | −0.22 | −0.20 | 0.60 | 0.27 | 0.40 | 0.47 | 0.52 * |
| SPC3 | −0.11 | 0.10 | 0.03 | −0.21 | 0.96 | −0.06 | 0.29 | 0.46 * |
TN: total nitrogen; OM: organic matter; AP: available phosphorus; AK: available potassium; ECa: soil apparent electrical conductivity; PC: principal component; SPC: spatial principal component; * represent the value is significant at 0.01 level.
Figure 4Spatial distribution maps of principal components (PCs) and spatial principal components (SPCs). (a): PC1; (b): PC1; (c): PC3; (d): SPC1; (e): SPC2; (f): SPC3.
Results of the evaluation of the clustering models in the generation of two, three, and four classes in terms of the ANOVA (Tukey’s test), variance reduction (VR) index, and average silhouette coefficient (ASC).
| C1 | C2 | VR (%) | ASC | C1 | C2 | C3 | VR (%) | ASC | C1 | C2 | C3 | C4 | VR (%) | ASC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All-FCM | a | b | 13.64 | 0.47 | a | b | c | 12.14 | 0.42 | a | b | c | ac | 10.96 | 0.38 |
| All- | a | b | 11.32 | 0.39 | a | b | c | 10.67 | 0.37 | a | b | b | c | 13.47 | 0.28 |
| All-GMM | a | b | 18.60 | 0.54 | a | b | c | 17.68 | 0.48 | a | b | c | d | 16.05 | 0.36 |
| PCA-FCM | a | b | 15.36 | 0.48 | a | b | b | 11.61 | 0.37 | a | b | c | bc | 8.47 | 0.31 |
| PCA- | a | b | 13.35 | 0.49 | a | a | b | 10.67 | 0.32 | a | b | c | bc | 9.56 | 0.26 |
| PCA-GMM | a | b | 16.24 | 0.51 | a | b | c | 12.11 | 0.45 | a | b | b | c | 10.34 | 0.35 |
| MPCA-FCM | a | b |
|
| a | b | c |
|
| a | b | a | c |
|
|
| MPCA- | a | b | 19.08 | 0.52 | a | b | b | 17.07 | 0.44 | a | b | c | d | 15.94 | 0.30 |
| MPCA-GMM | a | b |
|
| a | b | c |
|
| a | b | c | b |
|
|
C1, C2, C3, and C4 represent the classes generated by each clustering method. Values marked in bold and underlined highlight the best model for delineating MZs, while non-bold, underlined marks define the second-best model for delineating management zones.
Tukey’s test for the soil variables when the village was divided into two classes, considering classes defined with the nine clustering models.
| Model | pH | TN | OM | AP | AK | ECa |
|---|---|---|---|---|---|---|
| ALL-FCM | ** | ** | ** | ** | ** | * |
| ALL-ISODATA | ** | ** | ** | ** | ** | ** |
| ALL-GMM | ** | ** | ** | * | ** | ** |
| PCA-FCM | ** | ** | ** | ** | ** | ** |
| PCA-ISODATA | ** | ** | ** | ** | ** | ** |
| PCA-GMM | ** | ** | ** | ** | ** | ** |
| MPCA-FCM | ** | ** | ** | ** | ** | ** |
| MPCA-ISODATA | ** | ** | ** | ** | ** | ** |
| MPCA-GMM | ** | ** | ** | ** | ** | ** |
TN: total nitrogen; OM: organic matter; AP: available phosphorus; AK: available potassium; ECa: soil apparent electrical conductivity; **: significant difference between the averages of all classes at the 0.01 level; *: significant difference between the averages at the 0.05 level.
Figure 5Maps of the MZs defined with the application of nine clustering algorithms. (a) All-FCM; (b) All-ISODATA; (c) All-GMM; (d) PCA-FCM; (e) PCA-ISODATA; (f) PCA-GMM; (g) MPCA-FCM; (h) MPCA-ISODATA; (i) MPCA-GMM. All clustering models’ description are shown in Table 6.
Figure 6Kappa degrees of agreement between maps obtained by nine clustering models.
Figure 7Location of the study area and the sampling grids used to guide soil sample collection of the experimental site (Zhaoji village, Jiangsu, China).
Figure 8Flow chart of the optimized MZ delineation in this work.
Details on the nine clustering models that were used in the analysis.
| Model Composite | Model Description |
|---|---|
| Raw values for the six soil variables were used as input, and the FCM algorithm was applied on the raw values. | |
| Raw values for the six soil variables were used as input, and the ISODATA algorithm was applied on the raw values. | |
| Raw values for the six soil variables were used as input, and the GMM algorithm was applied on the raw values. | |
| Raw values for the six soil variables were standardized and used as input of PCA, and the FCM algorithm was applied on the principal components | |
| Raw values for the six soil variables were standardized and used as input of PCA, and the ISODATA algorithm was applied on the principal components. | |
| Raw values for the six soil variables were standardized and used as input of PCA, and the GMM algorithm was applied on the principal components. | |
| Raw values for the six soil variables were standardized and used as input of MULTISPATI-PCA, and the FCM algorithm was applied on the spatial principal components. | |
| Raw values for the six soil variables were standardized and used as input of MULTISPATI-PCA, and the ISODATA algorithm was applied on the spatial principal components. | |
| Raw values for the six soil variables were standardized and used as input of MULTISPATI-PCA, and the GMM algorithm was applied on the spatial principal components. |