Literature DB >> 35289167

Prediction of Soil Heavy Metal Immobilization by Biochar Using Machine Learning.

Kumuduni N Palansooriya1, Jie Li2, Pavani D Dissanayake1,3, Manu Suvarna2, Lanyu Li2, Xiangzhou Yuan1, Binoy Sarkar4, Daniel C W Tsang5, Jörg Rinklebe6,7, Xiaonan Wang8, Yong Sik Ok1.   

Abstract

Biochar application is a promising strategy for the remediation of contaminated soil, while ensuring sustainable waste management. Biochar remediation of heavy metal (HM)-contaminated soil primarily depends on the properties of the soil, biochar, and HM. The optimum conditions for HM immobilization in biochar-amended soils are site-specific and vary among studies. Therefore, a generalized approach to predict HM immobilization efficiency in biochar-amended soils is required. This study employs machine learning (ML) approaches to predict the HM immobilization efficiency of biochar in biochar-amended soils. The nitrogen content in the biochar (0.3-25.9%) and biochar application rate (0.5-10%) were the two most significant features affecting HM immobilization. Causal analysis showed that the empirical categories for HM immobilization efficiency, in the order of importance, were biochar properties > experimental conditions > soil properties > HM properties. Therefore, this study presents new insights into the effects of biochar properties and soil properties on HM immobilization. This approach can help determine the optimum conditions for enhanced HM immobilization in biochar-amended soils.

Entities:  

Keywords:  biochar; graphical user interface; heavy metal; machine learning models; soil remediation

Mesh:

Substances:

Year:  2022        PMID: 35289167      PMCID: PMC8988308          DOI: 10.1021/acs.est.1c08302

Source DB:  PubMed          Journal:  Environ Sci Technol        ISSN: 0013-936X            Impact factor:   9.028


Introduction

Soil pollution by heavy metals (HMs) is a significant global problem that threatens sustainable development, particularly in developing countries such as China and India, where 36% of the global population resides.[1,2] Anthropogenic activities such as mining and smelting, industrial operations, and agricultural activities accelerate HM contamination in soils.[1] Ultimately, these HMs enter the food chain and cause diseases such as cancer, renal failure, cardiovascular disorders, and neurological and cognitive impairment.[3,4] Various in situ and ex situ techniques have been used to remediate HM-contaminated soils. Among these techniques, in situ HM immobilization using biological waste has become well established owing to its efficiency, economic feasibility, and ease of adaptation.[3] In situ HM immobilization is a risk-based remediation strategy in which HM bioavailability is reduced to a level considered safe for the intended land use.[5] Appropriate HM immobilizing agents can facilitate environmentally sustainable remediation because of their reduced environmental footprint.[6] Biochar application to contaminated soil is considered a promising method for HM immobilization because biochar can adsorb and immobilize HMs as its surface area (SA), microporosity, surface functional groups, pH, and cation exchange capacity are superior to those of raw feedstock.[7,8] A variety of biochars produced from various feedstocks (e.g., sewage sludge, manure, crop residue) under different production conditions (e.g., slow pyrolysis, fast pyrolysis, gasification, and hydrothermal carbonization)[9] have been used to immobilize HMs (and metalloids) such as As, Cd, Cu, Pb, Cr, Ni, Co, and Zn in soils.[10,11] The HM immobilization efficiency in biochar-amended soils varies depending on the type of biochar (e.g., production conditions and physicochemical properties), soil properties (e.g., soil pH, organic matter content, and electrical conductivity (EC)), and HM properties (e.g., valency and ionic radius).[7,12] Studies have shown that immobilization efficiency is influenced by various adsorption/immobilization mechanisms and factors such as cation exchange, electrostatic interaction, precipitation, and complexation by surface functional groups.[13] However, the optimum conditions for enhanced HM immobilization in soils using biochar vary considerably among studies. Studying all of the process parameters involved in soil HM immobilization via simultaneous experimentation is challenging. Several meta-analyses, bibliometric analyses, and review studies have been carried out to evaluate the efficiency of HM immobilization.[14,15] However, these methods of determining the relative contribution of various factors to the immobilization efficiency are time-consuming and complex. Before applying a biochar-based remediation technique, identifying the optimum parameters for maximum HM immobilization in certain types of soil via an empirical approach can reduce the time and cost involved. However, to the best of our knowledge, such techniques have not yet been developed. An empirical approach capable of predicting HM immobilization efficiency in biochar-treated soils may address the issues associated with the determination of the optimum experimental conditions, biochar properties, and soil properties to maximize HM immobilization in soils. A robust model involving all of the possible factors can be used to highlight the relative importance of each factor, which may enhance the understanding of the overall process and help achieve a high HM immobilization efficiency in contaminated soils under optimized conditions. Machine learning (ML) can process and learn from large, complex, and multidimensional data to develop predictive models.[16] ML methods such as random forest (RF) and neural networks (NN) have been used to monitor and map contaminants in soils[17] and groundwater.[18] In addition, several studies have utilized ML models to develop risk assessment methods for groundwater pollution,[19] predict the yield and C content of biochar based on biomass properties and pyrolysis conditions,[20] as well as predict the sorption efficiencies of HMs (and metalloids)[16,21−23] and personal care products[24] by biochar in water and wastewater. The ML models used could predict the nonlinear and complex relationships between dependent and independent variables in complex systems for environmental engineering and bioremediation.[16,25] Complex biochar–soil interactions and lack of a systematic dataset have resulted in paucity of studies on ML-based prediction of biochar efficiency for immobilization of contaminants in soils, particularly in relation to HM contamination. To address this gap, we developed three types of ML models (RF, supporting vector regression (SVR), and NN) to predict HM immobilization efficiency in biochar-amended soils. Statistical methods can only achieve sample linear or quadratic correlations between a single factor and a target. Compared to general statistical analysis, ML methods can simultaneously consider the maximum possible related factors and identify complex correlations (both linear and nonlinear) with the targets. Moreover, the targets of interest (i.e., HM immobilization) can be accurately predicted by the ML model to conveniently evaluate the remediation potential of biochar for a specific HM-contaminated soil. In addition, new experimental designs achieved from the developed ML model can guide experiments and biochar application in HM-contaminated soil (e.g., by providing a suggested biochar addition rate). Therefore, this study considered 20 input variables to assess their roles and effects on HM immobilization in biochar-treated soils. These variables were related to four key aspects: (i) biochar characteristics and pyrolysis temperatures; (ii) experimental conditions; (iii) physicochemical properties of biochar-treated soils; and (iv) HM properties. Prior to developing the ML models, a thorough statistical analysis of the data collected from the literature was used to establish correlations between the input variables (factors) and output variables (HM immobilization) (detailed information is available in the Supporting Information including Figures S1 and S2).

Materials and Methods

Data Collection and Data Imputation

Research articles addressing HM immobilization (for Cu, Zn, Pb, Cd, Fe, Ni, and Mn) in biochar-amended soils from 2015 to 2020[11,12,26−37] were selected using the Institute for Scientific Information Web of Science and Google Scholar databases. The HM immobilization efficiency for the biochar-amended soil (compared to untreated soil) was directly obtained from the experimental data or calculated as a percentage using eq where CB0 refers to the HM concentration in the control treatment (without biochar amendment), and CB is the HM concentration in the biochar-amended treatment. The concentrations are expressed in mg·kg–1. Immobilization in metal-contaminated soil describes a decrease in the bioavailability, bioaccessibility, phytoavailability, and leaching of the metal, as there are reduced amounts of the metal in the exchangeable, labile, and water-soluble fractions of the soil. The research articles selected for this study were expected to include all of the data for the four empirical categories listed in the Introduction, including 20 input features and the output variable (Table S1). However, only a few studies were available that included all of the data for the considered parameters. Therefore, 15 papers[11,12,26−37] were selected, and data were obtained from tables or extracted from figures using the Web Plot Digitizer Software (https://apps.automeris.io/wpd/). Thus, 162 data points were collected and used for ML exploration. The detailed procedure of ML exploration associated with HM immobilization efficiency in biochar-amended soils is illustrated in Figure . Twenty parameters were identified as input features, and HM immobilization was defined as the output variable. From the total dataset, 35 and 10 data points were missing for the biochar SA and soil EC data, respectively. To fill these data gaps and obtain a uniform dataset, an ML model was used to predict the missing SA values using other properties of biochar, including pH, composition, and atomic ratio. Missing soil EC data points were excluded due to the lack of adequate soil input variables for the prediction of the EC. In summary, two datasets were compiled, one for the prediction of the biochar SA and the other for the prediction of the immobilization efficiency. The dataset for SA prediction contained 127 data points on the biochar properties as the input variables, whereas that for the immobilization efficiency prediction contained 152 data points following the addition of the missing SA data.
Figure 1

Flowchart detailing the strategies of the machine learning framework to determine the heavy metal immobilization efficiency in biochar-amended soil. During the first step, data were collected from the literature based on four empirical categories and the output variable. Model development was preceded by data pretreatment. Then, the model was updated using feature engineering to simplify it and improve its performance. The final updated model was used for feature exploration to study the effects of input features on the target output.

Flowchart detailing the strategies of the machine learning framework to determine the heavy metal immobilization efficiency in biochar-amended soil. During the first step, data were collected from the literature based on four empirical categories and the output variable. Model development was preceded by data pretreatment. Then, the model was updated using feature engineering to simplify it and improve its performance. The final updated model was used for feature exploration to study the effects of input features on the target output.

Model Development and Evaluation

To improve the training process of ML models for rapid convergence, the input features were normalized using StandardScaler in Scikit-Learn (version 1.0.2)[38] to obtain a similar scale and approximate a normal distribution. Following normalization, the dataset was randomly divided into two parts: 85% was used for ML model training and the remaining 15% was used for the final model evaluation.[39,40] Previous studies have reported that SVR, RF, and NN algorithms have exhibited satisfactory performance when used to train the models used for the prediction of biochar properties and HM adsorption by biochar.[16,20,41,42] The hyperparameters for each algorithm were tuned to obtain the minimal mean-squared error for biochar SA and to predict the immobilization efficiency based on fivePerovskitecross-validation. Different hyperparameters were included in various ML algorithms during the tuning process. In SVR, the epsilon (ε), kernel function, and penalty (α) parameters were the tuned hyperparameters. In RF, the number of trees, depth of each tree, and max_feature of RF were the three crucial parameters that were tuned. Finally, in NN, the number of hidden layers and neurons in each layer were tuned to improve the model convergence. The tuning process used in this study for the three ML algorithms has been described in previous studies.[16,20,41] Three optimal ML models from the SVR, NN, and RF algorithms were obtained after hyperparameter tuning; these models were evaluated using the 15% test dataset. The coefficient of determination (R2) and root-mean-square error (RMSE) were utilized to compare the prediction accuracy and quantify the prediction performance.[41]R2 and RMSE values were calculated using eqs and 3, respectivelywhere ypi is the predicted value of the output, yti is the true value of the output collected from the literature on experimental research, ym is the mean value of all output values, and N is the number of data samples in the training or testing datasets.

ML-Based Feature Engineering and Performance Analysis with the Updated ML Model

To simplify the ML model and improve its performance, feature filtering was incorporated using ML-based feature importance and correlation. The Pearson correlation coefficient (PCC) was used to investigate the correlations between features, and hierarchical clustering was conducted based on the Pearson rank-order correlations. Features with high correlation were sorted into one cluster because they contained similar information, and one of these features was selected as a representative of the cluster for the ML model development. ML-based feature importance can reasonably determine the representative features of a cluster. The most important feature in a cluster was selected as the representative feature for model improvement by combining the feature importance and correlation. The details of PCC have been described in a previous study.[16] Moreover, the hierarchical clustering strategy can be summarized in three steps.[43] First, the lowest distance between elements was determined, which involved determining the two elements that were most similar to each other for clustering. Second, the two clustered elements were considered a new element for future clustering. These two steps were repeated until a final cluster was obtained. The number of clusters was determined by selecting a distance threshold based on the application. Feature importance was investigated based on the developed RF model, which could interpret the roles of input features relative to the output variable.[20] The important and valuable features selected based on the feature engineering results were further applied to update the ML model that was optimized in the first round. Based on previous studies,[41,44] the hyperparameters remained the same, and only the input features were updated. Using the same hyperparameters as the previous ML model enabled comparison of the new and previous models to evaluate the feasibility of the feature selection method; this also reduced time and computational costs. The ML model was retrained using the same 85% training dataset to adapt to the new feature information by fixing the random_state of data splitting, and the updated ML model was evaluated with the same 15% test set after updating the features to avoid the participation of the test data in the model training process. If the prediction performance of the updated model improved, it implied that our feature selection process was helpful. However, if the prediction performance of the updated model decreased, the hyperparameters would have to be retuned. The final updated ML model was applied to explore the importance and impact of each feature on the target. Two types of feature analysis methods were applied to evaluate the feature importance and correlations to HM immobilization efficiency. One feature analysis result was directly determined through the final updated RF model. Both one-dimensional and two-dimensional (feature interaction) partial dependence plots were utilized to integrate the updated RF model and systematically express the correlation of each feature to the output variable. Another feature analysis was achieved using the Shapley additive explanation (SHAP) method, which is also widely used in feature analysis.[41,45] The marginal effect of each feature on the predicted output was determined using the ML model and the relevance between the input features and output variables (e.g., linear, monotonic, and even more complex relationships).[46]

Results and Discussion

Dataset Compilation and Missing Data Imputation

Following a systematic literature review and data collection, 20 input variables were classified into empirical categories based on the domain knowledge. These included the pyrolysis temperature, biochar properties (pH and SA of biochar), biochar composition (C, H, N, O, and ash contents), atomic ratios (H/C, O/C, and [O + N]/C), operational conditions (biochar addition rate, experimental duration, and available HM concentration), and soil properties (soil pH and EC) (Figure and Table S1). HM immobilization was considered as the output variable. For biochar SA, 35 data points of the total 162 were missing, which would reduce the dataset further if SA was considered. To ensure uniformity of the entire dataset and obtain the missing data points, three ML algorithms—RF, SVR, and NN—were developed (Figures a and S3a,b and Table S2) to derive the missing SA data using the pyrolysis temperature, biochar pH, biochar composition, and atomic ratios as inputs. Plots of the experimental versus the predicted SA data from the three ML models (Figures b and S3c,d) showed that all models exhibited good performance for SA prediction with a high test coefficient of determination (R2 = 0.98–0.99).
Figure 2

Results of (a) hyperparameter tuning, (b) prediction performance, and (c) feature importance from the random forest model in terms of the surface area (SA) prediction. The machine learning algorithms were established based on 131 data points including biochar properties (pH; C, H, O, N contents; ratios of H/C, O/C, [O + N]/C; and ash content) and biochar pyrolysis temperature.

Results of (a) hyperparameter tuning, (b) prediction performance, and (c) feature importance from the random forest model in terms of the surface area (SA) prediction. The machine learning algorithms were established based on 131 data points including biochar properties (pH; C, H, O, N contents; ratios of H/C, O/C, [O + N]/C; and ash content) and biochar pyrolysis temperature. Feature analysis was continued by utilizing the optimal RF model to predict the missing values and obtain the feature importance; Figure c shows the importance of each feature to biochar SA. The H/C atomic ratio was found to be the most important feature for SA prediction. This is a new finding, as no such direct relationship has previously been reported in the literature. The second most important feature for biochar SA was biochar pH, followed by the biochar pyrolysis temperature. Generally, the pyrolysis temperature showed a strong correlation with the SA and H/C atomic ratios in the biochar. A high pyrolysis temperature led to a higher SA and lower H/C atomic ratio in biochar.[47,48] Chen et al.[49] showed that missing data in a meta-analysis would increase the uncertainties of the results. In contrast, in the ML analysis, missing data could be imputed to avoid excluding records with missing values.[40]

ML Model Development and Feature Analysis

After filling the data gaps, three ML algorithms (SVR, NN, and RF) were used to predict the HM immobilization efficiency of biochar-amended soils based on 20 input features (Table S1). The optimal hyperparameters for each model were tuned during the training phase to minimize prediction errors based on fivefold cross-validation (Figure S4). Figure presents the actual and predicted values of the HM immobilization efficiency for the three models. For the RF model, the training and testing R2 were 0.95 and 0.91, respectively, while the RMSE values were 7.35 and 10.54%, respectively. For the SVR and NN models, the testing R2 were similar at 0.88 and 0.80, respectively. These results indicated that the RF model with optimally tuned hyperparameters was the best-performing algorithm for predicting the HM immobilization efficiency.
Figure 3

Predictive performance of (a) supporting vector regression, (b) neural network models, (c) random forest, and (d) updated random forest to predict heavy metal immobilization efficiency in biochar-amended soils. In all, 152 data points were obtained for model development after imputing the missing data for the biochar surface area. RMSE = root-mean-square error.

Predictive performance of (a) supporting vector regression, (b) neural network models, (c) random forest, and (d) updated random forest to predict heavy metal immobilization efficiency in biochar-amended soils. In all, 152 data points were obtained for model development after imputing the missing data for the biochar surface area. RMSE = root-mean-square error. Although the preliminary dataset had satisfactory prediction accuracies for the ML models (particularly for RF), the less important features and the simultaneous use of many input features could weaken the generalization capacity of the model. Therefore, ML-based feature engineering was conducted to filter out the less important features and simplify the ML model for improved performance. Highly related input features were identified using the PCC (Figure S5) and hierarchical clustering (Figure a). The clustering and empirical categories were identified based on the domain knowledge, including pyrolysis temperature, biochar properties, operational conditions, soil properties, and heavy metal properties. Features within the same cluster that belonged to different empirical categories were not removed during the feature-filtering process.
Figure 4

Input feature analysis: (a) hierarchical clustering and (b) machine learning model-based feature importance from the random forest model. In all, 152 data points were used for model development. The clusters in panel (a) indicated by different colors and hierarchical levels obtained from the hierarchical clustering algorithm were based on the Pearson rank-order correlations from Figure S5. A distance threshold of one was selected to determine the similarity between features. This implied that the bottom branches (each feature had one bottom branch) from the same upper layer branch were identified as one cluster under this distance threshold. Determining similar features in the same cluster provided insights for postextraction of important features to simplify the model. Note: T °C: pyrolysis temperature; pH_BC: biochar pH; C%, H%, O%, and N%: C, H, O, and N contents of biochar, respectively; H/C, O/C, (O + N)/C: atomic ratios of biochar; ash %: ash content of biochar; SA: surface area of biochar; BC rate %: biochar application rate in soil; time: experimental duration; Avail. HM: available heavy metal content in soil; pH: soil pH; EC: soil electrical conductivity; MW: molecular weight of the heavy metal; electronegativity: electronegativity of the heavy metal; ionic radius: ionic radius of the heavy metal; and valency: valency of the heavy metal.

Input feature analysis: (a) hierarchical clustering and (b) machine learning model-based feature importance from the random forest model. In all, 152 data points were used for model development. The clusters in panel (a) indicated by different colors and hierarchical levels obtained from the hierarchical clustering algorithm were based on the Pearson rank-order correlations from Figure S5. A distance threshold of one was selected to determine the similarity between features. This implied that the bottom branches (each feature had one bottom branch) from the same upper layer branch were identified as one cluster under this distance threshold. Determining similar features in the same cluster provided insights for postextraction of important features to simplify the model. Note: T °C: pyrolysis temperature; pH_BC: biochar pH; C%, H%, O%, and N%: C, H, O, and N contents of biochar, respectively; H/C, O/C, (O + N)/C: atomic ratios of biochar; ash %: ash content of biochar; SA: surface area of biochar; BC rate %: biochar application rate in soil; time: experimental duration; Avail. HM: available heavy metal content in soil; pH: soil pH; EC: soil electrical conductivity; MW: molecular weight of the heavy metal; electronegativity: electronegativity of the heavy metal; ionic radius: ionic radius of the heavy metal; and valency: valency of the heavy metal. Figure a shows that the (O + N)/C, O/C, and O contents were within one cluster with a threshold of one; these features represented the biochar properties, and two were eliminated to simplify the ML model. Based on the feature importance of the RF model, the O content was more important than the other two features (Figure b). Combining these results, (O + N)/C and O/C were removed as redundant features. Furthermore, the H/C and H contents, which represented biochar properties, were also identified in one cluster (Figure a). Therefore, the H/C ratio was eliminated from the features, as it was less important than the H content in terms of feature importance. The last cluster contained HM properties, including the mass, electronegativity, ionic radius, and valency, which were highly correlated to each other. This indicates that these features offered similar contributions to model training; therefore, the most important feature (i.e., electronegativity) to predict HM immobilization was retained for model development.

ML Model Update for Final Feature Exploration

The best-performing RF model was reconceptualized with a reduced set of 14 input features to obtain improved generalization ability and greater computational efficiency. The new RF model was trained using 85% of the dataset, and the remaining 15% was used to evaluate its prediction performance. Figure d presents the prediction accuracy of the updated RF model; the R2 values of the training and test dataset were 0.95 and 0.92, respectively. For the test dataset, the R2 value of the updated model was slightly higher than that of the preliminary model, and the RMSE value was smaller than that of the preliminary model. This slight increase in the testing prediction performance was reasonable as the excluded features might have been redundant in the original model, weakening its generalization ability and robustness.[44] This result implied that the feature selection method used in this study was feasible for simplifying the ML model and improving its robustness. Using the same optimal hyperparameters as the previous ML model could be an efficient method to save computational cost and time for model development. Notably, the prediction performance in this study was higher than other reported values from previous studies.[20,50,51] These results imply that the feature-filtering process, in which only 14 important features were identified, could adequately achieve a satisfactory ML model performance. To ensure that the prediction model is accessible to scientists and practitioners, a graphical user interface (GUI) web software was developed using Python (version 3.7) and the Flask (version 1.1.2) web framework (Figure S6). To further validate the model through the GUI using new data, we collected eight data points (Table S3) from published experimental studies. These experimental data points were independent of the initial dataset of 162 data points. Based on the validation results using the new data points, our developed model provided some prediction errors, most of which were lower than 30% (Figure S7). This level of prediction error was reasonable because our model was limited to the initial dataset, and some values or conditions (e.g., remediation time and soil pH) of the newly added experimental data points were outside the range of our original dataset (Tables S1 and S3), thus resulting in a prediction error <20%, which is an acceptable value.[20,52] Moreover, the optimal values for input variables can be obtained through the GUI for real-world biochar applications for HM immobilization in soils before implementation. In particular, the GUI uses the provided information to predict the HM immobilization performance of a specific biochar for a particular soil type. This may be achieved when the properties of biochar (elemental composition, SA, biochar pyrolysis temperature, and biochar pH), soil (pH and EC), and HM (available HM concentration and metal electronegativity) are determined using analytical instruments, and the amendment conditions (biochar rate and amendment time) are obtained based on the experimental design. Thus, this method can save time and cost in related research or engineering projects investigating the HM immobilization performance of biochar-amended soils. Based on the updated RF model, the final feature importance relative to the output HM immobilization efficiency was explored using both the RF explainer and SHAP methods (Figures and S8). The ranking of important features from the two feature analysis methods showed similar results, particularly for the two most important features for predicting the immobilization efficiency, which were the N content in the biochar and biochar application rate (Figure a,b). The N content was positively correlated with HM immobilization within the 0.3–25.9% range (Figure S8). The presence of N-containing functional groups (e.g., −NH2, N–C=O, and C=N) on the biochar surface provides active sites for HM immobilization through strong covalent bonding, H bonding, chelation, and electrostatic attraction.[53,54] HMs, such as Cu, may be fixed by amino-modified biochar because of the increased −NH2 surface functional groups.[55] In addition, a biochar with a higher N content may have better adsorption properties than that with a lower N content. For example, N-doped biochar exhibited altered surface chemistry with a higher SA (418.7 m2·g–1) than that of pristine biochar (61.0 m2·g–1),[56] resulting in higher adsorption capacities for aqueous Cu2+ and Cd2+ of N-doped biochar compared to pristine biochar. In general, the O content and O-containing functional groups (e.g., carboxyl, hydroxyl, and phenolic) in biochar play a vital role in HM immobilization.[57] The presence of more O-containing functional groups in biochar increases the immobilization of HMs (i.e., Cd, Pb, and Cu) owing to the strong interactions between HMs and O-containing functional groups.[57−59] Nevertheless, the N content was more important than the O content of biochar for HM immobilization in this study (Figure b). A similar finding was reported by Igalavithana et al.[60] who found that the Pb immobilization rate showed a higher correlation with the N content than with the O content of biochar. A possible reason for this difference could be the preferential adsorption of some HMs by the N-containing functional groups. Deng et al.[59] reported that N-containing functional groups (e.g., N–C=O) played a significant role in Pb removal, while C- or O-containing functional groups did not show a significant effect.
Figure 5

Updates of the random forest model based on the feature-filtering dataset with the final investigation of feature importance, where the relative importance of empirical categories was selected from the (a) random forest model and (b) Shapley additive explanation method, and (c) interactions among the top four features (N%, BC rate %, C%, and EC_soil) on the impact of HM immobilization. In all, 152 data points were used for model development. Note: T °C: pyrolysis temperature; pH_BC: biochar pH; C%, H%, O%, and N%: C, H, O, and N contents of biochar, respectively; ash %: ash content of biochar; SA: surface area of biochar; BC rate %: biochar application rate in soil; time: experimental duration; Avail. HM: available heavy metal content in soil; pH_soil: soil pH; EC_soil: soil electrical conductivity; and electronegativity: electronegativity of heavy metals.

Updates of the random forest model based on the feature-filtering dataset with the final investigation of feature importance, where the relative importance of empirical categories was selected from the (a) random forest model and (b) Shapley additive explanation method, and (c) interactions among the top four features (N%, BC rate %, C%, and EC_soil) on the impact of HM immobilization. In all, 152 data points were used for model development. Note: T °C: pyrolysis temperature; pH_BC: biochar pH; C%, H%, O%, and N%: C, H, O, and N contents of biochar, respectively; ash %: ash content of biochar; SA: surface area of biochar; BC rate %: biochar application rate in soil; time: experimental duration; Avail. HM: available heavy metal content in soil; pH_soil: soil pH; EC_soil: soil electrical conductivity; and electronegativity: electronegativity of heavy metals. The HM immobilization efficiency increased with the biochar application rate within a specific range (0–10%) based on the utilized dataset (Figure S8). Moreover, feature interaction indicated that biochar application rates higher than 4% with biochar N content higher than 5% could achieve a high HM immobilization efficiency (Figure c). Field experiments and incubation studies have also confirmed that the bioavailability of HMs may decrease with increasing biochar application rates.[61−63] Cui et al.[63] reported that the application of biochar at 5 and 15% (weight) to Cd-contaminated soil in an incubation experiment reduced the concentration of bioavailable Cd by 53.4 and 87.9%, respectively, compared with the concentration of bioavailable Cd in the control. This might be due to the increase in functional groups and soil pH, which caused the formation of insoluble Cd precipitates with increasing biochar application rates.[63] Bioavailable Pb and Cd in biochar-amended soils decreased with increasing biochar application rates (0.0, 1.0, 2.5, 5.0, and 10.0%).[64] This was attributed to the increase in soil pH (from 6.17 to 7.17) and organic matter content (from 10.34 to 11.48%), which promoted the formation of less soluble Pb(OH)2 and Cd(OH)2 precipitates and the binding of Pb/Cd to Mn and Fe oxides.[64] The C contents in biochar and EC of soil were very important in predicting the HM immobilization efficiency, and they ranked third and fourth in terms of feature importance (Figure a,b). In general, the C content in biochar increases with the pyrolysis temperature, indicating that biochar that is produced at higher temperatures has more recalcitrant C, whereas biochar produced at lower temperatures has more labile C.[65,66] In particular, biochar with more recalcitrant C or higher aromaticity exhibits higher HM immobilization performance owing to its greater surface negativity.[12,60] However, higher HM immobilization was observed when the biochar C content was between 40 and 60%, particularly when the biochar application rate was below 7% (Figures S8 and 5c). Soil EC showed a strong positive correlation with HM immobilization in biochar-amended soils in this study, which was consistent with existing reports in the literature[27,60] (Figure S8). Furthermore, feature interaction revealed that the addition of biochar with a carbon content <60% at a biochar application rate >5% with soil EC higher than 0.5 could improve soil HM immobilization (Figure c). The application of biochar to soil may also increase soil EC, particularly because of the release of exchangeable basic cations such as Ca2+, Mg2+, Na+, and K+,[11] which subsequently facilitates HM immobilization via ion exchange, complexation, and coprecipitation.[12,29,60] Although the pH values of biochar and soil were not considered very important for HM immobilization (Figure b), these features are known to be crucial for soil HM immobilization after biochar application.[11,29] Several meta-analyses have also emphasized that the pH of biochar and soil are crucial factors that influence bioavailable HM in soil.[15,67] However, the partial dependence plot demonstrated that soil HM immobilization was higher when the soil and biochar pH were approximately 8 and >7.5, respectively (Figure S8). At a higher soil pH, the deprotonation of acidic functional groups on the biochar surface demonstrated a preference for positively charged HMs, thereby enhancing HM immobilization.[68] Moreover, the release of K+, Na+, OH–, PO43–, and Cl– ions at higher pH facilitated the formation of stable compounds with HMs.[11] The formation of stable inner-sphere complexes[69] and precipitation of HMs as carbonates, phosphates, and hydroxides[29,70] are some of the additional mechanisms that facilitate HM immobilization in soils due to the increased pH from biochar application. Notably, the SA of biochar and the pyrolysis temperature did not play significant roles in HM immobilization. This is consistent with the preliminary analysis (statistical data analysis presented in the SI), which showed that a high immobilization efficiency was obtained for different ranges of the SA, and a high SA was not always required. Biochar applied to contaminated soil forms a complex system, and other factors may also interact with biochar SA to influence HM immobilization. Some studies have reported that the SA of biochar increases with increasing pyrolysis temperature; in this study, biochar with a higher SA adsorbed more HMs than that with a lower SA.[11,26] However, this phenomenon may change when the types of biochar, HM species, and experimental conditions vary. Son et al.[71] reported that the Cu2+ adsorption capacity of marine macroalgae biochar (SA = 0.39–0.49 m2·g–1) was higher than that of pinewood sawdust biochar (SA = 364.47 m2·g–1), even though the specific SA of the former was lower than that of the latter. Although increasing the pyrolysis temperature develops a microporous structure and increases the specific SA of biochar, it decreases the abundance of functional groups on the biochar surface.[72] For example, functional groups such as carboxyl, carbonyl, hydroxyl, and methoxyl on biochar tend to disappear as the pyrolysis temperature is increased over 450.[73] Thus, biochar with a higher SA would have fewer functional groups, and hence, lower HM immobilization capacity. These results supplement the findings of this study, which demonstrate the reduced importance of SA and pyrolysis temperature in HM immobilization using biochar. An increase in ash content in the biochar increased HM immobilization (Figure S8). The ash in biochar consists of residual minerals (e.g., salts) that can supply cations to the soil after dissolution.[60,74] These cations increase the soil cation exchange capacity, thereby increasing the HM immobilization efficiency in soils.[60] Based on the relative importance of each empirical category, “biochar properties” was found to be the most important feature for HM immobilization, followed by the experimental conditions, soil properties, and HM properties (Figure a). The contribution of biochar properties to HM immobilization was 50% (Figure a). Lakshmi et al.[75] reported similar findings, where “biochar properties” was the leading variable (53.8% contribution) for predicting HM sorption (in aqueous media) on biochar using an RF model. In our study, “experimental conditions” (which includes the available HM concentration in the control soil) was the second most important factor for HM immobilization, contributing 28.0% to the relative importance (Figure a). Similarly, in the aforementioned study,[75] the initial HM concentration was the second most important variable, contributing 30.6% to the relative importance.

Environmental Implications

The optimum conditions for HM immobilization in soils containing biochar vary widely among studies. Examining all relevant parameters through simultaneous experimentation is challenging. To address this issue, an ML-based empirical approach was developed in this study, which could be used to predict HM immobilization efficiency in biochar-amended soils based on biochar, soil, and HM properties as well as experimental conditions. According to the findings of the newly developed RF model, the two most important features governing soil HM immobilization were the biochar application rate and N content (N%), which were positively correlated with HM immobilization capacity. Based on causal analysis, the importance of the involved features was in the following order: biochar properties > experimental conditions > soil properties > HM properties. Chen et al.[49] performed a meta-analysis and found that the HM bioavailability and plant uptake depend on factors such as HM speciation, soil properties, biochar characteristics, application rate, and plant type. Moreover, soil properties such as soil pH, organic matter content, and texture are the key variables that determine the concentration of HM uptake by plants.[49] Similarly, Rehman et al.[67] determined through a meta-analysis that edaphic factors, such as soil pH, texture, and plant species, affect HM adsorption and transformation in soils amended with biochar. Several other meta-analyses[15,76−78] have also reported the factors influencing HM immobilization in soil. However, none of these studies has identified the most important variable or the relative importance of each variable for HM immobilization. Compared with these conventional approaches, our ML model highlights the relative importance of each factor for HM immobilization. This facilitates the overall understanding of the process and realization of maximum HM immobilization efficiency in contaminated soils under various conditions. Furthermore, a GUI was developed to ensure that the prediction model is accessible to both scientists and practitioners. This online tool can predict the HM immobilization efficiency of a given biochar for a specific soil type using available data. Thus, the GUI can assist in obtaining optimum values of the input variables and in achieving maximum HM immobilization in soils prior to the implementation of a biochar-based remediation plan. The results of this study have some limitations owing to the quality and quantity of data collected from published papers. The data distribution of some input features and output targets was inconsistent owing to multiple variations in the research objectives, methodologies, and experimental conditions. For example, immobilization efficiency was determined based on a wide range of features such as bioavailability, bioaccessibility, exchangeable fraction, labile fraction, leaching, phytoavailability, and water-soluble fraction of HMs. Moreover, various extraction methods using diethylenetriaminepentaacetic acid, ethylenediaminetetraacetic acid, CaCl2, and NH4NO3 were applied to determine the available HM concentrations in soil fractions. These constraints may cause uncertainties in some of the prediction results and may not precisely reflect real-world scenarios. Therefore, future research should focus on improving the ML model using a database that includes studies with well-defined scientific objectives and similar methodologies under uniform experimental conditions. In particular, some biochar properties that are directly associated with HM immobilization were missing in this study due to the lack of data in the selected literature. For example, the evaluation of the surface functional groups (e.g., −OH, −COOH, C–C, C=C, C–O, C=O phenolic, alcoholic, and ether) of biochar on HM immobilization is more important than the evaluation of the elemental composition of biochar. Hence, future studies should utilize the surface chemistry data derived from X-ray photoelectron spectroscopy, X-ray diffraction analysis, and Fourier transform infrared spectroscopy when developing ML models for predicting the HM immobilization efficiency by biochar.
  53 in total

1.  Heavy metal immobilization and microbial community abundance by vegetable waste and pine cone biochar of agricultural soils.

Authors:  Avanthi Deshani Igalavithana; Sung-Eun Lee; Young Han Lee; Daniel C W Tsang; Jörg Rinklebe; Eilhann E Kwon; Yong Sik Ok
Journal:  Chemosphere       Date:  2017-02-03       Impact factor: 7.086

2.  Machine-Learning Prediction of CO Adsorption in Thiolated, Ag-Alloyed Au Nanoclusters.

Authors:  Gihan Panapitiya; Guillermo Avendaño-Franco; Pengju Ren; Xiaodong Wen; Yongwang Li; James P Lewis
Journal:  J Am Chem Soc       Date:  2018-11-26       Impact factor: 15.419

3.  Influence of pyrolysis temperature on characteristics and heavy metal adsorptive performance of biochar derived from municipal sewage sludge.

Authors:  Tan Chen; Yaxin Zhang; Hongtao Wang; Wenjing Lu; Zeyu Zhou; Yuancheng Zhang; Lulu Ren
Journal:  Bioresour Technol       Date:  2014-04-21       Impact factor: 9.642

4.  Modelling of the adsorption of Pb, Cu and Ni ions from single and multi-component aqueous solutions by date seed derived biochar: Comparison of six machine learning approaches.

Authors:  Ali El Hanandeh; Zainab Mahdi; M S Imtiaz
Journal:  Environ Res       Date:  2020-10-17       Impact factor: 6.498

5.  N-doping effectively enhances the adsorption capacity of biochar for heavy metal ions from aqueous solution.

Authors:  Wenchao Yu; Fei Lian; Guannan Cui; Zhongqi Liu
Journal:  Chemosphere       Date:  2017-10-26       Impact factor: 7.086

6.  (Im)mobilization of arsenic, chromium, and nickel in soils via biochar: A meta-analysis.

Authors:  Zahra Arabi; Jörg Rinklebe; Ali El-Naggar; Deyi Hou; Ajit K Sarmah; Eduardo Moreno-Jiménez
Journal:  Environ Pollut       Date:  2021-05-01       Impact factor: 8.071

7.  Heavy metal removal from aqueous solutions using engineered magnetic biochars derived from waste marine macro-algal biomass.

Authors:  Eun-Bi Son; Kyung-Min Poo; Jae-Soo Chang; Kyu-Jung Chae
Journal:  Sci Total Environ       Date:  2017-09-29       Impact factor: 7.963

8.  Effect of gasification biochar application on soil quality: Trace metal behavior, microbial community, and soil dissolved organic matter.

Authors:  Xiao Yang; Ana Tsibart; Hyungseok Nam; Jin Hur; Ali El-Naggar; Filip M G Tack; Chi-Hwa Wang; Young Han Lee; Daniel C W Tsang; Yong Sik Ok
Journal:  J Hazard Mater       Date:  2018-11-13       Impact factor: 10.588

9.  Characterization of bioenergy biochar and its utilization for metal/metalloid immobilization in contaminated soil.

Authors:  Xiao Yang; Avanthi D Igalavithana; Sang-Eun Oh; Hyungseok Nam; Ming Zhang; Chi-Hwa Wang; Eilhann E Kwon; Daniel C W Tsang; Yong Sik Ok
Journal:  Sci Total Environ       Date:  2018-06-02       Impact factor: 7.963

10.  H/C atomic ratio as a smart linkage between pyrolytic temperatures, aromatic clusters and sorption properties of biochars derived from diverse precursory materials.

Authors:  Xin Xiao; Zaiming Chen; Baoliang Chen
Journal:  Sci Rep       Date:  2016-03-04       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.