| Literature DB >> 35680937 |
Vimal Rathakrishnan1, Salmia Bt Beddu2, Ali Najah Ahmed3.
Abstract
Predicting the compressive strength of concrete is a complicated process due to the heterogeneous mixture of concrete and high variable materials. Researchers have predicted the compressive strength of concrete for various mixes using machine learning and deep learning models. In this research, compressive strength of high-performance concrete with high volume ground granulated blast-furnace slag replacement is predicted using boosting machine learning (BML) algorithms, namely, Light Gradient Boosting Machine, CatBoost Regressor, Gradient Boosting Regressor (GBR), Adaboost Regressor, and Extreme Gradient Boosting. In these studies, the BML model's performance is evaluated based on prediction accuracy and prediction error rates, i.e., R2, MSE, RMSE, MAE, RMSLE, and MAPE. Additionally, the BML models were further optimised with Random Search algorithms and compared to BML models with default hyperparameters. Comparing all 5 BML models, the GBR model shows the highest prediction accuracy with R2 of 0.96 and lowest model error with MAE and RMSE of 2.73 and 3.40, respectively for test dataset. In conclusion, the GBR model are the best performing BML for predicting the compressive strength of concrete with the highest prediction accuracy, and lowest modelling error.Entities:
Year: 2022 PMID: 35680937 PMCID: PMC9184605 DOI: 10.1038/s41598-022-12890-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Artificial Intelligence sub-classes.
Summary of previous studies on concrete strength prediction.
| No | Type of Concrete | Model | Dataset | Year | Reference |
|---|---|---|---|---|---|
| 1 | Fly-ash based concrete | Decision tree, ensemble bagging, Gene expression programming | 270 | 2021 | [ |
| 2 | High-performance concrete from industrial wastes | Decision tree, random forest, support vector, artificial neural network, multiple linear regression, ensemble bagging & boosting | 1030 | 2021 | [ |
| 3 | Self-compacting concrete with fly-ash | Data Envelopment Analysis | 114 | 2021 | [ |
| 4 | Steel fibre-reinforced concrete | Boosting- and tree-based models, K-nearest neighbour, linear, ridge, lasso regressor, support vector regressor, multilayer perceptron models | 220 | 2021 | [ |
| 5 | Self-compacting concrete with high-volume fly ash | Support vector machine | 337 | 2020 | [ |
| 6 | High-performance concrete | Multivariate adaptive regression splines, kernel ridge regression, gradient boosting machines, gaussian process regression | 1030 | 2020 | [ |
| 7 | High-strength concrete | Gene expression programming | 357 | 2020 | [ |
| 8 | Ultra-high-performance concrete | Artificial neural network: Sequential Feature Selection (SFS) and Neural Interpretation Diagram (NID) | 110 | 2020 | [ |
| 9 | Alkali-activated concrete | Random Forest | 180 | 2020 | [ |
| 10 | Ordinary concrete | Extreme gradient boosting | 1030 | 2020 | [ |
| 11 | Self-compacting concrete | Artificial neural network | 205 | 2019 | [ |
| 12 | Self-compacting concrete with fly ash | Enhanced multiclass support vector machine and fuzzy rule | 114 | 2019 | [ |
| 13 | Lightweight self-compacting concrete | Random forest regression | 131 | 2019 | [ |
| 14 | High-performance concrete | Artificial neural network: modified firefly algorithm | 1133 | 2018 | [ |
| 15 | High-performance concrete | Support vector machine, enhanced cat swarm optimisation | 2200 | 2018 | [ |
| 16 | Lightweight Aggregate Concretes | Extreme learning machine regressor, particle swarm optimization | 75 | 2018 | [ |
| 17 | Self-compacting concrete containing fly ash | Decision tree algorithms: M5′ and multivariate adaptive regression splines | 114 | 2018 | [ |
Figure 2The evolution of XGBoost.
Figure 3Step by step BML modelling approach.
Summary of statistical analysis of the concrete material composition.
| Fine | Coarse | GGBS | OPC | SF | Water | Admixture | Fine MC | Coarse MC | Days | Strength | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 152 | 152 | 152 | 152 | 152 | 152 | 152 | 152 | 152 | 152 | 152 |
| mean | 871.7 | 874.3 | 246.6 | 246.3 | 54.4 | 138.7 | 12.2 | 4.4 | 0.5 | 45.5 | 105.2 |
| std | 11.7 | 10.2 | 1.4 | 0.9 | 0.6 | 2.9 | 0.5 | 0.7 | 0.2 | 31.6 | 14.6 |
| min | 842.0 | 857.0 | 244.0 | 244.0 | 53.0 | 135.0 | 11.5 | 3.2 | 0.2 | 7.0 | 70.3 |
| max | 900.0 | 904.0 | 250.0 | 248.0 | 56.0 | 149.0 | 12.7 | 6.0 | 1.0 | 91.0 | 131.4 |
Figure 4Distribution correlation of input parameters and strength.
Figure 5Pearson’s correlation heatmap.
Figure 6K-fold cross validation method.
Summary of initial modelling.
| Model | R2 | MAE | MSE | RMSE | RMSLE | MAPE | |
|---|---|---|---|---|---|---|---|
| Training Dataset | LBGM | 0.86 | 3.60 | 14.92 | 3.86 | 0.03 | 0.03 |
| CATB | 0.85 | 3.61 | 21.80 | 4.67 | 0.05 | 0.04 | |
| GBR | 0.83 | 4.02 | 22.32 | 4.72 | 0.05 | 0.04 | |
| ADAB | 0.81 | 4.20 | 26.59 | 5.16 | 0.05 | 0.04 | |
| XGB | 0.81 | 3.95 | 26.87 | 5.18 | 0.05 | 0.04 | |
| Test Dataset | LBGM | 0.94 | 3.29 | 16.80 | 4.10 | 0.04 | 0.03 |
| CATB | 0.89 | 3.97 | 29.10 | 5.39 | 0.06 | 0.04 | |
| GBR | 0.93 | 3.24 | 17.82 | 4.22 | 0.05 | 0.03 | |
| ADAB | 0.89 | 4.17 | 28.46 | 5.33 | 0.06 | 0.04 | |
| XGB | 0.92 | 3.64 | 21.86 | 4.68 | 0.05 | 0.04 |
Figure 7Best fit line for prediction distribution (RS model).
Summary of hyperparameter tuned values.
| Model | LGBM | CATB | GBR | ADAB | XGB | |
|---|---|---|---|---|---|---|
| Default Value | n_estimator | 100 | 100 | 1000 | 50 | 100 |
| learning_rate | 0.10 | 0.10 | 0.03 | 1.00 | 0.03 | |
| max_depth | −1 | 3 | 6 | – | 6 | |
| subsample | 1.00 | 1.00 | 0.80 | – | 1.00 | |
| Optimised Value | n_estimator | 270 | 90 | 210 | 290 | 100 |
| learning_rate | 0.20 | 0.30 | 0.15 | 0.40 | 0.30 | |
| max_depth | −1 | 2 | 2 | – | 6 | |
| subsample | 1.00 | 0.80 | 0.65 | – | 1.00 |
Summary of RS optimised models.
| Model | R2 | MAE | MSE | RMSE | RMSLE | MAPE | |
|---|---|---|---|---|---|---|---|
| Training Dataset | LBGM | 0.88 | 3.27 | 16.22 | 4.03 | 0.04 | 0.03 |
| CATB | 0.89 | 3.15 | 14.85 | 3.85 | 0.04 | 0.03 | |
| GBR | 0.88 | 3.26 | 16.75 | 4.09 | 0.04 | 0.03 | |
| ADAB | 0.83 | 4.00 | 24.50 | 4.95 | 0.05 | 0.04 | |
| XGB | 0.88 | 3.23 | 16.50 | 4.06 | 0.04 | 0.03 | |
| Test Dataset | LBGM | 0.95 | 2.88 | 12.79 | 3.58 | 0.04 | 0.03 |
| CATB | 0.95 | 2.98 | 13.30 | 3.65 | 0.04 | 0.03 | |
| GBR | 0.96 | 2.73 | 11.53 | 3.40 | 0.03 | 0.03 | |
| ADAB | 0.90 | 3.98 | 26.11 | 5.11 | 0.05 | 0.04 | |
| XGB | 0.94 | 3.14 | 15.20 | 3.90 | 0.04 | 0.03 |
Figure 8Comparison between BML and RS optimised models.
Figure 9Best fit line for prediction distribution (RS model).
Figure 10Feature importance analysis of BML models.
Summary of comparison between various ML models.
| Model | R2 | MAE | MSE | RMSE | RMSLE | MAPE |
|---|---|---|---|---|---|---|
| Extra Trees | 0.78 | 4.17 | 25.41 | 5.04 | 0.05 | 0.04 |
| Random Forest | 0.78 | 4.17 | 26.07 | 5.11 | 0.05 | 0.04 |
| K Neighbours | 0.72 | 4.90 | 35.09 | 5.92 | 0.05 | 0.05 |
| Ridge | 0.66 | 5.19 | 41.85 | 6.47 | 0.06 | 0.05 |
| Least Angle | 0.66 | 5.22 | 42.16 | 6.49 | 0.06 | 0.05 |
| Linear | 0.65 | 5.22 | 42.16 | 6.49 | 0.06 | 0.05 |
| Elastic Net | 0.65 | 5.41 | 45.79 | 6.77 | 0.06 | 0.05 |
| Huber | 0.64 | 5.30 | 44.55 | 6.67 | 0.06 | 0.05 |
| Bayesian Ridge | 0.64 | 5.54 | 47.42 | 6.89 | 0.07 | 0.05 |
| Lasso | 0.60 | 5.59 | 48.36 | 6.95 | 0.07 | 0.06 |
| Decision Tree | 0.59 | 5.59 | 45.66 | 6.76 | 0.07 | 0.05 |
| Orthogonal Matching Pursuit | 0.59 | 6.05 | 56.49 | 7.52 | 0.07 | 0.06 |
| Passive Aggressive | 0.25 | 7.80 | 96.13 | 9.80 | 0.09 | 0.08 |
| Lasso Least Angle | 0.09 | 9.84 | 153.55 | 12.39 | 0.12 | 0.10 |