| Literature DB >> 32492923 |
Abstract
Over the past few years, solar power has significantly increased in popularity as a renewable energy. In the context of electricity generation, solar power offers clean and accessible energy, as it is not associated with global warming and pollution. The main challenge of solar power is its uncontrollable fluctuation since it is highly depending on other weather variables. Thus, forecasting energy generation is important for smart grid operators and solar electricity providers since they are required to ensure the power continuity in order to dispatch and properly prepare to store the energy. In this study, we propose an efficient comparison framework for forecasting the solar power that will be generated 36 h in advance from Yeongam solar power plant located in South Jeolla Province, South Korea. The results show a comparative analysis of the state-of-the-art techniques for solar power generation.Entities:
Keywords: data mining; deep neural networks; forecasting solar power generation; machine learning; weather sensors
Year: 2020 PMID: 32492923 PMCID: PMC7308868 DOI: 10.3390/s20113129
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Proposed framework for the analysis and comparison of the machine learning techniques for photovoltaic prediction.
Figure 2Solar power generation per day at Yeongam solar power plant from January 2013 to December 2015.
Solar elevation and weather forecast variables.
| Category | Variable Name | Classification | Description (Unit) |
|---|---|---|---|
| Solar elevation (1) | Elevation | Continuous | Degrees (0°–76°) |
| Weather forecast | Humidity | Continuous | (%) |
| PrecipitationProb | Continuous | (%) | |
| PrecipitationCat | Categorical | 0: none, 1: rain, 2: sleet, 3: snow | |
| SkyType | Categorical | 0: clear sky, 1: slightly cloudy, | |
| Temperature | Continuous | Celsius (°C) | |
| WindDirection | Categorical | N: 315°–45°, E: 45°–135°, | |
| WindSpeed | Continuous | (m/s) |
Weather observation variables.
| Category | Variable Name | Classification | Description (Unit) |
|---|---|---|---|
| Weather observation | AirTemperature | Continuous | (°C) |
| AtmosPressure | Continuous | (hPa) | |
| DewPointTemperature | Continuous | (°C) | |
| GroundTemperature | Continuous | (°C) | |
| Humidity | Continuous | (%) | |
| Precipitation | Continuous | (mm) | |
| SeaLevelPressure | Continuous | (hPa) | |
| SolarRadiation | Continuous | (MJ/m2) | |
| SunlightTime | Continuous | (hr) | |
| VaporPressure | Continuous | (hPa) | |
| WindDirection | Categorical | N: 315°–45°, E: 45°–135°, | |
| WindSpeed | Continuous | (m/s) | |
| 5cmDownTemperature | Continuous | Temperature below the ground surface (°C) | |
| 10cmDownTemperature | Continuous | ||
| 20cmDownTemperature | Continuous | ||
| 30cmDownTemperature | Continuous |
Machine learning methods.
| Single Regression | Ensemble (Bagging) | Ensemble (Boosting) |
|---|---|---|
| Linear regression | Bagging | AdaBoost |
| Huber | Random forest | Gradient boosting |
| Ridge | Extra trees | CatBoost |
| Lasso | XGBoost | |
| Elastic Net | ||
| Decision tree | ||
| k-NN | ||
| SVR |
Hyperparameter tuning using grid search for the machine learning algorithms.
| Input Data | Prediction Models | Evaluated Hyperparameters | |
|---|---|---|---|
| Observational data | Single regression models | Linear regression | N/A |
| Huber | α = {0.0001, 0.001, 0.01, | ||
| Ridge | α = {0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, | ||
| Lasso | α = {0.0001, 0.001, 0.01, | ||
| Elastic net | α = { | ||
| Decision tree | max_depth = {1, 2, | ||
| k-NN | |||
| SVR | |||
| Ensemble models | Bagging | num_estimators = {5, 10, 15, 20, 40, | |
| Random forest | max_depth = {2, 5, | ||
| Extra trees | max_depth = {2, 5, | ||
| AdaBoost | N/A | ||
| Ensemble models | Gradient boosting | max_depth = {2, | |
| CatBoost | iterations = { | ||
| XGBoost | num_estimators = {5, 10, 15, 20, 40, | ||
| Forecast data | Linear regression | N/A | |
| Huber | α = {0.0001, 0.001, | ||
| Single regression models | Ridge | α = {0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, 0.8, 1.0, | |
| Lasso | α = {0.0001, 0.001, 0.01, 0.1, | ||
| Elastic net | α = {0.0001, | ||
| Decision tree | max_depth= {1, 2, 3, 4, 5, | ||
| k-NN | |||
| SVR | |||
| Ensemble models | Bagging | num_estimators = {5, 10, 15, 20, 40, | |
| Random forest | max_depth = {2, 5, | ||
| Extra trees | max_depth = {2, 5, 7, | ||
| AdaBoost | N/A | ||
| Ensemble models | Gradient boosting | max_depth = {2, | |
| CatBoost | iterations = { | ||
| XGBoost | num_estimators = {5, 10, 15, 20, 40, | ||
| Forecast and observational data | Single regression models | Linear regression | N/A |
| Huber | α = {0.0001, 0.001, 0.01, 0.1, 1, | ||
| Ridge | α = {0.01, 0.02, 0.05, 0.1, 0.2, 0.3, 0.5, 0.8, | ||
| Lasso | α = {0.0001, 0.001, 0.01, 0.1, | ||
| Elastic net | α = { | ||
| Decision tree | max_depth = {1, 2, 3, 4, | ||
| k-NN | |||
| SVR | |||
| Ensemble models | Bagging | num_estimators = {5, 10, 15, 20, 40, | |
| Random forest | max_depth = {2, 5, | ||
| Extra trees | max_depth = {2, 5, | ||
| AdaBoost | N/A | ||
| Ensemble models | Gradient boosting | max_depth = {2, 5, | |
| CatBoost | iterations = { | ||
| XGBoost | num_estimators = {5, 10, 15, 20, 40, | ||
The best values are in underlined boldface.
Best values for the evaluated hyperparameter tuning from grid search.
| Prediction Models | Observation Weather | Forecast Weather | Forecast and Observation Weather | |
|---|---|---|---|---|
| Single regression models | Linear regression | N/A | N/A | N/A |
| Huber | α = { | α = { | α = { | |
| Ridge | α = { | α = { | α = { | |
| Lasso | α = { | α = { | α = { | |
| Elastic net | α = { | α = { | α = { | |
| Decision tree | max_depth = { | max_depth= { | max_depth = { | |
| k-NN | ||||
| SVR | ||||
| Ensemble models (bagging) | Bagging | num_estimators = { | num_estimators = { | num_estimators = { |
| Random forest | max_depth = { | max_depth = { | max_depth = { | |
| Extra trees | max_depth = { | max_depth = { | max_depth = { | |
| AdaBoost | N/A | N/A | N/A | |
| Ensemble models (boosting) | Gradient boosting | max_depth = { | max_depth = { | max_depth = { |
| CatBoost | iterations = { | iterations = { | iterations = { | |
| XGBoost | num_estimators = { | num_estimators = { | num_estimators = { | |
Figure 3Comparative algorithm analysis using test set over weather data. (a) RMSE, (b) MAE, (c) R2.
Performances of solar power prediction models.
| Input Data | Prediction Models | Validation Set | Test Set | ||||||
|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAE |
| RMSE | MAE |
| ||||
| Observational data | Single regression models | Linear regression | Hyperparameters | 735.04 | 547.74 | 59.8% | 707.08 | 520.44 | 59.7% |
| Huber | α = 0.1 | 746.26 | 554.02 | 58.6% | 717.63 | 526.67 | 58.5% | ||
| Ridge | α = 0.8 | 735.06 | 547.32 | 59.8% | 706.76 | 519.81 | 59.8% | ||
| Lasso | α = 0.1 | 734.49 | 545.77 | 59.9% | 707.61 | 520.03 | 59.7% | ||
| Elastic net | α = 0.0001, | 734.66 | 541.45 | 59.9% | 709.04 | 523.10 | 59.5% | ||
| Decision tree | max_depth = 3 | 685.75 | 478.41 | 65.1% | 679.07 | 461.94 | 62.9% | ||
| k-NN | 683.80 | 475.89 | 65.3% | 676.44 | 459.42 | 63.1% | |||
| SVR | 741.87 | 546.58 | 59.1% | 712.95 | 519.84 | 59.1% | |||
| Ensemble models | Bagging | num_estimators = 80 | 685.02 | 466.09 | 65.1% | 680.51 | 448.63 | 62.7% | |
| Random forest | max_depth = 7, | 669.35 | 466.51 | 66.7% | 667.26 | 451.78 | 64.1% | ||
| Extra trees | max_depth = 7, | 684.45 | 478.97 | 65.2% | 661.29 | 451.48 | 64.8% | ||
| AdaBoost | N/A | 707.59 | 507.44 | 62.8% | 689.47 | 483.75 | 61.7% | ||
| Ensemble models | Gradient boosting | max_depth = 5, | 673.19 | 473.28 | 66.3% | 655.07 | 445.37 | 65.4% | |
| CatBoost | depth = 3, | 690.14 | 497.52 | 64.6% | 670.36 | 470.36 | 63.8% | ||
|
| num_estimators = 80 | 681.32 | 474.58 | 65.5% | 650.36 | 440.67 | 65.9% | ||
| Forecast data | Single regression models | Linear regression | N/A | 657.30 | 509.97 | 67.9% | 634.18 | 486.35 | 67.6% |
| Huber | α = 0.01 | 671.02 | 505.89 | 66.5% | 638.68 | 475.59 | 67.1% | ||
| Ridge | α = 2.0 | 658.26 | 510.17 | 67.8% | 633.29 | 485.14 | 67.7% | ||
| Lasso | α = 1.0 | 657.79 | 509.97 | 67.8% | 633.44 | 485.52 | 67.7% | ||
| Elastic net | α = 0.001, | 658.03 | 510.02 | 67.8% | 633.46 | 485.34 | 67.7% | ||
| Decision tree | max_depth = 6 | 576.69 | 376.62 | 75.3% | 557.41 | 346.96 | 75.0% | ||
| k-NN | 537.32 | 358.39 | 78.5% | 529.37 | 334.78 | 77.4% | |||
| SVR | 678.00 | 507.18 | 65.8% | 640.00 | 471.79 | 67.0% | |||
| Bagging | num_estimators = 80 | 527.20 | 337.95 | 79.3% | 519.11 | 318.74 | 78.3% | ||
| Random forest | max_depth = 7, | 518.98 | 335.68 | 80.0% | 510.96 | 317.40 | 79.0% | ||
| Extra trees | max_depth = 9, | 537.02 | 351.18 | 78.6% | 517.46 | 319.89 | 78.4% | ||
| AdaBoost | N/A | 595.58 | 430.29 | 73.6% | 582.24 | 400.90 | 72.7% | ||
| Gradient boosting | max_depth = 5, | 514.93 | 344.69 | 80.3% | 511.35 | 320.56 | 78.9% | ||
| CatBoost | depth = 3, | 543.21 | 372.91 | 78.1% | 542.67 | 353.85 | 76.3% | ||
|
| num_estimators = 80 | 525.21 | 349.87 | 79.5% | 509.44 | 326.25 | 79.1% | ||
| Forecast and observational data | Single regression models | Linear regression | N/A | 637.29 | 496.72 | 69.8% | 620.31 | 480.03 | 69.0% |
| Huber | α = 10 | 670.20 | 500.31 | 66.6% | 634.42 | 466.65 | 67.6% | ||
| Ridge | α = 1.0 | 637.95 | 496.44 | 69.8% | 619.79 | 478.94 | 69.1% | ||
| Lasso | α = 1.0 | 638.54 | 495.73 | 69.7% | 620.35 | 478.54 | 69.0% | ||
| Elastic net | α = 0.0001, | 639.79 | 494.39 | 69.6% | 618.82 | 474.22 | 69.2% | ||
| Decision tree | max_depth = 5 | 531.96 | 355.22 | 79.0% | 552.80 | 352.34 | 75.4% | ||
| k-NN | 548.95 | 368.00 | 77.6% | 533.59 | 340.86 | 77.1% | |||
| SVR | 660.39 | 487.39 | 67.6% | 626.68 | 455.21 | 68.4% | |||
| Ensemble models | Bagging | num_estimators = 80 | 505.33 | 323.03 | 81.0% | 506.64 | 321.09 | 79.3% | |
| Random forest | max_depth = 7, | 503.64 | 330.35 | 81.2% | 511.29 | 325.00 | 78.9% | ||
| Extra trees | max_depth = 7, | 512.54 | 340.11 | 80.5% | 508.68 | 321.05 | 79.2% | ||
| AdaBoost | N/A | 622.54 | 487.11 | 71.2% | 608.63 | 466.95 | 70.2% | ||
| Ensemble models | Gradient boosting | max_depth = 7, | 501.84 | 336.85 | 81.3% | 504.33 | 321.25 | 79.5% | |
| CatBoost | depth = 3, | 521.38 | 365.95 | 79.8% | 529.27 | 349.92 | 77.4% | ||
|
| num_estimators = 80 | 518.16 | 340.40 | 80.0% | 493.85 | 317.70 | 80.4% | ||
The best values are in underlined boldface.
Comparative statistics of prediction models based on observation weather information using 10-fold cross-validation.
| Prediction Models | RMSE | MAE |
| ||||
|---|---|---|---|---|---|---|---|
| Mean | STD | Mean | STD | Mean | STD | ||
| Linear regression | 737.22 | 32.76 | 549.73 | 30.44 | 0.5656 | 0.0467 | |
| Single regression models | Huber | 744.82 | 32.95 | 554.09 | 29.9 | 0.5557 | 0.0561 |
| Ridge | 737.15 | 32.65 | 549.69 | 30.95 | 0.5656 | 0.0466 | |
| Lasso | 737.49 | 32.75 | 549.37 | 31.48 | 0.5653 | 0.0465 | |
| Elastic net | 737.9 | 29.79 | 548.56 | 28.3 | 0.5648 | 0.0449 | |
| Decision tree | 694.24 | 35.83 | 478.77 | 34.68 | 0.616 | 0.0312 | |
| k-NN | 706.65 | 42.12 | 483.64 | 36.71 | 0.6027 | 0.0299 | |
| SVR | 740.4 | 32.1 | 547.15 | 29.98 | 0.5615 | 0.0497 | |
| Ensemble models | Bagging | 705.42 | 34.71 | 470.01 | 33.18 | 0.6106 | 0.0275 |
| Random forest | 685.97 | 33.71 | 471.31 | 33.8 | 0.6251 | 0.029 | |
| Extra trees | 686.43 | 32.87 | 472.94 | 34.51 | 0.6245 | 0.0295 | |
| AdaBoost | 710.19 | 43.01 | 506 | 38.73 | 0.6047 | 0.0266 | |
| Ensemble models | Gradient boosting | 680.65 | 33 | 474.09 | 28.3 | 0.6309 | 0.028 |
| CatBoost | 695.07 | 33.49 | 492.89 | 26.7 | 0.6151 | 0.0294 | |
| XGBoost | 693.13 | 32.75 | 478.05 | 26.16 | 0.617 | 0.0318 | |
The best values are in underlined boldface.
Figure 4Comparative algorithm analysis using 10-fold cross-validation over observational weather data analyzing (a) RMSE, (b) MAE, (c) R2 as scoring metric.
Figure 5Comparative algorithm analysis using 10-fold cross-validation over forecast weather data analyzing (a) RMSE, (b) MAE, (c) R2 as scoring metric.
Figure 6Comparative algorithm analysis using 10-fold cross-validation over observable and forecast weather data analyzing (a) RMSE, (b) MAE, (c) R2 as scoring metric.
Comparative statistics of prediction models based on forecast weather information using 10-fold cross-validation.
| Prediction Models | RMSE | MAE |
| ||||
|---|---|---|---|---|---|---|---|
| Mean | STD | Mean | STD | Mean | STD | ||
| Linear regression | 656.62 | 36.88 | 509.51 | 35.74 | 0.6558 | 0.0357 | |
| Single regression models | Huber | 665.09 | 30.53 | 501.79 | 28.61 | 0.6460 | 0.0414 |
| Ridge | 656.65 | 36.61 | 509.17 | 35.60 | 0.6557 | 0.0362 | |
| Lasso | 656.54 | 36.73 | 509.18 | 35.68 | 0.6558 | 0.0364 | |
| Elastic net | 656.62 | 36.57 | 509.07 | 35.58 | 0.6557 | 0.0362 | |
| Decision tree | 582.60 | 45.43 | 384.15 | 32.12 | 0.7295 | 0.0299 | |
| k-NN | 542.09 | 47.76 | 350.20 | 29.78 | 0.7661 | 0.0276 | |
| SVR | 667.79 | 29.18 | 500.79 | 27.48 | 0.6429 | 0.0430 | |
| Ensemble models | Bagging | 543.32 | 39.60 | 347.18 | 25.03 | 0.7671 | 0.0203 |
| Random forest | 532.20 | 43.19 | 340.77 | 24.11 | 0.7746 | 0.0230 | |
| Extra trees | 540.77 | 46.86 | 348.29 | 28.70 | 0.7673 | 0.0262 | |
| AdaBoost | 610.32 | 32.46 | 439.47 | 26.32 | 0.7061 | 0.0232 | |
| Ensemble models | Gradient boosting | 531.85 | 42.73 | 347.75 | 24.67 | 0.7749 | 0.0227 |
| CatBoost | 556.66 | 49.60 | 375.19 | 32.40 | 0.7532 | 0.0301 | |
| XGBoost | 537.06 | 44.69 | 355.67 | 28.67 | 0.7707 | 0.0219 | |
The best values are in underlined boldface.
Comparative statistics of prediction models based on observation and forecast weather information using 10-fold cross-validation.
| Prediction Models | RMSE | MAE |
| ||||
|---|---|---|---|---|---|---|---|
| Mean | STD | Mean | STD | Mean | STD | ||
| Single regression models | Linear regression | 638.18 | 43.81 | 497.20 | 38.50 | 0.6741 | 0.0440 |
| Huber | 668.09 | 34.30 | 497.38 | 25.83 | 0.6419 | 0.0494 | |
| Ridge | 638.06 | 43.62 | 496.58 | 38.22 | 0.6742 | 0.0440 | |
| Lasso | 638.36 | 43.61 | 495.98 | 38.02 | 0.6740 | 0.0437 | |
| Elastic net | 639.37 | 42.45 | 495.47 | 35.45 | 0.6727 | 0.0445 | |
| Decision tree | 558.35 | 49.92 | 366.54 | 36.39 | 0.7512 | 0.0354 | |
| k-NN | 547.79 | 56.24 | 358.28 | 31.88 | 0.7601 | 0.0395 | |
| SVR | 656.00 | 38.92 | 483.36 | 29.68 | 0.6551 | 0.0472 | |
| Ensemble models | Bagging | 530.90 | 42.84 | 338.03 | 27.46 | 0.7770 | 0.0295 |
| Random forest | 525.88 | 46.17 | 339.20 | 29.82 | 0.7791 | 0.0324 | |
| Extra trees | 524.92 | 48.44 | 340.73 | 31.94 | 0.7805 | 0.0287 | |
| AdaBoost | 612.09 | 38.51 | 454.29 | 40.33 | 0.7088 | 0.0416 | |
| Ensemble models |
| 517.56 | 42.09 | 341.22 | 24.99 | 0.7864 | 0.0267 |
| CatBoost | 536.23 | 52.59 | 364.20 | 32.35 | 0.7704 | 0.0353 | |
| XGBoost | 518.30 | 43.45 | 343.65 | 25.59 | 0.7850 | 0.0324 | |
The best values are in underlined boldface.
Figure A1Forward stepwise subset selection using C, AIC, BIC and adjusted R2.
Figure 7Forward stepwise subset selection using C, AIC, BIC and adjusted R2.
Gradient boosting weather variables.
| Category | Variable Name | Classification | Description (unit) |
|---|---|---|---|
| Solar elevation (1) | Elevation | Continuous | Degrees (0°–76°) |
| Weather forecast | Humidity | Continuous | (%) |
| PrecipitationProb | Continuous | (%) | |
| PrecipitationCat | Categorical | 0: none | |
| Temperature | Continuous | Celsius (°C) | |
| WindSpeed | Continuous | (m/s) | |
| Weather observation (Features: 4) | AtmosPressure | Continuous | (hPa) |
| SolarRadiation | Continuous | (MJ/m2) | |
| WindDirection | Categorical | W: 225°–315° | |
| 5cmDownTemperature | Continuous | Temperature below the ground surface (°C) |
XGboost weather variables.
| Category | Variable Name | Classification | Description (unit) |
|---|---|---|---|
| Solar elevation (1) | Elevation | Continuous | Degrees (0°–76°) |
| Weather forecast | Humidity | Continuous | (%) |
| PrecipitationProb | Continuous | (%) | |
| PrecipitationCat | Categorical | 1: rain | |
| Temperature | Continuous | Celsius (°C) | |
| WindDirection | Categorical | E: 45°–135°, W: 225°–315°, | |
| WindSpeed | Continuous | (m/s) | |
| Weather observation | Humidity | Continuous | (%) |
| SolarRadiation | Continuous | (MJ/m2) | |
| WindDirection | Categorical | S: 135°–225° | |
| GroundTemperature | Continuous | Celsius (°C) | |
| 5cmDownTemperature | Continuous | Temperature below the ground surface (°C) | |
| 20cmDownTemperature | Continuous |
Figure 8Performances of gradient boosting and XGBoost from 27 May 2015 to 31 December 2015.