Literature DB >> 28903260

Statistical Modeling of Spatio-Temporal Variability in Monthly Average Daily Solar Radiation over Turkey.

Fatih Evrendilek1, Can Ertekin2.   

Abstract

Though one of the most significant driving forces behind ecological processessuch as biogeochemical cycles and energy flows, solar radiation data are limited or non-existent by conventional ground-based measurements, and thus, often estimated from othermeteorological data through (geo)statistical models. In this study, spatial and temporalpatterns of monthly average daily solar radiation on a horizontal surface at the ground levelwere quantified using 130 climate stations for the entire Turkey and its conventionally-accepted seven geographical regions through multiple linear regression (MLR) models as afunction of latitude, longitude, altitude, aspect, distance to sea; minimum, maximum andmean air temperature and relative humidity, soil temperature, cloudiness, precipitation, panevapotranspiration, day length, maximum possible sunshine duration, monthly average dailyextraterrestrial solar radiation, and time (month), and universal kriging method. Theresulting 20 regional best-fit MLR models (three MLR models for each region) based onparameterization datasets had R²adj values of 91.5% for the Central Anatolia region to 98.0%for the Southeast Anatolia region. Validation of the best-fit MLR models for each region led to R₂ values of 87.7% for the Mediterranean region to 98.5% for the Southeast Anatoliaregion. The best-fit anisotropic semi-variogram models for universal kriging as a result ofone-leave-out cross-validation gave rise to R² values of 10.9% in July to 52.4% inNovember. Surface maps of monthly average daily solar radiation were generated overTurkey, with a grid resolution of 500 m x 500 m.

Entities:  

Keywords:  Multiple linear regression; Solar radiation; Spatio-temporal modeling; Turkey.; Universal kriging

Year:  2007        PMID: 28903260      PMCID: PMC3965217          DOI: 10.3390/s7112763

Source DB:  PubMed          Journal:  Sensors (Basel)        ISSN: 1424-8220            Impact factor:   3.576


Introduction

Solar radiation is one of the most significant driving variables that trigger changes in ecological processes such as biogeochemical cycles and energy flows [1-3]. The rate of total (both direct and diffuse) incoming solar energy on a horizontal plane at the earth's surface is referred to as global solar radiation and mathematically expressed as follows: where SRg: global radiation on a horizontal surface; SRd: diffuse radiation; SRdb: direct beam radiation on a surface perpendicular to the direct beam; and z: Sun's zenith angle. Direct solar radiation is usually measured by a pyrheliometer, while global and diffuse solar radiation is measured by ground-based pyranometers [4]. However, solar radiation data are often estimated from statistical models, and remotely-sensed data for areas where there are limited or non-existent conventional ground-based measurements. Satellite-derived solar radiation data provide coverage over large regions of 100 to 10,000 km2, with relatively long time intervals and are generally derived from such sensors as the geostationary Earth radiation budget satellites (GERBS), the geostationary operational environmental satellites (GOES), geostationary meteorological satellites (GMS), and NOAA-AVHRR (Advanced Very High Resolution Radiometer) [5-10]. Ground-based observation data are one point-measured data for relatively short time intervals. (Geo)statistical models can produce a reliable solar radiation database at the local-to-global scales for a given spatio-temporal range from a single variable (e.g. day length) or multiple variables (e.g. elevation, temperature, and evapotranspiration) [2,4,11-14]. This study aims at national and regional quantifications of spatial and temporal patterns of monthly average daily solar radiation on a horizontal surface at the ground level through multiple linear regression (MLR) and semi-variogram models.

Materials and Methods

Statistical Performance Indicators and Validation of National and Regional MLR Models of Daily Solar Radiation

The MLR models were based on five geographical variables of latitude (decimal degree), longitude (decimal degree), altitude (m), aspect (compass degree), and distance to sea (DtS, km); 11 monthly-observed climate variables of minimum, maximum and mean air temperature (Tmin, Tmax and T, °C) and relative humidity (RHmin, RHmax and RH, %), soil temperature (ST, °C at the depth of 0 to 5 cm), cloudiness (CLD, %), precipitation (PPT, mm), pan evapotranspiration (PET, mm), and day length (S, h); two monthly-derived climate variables of maximum possible sunshine duration (So, h), and monthly average daily extraterrestrial solar radiation (Ho, MJ m-2 day-1); and time (month) for the entire Turkey and its conventionally-accepted seven geographical regions (Mediterranean, Aegean Sea, Black Sea, Central Anatolia, East Anatolia, Southeast Anatolia, and Marmara). Monthly climate variables were acquired between 1968 and 2004 from 130 climate stations across Turkey through the Turkish State Meteorological Service. Based on the Jackknifing procedure, the dataset was randomly divided into independent parameterization and validation datasets, so as to make the ratio of number of climate stations of validation dataset to those of parameterization dataset equal to or greater than 25% for each region and the entire country (Figure 1).
Figure 1.

Geographical and altitudinal distribution of 130 climate stations used in parameterization and validation of monthly average daily solar radiation models over Turkey.

Through the parameterization datasets for each region and the entire country, best MLR models with site-specific explanatory variables and parameters were determined. Three optimum MLR models were recommended for each region and the entire country based on a forward stepwise selection. In forward stepwise selection, each variable that is not already in the model is tested for inclusion one at a time in the model. The most significant ones of these variables are added to the model provided that their P values ≤ 0.001 pre-set in this study. In this approach, variables once entered in the model may be dropped if they are no longer significant as other variables are added. The degree of model accuracy, and thus, comparative performances of MLR models were quantified using the following four statistical indicators: (1) coefficient of determination (R2, %); (2) the adjusted coefficient of determination (R2adj, %); (3) the root mean square error (RMSE, MJ m-2 day-1); and (4) Mallows's Cp statistic [15]. The coefficient of determination (R2) is the proportion of variation in a response variable explained by a regression model, while the adjusted coefficient of determination (R2adj) is the coefficient of determination modified to account for the number of explanatory variables added to a model and sample size. The R2 and R2adj are calculated as follows: where SRp, SRo, and SRm are the predicted, observed and mean values of the response variable, monthly average daily solar radiation, respectively. p is the total number of explanatory variables, and n is sample size. The RMSE reveals the level of scatter that a model produces and provides a comparison of the absolute deviation between the predicted and observed values. The lower the RMSE values are, the better a model is indicated to perform. The RMSE can be calculated as follows: Mallows's Cp statistic is mathematically expressed as follows: where SS is the residual sum of squares for the best model with p (the number of parameters in the model) (including the intercept). MS is the residual mean square when using all available explanatory variables. If the model fits the data well, then Cp value is expected to be approximately equal to p. Models with considerable lack-of-fit have values of Cp larger than 2p [16]. Three optimum MLR models chosen for each region and the entire country with the forward stepwise selection were tested comparing observed versus predicted values of daily solar radiation through the validation datasets. The degree of model fit between observed versus predicted values of daily solar radiation was quantified using R2 values (%).

Construction and Cross-Validation of National Geo-statistical Model of Daily Solar Radiation

The surface maps of monthly average daily solar radiation were created for the entire Turkey of 780,580 km2 with a grid resolution of 500 m × 500 m using 130 weather stations using the ArcGIS 9.1 [17]. The assumption of spatial autocorrelation for daily solar radiation data from 130 climate stations was verified by examining Moran's Index (I) values and their statistical significance as an indicator of the strength of correlation between observations as a function of the distance separating them [18]. The values of Moran's I range from 1 to -1 (strong positive and negative spatial autocorrelations, respectively), with 0 indicating a random pattern. To satisfy stationarity assumption prior to the spatial interpolation, trend analysis was performed to determine whether or not a global trend, an overriding process that affects all observed data in a deterministic manner, exists. Detrending was implemented by removing first order trends from all the semi-variogram models and adding back before predictions were made in order to more accurately model the random short-range variation in monthly average daily solar radiation over Turkey. Directional influences (anisotropy) detected in the spatial autocorrelation were accounted for in the semi-variogram models. Spatial interpolation was carried out using universal kriging method, and thus, a semi-variogram model that defines variance as a function of distance and direction as follows [19]: where γ(h) is the semi-variance of variable z as a function of both lag distance or separation distance (h); N(h) is the number of observation pairs of points separated by h used in each summation; and z(x) is the random variable at location x. The selection of the best-fit semi-variogram model was based on the six error statistics of leave-one-out cross-validation: (1) the mean prediction error (MPE), (2) the root mean square prediction error (RMSPE), (3) the average kriging standard error (AKSE), (4) the mean standardized prediction error (MSPE), (5) the root mean square standardized prediction error (RMSSPE), and (6) R2 as follows: where z is the observed value at location k, z is the predicted value at k through the ordinary kriging method, N is the number of pairs of observed and predicted values, and σ(k) is the prediction standard error for location k. As an indicator of prediction errors, the MPE and MSPE values reveal the degree of bias in model predictions and should be close to zero. In the assessment of uncertainty (variability in predictions), the RMSPE and AKSE values show the precision of prediction and should be equal to one another. Overestimation and underestimation of variability in predictions occur when the AKSE > and < the RMSPE, respectively. The RMSSPE values provide comparison of the error variance to the kriging variance and should be close to unity. Underestimation and overestimation occur when the RMSSPE values > and < unity, respectively [20].

Results and Discussion

Monthly average daily solar radiation data for each month in Turkey were revealed to follow Gaussian distribution given their histogram plots and closeness of their mean and median values in Figure 2. On average, daily solar radiation ranged from 5.8 ± 1.1 MJ m-2 day-1 in December to 22.6 ± 2.2 MJ m-2 day-1 in June in Turkey. Three best-fit MLR models for each geographical region of Turkey, and their validation against the independent datasets were presented in Table 1. A total of the 20 regional MLR models resulted in R2adj values that accounted for 91.5% of variation in the solar radiation data for the Central Anatolia region and for 98.0% for the Southeast Anatolia region. Similarly, the RMSE values of the MLR models ranged from 0.89 in the Southeast Anatolia region to 1.86 in the Central Anatolia region.
Figure 2.

Statistical distribution histograms of monthly average daily solar radiation data (MJ m-2 day-1) of January (1) to December (12).

Table 1.

Best-fit stepwise multiple linear regression (MLR) models of monthly average daily solar radiation (MJ m-2 day-1) for seven geographical regions of Turkey.

Region nameEast AnatoliaMediterraneanAegeanSoutheast AnatoliaCentral AnatoliaBlack SeaMarmara
Number of explanatory variables in MLR model56745656712334512234
Intercept-5.106-1.0152.735-5.41517.957-10.534-3.342-0.4133.684-3.396-0.929-3.833-2.656-2.283.1157.2172.3624.9374.213
Ho(S/So) (MJ m-2day-1)1.2041.241.3040.6080.5880.6460.7361.1161.2330.6080.7870.7540.7750.9320.9380.6951.5660.8681.2721.249
PET (mm month-1)-0.014-0.015-0.0200.0110.0120.012-0.023-0.022-0.030-0.023-0.022
CLD (% month-1)0.1810.1970.2050.1320.064a0.021b0.1070.1270.1130.0740.084
ST (°C month-1)-0.231-0.141-0.1730.1680.1760.285-0.34-0.412-0.493
S (h month-1)-1.03-1.1-1.42-1.68-2.12-2.87-1.49-1.45
RHmax (% month-1)0.090.0440.07-0.221-0.265
Aspect (°)0.003280.00590.006650.006960.00375
PPT (mm month-1)-0.016-0.015-0.014
Elevation (m)0.000920.001130.00132
Tmax (°C month-1)0.4180.490.514
RH (% month-1)-0.08
RHmin (% month-1)-0.126
DtS (km)-0.0058
RMSE1.161.111.071.421.331.271.371.331.31.350.9690.8961.861.751.711.451.11.611.531.46
R2 (%)95.493.1
R2adj (%)96.496.796.993.994.695.194.895.295.497.698.091.592.592.996.093.293.994.4
Cp63.739.626.28861.245.358.341.130.6286.2101.87481.741.330300.7127.59671.954.5
V/P ratio (%)29253025273331
R2(%) for validation92.995.595.991.390.687.793.493.192.696.198.598.294.091.491.795.695.793.994.394.3

All the variables except for

P < 0.01 and

P > 0.05 are significant at P ≤ 0.001;

Ho: monthly average daily extraterrestrial solar radiation on a horizontal surface; S: day length; So: maximum possible sunshine duration; ST: soil temperature for a depth of 0 to 5 cm; RHmax: maximum relative humidity; PET: potential evapotranspiration; PPT: precipitation; RH: relative humidity; CLD: cloudiness; Tmax: maximum air temperature; RHmin: minimum relative humidity; DtS: distance to sea; RMSE: root mean square error; and V/P: ratio of number of stations of validation dataset to those of parameterization dataset.

The frequency of presence of the explanatory variables in the regional MLR models was found in decreasing order as follows: Ho(S/So) (100%), PET (55%), CLD (55%), ST (45%), S (40%), RHmax (25%), aspect (25%), PPT (15%), elevation (15%), Tmax (15%), RH (5%), RHmin (5%), DtS (5%), latitude (0%), longitude (0%), mean and minimum air temperature (0%), and time (month) (0%). Monthly PPT and Tmax played a significantly important role only in the MLR models of the Mediterranean and Aegean regions, respectively (P ≤ 0.001). Monthly RH, and elevation were found as the significant explanatory variables in the MLR model of the East Anatolia. Monthly RHmin, and DtS appeared to be significant only in the MLR models of the Southeast and Central Anatolia regions, respectively. Comparisons of values observed from the climate stations versus values predicted by the best-fit MLR models for each region led to R2 values of 87.7% for the Mediterranean region to 98.5% for the Southeast Anatolia region. The validation of the regional MLR models revealed that the highest R2 values were obtained from the models as a function of seven variables—Ho(S/So), PET, ST, S, RHmax, RH, and elevation—for the East Anatolia region (95.9%); as a function of five variables—Ho(S/So), PET, CLD, ST, and Tmax—for the Aegean region (93.4%); as a function of four variables—Ho(S/So), CLD, ST, and PPT—for the Mediterranean region (91.3%); as a function of three variables—Ho(S/So), CLD, and aspect—for the Central Anatolia region (94.0%) and—Ho(S/So), PET, and S—for the Marmara region (94.3%); and as a function of two variables—Ho(S/So), and CLD—for the Southeast Anatolia region (98.5%) and—Ho(S/So), and S— for the Black Sea region (95.7%). The national MLR models elucidated about 93% of variation in monthly average daily solar radiation as a function of six to eight variables (Table 2). Validation of the national MLR models indicated that the MLR with the six explanatory variables of Ho(S/So), CLD, RHmax, elevation, aspect, and month performed best, with the R2 value of 93.3%.
Table 2.

Best-fit stepwise multiple linear regression (MLR) models of monthly average daily solar radiation (MJ m-2 day-1) for Turkey.

Number of explanatory variables in MLR model678
Intercept-7.156-9.853-5.206
Ho(S/So) (MJ m-2 day-1)0.7600.7620.776
Cloudiness (% month-1)0.08340.08190.0939
Elevation (m)0.000920.000630.00066
Month-0.095-0.095-0.080
Aspect (compass degree)0.002280.002290.00248
RHmax (% month-1)0.0510.0620.058
Longitude (decimal degree)0.0540.048
Latitude (decimal degree)-0.126
RMSE1.591.581.58
R2adj (%)93.393.493.5
Cp94.677.167.5
V/P ratio (%)29
R2 for validation (%)93.392.992.9

All the variables are significant at P ≤ 0.001; Ho: monthly average daily extraterrestrial solar radiation on a horizontal surface; S: day length; So: maximum possible sunshine duration; RHmax: maximum relative humidity; RMSE: root mean square error; and V/P: ratio of number of stations of validation dataset to those of parameterization dataset.

The test of the assumption for spatial autocorrelation based on Moran's I showed that there is a significantly clustered pattern for the months of January to May and August to December (P < 0.01) and for June and July (P < 0.05) (Table 3). The degree of spatial dependence for the solar radiation data was also calculated as the ratio of nugget (c0) to sill (c0 + c), and the nugget-to-sill ratios were found to range from 54% in February to 87% in June (Table 3). As the values of the nugget-to-sill ratio increase, spatial dependence for the data is indicated to decrease.
Table 3.

Parameters of semi-variogram models and error statistics of their one-leave-out cross-validation for monthly average daily solar radiation over Turkey.

Parameters and error statisticsMonth

123456789101112
Moran's I0.120.120.10.070.030.030.020.040.060.10.130.12
Major range (a)4.744.747.4610.663.25.334.034.034.034.144.744.74
Partial sill (c)0.420.861.10.450.860.591.481.420.760.370.210.22
Nugget effect (c0)0.651.011.772.262.93.953.742.892.331.280.70.43
Ratio of nugget to sill0.610.540.620.830.770.870.720.670.750.780.770.66
Lag size0.40.40.630.90.270.450.340.340.340.350.40.4
MPE-0.04-0.04-0.12-0.09-0.1-0.17-0.13-0.1-0.1-0.08-0.07-0.04
RMSPE0.961.251.541.641.952.122.252.01.711.210.910.81
AKSE0.971.221.521.652.032.192.231.991.741.270.940.78
MSPE-0.04-0.03-0.08-0.06-0.05-0.08-0.06-0.06-0.06-0.07-0.07-0.06
RMSSPE0.981.011.00.980.960.971.011.010.980.950.961.02
R2 (%) for validation48.9943.8837.0028.0318.9413.6510.9918.3628.8046.4852.4450.32

Ratio of nugget to sill: c / (c + c); MPE: mean prediction error; RMSPE: root mean square prediction error; AKSE: average kriging standard error; MSPE: mean standardized prediction error; RMSSPE: root mean square standardized prediction error.

The global trend analysis indicated that there is an overriding trend in the solar radiation data in the south-to-north direction of Turkey (Figure 3). The first order of trend removal was performed before the implementation of universal kriging for the solar radiation data given the plots of the global trend analysis in Figure 3. An anisotropic spherical spatial correlation model was used due to significant anisotropy or nugget effect, generally attributed to small scale variability or measurement error. A large nugget effect for the solar radiation semi-variogram models means that the local scale spatial autocorrelation (spatial dependence) among observations weakens. The nugget was high relative to the sill, thus indicating that most of the fine-scale variability was not explained by the semivariogram models. Anisotropic spherical semi-variogram models performed best for the solar radiation data, with neighbors to include (at least) = 9(5), and number of lags = 12.
Figure 3.

Trend analyses of monthly average daily solar radiation data of January (1) to December (12) by three-dimensional plots of the dataset from 130 climate stations over Turkey. The locations of 130 climate stations are projected on the x-y plane with the red dots. Above climate station points, daily solar radiation values (MJ m-2 day-1) are shown by the height of the red sticks in the z dimension. Daily solar radiation values are projected onto the x-z (west) and y-z (north) planes as the green and blue dots of the scatter plots, respectively. Green and blue lines refer to regression lines fitted to the scatter plots on the x-z and y-z planes, respectively.

The specific parameters of the best-fit anisotropic semi-variogram models for universal kriging are presented in Table 3. The degree of bias in the monthly average daily model predictions was highest for June and lowest for January and February according to the MPE and MSPE values of the spatial one-leave-out cross-validation. Variability in the monthly average daily predictions of solar radiation was overestimated for January, April, May, June, September, October, and November and underestimated for the rest of the months according to the AKSE, RMSPE and RMSSPE values. One-leave-out cross-validation of the monthly average daily solar radiation models revealed that R2 values for the comparisons of observed versus predicted solar radiation values ranged from 10.9% in July to 52.4% in November. Geostatistical models performed better for the months of October to March (R2 = 37.0 to 52.4%) than for those of April to September (10.9 to 28.8%) (Table 3). Surface maps of monthly average daily solar radiation over Turkey were generated with a grid resolution of 500 m × 500 m (Figure 4).
Figure 4.

Surface maps of monthly average daily solar radiation over Turkey based on anisotropic spherical semi-variogram models for universal kriging with a grid resolution of 500 m × 500 m.

In this study, (1) the most robust generic MLR models of monthly average daily solar radiation, (2) their performance for predicting temporal variation, (3) spatial distribution of the solar radiation data interpolated by universal kriging, which discerns both stochastic and deterministic components of spatial variation, (4) jackknifing validation of temporal predictions, and (5) one-leave-out cross-validation of spatial predictions were quantified not only for the entire Turkey but also for its seven geographical regions differentiated by virtue of their specific geographical conditions, based on 130 climate stations.
  1 in total

1.  Notes on continuous stochastic phenomena.

Authors:  P A P MORAN
Journal:  Biometrika       Date:  1950-06       Impact factor: 2.445

  1 in total
  2 in total

1.  Does irrigation with reclaimed water significantly pollute shallow aquifer with nitrate and salinity? An assay in a perurban area in North Tunisia.

Authors:  Makram Anane; Youssef Selmi; Atef Limam; Naceur Jedidi; Salah Jellali
Journal:  Environ Monit Assess       Date:  2014-03-28       Impact factor: 2.513

2.  Spatial Assessment of Solar Radiation by Machine Learning and Deep Neural Network Models Using Data Provided by the COMS MI Geostationary Satellite: A Case Study in South Korea.

Authors:  Jong-Min Yeom; Seonyoung Park; Taebyeong Chae; Jin-Young Kim; Chang Suk Lee
Journal:  Sensors (Basel)       Date:  2019-05-05       Impact factor: 3.576

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.