Integrating Map Algebra and Statistical Modeling for Spatio- Temporal Analysis of Monthly Mean Daily Incident Photosynthetically Active Radiation (PAR) over a Complex Terrain.

This study aims at quantifying spatio-temporal dynamics of monthly mean daily incident photosynthetically active radiation (PAR) over a vast and complex terrain such as Turkey. The spatial interpolation method of universal kriging, and the combination of multiple linear regression (MLR) models and map algebra techniques were implemented to generate surface maps of PAR with a grid resolution of 500 × 500 m as a function of five geographical and 14 climatic variables. Performance of the geostatistical and MLR models was compared using mean prediction error (MPE), root-mean-square prediction error (RMSPE), average standard prediction error (ASE), mean standardized prediction error (MSPE), root-mean-square standardized prediction error (RMSSPE), and adjusted coefficient of determination (R2adj.). The best-fit MLR- and universal kriging-generated models of monthly mean daily PAR were validated against an independent 37-year observed dataset of 35 climate stations derived from 160 stations across Turkey by the Jackknifing method. The spatial variability patterns of monthly mean daily incident PAR were more accurately reflected in the surface maps created by the MLR-based models than in those created by the universal kriging method, in particular, for spring (May) and autumn (November). The MLR-based spatial interpolation algorithms of PAR described in this study indicated the significance of the multifactor approach to understanding and mapping spatio-temporal dynamics of PAR for a complex terrain over meso-scales.


Introduction
The spatial and temporal patterns of photosynthetically active radiation (PAR) (400-700 nm) are a necessary input for modeling ecosystem processes such as evapotranspiration, net primary productivity (NPP), and sequestration of carbon (C) and nitrogen (N) in vegetation and soils [1,2]. The availability and intensity of PAR intercepted and absorbed by canopies is strongly linked to rates at which atmospheric carbon dioxide (CO 2 ) is converted into plant organic matter and cycled among natural sources and sinks such as the oceans, soils, vegetation, and atmosphere [3][4][5]. Possible effects of increased atmospheric CO 2 on the global climate, and the net imbalance between the sources and sinks of CO 2 , known as the "missing carbon" sink, generate interests in improving knowledge of spatiotemporal patterns of PAR [6,7].
PAR is not routinely measured, and therefore, often estimated from measurements of global solar radiation (SR) in climate stations. Worldwide investigations to predict PAR from routinely measured SR showed that the ratio of PAR to SR mainly falls between 0.45 and 0.50 in the northern and southern hemispheres [8,9]. However, availability of long-term SR measurements is generally restricted to locations of meteorological stations as well as by mountainous topography. Consideration of spatial heterogeneity at the landscape scale necessitates the uses of multiple regression analysis, and geostatistical interpolation methods such as kriging and co-kriging to estimate spatial and temporal variations in SR and/or PAR [26][27][28][29].
The objective of this study was to model spatio-temporal dynamics of monthly mean daily incident PAR over a large and complex terrain such as Turkey, using best generic and month-specific multiple linear regression (MLR) models as well as the geostatistical interpolation method of universal kriging with and without best MLR models and validate the models with the Jackknifing method, based on a geo-referenced dataset of SR measured between 1968 and 2004 from 160 climate stations across Turkey.

Primary Motivation
The primary motivation for the present work is that spatio-temporal dynamics of PAR are an essential input for the development of biogeochemical models to simulate ecosystem processes such as NPP, C and N cycles, and CO 2 sources and sinks in a given scale of space and time. The dynamics of PAR and canopy (land cover) determine the net amount of absorbed PAR in that only a fraction of PAR is absorbed by the canopy, known as absorbed PAR (APAR), and used for CO 2 assimilation. The APAR can be estimated as follows: APAR = PAR itc -PAR rtc -PAR t + PAR rs (1) where PAR itc refers to the amount of PAR incident at the top of the canopy; PAR rtc is the amount of PAR itc reflected from the top of the canopy; PAR t is the amount of PAR itc transmitted through the canopy to the soil; PAR rs is the amount of PAR t reflected from the soil and absorbed by the canopy. Eq. (1) can be simplified into the following expression: (2) where f refers to APAR/PAR itc fraction that changes non-linearly with leaf area index which can be approximated from the normalized different vegetation index (NDVI), an expression for chlorophyll related photosynthetic activity obtained by the technology of remote sensing [10][11][12]. APAR can be linked to the quantification of NPP based on the approach of light use efficiency (LUE) by Monteith [13,14], and Kumar and Monteith [15] as follows: where NPP T refers to total net primary productivity (g C m -2 yr -1 ); LUE T (g DM MJ -1 ) is light use efficiency of APAR into total dry matter (DM) of aboveground and belowground biomass; GSL is the length of growing season period (in days); and RF is reduction factors caused by growth-limiting environmental conditions such as soil productivity, climate factors, herbivory, and diseases. The value of 0.45 is a conversion coefficient for C content per unit DM biomass [30].

Study Region
Turkey (latitudes: 36-42°N, longitudes: 26-45°E) is located where Asia, Europe, and the Middle East meet and has a total area of about 780,595 km 2 with an average altitude of 1250 m. Air temperature ranges from 45°C in July in the southeastern region to -30°C in February in the eastern regions, with a mean annual temperature of around 13°C. Annual precipitation varies between 258 mm in the central and southeastern regions and 2220 mm in the northeastern Black Sea coast, with mean annual precipitation of around 634 mm. Annual pan evapotranspiration reaches 2400 mm in the southeastern region and declines to 624 mm in the eastern region, with mean annual evapotranspiration of 1280 mm.

Climate Data
Climate data used in this study were obtained from 160 meteorological stations across Turkey for the 37-year period of 1968 to 2004 [31]. The climate dataset included global solar radiation (SR, MJ m -2 day -1 ), day length (S, h), mean, minimum and maximum air temperature (T, T min and T max , o C) and relative humidity (RH, RH min and RH max , %), cloudiness (CLD, %), potential evapotranspiration (ET, mm), precipitation (PPT, mm), and soil temperature (0 to 5 cm in depth) (ST 5 , o C). The fraction of incident PAR in incoming SR was assumed to be 0.48 [16][17][18]. Monthly mean daily extraterrestrial solar radiation on a horizontal surface (H o , MJ m -2 day -1 ), and monthly maximum possible sunshine duration (S o , h) were derived based on Duffie and Beckman [19] where gs I is the solar constant (1367 W m -2 ); f is the eccentricity correction factor; λ is the latitude of the site; δ is the solar declination; and s w is the mean sunrise hour angle for a given month. The eccentricity correction factor, solar declination, and sunrise hour angle were calculated using the following Equations, respectively [19]: where n is the number of day of year starting from first of January. For a given month, the maximum possible sunshine duration ( ) 0 S was calculated using the following Equation [19]:

Mapping Geographical and Climate Variables
Digital elevation model (DEM) was constructed from a 1:250,000 scale topographic map of Turkey (Turkish General Command of Mapping 2005), projected to the Universal Transverse Mercator (UTM) coordinates of the World Geographic System (WGS 1984) and re-sampled to a grid size of 500 m, using ArcGIS 9.1 [20]. Explanatory geographical variables of latitude (Lat, decimal degree), longitude (Lon, decimal degree), distance to sea (DtS, m), and aspect (Asp, compass degree) were derived from the 500-m resolution DEM data. The implementation of kriging necessitates the calculation of a semivariogram model that defines variance as function of distance, and direction as follows [21]: where γ(h) is the semi-variance of variable z as a function of both lag distance or separation distance (h); N(h) is the number of observation pairs of points separated by h used in each summation; and z(x k ) is the random variable at location x k . Explanatory climate variables were spatially interpolated from the 160 meteorological stations as required by the best MLR models of PAR selected for each month. The generation of digital surface maps of PAR was based on the prediction method of universal kriging, with a grid resolution of 500 m x 500 m. The total area of Turkey 780,595 km 2 translates into 3 182 222 cells each 500 x 500 m in size. The semi-variogram models provide estimates for the lag size, range, nugget, and partial sill. The range (a) corresponds to the distance at which the semi-variogram reaches its asymptote and beyond which there is little or no spatial dependence. The sill defines the asymptotic height of the variogram (i.e., general or maximum variance in the data) and consists of nugget (c 0 ) and partial sill (c). The partial sill is the spatially correlated component of the variance as a measure of the strength of the spatial dependence, while the nugget is the spatially uncorrelated component of the variance plus what is spatially correlated below the level of the minimum lag size as a measure of the inherent or nonspatial variation (including measurement errors). The degree of spatial dependence for PAR was calculated as the ratio of the nugget (c 0 ) to the sill (c 0 + c) (N:S, expressed as percentage) and classified distinctly as strongly spatially dependent when N:S ratio was ≤ 25%; moderately spatially dependent when 25% < N:S ratio < 75%; and weakly spatially dependent when N:S ratio ≥ 75%.

Spatio-temporal Modeling of Monthly Mean Daily Incident PAR
Selection of best MLR models of monthly mean daily incident PAR was based on the lowest Mallows's C p value of the best subset procedure [32], the highest value of adjusted coefficient of determination (R 2 adj. ), and significant P values (<0.05) for all the explanatory variables. The entire dataset from the 160 meteorological stations was randomly partitioned into the datasets of parameterization (125 stations, 88%) and validation (35 stations, 22%) according to the Jackknifing method [33]. Based on the parameterization dataset, two types of MLR models were constructed to predict monthly mean daily PAR-(1) a generic MLR model with the explanatory variable of months being coded as 1 to 12, respectively; and (2) month-specific MLR models-using the following expression: where y refers to the response variable of PAR; β 0 is estimate of the intercept; β n is regression coefficients (slopes) of the explanatory variables; x n is the explanatory variables; x 1 x 2 is an interaction term; and ε is random error term.
Spatio-temporal modeling was conducted using (1) the interpolation method of universal kriging, and (2) the combination of the MLR models and map algebra techniques available in ArcGIS 9.1. Map algebra refers to an approach to the handling of raster datasets which treats spatial data layers as variables so as to be processed using mathematical operators [34]. All spherical semi-variogram models of universal kriging were best fit to the PAR dataset by adjusting lag sizes in order to obtain the highest possible values of coefficient of determination (R 2 ) of spatial one-leave-out cross-validations, with number of lags = 12 and neighbors to include = 5 being held constant. The final spatial variability maps of monthly mean daily incident PAR were generated for each month using map algebra tool of ArcGIS 9.1, as expressed in Eq (10). First, the combination of the MLR models and map algebra techniques was realized by generating raster maps of all the explanatory geographical and climatic variables of the best MLR models, as explained in Section 2.4. Second, coefficient values (β n ) of the explanatory variables were multiplied by corresponding raster maps of the explanatory variables. Finally, intercept values (β 0 ) of the best MLR models were added to the resulting raster maps.

Statistical Analyses
Data used in the study were statistically analyzed using Minitab 13.20 (Minitab Inc., PA, USA) and ArcGIS 9.1 [20]. Exploratory data and quality control analyses of the climate dataset were performed using (1) histogram plot; (2) QQ plot; (3) Anderson-Darling test for normality; (4) trend analyses for global trend; (5) variogram surface for anisotropic (directional influences) or isotropic (having the same variation in each direction) features; and (6) Moran's Index for spatial autocorrelation. Pearson's correlation matrix was applied to a total of 19 variables (five geographical and 14 climatic variables) in order to quantify strength and direction of the relationships among the variables.
Comparison of observed versus predicted PAR values was performed using the values of R 2 for all the generic and month-specific MLR models. The validation dataset from 35 stations was randomly so selected as to be spatially representative for the conventionally-accepted seven climate zones of Turkey. The values of R 2 adj were used to compare the performance of the generic and month-specific MLR models based on the parameterization datasets from 135 stations. Performance of the geostatistical models without the MLR models was assessed and compared using the following six statistics of spatial one-leave-out cross-validation: (1) mean prediction error (MPE); (2) root-meansquare prediction error (RMSPE); (3) average standard prediction error (ASE); (4) mean standardized prediction error (MSPE); (5) root-mean-square standardized prediction error (RMSSPE); and (6) R 2 as follows: where z ok is the observed value at location k; z pk is the predicted value at k through the universal kriging method; N is the number of pairs of observed and predicted values; and σ(k) is the prediction standard error for location k.
The MPE and MSPE values indicate the degree of bias in model prediction and should be close to zero. The RMSPE and ASE values reveal the precision of prediction and should be equal to one another, with ASE > and < RMSPE showing overestimation and underestimation, respectively. The RMSSPE compares the error variance with kriging variance and should be close to unity, with the RMSSPE values > and < unity indicating underestimation and overestimation, respectively.

Results and Discussion
Mean values of the 37-year climate dataset observed from 160 stations for 14 variables in Turkey are presented in Table 1. Elevations of the 160 stations ranged from 0 to 2296 m with a mean value of 710 m and displayed a spatial distribution of 30% ≤ 100 m, 36% > 100 and ≤ 1000 m, 26% > 1000 and ≤ 1500 m, and 8% > 1500 and ≤ 2296 m. Monthly mean daily incident PAR over Turkey varied between 2.8 and 10.8 MJ m -2 day -1 in December and July, respectively. PAR data showed the normal (Gaussian) distribution. Pearson's correlation matrix of 14 climatic and five geographical variables revealed that PAR was negatively correlated with CLD, RH, PPT, RH min , RH max , and Lat and positively correlated with H o , S o , S, ST 5 , T max , T, ET, T min , DtS, Elev, and Lon in decreasing order of correlation coefficient (R) value significantly ( Table 2).
Best generic and month-specific MLR models of incident PAR, and the composition of their explanatory variables are presented in Table 3. The best generic MLR model accounted for 94% of variation in monthly mean daily incident PAR, while the best month-specific MLR models had R 2 adj.
values ranging from 33% in July to 77% in January (P < 0.001). Most frequently found explanatory variables of the 12 best month-specific MLR models were "H o .(S/S o )"(92%), "H o " (58%), elevation (42%), and aspect (42%). According to the best generic and five month-specific MLR models, regional topographic influence can be seen as an increase in PAR at a rate of 0.3 to 0.5 MJ m -2 day -1 for every 1000-m increase in elevation. Similarly, the five best month-specific MLR models indicated that aspect (expressed in compass degrees relative to north in a clockwise direction) had a significant effect on PAR at a rate of 0.08 to 0.15 MJ m -2 day -1 per 100 o .
Comparison of observed versus predicted PAR values revealed that the generic MLR model performed better than the month-specific MLR models for the six months of March, May, June, July, October, and November. Validation R 2 values ranged from 28% for June to 74% for December and from 13% for June to 77% for December based on the generic and month-specific MLR models, respectively (Table 3). Surface maps of incident PAR for each month were generated using the combination of raster maps created for the explanatory variables used in the MLR models and map algebra techniques. Spatial variability maps of PAR are presented for February, May, August and November in Figure 1 as months representative of winter, spring, summer, and autumn, respectively.
Surface maps of PAR were also derived from best fit spherical semvariogram models of universal kriging for comparison with the MLR-based surface maps. Universal kriging has been reported to be most robust when variables with a strong geographical trend or drift (anisotropy) such as PAR used in this study are spatially interpolated [22][23][24][25]. The first order of trend removal was applied to satisfy stationarity assumption (a homogeneous behavior on the structure of spatial correlation) prior to fitting of anisotropic spherical semi-variogram models. The positive Moran's I values of 0.07 to 0.15 showed the existence of a spatial autocorrelation (dependency) in order for the robust geostatistical interpolation of PAR to be implemented (P < 0.01). Trend analysis revealed an overriding trend in PAR and elevation in the north-to-south and west-to-east directions in Turkey, respectively. The global trend in PAR may be associated with its latitudinal dependence with the northern parts receiving less radiation than the southern parts of Turkey. The spherical semi-variogram model estimates for the lag size, range, nugget, and partial sill of universal kriging, and their one-leave-out cross-validation statistics are reported in Table 4 in order to help to account for the spatially and non-spatially correlated components of the variance. According to the N:S ratios, PAR was considered moderately spatially dependent for February (53%), January (55%), March (56%), July (59%), October (63%), December (64%), August (65%), and September (67%) and weakly spatially dependent for November (77%), April (81%), May (100%), and June (100%) in decreasing order, respectively. The higher the N:S ratio is, the larger the amount of non-spatially correlated variation, or the error component is. High nugget effect may also be attributed to the variability range of PAR shorter than the chosen grid size of 500 x 500 m. There was no anisotropy evident in the semi-variograms of PAR for May and June.  PAR: photosynthetically active radiation (MJ m -2 day -1 ); Lat: latitude (decimal degree); Lon: longitude (decimal degree); Elev: elevation (m); DtS: distance to sea (m); Asp: aspect (compass degree); T: mean air temperature ( o C); T min : minimum air temperature ( o C); T max : maximum air temperature ( o C); ET: potential evapotranspiration (mm); PPT: precipitation (mm); CLD: cloudiness (%); RH: mean relative humidity (%); RH min : minimum relative humidity (%); RH max : maximum relative humidity (%); H o : monthly mean daily extraterrestrial solar radiation on a horizontal surface (MJ m -2 day -1 ); S: day length (h); S o : maximum possible sunshine duration (h); and ST 5 : soil temperature for a depth of 0 to 5 cm ( o C). The signs "*, **, and ****" denote significance levels of P < 0.05, < 0.01, and ≤ 0.001, respectively. No asterisk by the values indicates no significance (P > 0.05).
One-leave-out cross-validation statistics of the universal kriging models of PAR had R 2 values of 28% for May to 62% for November. The fact that the MPE and MSPE values were close to zero reveals the small degree of bias in the monthly mean daily model predictions ( Table 4). The small RMSPE and ASE values indicate the precision of the universal kriging predictions. The RMSSPE values were all close to unity, thus comparing the error variance to kriging variance. The fact that the ASE values > the RMSPE values and the RMSSPE values < unity shows a slight indication of overestimation in the monthly mean daily incident PAR predictions (Table 4). In particular, the surface maps of PAR for the winter and autumn seasons of January, February, October, November, and December had relatively low cross-validation errors. All the multiple linear regression (MLR) models are significant at P < 0.001. All the coefficient values of the intercept terms and explanatory variables are significant at P < 0.05. n = 1500 for the generic MLR; n = 123 for February, April, and June; and n = 125 for the rest. PAR: photosynthetically active radiation (MJ m -2 day -1 ); Lat: latitude (decimal degree); Lon: longitude (decimal degree); Elev: elevation (m); DtS: distance to sea (m); Asp: aspect (compass degree); T: mean air temperature ( o C month -1 ); PPT: precipitation (mm month -1 ); CLD: cloudiness (% month -1 ); RH: mean relative humidity (% month -1 ); H o : monthly mean daily extraterrestrial solar radiation on a horizontal surface (MJ m -2 day -1 ); S: day length (h); S o : maximum possible sunshine duration (h); and ST 5 : soil temperature for a depth of 0 to 5 cm ( o C month -1 ). For the categorical variable "month", the months of January to December were designated as 1 to 12, respectively. R 2 values based on validation of month-specific MLR models for May and June are significant at P < 0.01 and P < 0.05, respectively, with the rest being significant at P < 0.001. R 2 values based on validation of generic MLR model for months are significant at P ≤ 0.001.
Comparison of validation R 2 values among the generic and month-specific MLR-based models, and the universal kriging models revealed that the universal kriging model performed better for April than the generic MLR model, for May than the month-specific MLR model, and for June than both generic and month-specific MLR models (Tables 3 and 4). For visual comparison, the four months of February, May, August, and November were selected as representatives of the winter, spring, summer and autumn seasons, respectively (Figure 1). For each representative month, spatial interpolations with the highest values of validation R 2 were chosen among the MLR-based and universal kriging models.  Table 4 for semi-variogram models and spatial one-leave-out cross-validation statistics) and in (b) February, (d) May, (f) August and (h) November based on multiple linear regression (MLR) models (see Table 3 for MLR models and validation results).
(g) (h) MPE: mean prediction error; RMSPE: root-mean-square prediction error; ASE: average standard prediction error; MSPE: mean standardized prediction error; RMSSPE: root-mean-square standardized prediction error; R 2 values based on oneleave-out cross-validation are all significant at P < 0.001; All semi-variograms were best fit using number of lags = 12 after the first order of trend removal.
Monthly mean daily incident PAR values were 4.8 + 0.7 MJ m -2 day -1 in February, 9.6 + 0.7 MJ m -2 day -1 in May, 9.7 + 0.8 MJ m -2 day -1 in August, and 3.7 + 0.5 MJ m -2 day -1 in November for the digital maps generated by universal kriging. For the digital maps generated by the best MLR models, monthly mean daily incident PAR values were 4.9 + 0.6 MJ m -2 day -1 in February, 10.3 + 0.6 MJ m -2 day -1 in May, 10.0 + 0.7 MJ m -2 day -1 in August, and 4.3 + 0.5 MJ m -2 day -1 in November. Visual interpretation of the surface maps in Figure 1 showed that the spatial variability patterns of monthly mean daily incident PAR were more accurately reflected in the surface maps created by the MLR-based models than in those created by the universal kriging method, particularly, for spring (May) and autumn (November).

Conclusions
The research documents how multiple factors can be efficiently incorporated into the generation of surface maps over a vast and complex terrain and compares the MLR-and universal kriging-derived models, based on an independent dataset of validation. The MLR-based spatial interpolation algorithms of PAR described in this study appeared to be useful, particularly, for complex terrains influenced by multiple biogeoclimatic factors such as Turkey. Further adjustments through field measurements, estimates by modeling, and remote sensing are needed to predict the spatio-temporal dynamics of PAR intercepted and absorbed by the canopy. The combination of MLR models and spatial interpolation techniques as reported in this study may assist in developing biogeochemical models elucidating spatio-temporal dynamics over meso-scales.