Evaluation of NASA POWER Reanalysis Products to Estimate Daily Weather Variables in a Hot Summer Mediterranean Climate

: This study aims to evaluate NASA POWER reanalysis products for daily surface maximum (Tmax) and minimum (Tmin) temperatures, solar radiation (Rs), relative humidity (RH) and wind speed (Ws) when compared with observed data from 14 distributed weather stations across Alentejo Region, Southern Portugal, with a hot summer Mediterranean climate. Results showed that there is good agreement between NASA POWER reanalysis and observed data for all parameters, except for wind speed, with coefﬁcient of determination (R 2 ) higher than 0.82, with normalized root mean square error (NRMSE) varying, from 8 to 20%, and a normalized mean bias error (NMBE) ranging from –9 to 26%, for those variables. Based on these results, and in order to improve the accuracy of the NASA POWER dataset, two bias corrections were performed to all weather variables: one for the Alentejo Region as a whole; another, for each location individually. Results improved signiﬁcantly, especially when a local bias correction is performed, with Tmax and Tmin presenting an improvement of the mean NRMSE of 6.6 ◦ C (from 8.0 ◦ C) and 16.1 ◦ C (from 20.5 ◦ C), respectively, while a mean NMBE decreased from 10.65 to 0.2%. Rs results also show a very high goodness of ﬁt with a mean NRMSE of 11.2% and mean NMBE equal to 0.1%. Additionally, bias corrected RH data performed acceptably with an NRMSE lower than 12.1% and an NMBE below 2.1%. However, even when a bias correction is performed, Ws lacks the performance showed by the remaining weather variables, with an NRMSE never lower than 19.6%. Results show that NASA POWER can be useful for the generation of weather data sets where ground weather stations data is of missing or unavailable.


Introduction
Weather variables are regarded as one of most significant factors affecting decision making in agriculture. In many regions around the world, weather variables are not observed, are of poor quality due to lacking quality control, or are not available for free. Reanalysis and gridding meteorological data from global atmospheric models are considered as one of weather data sources that can be used to compensate lack of observation, quality, or availability [1].
Reanalysis of the observations, with more complete data, improved quality control, and with constant state-of-the-art assimilating models and analysis systems, greatly improves the homogeneity of the record and makes it useful for examining weather variations. This whole endeavor is now referred to as "reanalysis" [2]. The reanalysis products are constructed from numerical weather data assimilation systems that use a variety of atmospheric and sea surface observations to provide for long-term atmospheric and land surface variables [3]. However, some reanalysis products may require corrections using observation-based datasets in order to amend for anomalies that arise from land surface modelling [4].
There are several historical reanalysis datasets available, such as the Climate Forecast System Reanalysis (CFSR) [5], the ERA-Interim reanalysis products [6], provided by the European Centre for Medium-Range Weather Forecasts (ECMWF), the Japanese Meteorological Agency (JRA-55), [7], the National Center for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) [8], the NASA Modern Era Retrospective-Analysis for Research and Applications (MERRA) [9] and NASA Prediction of Worldwide Energy Resource (NASA POWER) [10]. The latter, available for a resolution of 0.5 • latitude by 0.5 • longitude at the NASA POWER's website (https://power.larc.nasa.gov/, accessed on 1 November 2020), provides daily data of near surface air temperature, relative humidity, rainfall, solar radiation and wind speed and direction. All datasets derive from using simulations of numerical weather prediction models based on a set of meteorological observations. However, the ease of use of NASA POWER allows to easily access data since it is available as a single point, providing regional and global coverage with daily, interannual and climatological temporal averages. By selecting a single point, a time series of data is made available based on the registered coordinate-single latitude and longitude. The regional endpoint produces a time series dataset based on a bounding box of latitude and longitude coordinates defined by the user. The global endpoint returns long-term climatological averages for the entire globe. If data proves to be accurate, its user-friendly interface allows any end-user to easily have access to near-real time sound weather data from anywhere around the globe.
White et al. [23] compared daily temperature from NASA POWER in the continental USA, concluding that the reanalysis data showed good agreement with a root mean square error (RMSE) of 4.1 • C for maximum temperature and 3.7 • C for minimum temperature. Additionally, they recommended that data could be improved by adjusting for elevation effects, reducing seasonal bias, and refining estimation of actual maximum and minimum temperatures in diurnal cycles. Bai et al. [24] assessed NASA POWER's daily maximum and minimum temperatures and solar radiation in China. They concluded that there is a close relation between NASA POWER and the observed data with a RMSE of 4.0 • C, 3.2 • C and 3.4 MJ m -2 d -1 for maximum and minimum temperatures, and solar radiation, respectively. Negm et al. [25] assessed the suitability of NASA POWER to estimate reference evapotranspiration through daily maximum, minimum and average air temperatures, relative humidity, global solar radiation and wind speed, in Sicily, Italy. Results showed a RMSE for those variables, respectively of 3.6 • C, 5.0 • C, 3.2 • C, 12.2%, 2.7 MJ m -2 d -1 and 2.4 m s −1 , allowing them to conclude that NASA POWER had agreement with the corresponding measured values on ground weather stations. However, inaccurate estimations of relative air humidity occurred for the coastal weather stations. Monteiro et al. [26], performed a similar study in Brazil, found similar agreement for the same variables. Aboelkhair et al. [1] evaluated NASA POWER reanalysis data for surface monthly average maximum, minimum, average and dew point temperatures, and relative humidity in comparison the observed data in Egypt. The results showed that there is a significant correlation between NASA POWER reanalysis and observed data for all temperature parameters (RMSE lower than 5 • C) but failed to accurately simulate relative humidity (with an average RMSE of 11.6%). However, none of those previous studies evaluated the impact of bias correction on reanalysis products.
Using daily meteorological NASA POWER reanalysis data and observations from 14 weather stations in Alentejo region, Southern Portugal, the objectives of this study are: (1) assess the accuracy of NASA POWER maximum and minimum temperatures, solar radiation, relative humidity and wind speed when compared to local observations; (2) assess the performance of alternative bias correction procedures to improve reanalysis accuracy.

Study Area
This study was conducted in Alentejo Region, Southern Portugal. The region has a Köppen-Geiger Csa climate, and is characterized by a semi-arid Mediterranean climate of hot and dry season in the summer and mild temperature associated to annual rainfall in winter. This region was selected due to its characteristics, being semi-arid and prone to desertification, where water availability is crucial to achieve farming sustainability and profitability. Additionally, and due to recurrent water scarcity, this agricultural area is prone to several risks associated with weather.
Daily meteorological data from 14 ground weather stations were obtained from the Irrigation Operation and Technology Center (COTR). The number and location of the selected weather stations allow to evaluate the performance of the reanalysis data at the regional level, and to better understand the spatial-temporal trends of the series. All weather data are validated every day by a team of experienced technicians, assuring its quality and feasibility. Figure 1 and Table 1 present, respectively, the geographical position of the weather data locations, their coordinates and period of observation.

Agroclimatic Data
The annual mean and standard deviation for the selected weather variables-maximum and minimum temperature, solar radiation, relative humidity and wind speed-are shown in Table 2. These data were interpolated between stations to cover all Alentejo region, as illustrated in Figure 2, using kriging algorithm. Due to the specific nature of the atmospheric processes leading to rainfall, we left the evaluation of this variable to a dedicated study. The same weather parameters for the same period of observations were collected from NASA POWER from the nearest grid point of the target location. Table 2. Annual mean and standard deviation of maximum (Tmax) and minimum (Tmin) temperatures, mean relative humidity (RH), solar radiation (Rs) and mean wind speed (Ws) at the selected weather stations.    Figure 2 shows that the Atlantic Ocean (and somewhat the proximity the Mediterranean Sea) plays a significant role in the temperature, humidity and wind speed variations across the region. The locations closer to the sea (e.g., Odemira) present lower temperature amplitudes (with lower maximum and higher minimum average temperatures) and higher relative humidity and wind speed. On the other hand, at more inland locations (e.g., Moura), temperature tends to present higher maximums and lower minimums, with lower relative humidity and wind speed. Therefore, there is an inverse relationship between average temperature and relative humidity. Solar radiation tends to increase with lower latitudes.

Evaluation Criteria
The estimation accuracy of each variable was assessed through the metrics listed below, where O i and P i (i = 1, 2, . . . , n) represent pairs of values of each variable using locally collected data and the NASA POWER estimated data, respectively, and O and P are the respective mean values and n is the number of samples of each variable:

•
The coefficients of regression and determination, relating the observed and simulated data, b and R 2 , respectively, are defined as: Henseler et al. [27] defines that R 2 values of 0.25, 0.50 and 0.75 match weakly, moderately and significantly fit, respectively.

•
The root mean square error, RMSE, and its normalization, NRMSE, which characterizes the variance of the estimation error: RMSE measures overall discrepancies between observed and estimated values and the smaller, the better accuracy. NRMSE is dimensionless, allowing to compare its values for different variables, assuming a good goodness of fit with a normalization below 15%.

•
The mean bias error, MBE, and its normalization, NMBE, that measures the systematic error between the predicted and observed values: The MBE and NMBE measure if the predicted data is over or under estimation with its positive or negative values, respectively. MBE intends to indicate the average interpolation bias [28].

•
The Nash and Sutcliffe [29] modelling efficiency, EF, that is the ratio of the mean square error to the variance in the observed data, subtracted from unity: As suggested by Legates and McCabe [30], if the square of the differences between model simulations and observations is as large as the variability in the observed data, then EF tends toward 0.0 and the O mean, O, is as good a predictor as the model, while negative values indicate that O, is an even better predictor than the model. EF can vary between −∞ and 1.

Correction of Bias
Bias correction seeks to reduce the differences between the NASA POWER data and the observed data, since forecast products are often biased due to errors in the host weather forecast models [31]. To find an accurate yet simple bias correction procedure, the approach proposed by Leander and Buishand [32], where the correction of bias only involves shifting and scaling to adjust the mean and variance, was used. Two bias corrections schemes were carried out for each variable ( Figure 3): (1) for the Alentejo Region as a whole, hereby defined as a regional bias correction, where all data was treated as one set; (2) for each location individually, defined onwards as a local bias correction.
For each scheme, the corrected daily weather variable X' was obtained as: where X NASA is the uncorrected NASA POWER's daily weather variable and X obs is the observed daily weather variable. In the equation an overbar denotes the average over the considered period and σ the standard deviation. The ratio of the standard deviation performs the scaling while the difference of the averages performs the shifting of bias. A major uncertainty of bias correction refers to how well it performs for conditions different from those used at calibration. Thus, a validation procedure was applied, consisting in providing a validation of model fit with a set of data that is independent of the model fitting set. In the present study, the models were fitted individually for each weather station location and validated on independent data sets for the same location. The validation procedure consisted in dividing each data set into two subsets of the same size, randomly chosen from the dataset (e.g., Paredes et al. [16]). For each iteration, the first set was used for calibration and the second was used for validation ( Figure 3). Equation (1) parameters were obtained from the calibration dataset, and then used with the validation dataset. The assessment of the performance of each bias correction procedure was performed on the calibration and validation sets. When comparing the accuracy metrics of bias corrected calibration and validation datasets with the bias corrected full datasets, no evidence of overfitting was found, with the estimated parameters only showing residual differences (Supplementary Tables S1-S5). Thus, in order to test the validity of the procedure when applied to long sets, the approach to bias correct the full dataset was hereby adopted.

Bias Correction Equations
The resulting calibrated and validated bias correction equations, following the procedure presented in Figure 3, are presented in Table 3. The results obtained when adopting each equation are presented int the following Sections. Full results for all locations, with and without bias corrections, are presented in Supplementary Tables S1-S5.  Table 4 presents the mean and range values of the accuracy metrics relative to NASA POWER maximum temperature with and without bias correction. Results show that NASA POWER successfully simulate maximum temperature data, with an excellent accuracy, even if no bias correction is performed, with a R 2 and EF higher than 0.82 and 0.68, respectively, a mean NRMSE of 7.59% and an average NMBE equal to −2.56%. If a regional bias correction is applied, results show a slightly better performance: mean b increases by 2.2% (to 1.00), while the average RMSE decreases 9.2% (1.74 • C day -1 ), and the mean MBE increases 104.4% (to 0.03 • C day -1 ). Similarly, if a local bias correction is adopted, the accuracy metrics tend to improve with the mean RMSE, MBE and EF: the mean root mean square error decreases 17.0% (RMSE = 1.59 • C day -1 ), the average mean bias error decreases 97.5% (MBE = −0.02 • C day -1 ) and the mean modelling efficiency increases by 2.8% (EF = 0.95).

Evaluation of Maximum Temperature Accuracy
Results (Supplementary Table S1 and Figure S1) show that only 14% of the weather stations show an NRMSE ≤ 6.5% if no bias correction, contrasting with the results obtained when a local bias correction is performed where 57% of the weather stations present an NRMSE ≤ 6.5%. When no bias correction is performed, 28% of the weather stations show an NMBE ranging −2.5 to 2.5%; however, when a correction is applied, for the same NMBE range, the frequency increases to 71 and 100%, for regional and local bias correction, respectively. Additionally, if Tmax is locally bias corrected, an EF higher that 0.95 is obtained for 79% of all locations; if no bias correction is performed, only 29% of the location perform equally.

Evaluation of Minimum Temperature
Mean and range values of the accuracy metrics relative to minimum temperature (Table 5) show that NASA POWER can successfully estimate daily Tmin data with a R 2 higher than 0.85, showing an excellent accuracy of NASA POWER when compared with observed data. These results were obtained when adopting the calibrated and validated bias correction equations presented in Table 3. If no bias correction is performed, minimum temperature can be estimated with an EF averaging 0.84, with a mean RMSE of 2.01 • C day -1 and a mean MBE of 1.03 • C day -1 . Results tend to improve when a bias correction is applied: for a regional correction, mean b decreases by 8.9% (to 0.99), while the mean RMSE and MBE show a decrease of 16.4% (to 1.68 day -1 ) and 105.2% (to −0.05 • C day -1 ), respectively; if a local bias correction is applied, the average RMSE is even lower (with a decrease of 21.3% to 1.58 • C day -1 ) with mean MBE equal to 2.01−0.02 • C day -1 , representing a decrease of 101.8% when compared with raw data. The modelling efficiency tends to increase on both correction schemes to an average 0.89 and 0.90 for regional and local bias correction, respectively.
If no bias correction is performed, 86% of the weather stations show an NRMSE higher than 18%. If the bias of Tmin is regionally corrected, the most frequent NRMSE ranges from 22 to 26% (36% of all weather stations), while when a local bias correction is performed 57% of the weather stations present an NRMSE ≤ 18%. When no bias correction is performed, 14% of the weather stations show an NMBE ranging −5 to 5%, while if a local bias correction is applied, Tmin is estimated with an NMBE lower than ±5% for all locations. If Tmin is locally bias corrected, an EF higher that 0.84 is obtained for all locations. Full results of the accuracy metrics for minimum temperature are presented in Supplementary Table S2 and Figure S2.

Evaluation of Solar Radiation
Solar radiation (Table 6) was the most accurate NASA POWER weather variable evaluated. All stations present a R 2 value higher than 0.91, as high as 0.97 for Redondo and Vidigueira (Table S3) Therefore, errors are relatively small for Rs, with more frequent (Supplementary Figure S3) NRMSE lower than 13.5%, even without a bias correction. Additionally, the most frequent EF is higher than 0.93 for 93% of all locations, when the bias is both regionally and locally corrected; also, the most frequent NMBE ranges from −4.0 to 4.0 with and without a bias correction, showing low under or overestimation of Rs data.

Evaluation of Relative Humidity
Accuracy metrics for NASA POWER relative humidity as compared to local observations is shown in Table 7. Results, when adopting the calibrated and validated bias correction equations shown in Table 3, present a mean R 2 of 0.82 (as high as 0.88). However, for Odemira (Table S4) the coefficient of determination is equal to 0.40, showing low correlation and with fit between observed and simulated RH; this may be due to the station closeness to the sea, and the influence of both Atlantic Ocean and Mediterranean Sea. The EF values average 0.61 and range from -0.08 to 0.79. Additionally, for raw NASA POWER RH data, the RMSE averages 9.24% day -1 (with an NRMSE of 13.00%), with MBE averaging -5.17% day -1 (with an NMBE of -7.27%). However, the bias is corrected, results improve significantly with both correction schemes showing an increase of the mean b of 7.7% (to 1.01) and a mean EF increasing by 30.7% 8 to 0.80). The mean RMSE decreases to 28.8 and 30.0% for a regional and local bias correction, respectively, with normalizations below 9.25%. The most frequent EF (Supplementary Figure S4) for both schemes is higher than 0.80, with more than 93% of all stations having an NMBE between −1.0 and 1.0%.

Evaluation of Wind Speed
Among all the weather variables evaluated, NASA POWER's wind speed is the least accurate. When using the calibrated and validated wind speed bias correction equations (Table 3), the maximum value of R 2 (0.79) was recorded for Beja while the minimum values of R 2 (0.52) was obtained for Elvas (Table S5), showing a moderate to high fit between reanalysis and observed wind speed data. Results for the uncorrected NASA POWER wind speed data (Table 8) show a mean coefficient of regression of 1.40, with an average RMSE of 1.10 m s −1 , a mean NRMSE equal to 62.50% and an average modelling efficiency of −0.88. Wind speed is overestimated by NASA POWER for all weather stations, with an MBE that varies from 0.12 to 1.44 m s -1 . Bias correction increases the accuracy of NASA POWER wind speed estimation significantly. The mean EF increased to 0.40 (an improvement of 146.3%), for a regional correction, and to 0.53 (an increase of 160.7%), when locally correcting the bias. The RMSE decreased 37.4% (to 0.69 m s −1 ), for the former, and 45.7% (to 0.60 m s −1* , for the latter. The most frequent (Supplementary Figure S5) NMBE and EF for the raw NASA POWER wind speed data is higher than 60% (42% of all locations) a lower than 0 (71% of all locations), respectively; however, if the dataset is locally bias corrected, the most frequent NMBE ranging −10.0 to 10.0%, with more than 60% of the stations showing an EF higher than 0.50.

Discussion
Based on the results presented we can conclude that NASA POWER is capable of predicting most weather variables accurately for the Alentejo Region. Table 9 presents the mean and standard deviation values of all the accuracy metrics relative to NASA POWER weather variables, with and without bias correction. NASA POWER's raw maximum and minimum temperature and solar radiation show to be accurate with the high agreement and goodness of fit, with very little under and over estimations, when compared to the observations. When performing a bias correction, results tend to improve. In average, MBE values improve from −0.64 and 1.03 to −0.02 • C day -1 for Tmax and for Tmin, and from 0.65 to 0.01 MJ m −2 day −1 for Rs. Additionally, a local bias correction of NASA POWER Tmax, Tmin and Rs improved the mean RMSE values from 1.91 • C, 2.01 • C and 2.10 MJ m −2 day −1 to 1.59 • C, 1.58 • C and 1.89 MJ m −2 d −1 , respectively. Similar results for those variables were found by Bai et al. [24] in China, by Monteiro et al. [26] in Brazil and by Negm et al. [25] in Italy. White el al. [23], for continental USA, and Aboelkhair et al. [1], for Egypt, also reported similar results and concluded that NASA POWER can accurately estimate maximum and minimum temperature. Therefore, the presented results show that NASA POWER can simulate maximum, minimum temperatures and solar radiation with a high goodness of fit and agreement, when compared with observed data.
NASA POWER's relative humidity also shows good accuracy for inland locations when compared with observed data with, after bias correction, an average RMSE and MBE of 6.47% day -1 and 0.78% day -1 , respectively. This represent an improvement from raw RH data, where a mean RMSE of 9.24% day -1 and a mean MBE of −5.71% day -1 were found. However, for coastal stations the estimation still needs improvement. Similar conclusions were drawn by Negm et al. [25], Monteiro et al. [26] and Aboelkhair et al. [1]. Nonetheless, one can conclude that NASA POWER can simulate RH with good accuracy, when compared with observed data.
Among all weather variables, NASA POWER's wind speed is the one that performs the worst. Raw data show RMSE and MBE values averaging 1.10 and 0.85 m s −1 with a coefficient of regression equal to 1.40. Results tend to improve with a bias correction; however, the lowest NRMSE is of 19.6%, thus one can conclude that wind speed lacks the performance showed by the remaining weather variables. Previous studies [25,26,32] also found that NASA POWER fails to estimate wind speed with acceptable accuracy. Despite the improvements if a bias correction is performed, the NASA POWER wind speed reanalysis data show unsatisfactory correlation and does not agree with most of the ground observations and still needs improvement.

Conclusions
Weather data is one of the key elements for crop and water management. However, the quality or availability of data in remote regions is questionable. Therefore, evaluating the potential of using daily reanalysis data-such the data provided by NASA POWER-as an alternative to ground observations is needed.
The results presented in this study demonstrated that NASA POWER is capable of estimating most weather data-maximum and minimum temperatures, solar radiationwith high accuracy for Alentejo Region, Southern Portugal. Though, for wind speed and coastal relative humidity, NASA POWER estimations still need improvements. However, we suggest a local bias correction of all variables, allowing improvement to the accuracy of the estimated data. It can be concluded that NASA POWER could be useful for generation of weather data sets where ground weather stations data is missing or unavailable, improving weather information for a better decision making. Nonetheless, additional studies are recommended to better assess the use of such data for the estimation of derived products such as yield and water requirements estimations, and weather impacts on crop management and practices.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/agronomy11061207/s1: Table S1: Accuracy metrics relative to NASA POWER maximum temperature with and without bias correction for all 14 locations, Table S2: Accuracy metrics relative to NASA POWER minimum temperature with and without bias correction for all 14 locations, Table  S3: Accuracy metrics relative to NASA POWER solar radiation with and without bias correction for all 14 locations, Table S4: Accuracy metrics relative to NASA POWER relative humidity with and without bias correction for all 14 locations, Table S5: Accuracy metrics relative to NASA POWER wind speed with and without bias correction for all 14 locations, Figure S1: Frequency (%) distribution of the accuracy metrics measuring the performance of NASA POWER maximum temperature (Tmax) without bias correction compared with adopting a regional bias correction and a local bias correction, Figure S2: Frequency (%) distribution of the accuracy metrics measuring the performance of NASA POWER minimum temperature (Tmin) without bias correction compared with adopting a regional bias correction and a local bias correction, Figure S3: Frequency (%) distribution of the accuracy metrics measuring the performance of NASA POWER solar radiation (Rs) without bias correction compared with adopting a regional bias correction and a local bias correction, Figure S4: Frequency (%) distribution of the accuracy metrics measuring the performance of NASA POWER relative humidity (RH) without bias correction compared with adopting a regional bias correction and a local bias correction, Figure S5: Frequency (%) distribution of the accuracy metrics measuring the performance of NASA POWER wind speed (Ws) without bias correction compared with adopting a regional bias correction and a local bias correction.  Data Availability Statement: Weather data was obtained from COTR and are available at http: //www.cotr.pt/servicos/sagranet.php (accessed on 1 October 2020) with the permission of COTR.
Acknowledgments: Authors thanks COTR-Irrigation Operation and Technology Center for the data provided for this study.

Conflicts of Interest:
The authors declare no conflict of interest.