Comparative Analysis between Daily Extreme Temperature and Precipitation Values Derived from Observations and Gridded Datasets in North-Western Romania

: Climate gridded datasets are highly needed and useful in conducting data analysis for research and practical purposes. They provide long-term information on various climatic variables for large areas worldwide, making them suitable for use at any spatial level. It is essential to assess the accuracy of gridded data by comparing the data to measured values, especially when they are used as input parameters for hydro-climatic models. From the multitude of databases available for North-western Romania, we selected three, particularly the European Climate Assessment and Dataset (E-OBS), the Romanian Climatic Dataset (ROCADA), and the Climate of the Carpathian Region (CARPATCLIM). In this paper, we analyse the extreme precipitation and temperature data derived from the above-mentioned datasets over a common 50-year period (1961–2010) and compare the data with raw values to estimate the degree of uncertainty for each set of data. The observation data, recorded at two meteorological stations located in a complex topography region, were compared to the output of the gridded datasets, by using descriptive statistics for the mean and extreme annual and seasonal temperature and precipitation data, and trend analyses. The main ﬁndings are: the high suitability of the ROCADA gridded database for climate trend analysis and of the E-OBS gridded database for extreme temperature and precipitation analysis.


Introduction
Extremely useful in various stages of research and data analysis, daily gridded datasets are increasingly used as climatic parameter input for modelling, statistical analysis of time series, trend analysis, etc. [1,2]. They have the advantage of being able to cover large areas and, by various homogenization and interpolation methods, can be successfully used as input parameters in many climatic/hydrological models or for different types of analysis based on time series [3][4][5].
No matter what domain they are employed in, some important questions arise. How reliable and accurate are these datasets and what types of analysis are they suitable for? Can they be used to determine climate change or as input data for hydrological modelling? How does one choose from the multitude of gridded datasets available for a focus region, i.e., which one is the most appropriate for a specific study? Usually, one of the most common approaches is to select those gridded datasets that provide the values closest to the observed values in the area under study, and therefore are the most suitable for further analysis. encountered when using climate datasets for this purpose, which included heterogeneity of available information, differences in time and spatial resolution, types of access and available platforms as well as bias of models [8].
For North-western Romania, a multitude of daily gridded datasets are available for different climatic parameters, but which one to choose still remains difficult. For the complex topography area in North-western Romania, only one study has been conducted [1] that has compared the string values of a gridded dataset with the observed values, yet, without targeting the locations under analysis. Nevertheless, there are many studies that have used different databases without prior validation [19][20][21]. In the presentation of ROCADA, there has been a comparison developped among its resulting datasets, and those returned by E-OBS and APHRODITE, considering the observation values recorded at six meteorological stations randomly distributed across the entire territory of Romania [1].
The main purpose of this study is to determine which datasets (from the available ones) are the most suitable for use at a very small spatial scale, in a complex topography area. When considering gridded datasets for larger regions, the topography "details" seem to be smoothened, but for small areas and communities, it is essential to choose the best datasets (in terms of accuracy and spatial resolution) as the impact of extreme events could be significant. This study is part of a wider research project aiming at determining the impact of climate change on river discharge flows in North-western Romania. One of the most important issues of such a research topic in a complex topography region is the accuracy of the climate data to be used as input for hydrological modelling. In this general context, we compared the gridded datasets derived from three databases with the observation data to select the most appropriate one to be further employed in a hydrological modelling study.

Data Used
To perform the comparison of the selected datasets, daily observation data recorded at two meteorological stations (Baia Mare and Ocna S , ugatag) located in a region with complex topography were considered and compared with data derived from the following three databases: ROCADA [1], E-OBS [22,23], and Climate of the Carpathian Region (CARPAT-CLIM) [24]. Both weather stations belong to the National Meteorological Administration (NMA) network in Romania. Their location is presented in Figure 1 and their geographical coordinates in Table 1. The following climatic parameters were used for comparison: daily data of maximum temperature (TX), minimum temperature (TN), and precipitation (RR). Data strings corresponding to a common 50-year reference period (1961-2010) were employed for both observation and gridded data.  The following climatic parameters were used for comparison: daily data of maximu temperature (TX), minimum temperature (TN), and precipitation (RR). Data strings c responding to a common 50-year reference period (1961-2010) were employed for bo observation and gridded data.

Observation Data
Most of the daily extreme temperature and precipitation data were freely dow loaded from the European Climate Assessment and Dataset (ECA&D) project databa (non-blended data) for the period 1961-2009 [22,23] and reconstructed from raw synop messages available on www.meteomanz.com ( accessed on 16 February 2016) for the ye 2010 [25].
As described on www.ecad.eu (accessed on 23 March 2020), "the non-blended ser are the series as provided by the participants: the climate variable series collected fro participating countries generally do not contain data for the most recent years. This partly due to the time that is needed for data quality control and archiving at the hom institutions of the participants, and partly the result of the efforts required to include t data in the ECA database. To make available for each station a time series that is as co plete as possible, an automated update procedure has been included that relies on t daily data from SYNOP messages that are distributed in near real time over the Glob Telecommunication System (GTS). By using this procedure, the gaps in a daily series we filled in with observations from nearby stations, provided that they are within 12.5 k distance and that height differences are less than 25 m", and thus the blended data ser

Observation Data
Most of the daily extreme temperature and precipitation data were freely downloaded from the European Climate Assessment and Dataset (ECA&D) project database (non-blended data) for the period 1961-2009 [22,23] and reconstructed from raw synoptic messages available on www.meteomanz.com (accessed on 16 February 2016) for the year 2010 [25].
As described on www.ecad.eu (accessed on 23 March 2020), "the non-blended series are the series as provided by the participants: the climate variable series collected from participating countries generally do not contain data for the most recent years. This is partly due to the time that is needed for data quality control and archiving at the home institutions of the participants, and partly the result of the efforts required to include the data in the ECA database. To make available for each station a time series that is as complete as possible, an automated update procedure has been included that relies on the daily data from SYNOP messages that are distributed in near real time over the Global Telecommunication System (GTS). By using this procedure, the gaps in a daily series were filled in with observations from nearby stations, provided that they are within 12.5 km distance and that height differences are less than 25 m", and thus the blended data series were derived [23]. Under these circumstances, for this study and for a more accurate analysis, we chose the non-blended data series.
The observation datasets were reconstructed and tested for homogeneity under the framework of the FMETPRO project (http://fmetpro.granturi.ubbcluj.ro, accessed on 28 March 2020) on extreme weather events related to temperature and precipitation in Romania [25].

Gridded Data
Data extracted from three databases were analysed: i. The ROCADA database, developed by the NMA, provides daily data for a 53year period (1961-2013) for the entire territory of Romania. For homogenization and interpolation, 155 meteorological stations for TX and TN, and 188 meteorological stations for precipitation were employed. It has a spatial resolution of 0.1 • × 0.1 • (~11 km × 11 km) ( Figure 2a). Datasets are freely available on the World Data Centre PANGAEA portal [1]. The homogenization was performed by using the MASH method, whereas for interpolation, the MISH procedure was employed; both methods are described in detail in [1,17,[26][27][28]. The same methods were previously successfully used for creating the CARPATCLIM project database [24,[26][27][28][29].
Data extracted from three databases were analysed: i. The ROCADA database, developed by the NMA, provides daily data for a 53-year period (1961-2013) for the entire territory of Romania. For homogenization and interpolation, 155 meteorological stations for TX and TN, and 188 meteorological stations for precipitation were employed. It has a spatial resolution of 0.1° × 0.1° (~11 km × 11 km) ( Figure  2a). Datasets are freely available on the World Data Centre PANGAEA portal [1]. The homogenization was performed by using the MASH method, whereas for interpolation, the MISH procedure was employed; both methods are described in detail in [1,17,[26][27][28]. The same methods were previously successfully used for creating the CARPATCLIM project database [24,[26][27][28][29].
None of the considered stations for the present study was used for the validation of the ROCADA database. ii. The CARPATCLIM database, developed under the framework of the project Climate of the Carpathian Region [24,26], covers an area of 483,525 km 2 and a 50-year period . As presented on the project website, it was developed by interpolating observation data from 585 precipitation and meteorological stations for the entire region. For Romania's territory, 67 stations were used for gridding precipitation and 91 stations for gridding TX and TN over Romania. To obtain the grid values, spatial interpolation was performed at the national level, using MISH/MASH software [17,[26][27][28][29] and afterwards, compiled at the CARPATCLIM region level. Between the neighbouring countries, the near-border station data series were exchanged to cross-border harmonize the interpolation [24,26]. Data are freely available on the website of the project (http://www.carpatclimeu.org, accessed on 15 January 2020). The spatial resolution of the grid is 0.1° × 0.1° (~11 km × 11 km) ( Figure 2b). None of the considered stations for the present study was used to validate the database.
iii. The E-OBS is a database developed at the European level and contains 54,095 series of observations for 12 elements recorded at 15,562 meteorological stations throughout None of the considered stations for the present study was used for the validation of the ROCADA database.
ii. The CARPATCLIM database, developed under the framework of the project Climate of the Carpathian Region [24,26], covers an area of 483,525 km 2 and a 50-year period . As presented on the project website, it was developed by interpolating observation data from 585 precipitation and meteorological stations for the entire region. For Romania's territory, 67 stations were used for gridding precipitation and 91 stations for gridding TX and TN over Romania. To obtain the grid values, spatial interpolation was performed at the national level, using MISH/MASH software [17,[26][27][28][29] and afterwards, compiled at the CARPATCLIM region level. Between the neighbouring countries, the near-border station data series were exchanged to cross-border harmonize the interpolation [24,26]. Data are freely available on the website of the project (http://www.carpatclim-eu.org, accessed on 15 January 2020). The spatial resolution of the grid is 0.1 • × 0.1 • (~11 km × 11 km) ( Figure 2b). None of the considered stations for the present study was used to validate the database.
iii. The E-OBS is a database developed at the European level and contains 54,095 series of observations for 12 elements recorded at 15,562 meteorological stations throughout Europe, over the period 1950-2019. From Romania, data recorded at 27 weather stations for extreme temperatures and 28 weather stations for precipitation were considered for interpolation. The E-OBS comes as an ensemble dataset and is available on a 0.1 • regular grid (~11 km × 11 km) (Figure 2c). The dataset is constructed based on a conditional simulation procedure. For each of the members of the ensemble, a spatially correlated random field is produced using a predetermined spatial correlation function. The mean across members is calculated and is provided as the "best-guess" field. The spread is calculated as the difference between the 5th and the 95th percentiles over the ensemble to provide a measure indicating the 90% uncertainty range [22,23,30].
All gridded datasets used in this study are freely available on the project's website (www.ecad.eu, accessed on 23 March 2020), to be used for non-commercial research projects, upon request.
For further analysis, a common 50-year period, 1961-2010, for all datasets considered was employed.

Methods
The time series derived from each gridded dataset were extracted based on the geographical coordinates of the weather stations, by intersection. For this task, we used ArcGIS Pro software (multidimension tools) (produced by ESRI, Redlands, CA, USA), by overlapping the point-type layer (used for the location of the meteorological stations) with each gridded dataset. Because the centre of the grid does not coincide with the location of the station, we used the nearest neighbourhood method to extract the data. Starting from the considered data strings and having observation data as reference values, four main categories of analysis were further performed, as described below.

Descriptive Statistics
In the first category, descriptive statistics were performed for both observation and gridded data series to be able to identify the differences between them. Afterwards, Pearson's correlation coefficient and the coefficient of determination (R-squared) were calculated for the pairs consisting of observation data and each gridded data series for the three parameters considered.

Comparison between the Extreme Values Derived from Gridded Datasets and from Observations Using the Complete Datasets
The second category of analysis focused on the comparison between the extreme values derived from gridded datasets and the results from observations. The extreme percentile (the 1st and the 99th) values and the extreme annual values were both considered as follows: (i) from each gridded dataset, all values equal to or lower than the 1st percentile and values equal to or greater than the 99th percentile were extracted and compared with the corresponding values derived from the observation datasets and (ii) the extreme annual values from each gridded dataset were selected and compared with the observed ones.
The datasets obtained based on the extreme percentile values were further analysed by using Taylor diagrams and the coefficient of determination. Taylor diagrams are mathematical diagrams especially developed to graphically indicate which of the models of a system, dataset, process, or phenomenon is located closest to a reference value (observation value, in this case). This type of diagram, designed by Karl E. Taylor in 1994 and published in 2001 [31], allows for a comparative evaluation of different models. For this study, it was used to assess the degree of similarity between the values of the gridded (modelled) datasets and the reference one (observation-derived dataset) using the following three statistical indices: Pearson's correlation coefficient, root mean square error (RMSE), the standard deviation. The Taylor diagrams provide a concise statistical summary of how well patterns match each other in terms of their correlation, their root mean square difference, and the ratio of their variances [31]. They allow simultaneous visualization of the three statistical parameters over the three gridded datasets as compared with the observation ones for each location. An R package was employed to create the diagrams.
The coefficient of determination allows the indication of the data points scattered around the regression line, both for the percentile and for the annual extreme values series.

Comparison between Seasonal Values Derived from Gridded Datasets and from Observations
The third category of analysis involved extracting and comparing the seasonal values related to the extreme seasons, winter (December, January, and February) and summer (June, July, and August), corresponding to each dataset. The analysis of the relation between the databases according to seasons was performed by applying the following approaches: i. A comparative analysis of two distributions by Kolmogorov-Smirnov test (K-S) [32]. This analysis shows the absolute maximum distance between two distributions of the time series with observed values as compared with the values of the gridded dataset; ii. An analysis of seasonal datasets using Taylor diagrams to quantify the degree of correspondence between the modelled and the observed behaviour in terms of three statistical indicators, i.e., Pearson's correlation coefficient, RMSE, and standard deviation [30].

Trend Analysis
The analysis of TX, TN, and RR temporal evolution was approached by employing a combination of the Mann-Kendall test for trends [33,34] with Sen's non-parametric method for the magnitude of the trend [35]. The combined method is widely used to detect changes in different climatic parameter datasets [36][37][38][39]. The Mann-Kendall test can be applied to detect a monotonic trend of a time data series, whereas Sen's method uses a linear model to estimate the slope of the trend, while the variance of the residuals should be constant in time [40].

Study Region
The focus region is small, but it has a complex topography; the selected area extends over two mountain depressions developed on opposite sides of the T , ibles , Mountains. Although territorially close to each other, their general climate features are quite different due to their altitude and the barrier generated by the presence of mountains between them. The Meteorological Station of Baia Mare is located in the Baia Mare Depression, on the western side of the mountain chain ( Figure 1). This area is characterized by a total annual amount of precipitation of 884 mm/year and by a mean multiannual temperature of 9.6 • C. It is widely open westward and the dominant air masses originate over the North Atlantic Ocean. The Meteorological Station of Ocna S , ugatag is located in the Maramures , Depression, East of the T , ibles , Mountains ( Figure 1). The recorded annual precipitation and mean multiannual temperature are considerably lower than those of Baia Mare (746 mm and 6.8 • C).

Analysis of the Historical Extreme Temperatures and Precipitation
The first analysis was performed considering the historical TX and TN values, as well as the highest daily amount of precipitation recorded at the Ocna S , ugatag weather station. For the spatial distribution of TX, some anomalies can be observed within the ROCADA and CARPATCLIM gridded datasets ( Figure 3). Large temperature differences were detected from one cell to another. Since it is not usual to have such steep differences, we suppose that the methodology of data interpolation could explain this, because the ROCADA and CARPATCLIM datasets used the same method for data homogenization (MASH) and interpolation (MISH) [24,[26][27][28], and both datasets have the same distribution anomalies, therefore, it can be assumed that the problem lies in the procedure (method) used for spatial interpolation.
The MASH method broadly assumes that breakpoints and possible changes can be detected and adjusted by mutual comparisons in climate data series. It is based on a series of tests and statistical analyses, described in detail by Szentimrey [24,26,28]. The MISH interpolation method employs homogenized data series (checked by using the MASH method). On the basis of these homogeneous time series, MISH calculates the optimal interpolation parameters depending on the climatic parameter. Therefore, during the data homogenization and interpolation processes, the most important phase is the homogenization procedure along with the number of evaluated climatic parameters and the length of the time series [23,25].
Unlike the ROCADA and CARPATCLIM gridded datasets, to obtain the E-OBS daily gridded dataset, the ordinary Kriging interpolation method was applied [22], which better captured the influence of topography on the analysed climatic parameters.
suppose that the methodology of data interpolation could explain this, because the RO-CADA and CARPATCLIM datasets used the same method for data homogenization (MASH) and interpolation (MISH) [24,[26][27][28], and both datasets have the same distribution anomalies, therefore, it can be assumed that the problem lies in the procedure (method) used for spatial interpolation. The MASH method broadly assumes that breakpoints and possible changes can be detected and adjusted by mutual comparisons in climate data series. It is based on a series of tests and statistical analyses, described in detail by Szentimrey [24,26,28]. The MISH interpolation method employs homogenized data series (checked by using the MASH method). On the basis of these homogeneous time series, MISH calculates the optimal interpolation parameters depending on the climatic parameter. Therefore, during the data homogenization and interpolation processes, the most important phase is the homogenization procedure along with the number of evaluated climatic parameters and the length of the time series [23,25].
Unlike the ROCADA and CARPATCLIM gridded datasets, to obtain the E-OBS daily gridded dataset, the ordinary Kriging interpolation method was applied [22], which better captured the influence of topography on the analysed climatic parameters.
The E-OBS datasets show a much more accurate spatial distribution for the temperature parameters. However, as compared with extreme temperature series, the precipitation in E-OBS has a different spatial distribution induced, most likely by the much lower density of rainfall stations used in the construction of the gridded dataset ( Figure 3). The E-OBS datasets show a much more accurate spatial distribution for the temperature parameters. However, as compared with extreme temperature series, the precipitation in E-OBS has a different spatial distribution induced, most likely by the much lower density of rainfall stations used in the construction of the gridded dataset ( Figure 3).

Descriptive Statistics
In this section, the values derived from all gridded datasets for both locations were compared with the observation values through descriptive statistics. The entire data series for each variable were employed to derive the descriptive statistics features (Tables 2 and 3). The comparison does not reveal a suggestive indication of the optimal dataset to be used for further analysis; the mean values are, in general, closer to the observed values in the ROCADA gridded dataset, while statistical indices describing the shape of the frequency distribution revealed the E-OBS gridded dataset as providing the closest values to the observations (Tables 2 and 3).

Correlation and Determination Analysis of Datasets
Following the methodology described, we performed a specific analysis for all datasets. A very good correlation (higher than 0.95) was obtained for all databases. Pearson's correlation coefficient indicates different situations for the two meteorological stations. The comparison between the three gridded datasets and the observation values at the Baia Mare meteorological station revealed the best results for ROCADA dataset, while comparing them with the values observed at the Ocna S , ugatag meteorological station, the higher correlation coefficient was obtained for the E-OBS dataset (Table 4). The highest values were obtained for TN (0.989-0.998) and the lowest for RR (0.888-0.954). Due to the high spatial variability of precipitation, especially in the complex topography regions, one can assume that the correlation for RR is also acceptable ( Table 4).
In terms of the coefficient of determination (R-squared), the results are slightly different for the two locations. Thus, for the Baia Mare weather station, the most accurate results for all parameters were obtained for the ROCADA datasets, while for the Ocna S , ugatag weather station, the best fit for extreme temperatures was provided by the E-OBS datasets, and for RR by the ROCADA output series.
As in the case of Pearson's correlation coefficient, it was more difficult to capture precipitation in the models due to a greater spatial variability, nevertheless, when all datasets were considered, quite good values were returned, especially for the ROCADA gridded dataset.

Taylor Diagrams-Based Analysis
The extreme values of the considered datasets, historical values, and percentilederived values were further analysed by using Taylor diagrams.
The results are quite different compared to those obtained when full strings were considered. For the extreme percentile values, in the case of TX and TN, the best fit was indicated by the E-OBS database for the Ocna S , ugatag weather station (Figure 4), whereas for the Baia Mare weather station, the relationship between the ROCADA and the observed values pairs was found to be the closest for TX and RR ( Figure 5). Atmosphere 2021, 12, x FOR PEER REVIEW 11 of 20 Using Taylor charts for historical extreme values allowed us to identify the position of gridded data as compared with the reference data. For the 99th percentile (for maximum values), the ROCADA database is generally in a better (closer) position, and for the 1st percentile (for minimum values) the data from E-OBS are more accurate. This was surprising, because to obtain the E-OBS gridded dataset, a lower density of weather stations was used than in the case of the ROCADA and CARPATCLIM databases. It was expected that the ROCADA database, which was developed based on the highest density of weather stations, would better capture the extreme values. Using Taylor charts for historical extreme values allowed us to identify the position of gridded data as compared with the reference data. For the 99th percentile (for maximum values), the ROCADA database is generally in a better (closer) position, and for the 1st percentile (for minimum values) the data from E-OBS are more accurate. This was surprising, because to obtain the E-OBS gridded dataset, a lower density of weather stations was used than in the case of the ROCADA and CARPATCLIM databases. It was expected that the ROCADA database, which was developed based on the highest density of weather stations, would better capture the extreme values.

Linear Regression for the 1st and 99th Percentiles
By employing the linear regression method for the 1st and 99th percentiles series, a similar data layout was obtained as compared with the observation values.
Considering the extreme maximum values (the 99th percentile), the results are different for the two selected locations. For the Baia Mare weather station, the ROCADA database returned the best values for all datasets (Table 5), whereas, for the Ocna S , ugatag weather station, the E-OBS datasets had a better determination coefficient both for the TX and TN series, whilst the ROCADA dataset was the best for RR. In the case of RR, the larger scattering of the points may be induced by the frequent values of 0.0 mm given by the E-OBS dataset when precipitation was actually recorded at the Ocna S , ugatag meteorological station for the same days. This could be a consequence of the lower spatial coverage with direct observation stations available for E-OBS interpolation as compared with the much higher density of measurement sites used by the ROCADA database, which returned the best results. The less convenient results were generated by the CARPATCLIM database in terms of extreme temperatures and by the E-OBS database for precipitation (Table 5). Moreover, the historical minimum values of precipitation are closer in the case of the observed values -E-OBS relation, compared to the other analysed gridded datasets, even though the coefficient of determination value is quite low. As explained above, this low value might be due to the spatial variability of this parameter and to the lower density of stations included in the interpolation to get this dataset.
For the 1st percentile, a better coefficient of determination in relation to the observed values was obtained for E-OBS, both in the case of TX and TN at Ocna S , ugatag weather station and for TN at the Baia Mare weather station, while for TX datasets at the Baia Mare weather station, the highest coefficient was obtained between ROCADA dataset and observation data ( Table 6). As the 1st percentile for RR is equal to 0 for all datasets considered, we did not calculate the correlation coefficient.

Analysis of the Extreme Annual Values
The highest and the lowest annual values for each year parameter (TX, TN, and RR) of the considered period were extracted, resulting in 50 values for each dataset. A comparative analysis for the new datasets was performed to identify the best fit between the observation and the derived gridded datasets.
In general, the E-OBS database returned the best results. However, larger differences could be identified, especially after 2000 ( Figure 6). The same result was confirmed when using Taylor diagrams (Figure 7).

Analysis of the Annual Seasonal Values
The extreme seasons' series were analysed by using three methods: K-S method, Taylor diagrams, and MK trend detection test combined with Sen's slope for trend magnitude.

Kolmogorov-Smirnov Test (K-S) Distribution Analysis
When analysing the winter TX, TN, and RR datasets by employing the K-S method, the results indicated a cumulative distribution closer to the observation series for the E-OBS gridded dataset in almost all cases (except for TN at Ocna S , ugatag weather station). How-ever, the ROCADA output revealed quite small differences compared to the observation time series (Table 7).

Analysis of the Annual Seasonal Values
The extreme seasons' series were analysed by using three methods: K-S method, Taylor diagrams, and MK trend detection test combined with Sen's slope for trend magnitude.

Kolmogorov-Smirnov Test (K-S) Distribution Analysis
When analysing the winter TX, TN, and RR datasets by employing the K-S method, the results indicated a cumulative distribution closer to the observation series for the E-OBS gridded dataset in almost all cases (except for TN at Ocna Șugatag weather station). However, the ROCADA output revealed quite small differences compared to the observation time series (Table 7). For summer, the analysis of the same parameters showed a slightly different situation compared to winter: for the TX and TN, the ROCADA gridded dataset presents the lowest values of D, while for the RR series the E-OBS gridded dataset indicates a distribution closer to the observation-derived values (Table 7).

Taylor Diagram Analysis
Taylor diagram-based method was employed for seasonal datasets, too. The analysis revealed a slightly more complicated situation compared to that obtained for the complete datasets: the values of statistical indicators derived from ROCADA datasets, both for winter ( Figure 8) and summer ( Figure 9) were found to have the best correlation with the statistical indicators of the observation dataset for the Baia Mare weather station and those derived from E-OBS database for the Ocna S , ugatag weather station.  For summer, the analysis of the same parameters showed a slightly different situation compared to winter: for the TX and TN, the ROCADA gridded dataset presents the lowest values of D, while for the RR series the E-OBS gridded dataset indicates a distribution closer to the observation-derived values ( Table 7).

Taylor Diagram Analysis
Taylor diagram-based method was employed for seasonal datasets, too. The analysis revealed a slightly more complicated situation compared to that obtained for the complete datasets: the values of statistical indicators derived from ROCADA datasets, both for winter ( Figure 8) and summer ( Figure 9) were found to have the best correlation with the statistical indicators of the observation dataset for the Baia Mare weather station and those derived from E-OBS database for the Ocna Șugatag weather station.

Trend Detection
The third approach for seasonal data series consisted in trend detection. The results of the MK test for all analysed datasets for the extreme seasons are presented in Tables 8  and 9. For summer, in terms of slope, the best results for temperature series were returned by E-OBS datasets, while for precipitation ROCADA series indicated the closest slope values to those obtained for observation series. Considering the statistical significance, RO-CADA and E-OBS indicated similar results for temperature for the Baia Mare weather station and ROCADA and CarpatClim for the Ocna Șugatag weather station; for precipitation, none of the series was found statistically significant (Table 8).

Trend Detection
The third approach for seasonal data series consisted in trend detection. The results of the MK test for all analysed datasets for the extreme seasons are presented in Tables 8 and 9. For summer, in terms of slope, the best results for temperature series were returned by E-OBS datasets, while for precipitation ROCADA series indicated the closest slope values to those obtained for observation series. Considering the statistical significance, ROCADA and E-OBS indicated similar results for temperature for the Baia Mare weather station and ROCADA and CarpatClim for the Ocna S , ugatag weather station; for precipitation, none of the series was found statistically significant (Table 8). The winter series indicated the best fit for each parameter with a different database, i.e., for TX with ROCADA and CarpatClim, for TN with E-OBS, and for RR with CarpatClim. In some cases, the trends indicated by CarpatClim or E-OBS are opposite to those detected in the observation-derived series, yet, not statistically significant. Although for TX and TN at the Baia Mare weather station a statistically significant increase was found, none of the grid-derived series indicated a significant change. Moreover, CarpatClim for TX indicated a decreasing trend (Table 9).

Conclusions
Gridded datasets are an important source of data for spatial analysis as they offer the possibility to use them as climatic input parameters for different hydrological or agrometeorological models. The main conclusion of the present study is the recommendation of using gridded datasets available for complex topography regions with caution, especially when aiming at illustrating extreme phenomena. Since, sometimes, differences among different gridded datasets are quite significant, the choice of using one or another for further research should be based on prior checking of their consistency with observational data, which is almost a prerequisite to ensure their suitability.
In general, the ROCADA dataset indicated the best fit with the raw observation values across all climatic parameters considered; most likely, a consequence of the highest density of meteorological stations providing observational data employed for developing the gridded data. The programs used to homogenize the data series compared all data strings according to the location of meteorological stations and complemented the lack of data by advanced statistical methods, thus, easing further analysis for researchers.
The CARPATCLIM datasets had the lowest correlation coefficient values between all analysed datasets and for all parameters. Although the same homogenization and spatial interpolation methods and software were used as those employed in creating the ROCADA series (MISH and MASH), the point values showed a fairly wide spread, primarily due to the much lower number of weather stations than the one used for the ROCADA dataset, and, secondarily, because of the reduced density of measuring stations used for the analysed area, which could not adequately capture the influence of topography on the climatic parameters.
Although, in the case of long-term series of data, the E-OBS-derived daily gridded datasets show values rather close to those provided by ROCADA, but not better. When the extreme annual and historical values were considered, series provided by E-OBS fit best to those observed. Thus, among all databases considered for this study, the E-OBS best captures the real (observed) values in terms of extreme values. For trend analysis or for studies based on mean values in North-western Romania, the ROCADA dataset is the most appropriate to be employed.
In general, such assessments of datasets are performed for large geographic regions, in which case many samples are considered for testing, validating one or another climate parameter. However, sometimes, such analyses may prove too cumbersome and often inaccessible due to the large volume of data required, or due to the small size of the selected geographical area for such an analysis. The present study aimed to reveal an alternative evaluation procedure of the existing gridded datasets, organized in four steps; researchers could then identify the one dataset which is the closest to the observation datasets from the total available gridded datasets.
The results of this research show the importance of validating gridded datasets before using them in various research activities. The value and importance of gridded datasets in the research activity cannot be questioned, but their analysis is clearly needed depending on the region they are used for. As a future perspective, we aim to develop this evaluation procedure, which should take into account the quality assessment of datasets in other types of topography regions (e.g., plains, depressions, wetlands, high mountain areas,). Moreover, we intend to increase the number of observation stations to extensively test new analysis methods, such as the network analysis method proposed by Wang and Wang [41].