A Long-Term, 1-km Resolution Daily Meteorological Dataset for Modeling and Mapping Permafrost in Canada

: Climate warming is causing permafrost thaw and there is an urgent need to understand the spatial distribution of permafrost and its potential changes with climate. This study developed a long-term (1901–2100), 1-km resolution daily meteorological dataset (Met1km) for modeling and mapping permafrost at high spatial resolutions in Canada. Met1km includes eight climate variables (daily minimum, maximum, and mean air temperatures, precipitation, vapor pressure, wind speed, solar radiation, and downward longwave radiation) and is suitable to drive process-based permafrost and other land-surface models. Met1km was developed based on four coarser gridded meteorological datasets for the historical period. Future values were developed using the output of a new Canadian regional climate model under medium-low and high emission scenarios. These datasets were downscaled to 1-km resolution using the re-baselining method based on the WorldClim2 dataset as spatial templates. We assessed Met1km by comparing it to climate station observations across Canada and a gridded monthly anomaly time-series dataset. The accuracy of Met1km is similar to or better than the four coarser gridded datasets. The errors in long-term averages and average seasonal patterns are small. The error occurs mainly in day-to-day ﬂuctuations, thus the error decreases signiﬁcantly when averaged over 5 to 10 days. Met1km, as a data generating system, is relatively small in data volume, ﬂexible to use, and easy to update when new or improved source datasets are available. The method can also be used to generate similar datasets for other regions, even for the entire global landmass.


Introduction
Permafrost is an important component of the global landmass, underlying 9-14% of world land surface [1]. In the Northern Hemisphere, permafrost underlies about 24% of exposed ground [2]. Climate warming is increasing permafrost temperatures [3], deepening summer thaw depths (e.g., [4]), and completely degrading permafrost in some areas (e.g., [5][6][7]). These changes have significant environmental and socioeconomic impacts from the local to global scales, including active-layer detachments and thaw slumps, ground subsidence, damage to infrastructure, changes in hydrology, ecosystems, animal habitats, and impacts on the global climate system (e.g., [8][9][10]). Therefore, there is an urgent need to improve our knowledge about the spatial distribution of permafrost and its potential changes with climate warming in the future.

Development of Met1km for Future Climate Change Scenarios
Future climate change scenarios were derived from the output of a newly developed Canadian regional climate model, CanRCM4, for North America [42]. GCMs are the primary tools used to project future climate changes. Although the state-of-the-art GCMs can reasonably simulate the climate at large scales [45], climate change impact assessments often need to downscale the GCM outputs to regional and local scales. A commonly used method is dynamical downscaling based on regional climate models driven by GCM outputs for a relatively small domain but with fine resolutions (e.g., [46]). This method has the advantages of keeping the fine-resolution output consistent with the output of GCMs at large scales and better accounting of regional and local forcings (e.g., topography, land use, and land cover) than GCMs. In this study, we used the output of CanRCM4 for North America under RCP 4.5 and 8.5 [42] as they have been frequently used in climate change impact assessments (e.g., [47][48][49]).
Climate model outputs often have some systematic biases when compared with observations. Such biases are related to imperfect model conceptualization, parameterization, and spatial averaging within grid cells. Therefore, bias correction has become a common practice to match modeled gridded data to station observations [50,51]. In this study, we conducted a bias correction using a quantile mapping approach that has been applied in previous studies [47][48][49]. The approach includes three steps for each grid: a) Determining the "true" cumulative probability distribution function (CDF) and the CDF from the CanRCM4 output under the current climate, b) correcting the CanRCM4 output to the "true" probability distribution under the current climate, and c) correcting the CanRCM4 output for the future period. The bias-corrected current climate data were used to calculate the CanRCM4 baseline, which was used to downscale the future scenarios to 1-km resolution.
The CanRCM4 output for the current climate was corrected using the following equation where x c-corrected is the bias-corrected data under the current climate. F true −1 is the inverse function of the "true" CDF for the current climate, and F c (x c ) is the CDF of the model output for the current climate. The probability distributions F true (x c ) and F c (x c ) can be estimated theoretically or empirically. We estimated the distributions empirically to avoid complications of calibrating an appropriate probability distribution, since even daily minimum and maximum air temperatures do not necessarily follow a normal distribution [52]. F c (x c ) was determined using the CanRCM4 output for the current climate. We estimated the "true" CDF for the current climate using the Princeton dataset because its spatial resolution (0.25 • latitude/longitude) is similar to that of the CanRCM4 output (0.22 • latitude/longitude). Empirical distributions for each variable were estimated month by month to account for seasonal variations. The CanRCM4 output for the future period was corrected using the following equation where x f-corrected is bias-corrected data for the future period. F f (x f ) is the CDF determined using the CanRCM4 output for the future (x f ). It is important to incorporate potential changes in the probability distribution of a climate variable when the bias correction is applied to a future period [53]. We adopted the equidistant CDF matching method [53], which assumes that the difference between the quantiles estimated from the model output and observed values during the current climate would also be applicable to the future period.

Spatial Downscaling
The gridded meteorological datasets used in this study are at different spatial resolutions. We downscaled them to a resolution of 30 arc seconds latitude/longitude (about 1-km) based on the WorldClim2 dataset [29] using the re-baselining method [54]. We selected WorldClim2 dataset because it has a fine spatial resolution and includes all the climate variables required except downward longwave radiation. The Princeton and CRU JRA climate datasets and the future climate scenarios all include downward longwave radiation. We therefore estimated monthly mean downward longwave radiation using the other variables in the WorldClim2 dataset (Appendix B).
The re-baselining method was developed by Way and Bonnaventure [54] to fill missing observations using gridded regional climate anomalies. The method is based on the commonly observed phenomenon that climate anomalies at the regional scale typically co-vary, whereas long-term averages at different sites can be different due to local topography and other site conditions. Therefore, they estimated air temperature at a site by combining the long-term averages at the site (observations are available but there are many gaps) and the anomalies from gridded climate reanalysis datasets [54]. They tested the method for monthly air temperatures at 53 climate stations in northeastern Canada. We used this method to downscale all the daily climate variables to 1-km resolution using WorldClim2 as the baseline. This method can easily integrate datasets of various spatial resolutions without affecting their temporal variations and trends. We assessed this downscaling method across Canada for air temperature and precipitation based on climate station observations (Section 4.2).
The re-baselining method is simple to compute. For daily minimum and maximum air temperatures, we used the differences to re-baseline the coarse gridded daily meteorological datasets to 1-km resolution where T (y,m,d) is the downscaled daily air temperature (minimum or maximum) for year y, month m, and day d for a 1-km resolution grid. T(y,m,d) is the daily air temperature (minimum or maximum) in a coarse gridded dataset (i.e., the CRU JRA, Princeton, NRCANmet, or PNWNAmet datasets) on that day. T wc (m) is the monthly average air temperature (minimum or maximum) for month m from the WorldClim2 dataset, which is the average from 1970 to 2000. T a (m) is the monthly average air temperature (minimum or maximum) from 1970 to 2000 for the month m calculated using the coarse gridded dataset. D(m) is the difference between T wc (m) and T a (m). Daily mean air temperature is calculated as the average of daily minimum and maximum air temperatures. For the other climate variables (daily precipitation, vapor pressure, wind speed, solar radiation, and downward longwave radiation), we used ratios to re-baseline the coarse gridded daily datasets to 1-km resolution where V is a daily value of a climate variable (daily precipitation, vapor pressure, wind speed, solar radiation, and downward longwave radiation) from a coarse gridded dataset. V wc (m) is the monthly average for month m from the WorldClim2 dataset, and V a (m) is the monthly average from 1970 to 2000 for the month m calculated using the coarse gridded dataset. r(m) is the ratio between V wc (m) and V a (m). We used ratios instead of difference for these variables to avoid generating negative values, and to avoid the effects of dry days for precipitation. The downscaled daily solar radiation has a smaller range in their day-to-day fluctuations comparing to the CWEEDS [44]. We adjusted this difference using the following method where S is the adjusted daily solar radiation, and S is the downscaled daily solar radiation from the Equation (5). S avg9 is nine-day moving average of the solar radiation calculated by Equation (5) around the day for the adjustment. Their units are in KJ/m 2 /day. a and b are parameters, determined as 1.0 and 0.4 by comparing the adjusted and the CWEEDS daily solar radiation for different stations across Canada. Essentially, Equation (7) doubles the fluctuation range, but the adjusted value should not be less than 40% of the nine-day average.

Statsitical Measures for Accuracy Assessment
We used Pearson's correlation coefficient (R) and mean absolute error (MAE) to assess the accuracy of Met1km by comparing its values with observations.
where R is Pearson's correlation coefficient, and MAE is mean absolute error. x i and x i are Met1km and observed values for a day i, respectively. x m and x m are the averages of Met1km and observed values, respectively. n is the total number of days in which both Met1km and observed daily data are available.
To further analyze sources of the errors, we partitioned the time series of a climate variable into three components: The long-term average, the long-term average seasonal pattern, and the deviation from the long-term average seasonal pattern.
Atmosphere 2020, 11, 1363 where x a is the long-term average of a climate variable, and x a,doy is the long-term average for a day of year doy (which does not change from year to year). dx 1,doy is the difference between x a,doy and x a , and dx 2,i is the difference between x i and x a,doy Thus, the MAE can be expressed as the following For precipitation, we used a ratio to calculate dx 2,i to avoid the effects of days without precipitation where r i is the ratio of precipitation on day i to the long-term average of that day of year (x a,doy ). Thus, the third term in Equation (13) for precipitation can be expressed as the following where r i is the ratio of precipitation on day i to the long-term average on that doy for the observed daily precipitation. Equation (13) indicates that the overall MAE can be divided into three components: The error in long-term average (the first term), the error in long-term average seasonal pattern (the second term), and the error in deviations from the average seasonal pattern (the third term), which reflects the error in day-to-day fluctuations. This partitioning is important for permafrost modeling because permafrost is sensitive to the error in long-term averages, somewhat sensitive to the error in seasonal patterns, but not sensitive to short-term fluctuations (e.g., [34]).

Result and Analysis
In this section, we first described the Met1km dataset and showed some examples of the data (Section 4.1). Then, we tested the re-baselining method using climate station observations for air temperature and precipitation (Section 4.2). After that, we assessed the accuracy of Met1km in four sections. Section 4.3 describes detailed comparisons of Met1km with observations of one climate station for all the climate variables except downward longwave radiation. Section 4.4 provides comparisons of Met1km with all the climate stations across Canada for air temperature and precipitation. Section 4.5 describes comparisons of Met1km with CANGRD, a 50-km resolution gridded monthly anomaly time series for air temperature and precipitation. Finally, we compared the accuracy of Met1km with the accuracies of the source datasets and the effects of spatial downscaling (Section 4.6).

Met1km Format and General Temporal and Spatial Patterns of the Data
Met1km includes re-organized input source datasets and two scripts to generate spatial data and time series based on the input datasets. The input source datasets, including the Princeton dataset, CRU JRA, NRCANmet, PNWNAmet, WorldClim2, and two future scenarios, were re-organized and saved in binary format for easy and efficient processing. One script generates 1-km resolution spatial data for any climate variable for any time period. Another script generates daily time series from 1901 to 2100 for all the climate variables for any user-defined location (1-km grid) in the domain of Canadian landmass. They were developed using Microsoft Visual C++ and can process the data very efficiently. With a desktop computer, generating a time-series dataset for one 1-km grid takes about one second, and generating a spatial dataset for a climate variable takes 4-5 min. The data volume of Atmosphere 2020, 11, 1363 9 of 27 the re-organized input datasets and the scripts is about 134 gigabytes, while the total data volume of all the 1-km grids in Canada is 82 terabytes if each value is saved in 4 bytes. Thus, Met1km is relatively small comparing to the total volume of the generated data, flexible to use, and can be updated easily by replacing the source datasets and modifying the scripts. Figure 1 shows an example of air temperature and precipitation time series generated by Met1km for a grid at Yellowknife (62.4540 • N, 114.3718 • W), depicting both long-term increases and inter-annual fluctuations. From the 1900s (1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910) to the 2000s (2001-2010), air temperature increased 2.0 • C and precipitation increased 11.4%. From the 2000s to the 2090s (2091-2100), air temperature is predicted to increase by 4.0 and 7.8 • C under the RCP 4.5 and 8.5 scenarios, respectively. Precipitation is predicted to increase by 9.8% and 36.2%, respectively, under these two scenarios. Daily minimum and maximum air temperatures show similar patterns as that of daily mean air temperature, but the increase in daily minimum air temperature is about 1 • C more than that of the daily maximum air temperature. Vapor pressure and downward longwave radiation also show similar increasing trends as air temperature. Long-term changes are small for wind speed and solar radiation.  Figure 2a shows the spatial distribution of mean air temperature in the 2000s. Mean air temperature ranges from −25 to 11 °C from the high Arctic to southern Canada. Mean annual air temperature can differ by several degrees in short distances due to variations in topography, especially in British Columbia and high arctic regions. This indicates that it is necessary to develop meteorological datasets at high spatial resolutions. From the 1900s to the 2000s, air temperature  Figure 2a shows the spatial distribution of mean air temperature in the 2000s. Mean air temperature ranges from −25 to 11 • C from the high Arctic to southern Canada. Mean annual air temperature can differ by several degrees in short distances due to variations in topography, especially in British Columbia and high arctic regions. This indicates that it is necessary to develop meteorological datasets at high spatial resolutions. From the 1900s to the 2000s, air temperature increased 1 to 2 • C in southern Canada and 2 to 3 • C in most of northern Canada (Figure 2b). Under the RCP 4.5 climate change scenario, air temperature is predicted to increase by 3 to 4 • C in southern Canada and by 4 to 7 • C in most of northern Canada (Figure 2c). Under the RCP 8.5 scenario, the air temperature is predicted to increase by 5 to 8 • C in most of southern Canada and by 7 to 11 • C in northern Canada, especially in the northwestern Arctic (Figure 2d). These spatial patterns and the magnitudes of future changes are similar to that of the multi-model ensembles from the fifth phase of the Coupled Model Intercomparison Project (CMIP5) for Canadian domain [55].

Testing the Re-Baselining Spatial Downscaling Method
We tested the re-baselining method using the second generation homogenized daily air temperature and precipitation data in Canada [43]. The dataset includes 338 and 463 stations across Canada for air temperature and precipitation, respectively. To test whether a climate variable co-varies at two sites, we calculated R between each pair of climate stations with distance less than 1000 km. To avoid the effects of seasonality, we calculated R using the deviations (using difference for air temperature and ratio for precipitation) from the long-term seasonal patterns, which were linearly interpolated from the long-term monthly averages. We used data from 1945 so that the number of years of the data used for the calculation

Testing the Re-Baselining Spatial Downscaling Method
We tested the re-baselining method using the second generation homogenized daily air temperature and precipitation data in Canada [43]. The dataset includes 338 and 463 stations across Canada for air temperature and precipitation, respectively. To test whether a climate variable co-varies at two sites, we calculated R between each pair of climate stations with distance less than 1000 km. To avoid the effects of seasonality, we calculated R using the deviations (using difference for air temperature and ratio for precipitation) from the long-term seasonal patterns, which were linearly interpolated from the long-term monthly averages. We used data from 1945 so that the number of years of the data used for the calculation was similar across the country. The calculated R for air temperature is generally higher than that of precipitation (Figure 3a,b). R decreases with distance between the stations for both air temperature and precipitation, with a near-linear decrease with distance for air temperature but a more rapid initial decrease for precipitation.
Atmosphere 2020, 11, x FOR PEER REVIEW 11 of 28 Figure 3. Correlation coefficients (R) between two climate stations for (a) daily mean air temperature and (b) daily precipitation and mean absolute differences between two climate stations for (c) daily mean air temperature and (d) daily precipitation. They were calculated using the deviations (differences for air temperature and ratios for precipitation) from the long-term seasonal patterns, which were linearly interpolated from the long-term monthly averages.
To test whether the magnitudes of co-variation are similar at two sites, we calculated the mean absolute difference between each pair of stations using the daily deviations from the long-term average seasonal patterns (using differences for air temperature and ratios for precipitation). The mean absolute difference is the third term in Equation (13), and it is also the MAE of the re-baselining method when using the deviations of one station to estimate another station. The mean absolute differences increase with distance for both air temperature and precipitation, with a near-linear increase with distance for air temperature but a more rapid initial increase for precipitation. These patterns somewhat mirror that of the R (Figure 3c,d). These results indicate that a climate variable (daily air temperature or precipitation) co-varies, and the magnitudes of variations are similar at two stations when their distance is not too far. The co-variation is more consistent for air temperature than for precipitation.
For each pair of stations, we also calculated the R and mean absolute differences using running averages over some days ( Figure 4). Each curve in Figure 4 represents the median values of all the pair of stations within a certain range of distances. The median R increases quickly with the duration . Correlation coefficients (R) between two climate stations for (a) daily mean air temperature and (b) daily precipitation and mean absolute differences between two climate stations for (c) daily mean air temperature and (d) daily precipitation. They were calculated using the deviations (differences for air temperature and ratios for precipitation) from the long-term seasonal patterns, which were linearly interpolated from the long-term monthly averages.
To test whether the magnitudes of co-variation are similar at two sites, we calculated the mean absolute difference between each pair of stations using the daily deviations from the long-term average seasonal patterns (using differences for air temperature and ratios for precipitation). The mean absolute difference is the third term in Equation (13), and it is also the MAE of the re-baselining method when using the deviations of one station to estimate another station. The mean absolute differences increase with distance for both air temperature and precipitation, with a near-linear increase with distance for air temperature but a more rapid initial increase for precipitation. These patterns somewhat mirror that of the R (Figure 3c,d). These results indicate that a climate variable (daily air temperature or precipitation) Atmosphere 2020, 11, 1363 12 of 27 co-varies, and the magnitudes of variations are similar at two stations when their distance is not too far. The co-variation is more consistent for air temperature than for precipitation.
For each pair of stations, we also calculated the R and mean absolute differences using running averages over some days ( Figure 4). Each curve in Figure 4 represents the median values of all the pair of stations within a certain range of distances. The median R increases quickly with the duration of the running average from 1 day to about 5-10 days. The increase of R becomes very small when the duration of the running average is longer than 10 days (Figure 4b). Similarly, the median mean absolute difference decreases quickly with the duration of the running average from 1 day to about 5-10 days, then the decrease becomes smaller, especially when two stations are relatively close. This is important because the original re-baselining method was developed and tested based on monthly air temperature [54]. Our tests show that the errors for daily air temperature and precipitation are about 2.5 and 2.9-3.5 times, respectively, of the errors when the re-baselining is calculated using monthly means. The errors decreased rapidly when averaged for some days. The errors for 5-to 10-day averages are not significantly higher than the error calculated using monthly averages. For stations with distances less than 50 km, the median error for 10-day average air temperature is 0.6 • C. For precipitation, the median errors for 10-day averages are 31% and 37% of the long-term averages for distances <25 km and 25-50 km, respectively. Figures 3 and 4 also show that the spatial downscaling errors are smaller when stations are more densely distributed (or the gridded dataset has a higher spatial resolution). Therefore, we generally chose source datasets with high spatial resolutions.
Atmosphere 2020, 11, x FOR PEER REVIEW 12 of 28 monthly means. The errors decreased rapidly when averaged for some days. The errors for 5-to 10day averages are not significantly higher than the error calculated using monthly averages. For stations with distances less than 50 km, the median error for 10-day average air temperature is 0.6 °C. For precipitation, the median errors for 10-day averages are 31% and 37% of the long-term averages for distances <25 km and 25-50 km, respectively. Figures 3 and 4 also show that the spatial downscaling errors are smaller when stations are more densely distributed (or the gridded dataset has a higher spatial resolution). Therefore, we generally chose source datasets with high spatial resolutions. They were calculated using the deviations (differences for air temperature and ratios for precipitation) from the long-term daily averages, which were interpolated from the long-term monthly averages. Each curve represents the median values calculated from all the stations within a certain distance range.

Comparing Met1km with Climate Station Observations
We compared Met1km with observations at some climate stations across Canada. Although  They were calculated using the deviations (differences for air temperature and ratios for precipitation) from the long-term daily averages, which were interpolated from the long-term monthly averages. Each curve represents the median values calculated from all the stations within a certain distance range.

Comparing Met1km with Climate Station Observations
We compared Met1km with observations at some climate stations across Canada. Although most of the station observations probably were used to develop the gridded source datasets, such comparisons are useful to detect any technical errors and to assess the accuracy of the re-baselining downscaling method. Station data of daily minimum and maximum air temperatures, vapor pressure, wind speed, and solar radiation were derived from hourly data of CWEEDS [44]. Daily station precipitation data were from Environment and Climate Change Canada. Figure 5 shows some examples of comparison between Met1km and observed climate variables at the Yellowknife airport climate station for the year 1953. Met1km daily values are very close to the observations. Figure 6a shows the R between Met1km and observed daily values from 1948 to 2016 for precipitation and from 1953 to 2005 for other variables. The R is very high for daily minimum, maximum, and mean air temperatures, vapor pressure, and solar radiation. The R is lower for precipitation and wind speed. Since strong seasonal variation could contribute to the high correlations, we also calculated the R after excluding the long-term average seasonal patterns. The long-term average seasonal patterns were calculated by linearly interpolating the long-term monthly averages. The R is still very high for daily minimum, maximum, and mean air temperatures (grey bars in Figure 6a). The R becomes lower for vapor pressure and solar radiation, indicating that Met1km captured their seasonal patterns but the day-to-day fluctuation was not well captured as for air temperatures. The R calculated from the original data and from the deviations are very similar for precipitation and wind speed.
Atmosphere 2020, 11, x FOR PEER REVIEW 13 of 28 precipitation and from 1953 to 2005 for other variables. The R is very high for daily minimum, maximum, and mean air temperatures, vapor pressure, and solar radiation. The R is lower for precipitation and wind speed. Since strong seasonal variation could contribute to the high correlations, we also calculated the R after excluding the long-term average seasonal patterns. The long-term average seasonal patterns were calculated by linearly interpolating the long-term monthly averages. The R is still very high for daily minimum, maximum, and mean air temperatures (grey bars in Figure 6a). The R becomes lower for vapor pressure and solar radiation, indicating that Met1km captured their seasonal patterns but the day-to-day fluctuation was not well captured as for air temperatures. The R calculated from the original data and from the deviations are very similar for precipitation and wind speed.  Figure 6b shows the overall error and the three error components for each climate variable for the grid covering the Yellowknife airport climate station. The total error for daily mean air temperature is smaller than that of the daily minimum and maximum air temperatures. The error in daily deviations from the long-term average seasonal patterns are the major sources of the error for all the variables, especially for precipitation, vapor pressure, wind speed, and solar radiation. The errors in long-term averages are very small for daily mean air temperature, precipitation, and vapor pressure. For other variables, the errors in long-term averages are similar to that of the long-term average seasonal patterns. Although the errors in long-term averages are relatively large for daily minimum and maximum air temperatures, the error in long-term average is small for daily mean air temperature. Figure 7 shows the changes in errors with the duration of running average for the Yellowknife airport climate station. The MAE decreased quickly with the duration of running average. When averaged over 10 days, the MAE is about 50 to 60% of the daily errors except daily maximum air temperature, which is about 70% of the daily error. The reduction of MAE is small when the duration of running average is longer than 10 days (Figure 7a). For the deviations from the long-term average seasonal patterns, the reduction of MAE with duration of the running average is even more significant (Figure 7b). In addition, the probability distributions of Met1km are very similar to that of the observations for all the climate variables ( Figure 8).   Figure 6b shows the overall error and the three error components for each climate variable for the grid covering the Yellowknife airport climate station. The total error for daily mean air temperature is smaller than that of the daily minimum and maximum air temperatures. The error in daily deviations from the long-term average seasonal patterns are the major sources of the error for all the variables, especially for precipitation, vapor pressure, wind speed, and solar radiation. The errors in long-term averages are very small for daily mean air temperature, precipitation, and vapor pressure. For other variables, the errors in long-term averages are similar to that of the long-term average seasonal patterns. Although the errors in long-term averages are relatively large for daily minimum and maximum air temperatures, the error in long-term average is small for daily mean air temperature. Figure 7 shows the changes in errors with the duration of running average for the Yellowknife airport climate station. The MAE decreased quickly with the duration of running average. When averaged over 10 days, the MAE is about 50 to 60% of the daily errors except daily maximum air temperature, which is about 70% of the daily error. The reduction of MAE is small when the duration of running average is longer than 10 days (Figure 7a). For the deviations from the long-term average seasonal patterns, the reduction of MAE with duration of the running average is even more significant (Figure 7b). In addition, the probability distributions of Met1km are very similar to that of the observations for all the climate variables ( Figure 8).

Comparing with the Homogenized Daily Air Temperature and Precipitation Station Data
We compared Met1km with the second-generation homogenized climate dataset [43] for each station. Because the CRU JRA dataset from 1901 to 1957 was developed by randomly selecting a year between 1958 and 1967 in JRA-55 dataset that is in alignment with the CRU monthly data, the daily

Comparing with the Homogenized Daily Air Temperature and Precipitation Station Data
We compared Met1km with the second-generation homogenized climate dataset [43] for each station. Because the CRU JRA dataset from 1901 to 1957 was developed by randomly selecting a year between 1958 and 1967 in JRA-55 dataset that is in alignment with the CRU monthly data, the daily

Met1km
Observed at Yellowknife climate station Figure 8. Comparisons of the probability distributions of Met1km data with that of the observations at the Yellowknife airport climate station. The x-axis is expressed as (a-f) deviations from or (g) ratio to the long-term average seasonal pattern, which is interpolated from long-term monthly averages. The inset in panel (g) shows the same data but in different scales. The daily data are from 1948 to 2016 for precipitation and from 1953 to 2005 for other climate variables.

Comparing with the Homogenized Daily Air Temperature and Precipitation Station Data
We compared Met1km with the second-generation homogenized climate dataset [43] for each station. Because the CRU JRA dataset from 1901 to 1957 was developed by randomly selecting a year between 1958 and 1967 in JRA-55 dataset that is in alignment with the CRU monthly data, the daily errors decreased sharply by about a half from 1 January 1958. Therefore, we did the comparisons separately for the periods from 1901 to 1944 (from 1901 to 1947 in eastern Canada (east of 101 • W)) and from 1945 to 2014 (from 1947 to 2014 in eastern Canada). Figure 9a-c shows the median MAE of all the stations for the two periods, respectively. For daily values, the median MAE from 1901 to 1944 is 3 to 5 times of that from 1945 to 2014. The MAE decreases when the climate variables are averaged for a number of days, and the decrease for precipitation is faster than for air temperatures (daily minimum, maximum, and mean), especially for the period from 1901 to 1944. When averaged for 30 days, the MAE from 1901 to 1944 is about twice of that from 1945 to 2014 (1.6, 2.0, 2.7, and 1.8 times for minimum, maximum, and mean air temperatures, and precipitation, respectively). The MAE for daily mean air temperature is smaller than that of daily minimum and maximum air temperatures, especially from 1945 to 2014. Met1km and the climate station data are highly correlated for most of the stations with a linear slope close to 1 from 1945 to 2014. This is even the case when the seasonal patterns are eliminated. However, from 1901 to 1944, the correlations between the daily values of Met1km and the climate station observations are very low after eliminating the average seasonal patterns. The correlation increases when the daily values are averaged over more days, and the correlation reaches a similar level of the correlation during 1945-2014 when the daily values are averaged over 30 days. . Figure 9a-c shows the median MAE of all the stations for the two periods, respectively. For daily values, the median MAE from 1901 to 1944 is 3 to 5 times of that from 1945 to 2014. The MAE decreases when the climate variables are averaged for a number of days, and the decrease for precipitation is faster than for air temperatures (daily minimum, maximum, and mean), especially for the period from 1901 to 1944. When averaged for 30 days, the MAE from 1901 to 1944 is about twice of that from 1945 to 2014 (1.6, 2.0, 2.7, and 1.8 times for minimum, maximum, and mean air temperatures, and precipitation, respectively). The MAE for daily mean air temperature is smaller than that of daily minimum and maximum air temperatures, especially from 1945 to 2014. Met1km and the climate station data are highly correlated for most of the stations with a linear slope close to 1 from 1945 to 2014. This is even the case when the seasonal patterns are eliminated. However, from 1901 to 1944, the correlations between the daily values of Met1km and the climate station observations are very low after eliminating the average seasonal patterns. The correlation increases when the daily values are averaged over more days, and the correlation reaches a similar level of the correlation during 1945-2014 when the daily values are averaged over 30 days.   (Figure 9b). The major error for daily mean air temperature and precipitation is in day-to-day fluctuations. For daily minimum and maximum air temperatures, the errors in average seasonal patterns and in day-to-day fluctuations are similar. The errors in day-to-day fluctuations decrease quickly when averaged for 5 to 10 days, and further reduced when averaged in 30 days (Figure 9b). From 1901 to 1944, the errors in long-term averages are very small, as are daily mean air temperature and precipitation. The errors in long-term seasonal patterns for daily minimum and maximum air temperatures are slightly larger. The major sources of the errors are from day-to-day fluctuations. The errors in day-to-day fluctuations decrease quickly and continuously when averaged up to 30 days (Figure 9d). Figure 10 shows the correlation coefficients between CANGRD and monthly anomaly calculated from Met1km. The R is very high for air temperatures (daily minimum, maximum, and mean). The R for the monthly anomalies of daily mean air temperature is slightly higher than that of daily minimum and maximum air temperatures in all the months, seasons, and annual averages. The R for precipitation is much lower than that of air temperatures and the R has a large standard deviation. Figure 11a shows the spatial distribution of R for the anomaly of annual mean air temperature. The sudden increase in R crossing latitude 60 • N is due to differences in data length (the data are from 1948 in the north and from 1901 in the south). The R is lower in British Columbia and the eastern high Arctic, probably because of the complex topography in these regions. There is no obvious difference in R between the domain of PNWNAmet and the other areas for air temperature. The R for precipitation is lower than that of air temperatures and has a wider range of variation ( Figure 10). R tends to be higher in western Canada (Figure 11b), indicating that the PNWNAmet has a higher accuracy than that of the NRCANmet dataset for precipitation. The correlation is very poor in one area in central Quebec. A close check of the data shows that the discrepancy occurs after 1992, which has an unusually large variation. The downscaled precipitation from the CRU JRA, Princeton and NRCANmet datasets all has a similar pattern for this area, probably because they used the same dataset, which is different from that deriving the CANGRD dataset. The slopes of the linear regressions are close to 1, indicating the fluctuation ranges of CANGRD and Met1km are similar. This comparison indicates that the temporal variation patterns of Met1km are similar to that of the CANGRD dataset.    Figure 12 shows the errors of Met1km and the source datasets and the effects of downscaling by comparing with the second generation homogenized daily air temperature and precipitation dataset [43]. The errors of the source datasets were calculated by directly comparing the grid values with the station data. The downscaling effects on the errors were calculated by comparing the station data with the source data downscaled to 1-km resolution using the re-baselining method. For the CRU JRA and the Princeton datasets, spatial downscaling reduced the mean error (absolute value of the error in long-term average) by about a half for daily mean, minimum, and maximum air Monthly anomalies of daily maximum air temperature Monthly anomalies of daily mean air temperature Monthly anomalies of precipitation Figure 11. Correlation coefficients between the CANGRD and the anomalies of Met1km datasets for (a) annual mean air temperature and (b) annual total precipitation. The daily data from Met1km were converted to anomalies corresponding to the CANGRD dataset. The data were from 1901 in southern Canada (south of 60 • N) and from 1948 in northern Canada. Figure 12 shows the errors of Met1km and the source datasets and the effects of downscaling by comparing with the second generation homogenized daily air temperature and precipitation dataset [43]. The errors of the source datasets were calculated by directly comparing the grid values with the station data. The downscaling effects on the errors were calculated by comparing the station data with the source data downscaled to 1-km resolution using the re-baselining method. For the CRU JRA and the Princeton datasets, spatial downscaling reduced the mean error (absolute value of the error in long-term average) by about a half for daily mean, minimum, and maximum air temperatures, and by about 30% for precipitation. Downscaling also significantly reduced the standard deviation of the mean errors for these two datasets. For daily mean, minimum, and maximum air temperatures, the mean errors of the CRU JRA dataset in 1948-2014 are slightly smaller than that in 1901-1947, and the mean errors are even smaller for the Princeton dataset. The mean errors of precipitation are similar for the CRU JRA dataset in 1901-1947 and 1948-2014, and for the Princeton dataset. For the NRCANmet and PNWNAmet datasets, the effects of downscaling on the mean errors were small and mixed. However, downscaling reduced the standard deviation of the mean errors for air temperatures, especially for the daily means.

Comparing the Accuracy of Met1km and the Source Datasets
Downscaling slightly reduced the MAE for daily air temperature and precipitation for the CRU JRA and Princeton datasets. The daily MAE from 1901 to 1947 is larger than the later period because the data for this period was estimated using an analogy method based on the JRA-55 from 1958 to 1967 and the CRU monthly data [36]. The MAE from 1901 to 1947 decreased quickly when the data were averaged for some days, and the MAE became close to that of CRU JRA in 1948-2014 when averaged for 30 days, especially for daily maximum air temperature. The MAE also reduced for other datasets when the data were averaged for some days. The MAE of the Princeton dataset is slightly smaller than that of CRU JRA in 1948-2014 for daily mean air temperature, but their MAE is similar for daily minimum and maximum air temperatures and precipitation. The MAE of the NRCANmet and PNWNAmet datasets is smaller than that of the CRU JRA and Princeton datasets because the NRCANmet and PNWNAmet datasets were directly interpolated from the climate station data and have fine spatial resolutions. For the NRCANmet and PNWNAmet datasets, downscaling slightly increased the MAE for daily minimum and maximum air temperatures. However, downscaling slightly reduced MAE of daily mean air temperature for the PNWNAmet and reduced the standard deviation of MAE of daily mean air temperature for both the NRCANmet and PNWNAmet datasets. That means downscaling reduced the errors in daily mean air temperature and made it spatially more consistent. The effects of downscaling on precipitation were small for both the NRCANmet and PNWNAmet datasets.
Atmosphere 2020, 11, x FOR PEER REVIEW 19 of 28 temperatures, and by about 30% for precipitation. Downscaling also significantly reduced the standard deviation of the mean errors for these two datasets. For daily mean, minimum, and maximum air temperatures, the mean errors of the CRU JRA dataset in 1948-2014 are slightly smaller than that in 1901-1947, and the mean errors are even smaller for the Princeton dataset. The mean errors of precipitation are similar for the CRU JRA dataset in 1901-1947 and 1948-2014, and for the Princeton dataset. For the NRCANmet and PNWNAmet datasets, the effects of downscaling on the mean errors were small and mixed. However, downscaling reduced the standard deviation of the mean errors for air temperatures, especially for the daily means. Figure 12. Comparisons of the errors of the source datasets, the spatially downscaled source datasets and Met1km (the three bars in each group respectively) (a-p). The errors were calculated by comparing with the homogenized daily climate station data [43]. The four rows are for different climate variables, and the four columns are for the errors in long-term averages and the mean absolute  Figure 12 shows that the PNWNAmet dataset had slightly larger errors than that of the NRCANmet dataset. However, their spatial domains were different. For the spatial domain of PNWNAmet, where most of the area is mountainous, the errors of NRCANmet and PNWNAmet were similar but the standard deviation of the errors of PNWNAmet was smaller than that of the NRCANmet, probably because the spatial resolution of PNWNAmet is finer than that of NRCANmet.

Discussion
This study developed a long-term (from 1901 to 2100), high-resolution (1 km) daily near-surface meteorological dataset in Canada. Met1km has several advantageous features compared to existing meteorological datasets (Table A1).
It has a high spatial resolution and covers a long-time period at a daily time step for the entire Canadian landmass. It can therefore be used to model long-term dynamics of permafrost and ground thermal conditions at high spatial resolutions. The dataset can also be used for other national-scale modeling and climate impact analysis.
Met1km includes eight climate variables and they are generally consistent in related atmospheric physical processes as they are generated directly from observations or from climate model reanalysis. Therefore, the dataset can be used to drive process-based models and to integrate the effects of these variables through energy and water dynamics.
The dataset is relatively small and flexible to use. Instead of saving the 1-km resolution data for each grid, we developed two scripts to generate spatial and time series data as needed. This treatment significantly reduces the volume of the dataset.
Met1km integrates deemed suitable datasets presently available in term of temporal coverages, spatial resolutions, and data availability. Met1km can be updated easily when better source datasets become available.
The method to generate Met1km for Canada can be used to develop similar datasets for other regions or even to cover the global landmass as the WorldClim2, CRU JRA and Princeton datasets have global coverage.
Comparing to the 1-km daily time series dataset Daymet [19], Met1km integrates more datasets, especially those developed for Canadian domain, and covers a longer time period. The WorldClim2 dataset [29] that we used for re-baselining includes some remote sensing products, which would be very useful to improve the accuracy in high arctic, where climate stations are sparsely distributed [56]. Therefore, Met1km is more suitable for modeling permafrost in Canada. The software toolkit, GlobSim [21], is very useful to improve the applications of climate reanalysis data. However, reanalysis datasets usually have some biases [21,22]. Re-baselining method is very useful to reduce the long-term mean errors ( Figure 12). Met1km can be updated to integrate some of the reanalysis datasets, such as the version 5 of the European Centre for Medium-Range Weather Forecasts Re-Analysis (ERA5), if they are deemed better than the ones we used now.
Our tests show that the re-baselining method is useful for spatial downscaling of meteorological data, especially for downscaling coarse gridded datasets. Way and Bonnaventure [54] tested the method for monthly air temperature in northeastern Canada. Our assessment suggests that the method is useful for precipitation as well when distance between two stations is less than 50 km or the resolution of gridded datasets is finer than 0.5 • latitude/longitude. The accuracy in 5-to 10-day averages are similar to that of the monthly averages. As precipitation is usually more spatially variable than other climate variables, the re-baselining method may be useful for other climate variables as well. Karger et al. [57] used a similar approach to downscale the 0.5 • latitude/longitude CRU monthly air temperature and precipitation to about 1-km resolution. A similar approach was used in three areas in Canada, where the monthly baseline data were from spatially interpolated gridded data (long-term monthly averages) while the daily anomalies were from climate station observations [15][16][17]. Verdin et al. [56] also used a similar method to develop a 5-km resolution daily air temperature dataset by combining 5-km resolution monthly maximum air temperature and the ERA5 dataset. Verdin et al. [56] demonstrate that remotely sensed temperature estimates may more closely represent true conditions than those that rely on interpolation, especially in regions with sparse in situ data.
Our assessments show that the accuracy of Met1km is similar to or better than other gridded datasets (Section 4.6). Our analysis indicates that the major source of errors is due to day-to-day fluctuations. When averaged over 5-10 days, the error is reduced significantly. Since permafrost is not very sensitive to short-term fluctuations in meteorological conditions [43], the effects of short-term errors on modeled permafrost conditions will be limited. For the period from 1901 to 1944, the monthly MAE is about twice of that of 1945-2017. The MAE for daily values is 3-5 times of that of 1945-2017 because the daily data are estimated using an analogy approach [36]. Therefore, the data at the sub-monthly scale is not very reliable for this period. In addition, we used NRCANmet and PNWNAmet datasets to replace air temperature and precipitation because they better mimicked station observations than the CRU JRA and the Princeton datasets (Section 4.6). However, such replacement could cause some inconsistency with other climate variables.
We assessed the accuracy of Met1km mainly by comparing with station observations. Such assessments have some limitations. First, most of the climate station observations may have been used to generate the source datasets. Thus, the observations used for validating Met1km are not completely independent. Second, climate stations are sparsely distributed in northern Canada. The accuracy of the source datasets and thus the generated Met1km dataset can be low in this region due to lack of observations. The error is usually larger for precipitation than for air temperatures, especially over a short time (1 to 10 days). Although we compared with the gridded dataset CANGRD, its spatial resolution (50 km) is coarse. We will continue to update Met1km as new and better source datasets are available, especially the periods before the 1940s and for future projections. Several studies extended the dataset to the beginning of the 20th century (Table A1), and the Twentieth Century Reanalysis project even dated back to middle of the 19th century [58]. These datasets may be useful to improve the accuracy of the data before 1944 and even to extend Met1km back to the end of the Little Ice Age.

Conclusions
This study developed a long-term, 1-km resolution daily meteorological dataset in Canada. The dataset includes eight climate variables; therefore, it can be used to drive process-based models for high resolution permafrost modeling and mapping. It can also be used for other land-surface modeling and climate impact studies in Canada.
Met1km dataset is generated based on four coarser gridded meteorological datasets for the historical period: CRU JRA, PNWNAmet, NRCANmet, and the Princeton dataset. The future climate scenarios are from the output of a new Canadian regional climate model under medium-low and high emission scenarios (RCP 4.5 and 8.5). These datasets were downscaled to 1-km resolution using the re-baselining method based on the WorldClim2 dataset as spatial templates. The future scenarios were bias-corrected before re-baselining to 1-km resolution. The accuracy of Met1km is similar to or better than those of the coarse gridded source datasets. The errors in long-term averages and average seasonal patterns are small. The error mainly occurs in day-to-day fluctuations, which decreases quickly when averaged over 5 to 10 days. The error in daily values is large for the period 1901-1944 because the source dataset CRU JRA is developed using an analogy approach based on monthly values. The Met1km, as a data generating system, is relatively small comparing to the total data volume of the 1-km daily time series in Canada, flexible to use for generating spatial and time-series data for applications, and easy to update when new or improved source datasets are available. The methodology of this study can also be used to develop similar datasets for other regions, even the global landmass as most of the source data are in global coverage.
Author Contributions: Conceived the idea and designed the work together, Y.Z. and B.Q.; selected the future scenarios, conducted bias correction, and wrote the related sections, B.Q.; organized the data, assessed the accuracy, and wrote the early version of the manuscript with the input from B.Q. about future climate scenarios, Y.Z.; processed some of the spatial data for accuracy assessment, G.H. All authors discussed the results and contributed to the final manuscript. All authors have read and agreed to the published version of the manuscript. Interpolated based on climate station data [19].

GlobSim
Global Any location Same as the reanalysis datasets Same as the reanalysis datasets.
A software toolkit to generate time series data for any sites from several reanalysis datasets [21].
* T is the mean air temperature for a define period (at sub-daily scales). Ta, Tn, and Tx are daily mean, minimum and maximum air temperatures, respectively, dT is diurnal air temperature range, P is precipitation, Pa is atmospheric pressure, Vap is vapor pressure, SH is specific humidity, WetD is wet-days, Cloud is cloud cover, SR is solar radiation, LwR is downward longwave radiation, Snow is snow water equivalent.

Calculating Monthly Mean Downward Longwave Radiation
We estimated monthly mean downward longwave radiation corresponding to the time scale and spatial resolution of the WorldClim2 dataset (monthly average from 1970 to 2000 at resolution of 30 arc seconds latitude/longitude) [29]. Based on the assessment of different methods conducted by Flerchinger et al. [68], especially the accuracy in cold regions, we estimated downward longwave radiation using the method of Kimball et al. [69] with clear-sky downward longwave radiation estimated by the method of Dilley and O'Brien [70].
where L clr = 59.38 + 113.7(T 0 /273.16) 6 + 96.96(w/25) 0.5 , (A2) w = 4650e 0 /T 0 , (A3) ε 8z = 0.24 + 2.98·10 −6 e 2 0 ·exp(3000/T 0 ), where L is all sky downward longwave radiation (in W/m 2 ), L clr is downward longwave radiation for clear sky, c is the fraction of cloud cover, T 0 is air temperature observed at about 2 m above the land surface (in K), and T c is the temperature on the base of the cloud (in K), which is assumed 9 K lower than the T 0 . σ is the Stefan-Boltzmann constant (5.67 × 10 −8 W/m 2 /K 4 ), e 0 is the vapor pressure observed about 2 m above the surface (in kPa). The equations and parameters are from Flerchinger et al. [69]. We calculated monthly mean downward longwave radiation for each 1-km grid based on the monthly mean air temperature and vapor pressure of the corresponding WorldClim2 grid. As high arctic regions receive very low and even no solar radiation in winter months, we estimated the cloud cover fraction based on CRU TS data [37].