Despite over twenty years of studies developing statistical downscaling methodologies, there remains a lack of methods that can downscale from AOGCM precipitation to regional level high resolution gridded precipitation [1
]. Compared to other climate variables, such as temperature or barometric pressure, precipitation is more fragmented in space, and interactions of different atmospheric scales (local, meso, synoptic) and terrestrial features are more apparent in observed precipitation patterns. It is very difficult for continuous functions used in traditional statistical downscaling methods to simulate these types of local patterns. Recent advances in machine learning (ML) methods, such as convolutional neural networks, have started to address these long-standing issues [3
]. However, widely used loss functions, such as mean absolute error (MAE) and Nash-Sutcliffe efficiency (NS), consider overall simulation performance but ignore the spatial structure of precipitation; a key property if replicating observed local patterns is an objective [4
]. Narrowly defined loss functions inhibit the potential of machine learning methods in downscaling, leading to the poor performance of models at regional scales.
Synoptic scale climate variables are commonly simulated by coupled atmosphere–ocean global climate models (AOGCMs), which provide a numerical framework of climate systems based on the physio-chemical and biological characteristics of their components and feedback interactions [5
]. Coupled AOGCMs are computational frameworks that can simulate an estimate of the spatio-temporal behavior of different climatic variables under the effects of variable concentrations of greenhouse gases (GHG) in the atmosphere [6
]. Physically-based representations of the physics and chemistry of the atmosphere and oceans make these models one of the most reliable tools for deriving future projections of meteorological variables (temperature, humidity, precipitation, wind speed, solar radiation, pressure, etc.). AOGCMs can simulate estimates of atmospheric variables which can be treated as possible representations of future climate [7
AOGCMs degrade at higher spatial and temporal scales [8
]. AOGCMs typically run with a spatial resolution of 250 km to 600 km, the scale at which AOGCMs can capture synoptic-scale circulation patterns and correctly simulate smoothly varying fields, such as surface pressure. As the physical assumptions underlying the various parameterizations in AOGCMs target this scale of “variable resolving resolution”, we can place a high level of confidence on the estimates over those scales. However, as we move from the synoptic scale into finer hydrologically relevant scales and analyze highly spatially heterogeneous fields such as precipitation, AOGCM skill quickly deteriorates [9
] The coarse resolution of AOGCMs tends to distort the representation of regional variations of precipitation which, in turn, can alter the formation of site-specific precipitation conditions by affecting the sub-grid-scale processes. Various assumptions in the parametrization of different processes, different resolutions of land covers and topography, and their representations [10
], and varying solution methods of different AOGCMs (FEM, FVM, etc.) can affect their estimation of climatic variables. As such, ensemble methods attempt to capture a suite of AOGCMs by collating outputs from multiple models, which in turn aims to reduce sensitivity to individual model biases through a quantitative framework [11
]. Failure of models in predicting highly variable processes driving, for example, daily precipitation, limits their utility in several applied and management settings [12
To study a climatic variable at hydrological or regional scales, we need to reduce the scale of the outputs from climate models. The method used to reduce the scale of the AOGCM’s output is broadly referred to as “downscaling”. As per the design and methodology, downscaling procedures are broadly classified into two different types, namely dynamical downscaling and statistical downscaling. In dynamic downscaling the most common approach is to use a regional circulation model (RCM) or limited area model (LAM) that are designed to operate at a higher spatial resolution to simulate climatic variables of interest using AOGCM simulated fields as initial and boundary conditions [13
]. But experimental design complexity and computational effort make this approach infeasible when multiple ensembles of AOGCMs are required in the study [14
Useful features of statistical downscaling are its simple architecture and less computational burden compared to the dynamic downscaling [15
]. Statistical downscaling can produce synthetic variables of any prescribed length which makes it very popular in studies of climate change impacts [16
]. The statistical downscaling draws the empirical relationships between the regional scale predictants (variables of interest) and predictors (AOGCMs) and constructs regional-scale atmospheric variable structure from large-scale simulated patterns. [17
] explained the underlying assumptions which created the basis of subsequent statistical downscaling methods. The robustness of statistical downscaling to study climate change impact of any region can partly be attributed to its methodology to incorporate historical observations that carry the location-specific climatic signature. The comparison of statistical and dynamic downscaling methods over Northern Canada by [18
] showed that the biases in precipitation estimates were lower in statistical downscaling and the distributions of maximum and minimum temperatures were well estimated.
Statistical downscaling methods are broadly classified into three different categories based on their design of processing the predictors and predictants [19
]: (i) weather generators, (ii) weather typing, and (iii) transfer functions. Stochastic weather generators are essentially complex random number generators, which can be used to produce a synthetic series of data [21
]. This feature enables researchers to address natural variability when studying the impact of climate change. Brissette et al., 2007 classified weather generators into three types, (i) parametric [22
], (ii) semi-parametric or empirical [27
] and (iii) non-parametric [29
]. A detailed discussion of these methods is beyond the scope of this paper but it should be noted that one of the key advantages of weather generators is their ability to produce synthetic time series of climate data of the desired length based on the statistical characteristics of the observed weather. Weather typing approaches [32
] involve the clustering of regional-scale meteorological variables and linking them with different classes of atmospheric circulation. Within this framework, future regional climate scenarios can be generated in two different ways: (i) by re-sampling from the observed variable distribution given the distribution of circulation pattern produced by a GCM, or (ii) by the Monte Carlo (MC) technique producing a synthetic sequence of weather patterns and based on that sequence re-sampling from the archived data. The relative frequency of different types of weather patterns are then weighed to estimate the moments or the frequencies of the distribution for the regional-scale climate.
Perhaps the most popular approach within statistical downscaling methods is transfer functions, which is a regression-based framework [33
]. The method consists of developing a direct functional relationship between global large-scale variables (predictors) and local regional-scale variables (predictants) through the statistical fitting. One method can differ from the other on the choice of mathematical functions, predictor variables, and procedure of deriving relationship. There have been several studies focused on the application of neural networks [37
], regression-based methods [38
], support vector machine [36
], and analog methods [39
] in statistical downscaling. Artificial neural networks (ANN), due to their robust nature in capturing the nonlinear relationship between predictors and predictants, have gained wide recognition in the climate modeling community [33
More recently, the use of machine learning and data science methods has increased in various fields owing to their superior performance and robust methods/software implementations. Specifically, convolution neural network (CNN) modeling has gained wide popularity because of lower computational requirements compared to the dense networks through the extraction of spatial information through kernel filters [42
]. There is a wide range of applications of CNNs including image recognition [43
], image segmentation [44
], and satellite image change detection [45
]. Image super-resolution [46
] is another method where CNNs have been applied to increase the resolution of an image, an application analogous to the climate model downscaling context.
Among the climatological variables that are in general downscaled in practice, precipitation, perhaps, is most susceptible to various uncertainties [47
]. The non-Gaussian nature of extreme precipitation owing to its localized nature creates problems for classical statistical estimators [48
]. Some recent studies have utilized more advanced statistical techniques and Bayesian methods in particular to downscale extreme precipitation over data sparse regions [50
]. An example of direct application of machine learning techniques in statistical downscaling of precipitation is [52
] who used a generalized stacked super-resolution CNN to downscale daily precipitation over the US. Despite several limitations in their experiments, the result shows the efficiency and robustness of the approach over other methods in predicting extremes. [53
] also recently introduced a novel residual dense block (RDB) into the Laplacian pyramid super-resolution network (LapSRN) to generate high-resolution precipitation forecast. [54
] used super-resolution techniques to simulate high-resolution urban micrometeorology, while [55
] proposed several CNN-based architectures to forecast high-resolution precipitation. Underlying all of these models is the treatment of two-dimensional fields such as climate model outputs and gridded observations as analogous to non-geographic images which makes CNNs an ideal candidate as transfer functions in statistical downscaling.
Recent advancements in machine learning strategies have developed several algorithms for image super-resolution, for example, attention-based training [56
]. One such recent advance, generative adversarial training (GAN), was proposed by [58
] where two networks compete in a zero-sum game that has been used widely to train deep neural networks in recent years. This method provides a superior training of network and can produce outputs that look superficially close to reality to human observers and address the gradient problem in an intuitive way. Ref. [59
] used adversarial learning to downscale wind and solar output from several AOGCM climate scenarios to regional level high-resolution. Cheng et al., 2020 also used adversarial learning to downscale precipitation. Their result shows the promising performance of the generative adversarial network in downscaling climate data.
In this paper, we develop a novel downscaling method using GAN, which can downscale an ensemble of large-scale annual maximum precipitation given by several AOGCMs to the regional-level gridded annual maximum precipitation. The objectives of our study are the following;
Develop a methodology to downscale large-scale precipitation, given by several AOGCMs, to regional-scale precipitation by statistical downscaling using convolution neural network and generative adversarial training.
Propose a novel loss function which is a combination of content loss, structural loss, and adversarial loss, which improves the prediction of global and regional qualities of the downscaled precipitation.
3. Results and Discussion
The median annual maximum rainfall generated by the various AOGCMs examined in this study varied considerably (Figure 3
), indicating significant inter-model uncertainty and the need for an ensemble approach [73
]. One of the major challenges in downscaling these precipitation variables to any region is to correct these biases (or differences) according to the observed pattern [74
]. An ideal downscaling method not only generates a higher resolution regional level realization of the climatic variable but should also reproduce both patterns and magnitudes. The models examined here show the need for downscaling methods that can produce a spatial and temporally coherent observed climatic distribution.
A key contribution in the CliGAN modeling framework is the development of a novel loss training method incorporating the spatial structure of rainfall. To validate our use of the total loss functions, we experimented with different combinations of loss functions (Table 2
). Each model is trained with a different permutation of loss functions for 10,000 iterations; and the median training and testing loss of the last 50 iterations is reported in the table. The combination of all three— adversarial loss, NS loss, and MSSIM loss—has content errors of 0.015 and 0.043, and structural error of 0.024 and 0.033 for training and testing, respectively. It can also be noticed the adversarial loss plays a vital role in stabilizing the performance of the network (Figure 4
). Even though the training results are similar magnitude for all the experiments, the inclusion of the adversarial loss function in the combination of loss functions, have produced testing results order of magnitude improved results for the testing set. The total loss combination is found to be producing a balanced error for both content and structural error and performed stable for both training and testing. Thus, we go forward with this total loss combination. To validate the model, we trained the total loss combination for 20,000 iterations the content losses are found to be 0.011 and 0.020, and structural losses are found to be 0.021 and 0.017 for training and testing, respectively. This justifies our choice of using this loss function.
shows the trace of different training and testing losses for all the loss function combination error function for 10,000 iterations. For total, content, structural loss first 100 iteration results are discarded as part of generator warm-up and for adversarial loss and discriminator loss first 500 iterations are discarded as part of discriminator spin up. Figure 5
a shows the trace of the total loss. Considering this is a combination of three types of errors—adversarial error, content error, and structural error—it does not have any meaningful unit. Figure 5
b shows the trace of content loss and Figure 5
c shows the trace of the structural loss. Notice the stable training trace and fluctuating testing trace. However, overall, the testing error is showing a decreasing trend for both content and structural loss. The adversarial loss keeps on increasing (Figure 5
d) for both training and testing. This signifies the ability of the discriminator to distinguish generator output and observation. The training loss of the discriminator (Figure 5
e) shows the increasing trace. However, it is marginally greater in magnitude than adversarial loss. This means the generator is doing a good job of emulating the observed patterns. Mean absolute error trace (Figure 5
f) shows a pattern like content loss.
For the final simulation, we trained the model with all the available input data. This also enabled us to compare the model against other models effectively. For the total loss of the model, error continued to decrease even after 10,000 iterations (Figure 6
a). The content loss stabilized around a value 1−3
after 10,000 iterations (Figure 6
b), while the structural loss stabilized around a value of 2.4−2
c). The adversarial loss in Figure 6
d keeps on increasing after 10,000 iterations indicating the discriminator is resolving the differences between the observed and predicted precipitation patterns. The discriminator error also keeps on increasing (Figure 6
e), signifying the good performance of the generator. However, the loss of the discriminator is still lower than the adversarial error. As an extra diagnostic, we also tracked a widely popular loss metric mean absolute error. Figure 6
f shows the trace of mean absolute error which stabilized around a value of 0.1 mm/day after 10,000 iterations.
shows the temporal mean of observed (Figure 7
a) and downscaled (Figure 7
b) annual maximum precipitation. The regional patterns are well captured by the model. The observed pattern of high rainfall in the southwestern high region with orographic influences and the low rainfall over the lake is captured by the model. While more work is required to understand the performance of structural loss training of climate model outputs in different geographic contexts, the results here show that different processes are captured that yield more accurate overall output products. This is evident in the error diagnostics of the downscaling performance outlined in Figure 8
. For the temporal mean absolute percent error (Figure 8
a), the maximum error is around 5%, however, this error is mostly confined to the north-western part of the domain. We think this is due to the low observed rainfall without any special features in the models (which can again be attributed to the absence of orographic or land cover features) in this region attributed to this high percent error. Figure 8
b shows the temporal correlation between the observed and downscaled precipitation. The minimum correlation is 0.995, which is well beyond the acceptable limit. However, interestingly, the correlation values show geometrical rectilinear orientation. We hypothesized that this is an artifact of the convolution filters used in the generator for downscaling, thus this result should be interpreted with caution. Calculating p
-values for the Kolmogorov-Smirnov test for equivalency of temporal distribution of observed and downscaled annual maximum (Figure 8
c) indicates that we cannot reject the equivalency of the distributions between the observed and downscaled (null hypothesis) at 90% confidence limit.
shows the performance of different models and loss functions. The generative adversarial network is compared against the model trained using traditional mean absolute error and another network with principal component mapping. The performance functions are mean absolute error (MAE), NS coefficient, correlation coefficient, and p
-value of the two sample KS tests. In all the performance measures the performance of GAN is superior followed by MAE based model. PCA based model performed poorly compared to the other model. However, all models performed reasonably well.