stUPscales: An R- Package for Spatio-Temporal Uncertainty Propagation across Multiple Scales with Examples in Urban Water Modelling

: Integrated environmental modelling requires coupling sub-models at different spatial and temporal scales, thus accounting for change of support procedures (aggregation and disaggregation). We introduce the R- package s patio- t emporal U ncertainty P ropagation across multiple scales , stUPscales , which constitutes a contribution to state-of-the-art open source tools that support uncertainty propagation analysis in temporal and spatio-temporal domains. We illustrate the tool with an uncertainty propagation example in environmental modelling, speciﬁcally in the urban water domain. The functionalities of the class setup and the methods and functions MC.setup , MC.sim , MC.analysis and Agg.t are explained, which are used for setting up, running and analysing Monte Carlo uncertainty propagation simulations, and for spatio-temporal aggregation. We also show how the package can be used to model and predict variables that vary in space and time by using a spatio-temporal variogram model and space-time ordinary kriging. stUPscales takes uncertainty characterisation and propagation a step further by including temporal and spatio-temporal auto- and cross-correlation, resulting in more realistic (spatio-)temporal series of environmental variables. Due to its modularity, the package allows the implementation of additional methods and functions for spatio-temporal disaggregation of model inputs and outputs, when linking models across multiple space-time scales.


Introduction
Integrated environmental modelling (IEM) often requires coupling models at multiple spatial and temporal scales. This is challenging due to the complexity of the models and differences in input and output supports between models. Leopold et al. [1] address the need for aggregation and disaggregation when coupling models. In addition, Bastin et al. [2] point out that spatio-temporal aggregation and disaggregation are common procedures in modelling chains. This challenge becomes more difficult if uncertainty propagation (UP) is analysed because uncertainty is support-dependent. Change of support procedures (i.e., aggregation and disaggregation) are also required when dealing with integration between models. Accounting for spatio-temporal uncertainty when coupling models in an integrated layout and when addressing change of support is a key component in the data-model-simulation-uncertainty quantification-decision-making chain. There is no software that can do all that is needed for UP analysis in such a complex case. Although important contributions have been made, e.g., the UncertWeb framework [2] and the R-package spup [3], there still is a need for tools for temporal, spatial and spatio-temporal UP accounting for change of support across multiple scales.
In the environmental and geospatial fields, modelling complex processes accounting for uncertainties exhibits specific issues. Bastin et al. [2] identified six main challenges. We summarise and relate these to the availability of tools accounting for uncertainty analysis in specific applications: 1.
Lack of tools accounting for reliable uncertainty information of observational and other data.

2.
Lack of tools for reliable information on model structural uncertainty. 3.
Development of tools accounting for spatial, temporal and spatio-temporal uncertainty is still a challenge because many environmental variables are measured at diverse spatial and temporal scales and because there are strong correlations between variables. 4.
Large computational budgets are required when modelling spatio-temporal distributed systems, due to the large number of variables and the correct characterisation of uncertainties and correlations. 5.
Models commonly exhibit complex probability distributions in model output response due to nonlinear responses to model input, even for simple parametric input probability distributions. 6.
The default mechanism to propagate uncertainties is the Monte Carlo method, which is highly computationally demanding. Therefore, large computational resources and implementations of computationally efficient tools are required.
Generally, in environmental and geospatial modelling, there are three main sources of uncertainty [4,5]: (1) Model input uncertainty; (2) Model parameter uncertainty; and (3) Model structural uncertainty. Proper characterisation and quantification of all three uncertainty sources are fundamental in the analysis of the propagation of uncertainties through environmental models. It is also important that practical tools are available to address the characterisation and propagation of uncertainties through models.
Spatial uncertainty propagation has been addressed in several fields, such as in hydrological and water quality modelling [6][7][8], in scenario analysis [9], and in soil pollution and nutrient modelling [1, 10,11]. However, it is recognised that there is not a universal software tool for performing uncertainty propagation tasks [3].
Regarding software tools developed for uncertainty propagation analysis, Sawicka et al. [3] list different tools available for spatial and non-spatial uncertainty propagation and uncertainty assessment. Among the listed tools, there is free software, such as OpenTURNS [12], DACOTA [13], and DUE [14], commercial software, such as COSSAN [15], and free software written for licensed software, e.g., SAFE [16] and UQLab [17] toolboxes for MATLAB (MathWorks, Natick, MA, USA). An extensive review of available UP software tools is presented in [2]. Regarding existing R packages, there are few that deal with uncertainty propagation explicitly: propagate and errors. However, neither of these two packages provides functionality for spatial uncertainty analysis.
Widely used techniques for uncertainty assessment in stormwater quality modelling are the classical Bayesian approach based on Markov Chain Monte Carlo (MCMC) sampling and the Metropolis Sampler [18], Generalized Likelihood Uncertainty Estimation (GLUE) [19], the Multi-algorithm Genetically Adaptive Multi-objective method [20], and the Shuffled Complex Evolution Metropolis algorithm [21]. However, these approaches usually do not take all uncertainty sources into account [22,23]. Dotto et al. [5] found that the definition of different subjective criteria, such as user-defined likelihood measures and prior knowledge required in Bayesian techniques, are important issues that limit the application of these techniques. Wijesiri et al. [23] pointed out that these limitations could have a significant influence on management and planning decisions for mitigation of stormwater pollution. In stormwater quality modelling, a poor characterisation of the source uncertainty (process variability) is the main limitation in accounting for uncertainty [23]. Therefore, provided these uncertainty sources can be captured adequately by probability distributions, the use of Monte Carlo (MC) methods for uncertainty propagation is still a suitable option that can overcome most of these limitations.
The literature review above indicates that there are still gaps in research and development of tools for temporal, spatial and spatio-temporal UP accounting for change of support across multiple scales in environmental and geospatial domains. We present the spatio-temporal and Uncertainty Propagation across multiple scales R-package, "stUPscales", which provides several R methods and functions for spatio-temporal uncertainty propagation through MC simulation in environmental models, while taking space-time aggregation and disaggregation procedures into account.
We recognise that spup [3] addresses spatial UP as well, both for continuous and categorical variables. Although there are similarities between spup and stUPscales, there are meaningful differences too. The main difference is that stUPscales can also handle UP in the temporal and spatio-temporal domain. In addition, in stUPscales, besides sampling from uniform, normal, log-normal, and normal truncated probability distribution functions (pdfs), it is also possible to simulate environmental variables as uni-and multi-variate autoregressive models of order one. Moreover, stUPscales has explicit functionality for spatio-temporal change of support. Additionally, another important difference between the two packages is the handling of data structure and objects. In stUPscales, the temporal dimension is handled as xts (eXtensible Time Series) objects from the xts package [24], while space-time objects are handled by the classes ST (Spatio Temporal) from the spacetime package [25], especially the class STFDF (Spatio Temporal Full Data Frame) for regular grids. The spup package makes use of the objects RasterStack from the Raster package [26].
We illustrate the package for temporal aggregation of time series as precipitation and pollutants of Urban Drainage Models (UDMs). In addition, we illustrate the methods and functions for uncertainty propagation in an example of a lumped UDM, EmiStatR, via MC simulation to evaluate water quantity and quality in emissions of Combined Sewer Overflows (CSOs). The example serves for testing the hypothesis that when characterisation of sewage quality indicators as chemical oxygen demand (COD) and ammonium (NH 4 ) is considered with the appropriate process variability, uncertainty can be quantified as an integral part of the total uncertainty in the modelling procedure.
Regarding the spatio-temporal uncertainty propagation domain, we show how the package can be used to model and predict variables that vary in space and time by using a spatio-temporal variogram model and space-time ordinary kriging. We emphasize that the spatio-temporal functionality developed in stUPscales for performing uncertainty propagation studies is an important contribution to the scientific and practitioner communities by making this open source tool accessible and illustrating its applicability in the water domain. However, it should be noted that the tool may also be applied to other domains in environmental modelling.
The objectives of this study are to: (1) Develop a tool for characterising uncertainty in spatial, temporal and spatio-temporal environmental variables (model inputs) as pdfs and as uni-and multi-variate autoregressive models; (2) Develop a tool for sampling from these pdfs (to support MC UP analysis) and to generate realisations of autoregressive models for environmental variables;

Materials and Methods
This section describes the R-package stUPscales, its main class, methods and functions.

The R-Package stUPscales
An overview of the conceptual framework is presented in Figure 1. Three steps are distinguished:  The class setup is used to create objects of signature setup for further use in the Monte Carlo method. The created object contains eight slots (for more details, see the online user manual available at https://CRAN.R-project.org/package=stUPscales): • id: Object of class "character" to identify the current MC simulation. • nsim: Object of class "numeric" to specify the number of MC runs. • seed: Object of class "numeric" to specify the seed of the random number generator. • mcCores: Object of class "numeric" to specify the number of cores (CPUs) to be used in the MC simulation. • st.input: Object of class "character" that defines the name of the file or url of the web service to retrieve the spatio-temporal input data. • rng: Object of class "list" that contains the names and values of the variables to be used in the MC simulation. Five modes are available: (1) constant value; (2) a variable sampled from a uniform (uni) probability distribution function (pdf); (3) a variable sampled from a normal (nor) pdf; (4) a variable sampled from an autoregressive (AR) model and normal (nor) pdf; (5) a variable sampled from a vector autoregressive (VAR) model and normal (nor) pdf; (6) a variable sampled from discrete values (dis); and (7) a variable sampled from a truncated normal (tnor) pdf. • ar.model: Object of class "list" containing the coefficients of the AR model as vectors, the name of which is the variable to be modelled and the length of which refers to the order of the model as required by function arima.sim from the base package stats. • var.model: Object of class "list" containing the vector of intercept terms w, the matrix of AR coefficients A, and the noise covariance matrix C of the VAR model, the name of which is the variable to be modelled and the length of which refers to the order of the model as required by function mAr.sim from the package mAr. For mathematical details, see [27].

The MC.setup Method
Given an object of class setup, the method can be invoked for setting-up the MC simulation. The variables are sampled according to their parameters specified in the slot rng of the setup object. If ar.model is defined in slot ar.model, then the specified variables are sampled from the pdf or as an autoregressive (AR) model via the function arima.sim. If var.model is defined in slot var.model, then the specified variables are sampled from the pdf or as a vector autoregressive (VAR) model via the function mAr.sim (see [27,28] for details).
There are seven different cases considered in stUPscales (for more details see the online user manual available at https://CRAN.R-project.org/package=stUPscales). We developed specific cases accounting for autocorrelation of variables. An extract from the method MC.setup for selecting the sampling probability distribution function, case 2 of the method MC.setup, i.e., a normally distributed autocorrelated time series (AR1 model), is presented in Appendix A. Additionally, case 5 of the method, a normally distributed auto-and cross-correlated time series (var.model) in parallel code, is also presented in Appendix A.

The MC.sim Method
The method MC.sim is used to invoke a MC simulation. This method can be modified to work with another simulator. The method has two arguments: • x: Object of class "list" as defined by method MC.setup. • EmiStatR.cores: a numeric value for specifying the number of cores (CPUs) to be used in the EmiStatR method. Use zero for no use of parallel computation.
The value of the method is a list of length two. The first element of the list, mc, is a list that contains the MC.setup, timing and lap objects. The second element of the list, sim1, is a list that contains the Monte Carlo matrices of the simulator output.

The MC.analysis Function
The MC.analysis function is available for performing the statistical analysis of the MC simulation results. This function requires nine arguments: • x: A list from the output of the method MC.sim. • delta: A numeric value that specifies the level of temporal aggregation required in minutes. • qUpper: A character string that defines the upper percentile to plot the prediction band of the results. Several options are possible: "q999" the 99.9th percentile, "q995" the 99.5th percentile, "q99" the 99th percentile, "q95" the 95th percentile, and "q50" the 50th percentile. The lower boundary of the prediction band (shown in grey in the output plots) is the 5th percentile in all cases. • p1.det: A data.frame that contains the time series of the main driving force of the system to be simulated deterministically, e.g., precipitation. This data.frame should have only two

Example 1: Application of stUPscales to Temporal Uncertainty Propagation and Aggregation with the EmiStatR Model
We present a UP example using an environmental model with application in urban water modelling, the EmiStatR model, coded as R-package. Its input class and main method EmiStatR are presented as well.
The example is an application in temporal UP for simulation of the water volume and water quality dynamics in combined sewer systems (CSSs). CSSs are designed to convey wastewater during precipitation events and also directly to the wastewater treatment facility in dry weather flow conditions. In CSSs, two types of discharge are distinguished: (1) the pass forward flow, which transports the sewage discharge directly to the wastewater treatment facility [29]; (2) the combined sewer overflow (CSO), which is the sewage diverted from the treatment facility and discharged, untreated, into a local receiving water body during heavy precipitation events [30]. CSOs, when released to the environment, can have an important damaging impact on the water quality status of receiving waters (streams, rivers, ponds, lakes, wetlands, and oceans). About 50% of pollutants in stormwater are metals, nutrients, organic toxins, and bacteria, and are associated with particulate matter. The other 50% are soluble [30], and therefore these pollutants can be more persistent in the water and the environment itself. Minimisation of the CSO spill volume is therefore important for the preservation of good water quality status of receiving waters. The goal of this example is to quantify model output uncertainty when model input uncertainties are propagated through a simplified lumped urban drainage model.

The Model EmiStatR
The R-package EmiStatR version 1.2.0.4 is used to perform the simulations and to propagate model input uncertainty in the example with the R-package stUPscales. Details regarding the EmiStatR model are found in [31]. Basically, the total dry weather flow, Q t [L · s −1 ] is calculated as: where Q s [L · s −1 ] is the dry weather flow estimated as the flow of the residential wastewater in the catchment, calculated as 86,400 −1 · pe · qs (where 86,400 = 24 × 60 × 60 is a measurement unit conversion factor), with pe [PE] the population equivalents of the connected CSO structure, and qs is the infiltration flow that enters the pipes from groundwater flow through cracks and joints, calculated as is the impervious area of the catchment, and q f [L · s −1 · ha −1 ] is the infiltration water inflow flux (specific infiltration discharge from groundwater flow). Variables qs and pe are dynamic and can be defined as time series with daily, weekly and seasonal patterns. The contribution of rain water to the combined sewage volume is given by V r [m 3 ]. This is a vector whose length is equal to that of the input precipitation time series. V r is computed as: where P 1 is a time series of rainfall depth [mm]; A imp is the impervious area of the catchment [ha]; A total is the total area of the catchment [ha]; C imp is the run-off coefficient for impervious areas [-]; and C per is the run-off coefficient for pervious areas [-]. The load, B x,Sv [kg], of a specific water quality variable x (e.g., Chemical Oxygen Demand, COD, or Ammonium, NH 4 ) in the overflow is calculated as a function of the CSO spill volume, V Sv [m 3 ], a combined sewer mixing ratio, cs mr [-], the mean dry weather pollutant concentration, C x,DWF [mg · L −1 ], and the concentration due to rainwater pollution, where DWF refers to dry weather flow and Rw to rainwater. V Sv , cs mr and C x,DWF are vectors of length equal to the input precipitation time series, P 1 . C x,Rw can be a vector of length equal to P 1 or a unique value constant in time. Two indicators of water pollution are simulated: COD and NH 4 . Table 1 presents the main components and input variables of the EmiStatR model. The main model outputs are listed in Table 2. An object of class input can be defined to contain all required input data for running the model in deterministic mode. An object of class input can be created by invoking the command input() or new("input"). The content details of an object of class input can be found in the user manual of the EmiStatR package available in CRAN (https://cran.r-project.org/web/packages/EmiStatR/). This object has 20 slots. For this example, the model input variables considered in the simulations and used to compute the output uncertainty are presented in Table 1.
If calibration data are available, EmiStatR parameters may be calibrated prior to simulation. If calibration is not feasible, the model can be run using parameter values taken from the reference literature and guidelines. Table 3 provides reference values and calibration ranges for the most important EmiStatR parameters. Table 1. General input data and combined sewer overflow (CSO) data to be defined by the input object.

General Input Units CSO Input Units
a PE = population equivalents; b COD = chemical oxygen demand; c NH 4 = ammonium. The output value of the EmiStatR method is described in the manual. Here, we are interested in propagating model input uncertainties through the simplified lumped urban drainage model.

Temporal Aggregation and Model Input Uncertainty Propagation
We used the Agg.t function, described in Section 2.1.5, to perform a temporal aggregation of the model input data to the time step required. The model input uncertainty propagation is done in a two-step process. First, a model input uncertainty definition is done by setting up the MC simulation. For this, we apply the setup class and the MC.setup method. Second, the MC simulation is performed by applying the MC.sim method, after which the UP analysis is done through the application of the MC.analysis function.

Results
In this section, we first present the input time series as the main driving force of the system modelled. This input is important because the other input time series definition depends on it to achieve the same model input support. Second, the results of input time series aggregation are shown. Finally, the results of the model input uncertainty propagation with stUPscales are presented.

Model Input
EmiStatR includes an example of a precipitation dataset called P1, used as main input data. This dataset is a list that contains a data.frame with two columns: "Time [y-m-d h:m:s]" and precipitation depth in millimetres "P [mm]". The rain gauge station where the measurements were recorded (in this case Dahl) is located close to the catchment of the combined sewer overflow chamber at Goesdorf, Grand-Duchy of Luxembourg. The dataset contains records from 1 January until 2 February 2016. The recording time step is 10 min. We used this dataset as precipitation input.

Temporal Aggregation and Model Input Uncertainty Propagation
As an illustration of the Agg.t function, we aggregated the time series of precipitation P1 from 10 to 30 min. We took the sum as aggregation function. Following the two-step process, we performed the model input uncertainty propagation. First, an illustration of the model input uncertainty definition is presented by setting up the MC simulation. Second, the MC simulation is executed followed by its respective analysis.

Model Input Uncertainty and Monte Carlo Simulation Set-Up
For setting up the MC simulation, we applied the setup class and the MC.setup method. Appendix A.6.4 shows the R-code for the settings of the setup class. Upon the definition of the setup object, we proceeded to invoke the MC.setup to create the sampled variables from their corresponding probability distributions.
The water quantity input variables chosen for the uncertainty propagation analysis are water consumption (qs), infiltration inflow water (q f ), run-off coefficient for impervious area (C imp ), run-off coefficient for pervious area (C per ), population equivalents (pe), orifice coefficient of discharge (C d ) and the initial water level in the chamber (Lev ini ).
The precipitation input time series and the water quality input variables chosen for the uncertainty propagation analysis are plotted in Figure 2. A time window was chosen for illustration purposes. The sewage COD pollution per capita [PE] load per day (C COD,s ) was modelled as an autoregressive order one (AR1) model with autoregressive coefficient equal to 0.7 (Figure 2). The sewage NH 4 pollution per capita [PE] load per day (C NH4,s ) was also modelled as an AR1 model with autoregressive coefficient equal to 0.7 ( Figure 2).

Monte Carlo Simulation and Analysis
The MC simulation was performed by invoking the MC.sim method. The analysis of the MC simulations was done by invoking the MC.analysis function. In order to proceed with the MC analysis, a deterministic simulation was defined and computed. First, we defined structure 1, named "E1". Second, the input.user object of class input was defined. Next, we invoked the method EmiStatR with the deterministic input (input.user) defined before. Finally, the additional arguments of the MC.analysis function were defined. Upon definition of the additional arguments, the function MC.analysis was invoked. Appendix A.6.5 presents the R-code of the MC.sim method and the MC.analysis function.
Only part of the outputs of the analysis are presented here. Figure 3 presents the time series of the deterministic simulation, as well as the mean and 95% prediction band of the MC simulations at a 30-min time step, for simulation and analysis of the water volume. Figure 4 shows the time series window for COD load and concentration in spill water volume in the CSO and NH 4 load concentration in spill water volume in the CSO, at a 30-min time step.

Example 2: Application of stUPscales to Spatio-Temporal Uncertainty Characterisation of Precipitation for the Grand-Duchy of Luxembourg
This example illustrates the capability of stUPscales to address UP in the spatio-temporal domain. We illustrate how to characterise the spatio-temporal uncertainty of precipitation for the entire country of the Grand-Duchy of Luxembourg. We use observations of a precipitation time series measured at 25 rain gauges for the period January 2010 to December 2011, recorded at 10-min time steps. The set of time series was provided by the Luxemburgish Administration des Services Techniques de l'Agriculture (ASTA), and verified for consistency by the Observatory for Climate and Environment (OCE) of the Luxembourg Institute of Science and Technology (LIST).
We prepared the data to predict space-time precipitation fields at 500 m and 10-min time step. This space-time uncertainty information can, for instance, be used to feed a distributed rainfall-runoff model coupled with an urban drainage model. We only demonstrate the precipitation uncertainty characterisation, since an entire UP study of a rainfall-runoff model is out of the scope of this paper and will be addressed in subsequent papers.

Selection of Event
We selected a 10-h period where the cumulative precipitation of the time series is maximal, assuring to retrieve a precipitation event in all stations. The selected period of the event was 16 December 2011 from 12:00 a.m. to 10:00 a.m. A total of 32 stations were available. Seven stations were not taken into account because of no measurement in the selected period. For illustration purpose, Figure 5 presents eight out of the 25 selected time series, and Figure 6 illustrates 18 snapshots of the event every 30 min and shows how this event is distributed in space. It also shows the precipitation magnitude in the 25 rain gauges.

Ordinary Kriging in the Space-Time Domain
To model precipitation fields in the spatio-temporal domain, we used the concept of the spatio-temporal variogram for spatio-temporal ordinary Kriging [36]. Following Snepvangers et al. [37], the aim of space-time geostatistical modelling is to predict or simulate an attribute z given by: where s ∈ R 2 refers to space, t ∈ T to time. The prediction or simulation of z is made at a space-time point (s 0 , t 0 ), where z was not measured. A space-time random function model Z can be defined to predict z(s 0 , t 0 ) [37]: where m is the trend component of the model and is a zero-mean stochastic residual component. The prediction is based at measurements in n space-time points (s i , t i ). In space-time ordinary kriging (ST-OK), the trend component is assumed to be an unknown constant mean. The stochastic residual component is characterised by a space-time variogram, that, under second-order stationarity, can be computed from the measurements [37]: where h S and h T are the space and time lags, and N(h S , h T ) is the number of pairs in the space-time lag.
To fit a model to the space-time experimental variogram (Equation 6), we need to overcome some additional problems as we have space and time variation. In this example, we used a model that considers the definition of one covariance model (or variogram) for the space domain and one covariance model (or variogram) for the time domain, including an anisotropy parameter κ, as described by Bilonick [38] and revisited by Snepvangers et al. [37] and Graeler et al. [36]. This is the sum-metric covariance model: The variogram is derived similar to the covariance [36,37] as: stUPscales implements routines of the R package gstat [39] for representing the spatio-temporal covariance models. Besides the sum-metric model, four additional models are available in gstat: Separable; Product-sum; Metric; and Simplified sum-metric.

Results
The commands used to produce the results explained next are presented in Appendix A.7.

Events Selected and Space-Time Full Grid Object
stUPscales has a dataset with the selected events per station as an xts object in space-wide format for easy visualisation of the 25 stations in the object event.subset.xts. The spatial locations of the 25 rain gauge stations are provided as a dataset in SpatialPointsDataFrame format from the sp package [40]. The boundary of the country of the Grand-Duchy of Luxembourg is given as a dataset in sp format SpatialPolygonsDataFrame. The data were converted to a space-time full data.frame (STFDF) of the package spacetime [25], through the function stConstruct from the same package. The object is visualised in Figure 6

Space-Time Variograms
Once the observation period was defined, we computed the experimental spatio-temporal variogram according to Equation (6) using the function variogramST of the gstat package, with time lags of 50 min and space lags of 5000 m. The spatio-temporal anisotropy was estimated by the estiStAni function (gstat). Before proceeding to the fit of the parameters, the experimental variogram and the anisotropy were scaled to have distances in kilometers.
The sum-metric variogram model (Equations (7) and (8)) was defined in gstat by using the vgmST function. The model was fitted using the fit.StVariogram gstat function with the L-BFGS-B optimisation algorithm described in Byrd et al. [41], which allows box constraints with lower and upper bounds. All space-time variogram parameters (spatial, temporal and joint nugget, range and partial sill) were estimated by minimising the mean squared error (MSE). The MSE of the best model fitted is 0.00013. Figure 7 shows the experimental spatio-temporal variogram (a), the fitted spatio-temporal variogram (b) and the residuals of the spatio-temporal variogram model for the sum-metric model (c). The fitted parameters of the sum-metric model are given in Table 4.

Space-Time Ordinary Kriging
Upon definition of the space-time variogram, we proceeded to create prediction maps using space-time ordinary kriging. For this, a prediction grid over Luxembourg was created and predictions were made by the function krigeST from gstat. The prediction maps are shown in Figure 8. Finally, we computed the lower and upper boundary of the 90 percent prediction interval as shown in Figures 9 and 10.
With this, the stUPscales package is ready to characterise space-time uncertainties of environmental variables. The next steps for subsequent work are to simulate realisations of such variables and analyse the uncertainty propagation.

Discussion
We developed the stUPscales R package as a contribution to state-of-the-art software tools designed to perform spatio-temporal uncertainty characterisation and uncertainty propagation analysis of environmental models across different scales. We applied the tool in two examples where the class, methods and functions were tested satisfactorily as an illustration of the capabilities of the package. In this section, we highlight and discuss the main results obtained and put these in perspective with other studies.

Temporal and Spatio-Temporal Model Inputs
In Example 1, we used a simplified urban drainage model, EmiStatR, for testing purposes. We used this model because of its simplification and parallel computing coding, allowing a seamless uncertainty propagation in the R environment.
The main input variable of the modelling framework developed in the example corresponded to a time series of precipitation included as a dataset into the R-package EmiStatR. This time series includes records from 1 January to 2 February 2016 (4462 observations). The total precipitation in the time series is 89.6 mm, with a mean value for non-zero data of 0.18 mm and maximum value of 0.90 mm. While the recording time step is 10 min, we aggregated the precipitation to 30 min to illustrate the aggregation capabilities of stUPscales. We characterised the uncertainty of the temporal input by invoking the temporal autocorrelation function.
For Example 2, we used time series of precipitation for 25 rain gauges distributed over the country of Luxembourg as an input dataset. The event selected was a period of 10 h from 12:00 a.m. to 10:00 a.m. on 16 December 2011. Next, we characterised the spatio-temporal uncertainty by inferring the experimental space-time variogram from the observations for the entire spatial domain and time period.

Temporal Aggregation of Model Input
In Example 1, we illustrated the temporal aggregation function Agg.t of the stUPscales package. The level of aggregation was a 30-min time step. The resulting time series is composed of 1488 observations, summing up to a total precipitation of 89.6 mm, with a mean value for non-zero data of 0.33 mm and maximum value of 2.00 mm. This level of aggregation is useful when a large time series is used as input in the uncertainty propagation procedure, reducing the computational burden required for the MC simulation.

Model Input Uncertainty Propagation
In Example 1, we used the MC method for model input uncertainty propagation through the environmental model. In so doing, we accounted for temporal and cross-correlations when defining model input uncertainty and performing uncertainty propagation.

Model Input Uncertainty and Monte Carlo Simulation Set-Up
Moret at al. [42] present a literature review about characterisation of input uncertainties in strategic energy planning models. They found that, in the context of strategic design problems, Tock and Maréchal [43] use normal, uniform and beta distributions to define uncertainty of economic parameters with lower and upper limits defined from institutional reports. In addition, in the context of energy systems design under uncertainty, Dubuis [44] proposes a methodology based on statistical theory, i.e., on definition of probability distribution functions. However, uncertainty characterisation using pdfs is not always possible when data are unavailable. In the absence of data for defining future uncertainties in the UK energy transition pathways, Pye et al. [45] propose using triangular distributions for representing uncertainty between upper and lower limits, defined from the literature. To account for uncertainty in the assessment of biomass-to-fuel strategies in wastewater treatment applications, Sin et al. [46] assign different levels of cost uncertainty (low, medium, high), taking into account maturity and complexity. In addition, upper and lower limits to all model parameters were defined by expert knowledge and establishing uniform distributions. In the design of flexible multi-generation systems, Lythcke-Jorgensen et al. [47] assume variations of ±25% and uniform distributions for parameters of investment and operating cost.
In the urban drainage modelling domain, Wijesiri et al. [23] quantified uncertainty by uncertainty limits associated with predicted build-up and wash-off processes. Zoppou [48] states that "providing the uncertainty associated with model outcomes in terms of uncertainty limits is an effective way to enhance management and planning decisions".
Most of the reviewed literature highlighted that, regarding parameter uncertainty characterisation, there is a difficulty to define pdfs based on data at proper quantity and quality. In this sense, Moret et al. [42] present a method to characterise input uncertainty for strategic energy planning models. This method proposes the definition of a set of criteria for the establishment of variation ranges for uncertain parameters, instead of the definition of pdfs. For some techniques, such as robust optimization, full pdfs are not needed and specification of uncertainty ranges is enough [49].
The stUPscales package includes both pdf and range characterisations for input uncertainty quantification. The pdf option allows the definition of normal, log-normal and truncated normal pdfs. The range option consists of the definition of a uniform pdf where the lower bound (minimum value) and the upper bound (maximum value) are defined as function arguments. Additionally, two more options are available for the characterisation of model input uncertainty: (1) discrete, where a uniform discrete sampling is defined; and (2) constant, where a unique constant value is defined in time and space, i.e., suitable for representing model inputs that are not uncertain. With all these possibilities, we cover an important range of characterisations for model input uncertainty used in practice.

Monte Carlo Simulation and Analysis
In Example 1, the selected water quantity input variables for the uncertainty propagation analysis (water consumption, qs, infiltration inflow water, q f , impervious area, A imp , run-off coefficient for impervious area, C imp , run-off coefficient for pervious area, C per , population equivalents, pe, orifice coefficient of discharge, C d and the initial water level in the chamber, Lev ini ), had been defined to quantify input uncertainty and analyse how it propagates through the environmental model, EmiStatR, to the different model outputs related to water quantity ('CSO summary' in Table 2, and Figure 3).
Regarding the selected water quality input variables for the uncertainty propagation analysis (sewage COD pollution per capita [PE] load per day, C COD,s , sewage NH 4 pollution per capita [PE] load per day, C NH4,s , and rain COD pollution, COD r ), these reflect how the variability in the input uncertainty definition is translated to the model output ('COD in spill volume' and 'NH 4 in spill volume', as shown in Table 2 and Figure 4).
In the case of total concentration of COD (CCOD) in the spill volume, a mean value of 145 mg·L −1 and a maximum value of 190 mg·L −1 were simulated in the deterministic run. In the Monte Carlo simulation, a mean value of 255 mg·L −1 and a maximum value of 511 mg·L −1 in the 95 percentile band were obtained. These large differences indicate a large uncertainty in this variable. The relative increase of the mean and maximum are 1.8 and 2.7, respectively.
In the case of total concentration of NH 4 (CNH4) in the spill volume, a mean value of 4.1 mg·L −1 and a maximum value of 6.5 mg·L −1 were simulated in the deterministic run. In the Monte Carlo simulation, a mean value of 5.5 mg·L −1 and a maximum value of 12.6 mg·L −1 in the 95 percentile band were obtained. While the means are similar, the large increase in the maximum indicates that uncertainty in this variable is also substantial, as already indicated by the wide prediction bands (grey area) shown in Figure 4.
These results demonstrate the usefulness of performing model input uncertainty propagation through the environmental model, which is an important goal in the evaluation and communication of uncertainties in environmental modelling.
In order to further validate model accuracy, a complete uncertainty propagation analysis would have to involve (cross-)validation using independent observations. This was beyond the scope of the paper and is currently done in a subsequent paper on a full uncertainty propagation analysis of the EmiStatR model, including a stochastic sensitivity analysis to analyse the contribution of each source of uncertainty to the overall uncertainty of the model results.

Space-Time Characterisation of Model Input Uncertainty
In Example 2, we showed how space-time model input uncertainty is modelled in stUPscales. We used space-time ordinary kriging, with a space-time unknown constant mean of the process and a space-time stochastic residual for which the literature proposes different covariance models [36]. However, in the space-time domain, the fitting of the covariance model is not trivial compared to that in spatial kriging. We used the sum-metric model [36,37], which gave satisfactory results. We showed space-time prediction maps of the precipitation for a 10 h time period on 16 December 2011 for the Grand-Duchy of Luxembourg.

Data Format for Uncertainties
We have coded stUPscales to be compatible with standard spatio-temporal data concepts and formats in R, such as the spatio-temporal classes ST of the spacetime package [50]. This allows for seamless linkage of geospatial and spatio-temporal models and sub-models across multiple scales.
In addition, despite an attempt to create the Uncertainty Markup Language (UncertML) [51] as an Open Geospatial Consortium (OGC) standard to store metadata for uncertainty propagation applications [6], there is still an open question about the suitability of this data format to hold, visualise and communicate uncertainties together with the space-time ST classes in R. Furthermore, there is a potential to develop a new class or extend the ST classes to embed the uncertainty information regarding data uncertainty, input uncertainty characterisation, output uncertainty and summary statistics.

Conclusions and Future Research
We presented the R package spatio-temporal and Uncertainty Propagation across multiple scales, stUPscales. This package constitutes a contribution to the state-of-the-art of open source tools that aim to support uncertainty propagation in the spatio-temporal domain. The main class, methods and functions of the package were presented and illustrated with an example of temporal model input uncertainty propagation through a simplified lumped urban drainage model accounting for water quality modelling. We also showed how the package can be used to model and predict environmental variables that vary in time and space. Using simulations of these types of variables in MC uncertainty propagation will be done in subsequent work.
The main contribution of stUPscales compared to existing uncertainty propagation tools is that it is not restricted to uncertainty analysis of purely spatial applications but can also handle uncertain temporal and spatio-temporal variables. In addition, it is designed to handle aggregation and disaggregation of uncertain temporal, spatial and spatio-temporal variables.
Through two examples, we demonstrated that stUPscales is suitable for characterising uncertainty in spatial, temporal and spatio-temporal environmental variables (model inputs) as probability distribution functions (pdfs) and as uni-and multi-variate autoregressive models. Moreover, it is possible to sample from these pdfs (to support MC UP analysis) and to generate realisations of autoregressive models. We illustrated the aggregation functionality of stUPscales by averaging realisations of precipitation in time. We propagated model input uncertainty through a simplified urban water model and analysed the results of an MC UP. Finally, we showed how space-time geostatistical interpolation is done in stUPscales.
It is worthwhile to explore the development of a new class or to extend the ST classes so that these incorporate uncertainty information regarding data uncertainty, input uncertainty characterisation, output uncertainty and summary statistics. We recommend that future implementations take this matter into account.