A Cautionary Note on the Reproduction of Dependencies through Linear Stochastic Models with Non-Gaussian White Noise

Since the prime days of stochastic hydrology back in 1960s, autoregressive (AR) and moving average (MA) models (as well as their extensions) have been widely used to simulate hydrometeorological processes. Initially, AR(1) or Markovian models with Gaussian noise prevailed due to their conceptual and mathematical simplicity. However, the ubiquitous skewed behavior of most hydrometeorological processes, particularly at fine time scales, necessitated the generation of synthetic time series to also reproduce higher-order moments. In this respect, the former schemes were enhanced to preserve skewness through the use of non-Gaussian white noise— a modification attributed to Thomas and Fiering (TF). Although preserving higher-order moments to approximate a distribution is a limited and potentially risky solution, the TF approach has become a common choice in operational practice. In this study, almost half a century after its introduction, we reveal an important flaw that spans over all popular linear stochastic models that employ non-Gaussian white noise. Focusing on the Markovian case, we prove mathematically that this generating scheme provides bounded dependence patterns, which are both unrealistic and inconsistent with the observed data. This so-called “envelope behavior” is amplified as the skewness and correlation increases, as demonstrated on the basis of real-world and hypothetical simulation examples.


Introduction: A Glimpse of History
"Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful." -George Box and Norman Draper [1] The celebrated Harvard water program and the development of the so-called Thomas-Fiering (TF) model in the early 60s [2][3][4][5] played a historically crucial role in definition and advancement of the scientific discipline of stochastic hydrology-more specifically, of synthetic hydrology.The emergence of this field was mainly motivated by the need to generate synthetic streamflow data, to be used in water resources planning and management models [6][7][8].The use of synthetic streamflow generators (or more generally weather generators) allowed for representing the operation of complex hydrosystems and deriving risk-related quantities that could not be obtained through classical statistics.Among the many different alternative models (see references below), the TF model prevailed for many years and still remains a popular choice.To date, the original Thomas-Fiering paper [4] and the related works of the Harvard water program [2][3][4][5] have been cited in the literature almost 2000 times, a fact highlighting its vast popularity and reasonably justifying its denomination as the "Ford's Model T" of stochastic hydrology [9].Additionally, more than half a century since its conception, the TF model, its variants, and the associated approach to handle skewness (see below) are standard educational material in most stochastic-hydrology courses and are disclosed in prominent positions in many classic and contemporary textbooks [10][11][12][13][14][15].The wide acceptance of the model is also acknowledged by Salas and Pielke [16], who asserted that "the PAR(1) model (also known as the Thomas-Fiering model) is likely one of the most widely used models in hydrology".
The original TF model is essentially a cyclostationary version of the classic stationary linear autoregressive model of order 1 (AR(1)), also formulated as a periodic autoregressive of order 1 (PAR (1)), in order to account for systematic changes and non-stationarities of statistical characteristics across seasons.The fact that the marginal distributions of many hydrometeorological processes are not Gaussian, motivated Thomas and Fiering [17] to propose the replacement of the Gaussian white noise with Gamma (G) or Pearson type-III (PIII) distributed white noise [3] (pp. [53][54][55][56][57] in order to account for the skewness coefficient (to our knowledge, this modification first appears in the book of Thomas and Burden [18]).Note that the PIII distribution is a simple generalization of the G distribution, which introduces an additional location parameter.
Nevertheless, herein, we mainly focus on AR models with non-Gaussian white noise, which have been widely adopted in hydrology, and briefly discuss three alternative schemes, two of which are based on moving average (MA) models and one based on an autoregressive moving average model (ARMA).Specifically, we investigate the effect on the established dependence patterns that arise from the use of PIII white noise within stationary univariate and multivariate linear stochastic models for generating synthetic hydrological data via stochastic simulation.Based on theoretical reasoning and empirical evidence, it is shown that the use of the implicit TF approach results in bounded and thus unrealistic dependence patterns, highlighting this approach's limitations in simulating skewed hydrometeorological processes.
Our motivation stems from an observation of Tsoukalas et al. [33], who noticed this dependence pattern flaw while simulating 2000 years of monthly streamflow data at Aswan dam through the TF approach (i.e., PAR (1) with skewed white noise), hereafter called "envelope behavior".A characteristic sample of this work is shown Figure 1, where we depict the scatter plots of historical and synthetic data for three pairs of consecutive months (January-February, March-April, and September-October).It is observed that the synthetically-derived scatter is bounded by a linear threshold, while the historical data clearly extend below this limiting line.It is remarkable that the model reproduces almost perfectly the (often regarded as essential) statistical characteristics of historical data, i.e., the mean, variance, and skewness, as well as the month-to-month linear correlations (Pearson's), which is the typical measure of statistical dependence that is encountered in all linear stochastic schemes.However, it seems that the preservation of the latter does not ensure the generation of fully consistent dependence patterns.

The Thomas-Fiering Approach
The basic idea of the TF approach lies in using non-Gaussian, skewed, white noise within linear stochastic models in order to resemble the target marginal statistics, i.e., sample mean, variance, and skewness.Note that the derivation of a theoretical formula for the white noise skewness in AR( ) models of a higher order ( ≥ 2) aiming to reproduce skewness is theoretically possible but practically of no use, as it involves high-order joint statistics (that are difficult to estimate and are also subject to significant sample uncertainties [34]).Thus, application is possible only based on sample estimates of these joint statistics [35].This is the major reason why the TF modification was originally restricted in AR(1) models, and thus similarly we also concentrate our main analysis in stationary univariate and multivariate AR(1) models with skewed white noise, while we briefly explore the cases of some other linear stochastic models (i.e., an ARMA and two variants of MA models).
Apparently, the selection of the underlying model determines the stochastic characteristics of the resulting simulation scheme.For example, when an AR(1) model is employed, the overall scheme will reproduce only Markovian autocorrelation structures, while if a more flexible MA-based scheme is used, the simulation scheme will be able to resemble a wider range of correlation structures.
However, regardless of the choice of the underlying model, such schemes exhibit a number of shortcomings and limitations, which are briefly summarized here [33]: (1) They provide just an approximation of the marginal distribution, as reproducing statistics generally is not equivalent to reproducing the distribution.Furthermore, the resulting distribution is not known a priori (e.g., in general the sum of Gamma distributed variables is not Gamma; see also, Moschopoulos [36]).We remark that this was acknowledged by the authors [3] (pp. [53][54][55][56][57] as well as later remarked by other researchers ( [26,37,38], (p.66)); (2) In order to reproduce the skewness of the underlying process it is required (due to central limit theorem) to use white noise with higher skewness [11,23,38,39], which can cause, in some cases, failure of the random number generator itself; (3) This simulation scheme generates time series that can have negative values, which is not consistent with many physical processes (e.g., rainfall, wind, streamflow, etc.).This is attributed to the fact that the lower bound of the white noise distribution may be negative in order to match the target statistics (as estimated from observed time series); (4) Finally, we prove and demonstrate in the next sections that this scheme leads to bounded and thus unrealistic dependence patterns that are not observed in natural processes (such as those depicted in Figure 1).[33]; the simulated negative values were not truncated to zero in order to avoid distortion of the dependence pattern).The red line (-) depicts the envelope equation of the TF model (when combined with PIII white noise.See also Appendix A).

The Thomas-Fiering Approach
The basic idea of the TF approach lies in using non-Gaussian, skewed, white noise within linear stochastic models in order to resemble the target marginal statistics, i.e., sample mean, variance, and skewness.Note that the derivation of a theoretical formula for the white noise skewness in AR(p) models of a higher order (p ≥ 2) aiming to reproduce skewness is theoretically possible but practically of no use, as it involves high-order joint statistics (that are difficult to estimate and are also subject to significant sample uncertainties [34]).Thus, application is possible only based on sample estimates of these joint statistics [35].This is the major reason why the TF modification was originally restricted in AR(1) models, and thus similarly we also concentrate our main analysis in stationary univariate and multivariate AR(1) models with skewed white noise, while we briefly explore the cases of some other linear stochastic models (i.e., an ARMA and two variants of MA models).
Apparently, the selection of the underlying model determines the stochastic characteristics of the resulting simulation scheme.For example, when an AR(1) model is employed, the overall scheme will reproduce only Markovian autocorrelation structures, while if a more flexible MA-based scheme is used, the simulation scheme will be able to resemble a wider range of correlation structures.
However, regardless of the choice of the underlying model, such schemes exhibit a number of shortcomings and limitations, which are briefly summarized here [33]: (1) They provide just an approximation of the marginal distribution, as reproducing statistics generally is not equivalent to reproducing the distribution.Furthermore, the resulting distribution is not known a priori (e.g., in general the sum of Gamma distributed variables is not Gamma; see also, Moschopoulos [36]).We remark that this was acknowledged by the authors [3] (pp. [53][54][55][56][57] as well as later remarked by other researchers ([26,37,38], (p.66)); (2) In order to reproduce the skewness of the underlying process it is required (due to central limit theorem) to use white noise with higher skewness [11,23,38,39], which can cause, in some cases, failure of the random number generator itself; (3) This simulation scheme generates time series that can have negative values, which is not consistent with many physical processes (e.g., rainfall, wind, streamflow, etc.).This is attributed to the fact that the lower bound of the white noise distribution may be negative in order to match the target statistics (as estimated from observed time series); (4) Finally, we prove and demonstrate in the next sections that this scheme leads to bounded and thus unrealistic dependence patterns that are not observed in natural processes (such as those depicted in Figure 1).

The Envelope Behavior in the Classical Univariate AR(1) Model
Let us assume we wish to simulate a continuous-state (not necessarily Gaussian), discrete-time, stationary AR(1) process (also referred to as the Markov process) x t , t ∈ Z, where t is the time index.The main equation of the model reads: where a 1 = ρ 1 := Corr[x t , x t−1 ] is a model parameter and ε t denotes an i.i.d.random variable (RV) known as white noise or the innovation term.The theoretical autocorrelation function (ACF) of the AR(1) model is , where τ stands for the time lag.The mean µ x t := E[x t ] and variance σ 2 x t := Var[x t ] = E (x − µ x ) 2 of x t are related with those of ε t via the following equations (hereafter, due to stationarity, the index t will be omitted when possible): Apparently, if the process of interest is Gaussian (or well-approximated by it), Equations ( 2) and (3) in combination with Gaussian white noise would be sufficient and "exact", since a linear combination of Gaussian RVs is also Gaussian.However, this is not the case for most hydrometeorological processes.In this context, the TF approach attempts to approximate the non-Gaussian behavior of x t by employing non-Gaussian white noise for ε t , where the skewness coefficient with that of ε t by [3,[10][11][12][13]: Hence, in order to capture the first three marginal moments of x t , one has to generate non-Gaussian white noise with certain statistical characteristics, which are all functions of the lag-1 autocorrelation coefficient of the process x t , given that a 1 = ρ 1 .In Figure 2 we provide two alternative views of Equation (4), both depicting the variability of the required skewness C S ε of the white noise against the skewness C S x and the lag-1 autocorrelation ρ 1 of the target process x t .Particularly, in Figure 2A we fix ρ 1 to specific values, ranging from 0 to 0.9, and with C S x varying from 0 to 5, while in Figure 2B we set C S x ∈ {1, 2, 3, 4, 5} and vary ρ 1 from 0 to 0.9.Considering a high ρ 1 = 0.9 and aiming to reproduce a moderate skewness, e.g., ≈ 1, results in a white noise skewness ≈ 3.5, while for a highly skewed variable the deviation becomes much larger (related of course to ρ 1 ).For example, for a process with ρ 1 = 0.9 and C S x = 4, the required white noise skewness is C S ε ≈ 12.5, i.e., more than three times higher than the target value.Within non-Gaussian simulations, the selection of the underlying statistical model of the white noise and the associated random number generation procedure is a pivotal step.Thomas and Fiering proposed the use of Pearson type-III ( III) distribution, which is also one of the most commonly used distributions in hydrology.The probability density function (PDF) of III is given by: where > 0, ≠ 0, and ∈ ℝ are shape, scale, and location parameters, respectively (if = 0, then III reduces to the Gamma distribution).The mean , variance and skewness of the random variable are given by: Of course, as Matalas and Wallis [37] noted, choosing the white noise distribution is a matter of convenience (see also discussion in Tsoukalas et al. [33]) and simplicity in generating random numbers, given obviously that the selected distribution can reproduce the desired statistics of white noise, i.e., , , and .
The non-Gaussian formulation of the AR(1) model through the TF approach results in the so-called envelope behavior issue, which is associated with the distribution of the white noise.Let us write Equation (1) in the equivalent form: where denotes the inverse cumulative density function (ICDF) of the white noise and expresses a uniform ( ) RV in [0, 1] (probability), i.e., ~ (0,1).In the above formulation, we see that in the Gaussian case, where ϵ (−∞, ∞), the random variable takes any value in (−∞, ∞).However, when the distribution of has a finite left support, as in III or Gamma ( ) cases, then lim → ( ) = ℓ , where ℓ stands for the lower bound of .Hence, for given (e.g., specified from the data) and , we can estimate at any step of the generation procedure the lower bound of by: Within non-Gaussian simulations, the selection of the underlying statistical model of the white noise and the associated random number generation procedure is a pivotal step.Thomas and Fiering proposed the use of Pearson type-III (PIII) distribution, which is also one of the most commonly used distributions in hydrology.The probability density function (PDF) of PIII is given by: where γ > 0, b = 0, and c ∈ R are shape, scale, and location parameters, respectively (if c = 0, then PIII reduces to the Gamma distribution).The mean µ ξ , variance σ 2 ξ and skewness C s ξ of the random variable ξ are given by: Of course, as Matalas and Wallis [37] noted, choosing the white noise distribution is a matter of convenience (see also discussion in Tsoukalas et al. [33]) and simplicity in generating random numbers, given obviously that the selected distribution can reproduce the desired statistics of white noise, i.e., µ ε , σ ε , and C S ε .
The non-Gaussian formulation of the AR(1) model through the TF approach results in the so-called envelope behavior issue, which is associated with the distribution of the white noise.Let us write Equation (1) in the equivalent form: where F −1 ε denotes the inverse cumulative density function (ICDF) of the white noise ε t and u expresses a uniform (U ) RV in [0, 1] (probability), i.e., u ∼ U (0, 1).In the above formulation, we see that in the Gaussian case, where ε t (−∞, ∞), the random variable x t takes any value in (−∞, ∞).However, when the distribution of ε t has a finite left support, as in PIII or Gamma (G) cases, then lim u→0 F −1 ε (u) = ε , where ε stands for the lower bound of ε t .Hence, for given a 1 (e.g., specified from the data) and x t−1 , we can estimate at any step of the generation procedure the lower bound of x t by: and thus calculate the theoretical lower bound of the synthetic data.Similarly, when the distribution of ε t is bounded from above (as in the PIII case adjusted for negative skewness), then lim , where v ε is the upper bound of the distribution of ε t .In this case the generation mechanism is bounded from above, i.e.: x t ≤ a This limitation is especially important since hydrometeorological variables, such as river discharge, cannot be considered unbounded from above, even when the sample statistics erroneously indicate negative skewness.To the best of our knowledge, despite the popularity of the TF model and the associated approach of coupling it with skewed white noise, this shortcoming has never been reported in literature, regardless of its straightforward and intuitive theoretical derivation.This limitation also holds for the univariate cyclostationary TF model (i.e., PAR (1) with PIII white noise), for which we provide the corresponding relationships in Appendix A.
Apart from the latter relationships, based on the previous formulation it can be shown that a simple recursive procedure facilitates the estimation of the theoretical minimum (or maximum) value of the TF approach.Without the loss of generality, assuming x 0 := µ x , and by sequentially applying Equation ( 7) for q steps with ε t = F −1 ε (0) = ε , we can obtain the model's theoretical minimum, which can differ from ε (they are identical when ε = 0).The recursive procedure can be written as follows: x Alternatively, and more vigorously (depending on the support of ε t ), the theoretical minimum and maximum are given, respectively, by min(x t ) = ε /(1 − a 1 ) and max(x t ) = v ε /(1 − a 1 ).
In order to better demonstrate the envelope behavior, we apply the AR(1) model coupled with PIII white noise (termed AR(1)-PIII) to 12 hypothetical scenarios by simulating 5000 time steps for each.For all scenarios we fix µ x = 0.5 and σ 2 x = 1 combined with C s x ∈ {1, 2, 4} and ρ 1 ∈ {0.2, 0.4, 0.6, 0.8} (see Table 1 for a summary).Since the PIII is used for generating white noise and C s x > 0, in all cases a lower bound is anticipated.As theoretically expected, the model reproduces the target ACF and target statistics for all scenarios with high accuracy (see Figure A1 and Table A1 of Appendix B).However, the envelope behavior of the dependence pattern is apparent and indicates its limitation, a fact clearly demonstrated by the scatter plots (Figure 3) corresponding to the 12 simulation scenarios.The theoretically-derived Equation ( 8), defining the lower bound of the feasible space of the (x t−1 , x t ) points, is depicted by a red line (Figure 3).Note that labels in each subplot follow the scenarios' naming convention in Table 1 (e.g., panel C corresponds to scenario C of Figure 3).Apparently, in every case, regardless of the C s x and ρ 1 values, the model generates bounded dependence patterns enveloped by Equation ( 8).This behavior appears even for low combinations of C s x and ρ 1 (e.g., scenario A).

From the Univariate to the Multivariate AR(1) Model
It is reasonable to expect that the envelope behavior will also be observed in the multivariate case, i.e., when the multivariate autoregressive process of order 1 is used (MAR(1)) in combination with non-Gaussian white noise.Let us assume that we wish to generate an m-dimensional vector = , … , , … of m cross-correlated AR(1) processes, indexed by i.The generation mechanism of the model is: where is an × matrix and is an m-dimensional column vector of cross-correlated yet serially independent RVs with covariance ∈ ℝ , .The model is often called the "multivariate lag-1 model" when a full matrix is employed, while there exists a variation that assumes a

From the Univariate to the Multivariate AR(1) Model
It is reasonable to expect that the envelope behavior will also be observed in the multivariate case, i.e., when the multivariate autoregressive process of order 1 is used (MAR(1)) in combination with non-Gaussian white noise.Let us assume that we wish to generate an m-dimensional vector x t = x 1 t , . . ., x i t , . . .x m t T of m cross-correlated AR(1) processes, indexed by i.The generation mechanism of the model is: where A 1 is an m × m matrix and ε t is an m-dimensional column vector of cross-correlated yet serially independent RVs with covariance Σ ε ∈ R m,m .The model is often called the "multivariate lag-1 model" when a full A 1 matrix is employed, while there exists a variation that assumes a diagonal A 1 matrix, often called "multivariate Markov model" or "contemporaneous multivariate autoregressive model of order 1" (i.e., CMAR(1)).Both formulations explicitly account for the lag-0 cross-correlations of the variables while their major difference is that the former is able to explicitly account for the lag-1 cross-correlations [11,37,40].On the other hand, the use of diagonal A 1 ensures that each individual process is a Markov process and significantly simplifies the parameter estimation procedure, since the lag-1 cross-correlations are not explicitly modeled.Its use is often advocated by the literature, since several authors suggest that lag-1 cross-correlations can be neglected [14,22,26,33,40,41].Yet it is noted that while this simplification could be valid for processes considered at a coarse time scale (e.g., monthly or annual), it should be used with caution in cases of fine time scale processes (e.g., hourly) or for modeling phenomena characterized by cause-effect relationships (e.g., rainfall-runoff).Nevertheless, here we focus on the so-called multivariate Markov model (i.e., CMAR(1)).Regarding its parameter estimation and assuming that A 1 is a diagonal matrix of the form: where the following holds true: where is the lag-0 cross-covariance matrix.For instance, its ith, jth element is The theoretical cross-covariance matrices M τ = Cov[x t , x t−τ ] can be obtained for any time lag τ by recursively applying the equation: Meanwhile, the theoretical and cross-correlation matrices R τ = Corr[x t , x t−τ ] are obtained by R τ = (diag(M τ )) −1/2 M τ (diag(M τ )) −1/2 .Furthermore, the covariance matrix Σ ε can be expressed as: where B is an m × m, typically lower triangular, matrix (also known as the square root of Σ ε ) obtained by standard decomposition techniques (e.g., the Cholesky technique) or optimization approaches [23,42].The latter methods are usually employed when B is non-positive definite.Typically, such problems arise when the sample estimates of the required statistics are extracted from historical time series of different and/or limited lengths [11].Nonetheless, given that A 1 is diagonal and assuming that ε t = Bξ t , where ξ t is an m-dimensional column-vector of i.i.d.RVs, the decomposition of Equation ( 11) can be given as follows: Additionally, the moments of ξ t and x t are interrelated through (index t is omitted due to stationarity): where µ 3 ξ and µ 3 [x] denote column-vectors that contain the third central moments of ξ and x, respectively; we remark that ξ coincides with the skewness coefficient, since the model assumes unit variance for ξ.Similar to the univariate case, the white noise term is typically generated using the PIII distribution (Equation ( 5)).To illustrate the envelope behavior of the latter model, we rewrite the Equation (11) similarly to Equation (7), i.e.: where F −1 ξ i u i denotes the quantile function of ξ j for a given probability u i .If the distribution of ξ j is bounded below or above by ξ i or v ξ i , respectively, then lim

and lim
Therefore, we obtain: for lower (left)-and above (right)-bounded cases, respectively.However, in the multivariate case, and since x i t depends on multiple values of ξ i , the limiting behavior (assuming that all RVs are left-bounded) is identified by setting u = u 1 , . . ., u i , . . ., u m → 0 .Of course, the envelope behavior diminishes if the white noise term ξ t is normally distributed (or more generally if ξ t (−∞, ∞)), yet in this case skewness cannot be preserved.
Without the loss of generality, we examine the bivariate case of T where both processes exhibit zero autocorrelation but their lag-0 cross-correlation is equal to 0.8.For E[x] = [0.5, 0.5] T , Var[x] = [1, 1] T , and µ 3 [x] = C s x = [2, 2.5] T we find: (where B is obtained by the Cholesky decomposition), while the generating equation (Equation ( 11)) becomes: Given the target moments of x, the statistics of the white noise term are calculated as E ξ = [0.50, 0.16] T , Var ξ = [1, 1] T , and µ 3 ξ = C s ξ = [2.00, 6.83] T .Using the PIII for white noise generation we obtain the lower bound vector ξ = [−0.500,−0.126] T .Thus, from Equation ( 22) the limiting envelope equations are x 1 t = 0 x 1 t−1 − 0.500 and x 2 t = 0 x 2 t−1 − 0.475.In this case, it is also possible to estimate the envelope relationship a priori between x 1 t and x 2 t as A 1 is a zero matrix.Particularly, since x 1 t = ξ 1 t and x 2 t = 0.8ξ 1 t + 0.6ξ 2 t , and substituting the former into the latter, we get x 2 t = 0.8x 1 t − 0.6ξ 2 t , and by setting ξ 2 t = ξ 2 the envelope line x 2 t = 0.8x 1 t − 0.076 is obtained.In order to demonstrate the envelope behavior in the multivariate case, we employ the above model and synthesized a time series of 5000 time steps.Figure 4A-C depicts the established dependence patterns of each individual process for lag-1 (panels A and B), while panel C shows the pattern among the two processes for lag-0.Also, at each panel we display the corresponding envelope equation.We remark that the model was able to accurately reproduce the theoretical stochastic structure, expressed by the autocorrelation (ACF) and cross-correlation functions (CCF) shown in Figure 4D-F, as well as, to approximate very well the target moments (Table A2).In order to extend our working examples, we simulate another vector of bivariate time series (5000 time steps) using the same marginal moments as before, but this time with a different autocorrelation structure.Specifically, we assumed Corr , = = 0.7 and Corr , = = 0.5.Thus, we get: = 0.7 0 0 0.5 , = 0.714 0 0.728 0.468 (25) and the generating formula: = 0.7 0 0 0.5 + 0.714 0 0.728 0.468 (26) Similar to the previous analysis, Figure 5A-C depicts the established lag-1 and lag-0 dependence patterns, while the envelope equation of each process is displayed in the title of each panel.It is apparent that at each simulated step, the model poses significant constraints in the range of subsequent plausible values, which is far from reality.We remark that in this case the contemporaneous lag-0 relationship cannot be displayed in a two-dimensional (2D) plot since it involves the lag-1 values of and .Nevertheless, the model successfully reproduced the target stochastic structure (Figure 5D-F) and the marginal moments (see Table A3), at the cost, however, of unrealistically bounded dependence patterns.In order to extend our working examples, we simulate another vector of bivariate time series (5000 time steps) using the same marginal moments as before, but this time with a different autocorrelation structure.Specifically, we assumed Corr x 1 t , x 1 t−1 = ρ 1 1 = 0.7 and Corr x 2 t , x 2 t−1 = ρ 2 1 = 0.5.Thus, we get: 0.7 0 0 0.5 , B = 0.714 0 0.728 0.468 (25) and the generating formula: Similar to the previous analysis, Figure 5A-C depicts the established lag-1 and lag-0 dependence patterns, while the envelope equation of each process is displayed in the title of each panel.It is apparent that at each simulated step, the model poses significant constraints in the range of subsequent plausible values, which is far from reality.We remark that in this case the contemporaneous lag-0 relationship cannot be displayed in a two-dimensional (2D) plot since it involves the lag-1 values of x 1 t and x 2 t .Nevertheless, the model successfully reproduced the target stochastic structure (Figure 5D-F) and the marginal moments (see Table A3), at the cost, however, of unrealistically bounded dependence patterns.

The Envelope Behavior beyond AR Models
To demonstrate the impact of employing non-Gaussian white noise in combination with other linear stochastic models, we employed (1) a low-order autoregressive moving average model ARMA(p,q); (2) a simple moving average model MA(q); and (3) its symmetric variant, termed SMA(q).The parameters p and q determine the order of the models.As shown by O'Connell [31] and later discussed by Lettenmaier [38], it is possible for the case of ARMA (1,1) to derive an analytical relationship that links the skewness of the white noise with the skewness of the target process.Furthermore, it has been shown [30] that similar relationships can be established for the two moving average schemes regardless of the order q (i.e., MA(q) and SMA(q)).
In this demonstration we utilized the aforementioned relationships for the simulation of a univariate stationary process with the characteristics of the hypothetical Scenario I of Table 1, which refers to the Markovian process with = 0.6 | | and = 4. Regarding the ARMA(1,1) process, it is noted that its autocorrelation structure is somewhat different from the Markovian one, hence we carefully choose its parameters in order to have = 0.6.On the other hand, both MA(q) and SMA(q) are able to resemble the Markovian autocorrelation structure with satisfactory accuracy by setting q = 32.Nonetheless, in all cases we used III distribution for the white noise, hence the models are termed ARMA(1,1)-III, MA(32)-III, and SMA(32)-III.However, due to a lack of analytical solution for the envelope function, and in order to derive a clear picture of the established dependence patterns, we generated very long realizations, each one consisting of 500,000 time steps.Figure 6 shows the lag-1 dependence pattern obtained by the three schemes as well as a comparison of the simulated and theoretical ACF, which is very well reproduced by all models.Despite the accurate reproduction of the target marginal statistics (mean, variance, and skewness) by all models,

The Envelope Behavior beyond AR Models
To demonstrate the impact of employing non-Gaussian white noise in combination with other linear stochastic models, we employed (1) a low-order autoregressive moving average model ARMA(p,q); (2) a simple moving average model MA(q); and (3) its symmetric variant, termed SMA(q).The parameters p and q determine the order of the models.As shown by O'Connell [31] and later discussed by Lettenmaier [38], it is possible for the case of ARMA(1,1) to derive an analytical relationship that links the skewness of the white noise with the skewness of the target process.Furthermore, it has been shown [30] that similar relationships can be established for the two moving average schemes regardless of the order q (i.e., MA(q) and SMA(q)).
In this demonstration we utilized the aforementioned relationships for the simulation of a univariate stationary process with the characteristics of the hypothetical Scenario I of Table 1, which refers to the Markovian process with ρ τ = 0.6 |τ| and C S x = 4. Regarding the ARMA(1,1) process, it is noted that its autocorrelation structure is somewhat different from the Markovian one, hence we carefully choose its parameters in order to have ρ 1 = 0.6.On the other hand, both MA(q) and SMA(q) are able to resemble the Markovian autocorrelation structure with satisfactory accuracy by setting q = 32.Nonetheless, in all cases we used PIII distribution for the white noise, hence the models are termed ARMA(1,1)-PIII, MA(32)-PIII, and SMA(32)-PIII.However, due to a lack of analytical solution for the envelope function, and in order to derive a clear picture of the established dependence patterns, we generated very long realizations, each one consisting of 500,000 time steps.Figure 6 shows the lag-1 dependence pattern obtained by the three schemes as well as a comparison of the simulated and theoretical ACF, which is very well reproduced by all models.Despite the accurate reproduction of the target marginal statistics (mean, variance, and skewness) by all models, they establish peculiarly-shaped and always bounded dependence forms.However, it should be noted that the MA(q) and SMA(q) schemes are typically employed for the simulation of annual processes, which are typically well approximated by the Gaussian distribution, and thus it is reasonable to expect a minimization of this issue.
Water 2018, 10, x FOR PEER REVIEW 12 of 23 they establish peculiarly-shaped and always bounded dependence forms.However, it should be noted that the MA(q) and SMA(q) schemes are typically employed for the simulation of annual processes, which are typically well approximated by the Gaussian distribution, and thus it is reasonable to expect a minimization of this issue.

Real-World Case Study
In this section we demonstrate the envelope behavior of the TF approach using a real-world and long dataset (1 January 1970 to 31 December 2008) of daily streamflow data (m 3 /s) of river Achelous at Kremasta dam in Western Greece.It is assumed that the autocorrelation structure of the daily streamflow of each month can be described by a stationary AR(1) model.The historical monthly and daily time series are clearly characterized by non-Gaussian distributions and skewness coefficients ranging from 1.6 (June) up to 6.7 (October).Specifically, we generate daily synthetic time series with a length of 1000 years, using for each month a different AR(1) model with III white noise (i.e., AR(1)-III).The model very satisfactorily reproduced the target historical marginal statistics of each month (Table A4), as well as the theoretical Markovian autocorrelation structure (see Figure A2 for a comparison among the empirical, synthetic, and theoretical ACFs), which however deviates from the empirical ACF for some months, showing a more persistent behavior.Yet a comparison of the lag-1 dependence patterns between the synthetic and the historical data, using scatter plots for each month (Figure 7), reveals the omnipresence of the envelope behavior.Evidently the model generates unrealistic dependence patterns that are far from the historical ones.The synthetic pairs of values are bounded by the theoretical envelope function (red line; embedded in each plot), while the historical pairs clearly extend beyond that line.In an effort to provide a quantitative metric, we calculate the empirical probability of a historical pair to lie below the envelope function.The overall mean value of this metric estimated from all months is 27%, while it ranges from 14% (in November) to 42% (in April).

Real-World Case Study
In this section we demonstrate the envelope behavior of the TF approach using a real-world and long dataset (1 January 1970 to 31 December 2008) of daily streamflow data (m 3 /s) of river Achelous at Kremasta dam in Western Greece.It is assumed that the autocorrelation structure of the daily streamflow of each month can be described by a stationary AR(1) model.The historical monthly and daily time series are clearly characterized by non-Gaussian distributions and skewness coefficients ranging from 1.6 (June) up to 6.7 (October).Specifically, we generate daily synthetic time series with a length of 1000 years, using for each month a different AR(1) model with PIII white noise (i.e., AR(1)-PIII).The model very satisfactorily reproduced the target historical marginal statistics of each month (Table A4), as well as the theoretical Markovian autocorrelation structure (see Figure A2 for a comparison among the empirical, synthetic, and theoretical ACFs), which however deviates from the empirical ACF for some months, showing a more persistent behavior.Yet a comparison of the lag-1 dependence patterns between the synthetic and the historical data, using scatter plots for each month (Figure 7), reveals the omnipresence of the envelope behavior.Evidently the model generates unrealistic dependence patterns that are far from the historical ones.The synthetic pairs of values are bounded by the theoretical envelope function (red line; embedded in each plot), while the historical pairs clearly extend beyond that line.an effort to provide a quantitative metric, we calculate the empirical probability of a historical pair to lie below the envelope function.The overall mean value of this metric estimated from all months is 27%, while it ranges from 14% (in November) to 42% (in April).

Discussion
Historically, most of the questions raised regarding the TF approach have concerned the case of the AR(1) model and the range of attainable skewness coefficients [20,38,43].This was mainly due to the use of Wilson-Hilferty transformation which was used for generating Gamma or Pearson type-III RVs [44].Nowadays, this technical issue is out of interest, since such RVs can be easily generated with high accuracy by modern random number generators which are available in almost every programming language (e.g., R, MATLAB, etc.).Additionally, we note that McMahon and Miller [20] reported that Thomas and Burden [18] and Fiering [5] tested their approach for skewness values ranging in (−0.5, 1.0).This work focused on the effect of using Pearson type-III white noise in AR(1) models and we show that this approach leads to unrealistic dependence patterns.Furthermore, preliminary investigations have also shown that this issue extends over other popular linear stochastic models when coupled with non-Gaussian white noise.Particularly, we demonstrated three such cases using III white noise in combination with (1) a classical ARMA(1,1); (2) a simple MA(q); and (3) its

Discussion
Historically, most of the questions raised regarding the TF approach have concerned the case of the AR(1) model and the range of attainable skewness coefficients [20,38,43].This was mainly due to the use of Wilson-Hilferty transformation which was used for generating Gamma or Pearson type-III RVs [44].Nowadays, this technical issue is out of interest, since such RVs can be easily generated with high accuracy by modern random number generators which are available in almost every programming language (e.g., R, MATLAB, etc.).Additionally, we note that McMahon and Miller [20] reported that Thomas and Burden [18] and Fiering [5] tested their approach for skewness values ranging in (−0.5, 1.0).This work focused on the effect of using Pearson type-III white noise in AR(1) models and we show that this approach leads to unrealistic dependence patterns.Furthermore, preliminary investigations have also shown that this issue extends over other popular linear stochastic models when coupled with non-Gaussian white noise.Particularly, we demonstrated three such cases using PIII white noise in combination with (1) a classical ARMA(1,1); (2) a simple MA(q); and (3) its symmetrical variant SMA(q) In all cases the resulting dependence patterns exhibited a peculiar and unrealistic bounded shape which can be bounded from both directions.
Nevertheless, it is noteworthy that Song et al. [45] and Jeong and Lee [46] also observed this issue independently while studying AR(1) with exponential white noise [47][48][49] and periodic Gamma autoregressive (PGAR) processes [50], respectively.However, to the best of our knowledge, these works, or any other, have not revealed the envelope limitation, neither provided a theoretical proof and a justification for this behavior, which probably arises from the lack of explicit assumption regarding the joint dependence structure of the process.Particularly, the joint moment of order k + n of two continuous RVs, x and y, is given by: where f xy denotes the joint probability distribution function (PDF) of x and y.The first cross product joint moment is embedded in the definition of covariance, as well as in the Pearson's correlation, i.e.: Hence, this points to the requirement for an assumption regarding f xy .When both x and y are Gaussian, and simulated through a typical decomposition scheme (e.g., the Cholesky technique) which applies linear operations on them, the joint PDF of x and y is also Gaussian (due to the affine transformation property of Gaussian RVs).When x and y are not Gaussian, the latter convenient property does not hold.By analogy, the joint moment of order k + n of a continuous-state, discrete-time stochastic process x t can be expressed as: If x t is Gaussian and modeled using a linear stochastic process (e.g., AR or MA-type) with Gaussian white noise, then it is well known that the joint PDF f x t , x t−τ is also Gaussian.This implies linear operations on Gaussian RVs.On the other hand, this does not hold for the TF approach, which uses non-Gaussian white noise and thus the form of f x t , x t−τ is unclear.
We remind that summary statistics such as mean, variance, skewness, and correlation are nothing more than some useful measures of location, dispersion, asymmetry, and dependence, and do not involve in their estimation the actual joint distribution.A classic example is provided by the so-called Anscombe's Quartet [51] and recently by Matejka and Fitzmaurice [52].Both works stress the importance of data science's first principle: "Visualize your data".They demonstrate this issue by devising several examples of datasets that have the same summary statistics but completely different dependence patterns.Apparently, as shown in this work, the aforementioned simple principle also applies in synthetic hydrology.
A conceptually related, yet until recently to the hydrological community, approach relies on the so-called Nataf joint distribution model [71], which is associated to the well-known Gaussian copula [54,72].Since their inception, Nataf-based models have been developed and applied for the simulation of univariate or multivariate autoregressive processes with arbitrary marginal distribution mainly within operations research, e.g., [73,74], while recently they have been aligned with hydrological stochastics [33,[75][76][77][78] in order to account for non-Gaussian processes, both univariate and multivariate, exhibiting intermittency, periodicity, and any-range dependence.
Apparently, both Nataf-and copula-based approaches can provide a remedy to the limitations of the TF approach, as well as explicitly account for non-Gaussianity, which is omnipresent within hydrometeorological processes (e.g., [79][80][81][82][83][84][85]).We deem that Nataf-based models provide a convenient and more precise alternative given that they utilize (in an auxiliary or parent role) existing and well-known stochastic models which provide the basis for a straightforward and operational efficient generation scheme.It is also noted that the celebrated Log-Normal model of Matalas [7], which incidentally can be classified as a Nataf-based approach [33,76], does not exhibit the TF approach limitation and thus can provide a rather easy and consistent option for practitioners.

Conclusions
To conclude, we bring back the aphorism and the question set by Box and Draper.Paraphrasing, we could say that indeed since all models are wrong and TF is not an exception, the question is how wrong the TF approach has to be to not be useful.A way to answer this question is through impact assessments of the envelope behavior in real-world applications, e.g., in important engineering studies (reservoir design and sizing, hydropower assessment, reliability-based studies, etc.), and of its effect on decision-making related to water resources management.Another question arising here is why should one use a model with known limitations and flaws (irrespective of whether these flaws have minor or major impacts on real-world applications) which reproduces unrealistic rainfall or streamflow patterns?We recognize that the TF model and the associated approach was a major contribution of paramount importance that shaped stochastic hydrology, yet in practice linear stochastic models should be used cautiously when combined with non-Gaussian white noise, given the limitations shown herein.This approach preserves important summary statistics (i.e., mean, variance, and skewness) and correlations, yet for processes showing medium to high skewness values and/or correlations it will inevitably reproduce bounded and unrealistic dependence patterns that are next used in simulations.In this context, after half a century of blind use of this model and approach, we deem that it is time to move to alternative methods which are consistent in generating realistic dependence structures as well as the marginal distribution itself.

Figure 1 .
Figure 1.Comparison of the (A) January-February, (B) March-April, and (C) September-October dependence patterns between historical and synthetic monthly runoff data (10 9 m 3 ) of the Nile, at Aswan dam.Synthetic time series were generated by the cyclostationary Thomas-Fiering (TF) approach (adapted by Tsoukalas et al. [33]; the simulated negative values were not truncated to zero in order to avoid distortion of the dependence pattern).The red line (-) depicts the envelope equation of the TF model (when combined with III white noise.See also Appendix A).

Figure 1 .
Figure 1.Comparison of the (A) January-February, (B) March-April, and (C) September-October dependence patterns between historical and synthetic monthly runoff data (10 9 m 3 ) of the Nile, at Aswan dam.Synthetic time series were generated by the cyclostationary Thomas-Fiering (TF) approach (adapted by Tsoukalas et al. [33]; the simulated negative values were not truncated to zero in order to avoid distortion of the dependence pattern).The red line (-) depicts the envelope equation of the TF model (when combined with PIII white noise.See also Appendix A).

Figure 2 .
Figure 2. Relationship between (A) the target skewness coefficient of process and the required skewness for white noise term for a given lag-1 autocorrelation coefficient ; and (B) the lag-1 autocorrelation coefficient and the required skewness coefficient of white noise term to attain the target skewness coefficient of process .

Figure 2 .
Figure 2.Relationship between (A) the target skewness coefficient of process x t and the required skewness for white noise term ε t for a given lag-1 autocorrelation coefficient ρ 1 ; and (B) the lag-1 autocorrelation coefficient ρ 1 and the required skewness coefficient of white noise term ε t to attain the target skewness coefficient of process x t .
Water 2018, 10, x FOR PEER REVIEW 7 of 23 every case, regardless of the and values, the model generates bounded dependence patterns enveloped by Equation (8).This behavior appears even for low combinations of and (e.g., scenario A).

Figure 3 .
Figure 3. Scatter plots depicting the simulated (using the TF model, i.e., the autoregressive model of order 1 (AR(1))-III) lag-1 dependence pattern among consecutive time steps (i.e., pair values (•) of the previous and current time steps).The labels of each plot resemble the corresponding scenarios of Table 1.The red line (-) depicts the envelope equation shown in the title of each plot.

Figure 3 .
Figure 3. Scatter plots depicting the simulated (using the TF model, i.e., the autoregressive model of order 1 (AR(1))-PIII) lag-1 dependence pattern among consecutive time steps (i.e., pair values (•) of the previous and current time steps).The labels of each plot resemble the corresponding scenarios of Table 1.The red line (-) depicts the envelope equation shown in the title of each plot.

Water 2018 ,
10, x FOR PEER REVIEW 10 of 23

Figure 4 .
Figure 4. Scatter plots depicting the simulated (using the contemporaneous multivariate autoregressive model of order 1 (CMAR(1) model) with III white noise) for (A) and (B) lag-1 dependence patterns of the zero-autocorrelated processes and , respectively, for consecutive time steps (i.e., pair values (•) of the previous and current time steps).Panel (C) depicts the contemporaneous dependence (lag-0) of and .The red line (-) depicts the envelope equation shown in the title of each plot.Panel (D) compares the simulated and theoretical autocorrelation function (ACF) of while panel (E) compares that of .Finally, panel (F) compares the simulated and theoretical cross-correlation function (CCF) of and .

Figure 4 .
Figure 4. Scatter plots depicting the simulated (using the contemporaneous multivariate autoregressive model of order 1 (CMAR(1) model) with PIII white noise) for (A) and (B) lag-1 dependence patterns of the zero-autocorrelated processes x 1 t and x 2 t , respectively, for consecutive time steps (i.e., pair values (•) of the previous and current time steps).Panel (C) depicts the contemporaneous dependence (lag-0) of x 1 t and x 2 t .The red line (-) depicts the envelope equation shown in the title of each plot.Panel (D) compares the simulated and theoretical autocorrelation function (ACF) of x 1 t while panel (E) compares that of x 2 t .Finally, panel (F) compares the simulated and theoretical cross-correlation function (CCF) of x 1 t and x 2 t .

Water 2018 , 23 Figure 5 .
Figure 5. Scatter plots depicting the simulated (using the CMAR(1) model with III white noise) for (A) and (B) lag-1 dependence pattern of the autocorrelated processes and , respectively, for consecutive time steps (i.e., pair values (•) of the previous and current time steps), while panel (C) depicts the contemporaneous dependence (lag-0) of and .The red line (-) depicts the envelope equation shown in the title of each plot.Panel (D) compares the simulated and theoretical ACF of while panel (E) compares that of .Lastly, panel ( ) compares the simulated and theoretical CCF of and .

Figure 5 . 1 t
Figure 5. Scatter plots depicting the simulated (using the CMAR(1) model with PIII white noise) for (A) and (B) lag-1 dependence pattern of the autocorrelated processes x 1 t and x 2 t , respectively, for consecutive time steps (i.e., pair values (•) of the previous and current time steps), while panel (C) depicts the contemporaneous dependence (lag-0) of x 1 t and x 2 t .The red line (-) depicts the envelope equation shown in the title of each plot.Panel (D) compares the simulated and theoretical ACF of x 1 t while panel (E) compares that of x 2 t .Lastly, panel (F) compares the simulated and theoretical CCF of x 1 t

Water 2018 , 23 Figure 7 .
Figure 7. Scatter plots showing the lag-1 dependence pattern of the daily streamflow of the Achelous river at the Kremasta dam, Greece (orange dots; •) and of a synthetic time series generated using an AR(1)-III model (black dots; •).The red line (-) depicts the envelope equation embedded each plot.

Figure 7 .
Figure 7. Scatter plots showing the lag-1 dependence pattern of the daily streamflow of the Achelous river at the Kremasta dam, Greece (orange dots; •) and of a synthetic time series generated using an AR(1)-PIII model (black dots; •).The red line (-) depicts the envelope equation embedded each plot.

Figure
Figure Monthly-based comparison of empirical (historical), synthetic (using AR(1) with III white noise), and theoretical autocorrelation functions (ACFs) of the real-world case study employed in Section 3-"Real-world case study" of the main manuscript.

Figure A2 .
Figure A2.Monthly-based comparison of empirical (historical), synthetic (using AR(1) with PIII white noise), and theoretical autocorrelation functions (ACFs) of the real-world case study employed in Section 3-"Real-world case study" of the main manuscript.

Table 1 .
Summary of target statistics for all scenarios (in all cases, µ x = 0.5 and σ 2 x = 1).

Table A3 .
Summary of theoretical and simulated statistics for the second, autocorrelated, bivariate AR(1) with PIII white noise, employed in Section 2.3-"From the univariate to the multivariate AR(1) model" of the main manuscript.

Table A4 .
Monthly-based summary of historical and simulated (synthetically generated using an AR(1)with PIII white noise) statistics of the real-world case study employed in Section 3-"Real world case study" of the main manuscript.