Simulation of Wave Time Series with a Vector Autoregressive Method

Joint time series of wave height, period and direction are essential input data to computational models which are used to simulate diachronic beach evolution in coastal engineering. However, it is often impractical to collect a large amount of the required input data due to the expense. Based on the nearshore wave records offshore of Littlehampton in Southeast England over the period from 1 September 2003 to 30 June 2016, this paper presents a statistical method to obtain simulated joint time series of wave height, period and direction covering an extended time span of a decade or more. The method is based on a vector auto-regressive moving average algorithm. The simulated times series shows a satisfactory degree of stochastic agreement between original and simulated time series, including average value, marginal distribution, autocorrelation and cross-correlation structure, which are important for Monte Carlo modelling of shoreline evolution, thereby allowing ensemble prediction of shoreline response to a variable wave climate.


Introduction
Simulation of time series plays an important role in many areas due to the high cost in obtaining in situ measurements. In the field of coastal engineering simulation of wave time series has been used for estimating the duration of storm events and their spacing in time, see e.g., [1,2], with the aim of assessing the risk of serious beach erosion [3,4]. Typically, a relatively short record of measured wave conditions is available, hence to perform Monte Carlo simulation of coastal flooding or erosion, wave sequences with similar statistics are required. This problem is particularly challenging and may be stated as simulating a non-stationary non-normal, correlated, trivariate stochastic process, given a sample or realisation of the process. Several methods have been proposed for tackling this type of problem. For example, Li and Winker [5] proposed a Monte Carlo method, or a quasi Monte Carlo method, to obtain simulated time series from a vector autoregressive moving average (VARMA) model fitted to sample data. Barone [6] described a simulation method that can be used to generate realisations of a VARMA process, while Shea [7] discussed a direct method of computing the initial state covariance matrix required by the simulation method. However, these methods cannot ensure the marginal distributions, and/or the autocorrelation patterns of the simulated data are the same as those of the sample. On the other hand, representative wave events obtained via a K-Means algorithm clustering technique has been used for studying port operability and long-term longshore sediment transport, but this suffers from not taking extreme wave events into account [8].
Methods to simplify the problem include a technique for simulating time series of wave height, period and direction, accounting for seasonality on a monthly mean basis [2]. This can suffer from jumps in the mean of variables at the changeover between consecutive months. Focusing on wave height and period only, bivariate autoregressive models were proposed by Soares and Cunha [9]. Recently, Cai [10] and Cai et al. [11] generalised the work of [12,13] in order to obtain simulated multivariate time series. However, these methods are suitable for stationary time series only, and hence, they cannot be used to generate the non-stationary wave conditions observed widely in practice. Although Cai et al. [14] developed a simulation method for non-stationary time series, this method is suitable only for short non-stationary time series due to relatively higher computational cost, and therefore is not suitable for wave time series corresponding to long-term measurements.
The purpose of this paper is to extend the method of [10] to recreate both the statistics and lag correlation properties of a non-stationary, multivariate wave record spanning many years. The data used here are multivariate, correlated and non-stationary. The reader is referred to [15][16][17]. Furthermore, this study relaxes the constraints of describing seasonality of all wave variables with a single functional form. More specifically, the seasonality of wave height and period time series were not assumed to have the same form as the seasonality of wave direction.
This paper is organised as follows. In Section 2, the theoretical background of the model is presented; in Section 3 the measurements from the study site and the approaches we used for removing the trends and seasonality are discussed. The simulation results are demonstrated in Section 4. Finally, discussion and conclusions are presented in Section 5.

Outline
Let E t = (E 1t ,E 2t ,E 3t ) denote observed sea condition data, where E 1t is the wave height, E 2t is wave period and E 3t is the wave direction, and the subscript t indicates waves at time t. As discussed in the previous section, the main objective of this research is to obtain simulated sea condition data, denoted by E t = E 1t , E 2t , E 3t , such that the simulated data will have similar trend, seasonality, autocorrelation structures and marginal distributions with those of the observed E t .
It is worth noting that the method of [10] requires that the observed data are stationary. Hence, it is important to convert the observed wave data into stationary ones. More specifically, our method includes the following steps: (i) Remove the trend and seasonality from the observed data to produce stationary series; (ii) apply the method of [10] to the series obtained from step (i); (iii) include the trend and seasonality into the simulated data obtained from step (ii) to generate the final simulated data that we require. Figure 1 illustrates our proposed methodology: Water 2022, 14, x FOR PEER REVIEW 2 of 16 secutive months. Focusing on wave height and period only, bivariate autoregressive models were proposed by Soares and Cunha [9]. Recently, Cai [10] and Cai et al. [11] generalised the work of [12,13] in order to obtain simulated multivariate time series. However, these methods are suitable for stationary time series only, and hence, they cannot be used to generate the non-stationary wave conditions observed widely in practice. Although Cai et al. [14] developed a simulation method for non-stationary time series, this method is suitable only for short non-stationary time series due to relatively higher computational cost, and therefore is not suitable for wave time series corresponding to long-term measurements. The purpose of this paper is to extend the method of [10] to recreate both the statistics and lag correlation properties of a non-stationary, multivariate wave record spanning many years. The data used here are multivariate, correlated and non-stationary. The reader is referred to [15][16][17]. Furthermore, this study relaxes the constraints of describing seasonality of all wave variables with a single functional form. More specifically, the seasonality of wave height and period time series were not assumed to have the same form as the seasonality of wave direction. This paper is organised as follows. In Section 2, the theoretical background of the model is presented; in Section 3 the measurements from the study site and the approaches we used for removing the trends and seasonality are discussed. The simulation results are demonstrated in Section 4. Finally, discussion and conclusions are presented in Section 5.

Outline
Let Et = (E1t,E2t,E3t) denote observed sea condition data, where E1t is the wave height, E2t is wave period and E3t is the wave direction, and the subscript t indicates waves at time t. As discussed in the previous section, the main objective of this research is to obtain simulated sea condition data, denoted by ̃= (̃1 ,̃2 ,̃3 ), such that the simulated data will have similar trend, seasonality, autocorrelation structures and marginal distributions with those of the observed Et.
It is worth noting that the method of [10] requires that the observed data are stationary. Hence, it is important to convert the observed wave data into stationary ones. More specifically, our method includes the following steps: (i) Remove the trend and seasonality from the observed data to produce stationary series; (ii) apply the method of [10] to the series obtained from step (i); (iii) include the trend and seasonality into the simulated data obtained from step (ii) to generate the final simulated data that we require. Figure 1 illustrates our proposed methodology:  First note that in this application, we consider the trend and seasonality to be additive. That is, each E kt for k = 1, 2, 3 can be expressed by E kt = T kt + S kt + y kt , where T kt and S kt represent the trend and seasonality components, respectively, and y kt is the detrended and deseasonalised data that will be used in step (ii) of our method.
For wave directions, the sample was first transformed to create a variable with similar magnitude and range as that of wave height and period, (see Section 3), before the trend was estimated and removed as per the treatment of wave height and period. Then, a linear trend for each variable and an autoregressive process model of order 5 for wave height and period were estimated, which were subtracted from the original data to give the detrended data E kt − T kt .
For seasonality S kt , as waves are driven by atmospheric winds, their characteristics will reflect the seasonal changes in the weather. However, at any particular location, the wave conditions may be a combination of waves from several sources so that 'seasonality' in the literal sense may not be reflected in wave height, period and angle in an identical manner. The seasonality parameters were determined using least squares estimation. Specifically, to estimate the seasonal component of each one of the three wave time series, sums comprising of gradually increasing numbers of sine or Fourier terms were successively tested via the Augmented Dickey-Fuller test [18], until the suitable combination of seasonal components were obtained. Then, the selected seasonal components were removed from the wave time series. To this end, an existing routine in MATLAB [19] was applied for testing the stationarity of the wave time series via the Augmented Dickey-Fuller test. This routine gives as output the value 0 for non-stationary data and 1 for stationary data. Following this method, the seasonality of wave height and period time series were described via a sum of sine terms while the seasonality of wave directions was described via a sum of Fourier terms. After removal of the trend and seasonal component the detrended and deseasonalised series are given by y kt = E kt − T kt − S kt , which is a vector sequence, (trivariate-with wave height, period, and direction), that is stationary, correlated and non-normal.
Once the detrended and deseasonalised series y kt for k = 1,2,3 have been computed, and successfully tested for stationarity, a vector VAR (vector auto-regressive) method, (outlined in the next section), may be applied to obtain simulated data for y kt . This may then be transformed back to obtain simulated data for E kt with the correct seasonality and trend.

Detailed Methodology
Following [10], let the base process of the method be a vector AR (auto-regressive) process of order p, denoted by VAR(p), defined by where z t = (z 1t ,z 2t ,z 3t ) , z kt ∼ N(0,1) for k = 1, 2, 3, φ i are fixed 3 × 3 coefficient matrices, I = 1, ..., p, and u t = (u 1t ,u 2t ,u 3t ) is a 3-dimensional normal random variable with mean 0 and covariance matrix Σ u such that and It follows, see e.g., [20], that the correlation matrix function of z t is given by Γ(h) = E z t z t−h , where h = 0,1,2,..., H and where H is a fixed number that defines the maximum lag value we would like to consider for matching the autocorrelation structures between the simulated and the observed time series, r ijh is the correlation between z it and z jt+h and i,j = 1, 2, 3, and Γ(−h) = Γ (h). The simulation method requires the estimation of φ i , r ijh and Σ u . The correlations r ijh can be obtained by solving the following non-linear equations where ρ ijh is the correlation between two stationary time series corresponding to the original input data, F j (·) is the marginal distribution of y j and hence F j −1 (·) represents its inverse function; Φ(·) is the distribution function of the standard normal distribution; ξ rijh (·,·) is the joint density function of two normal variables with mean zero and correlation r ijh ; E(y it ) is the mean of y it , and var(y it ) is the variance ofy it . An explanation about how Equation (4) is derived is presented in the Appendix A.
Thus, given the sample data E t , we determined detrended and deseasonalised data y it . Setting F j (·) equal to the empirical distribution of y jt and setting E(y it ) and var(y it ) equal to the sample mean and variance of y it respectively, and replacing ρ ijh with the sample autocorrelation of y it and y jt+h , the r ijh can be obtained by solving the resulting non-linear equations using the methods detailed in the Appendix B. Once the r ijh 's are available, φ i and Σ u for i = 1,...,p can be obtained by solving Equation (3).
Hence, with r ijh , φ i and Σ u determined the simulated data can be produced as follows: (i) Obtain simulated data from the base process (Equation (1)); (ii) use the simulated z t = (z 1t ,z 2t ,z 3t ) to obtain simulated y it for i = 1,2,3. To this end, the corresponding values of the normal cumulative distribution function, Φ(z t ), are calculated given as input data the simulated values z t . Then, the Φ(z t ) values are interpolated in the empirical marginal distribution set of values with respect to the wave parameters (height (for i = 1), period (for i = 2) and direction (for i = 3) to yield the simulated data y it . (iii) Finally, the trend and seasonality components that had been removed before (see Figure 1) are re-added to y it to yield the simulated time series E it for i = 1,2,3 as required. The whole process is schematised in Figure 2.
of the normal cumulative distribution function, Φ(zt), are calculated given as input data the simulated values zt. Then, the Φ(zt) values are interpolated in the empirical marginal distribution set of values with respect to the wave parameters (height (for i = 1), period (for i = 2) and direction (for i = 3) to yield the simulated data yit. (iii) Finally, the trend and seasonality components that had been removed before (see Figure 1) are re-added to yit to yield the simulated time series Eit for i = 1,2,3 as required. The whole process is schematised in Figure 2.

Case-Study and Data Processing
For our test case we have chosen Littlehampton which is located in Southeast England ( Figure 3a). Wave measurements were accessed from the Channel Coastal Observatory, a UK organisation which collects and archives coastal field-data. Specifically, time series of significant wave height (Hs), peak wave period (Tp) and wave direction (α) relative to North, between the 1st of July 2003 and the 30th of June 2016 were gathered. Integrated wave parameters were available at an interval of 30 min at a location 4 miles SSE of Lit- The trend T kt and seasonality S kt are removed from E kt : The marginal distribution, mean, standard deviation and autocorrelation of y kt An AR stochastic base process is defined: Given ത , ො , ρ ijh, F k , the parameters of the base process: can be obtained consecutively.
Considering a matrix Q 1 such as S u =Q 1 Q 1 ', then: Now the AR model z kt can be evaluated.
Next, detrended and deseasonalised simulated data are produced: Finally, trend and seasonality are added back to : = +T kt +S kt

Case-Study and Data Processing
For our test case we have chosen Littlehampton which is located in Southeast England (Figure 3a).
the simulated values zt. Then, the Φ(zt) values are interpolated in the empirical marginal distribution set of values with respect to the wave parameters (height (for i = 1), period (for i = 2) and direction (for i = 3) to yield the simulated data yit. (iii) Finally, the trend and seasonality components that had been removed before (see Figure 1) are re-added to yit to yield the simulated time series Eit for i = 1,2,3 as required. The whole process is schematised in Figure 2.

Case-Study and Data Processing
For our test case we have chosen Littlehampton which is located in Southeast England ( Figure 3a). Wave measurements were accessed from the Channel Coastal Observatory, a UK organisation which collects and archives coastal field-data. Specifically, time series of significant wave height (Hs), peak wave period (Tp) and wave direction (α) relative to North, between the 1st of July 2003 and the 30th of June 2016 were gathered. Integrated wave parameters were available at an interval of 30 min at a location 4 miles SSE of Lit- The trend T kt and seasonality S kt are removed from E kt : The marginal distribution, mean, standard deviation and autocorrelation of y kt An AR stochastic base process is defined: Given ത , ො , ρ ijh, F k , the parameters of the base process: can be obtained consecutively.
Considering a matrix Q 1 such as S u =Q 1 Q 1 ', then: Now the AR model z kt can be evaluated.
Next, detrended and deseasonalised simulated data are produced: Finally, trend and seasonality are added back to : = +T kt +S kt Wave measurements were accessed from the Channel Coastal Observatory, a UK organisation which collects and archives coastal field-data. Specifically, time series of significant wave height (Hs), peak wave period (T p ) and wave direction (α) relative to North, between the 1st of July 2003 and the 30th of June 2016 were gathered. Integrated wave parameters were available at an interval of 30 min at a location 4 miles SSE of Littlehampton harbour entrance; observations were made with a Datawell Directional WaveRider Mk III buoy moored in approximately 10 m water. The wave rose of the 13 year record is illustrated in Figure 3b. This shows a wave climate with a predominant southwesterly approach and a secondary peak in waves from the southeast. Waves from the southwest are typically a mixture of locally generated wind waves and swell waves from Atlantic storms. Southeasterly waves are fetch-limited storm waves generated by the northern part of low-pressure systems that track to the south of the UK [21].
The wave records were averaged to create a time series of daily wave conditions, as our focus is on storm events rather than wave by wave fluctuations. The daily data are shown in Figure 4, in which seasonality is visually evident.
southwest are typically a mixture of locally generated wind waves and swell waves from Atlantic storms. Southeasterly waves are fetch-limited storm waves generated by the northern part of low-pressure systems that track to the south of the UK [21].
The wave records were averaged to create a time series of daily wave conditions, as our focus is on storm events rather than wave by wave fluctuations. The daily data are shown in Figure 4, in which seasonality is visually evident. If the magnitudes of the variables of interest are very different, it is normal practice to standardise or transform them before beginning the modelling process in order to improve the fitting. Here the range of wave direction is much larger than the other two variables. Hence, the following transformation of wave direction in the data preparation stage was performed: Before undergoing any further processing of the wave time series, it was important to split the available dataset into a training and a validation subset. The training subset constituted the first 75% of the whole dataset, thus, covering the time period from 1 July 2003 to 31 March 2013. In this time span, the aim was the VAR model to be properly parametrised, specifically, to estimate the trend and the seasonal elements of the wave time series, plus the φi and ut parameters of Equation (1). The validation is performed on the remaining 25% of the whole dataset, namely, from 1 April 2013 to 30 June 2016.
Next, the trend and seasonal elements of the wave timeseries are assessed, and subsequently, removed to achieve stationarity. If the magnitudes of the variables of interest are very different, it is normal practice to standardise or transform them before beginning the modelling process in order to improve the fitting. Here the range of wave direction is much larger than the other two variables. Hence, the following transformation of wave direction in the data preparation stage was performed: Before undergoing any further processing of the wave time series, it was important to split the available dataset into a training and a validation subset. The training subset constituted the first 75% of the whole dataset, thus, covering the time period from 1 July 2003 to 31 March 2013. In this time span, the aim was the VAR model to be properly parametrised, specifically, to estimate the trend and the seasonal elements of the wave time series, plus the ϕ i and u t parameters of Equation (1). The validation is performed on the remaining 25% of the whole dataset, namely, from 1 April 2013 to 30 June 2016.
Next, the trend and seasonal elements of the wave timeseries are assessed, and subsequently, removed to achieve stationarity.

Detrending
To estimate T kt , the following model was used to represent the trend; coefficients being obtained by fitting this to the observations using least-squares estimation: where ε t are independent, identically distributed random variables, p = 0 for wave direction and p = 5 for wave height and wave period. The value of the order p was chosen to ensure the stationarity of the detrended and deseasonalised series. The estimated linear trend for each variable is shown in Figure 5, rection and = 5 for wave height and wave period. The value of the order was chosen to ensure the stationarity of the detrended and deseasonalised series. The estimated linear trend for each variable is shown in Figure 5, and the estimated parameter αk values for wave height are α1 = 0.5996; α2 = −0.08316; α3 = 0.11206; α4 = −0.02666; and α5 = 0.0373, while the corresponding values for wave period are: a1 = 0.5007; a2 = −0.04946; a3 = 0.03786; a4 = 0.0182; and a5 = −0.019. The trend, Tkt, was taken as the residual series εt of the models respectively.

Seasonality
For seasonality, a best fit curve was chosen to describe the seasonal element for each variable. This curve is given by the following equation corresponding to a sum of sine terms: St = a1 × sin(b1 × n + c1) +…+a8 × sin(b8 × n + c8), for significant wave height Hs

Seasonality
For seasonality, a best fit curve was chosen to describe the seasonal element for each variable. This curve is given by the following equation corresponding to a sum of sine terms: S t = a 1 × sin(b 1 × n + c 1 ) + . . . +a 8 × sin(b 8 × n + c 8 ), for significant wave height H s and peak wave period T p , while for the wave direction α, the seasonal element was described via an equation comprising of a sum of Fourier terms: S t = a 0 + a 1 × cos(nw) + b 1 × sin(nw) + a 2 × cos(8nw) + b 2 × sin(8nw) + . . . a 8 × cos(8nw) + b 8 × sin(8nw), where n is the number of the consecutive temporal step. The fitting parameters a i , b i and c i for wave height and period time series, and the corresponding values a i , b i and w for wave direction were estimated via the least-squares method for each of wave height, period and direction. A different formulation for the seasonal components for wave direction was required in order to remove all the non-stationarity in the series. Figure 3 suggests that seasonal components in the wave direction are different from those in the other two wave variables and this is borne out by the nature of the non-stationarity of the respective variables.
The extracted seasonal components are presented in Figure 6. Note that monthly averaged values of wave parameters have been used for visual clarity.
Finally, the augmented Dickey-Fuller test [18] was applied to the detrended and deseasonalised data y it to determine whether the series was stationary. This process was performed iteratively, adding additional terms to the seasonality model, until the augmented Dickey-Fuller test indicated that y it were stationary. different formulation for the seasonal components for wave direction was required in order to remove all the non-stationarity in the series. Figure 3 suggests that seasonal components in the wave direction are different from those in the other two wave variables and this is borne out by the nature of the non-stationarity of the respective variables.
The extracted seasonal components are presented in Figure 6. Note that monthly averaged values of wave parameters have been used for visual clarity. Finally, the augmented Dickey-Fuller test [18] was applied to the detrended and deseasonalised data yit to determine whether the series was stationary. This process was performed iteratively, adding additional terms to the seasonality model, until the augmented Dickey-Fuller test indicated that yit were stationary.

Simulation Results
As mentioned in the previous section, the trend and seasonal component of the first 75% of the available measurements at Littlehampton were taken into account. The marginal distributions of wave height, period and direction and their autocorrelations and cross-correlations were estimated from this sequence. The correlations were used to es-

Simulation Results
As mentioned in the previous section, the trend and seasonal component of the first 75% of the available measurements at Littlehampton were taken into account. The marginal distributions of wave height, period and direction and their autocorrelations and crosscorrelations were estimated from this sequence. The correlations were used to estimate the values of r ijh , φ i and Σ u by solving Equations (2) and (3) (see also Appendix B), where h = 0, 1, . . . , 3. Hence, in this study we let H = 3, corresponding to matching the correlation structure of the data up to three days, or approximately the storm duration at the site. Note that these parameter values define the correlation structure of the model (Equation (1)). Then, the estimated model (Equation (1)) was used to obtain simulated data z t, i.e., the detrended and deseasonalised synthetic data (see Appendix B). The length of the simulated data was taken to be the same as the original series for illustration purpose. Finally, the trend and seasonality removed in the initial steps were added back to create the output synthetic time-series, E t (Figure 7).
If the method is working well, the simulated data E t should have similar statistical properties to those of the original data E t . As a check on this, marginal distributions and correlations of the original and simulated series were compared. The marginal distributions of the observed and simulated data were estimated with the non-parametric kernel estimation method [22]. The estimated marginal density functions are given in Figure 8, where the blue curves correspond to the observed data and the red curves correspond to the simulated data. It can be seen that the two sets of density functions are very close for all three sea condition variables. Moreover, the estimated means of the observed and simulated data are shown by the blue and red vertical lines respectively in Figure 8, and show extremely close agreement. estimation method [22]. The estimated marginal density functions are given in Figure 8, where the blue curves correspond to the observed data and the red curves correspond to the simulated data. It can be seen that the two sets of density functions are very close for all three sea condition variables. Moreover, the estimated means of the observed and simulated data are shown by the blue and red vertical lines respectively in Figure 8, and show extremely close agreement.  respectively. Blue curves correspond to observed data and red curves correspond to simulated data. Similarly, blue and red vertical lines correspond to the means of the observed and simulated data, respectively. Figure 9 shows the estimated autocorrelation and cross-correlation between variables for the original and simulated data series. respectively. Blue curves correspond to observed data and red curves correspond to simulated data. Similarly, blue and red vertical lines correspond to the means of the observed and simulated data, respectively. Figure 9 shows the estimated autocorrelation and cross-correlation between variables for the original and simulated data series. and simulated wave height (m), wave period (s) and wave direction (dimensionless transformed) respectively. Blue curves correspond to observed data and red curves correspond to simulated data. Similarly, blue and red vertical lines correspond to the means of the observed and simulated data, respectively. Figure 9 shows the estimated autocorrelation and cross-correlation between variables for the original and simulated data series. Overall, Figure 9 shows a very good agreement between the original and synthetic data. The final step is the validation of the VAR model on an independent section of measurements; that is, the time period from 1 April 2013 to 30 June 2016. The simulated wave time series, along with their density functions is presented in Figures 10 and 11 respectively. Overall, Figure 9 shows a very good agreement between the original and synthetic data. The final step is the validation of the VAR model on an independent section of measurements; that is, the time period from 1 April 2013 to 30 June 2016. The simulated wave time series, along with their density functions is presented in Figures 10 and 11 respectively.     Similarly, blue and red vertical lines correspond to the means of the observed and simulated data, respectively. Figure 11 illustrates good agreement between the means of wave height, period and direction. Some divergence between the original and simulated wave time series, particularly at the peaks of the density functions, is evident.
A comparison between detrended and deseasonalised original and synthetic time series was conducted. Results are shown in Figure 12 which demonstrate a good level of agreement in the correlation structure of the original and synthetic series. The temporal correlation scales in auto-and cross-correlations is captured well although the slight negative cross-correlation between wave height and period at lags up to one week is absent in the synthetic series. Example input data and processed data can be found in Supplementary Materials.  Figure 11 illustrates good agreement between the means of wave height, period and direction. Some divergence between the original and simulated wave time series, particularly at the peaks of the density functions, is evident.
A comparison between detrended and deseasonalised original and synthetic time series was conducted. Results are shown in Figure 12 which demonstrate a good level of agreement in the correlation structure of the original and synthetic series. The temporal correlation scales in auto-and cross-correlations is captured well although the slight negative cross-correlation between wave height and period at lags up to one week is absent in the synthetic series. Example input data and processed data can be found in Supplementary Materials.

Discussion and Conclusions
A methodology for simulating multi-variate wave sequences via a vector autoregressive (VAR) stochastic model was presented in this study and its application illustrated with measurements over a 13-year period taken at Littlehampton, UK. The measurement record was split into two non-overlapping subsets. The first one extending from 1 July 2003 to 31 March 2013 and the second one from 1 April 2013 to 30 June 2016,

Discussion and Conclusions
A methodology for simulating multi-variate wave sequences via a vector autoregressive (VAR) stochastic model was presented in this study and its application illustrated with measurements over a 13-year period taken at Littlehampton, UK. The measurement record was split into two non-overlapping subsets. The first one extending from 1 July 2003 to 31 March 2013 and the second one from 1 April 2013 to 30 June 2016, to provide independent training and validation data sets. The model was successfully calibrated and validated.
The utility of the correlation functions on non-stationary data is arguable and does not appear to rhyme with "A key element of the procedure is a detailed treatment of non-stationarity in the wave time sequences." A key element of the procedure is a detailed treatment of non-stationarity in the wave time sequences. The non-stationarity may have different manifestations within each element of the wave conditions; with wave height, period and direction each exhibiting different non-stationarity. Our methodology allows for such variation and is able to create synthetic sequences of wave conditions that have very similar statistical properties to the original dataset.
We applied the method to a site in the UK that experiences mid-latitude storms that have a typical duration of several days. A crucial quantity in the method is the parameter H, which controls the number of lag correlations that are modelled. For our dataset, we found that H = 3, corresponding to a lag of three days, provided a good representation of the storm-scale correlation between wave parameters. Should greater fidelity in the correlation structure be required, for instance to resolve infra-storm conditions, the method allows this. It would require analysing the original data at a finer temporal resolution, say hourly or three hourly, and correlations at a larger number of lags to be found, requiring additional calculation.
We note that the purpose of removing trend and seasonality in our study is to ensure that the detended and deseasonalised series y kt are stationary so that we can use our simulation method to obtain simulated data for the original wave height, direction and period data. We have used the simplest method of [20] to remove trend and seasonality, where the trend and seasonality are estimated using our methods. Furthermore, we use the ADF test to check the stationarity of detrended and deseasonalised data. On the other hand, after we re-add trend and seasonality to the simulated y kt , any potential biases in the residuals will disappear.
More sophisticated approaches are available. For example, where seasonal components vary significantly in amplitude and frequency over the dataset a least-squares wavelet analysis applied in a window-wise manner may be more suitable [15]. Other methods, such as the anti-leakage least-squares spectral analysis, allow simultaneous estimation of the trend and seasonal components.
The method described in this paper does not yield unrealistic jumps in the time series, as some earlier techniques did. The vector autoregressive (VAR) stochastic model presented in this study can be developed further to optimise the modelling of the seasonal component. In addition, the choice of parameter H, corresponding to the number of lag correlations that are modelled, could be automated rather than specified via a sequence of trial simulations.

Informed Consent Statement: Not applicable.
Data Availability Statement: The wave measurements used in this study are available, without charge, from the Channel Coastal Observatory, a UK organisation which collects and archives coastal field-data (https://coastalmonitoring.org/cco/, accessed on 21 January 2022).