High-Precision Combined Tidal Forecasting Model

To improve the overall accuracy of tidal forecasting and ameliorate the low accuracy of single harmonic analysis, this paper proposes a combined tidal forecasting model based on harmonic analysis and autoregressive integrated moving average–support vector regression (ARIMA-SVR). In tidal analysis, the resultant tide can be considered as a superposition of the astronomical tide level and the non-astronomical tidal level, which are affected by the tide-generating force and environmental factors, respectively. The tidal data are de-noised via wavelet analysis, and the astronomical tide level is subsequently calculated via harmonic analysis. The residual sequence generated via harmonic analysis is used as the sample dataset of the non-astronomical tidal level, and the tidal height of the system is calculated by the ARIMA-SVR model. Finally, the tidal values are predicted by linearly summing the calculated results of both systems. The simulation results were validated against the measured tidal data at the tidal station of Bay Waveland Yacht Club, USA. By considering the residual non-astronomical tide level effects (which are ignored in traditional harmonic analysis), the combined model improves the accuracy of tidal prediction. Moreover, the combined model is feasible and efficient.


Introduction
Tide is the periodic rising and falling of the sea level, and its fluctuations largely influence human lifestyle.Accurate real-time recording of tide level information is essential for ship navigation safety, the development and utilization of marine resources, and marine disaster mitigation and prevention [1].Therefore, a simple and efficient tidal prediction method is urgently required.Based on their underlying prediction principles, tidal prediction methods are classified into traditional and intelligent prediction models.
Traditional tidal prediction models are mainly based on harmonic analysis [2,3].Harmonic analysis for tide prediction was pioneered by Thomson in 1866, which was subsequently improved by Darwin, who formulated the equilibrium tide theory.Doodson determined the harmonic analysis constants by least-squares fitting the observed tidal data [4].Yen smoothed the harmonic analysis constants by passing them through a Kalman filter [5].After hundreds of years of development, harmonic analysis continues to be widely used in tidal prediction; however, this model only considers the astronomical tidal level affected by the tide-generating forces.Other environmental factors such as wind, pressure, and seabed topography, which exert nonlinear effects on the tidal level, are ignored.If the tide level is predicted by harmonic analysis alone, a large prediction error is incurred, which can be verified in Section 3.2.2 of this paper.Harmonic analysis also requires long-term historical data of tidal levels, which are generally precluded by the high cost of on-site monitoring equipment [6].
In today's artificial intelligence era, data prediction is performed by increasingly intelligent models.Owing to their strong adaptive learning ability and nonlinear-mapping ability, neural networks are now widely used in tidal prediction.Many researchers have combined neural networks with related intelligent algorithms in their tidal prediction models, with much success.
The first prediction of diurnal and semidiurnal tides by an artificial neural network was attempted by Tsai et al. [7]; Lin et al. [8] proposed an adaptive neuro-fuzzy inference system for sea level prediction, which accounts for the tidal forces and thermal expansion of the ocean; and Jain et al. [9] developed a 24 h tidal prediction model based on a neural network, and applied it to the New Mangalore tidal station on the west coast of the Indian Ocean.However, neural networks generally require a large amount of training data, are easily trapped into local optima, lack universal applicability, etc. Support vector machines (SVM) have been widely used in prediction because they provide good nonlinear fitting with small input data volumes and are strongly generalizable [10].Bhasin et al. [11] successfully forecasted the families and subfamilies of G-protein coupled receptors by an SVM-based approach; Xiong et al. [12] combined an SVM with a Hidden Markov model, and hence proposed a new framework of vehicle collision prediction; and Deris et al. [13] hybridized the SVM model with graded resolution, and predicted the surface roughness during abrasive water-jet machining.Oliveira et al. [14] proposed an evolutionary hybrid system composed of an exponential smoothing filter, the Autoregressive Integrated Moving Average Model (ARIMA), autoregressive (AR) linear models, and an SVR model, which has been proven to have good prospects in the forecasting field.Given this diversity of applications, the prospects of SVM in tidal prediction are high.Nevertheless, tidal prediction by SVM has rarely been reported.
To exploit these prospects, the current study proposes a tidal prediction model based on harmonic analysis and an autoregressive integrated moving-average-SVM for Regression (ARIMA-SVR): The model uses the typical time-series-processing model ARIMA and the SVR, with an excellent nonlinear-data regression performance, to predict the residual sequence generated by the prediction of harmonic analysis.The ARIMA-SVR prediction model is a data-driven model.It has unique advantages in solving numerical predictions, rebuilding highly non-linear functions, time series analysis, and so on.It does not need to consider the physical mechanism of the tidal formation process, but establishes a mathematical analysis of time analysis.By learning the given samples, we can find the statistical or causal relationship among the variables of water level, which has broad prospects in tidal prediction.
By using the complete data to extract information, including the astronomical tide level and the non-astronomical tide level, the proposed model greatly improves the accuracy of tidal prediction.The model was validated against the measured tide data at the port of Bay Waveland Yacht Club in Mississippi, USA.The verification proves that the combined model effectively ameliorates the low accuracy of a single model and provides effective tidal prediction.

SVM and SVR
Pioneered by Vapnik in 1995, SVM is a nonlinear learning method with a solid theoretical foundation.Unlike other machine learning methods, such as neural networks, SVM implements the principle of Structural Risk Minimization (SRM) [15].In convex quadratic programming problems, SVM seeks the best generalization performance by balancing the learning ability of the finite sample [16] against the complexity of the model.It has two variants: support vector classification (SVC) and support vector regression (SVR), which solve data classification and regression prediction problems, respectively.SVM architecture is shown in Figure 1, where x(n) is the input independent eigenvalue and K(X, X i ) (i = 1, 2, . . ., n) is the kernel function.The independent variable x(n) is mapped to the high-dimensional feature space by the kernel function to realize linear regression in the feature space, and multiple linear regression is then performed in the high-dimensional feature space [17], and the output characteristic Y is obtained.
Algorithms 2019, 12, x FOR PEER REVIEW 3 of 18 x(1) x( 2) In the present paper, the kernel is the popularly used radial basis function (RBF): where [ , i x x ] is the initial low-dimensional feature space vector, δ is the inner bandwidth of the kernel function, g is the parameter of the kernel function, and is the 2-norm operator.By mapping the RBF kernel function, each sample ( ) x y is maximally fitted to the following linear SVR model: where is the inner product notation and incorporates the kernel function; the weight vector ω and bias b are obtained by solving the following optimization problems: ) The factor C is the penalty factor, which compromises between the generalization performance and the training error, and ε is the maximum tolerance beyond which the optimization fails.ξ i and ξ * i are relaxation variables that avoid over-fitting during data training, ensuring a certain fault tolerance of the model.

PSO Algorithm
The particle swarm optimization (PSO) algorithm [18] is a swarm intelligence optimization algorithm proposed by Kennedy and Eberhart in 1995.Like the fish swarm algorithm and the ant colony algorithm, PSO optimizes the solution via group intelligence generated by the mutual cooperation of particles and information sharing.PSO also has a memory function for dynamically tracking the current search situation and adjusting the search strategy in real time [19].Because of its simplicity, high efficiency, and lack of many parameter adjustments, PSO has been widely applied in fuzzy system control, parameter optimization of machine learning algorithms, and function optimization.
The PSO algorithm is fully described below.In the present paper, the kernel is the popularly used radial basis function (RBF): where [x, x i ] is the initial low-dimensional feature space vector, δ is the inner bandwidth of the kernel function, g is the parameter of the kernel function, and is the 2-norm operator.By mapping the RBF kernel function, each sample (x i , y i ) is maximally fitted to the following linear SVR model: where is the inner product notation and incorporates the kernel function; the weight vector ω and bias b are obtained by solving the following optimization problems: The factor C is the penalty factor, which compromises between the generalization performance and the training error, and ε is the maximum tolerance beyond which the optimization fails.ξ i and ξ * i are relaxation variables that avoid over-fitting during data training, ensuring a certain fault tolerance of the model.

PSO Algorithm
The particle swarm optimization (PSO) algorithm [18] is a swarm intelligence optimization algorithm proposed by Kennedy and Eberhart in 1995.Like the fish swarm algorithm and the ant colony algorithm, PSO optimizes the solution via group intelligence generated by the mutual cooperation of particles and information sharing.PSO also has a memory function for dynamically tracking the current search situation and adjusting the search strategy in real time [19].Because of its simplicity, high efficiency, and lack of many parameter adjustments, PSO has been widely applied in fuzzy system control, parameter optimization of machine learning algorithms, and function optimization.
The PSO algorithm is fully described below.
Step 1: Initialize a group of particles in D-dimensional space.Each particle represents a set of potential optimal solutions to the optimization problem and is characterized by its position, velocity, and fitness.The position and velocity vectors are represented as X i = (X i1 , X i2 , X i3 , . . ., X iD ) and Step 2: As the particles move in space, update their individual positions by tracking the individual extremum P best and the group extremum G best .P best and G best represent the local and global fitness of the optimal particle positions, respectively.
Step 3: Calculate the fitness after updating the particle position.The P best and G best positions are updated by comparing the fitness of the new particle with those of the individual extremum and group extremum computed in Step 2. In each iteration, the particle adjusts its velocity vector based on the inertia vector P i = (P i1 , P i2 , P i3 , . . ., P iD ), the optimal empirical vector P g = (P g1 , P g2 , P g3 , . . ., P gD ), and its own experience.It then adjusts its position vector.The specific update formula is where c 1 and c 2 are the learning factors, and r 1 and r 2 are random numbers between 0 and 1 (with i = 1, 2, . . ., n; d = 1, 2, . . ., D).

The Harmonic Analysis Method
Tidal forces can be regarded as the sum of forces during different periods; in other words, as the sum of many simple harmonic oscillations.The harmonic analysis method separates the harmonic constants (including amplitudes and phase lags) of each tidal component from the continuous observation data of tide heights [6].Tidal height is calculated by summing the m tidal components as follows: where A 0 is the mean sea level; R i is the component amplitude; and ω i and θ i are the angular velocity and initial phase of the tidal components, respectively.

ARIMA
The ARIMA model is a typical time-series analysis and prediction model.From historical time-series data, we can build a dynamic model and predict the future trend of the data [20].The ARIMA model is based on the autoregressive and moving average model, which is represented by the following formula: and it adds the following difference operator: In these expressions, Y t is the predicted value at time t; ε t−i and ε t denote the errors at times t − i and t, respectively; and Y t−i is the measured value at time t − i. µ is a constant; r i is the autocorrelation coefficient; θ i is the is the moving average coefficient, which is different from the initial phase of the tidal components θ i mentioned in Formula (7); d is the differential term; and ∆ d y t−i is the time series after adding the d-order difference.p is the order of the autoregressive model, which represents the lagged rank of the time series.q is the order of the moving average model, which represents the lagged rank of prediction errors.The model must determine three parameters (p, d, q).If the sequence is unstable, it should be transformed into a stable sequence by the d-order difference (Equation ( 9)), and y t−i in Equation ( 8) is replaced with ∆ d y t−i .If the original sequence is a stationary, d is set to 0.

Prediction Steps
As mentioned above, tides can be considered as the superposition of astronomical tidal levels and non-astronomical tidal levels.The astronomical tidal level is strongly periodic, being governed by the tide-generating force, whereas the non-astronomical variations in water level largely depend on climate, hydrology, wind, and other environmental factors, and exhibit a strong randomness [21].Based on the above analysis, this paper establishes a combined tidal prediction model based on harmonic analysis and ARIMA-SVR.The prediction steps are described below.
The first step is to preprocesses the sample data.While acquiring tidal data, many factors may cause the data to become inaccurate and incomplete, resulting in noise interference.The preprocessing step removes the noise from the tidal sequence, thus restoring the tidal motions and improving the prediction accuracy.The preprocessing is performed by wavelet analysis theory.
Second, the height of the astronomical tide level is calculated via harmonic analysis.The harmonic constants are calculated by analyzing the historical observations of tidal heights, and the tidal heights are subsequently calculated.Meanwhile, the residual series generated by this method are collected as the sample data of the non-astronomical tidal level component.
Third, the nonlinear variations in water level are predicted by the ARIMA-SVR model.As the astronomical characteristics of the data have been processed in the astronomical tidal level component, the non-astronomical tidal levels change is reflected in the prediction residue.This prediction step is divided into several sub-steps: Exploiting the strong processing ability of ARIMA for time series, a single-step ARIMA model of non-astronomical tidal level sequences is established, which determines the input in the SVR forecasting model according to the lagged rank of the time series p in ARIMA.The analysis of the time sequence in the ARIMA model confirms that the residual values from t − 1 to t − p moment have a noticeable relevance with the values of moment t, which can be chosen as the input of the SVR forecasting model.That is, the residual values from t − 1 to t − p moment are used to predict the value at time t.The second sub-step establishes a non-astronomical tidal level prediction model based on SVR, which has a strong nonlinear processing ability.As mentioned above, the input variable of SVR mode is where Y t−p represents the value at time t − p .Additionally, the value of p is determined by the value of the lagged rank of the time series p in the ARIMA model.Without considering other factors (p value is different only, other factors are the same), the prediction model error is the smallest when p = p, which is verified in Section 3.2.4.Furthermore, the output variable is The residual sequence generated by the harmonic analysis is normalized to avoid computational saturation.The data are normalized to the interval (0, 1) as follows: where x min = min(x) and x max = max(x).The optimal kernel function type, penalty factor C, and kernel function parameter δ are found by the PSO algorithm and then input to the SVR model for training.In the third sub-step, the test set samples are predicted by the trained model, and the tidal calculation values of the non-astronomical part are obtained.
In the fourth main step, the final tidal height is predicted via the equal-weight summation of the astronomical tidal level and the non-astronomical tidal level.The whole prediction process schematic is shown in Figure 2.
kernel function parameter δ are found by the PSO algorithm and then input to the SVR model for training.In the third sub-step, the test set samples are predicted by the trained model, and the tidal calculation values of the non-astronomical part are obtained.
In the fourth main step, the final tidal height is predicted via the equal-weight summation of the astronomical tidal level and the non-astronomical tidal level.The whole prediction process schematic is shown in Figure 2. The error in the prediction model was measured by the mean absolute error ( MA E ), the mean squared error ( MS E ), the root mean squared error ( RMS E ), and the correlation coefficient r .These four performance measures are respectively calculated as follows: The error in the prediction model was measured by the mean absolute error (E MA ), the mean squared error (E MS ), the root mean squared error (E RMS ), and the correlation coefficient r.These four performance measures are respectively calculated as follows: Here, y and Y are the measured and predicted tidal heights, respectively; n is the number of tidal samples; Cov(y, Y) is the covariance between y and Y; and Var[y] and Var[Y] denote the variances in y and Y, respectively.

Model Checking
The combined tide-forecasting model presented in this paper was checked against the observed tide level data at the Bay Waveland Yacht Club port in the USA, obtained from the website of the National Oceanic and Atmospheric Administration.The harmonic constants of four tidal constituents of the Bay Waveland Yacht Club tidal station are shown in Table 1.The tidal coefficient of the port is 10.767, indicating a diurnal tide-only one high and low tide each day.The tidal level of this port was predicted by the proposed model.The astronomical part of the tidal height was calculated via harmonic analysis, obtaining 720 tidal calculations at 1 h intervals over 30 consecutive days in November 2018.Meanwhile, the non-astronomical part of the water level variation was determined by the ARIMA-SVR model, wherein the training set was compiled from 744 prediction residual data of the astronomical tidal levels from GMT0000 on 1 October 1 2018 to GMT2300 on 31 October 2018, and 720 prediction residual data from GMT0000 on 1 November 2018 to GMT 2300 on 30 November 2018, which were used as the test set for verifying the prediction results.Finally, both parts of the tidal calculation results were superimposed to obtain the predicted tidal levels throughout November 2018.

Sample Data Preprocessing
The original waveform was constructed from the real-time signal sets (the tidal series measured from GMT0000 on 1 December 2017 to GMT2300 on 31 December 2017).The original waveform is shown in Figure 3.The waveform construction was based on the Sym8 wavelet basis function and two-layer wavelet decomposition.The threshold was selected using the heursure function, and de-noising was performed with a soft threshold function.The original and de-noised waveforms are compared in Figure 4, and the noise signal is shown in Figure 5. De-noising comparatively flattened the processed waveforms without changing their overall trend (Figure 4).The effectiveness of the de-noising pretreatment was verified in a comparative test (Section 3.2.4).The results show that de-noising improves the accuracy of the sample data.

Analysis of Prediction Results of Astronomical Tide Level
The astronomical tidal level calculation used 11 main tidal constituents, namely, MS , and 6 M .Figure 6 compares the astronomical tidal level predicted by the harmonic analysis method with the original (not de-noised) measured tidal level.Figure 7 is an error distribution chart of the tidal levels predicted by the harmonic analysis method alone, and Figure 8 plots the linear regression between the observations and predicted results of the harmonic analysis method.As indicated by the deviation of the best-fit line (red line in Figure 8) from the Y = X line, the observed and predicted values were highly discrepant.The simple harmonic analysis introduced obvious errors at some points.The RMS E of the harmonic analysis method was 5. Sequence of the noise signal.

Analysis of Prediction Results of Astronomical Tide Level
The astronomical tidal level calculation used 11 main tidal constituents, namely, M 2 S, S 2 , N 2 ,K 2 ,K 1 , O 1 , P 1 , Q 1 , M 4 , MS 4 , and M 6 .Figure 6 compares the astronomical tidal level predicted by the harmonic analysis method with the original (not de-noised) measured tidal level.Figure 7 is an error distribution chart of the tidal levels predicted by the harmonic analysis method alone, and Figure 8 plots the linear regression between the observations and predicted results of the harmonic analysis method.As indicated by the deviation of the best-fit line (red line in Figure 8) from the Y = X line, the observed and predicted values were highly discrepant.The simple harmonic analysis introduced obvious errors at some points.The E RMS of the harmonic analysis method was determined as 0.180571 m.This large error is attributed to the oversimplified analysis method: the simple harmonic method only considers the influence of the tide-generating force and ignores the non-astronomical components of the changing water levels.
Algorithms 2019, 12, x FOR PEER REVIEW 9 of 18 determined as 0.180571 m.This large error is attributed to the oversimplified analysis method: the simple harmonic method only considers the influence of the tide-generating force and ignores the non-astronomical components of the changing water levels.

Analysis of Prediction Results of Non-Astronomical Tide Level
The model parameters of the non-astronomical part were determined by the single-step ARIMA model.To determine the difference term d, the stationarity of the sequence must be determined by visual processing.To this end, the 744 nonlinear data throughout October 2018 were processed by the augmented Dickey-Fuller (ADF) stationarity test.The parameters are shown in Table 2. Here, the p-value is the probability of significance test in statistics, which is different from the order of the autoregressive model p mentioned in Formula ( 8).The t-statistics were below the critical values at the 1%, 5%, and 10% significance levels, and the p-values were close to 0, confirming that the tidal level sequence was a stationary sequence with d = 0 [22].To determine the factors p in the autoregressive term and q in the moving average term, the sequence was evaluated by an autocorrelation function and a partial autocorrelation function [23], respectively.The evaluation results are shown in Figures 9  and 10, respectively.In the partial autocorrelation function plot (Figure 9), the sequence is mainly located in the confidence interval after the third order; i.e., it begins to truncate after the third order, and the autoregressive term p is thus 3.After establishing p, the independent variables of the sample set in the SVR model are set to the tide levels at times t − i (i = 1, 2, 3), and the dependent variable is the tide level at time t.
, and the dependent variable is the tide level at time t .
As is evident in the autocorrelation plot (Figure 8), the tidal level time series is tailing, and the moving average term q is 0. In summary, the sequence establishes the ARIMA ( ) 3,0,0 model.The prediction process for the non-astronomical tidal part is shown in Figure 11.As is evident in the autocorrelation plot (Figure 8), the tidal level time series is tailing, and the moving average term q is 0. In summary, the sequence establishes the ARIMA (3, 0, 0) model.The prediction process for the non-astronomical tidal part is shown in Figure 11.
To reduce the influence of the order of magnitude of the sample on the prediction accuracy, the data in the training and test sets were normalized to the interval (0,1), and the RBF was selected as the kernel function.The parameters (penalty factor C and kernel function parameter g) were optimized by the PSO algorithm.The group termination algebra was set to 200; the population number was set to 20; and the learning factors C 1 and C 2 were set to 1.5 and 17, respectively.The optimization results are shown in Table 3.The searched optimal parameters were input to the SVM model as the training data.
After training, the test data were input to the model, and the predicted results were compared with the real results.The comparisons and their relative errors are displayed in Figures 12 and 13, respectively.To reduce the influence of the order of magnitude of the sample on the prediction accuracy, the data in the training and test sets were normalized to the interval (0,1), and the RBF was selected as the kernel function.The parameters (penalty factor C and kernel function parameter g ) were optimized by the PSO algorithm.The group termination algebra was set to 200; the population number was set to 20; and the learning factors 1 C and 2 C were set to 1.5 and 17, respectively.The optimization results are shown in Table 3.The searched optimal parameters were input to the SVM model as the training data.After training, the test data were input to the model, and the predicted results were compared with the real results.The comparisons and their relative errors are displayed in Figures 12 and 13, respectively.

Analysis of Prediction Results of the Combined Model
Next, the astronomical tidal level and the non-astronomical tidal level were linearly added to obtain the overall tidal prediction throughout November 2018. Figure 14 compares the predicted tidal levels with the observed (not de-noised) data, and Figure 15 plots the predicted errors.The combined model yielded much more accurate results than the pure harmonic analysis method.The RMS E of the combined model was 0.022293 m, which is obviously smaller than that in the harmonic analysis method.Figure 16 linearly regresses the predictions of the combined model against the observed (not de-noised) data.The combined model clearly predicted the observed tidal level with a high accuracy.

Analysis of Prediction Results of the Combined Model
Next, the astronomical tidal level and the non-astronomical tidal level were linearly added to obtain the overall tidal prediction throughout November 2018. Figure 14 compares the predicted tidal levels with the observed (not de-noised) data, and Figure 15 plots the predicted errors.The combined model yielded much more accurate results than the pure harmonic analysis method.The E RMS of the combined model was 0.022293 m, which is obviously smaller than that in the harmonic analysis method.Figure 16 linearly regresses the predictions of the combined model against the observed (not de-noised) data.The combined model clearly predicted the observed tidal level with a high accuracy.To verify the effect of de-noising, the tide level was predicted using the original data (without de-noising) and the predicted error was calculated by the above steps.The errors in the predictions are compared with those of the de-noised data in Table 4. Clearly, the wavelet transform smoothed the data and improved the prediction accuracy.This result confirms the feasibility and effectiveness of de-noising the sample data prior to analysis.To further verify the prediction accuracy of the combined model, the total tidal levels at the Bay Waveland Yacht Club station were predicted by single harmonic analysis, the combined model, the SVR model, and another common method called back propagation neural networks (the BP model) [24].The parameters of the SVR model were optimized by the PSO algorithm.For a fair comparison, the sample data and parameters were identical in all methods.The prediction performances of the four methods are compared in Table 5.The proposed combined model required a longer training time, but yielded more accurate tidal predictions with lower errors than the other models.The prediction accuracy of a model depends on the size of the training set.Accordingly, the prediction accuracies of the SVR and combined models were compared on training sets with different sample sizes (samples collected over 1, 3, 6, or 12 months).In this comparison, the test set remained fixed.ARIMA modeling is performed on the training sets of different sample sizes, and the lagged order of time series p is determined, and the input of the model is determined thereby.Again, the data were the tidal levels at the Bay Waveland Yacht Club tidal station.The prediction results of the SVR model and the combined model are shown in Tables 6 and 7, respectively.As the sample size increased, the E RMS of the tidal levels predicted by the SVR model (Table 6) changed substantially around 0.049, whereas those of the combined model (Table 7) fluctuated around 0.022 m.By contrast, the error indicators of tidal prediction were lower in the combined model than in the SVR model.The combined model thus exhibited a more accurate and stable prediction performance than the SVR model alone, within a significantly lower runtime than the SVR model.This result confirms the efficiency of the combined model.As mentioned above, the Bay Waveland Yacht Club tidal station experiences a diurnal tide.To test the combined model on different tidal types and stations, the harmonic analysis, SVR, and combined models were trained on the tidal level data from four stations with different tidal types, and their predictive performances were evaluated in each case.Nawiliwili, The Battery, and Texas Point tidal stations were selected for tidal prediction comparison experiments, which have different tidal types.
The ARIMA model was established for the tidal data of different tidal stations to determine the lagged rank of the time series p, so the input of the non-astronomical tidal part was determined.The error results are shown in Table 8.At all four stations, the combined model outperformed the pure harmonic analysis and pure SVR models.In order to further measure the prediction accuracy, the relative magnitude of the tide and non-astronomical tide parts of the sample data of four tidal station was calculated, and the following Table 8 was obtained.As shown in Tables 8 and 9, it can be seen that the larger the relative magnitude of the astronomical tide, the higher the accuracy of the harmonic analysis.The harmonic analysis method is suitable for predicting tidal stations with a high relative magnitude of astronomical tides.
Table 8.Tidal-type comparisons of prediction errors in the harmonic analysis, SVR, and combined models.The third column lists "p ", which refers to the fact that when the prediction time step is 1, the value from t − 1 to t − p moment is used to predict the value at time t.The input of the SVR also significantly affects the prediction accuracy of the combined models.To verify the robustness of the input of SVR in the combined model determined according to the lagged rank p of the residual sequence in the ARIMA model, the combined model was trained on the tide level data from the Bay Waveland Yacht Club tidal station, and its predictive performance was compared for different inputs.The prediction results are shown in Table 10.Here, the first column lists "p ", which refers to the fact that when the prediction time step is 1, the residual values from t − 1 to t − p moment is used to predict the value at time t.Increasing the p from 1 to 3 reduced the error in the combined model, but increasing the p further increased the error.The minimized error at p = 3 is consistent with the test results of the lagged rank p of the residual sequence in the ARIMA model.However, even at the largest p (p = 12 ), the error was lower in the combined model than in the harmonic analysis method.Therefore, the combined model is more accurate and more suitable for short-term tidal level prediction than the simple harmonic model.

Figure 1 .
Figure 1.Architecture of a support vector machine (SVM).

Figure 1 .
Figure 1.Architecture of a support vector machine (SVM).

Figure 2 .
Figure 2. Schematic of the combined tidal prediction model.

Figure 4 .
Figure 4. Comparison of the original and de-noised tidal level sequences.

Figure 4 .
Figure 4. Comparison of the original and de-noised tidal level sequences.

Figure 4 .
Figure 4. Comparison of the original and de-noised tidal level sequences.

Figure 4 .
Figure 4. Comparison of the original and de-noised tidal level sequences.

Figure 5 .
Figure 5. Sequence of the noise signal.

Figure 6 .
Figure 6.Comparison of tidal levels predicted by the harmonic analysis method alone and the observed data.

Figure 6 .
Figure 6.Comparison of tidal levels predicted by the harmonic analysis method alone and the observed data.

Figure 6 .
Figure 6.Comparison of tidal levels predicted by the harmonic analysis method alone and the observed data.

Figure 7 .
Figure 7. Error distribution map of the harmonic analysis prediction.

Figure 8 .
Figure 8. Linear regression of the harmonic analysis predictions versus observed results.The red line is the best-fit line of the data.Along the black line, the predicted and observed values are equal, and the error is zero.

Figure 7 .
Figure 7. Error distribution map of the harmonic analysis prediction.

Figure 6 .
Figure 6.Comparison of tidal levels predicted by the harmonic analysis method alone and the observed data.

Figure 7 .
Figure 7. Error distribution map of the harmonic analysis prediction.

Figure 8 .
Figure 8. Linear regression of the harmonic analysis predictions versus the observed results.The red line is the best-fit line of the data.Along the black line, the predicted and observed values are equal, and the error is zero.

Figure 8 .
Figure 8. Linear regression of the harmonic analysis predictions versus the observed results.The red line is the best-fit line of the data.Along the black line, the predicted and observed values are equal, and the error is zero.

Figure 9 .
Figure 9. Autocorrelation analysis of the residual tidal level data.The area between the two blue lines is the confidence interval.

Figure 9 . 18 Figure 10 .Figure 10 .
Figure 9. Autocorrelation analysis of the residual tidal level data.The area between the two blue lines is the confidence interval.2019, 12, x FOR PEER REVIEW 12 of 18

Figure 10 .Figure 11 .
Figure 10.Partial autocorrelation analysis of the residual tidal level data.The area between the two blue lines and the coordinate axis is the confidence interval.

Figure 11 .
Figure 11.Schematic of the non-astronomical tidal level prediction part.

Figure 12 .
Figure 12.Comparison of the observations and predicted non-astronomical tidal level computed by the SVR model.

Figure 12 .
Figure 12.Comparison of the observations and predicted non-astronomical tidal level computed by the SVR model.

Figure 12 .
Figure 12.Comparison of the observations and predicted non-astronomical tidal level computed by the SVR model.

Figure 13 .
Figure 13.Error distribution map of the non-astronomical tidal level.

Figure 13 .
Figure 13.Error distribution map of the non-astronomical tidal level.

Figure 14 .
Figure 14.Comparison of the (not de-noised) observations and the water levels predicted by the combined model.

Figure 14 .
Figure 14.Comparison of the (not de-noised) observations and the water levels predicted by the combined model.

Figure 14 .
Figure 14.Comparison of the (not de-noised) observations and the water levels predicted by the combined model.

Figure 15 .
Figure 15.Error distribution map of the combined model.

Figure 16 .
Figure 16.Linear regression plot of the predicted versus observed tidal heights.The predicted results were calculated by the combined model.

Figure 15 .
Figure 15.Error distribution map of the combined model.

Figure 14 .
Figure 14.Comparison of the (not de-noised) observations and the water levels predicted by the combined model.

Figure 15 .
Figure 15.Error distribution map of the combined model.

Figure 16 .
Figure 16.Linear regression plot of the predicted versus observed tidal heights.The predicted results were calculated by the combined model.

Figure 16 .
Figure 16.Linear regression plot of the predicted versus observed tidal heights.The predicted results were calculated by the combined model.

Table 1 .
The harmonic constants of four tidal constituents of the Bay Waveland Yacht Club tidal station.

Table 2 .
Parameters determined in the ADF unit root test.variables of the sample set in the SVR model are set to the tide levels at times independent

Table 2 .
Parameters determined in the ADF unit root test.

Table 3 .
Optimal parameters detected by PSO.

Table 3 .
Optimal parameters detected by PSO.

Table 4 .
Comparison of the predicted tide level errors with and without de-noising.

Table 5 .
Performance comparison of different models simulating the tidal behavior at Bay Waveland Yacht Club.The second column lists "p ", which refers to the fact that when the prediction time step is 1, the residual values from t − 1 to t − p moment are used to predict the value at time The last column lists the processing time required for the various methods.

Table 6 .
Sample-size comparison tidal prediction errors in the SVR model.The second column lists "p ", which refers to the fact that when the prediction time step is 1, the total tidal level from t − 1 to t − p moment is used to predict the total tidal level at time t.

Table 7 .
Sample-size comparison tidal prediction errors in the combined model.The second column lists "p ", which refers to the fact that when the prediction time step is 1, the residual water level from t − 1 to t − p moment is used to predict the non-astronomical tidal level at time t.

Table 9 .
The relative magnitude of the astronomical tide and the un-astronomical tide parts of four tidal stations.

Table 10 .
The input of SVR model comparisons of prediction accuracy of the combined method at Bay Waveland Yacht Club Tidal Station.The first column lists "p ", which refers to the fact that when the prediction time step is 1, the residual values from t − 1 to t − p moment is used to predict the value at time t.