Prediction of Sea Level with Vertical Land Movement Correction Using Deep Learning

: Sea level rise (SLR) in small island countries such as Kiribati and Tuvalu have been a signiﬁcant issue for decades. There is an urgent need for more accurate and reliable scientiﬁc information regarding SLR and its trend and for more informed decision making. This study uses the tide gauge (TG) dataset obtained from locations in Betio, Kiribati and Funafuti, Tuvalu with sea level corrections for vertical land movement (VLM) at these locations from the data obtained by the Global Navigation Satellite System (GNSS) before the sea level trend and rise predictions. The oceanic feature inputs of water temperature, barometric pressure, wind speed, wind gust, wind direction, air temperature, and three signiﬁcant lags of sea level are considered in this study for data modeling. A new data decomposition method, namely, successive variational mode decomposition (SVMD), is employed to extract intrinsic modes of each feature that are processed for selection by the Boruta random optimizer (BRO). The study develops a deep learning model, namely, stacked bidirectional long short-term memory (BiLSTM), to make sea level (target variable) predictions that are benchmarked by three other AI models adaptive boosting regressor (AdaBoost), support vector regression (SVR), and multilinear regression (MLR). With a comprehensive evaluation of performance metrics, stacked BiLSTM attains superior results of 0.994207, 0.994079, 0.988219, and 0.899868 for correlation coefﬁcient, Wilmott’s Index, the Nash–Sutcliffe Index, and the Legates–McCabe Index, respectively, for Kiribati, and with values of 0.996806, 0.996272, 0.992316, and 0.919732 for correlation coefﬁcient, Wilmott’s Index, the Nash–Sutcliffe Index, and the Legates–McCabe Index, respectively, for the case of Tuvalu. It also shows the lowest error metrics in prediction for both study locations. Finally, trend analysis and linear projection are provided with the GNSS-VLM-corrected sea level average for the period 2001 to 2040. The analysis shows an average sea level rate rise of 2.1 mm/yr for Kiribati and 3.9 mm/yr for Tuvalu. It is estimated that Kiribati and Tuvalu will have a rise of 80 mm and 150 mm, respectively, by the year 2040 if estimated from year 2001 with the current trend.


Introduction
Sea level rise (SLR) is an important global issue and ranks among the top climate change issues, with international organizations concerned with the devastating impacts of climate change. The sixth assessment report of the Intergovernmental Panel on Climate Change (IPCC) [1,2] highlights that risks from SLR for coastal ecosystems and people are very likely to increase by tenfold if no adaptation and mitigation strategies are implemented as agreed by parties to the Paris agreement by the year 2100. The report further states that the global mean sea level (GMSL) has risen by 0.2 m since the year 1901 and is projected to rise further in the coming years. This also impacts the frequency of extreme sea levels, which are projected to increase by a median of 20-30 times across tide gauges by the year 2 of 23 2050 [2]. The global sea level rise (GSLR) rate has significantly increased from 1.7 mm/year to 3.4 mm/year since 1993 [3]. Sea level is a sensitive climate variable affected by various oceanic parameters. The rise in sea level has been attributed to the melting of glaciers and ice sheets, thereby adding water volume to the ocean [4]. The rate of melting has significantly increased over the past decade, and global warming has been stated as the major reason for this [3,5]. The impact of SLR could have devastating effects with a loss of 30-80% of coastal wetlands, which include salt marshes and mangroves [6]. A study on coastal hazard exposure of cultural, natural, and African heritage sites [7] found that by 2050, exposure to these areas will more than triple, reaching almost 200 sites as a result of climate change.
GSLR provides an estimate of the impact; however, the actual impact and rate of rise varies according to geographic location and other related climate variables. Miller and Douglas [8] stated the GSLR rate for the 20th Century to be between 1.5-2 mm/year, whereas Wadhams and Munk [9] found a lower rate of 1.1 mm/year in the same period of study. According to the IPCC report [10][11][12], small island nations are most vulnerable to SLR. Among these are South Pacific countries such as Kiribati and Tuvalu. Given the nature of these island states surrounded by oceans, SLR poses a major threat to their infrastructure and socioeconomic activities [12]. While the main aspect of SLR is highlighted as the inundation of coastal areas, the most critical and long term threat of impact will be on freshwater quality and availability [13]. The impacts of climate change on these island nations have been continuously highlighted in climate reports and media. However, as stated in [14], scientific knowledge and study in these islands have been relatively slow.
Sea level change is assessed by using three types of measurements, namely, tide gauge (TG) data, Global Navigation Satellite System (GNSS) time series, and observations of levelling [15,16]. TG provides a measure of sea level variation relative to a TG attached to a wharf. A problem associated with TG measurement is its inability to differentiate between sea level change and movement of the TG. In the case of land subsidence on which the tide gauge is situated, it records the relative sea level change [17]. In this study, tide gauge observations are corrected for land subsidence using the GNSS recorded measurements. GNSS measures the vertical crustal motion of the Earth with respect to the center of the Earth. Geoscience Australia (GA) operates the GNSS network, which includes the South Pacific countries, and provides this information under the Australian Aid-funded Pacific Sea Level and Geodetic Monitoring (PSLGM) Project. This technique has been successfully used in past studies for accurate sea level estimation. A study [18] using existing GNSS stations in Taiwan estimated the sea level change by removing vertical land motion (VLM) from the TG sea level measurement. Another study [19] claims that GNSS is able to provide VLM monitoring with an accuracy higher than 1 mm/year and hence is an effective tool to improve the estimation of sea level rise rates. VLM consideration in estimating accurate sea level rise is extremely important, as it removes a wide range of natural and anthropogenic influences from the tide gauge-measured sea level [16,20].
Furthermore, this study also addresses another important gap in study by utilizing new artificial intelligence (AI) methodologies, such as deep learning, to provide SLR trends and future localized projections for small islands such as Kiribati and Tuvalu in the South Pacific. The use of AI model predictions has been successfully carried out for various climate variables in many studies [21][22][23][24][25][26]. However, its implementation for the prediction of sea level with the VLM-corrected tide gauge dataset with data decomposition and feature selection techniques for South Pacific countries has not been considered so far. Therefore, this study proposes a deep learning architecture-based stacked bidirectional long short-term memory (BiLSTM) model as the objective model integrated with successive variational mode decomposition (SVMD) data decomposition and Boruta random forest optimizer feature selection for prediction of sea level. This is benchmarked by three AI models (adaptive boosting (AdaBoost) regressor, multilinear regression (MLR), and support vector regression (SVR). In addition, the corrected sea level average values are computed for a linear projection with the sea level rate rise for Kiribati and Tuvalu.

Study Area and Dataset
As mentioned previously, the two countries selected for this study are Kiribati and Tuvalu, located in the South Pacific Ocean. Figure 1 and Table 1 show the study area map and geographical locations of the two small island nations. and support vector regression (SVR). In addition, the corrected sea level average value are computed for a linear projection with the sea level rate rise for Kiribati and Tuvalu.

Study Area and Dataset
As mentioned previously, the two countries selected for this study are Kiribati an Tuvalu, located in the South Pacific Ocean. Figure 1 and Table 1 show the study area ma and geographical locations of the two small island nations.  Table 1. The TG is one com ponent of a water monitoring station and uses fitted sensors to collect measurement Apart from sea level heights, the new technology is capable of measuring other oceani parameters such as wind gusts and speed, air and water temperatures, and barometri pressure. The study utilized a 60 min interval-recorded sea level (meters), water temperature ( • C), air temperature ( • C), barometric pressure (hPa), wind direction (degrees True), wind gust (m/s), and wind speed (m/s) dataset for the period of 2010-2021 acquired from the Bureau of Meteorology (BOM), Australia (Pacific Sea Level and Geodetic Monitoring Project File information and Instructions (bom.gov.au accessed on 10 February 2022)). The Pacific climate change data portal was developed with BOM under the Pacific Climate Change Science Program (PCCSP; 2009-2011). The sea level observations were recorded by the tide gauges deployed at the study sites mentioned in Table 1. The TG is one component of a water monitoring station and uses fitted sensors to collect measurements. Apart from sea level heights, the new technology is capable of measuring other oceanic parameters such as wind gusts and speed, air and water temperatures, and barometric pressure.

Data Preprocessing and Input Selection
An important aspect of any study based on data analysis is the quality of the data and scientific correctness of addressing the missing values [27]. This study used the interpolation method in Python using the Pandas library. The default status of linear was implemented, given the close relationship with the data points in the dataset. The effectiveness of machine learning modeling depends on this process, and deletion of rows is not an ideal option [28]. The next step was to confirm the stationarity of the dataset, which was done using the Augmented Dickey Fuller (ADF) test. ADF is used for large datasets and tests the null hypothesis on the presence of the unit root in the data. A larger negative value than the given critical value indicates that null hypothesis can be rejected. Hence, if no unit root is present, the dataset satisfies the stationarity criteria [29].
A time series dataset as used in this study is a collection of time ordered observations. Many studies [30][31][32][33] use lagged values as model inputs, as they help to reduce redundant features and improve the model prediction accuracy [34,35]. Figure 2 below shows the Auto Correlation Function (ACF) and the Partial Autocorrelation Function (PACF) plots of the sea level time series, which show the antecedent behavior of the lags for Kiribati. ACF shows the correlation of how any two values of the sea level series change as their separation changes. The ACF sinusoidal curve shows the seasonality and measures the association between current and past values, which will be useful in predicting future values in AI modelling [36]. The PACF analysis helps to show partial correlation of the series with its own lagged values. As shown in Figure 2, the line graph shows the level of correlation with each lag. Following the graph analysis for both time series (Kiribati and Tuvalu), three lags of sea level time series were chosen as inputs for the AI models. An important aspect of any study based on data analysis is the quality of the data and scientific correctness of addressing the missing values [27]. This study used the interpolation method in Python using the Pandas library. The default status of linear was implemented, given the close relationship with the data points in the dataset. The effectiveness of machine learning modeling depends on this process, and deletion of rows is not an ideal option [28]. The next step was to confirm the stationarity of the dataset, which was done using the Augmented Dickey Fuller (ADF) test. ADF is used for large datasets and tests the null hypothesis on the presence of the unit root in the data. A larger negative value than the given critical value indicates that null hypothesis can be rejected. Hence, if no unit root is present, the dataset satisfies the stationarity criteria [29].
A time series dataset as used in this study is a collection of time ordered observations. Many studies [30][31][32][33] use lagged values as model inputs, as they help to reduce redundant features and improve the model prediction accuracy [34,35]. Figure 2 below shows the Auto Correlation Function (ACF) and the Partial Autocorrelation Function (PACF) plots of the sea level time series, which show the antecedent behavior of the lags for Kiribati. ACF shows the correlation of how any two values of the sea level series change as their separation changes. The ACF sinusoidal curve shows the seasonality and measures the association between current and past values, which will be useful in predicting future values in AI modelling [36]. The PACF analysis helps to show partial correlation of the series with its own lagged values. As shown in Figure 2, the line graph shows the level of correlation with each lag. Following the graph analysis for both time series (Kiribati and Tuvalu), three lags of sea level time series were chosen as inputs for the AI models. The lags and all other oceanic parameters were tested for correlation, as shown in Figure  3. This correlation matrix calculates the correlation between each oceanic parameter with the sea level. The darker rectangle blocks show higher correlations between the parameters. Taking sea level as the target variable for prediction, Table 2 shows the input oceanic parameters and lagged values used for the AI models. The lags and all other oceanic parameters were tested for correlation, as shown in Figure 3. This correlation matrix calculates the correlation between each oceanic parameter with the sea level. The darker rectangle blocks show higher correlations between the parameters. Taking sea level as the target variable for prediction, Table 2 shows the input oceanic parameters and lagged values used for the AI models.

GNSS VLM Correction
The GNSS dataset is collected by Geoscience Australia (GA) using its infrastructure in Kiribati and Tuvalu. The three-dimensional GNSS position is computed every week with respect to the center of the Earth [37].

GNSS VLM Correction
The GNSS dataset is collected by Geoscience Australia (GA) using its infrastructure in Kiribati and Tuvalu. The three-dimensional GNSS position is computed every week with respect to the center of the Earth [37].

Data Normalization
All the model input and target data were normalized [38][39][40] for the values to be within [0,1] for modelling by Equation (1)

Data Normalization
All the model input and target data were normalized [38][39][40] for the values to be within [0,1] for modelling by Equation (1) below: After the modeling process, these values are computed to convert back to their original values. The x actual in Equation (1) is made the subject, as shown in Equation (4).
where x n is the input dataset, x min is the minimum value in the dataset, and x max is the maximum value in the dataset.

Data Decomposition by Successive Variational Mode Decomposition (SVMD)
This study used successive variational mode decomposition (SVMD) proposed in [41], which successively extracts the modes without the need to know the number of modes. This new proposed method based on the same concept used in variational mode decomposition (VMD) decomposes the signal with lower computational complexity [41,42]. Variational mode extraction (VME) was applied successively on the input signal by adding constraints to negate convergence on the extracted modes in the previous step. This continued until all modes were extracted. To find a signal with the maximum compact spectrum L th , an optimization problem is solved when this L th mode reaches the extracted mode sum that reduces the reconstruction error. Figure 6 below shows the intrinsic mode decomposition (IMF) of Tuvalu sea level (lag) as an example of 500 data points. All input signals were decomposed into their IMFs for model inputs after Boruta feature selection, as used in many studies [43][44][45] to improve prediction accuracy. modes. This new proposed method based on the same concept used in variational mode decomposition (VMD) decomposes the signal with lower computational complexity [41,42]. Variational mode extraction (VME) was applied successively on the input signal by adding constraints to negate convergence on the extracted modes in the previous step. This continued until all modes were extracted. To find a signal with the maximum compact spectrum L th , an optimization problem is solved when this L th mode reaches the extracted mode sum that reduces the reconstruction error. Figure 6 below shows the intrinsic mode decomposition (IMF) of Tuvalu sea level (lag) as an example of 500 data points. All input signals were decomposed into their IMFs for model inputs after Boruta feature selection, as used in many studies [43][44][45] to improve prediction accuracy.

Input Feature Selection Using Boruta Random Forest Optimizer (BRFO)
The algorithm of Boruta feature selection is based on the random forest classifier. The selection of the classifier is based on decision trees. The algorithm randomizes the input dataset, and duplicates are created as shadow attributes. The random forest classifier is then trained on the larger set using a feature importance measure to collect the z-scores. The maximum z-score is calculated within the shadow attributes, which form the critical values whereby attributes with higher than these values are included in the selected set. A statistical two-sided test of equality with attributes that had an undetermined importance is then performed. All the attributes with lower z-scores were removed from the set. Hence, all selected features were then ready for AI training and modeling for prediction [33,46,47].

Input Feature Selection Using Boruta Random Forest Optimizer (BRFO)
The algorithm of Boruta feature selection is based on the random forest classifier. The selection of the classifier is based on decision trees. The algorithm randomizes the input dataset, and duplicates are created as shadow attributes. The random forest classifier is then trained on the larger set using a feature importance measure to collect the z-scores. The maximum z-score is calculated within the shadow attributes, which form the critical values whereby attributes with higher than these values are included in the selected set. A statistical two-sided test of equality with attributes that had an undetermined importance is then performed. All the attributes with lower z-scores were removed from the set. Hence, all selected features were then ready for AI training and modeling for prediction [33,46,47].

Data Partition
There is no set rule for data partition; however, following past successful studies [23,48,49], the oceanic dataset was divided, as shown in Table 3. The LSTM-based models overcome a significant problem of RNNs of vanishing gradient and capture long term dependencies in the data modeling process. This deep learning architecture learns what information to preserve and what to remove. There are three gates (forget, input, and output), which handle the process, as shown in Figure 7. The LSTM-based models overcome a significant problem of RNNs of vanishing gradient and capture long term dependencies in the data modeling process. This deep learning architecture learns what information to preserve and what to remove. There are three gates (forget, input, and output), which handle the process, as shown in Figure 7. The BiLSTM model extends the LSTM network in which each training sequence facilitates both forward and backward recurrent nets [33,50,51]. The input data passes through two LSTMs in this network model. The architecture uses full sequential information of all data points before with the movement through the layers [52,53]. There are two movements of input data, where data are first fed into the forward layer, and then the reverse form of input data are fed into the backward layer, as shown in Figure 8 [54]. The mathematical functions are given below for the units in the architecture: Forget Gates: Input Gates: Output Gates: Sigmoid Function: The BiLSTM model extends the LSTM network in which each training sequence facilitates both forward and backward recurrent nets [33,50,51]. The input data passes through two LSTMs in this network model. The architecture uses full sequential information of all data points before with the movement through the layers [52,53]. There are two movements of input data, where data are first fed into the forward layer, and then the reverse form of input data are fed into the backward layer, as shown in Figure 8 [54]. The mathematical functions are given below for the units in the architecture: Forget Gates: Input Gates: Mathematics 2022, 10, 4533 9 of 23 Output Gates: Sigmoid Function: Cell Input State:č Hypertangent Function: The b f , b i , b o , and b c play the role of bias vectors. The U f , U i , U o , and U c are weight matrices that form the connection between the input cell state and the previous cell output state. W f , W i , W o , and W c are weight matrices of the forget gate. The network uses sigmoid (δ g ), which acts as the gate activation function [55]. The cell output c t and output h t are computed at each iteration t as follows: Mathematics 2022, 10, x FOR PEER REVIEW 10 of 2 The , , , and play the role of bias vectors. The , , , and are weigh matrices that form the connection between the input cell state and the previous cell outpu state. , , , and are weight matrices of the forget gate. The network uses sigmoi ( ), which acts as the gate activation function [55]. The cell output and output ℎ ar computed at each iteration as follows:   Figure 8 shows how the data are processed in both directions with separate hidden layers. All the data input features go through BiLSTM layers in the network architecture. At the final stage, prediction output is given as Y t = [. . . y t−1 ,y t , y t+1 . . .]. The y t value is found using the merge mode as:

Objective Model Development: Stacked Bidirectional Long Short-Term Memory (BiLSTM)
Model parameters are selected using a grid-search technique, which provides optimum values for the BiLSTM model. Figure 9 and Table 4 show the tuning of parameters using grid-search and stacked BiLSTM deep learning architecture in Python during the modeling process.

Benchmark Models
The benchmark models have been widely used in predictions for many real-life ap plications. A robust ensemble method which combined random forest and gradient boost ing to predict design routability and design rule checking (DRC) was used in [56]. In an other study, ML algorithms are used to combine ensemble and heuristic greedy searc methods for identifying design constraints and DRC [57]. SVR is regarded as an efficien ML model and shows high determination coefficients in [58] to predict evaporatio amounts. One study [59] found high estimation results using MLR for the prediction o organic potato yield using tillage systems and soil properties. These models provide sound comparative platform for the stacked BiLSTM model for the evaluation of perfor mance in this study.

Support Vector Regression (SVR)
The support vector (SV) algorithm [60] based on non-linear generalization of the gen eralized portrait algorithm [60,61] was developed by Vapnik and co-workers. While it i generally used for classification, it can also be effectively applied for regression (SVR problems. The regression version was also developed by Vapnik, Steven Golowich, and Alex Smola in 1997 [62]. The SVR algorithm uses a symmetrical loss function to train th dataset. The SVR computation complexity has the advantage of non-dependence on th dimensionality of the input space [63,64]

Benchmark Models
The benchmark models have been widely used in predictions for many real-life applications. A robust ensemble method which combined random forest and gradient boosting to predict design routability and design rule checking (DRC) was used in [56]. In another study, ML algorithms are used to combine ensemble and heuristic greedy search methods for identifying design constraints and DRC [57]. SVR is regarded as an efficient ML model and shows high determination coefficients in [58] to predict evaporation amounts. One study [59] found high estimation results using MLR for the prediction of organic potato yield using tillage systems and soil properties. These models provide a sound comparative platform for the stacked BiLSTM model for the evaluation of performance in this study.

Support Vector Regression (SVR)
The support vector (SV) algorithm [60] based on non-linear generalization of the generalized portrait algorithm [60,61] was developed by Vapnik and co-workers. While it is generally used for classification, it can also be effectively applied for regression (SVR) problems. The regression version was also developed by Vapnik, Steven Golowich, and Alex Smola in 1997 [62]. The SVR algorithm uses a symmetrical loss function to train the dataset. The SVR computation complexity has the advantage of non-dependence on the dimensionality of the input space [63,64].

Adaptive Boosting Regressor (AdaBoost)
The adaptive boosting algorithm was first introduced by Freund and Schapire [65] in 1995. It is based on meta-estimation, where the dataset is fitted with a regressor. Following this, additional copies of the regressor is also fitted on the original dataset with an adjustment of the weights of instances with respect to predictor error [66].

Multilinear Regression (MLR)
Multilinear regression is a statistical technique, where many explanatory variables are used to predict a response variable [67]. It examines how multiple independent variables are related to a single dependent variable. The MLR assumptions are based on normal distribution, linearity, and freedom from extreme values, and observations are independent [68].

The Performance Evaluation Metrics for AI Models
All the models were evaluated using eight statistical metrics as given by Equations (12)-(19) on the testing dataset.
Relative Root Mean Square Error (RRMSE) where DS i -simulated data, DO i -observed data. Figure 10 shows the steps involved in the preprocessing and VLM correction for absolute sea level. The initial block is critical for accurate and reliable results as key aspects of dataset preparation, such as the stationarity test, creation of lags, data normalization, and correlation analysis. Data decomposition into its intrinsic modes is an essential statistical step of the time series dataset, as it helps to extract features such as seasonality and trends. The Boruta random forest optimizer selects the significant features for the AI models. All preprocessed datasets are fed into the AI models, and the prediction output data are denormalized for conversion to actual sea level values.

Results
The preprocessed time series dataset with GNSS correction of VLM were divided into three sets, namely, training, validation, and testing. These were then put into the AI models as feature inputs and target variables. The prediction model was developed in each case for sea level prediction. The evaluation was divided in two parts, firstly, the efficiency metrics measuring the accuracy of the models, and secondly, error metrics for predicted

Results
The preprocessed time series dataset with GNSS correction of VLM were divided into three sets, namely, training, validation, and testing. These were then put into the AI models as feature inputs and target variables. The prediction model was developed in each case for sea level prediction. The evaluation was divided in two parts, firstly, the efficiency metrics measuring the accuracy of the models, and secondly, error metrics for predicted values. The correlation coefficient (r) is a widely used statistical measure that determines the relationship between two variables [69]. It indicates the strength of association, and in this case, the degree of association between the observed and predicted sea level values for all models. Wilmott's Index [70] is a standardized dimensionless measure that indicates the ratio of the mean of square error and the potential error. It detects the additive and proportional differences in the means and variances between observed and predicted values [70,71]. Nash-Sutcliffe (NS) [72] is a normalized metric that provides model accuracy with goodness of fit [73,74]. The index uses three quantities, namely, measured values, means of measured values, and the predicted values [73]. The Legates and McCabe Index [75] is considered to be a more advanced and efficient statistical index, where the adjustment of comparisons in the evaluation of WI is utilized [76]. There are four error metrics utilized for analyses of all models at both study sites, namely, root mean square error (RMSE), mean absolute error (MABE), relative root mean square error (RRMSE), and mean absolute percentage error (MAPE).

Objective and Benchmark Model Results for Kiribati and Tuvalu
The model performance and error metrics are shown in Tables 5-8.

Scatterplot with Correlation and Histogram Error Results for Kiribati and Tuvalu Models
A scatterplot is considered as one of the most powerful data analysis tools [77]. It is a plot of two variables (observed and predicted) to show the association between them. This technique emerged with the need to examine bivariate relationships between separate variables directly [78]. The linear regression fit with a coefficient of determination (r 2 ) further adds to more information on the scatterplot about the behavior between the variables in the study [79]. The stacked BiLSTM as the objective model for this study, as shown in Figure 11, shows more compactness in the scatter of points between observed and predicted values, indicating that higher accuracy and higher r 2 further supports this graphical representation. The MLR plot shows wider scattering with the lowest r 2 value, indicating lower model accuracy.

Scatterplot with Correlation and Histogram Error Results for Kiribati and Tuvalu Models
A scatterplot is considered as one of the most powerful data analysis tools [77]. It is a plot of two variables (observed and predicted) to show the association between them. This technique emerged with the need to examine bivariate relationships between separate variables directly [78]. The linear regression fit with a coefficient of determination (r 2 ) further adds to more information on the scatterplot about the behavior between the variables in the study [79]. The stacked BiLSTM as the objective model for this study, as shown in Figure 11, shows more compactness in the scatter of points between observed and predicted values, indicating that higher accuracy and higher r 2 further supports this graphical representation. The MLR plot shows wider scattering with the lowest r 2 value, indicating lower model accuracy. A histogram is another widely used and common display chart in scientific study and coined by famous statistician Karl Pearson [80]. It is constructed using attributes (absolute prediction error) by partitioning the data distribution into "buckets" or "bins" with their frequency in the allocated ranges [80,81]. It is an effective graphical method to evaluate the performance of model prediction, as it shows the frequency of prediction error and its distribution within the "bins". The stacked BiLSTM model plot in Figure 12 shows the bars in the lower bracket indicating lesser amounts of absolute prediction error. On the contrary, other benchmark models in Figure 11 have more bars extending to the right, indicating higher absolute prediction error. A histogram is another widely used and common display chart in scientific study and coined by famous statistician Karl Pearson [80]. It is constructed using attributes (absolute prediction error) by partitioning the data distribution into "buckets" or "bins" with their frequency in the allocated ranges [80,81]. It is an effective graphical method to evaluate the performance of model prediction, as it shows the frequency of prediction error and its distribution within the "bins". The stacked BiLSTM model plot in Figure 12 shows the bars in the lower bracket indicating lesser amounts of absolute prediction error. On the contrary, other benchmark models in Figure 11 have more bars extending to the right, indicating higher absolute prediction error.

Objective and Benchmark Model Performance Evaluation
As shown in Figure 13, all efficiency metrics discussed previously show superior performance by the stacked BiLSTM model for both study sites. It attains high performance values of 0.994207, 0.994079, 0.988219, and 0.899868 for the correlation coefficient, Wilmott's Index, the Nash-Sutcliffe Index, and the Legates-McCabe Index, respectively, for Kiribati. The SVR model has also performed well as a benchmark model attaining values of 0.988909, 0.987155, 0.974866, and 0.852321 for the correlation coefficient, Wilmott's Index, the Nash-Sutcliffe Index, and the Legates-McCabe Index, respectively. MLR performed very poorly at both study locations with values of 0.758946, 0.672739, 0.368981, and 0.217703 for Kiribati. Compared with other models, this is also similar for Tuvalu, as shown in Figure 14

Objective and Benchmark Model Performance Evaluation
As shown in Figure 13, all efficiency metrics discussed previously show superior performance by the stacked BiLSTM model for both study sites. It attains high performance values of 0.994207, 0.994079, 0.988219, and 0.899868 for the correlation coefficient, Wilmott's Index, the Nash-Sutcliffe Index, and the Legates-McCabe Index, respectively, for Kiribati. The SVR model has also performed well as a benchmark model attaining values of 0.988909, 0.987155, 0.974866, and 0.852321 for the correlation coefficient, Wilmott's Index, the Nash-Sutcliffe Index, and the Legates-McCabe Index, respectively. MLR performed very poorly at both study locations with values of 0.758946, 0.672739, 0.368981, and 0.217703 for Kiribati. Compared with other models, this is also similar for Tuvalu, as shown in Figure 14

Objective and Benchmark Model Error Evaluation
To support the performance metrics above, error calculations were done for each model with their prediction results. Error evaluation is a significant aspect of model evaluation, as shown in many past studies using AI models for prediction [33,45]. Four error metrics are computed in this study for error analysis. Figures 15 and 16

Objective and Benchmark Model Error Evaluation
To support the performance metrics above, error calculations were done for each model with their prediction results. Error evaluation is a significant aspect of model evaluation, as shown in many past studies using AI models for prediction [33,45]. Four error metrics are computed in this study for error analysis. Figures 15 and 16

Objective and Benchmark Model Error Evaluation
To support the performance metrics above, error calculations were done for each model with their prediction results. Error evaluation is a significant aspect of model evaluation, as shown in many past studies using AI models for prediction [33,45]. Four error metrics are computed in this study for error analysis. Figures 15 and 16 Figure 17 shows a time series comparison of all model results between 23 December 2021 and 31 December 2021. The data points show the tracking of observed and predicted data points within the testing phase of modeling. All AI models are able to track the observed data; however, stacked BiLSTM shows the best fit, and all predicted points overlap, indicating greater accuracy than all benchmark models. The MLR model shows most deviation between the observed and predicted points. AdaBoost and SVR also show better tracking of the GNSS-VLM sea level than the MLR model with the predicted dataset.   Figure 17 shows a time series comparison of all model results between 23 December 2021 and 31 December 2021. The data points show the tracking of observed and predicted data points within the testing phase of modeling. All AI models are able to track the observed data; however, stacked BiLSTM shows the best fit, and all predicted points overlap, indicating greater accuracy than all benchmark models. The MLR model shows most deviation between the observed and predicted points. AdaBoost and SVR also show better tracking of the GNSS-VLM sea level than the MLR model with the predicted dataset.  Figure 17 shows a time series comparison of all model results between 23 December 2021 and 31 December 2021. The data points show the tracking of observed and predicted data points within the testing phase of modeling. All AI models are able to track the observed data; however, stacked BiLSTM shows the best fit, and all predicted points overlap, indicating greater accuracy than all benchmark models. The MLR model shows most deviation between the observed and predicted points. AdaBoost and SVR also show better tracking of the GNSS-VLM sea level than the MLR model with the predicted dataset.  Figure 18 shows the annual sea level average with VLM correction with 2 per movin average, which shows an increasing trend based on the analysis of the tide gauge datase located in Tarawa, Kiribati. The GNSS-VLM measurement was adjusted from each obser vation for the analysis. The annual rate of increase is estimated to be 2.1 mm/year, an with a linear projection, the sea level is expected to reach close to 1.75 m in 2040. This an estimated increase from 1.67 m at an average of 80 mm. According to [82], there is significant increase in the extreme water level according to Betio tide gauge data sinc 1984 (>2.8 m above datum) due to decadal variability in the frequency of "Central Pacific El Niño events. Furthermore, it states that sea level variability could increase by 20-4 mm by the end of century due to the evolution of ENSO dynamics and including othe factors. In another study of Tarawa in Kiribati [83], a sea level rise rate trend of 3.9 mm per year was found from 1992 to 2008.  Figure 18 shows the annual sea level average with VLM correction with 2 per moving average, which shows an increasing trend based on the analysis of the tide gauge dataset located in Tarawa, Kiribati. The GNSS-VLM measurement was adjusted from each observation for the analysis. The annual rate of increase is estimated to be 2.1 mm/year, and with a linear projection, the sea level is expected to reach close to 1.75 m in 2040. This is an estimated increase from 1.67 m at an average of 80 mm. According to [82], there is a significant increase in the extreme water level according to Betio tide gauge data since 1984 (>2.8 m above datum) due to decadal variability in the frequency of "Central Pacific" El Niño events. Furthermore, it states that sea level variability could increase by 20-40 mm by the end of century due to the evolution of ENSO dynamics and including other factors. In another study of Tarawa in Kiribati [83], a sea level rise rate trend of 3.9 mm per year was found from 1992 to 2008.  Figure 19 shows the annual average GNSS-VLM-corrected sea level trend for Tuvalu. The rate of rise is higher in the case of Tuvalu, with an annual average rate rise of 3.9 m/year. The linear projection shows an estimate rise to 2.16 m for 2040 with an increase of 150 mm from 2.01 m. Other studies, such as Donner and Webber [84], of sea level rise in Tuvalu, with a 16-year TG data analysis from 2008, found a higher average rate rise of 5.9 mm/year. The higher values can be attributed to a different timeline considered as this study considers the dataset from 2001 to 2021 and the consideration of GNSS-VLM TG adjustment in sea level observations.

Conclusions
Based on the implementation of artificial intelligence models, including new deep learning analysis with data decomposition, it can be clearly seen with the comprehensive evaluation of all AI models shown that the GNSS-VLM-corrected sea level trend can be  Figure 19 shows the annual average GNSS-VLM-corrected sea level trend for Tuvalu. The rate of rise is higher in the case of Tuvalu, with an annual average rate rise of 3.9 m/year. The linear projection shows an estimate rise to 2.16 m for 2040 with an increase of 150 mm from 2.01 m. Other studies, such as Donner and Webber [84], of sea level rise in Tuvalu, with a 16-year TG data analysis from 2008, found a higher average rate rise of 5.9 mm/year. The higher values can be attributed to a different timeline considered as this study considers the dataset from 2001 to 2021 and the consideration of GNSS-VLM TG adjustment in sea level observations.  Figure 19 shows the annual average GNSS-VLM-corrected sea level trend for Tuvalu. The rate of rise is higher in the case of Tuvalu, with an annual average rate rise of 3.9 m/year. The linear projection shows an estimate rise to 2.16 m for 2040 with an increase of 150 mm from 2.01 m. Other studies, such as Donner and Webber [84], of sea level rise in Tuvalu, with a 16-year TG data analysis from 2008, found a higher average rate rise of 5.9 mm/year. The higher values can be attributed to a different timeline considered as this study considers the dataset from 2001 to 2021 and the consideration of GNSS-VLM TG adjustment in sea level observations.

Conclusions
Based on the implementation of artificial intelligence models, including new deep learning analysis with data decomposition, it can be clearly seen with the comprehensive evaluation of all AI models shown that the GNSS-VLM-corrected sea level trend can be

Conclusions
Based on the implementation of artificial intelligence models, including new deep learning analysis with data decomposition, it can be clearly seen with the comprehensive evaluation of all AI models shown that the GNSS-VLM-corrected sea level trend can be predicted with high accuracy. The objective stacked BiLSTM model outperformed all benchmark models in the study with greater accuracy. The trend assessment and linear forecast with the new GNSS-VLM-corrected average values provides insight into the highly debated issue of sea level rise with climate change in small island countries. With use of a recent dataset and improved accuracy of predicted rise, stakeholders can make better decisions on adaptation and mitigation strategies for Kiribati and Tuvalu. There is no doubt that the sea level is rising and that the extent and nature of its impact is different across the world. As seen in the analysis of this study, the expected rate of rise in Kiribati (2.1 mm/year) is not the same as in Tuvalu (3.9 mm/year) despite being in the South Pacific region. This information needs to be considered with other factors such as the topological changes in the land area to assess how the impacts may occur on the atoll islands with their vulnerability to flooding and inundation. Each island state has its own context of geographic, topographic, and cultural variety which must be assessed for real impacts of sea level rise [85]. This can be a future scope for the extension of this study.