Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea

Huang, Linxiao; Shu, Yeqiang; Yao, Jinglong; Liu, Danian

doi:10.3390/rs17233838

Open AccessArticle

Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea

¹

State Key Laboratory of Tropical Oceanography, South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

³

High Impact Weather Key Laboratory, China Meteorological Administration (CMA), Changsha 410005, China

⁴

China-Sri Lanka Belt and Road Joint Laboratory on Tropical Oceanography, Chinese Academy of Sciences, Guangzhou 510301, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(23), 3838; https://doi.org/10.3390/rs17233838 (registering DOI)

Submission received: 17 September 2025 / Revised: 31 October 2025 / Accepted: 26 November 2025 / Published: 27 November 2025

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The proposed Physics-Informed methods improve the prediction accuracy of the SimVPv2 model by 13% (RMSE decrease from 0.0198 m to 0.0173 m).
Incorporating the land mask information into the inputs is implied and proved to be a simple yet effective way to enhance the AI model (especially the PredRNNv2) performance on the oceanic data.
Seasonal/spatial analyses and a case study investigate how the geostrophic constraint improves model performance by connecting with the physical oceanography.

What are the implications of the main findings?

This study demonstrates that integrating a latitude-weighted geostrophic constraint enhances both physical consistency and prediction accuracy in data-driven ocean models.
To the best of our knowledge, the issue of artifacts in AI models caused by land points in oceanic data is first proposed here, and we provide a simple yet effective solution.
The limitation of directly concatenating heterogeneous oceanic data as inputs for AI models is also first revealed here, as far as we know.

Abstract

Sea surface height (SSH) derived from satellite altimetry is essential for oceanographic research and marine monitoring. Although artificial intelligence (AI) models show considerable potential in forecasting, their application in oceanography remains constrained by several limitations. To address these challenges, we propose a set of physics-informed methods to improve SSH prediction based on neural networks in the South China Sea (SCS). The key strategies include: (1) incorporating land mask information to mitigate artifacts induced by the presence of land in marine data; (2) introducing a geostrophic constraint into the loss function; and applying latitude-dependent weighting to this constraint to account for the breakdown of geostrophic balance near the equator. On the test dataset, the physics-informed SimVPv2 (Phys-SV) model achieves an RMSE of 0.0173 m, a 13% improvement over the baseline SimVPv2 (Base-SV). The PredRNNv2 (PR) model also benefits significantly from the inclusion of land mask input, with RMSE reduced by 12% (from 0.0280 m to 0.0246 m). To the best of our knowledge, this study is the first to identify the artifact issue in AI models caused by land points in ocean data and to reveal the limitations of directly concatenating heterogeneous oceanic variables as model inputs.

Keywords:

sea surface height forecasting; physics-informed neural network; geostrophic constraint; deep learning; remote sensing; South China Sea

1. Introduction

As a critical indicator of mesoscale oceanic dynamical processes, sea surface height (SSH) is routinely used to map mesoscale circulation features [1,2] and quantify mesoscale eddy dynamics [3,4], which contributes to oceanic mass transport comparable in magnitude to that of the large-scale wind- and thermohaline-driven circulation [5]. Given this scientific importance, the ability to accurately forecast SSH is of critical operational and engineering significance. Timely and reliable SSH forecasts provide the basis for deriving surface currents, which are essential for a wide range of maritime activities. In the offshore energy sector, precise knowledge of future current and eddy locations is indispensable for the safe execution of sensitive operations, such as the installation and maintenance of oil rigs and offshore wind turbines [6]. Therefore, developing models that can deliver high-accuracy SSH forecasts is a key objective in modern operational oceanography.

Approaches to forecasting SSH can be broadly categorized into three groups: numerical models, statistical methods, and data-driven deep learning models. Numerical models, such as the Hybrid Coordinate Ocean Model (HYCOM) [7] and Regional Ocean Model System (ROMS) [8,9], have been applied to forecast SSH for a long time. The numerical models solve fundamental hydrodynamic equations to simulate oceanic states and are physically comprehensive. But they require complex parameterization of sub-grid-scale processes, and the accuracy is highly sensitive to initial and boundary conditions—limitations, which have motivated the search for alternative approaches. Statistical methods, such as autoregressive models, offer a computationally cheaper alternative but are often based on linear assumptions, limiting their ability to capture the complex, non-linear dynamics inherent in ocean systems [10].

The rapid development of artificial intelligence (AI) has revolutionized spatiotemporal forecasting in oceanography, with deep learning models emerging as particularly powerful tools. These data-driven approaches have demonstrated remarkable success across various oceanographic applications. Convolutional neural networks (CNNs) have proven effective for El Niño-Southern Oscillation (ENSO) prediction [11], while the integration of multivariate empirical orthogonal function (MEOF) analysis with one-dimensional convolutional long short-term memory (Conv1D-LSTM) networks has shown promising results for multi-variable sea surface forecasting [12]. Innovative adaptations of vision Transformers (ViT) with self-attention mechanisms have enabled three-dimensional multivariate modeling for enhanced ENSO prediction [13].

Despite these advances, fundamental limitations persist in purely data-driven approaches. The inherent “black box” nature of these models raises concerns about physical consistency in their predictions [14], particularly when extrapolating beyond the temporal scope of training data or processing noisy observational inputs. This limitation has motivated the development of Physics-Informed Neural Networks (PINNs), which embed physical laws into the learning process through penalty terms that quantify violations of governing equations [15]. The PINN framework has shown considerable promise across various oceanographic applications, including tropical cyclone field reconstruction [16], three-dimensional thermohaline modeling in the tropical Pacific [17], and improved air–sea flux parameterizations [18].

Among the fundamental principles of ocean dynamics, the geostrophic balance provides a robust first-order approximation for large-scale, low-frequency ocean circulation. This balance, which describes an equilibrium between the Coriolis force and the pressure gradient force, is known to govern the circulation in many regions of the world’s oceans [3]. The SCS is one such region, where the large-scale circulation is predominantly in geostrophic balance [19,20]. A key practical advantage of using a geostrophic constraint is its elegance and efficiency: it relates the sea level gradient to the velocity field, meaning the constraint can be formulated using only SSH data, without requiring external variables like wind forcing or in situ velocity measurements.

Although numerous studies have explored the use of AI models for forecasting SSH [21,22,23,24], few have integrated these models with physical laws. To our knowledge, only one study has incorporated the geostrophic constraint into an AI framework to predict sea surface currents [25]. As such, the potential of PINNs for SSH forecasting remains largely untapped.

In this study, we proposed physics-informed methods to enhance the AI model’s accuracy for ten-day SSH forecasting in the SCS. A latitude-weighted geostrophic constraint is embedded into the loss function, along with the incorporation of mask information, to further enhance model performance. Our primary objective is to demonstrate that this physics-informed approaches improve both forecast accuracy and physical consistency compared to a purely data-driven baseline. In addition to extensive experiments validating the model’s improvement, we conduct a comprehensive analysis of its performance across different seasons, forecast lead times, and bathymetric regimes. A case study also provides a concrete example of how the physics-informed methods impact the forecasting SSH field. The analyses aim to quantify the benefits and limitations of applying the geostrophic constraint in this dynamically complex region.

2. Data and Methods

2.1. Data

This study employs daily mean absolute dynamic topography data—defined as the sea surface height above the geoid and hereafter referred to as SSH—obtained from the Copernicus Marine Environment Monitoring Service (CMEMS). The dataset has a spatial resolution of 1/8° × 1/8° and incorporates multi-satellite altimeter observations. It has undergone tidal correction and mean dynamic topography processing to ensure data quality. Besides, the 1/4° × 1/4° daily (averaged per 6 h) ERA5 wind data obtained from Copernicus Climate Change Service (C3S) is utilized after being interpolated into 1/8° × 1/8° during the analyses.

As shown in Figure 1, the study domain (2°N–22°N, 104°E–124°E) encompasses the SCS basin and adjacent Luzon Strait, corresponding to a 160 × 160 grid. This region captures critical dynamical features, including the Kuroshio intrusion through the Luzon Strait, which significantly modulates SCS circulation patterns [26] and consequently influences SSH variability across the basin. The dataset covers the period from 1 January 1993 to 13 June 2024 and is divided into three subsets: a training set (1993–2021), a validation set (2022), and a test set (2023–13 June 2024). The validation set is used for hyperparameter tuning and monitoring the training process, while the test set serves as an independent dataset to evaluate the model’s predictive performance. Notably, 2022 was a La Niña year, whereas 2023 transitioned to an El Niño year. Previous studies have demonstrated that ENSO signals can influence SCS circulation and SSH variability through processes such as the Luzon Strait water exchange [27]. This interannual variability introduces additional challenges for model predictions but provides a more rigorous assessment of the model’s generalization capability.

2.2. Model Structure

The SimVPv2 model employed in this study is a purely convolutional architecture that efficiently captures spatiotemporal coupling relationships through a gated spatiotemporal attention (gSTA) mechanism. Compared to conventional spatiotemporal prediction models (e.g., ConvLSTM [28], PredRNN [29]), SimVPv2 demonstrates superior performance in terms of structural simplicity, computational efficiency, and prediction accuracy, showing exceptional performance on multiple benchmark datasets [30]. These characteristics make it particularly suitable for modeling complex oceanographic data.

Designed for sequential prediction of two-dimensional spatial variables, the SimVPv2 architecture consists of three primary components: (1) Spatial Encoder, (2) Spatiotemporal Translator, and (3) Spatial Decoder. Similar to U-Net, the model incorporates skip connections between the initial encoder layers and final decoder layers to preserve original features. Figure 2 illustrates the overall structure of SimVPv2, and the more concrete structure is given in Appendix A.

For an input tensor of dimensions

(B, T, C, H, W)

, where B denotes batch size (the number of samples to proceed), T denotes days of inputs, C denotes the number of channels, H denotes image height, and W denotes image width, the model first flattens the temporal dimension into the batch dimension

(B \times T, C, H, W)

. After spatial encoding, the tensor

(B \times T, C, H^{f}, W^{f})

undergoes channel-temporal folding to

(B, T \times C, W^{f}, W^{f})

, where

H^{f} = H / 2^{N_{s}}

and

W^{f} = W / 2^{N_{s}}

. The translator’s operations on this restructured tensor enable simultaneous learning of spatial and temporal relationships through depthwise spatial attention and channel-wise convolutions.

As shown in Figure 2, we adopt a 10-day sequence of SSH fields as inputs to predict the subsequent 10-day SSH fields, corresponding to inputs and outputs dimensions of (10, 1, 160, 160). Given the relatively low spatial resolution (160 × 160), we minimize information loss during downsampling and upsampling by limiting the encoder layers (Ns) to 2, performing only one downsampling and one upsampling operation. Within the attention module, we employ dilated convolutions with a dilation rate of 2 (skipping every other grid point) and an effective kernel size of 21 to capture broader spatial dependencies.

2.3. Strategies

2.3.1. Geostrophic Constraint in SSH Prediction

In physical oceanography, the geostrophic balance is a fundamental dynamical approximation in which the Coriolis force is balanced by the horizontal pressure gradient force. Owing to its clear physical meaning and relatively simple mathematical form, it is widely used to estimate large-scale oceanic currents from sea surface height (SSH) observations.

Under the f-plane approximation, where the Coriolis parameter is assumed constant, the geostrophic balance can be expressed as:

f u_{g} = - g \frac{\partial ζ}{\partial y},

(1)

f v_{g} = g \frac{\partial ζ}{\partial x},

(2)

where (

u_{g}, v_{g}

) are the zonal and meridional components of the geostrophic velocity, g is the gravitational acceleration (taken as 9.81 m/s⁻²), ζ represents the sea surface height (SSH), and f is the Coriolis parameter, defined as:

f = 2 Ω s i n (ϕ),

(3)

Here, Ω is the Earth’s angular velocity (taken as 7.2921 × 10⁻⁵ rad/s), and ϕ is the latitude.

Solving Equations (1) and (2) for the velocity components yields:

u_{g} = - \frac{g}{f} \frac{\partial ζ}{\partial y},

(4)

v_{g} = \frac{g}{f} \frac{\partial ζ}{\partial x},

(5)

In the training of our SSH prediction model, both the inputs and outputs are exclusively SSH fields. The primary loss function is the Mean Squared Error (MSE) between the predicted and target SSH:

l o s s_{S S H} = M S E (ζ_{p r e d}, ζ_{t a r g e t}),

(6)

Here,

ζ_{p r e d}, ζ_{t a r g e t}

denote the predicted SSH and the target SSH from the training dataset for the corresponding date, respectively. The MSE is calculated as:

M S E (x, y) = \frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - y_{i})}^{2},

(7)

where N is the total number of valid data points (excluding the land grids). Through exclusively calculating the loss of valid data points, we aim to eliminate the interference in model training caused by artifacts from land points.

To incorporate the geostrophic balance into the model’s loss function, a geostrophic constraint loss is introduced. First, the SSH spatial gradient fields of predictions and targets are computed using a Sobel operator. Subsequently, the geostrophic velocity components (

u_{p r e d}, v_{p r e d}, u_{t a r g e t}, v_{t a r g e t}

) are derived from the SSH spatial gradient fields as described in Equations (4) and (5). The calculated geostrophic velocities are divided by the standard deviation from CMEMS geostrophic velocity data (from 1993 to 2021) and multiplied by that from CMEMS SSH data (from 1993 to 2021)—a step intended to align the dimension of the computed geostrophic velocities with that of SSH, so that the magnitude of the loss terms is balanced and each of the loss terms would not be dominant solely because its larger magnitude. Finally, the MSE between the predicted and target geostrophic velocities forms the geostrophic loss term:

l o s s_{u} = M S E (u_{p r e d} / σ_{u} \cdot σ_{S S H}, u_{t a r g e t} / σ_{u} \cdot σ_{S S H}),

(8)

l o s s_{v} = M S E (v_{p r e d} / σ_{v} \cdot σ_{S S H}, v_{t a r g e t} / σ_{v} \cdot σ_{S S H}),

(9)

l o s s_{g e o} = l o s s_u + l o s s_v,

(10)

The total loss function for training the model is a linear combination of the SSH prediction loss and the geostrophic velocity loss:

l o s s_{t o t a l} = l o s s_{S S H} + λ l o s s_{g e o},

(11)

where λ is the geostrophic constraint coefficient. A larger value of λ imposes a stronger geostrophic constraint, whereas a smaller value signifies a weaker constraint.

2.3.2. Latitude-Weighted Loss

The geostrophic constraint in our model is based on the geostrophic velocity equations (Equations (4) and (5)). A direct application of these equations is problematic at low latitudes, as the Coriolis parameter f in the denominator approaches zero, leading to an over-amplification of the geostrophic loss term where the geostrophic balance is inherently weak. This can introduce significant errors into the model training process.

To mitigate this issue, we introduce a latitude-dependent weighting factor, w(ϕ), designed to smoothly suppress the geostrophic constraint in equatorial regions. The weight is calculated using the following square-rooted sigmoid function:

w (ϕ) = \frac{1}{\sqrt{(1 + e^{- k (ϕ - ϕ_{0})})}},

(12)

The factor

w (ϕ)

is applied in the loss function as below:

l o s s_{g e o (w e i g h t e d)} = l o s s_{g e o} \cdot w (ϕ),

(13)

According to Lagerloef et al. [31], the geostrophic approximation under the f-plane assumption is generally valid at latitudes higher than approximately 5°N. Based on this guidance, we introduce a latitude-dependent weighting scheme to gradually apply the geostrophic constraint with increasing latitude. Specifically, we define a sigmoid-shaped weight function with parameters

ϕ_{0} = 7 °

and k = 2, such that the weight transitions smoothly from nearly 0 south of 5°N to nearly 1 north of 10°N.

2.3.3. Mask-Informed Inputs

In the application of deep learning models, particularly convolutional neural networks, to oceanographic data, a significant challenge is the prevalence of NaN (Not a Number) values, which often correspond to land grids in marine datasets. A common practice is to replace these NaN values with zeros. However, this approach is suboptimal, as simply zero-filling may mislead the convolutional model during feature extraction, given that such models rely on sliding kernels across the grid to capture meaningful spatial patterns. Another method involves interpolation, which fills the land grids using values derived from surrounding ocean data. While interpolation may offer better performance than simply assigning zeros to land grids, it still introduces misleading information to the model: originally, information-free land areas are now filled with artificially imputed values, which do not correspond to any real physical processes and may still distort feature learning.

For instance, in SSH prediction tasks, shallow network layers could misinterpret zero-filled or interpolated land grids as authentic SSH values, thereby propagating erroneous information to deeper layers. To address this issue, we propose a simple yet effective method: concatenating a binary mask that identifies valid grids—a tensor of ones and zeros with the same spatial dimensions as the inputs, but with a single channel—to the inputs along the channel dimension. This operation transforms the inputs shape from (B, T, C, H, W) to (B, T, C+1, H, W) for the SimVPv2 model, while for the PredRNNv2 model, the mask is concatenated with not only with the first inputs, but also the outputs of last step (which is also the inputs of the current step) at each time step, thereby explicitly informing the model about the presence of invalid grid cells in a straightforward yet effective manner.

3. Results

3.1. Impact of the Geostrophic Constraint Coefficient

To determine the optimal weighting for the geostrophic constraint, we conducted a series of experiments in which the model was trained under varying values of the geostrophic constraint coefficient, denoted as. In order to mitigate the effects of randomness and enhance the robustness of the results, each experiment was repeated five times under identical hyperparameter settings except for the random seed. The performance of each configuration was evaluated on the test dataset, and the results were averaged across the five runs to ensure a more reliable and statistically meaningful comparison.

Figure 3 illustrates the relationship between model performance and λ, using the RMSE and the averaged Point-to-point Correlation Coefficient (PCC) as evaluation metrics. And PCC is computed as below:

P C C = \frac{\sum_{i = 1}^{N} (x_{i} - {\bar{x}}_{i}) (y_{i} - {\bar{y}}_{i})}{\sqrt{\sum_{i = 1}^{N} {(x_{i} - {\bar{x}}_{i})}^{2} \sum_{i = 1}^{N} {(y_{i} - {\bar{y}}_{i})}^{2}}}

(14)

where N denotes the total number of valid data points (excluding the land points).

For each value of λ, the mean values are derived from five independent models, which were trained under identical configurations differing only in their random seeds. The 95% confidence intervals (CIs) are calculated using the Bootstrap method (as detailed in Appendix B). The Base-SV model denotes the SimVPv2 baseline trained without any of the proposed strategies.

As λ increases from zero, the RMSE initially decreases and the PCC increases, indicating improved performance. Optimal performance is achieved at λ = 0.7, after which the model’s accuracy degrades. Therefore, the term Phys-SV hereafter refers specifically to the model trained at this optimal value (λ = 0.7).

Given that the geostrophic loss was normalized, the coefficient λ can be interpreted as the relative importance of the geostrophic loss. The degradation in performance for λ > 0.7 suggests that the geostrophic constraint should serve as a supplementary component of the loss function. This phenomenon can be mainly attributed to the presence of ageostrophic dynamics. Forcing the model to adhere too strictly to geostrophy by increasing λ penalizes the data-driven model for learning the ageostrophic dynamics.

3.2. Ablation Study

In the Section 2, we introduced three strategies aimed at enhancing model performance. As demonstrated in Figure 3, integrating all three leads to notable improvement. However, the individual contribution of each strategy had not been evaluated. To assess their respective effectiveness, we conducted a series of ablation studies in which one strategy was omitted at a time.

As shown in Figure 4, the removal of any of the three strategies results in an increase in RMSE and a decrease in PCC. Among them, abandoning the geostrophic constraint has the most pronounced effect. These results confirm that all three strategies contribute to improving the performance of the Phys-SV, and that the geostrophic constraint plays an especially critical role, which is probably because the dominance of geostrophic dynamics in the SCS makes the geostrophic constraint an effective regularizer for the model.

4. Analysis of Results

From the experiments above, it can be observed that the fluctuations caused by randomness during the training process are considerable. To minimize the impact of such randomness, all random seeds were fixed to 42 and cuDNN’s (Nvida’s CUDA Deep Neural Network library) deterministic algorithms were enabled throughout the training of the subsequent models. This ensures that models trained under the same hyperparameters are strictly identical, except for those trained with geostrophic constraint loss. Due to its additional computational steps, this loss introduces new uncertainties. Nevertheless, as Table 1 demonstrates, despite not being entirely identical, the use of fixed random seeds and cuDNN’s deterministic algorithms still results in highly consistent outputs across repeated trials under identical hyperparameters. Across three independent trials, the RMSE values exhibit minimal deviation, remaining within 1.6% of the mean RMSE. Based on this high consistency, we selected one of the three runs as the representative instance of the Phys-SV for all subsequent analyses.

As shown in Table 1, we have also trained the PR model using the proposed methods to explore their generalizability. To ensure a fair comparison, we adjusted the hyperparameter configurations to make the two models similar in scale of parameters. (SV: 3,273,809 parameters, PR: 3,340,288 parameters). It is important to note that the PR model exhibits significantly different characteristics compared to the SV model. Firstly, the PR model is much more sensitive to data normalization than the SV model, even when the inputs are solely SSH. Consequently, we trained the PR model using normalized SSH data. Secondly, land masks are integrated into the PR model at every time step, as it is an autoregressive model. Thirdly, while the GC method improves the performance of the SV model, it diminishes that of the PR model. Conversely, the MI method significantly enhances the performance of the PR model, but only marginally for the SV model.

It is the difference in model structure that we think makes the MI method boost the PR more significantly, and demonstrates an appealing prospect to apply the mask-informed method to the autoregressive model. As for the SV, it may be better to concatenate the mask not only into the original inputs, and the reformative practice remains unexplored. Regarding the degradation of PR’s performance caused by the GC, we supposethe reason may be that the PR requires a larger scale of parameters to achieve overfitting, where the GC, acting as a penalty term, could enhance the performance.

4.1. Comparative Performance Analysis

To further evaluate the effectiveness of the physics-informed approaches, the performance at different lead times of the Phys-SV and Mask-PR (Base-PR + MI) was compared against other models: Base-SV, Base-PR and Persistence. Persistence, which assumes the future state is identical to the current state (ζ(t+1) = ζ(t)), is a benchmark comparison and forecast reference widely accepted in oceanic science [32], and serves as a simple baseline for forecast skill here.

As shown in Figure 5, both the Phys-SV and Mask Base-SV outperform Base models. For the PR, the MI method significantly enhances its performance, especially in PCC. The introduction of MI effectively slows down the rate of PCC degradation, indicating that mask information helps the PR cope with the accumulated errors due to the artifacts introduced by the land points during its autoregressive process.

Moreover, for the SV models, they both exhibit greater performance than the PR models, and the Physics-Informed methods enhance the performance of the SV in a comparably average way, exhibiting lower RMSE and higher PCC across all lead times. However, the magnitude of this improvement is modest, which is likely attributable to the challenges of applying the geostrophic constraint over a domain that includes extensive low-latitude and inshore areas where the geostrophic balance is weak. Although the latitude-weighting scheme was implemented to mitigate this, it may introduce discontinuities in the loss function that can complicate the training process, thereby limiting the full potential benefit of the physical constraint.

4.2. Seasonal Variation in Prediction Accuracy

The predictive performance of both the Base-SV and the Phys-SV exhibits a distinct seasonal cycle, as illustrated in the time series of forecast errors in Figure 6. For this analysis, the error metric for any given start date represents the average performance over the subsequent ten-day forecast period. Figure 6 clearly indicates that for both models, the RMSE and the PCC are both systematically lower during the summer months (April–September, red shading) compared to the winter months (October–March, blue shading).

We hypothesize that this seasonal difference in forecast skill is primarily driven by the inherent seasonal variability of the SSH field itself. To investigate this, we quantified the temporal and spatial variability of SSH for each season in the test dataset. Mean Temporal variability, denoted as

{\bar{σ}}_{T}

, is defined as the spatial average of the standard deviation calculated over time at each grid point. It measures the typical magnitude of temporal fluctuations within the SSH field. Mean Spatial Variability, denoted as

{\bar{σ}}_{S}

, is defined as the standard deviation of the time-averaged SSH field. It represents the magnitude of spatial fluctuations within the time-averaged SSH field.

As summarized in Table 2, both the mean temporal and spatial variability are significantly lower in summer than in winter. This indicates that the SSH field is generally more quiescent and spatially smoother during the summer. To directly link this variability to prediction error, we computed the PCC between the temporal variability (

σ_{T}

) and the mean absolute error (MAE) of the Base-SV prediction. The analysis revealed statistically significant positive correlations in both summer (PCC = 0.52) and winter (PCC = 0.58), with confidence levels exceeding 99.9%. This confirms that locations with greater temporal variability are inherently more difficult to predict, and that the higher overall variability in winter is a key driver of the observed seasonal degradation in RMSE.

Considering that RMSE and PCC describe different aspects of the prediction performance (the absolute accuracy and spatial correlation, respectively), and that the

{\bar{σ}}_{T}

and

{\bar{σ}}_{S}

are both higher in winter, the seemingly counterintuitive phenomenon whereby both RMSE and PCC are higher in winter can be explained as below: while higher-magnitude variations in the SSH field in winter makes it harder to predict precisely for the model, the more significant variation make it easier for the model to capture the variation mode of the SSH field than in summer, which accounts for why the PCC of prediction is higher in winter.

4.3. Spatial Distribution of Forecast Error

The previous section established that the overall forecast error is lower in summer and that the performance improvement of the Phys-SV is also season-dependent. To further investigate these patterns, we analyze the spatial distribution of the mean absolute error (MAE) for both the Base-SV model and the Phys-SV, as shown in Figure 7 and Figure 8.

Figure 7 illustrates that the MAE of the Base-SV model is not uniformly distributed, with elevated errors concentrated in dynamically active regions, including the coastal waters off Vietnam, the Beibu Gulf (also known as the Gulf of Tonkin), the Guangdong coast, the area east of the Luzon Strait, and the Sunda Shelf.

Further analysis of Figure 8 reveals that the MAE of the Base-SV model is initially relatively uniform but becomes increasingly heterogeneous as lead time increases. This inhomogeneity also exhibits seasonal variations. For example, at a lead time of 10 days, the MAE is notably higher during winter along the Vietnamese coast, the Guangdong coast, the Beibu Gulf, and the Sunda Shelf. This pattern is consistent with the winter intensification of monsoon-driven circulation features, such as the Vietnam Coastal Current and the Natuna Eddy, which are associated with stronger nonlinear dynamics [33].

In addition to seasonal variations, another prominent characteristic of the forecast error is the strong influence of bathymetry. High-error regions are predominantly located in shallow coastal and shelf waters. To quantify this, we divided the domain into shelf areas (<200 m) and deep-basin regions (DB; defined as areas with water depth exceeding 200 m). As summarized in Table 3, the RMSE for the Base-SV model in the DB region is 0.0172 m, which is 13% lower than the full-domain RMSE of 0.0198 m. This discrepancy can be attributed to two main factors: (1) the reduced accuracy of satellite altimetry data in coastal zones [34], and (2) the presence of complex nearshore dynamical processes—such as coastal currents, shelf waves, tides, and upwelling—which are often nonlinear, high-frequency, and not fully resolved by the model [35].

Moreover, the seasonal differences in the models’ prediction RMSE are mainly concentrated in the shallow water area. For the Base-SV, the seasonal RMSE differences are 0.0025 m in WD and only 0.0003 m in DB. For the Phys-SV, 0.0029 m in WD and 0.0007 m in DB. This phenomenon may be closely linked to the dynamic effects of the winter monsoon in the South China Sea [36]. The specific mechanisms include the following two aspects: First, the winter monsoon is stronger and exhibits higher variability, which tends to induce more frequent and intense coastal jets, enhancing nonlinear effects in the flow and thereby increasing simulation challenges for the models. Second, the prevailing northeasterly winds in winter drive the transport and accumulation of surface water toward the western coastal areas of the South China Sea via Ekman transport, leading to a significant rise in sea surface height. This process not only alters the regional dynamic structure but may also amplify pressure gradients and flow variability, further increasing forecasting uncertainties.

4.4. Impact of the Geostrophic Constraint

The application of the geostrophic constraint also leads to significant spatial variation in forecast performance. As indicated in Table 3, the improvement achieved by the Phys-SV over the Base-SV model is more pronounced in the DB region, where the RMSE is reduced by 16%, compared to a 12% reduction in the whole domain (WD).

Figure 7 shows that the Phys-SV improves forecast accuracy across most of the study areas. One of the most substantial improvements occurs east of the Luzon Strait. This result aligns with previous studies suggesting that the Kuroshio transport through the strait is primarily governed by geostrophic dynamics [37], confirming that the integration of this physical constraint enhances model performance in regions where the underlying assumption is most valid.

The effect of the geostrophic constraint also varies with forecast lead time (Figure 8). At a one-day lead time, the Phys-SV provides relatively uniform improvement across the domain. However, as the lead time extends to 7 and 10 days, the spatial distribution of improvements becomes more heterogeneous. While performance gains intensify in regions where geostrophic balance dominates—such as east of the Luzon Strait and the central deep basin—some areas near the land boundary show limited improvement or even increased error. This may be related to complex nonlinear effects induced by boundary dynamics, which warrant further investigation.

4.5. Case Study

To examine the specific impact of the geostrophic constraint on the prediction, we selected the day with the most notable RMSE—10 February 2023, as indicated in Figure 6a—for a case study. As Figure 9 illustrates, this is a typical western boundary current (WBC) strengthening process. From lead 1 to lead 10, the SSH rose markedly to west of the 200 m isobath, a phenomenon primarily driven by the intraseasonal strengthening of northeasterly winds, as demonstrated in previous research [36]. Additionally, the anticyclonic eddy in the northeastern SCS exhibited variability during the forecast period, likely influenced by a combination of wind effects and Kuroshio intrusion [38].

A comparison of model predictions reveals that although both Base-SV and Phys-SV underestimate the SSH rise in the western boundary, the output from Phys-SV aligns more closely with the target data. Furthermore, Phys-SV more accurately represents the evolution of the anticyclonic eddy in the northeastern SCS than Base-SV, both in terms of spatial structure and intensity.

In this case, the RMSE of the Phys-SV model is 0.0303 m, compared to 0.0412 m for the Base-SV model, representing a 26% reduction in RMSE. Moreover, the RMSE values of both models in this case are higher than those on the entire test dataset, which we attribute primarily to the high magnitude and rapid variation in wind during this period.

To verify this hypothesis, we computed the Pearson correlation coefficient (PCC) between the absolute error of Base-SV and wind speed magnitude (denoted as PCC-E), as well as the PCC between the improvement by Phys-SV and wind speed magnitude (denoted as PCC-I), at each time step. As shown in Figure 10, the PCC-E curve closely resembles the curve of the magnitude of wind speed variation (MVW), and similarly, the PCC-I curve aligns with the magnitude of wind speed (MW). We further calculated the correlation coefficients (CCs) between PCC-E and MVW, and between PCC-I and MW. Both pairs exhibit statistically significant correlations: CC = 0.85 (p = 0.002) for PCC-E and MVW, and CC = 0.92 (p = 0.0002) for PCC-I and MW.

The strong correlation between PCC-E and MVW suggests that under high wind variability, the prediction errors of the Base-SV model increase with wind speed. Similarly, the high correlation between PCC-I and MW indicates that the performance improvement of Phys-SV over Base-SV becomes more pronounced as wind speed intensifies.

The former result underscores the substantial influence of wind on SSH prediction and highlights a key limitation in existing approaches: the absence of wind data as input. To address this, we incorporated normalized interpolated ERA5 wind data together with normalized SSH as inputs to the Base-SV model and retrained it under the same configuration. However, the resulting test RMSE was 0.0207 m, higher than the 0.0198 m achieved without wind data. This suggests that the model struggled to effectively integrate the two distinct types of data, pointing to the need for further work on multimodal data alignment to enable more efficient information fusion.

The latter result indicates that, in this case, a higher wind magnitude is associated with a greater improvement. This is likely because the GC—a form of gradient loss function—guides the Phys-SV model to become more sensitive to gradient variations, thereby enhancing its capacity to capture the underlying patterns of SSH gradients, which are strongly influenced by wind forcing.

5. Conclusions

In this study, we developed physics-informed methods, including latitude-weighted geostrophic constraints (GCs) and mask-informed inputs (MIs), to enhance SSH forecasting in the South China Sea. For MI, we utilize mask information as input to reduce artifacts caused by the processing of extensive land points in oceanographic datasets using AI models. As for GC, we integrated a latitude-weighted geostrophic constraint into the loss function by minimizing the difference between predicted and target geostrophic currents, which are derived from SSH gradients. The latitude weights address the diminished validity of geostrophic balance near the equator by applying smaller weights to the geostrophic loss in that region. We tested the effect of the physics-informed methods on two mainstream spatiotemporal prediction models: SimVPv2 (SV) and PredRNNv2 (PR). The results indicate that GC primarily enhances SV’s performance, while it worsens PR’s performance; MI improves the performance of both models, with significant benefits for PR and marginal improvements for SV.

We investigated the influence of seasonality on model performance. Both the Base-SV and Phys-SV models demonstrated higher forecast accuracy during summer compared to winter. Correlation analysis between the MAE of the Base-SV prediction and the temporal variability of the SSH field confirmed that regions with higher temporal variability are inherently more challenging to predict. Consequently, the increased temporal variability of SSH during winter is identified as a significant factor contributing to seasonal degradation in forecast accuracy.

The bathymetric effects were also examined. Both models demonstrated significantly lower RMSE in deep basin areas (DB, with depths greater than 200 m) compared to the entire domain. Furthermore, the seasonal performance discrepancies of the models are found to exist primarily in shallow water areas, and the performance advantage of Phys-SV over Base-SV is more pronounced in DB. These findings emphasize the role of topographic features in influencing prediction errors and in modulating the effectiveness of physical constraints, highlighting the necessity of incorporating the bathymetric context into the design of future models.

Overall, this study confirms the feasibility and value of embedding the physics-informed methods into deep learning frameworks for SSH forecasting. Phys-SV improves prediction skill while enhancing interpretability by aligning with physical ocean dynamics. Mask-PR demonstrates the significance of informing the AI model of the land masks to mitigate the artifact inputs into the model. This work demonstrates a promising direction for integrating physical knowledge with data-driven modeling in ocean prediction.

As the case study illustrates, wind variation significantly contributes to prediction errors. Therefore, integrating wind data into the model is expected to enhance SSH predictions. Nevertheless, the poorer outcomes from attempts to combine wind data with SSH as inputs for the SV model underscore the need for exploring improved methods to incorporate heterogeneous oceanic data into AI models, which is our future research direction.

Moreover, considering the inherent limitations and finite scope of applicability of the geostrophic approximation itself, the incorporation of broader physical constraints is recommended for future oceanic AI models to further enhance their predictive performance and physical consistency.

Author Contributions

Conceptualization, L.H. and Y.S.; methodology, L.H.; software, L.H.; validation, L.H. and D.L.; formal analysis, L.H.; investigation, L.H.; resources, Y.S. and J.Y.; data curation, L.H. and J.Y.; writing—original draft preparation, L.H.; writing—review and editing, L.H., Y.S., J.Y. and D.L.; visualization, L.H.; supervision, Y.S.; project administration, L.H. and J.Y.; funding acquisition, J.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the China Meteorological Administration, grant number 2024-K-03, the Strategic Priority Research Program of the Chinese Academy of Sciences, grant number XDA050202, the National Natural Science Foundation of China, grant number 42076201, and the Program of the Chinese Academy of Sciences, grant number 133244KYSB20210026.

Data Availability Statement

The datasets analyzed in this study are publicly available at https://data.marine.copernicus.eu/product/SEALEVEL_GLO_PHY_L4_MY_008_047/description, accessed on 18 May 2025. ERA5 wind data used for analyses is publicly available at https://cds.climate.copernicus.eu/datasets/derived-era5-single-levels-daily-statistics?tab=overview, accessed on 13 October 2025. The codes and trained models will be released at https://github.com/happy364/PINNs-for-SSH-Prediction-in-the-SCS, accessed on 13 October 2025.

Acknowledgments

The numerical simulations in this study were supported by the High-Performance Computing Division of the South China Sea Institute of Oceanology, for which we express our sincere gratitude. We also acknowledge the use of data provided by the Copernicus Marine Service, which greatly contributed to this research. During the preparation of this manuscript, DeepSeek V3.1 was used for the purpose of polishing the original draft to enhance the readability, while the authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A

Appendix A.1. Concrete Structure of SimVPv2

The Spatial Encoder comprises Nₛ layers of 2D convolutional blocks, each defined as:

z_{i} = σ (N o r m 2 d (C o n v 2 d (z_{i - 1}))), 1 \leq i \leq N s,

(A1)

where zᵢ represents the feature map after the i-th convolutional block (with z₀ as input), Conv2d denotes a standard 3 × 3 2D convolution (implemented as PyTorch (version 2.5.1)’s Conv2d), Norm2d indicates a GroupNorm layer (group = 2), and σ represents the Sigmoid Linear Unit (SILU) activation function. The architecture alternates between stride = 1 and stride = 2 convolutions for progressive downsampling.

The Spatial Decoder essentially mirrors the structure of the encoder, replacing downsampling operations with upsampling through PyTorch’s PixelShuffle using a scale factor of 2. The final layer serves as the output layer, which converts the feature channels into the desired number of output channels.

The core part lies in the Spatiotemporal Translator, composed of Nₜ stacked gSTA Blocks. As illustrated in Figure A1, each gSTA Block sequentially combines a Spatial Attention layer and a MixMLP layer (constructed from 1 × 1 convolutions and depth-wise convolutions). The Spatial Attention layer features channel-preserving 1 × 1 convolutions bracketing an attention module, and the attention module can be mathematically expressed as:

{\hat{z}}_{j} = C o n v 2 d_{1 \times 1} (C o n v 2 d_{D w - d} (C o n v 2 d_{D w} (z_{j}))), N_{s} < j \leq N_{s} + N_{t}

(A2)

g, \bar{z_{j}} = s p l i t ({\hat{z}}_{j}),

(A3)

z_{j + 1} = σ (g) ⊙ \bar{z_{j}},

(A4)

where

C o n v 2 d_{D w}

and

C o n v 2 d_{D w - d}

denote depthwise and dilated depthwise convolutions, respectively, g represents attention coefficients, σ indicates softmax normalization, and ⊙ is element-wise multiplication. This gating mechanism dynamically weights features based on their spatiotemporal importance.

Figure A1. Hierarchical illustration of the gSTA architecture: (a) overall framework, (b) expansion of the Spatial Attention in (a), and (c) details of the Attention Module in (b). The abbreviations used in the figure are defined as follows: Conv, Convolutional layer; GN, Group Normalization; Scale, multiplication with a learnable scale factor; DP, Drop Path; GELU, GELU activation function.

Appendix A.2. Hyperparameter Configuration

To enhance computational efficiency and mitigate overfitting —constrained by the small training dataset (several thousand samples) and low data resolution—we select conservative hidden layer dimensions: a spatial hidden size of 16 and a temporal hidden size of 128. Further regularization is achieved via dropout and drop path rates of 0.3. Other hyperparameters, including the default kernel size 3 for encoder/decoder convolutions and the MLP ratio 8 in the Translator’s MixMLP, remain unchanged.

The training was conducted on an NVIDIA GeForce RTX 4090 GPU with 24 GB of memory. To mitigate overfitting and improve computational efficiency, the following training strategy was adopted: a batch size of 4 was used to introduce more stochasticity and accelerate training. A ReduceLROnPlateau learning rate scheduler was applied with an initial rate of 1 × 10⁻⁴ to alleviate gradient instability caused by the small batch size. The learning rate was reduced by a factor of 0.1 if the validation loss did not decrease for 5 consecutive epochs, thereby promoting steady convergence. Training proceeded for up to 150 epochs with early stopping configured with a patience of 10 epochs, monitoring the validation loss to terminate training if no improvement was observed.

Under the specified configuration, the Phys-SV model is relatively compact with only 3,273,809 parameters. Training was completed in approximately three hours, while during testing, the model average inference spead is 3.7 milliseconds per sample—a value derived from averaging over 10,000 inference runs at a batch size of 1. The entire process required only a single RTX 4090 GPU, highlighting its computational efficiency and low resource demands.

Appendix B

As referred to in the caption of Figure 4, the confidence intervals for the results presented in Figure 4 and Figure 5 are computed using the Bootstrap method. The specific implementation is as follows:

First, the performance metrics (RMSE or PCC) of the five independent trials with identical configurations except random seeds are treated as a set of five data points, and a bootstrap sample is generated by randomly selecting one data point with replacement from the original set, repeating this process five times, which means some data points could be selected multiple times, while others may not be selected at all in a single sample.

Second, the mean value is computed for this bootstrap sample. And this sampling and mean-calculation process is repeated 10,000 times to establish a robust empirical sampling distribution of the metric mean.

Finally, The 95% confidence interval is derived directly from this distribution using the percentile method. The lower and upper bounds are defined as the 2.5th and 97.5th percentiles of the 10,000 bootstrap means, respectively.

References

Fu, L.-L.; Smith, R.D. Global Ocean Circulation from Satellite Altimetry and High-Resolution Computer Simulation. Bull. Am. Meteorol. Soc. 1996, 77, 2625–2636. [Google Scholar] [CrossRef]
Ducet, N.; Le Traon, P.Y.; Reverdin, G. Global High-Resolution Mapping of Ocean Circulation from TOPEX/Poseidon and ERS-1 and -2. J. Geophys. Res. Oceans 2000, 105, 19477–19498. [Google Scholar] [CrossRef]
Stammer, D. Global Characteristics of Ocean Variability Estimated from Regional TOPEX/POSEIDON Altimeter Measurements. J. Phys. Oceanogr. 1997, 27, 1743–1769. [Google Scholar] [CrossRef]
Chelton, D.B.; Schlax, M.G.; Samelson, R.M. Global Observations of Nonlinear Mesoscale Eddies. Prog. Oceanogr. 2011, 91, 167–216. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, W.; Qiu, B. Oceanic Mass Transport by Mesoscale Eddies. Science 2014, 354, 322–324. [Google Scholar] [CrossRef]
Schiller, A.; Brassington, G.B.; Oke, P.; Cahill, M.; Divakaran, P.; Entel, M.; Freeman, J.; Griffin, D.; Herzfeld, M.; Hoeke, R.; et al. Bluelink Ocean Forecasting Australia: 15 Years of Operational Ocean Service Delivery with Societal, Economic and Environmental Benefits. J. Oper. Oceanogr. 2020, 13, 1–18. [Google Scholar] [CrossRef]
Chassignet, E.P.; Hurlburt, H.E.; Smedstad, O.M.; Halliwell, G.R.; Hogan, P.J.; Wallcraft, A.J.; Bleck, R. Ocean Prediction with the Hybrid Coordinate Ocean Model (HYCOM). In Ocean Weather Forecasting: An Integrated View of Oceanography; Chassignet, E.P., Verron, J., Eds.; Springer: Dordrecht, The Netherlands, 2006; pp. 413–426. ISBN 978-1-4020-4028-3. [Google Scholar]
Zavala-Garay, J.; Wilkin, J.L.; Arango, H.G. Predictability of Mesoscale Variability in the East Australian Current given Strong-Constraint Data Assimilation. J. Phys. Oceanogr. 2012, 42, 1402–1420. [Google Scholar] [CrossRef]
Nagy, H.; Di Lorenzo, E.; El-Gindy, A. The Impact of Climate Change on Circulation Patterns in the Eastern Mediterranean Sea Upper Layer Using Med-ROMS Model. Prog. Oceanogr. 2019, 175, 226–244. [Google Scholar] [CrossRef]
Fraser, R.; Palmer, M.; Roberts, C.; Wilson, C.; Copsey, D.; Zanna, L. Investigating the Predictability of North Atlantic Sea Surface Height. Clim. Dyn. 2019, 53, 2175–2195. [Google Scholar] [CrossRef]
Ham, Y.-G.; Kim, J.-H.; Luo, J.-J. Deep Learning for Multi-Year ENSO Forecasts. Nature 2019, 573, 568–572. [Google Scholar] [CrossRef]
Shao, Q.; Li, W.; Han, G.; Hou, G.; Liu, S.; Gong, Y.; Qu, P. A Deep Learning Model for Forecasting Sea Surface Height Anomalies and Temperatures in the South China Sea. J. Geophys. Res. Ocean. 2021, 126, e2021JC017515. [Google Scholar] [CrossRef]
Zhou, L.; Zhang, R.-H. A Self-Attention–Based Neural Network for Three-Dimensional Multivariate Modeling and Its Skillful ENSO Predictions. Sci. Adv. 2023, 9, eadf2827. [Google Scholar] [CrossRef]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep Learning and Process Understanding for Data-Driven Earth System Science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-Informed Neural Networks: A Deep Learning Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Eusebi, R.; Vecchi, G.A.; Lai, C.-Y.; Tong, M. Realistic Tropical Cyclone Wind and Pressure Fields Can Be Reconstructed from Sparse Data Using Deep Learning. Commun. Earth Environ. 2024, 5, 8. [Google Scholar] [CrossRef]
Wu, S.; Bao, S.; Dong, W.; Wang, S.; Zhang, X.; Shao, C.; Zhu, J.; Li, X. PGTransNet: A Physics-Guided Transformer Network for 3D Ocean Temperature and Salinity Predicting in Tropical Pacific. Front. Mar. Sci. 2024, 11, 1477710. [Google Scholar] [CrossRef]
Zhou, S.; Shi, R.; Yu, H.; Zhang, X.; Dai, J.; Huang, X.; Xu, F. A Physical-Informed Neural Network for Improving Air-Sea Turbulent Heat Flux Parameterization. J. Geophys. Res. Atmos. 2024, 129, e2023JD040603. [Google Scholar] [CrossRef]
Gan, J.; Li, H.; Curchitser, E.N.; Haidvogel, D.B. Modeling South China Sea Circulation: Response to Seasonal Forcing Regimes. J. Geophys. Res. Oceans 2006, 111, C06034. [Google Scholar] [CrossRef]
Shu, Y.; Chen, J.; Yao, J.; Pan, J.; Wang, W.; Mao, H.; Wang, D. Effects of the Pearl River Plume on the Vertical Structure of Coastal Currents in the Northern South China Sea during Summer 2008. Ocean Dyn. 2014, 64, 1743–1752. [Google Scholar] [CrossRef]
Rus, M.; Fettich, A.; Kristan, M.; Licer, M. HIDRA2: Deep-Learning Ensemble Sea Level and Storm Tide Forecasting in the Presence of Seiches—The Case of the Northern Adriatic. Geosci. Model Dev. 2023, 16, 271–288. [Google Scholar] [CrossRef]
Zhu, R.; Song, B.; Qiu, Z.; Tian, Y. A Metadata-Enhanced Deep Learning Method for Sea Surface Height and Mesoscale Eddy Prediction. Remote Sens. 2024, 16, 1466. [Google Scholar] [CrossRef]
Brettin, A.E.; Zanna, L.; Barnes, E.A. Learning Propagators for Sea Surface Height Forecasts Using Koopman Autoencoders. Geophys. Res. Lett. 2025, 52, e2024GL112835. [Google Scholar] [CrossRef]
Sha, W.; Jin, D.; Liu, L.; Zhang, X.; Ling, F.; Zhang, F.; Liao, Y.; Luo, J.-J. A South China Sea Surface Absolute Dynamic Topography Prediction Model Based on Convolutional Long Short-Term Memory Network with Self-Attention Mechanism. Geophys. Res. Lett. 2025, 52, e2025GL117019. [Google Scholar] [CrossRef]
Zhang, L.; Duan, W.; Cui, X.; Liu, Y.; Huang, L. Surface Current Prediction Based on a Physics-Informed Deep Learning Model. Appl. Ocean Res. 2024, 148, 104005. [Google Scholar] [CrossRef]
Nan, F.; Xue, H.; Yu, F. Kuroshio Intrusion into the South China Sea: A Review. Prog. Oceanogr. 2015, 137, 314–333. [Google Scholar] [CrossRef]
Qu, T.; Du, Y.; Meyers, G.; Ishida, A.; Wang, D. Connecting the Tropical Pacific with Indian Ocean through South China Sea. Geophys. Res. Lett. 2005, 32, L24609. [Google Scholar] [CrossRef]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.-Y.; Wong, W.; Woo, W. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. arXiv 2015, arXiv:1506.04214. [Google Scholar] [CrossRef]
Wang, Y.; Long, M.; Wang, J.; Gao, Z.; Yu, P.S. PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs. In Proceedings of the Advances in Neural Information Processing Systems: Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Tan, C.; Gao, Z.; Li, S.; Li, S.Z. SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning. IEEE Trans. Multimed. 2024, 27, 5170–5184. [Google Scholar] [CrossRef]
Lagerloef, G.S.E.; Mitchum, G.T.; Lukas, R.B.; Niiler, P.P. Tropical Pacific Near-Surface Currents Estimated from Altimeter, Wind, and Drifter Data. J. Geophys. Res. Oceans 1999, 104, 23313–23326. [Google Scholar] [CrossRef]
Cui, Y.; Wu, R.; Zhang, X.; Zhu, Z.; Liu, B.; Shi, J.; Chen, J.; Liu, H.; Zhou, S.; Su, L.; et al. Forecasting the Eddying Ocean with a Deep Neural Network. Nat. Commun. 2025, 16, 2268. [Google Scholar] [CrossRef]
Chu, P.C.; Edmons, N.L.; Fan, C. Dynamical Mechanisms for the South China Sea Seasonal Circulation and Thermohaline Variabilities. J. Phys. Oceanogr. 1999, 29, 2971–2989. [Google Scholar] [CrossRef]
Vignudelli, S.; Birol, F.; Benveniste, J.; Fu, L.-L.; Picot, N.; Raynal, M.; Roinard, H. Satellite Altimetry Measurements of Sea Level in the Coastal Zone. Surv. Geophys. 2019, 40, 1319–1349. [Google Scholar] [CrossRef]
Yang, H.; Liu, Q.; Liu, Z.; Wang, D.; Liu, X. A General Circulation Model Study of the Dynamics of the Upper Ocean Circulation of the South China Sea. J. Geophys. Res. Oceans 2002, 107, C73085. [Google Scholar] [CrossRef]
Centurioni, L.R.; Niiler, P.N.; Lee, D.-K. Near-Surface Circulation in the South China Sea during the Winter Monsoon. Geophys. Res. Lett. 2009, 36, L06605. [Google Scholar] [CrossRef]
Song, Y.T. Estimation of Interbasin Transport Using Ocean Bottom Pressure: Theory and Model for Asian Marginal Seas. J. Geophys. Res. Oceans 2006, 111, C11S19. [Google Scholar] [CrossRef]
Zu, T.; Wang, D.; Yan, C.; Belkin, I.; Zhuang, W.; Chen, J. Evolution of an Anticyclonic Eddy Southwest of Taiwan. Ocean Dyn. 2013, 63, 519–531. [Google Scholar] [CrossRef]

Figure 1. The study area, as framed by the red box.

Figure 2. Flowchart of the SimVPv2 model for SSH prediction. The batch dimension is omitted for clarity. The batch size is 40 (4 × 10) in both the Spatial Encoder and Spatial Decoder, and 4 in the Spatiotemporal Translator.

Figure 3. Evaluation of Phys-SV predictions across geostrophic constraint coefficients (λ): (a) RMSE and (b) PCC. Dark blue points and error bars denote the mean and 95% Bootstrap CI from five runs; light blue points are individual results. The red dashed line and shaded area represent the SimVPv2 baseline (Base-SV) and its 95% CI.

Figure 4. Similar to Figure 3, except that the Phys-SV here specifically denotes the model incorporating all three strategies—geostrophic constraint (λ = 0.7), latitude-weighted loss, and mask-informed inputs—while “No GC”, “No LW”, and “No MI” correspond to models trained without the respective strategy.

Figure 5. Comparison of the forecast skill of different models at different lead time in the SCS from 1 January 2023 to 13 June 2024, using (a) RMSE and (b) PCC. Mask-PR refers to the Base-PR that employs only the MI method.

Figure 6. (a) Lead-mean RMSE for the Base-SV and the Phys-SV predictions versus date on the test dataset. Seasonal-mean (b) RMSE and (c) PCC at different forecast lead time, comparing the performance of the Base-SV and the Phys-SV.

Figure 7. The spatial distribution of the time-averaged prediction (a) mean absolute error (MAE) for the Base-SV model and (b) the improvement of the Phys-SV model compared to the Base-SV model (

{M A E}_{B a s e} - {M A E}_{P h y s}

) from 1 January 2023 to 13 June 2024. The black line indicates the 200 m isobath.

Figure 7. The spatial distribution of the time-averaged prediction (a) mean absolute error (MAE) for the Base-SV model and (b) the improvement of the Phys-SV model compared to the Base-SV model (

{M A E}_{B a s e} - {M A E}_{P h y s}

) from 1 January 2023 to 13 June 2024. The black line indicates the 200 m isobath.

Figure 8. The spatial distribution of the prediction MAE for the Base-SV model and the improvement of the Phys-SV model compared to the Base-SV model in different lead times and seasons: (a) summer and (b) winter, from 1 January 2023 to 13 June 2024. The black line indicates the 200 m isobath.

Figure 9. Comparison of predictions and the target (raw CMEMS SSH data), units (m). The first day of prediction is 10 February 2023.

Figure 10. Time series of (a) PCC between absolute error of the Base-SV and the wind speed magnitude, PCC between improvement of Phys-SV and the wind speed magnitude; (b) magnitude of wind speed and variation in wind speed (vector, including the variation in direction).

Table 1. Independent replicate experiments using the same random seeds.

Model	RMSE (m)
Model	Trial 1	Trial 2	Trial 3	Mean	Standard Deviation
Base-SV	0.019768	0.019768	0.019768	0.019768	0
Base-SV + MI	0.019329	0.019329	0.019329	0.019329	0
Base-SV + GC	0.017899	0.017921	0.018242	0.018021	0.000157
Phys-SV	0.017309	0.017316	0.017286	0.017304	0.000013
Base-PR	0.027973	0.027973	0.027973	0.027973	0
Base-PR + MI	0.024610	0.024610	0.024610	0.024610	0
Base-PR + GC	0.033604	0.033604	0.033604	0.033604	0
Phys-PR	0.028282	0.028282	0.028282	0.028282	0

Table 2. Seasonal statistics of SSH variability and the PCC between

σ_{T}

and mean absolute error (MAE) of the Base-SV prediction.

Table 2. Seasonal statistics of SSH variability and the PCC between

σ_{T}

and mean absolute error (MAE) of the Base-SV prediction.

Season	${\bar{σ}}_{S}$ (m)	${\bar{σ}}_{T}$ (m)	PCC (100%)
Summer	0.0741	0.0545	0.52
Winter	0.0954	0.0697	0.58

Table 3. RMSE (m) for the whole domain (WD) and deep-basin areas (DB, depth > 200 m).

Model	Scope	Full Year	Summer	Winter
Base-SV	WD	0.0198	0.0185	0.0210
Base-SV	DB	0.0172	0.0169	0.0172
Phys-SV	WD	0.0173	0.0159	0.0188
Phys-SV	DB	0.0145	0.0142	0.0149

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, L.; Shu, Y.; Yao, J.; Liu, D. Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea. Remote Sens. 2025, 17, 3838. https://doi.org/10.3390/rs17233838

AMA Style

Huang L, Shu Y, Yao J, Liu D. Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea. Remote Sensing. 2025; 17(23):3838. https://doi.org/10.3390/rs17233838

Chicago/Turabian Style

Huang, Linxiao, Yeqiang Shu, Jinglong Yao, and Danian Liu. 2025. "Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea" Remote Sensing 17, no. 23: 3838. https://doi.org/10.3390/rs17233838

APA Style

Huang, L., Shu, Y., Yao, J., & Liu, D. (2025). Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea. Remote Sensing, 17(23), 3838. https://doi.org/10.3390/rs17233838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Investigation of Physics-Informed Methods for Improving Sea Surface Height Prediction Based on Neural Networks in the South China Sea

Highlights

Abstract

1. Introduction

2. Data and Methods

2.1. Data

2.2. Model Structure

2.3. Strategies

2.3.1. Geostrophic Constraint in SSH Prediction

2.3.2. Latitude-Weighted Loss

2.3.3. Mask-Informed Inputs

3. Results

3.1. Impact of the Geostrophic Constraint Coefficient

3.2. Ablation Study

4. Analysis of Results

4.1. Comparative Performance Analysis

4.2. Seasonal Variation in Prediction Accuracy

4.3. Spatial Distribution of Forecast Error

4.4. Impact of the Geostrophic Constraint

4.5. Case Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Concrete Structure of SimVPv2

Appendix A.2. Hyperparameter Configuration

Appendix B

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI