1. Introduction
Sea Surface Height (SSH) and Sea Level Anomaly (SLA) are critical variables in oceanography and climate research, providing valuable insights into ocean surface conditions [
1,
2,
3]. These measurements are directly linked to global climate change, ocean circulation patterns, climate system interactions, and marine ecosystem health. Long-term SSH and SLA data are essential for monitoring phenomena such as global sea level rise, glacial melting, and changes in polar regions [
4,
5,
6,
7]. Satellite altimeters serve as effective means for obtaining SSH observations, such as those from the TOPEX/Poseidon and Jason satellite missions [
8,
9]. These satellites transmit signals, often microwaves, to the ocean surface and measure the time it takes for the signals to reflect back. This time difference, combined with the satellite’s position, allows for the calculation of SSH relative to a reference point, typically the Earth’s surface [
10]. SLA is defined as the deviation of SSH from a reference datum, usually the long-term mean SSH over a specified period. Numerous satellites currently measure SSH, including the TOPEX/Poseidon, Jason, and Sentinel series [
11,
12]. Advances in satellite technology have enabled the use of gravitational field measurements to estimate ocean mass distribution, and thus, derive SSH [
13]. Satellites like GRACE (Gravity Recovery and Climate Experiment) and GOCE (Gravity field and steady-state Ocean Circulation Explorer) monitor sea level variations through this method [
14,
15]. Additionally, the intensity of microwave radiation reflected from the ocean surface can be measured, which is especially useful in polar regions or areas with heavy cloud cover [
16].
The Atlantic Ocean, the second-largest ocean in the world, covers approximately 20% of the Earth’s surface. It holds significant geographical, economic, climatic, and ecological importance, stretching from the Arctic to the Antarctic over a distance of about 16,000 km and connecting multiple continents [
17]. As a central component of the global climate system, the Atlantic Ocean plays a crucial role in ocean circulation, heat distribution, climate regulation, and biodiversity [
18]. The circulation within the Atlantic Ocean can be divided into two main components: surface circulation and deep circulation [
19]. Surface circulation is primarily driven by wind patterns and includes major currents such as the Gulf Stream, the North Atlantic Drift, the Canary Current, and the South Atlantic Gyre. In contrast, deep circulation is driven by variations in temperature and salinity, often referred to as thermohaline circulation. Among these currents, the Gulf Stream is one of the most powerful ocean currents globally, significantly influencing climate patterns and weather systems across the globe [
20].
Mesoscale eddies are oceanic structures with spatial scales between large-scale and small-scale features, typically ranging from tens to hundreds of kilometers [
3,
21]. These eddies play a crucial role in understanding complex dynamical processes in the ocean, particularly in energy transfer, material exchange, and ocean circulation [
22]. Mesoscale eddies are driven by a range of nonlinear processes, including wind stress, geostrophic effects, eddy instability, and other dynamic factors [
23]. In the Gulf Stream, mesoscale eddies are primarily generated by instabilities and boundary effects of the current, such as current splitting, overturning, or bending [
20,
24]. Recent observations show that the number of anticyclonic warm eddies has nearly doubled since 2000 [
25,
26]. This increase is attributed to changes in the speed and position of the Gulf Stream, which has become more energetic and sinuous, facilitating the detachment of more mesoscale eddies. Modeling studies suggest that under global warming conditions, there is a reduction in the amplitude of these eddies, while the number and radius of the mesoscale eddies have increased [
27].
Geostrophic flow is a fundamental type of motion in oceanic hydrodynamics, commonly observed in large-scale ocean circulation [
4]. This flow is primarily influenced by the Earth’s rotation. The Coriolis force, a result of the Earth’s rotation, significantly affects fluid motion, and in large-scale flows, it typically balances with the horizontal pressure gradient force within the water column, leading to the formation of geostrophic flow [
8,
28,
29]. The study of oceanic flow patterns began in the early 20th century, with geostrophic flow gradually being recognized as a typical and important flow pattern [
20,
30]. As global climate change and shifts in the marine environment increasingly impact human society, understanding the characteristics and behavior of ocean currents becomes crucial [
6]. The research on geostrophic flow holds significant academic and practical value for climate prediction, marine resource development, and ecological protection. In particular, the study of geostrophic flow within large-scale ocean current systems, such as the equatorial countercurrent and polar circulation, is essential. By exploring these flows, scientists can uncover the fundamental principles governing ocean circulation and improve predictions of long-term oceanic changes [
7].
Currently, research on mesoscale eddies and geostrophic flows primarily relies on grid reanalysis data, which demands significant computational resources. In contrast, directly fitting satellite orbit data can provide an approximate real-time sea surface height field. This direct fitting approach is structurally simple and does not require a supercomputing platform, making it a highly efficient and meaningful alternative. One effective method for direct fitting is the bicubic quasi-uniform B-spline fitting technique. Previous studies, such as those by Xu et al., have demonstrated the method’s success in idealized experiments, where it effectively fits mesoscale eddies using satellite along-track data over a 15° × 15° oceanic region [
31]. However, the applicability of bicubic quasi-uniform B-spline fitting has not been fully explored in larger, more complex regions. In oceanography, basin-scale studies encompass critical physical processes, such as large-scale ocean circulation, thermohaline circulation, and climate change, all of which have widespread effects on the ocean and are not confined to specific areas.
This study examines the performance of bicubic quasi-uniform B-spline fitting at the basin scale, using the North Atlantic as a case study. By comparing the fitting error to the error between Level-4 grid SLA data and observed data, we find that the two errors are closely aligned. The study also examines the effect of the coastline on sea surface height fitting results. Additionally, the study applies the fitting results to calculate geostrophic flow and vertical component of relative vorticity. These results are compared with the Level-4 gridded geostrophic flow data, and a reliable vertical component of relative vorticity calculated from the Level-4 gridded geostrophic flow data. This paper consists of the following sections:
Section 2 describes the data and methodology.
Section 3 presents the experimental steps and results. Finally,
Section 4 summarizes and concludes the main findings of the study.
3. Results
3.1. Selection of Order Combinations for Fitting Bicubic Quasi-Uniform B-Spline Surfaces
To minimize the effect of the coastline on sea surface height fitting, a 36° × 30° area with minimal land coverage was selected. This region, with longitudes ranging from 16°W to 52°W and latitudes from 20°N to 50°N, was used for 10-fold cross-validation to determine the optimal combinations of data days and spline order combinations for the best fitting results. By combining control variable methods and cross-validation, the data with a 1-day time range were first varied within a range of 39–46 orders in the east–west direction and 36–41 orders in the north–south direction for each combination of order parameters. The bicubic quasi-uniform B-spline method was applied for surface fitting, and 10-fold cross-validation was performed to calculate the average of the MAE across the ten validation sets. Subsequently, the data duration was incrementally increased from 1 day to 13 days, with MAE values compared for different time ranges of data. It was found that for 1–4 days of data (with fewer observation points), the MAE values were larger, and these results are not presented here.
Table 2 shows the number of data points and the corresponding optimal MAE values for the B-spline fitting method applied to datasets ranging from 5 to 13 days. The optimal MAE values presented here correspond only to the results obtained from different order combinations of along-track data for each time range.
As shown in
Table 2, the optimal MAE values for different time ranges of data initially decrease and then gradually increase as the time range extends. The optimal MAE for the 9-day data is the smallest among all the time ranges considered. This can be attributed to the fact that using fewer than 9 days of data leads to significant fitting errors due to the limited amount of information, while using more than 13 days of data can introduce larger errors. These larger errors are primarily due to the increased time gap between the earliest and latest observations, as well as the influence of eddy motion. Furthermore, the difference in optimal MAEs for the 8–13 days data is relatively small, suggesting that bicubic quasi-uniform B-spline fits using data from 8 to 11 days are all feasible. This is particularly true when considering that errors from vortex motions increase with longer time ranges. Therefore, to minimize the impact of these factors, the 9-day data, which yields the smallest optimal MAE, is selected for further analysis.
A 10-fold cross-validation was conducted using 9 days of data, with the east–west direction order
and the north–south direction order
varied. The results of this cross-validation are presented in
Figure 3, which illustrates that the MAE is minimized when the east–west direction order is 41 and the north–south direction order is 38.
To further investigate the relationship between spatial range and optimal fitting order, data spanning a 9-day period were used, with different spatial regions selected for 10-fold cross-validation, as illustrated in
Figure 4.
The optimal order combinations were first determined through 10-fold cross-validation for three progressively smaller regions: the ABCD, EFGH, and EIJK regions, as shown in
Figure 4a. Next, regions with unequal latitude and longitude extents, depicted in
Figure 4b, were examined. Based on the optimal order combinations derived from the LMNO region, the latitude and longitude ranges were changed, and the optimal order combinations for the PQRS region were subsequently obtained via 10-fold cross-validation. To explore the effect of spatial extent in the east–west versus north–south directions, 10-fold cross-validation was also performed for the TQRU and LMVW regions. The optimal order combinations for different spatial extents are summarized in
Table 3. Specifically, the optimal orders are 27, 28, and 31 for a 20° spatial range, 38, 40, and 41 for a 30° spatial range, and 39, 41, 43, 45, and 46 for a 36° spatial range. It can be observed that the ratio of the optimal order to the spatial span decreases from about 1.4 to about 1.2 as the spatial span increases from 20° to 36°. This indicates that although larger spatial spans require more control points and higher orders for accurate fitting, the increase in order tends to slow down relative to the increase in spatial extent. This happens because as the spatial span expands, the influence of local features on the fit decreases. In small spatial ranges, close to the scale of mesoscale eddies, a relatively high order is necessary to accurately fit local features due to the density of data points and the potential for strong local variations. In contrast, for large-scale regions much larger than the scale of mesoscale eddies, the data’s variation tends to be smoothed out, and higher orders are primarily used to fit global trends, rather than local features. Thus, as the spatial span widens, the focus shifts toward global fitting, resulting in a slower rate of increase in order demand.
From this, the following conclusions can be drawn:
When fitting satellite along-track data for spatial ranges not larger than 10 times the scale of mesoscale eddies, the optimal order combinations can be selected by centering on the order corresponding to 1.4 times the latitudinal and longitudinal spans, and expanding the range of orders forward and backward for cross-validation;
When the spatial range required for fitting exceeds 10 times the scale of mesoscale eddies, the optimal order combinations can be selected by centering on the orders corresponding to 1.2 times the latitudinal and longitudinal spans, with the range of orders expanded forward and backward for cross-validation.
3.2. Hypothesis Test
The results from 10-fold cross-validation indicate that the optimal B-spline fitting order combination, based on 9-day data, is shown in
Figure 5a.
Figure 5b displays the corresponding fitting errors at each satellite–subsatellite point.
Figure 5c shows the along-track data at each satellite–subsatellite point.
Figure 5d illustrates the CMEMS Level-4 gridded SLA data. The MAE is 1.89, with a root mean square error (RMSE) of 3.02. In comparison, the MAE and RMSE between CMEMS Level-4 gridded SLA data and satellite along-track data are 1.95 and 3.06, respectively, demonstrating a similar level of accuracy. While B-spline fitting is a simpler method that provides quick results, its performance is comparable to gridded data. Mesoscale eddy boundaries are characterized by complex dynamics, such as the roll-up effect, turbulence, and water exchange, which lead to highly nonlinear variations in sea surface height. While B-spline fitting effectively interpolates data in most regions, it may produce larger fitting errors in these nonlinear, dynamic areas, such as the edges of vortices. In regions with active mesoscale signals, energy is transferred to smaller scales through strong shear and other instabilities, making these signals difficult to resolve directly and leading to errors. The fitting errors observed in
Figure 5b are primarily associated with these complex regions. Hypothesis testing is applied to further analyze the fitting errors.
Figure 5e displays the bicubic quasi-uniform B-spline fitting results, with latitude and longitude coordinates consistent with the CMEMS Level-4 gridded SLA data. The red line represents the contour of the anticyclonic vortex, and the blue line represents the contour of the cyclonic vortex. The same applies to the lines in
Figure 5f.
Figure 5f displays the result of subtracting the SLA in
Figure 5d from the SLA in
Figure 5e. Observing
Figure 5f, it can be seen that within the anticyclonic vortex contour, the B-spline fitting results are consistently greater than the Level-4 gridded SLA data. Specifically, after conducting a statistical analysis of the points in
Figure 5f, 33.3% of the B-spline results are more than 1 cm larger, and 3.7% are more than 5 cm larger than the Level-4 gridded SLA data. In contrast, within the cyclonic vortex contour, the B-spline fitting results are consistently smaller than the Level-4 gridded SLA data, with 40.3% of the results being more than 1 cm smaller, and 7.3% being more than 5 cm smaller. The average difference between the B-spline fitting results and the Level-4 gridded SLA data within the anticyclonic vortex contour is 0.81 cm, while the average difference within the cyclonic vortex contour is −1.22 cm. This suggests that the B-spline fitting results more accurately reflect the true intensity of mesoscale eddies in the ocean.
In the literature, the distribution of data is often influenced by various factors, making it difficult to strictly follow a normal distribution. However, according to the central limit theorem, even if the data deviate from normality, large sample sizes, combined with appropriate statistical processing and hypothesis testing, allow the assumption of normal distribution to still be valid for further analysis.
To assess the normality of the data, tools such as normal distribution histograms and P–P (probability–probability) plots are commonly used. A normal distribution histogram displays the shape of the data distribution by grouping the data and plotting it as a bar chart. If the histogram resembles a curve with a higher center and lower tails, and its shape closely aligns with the ideal normal distribution, the data can be considered approximately normal. This is indicated by symmetry and concentration around the center, characteristic of a normal distribution. A P–P plot compares the cumulative distribution of the sample data to the cumulative distribution of a theoretical normal distribution. If the data points in the P–P plot closely follow the straight line representing the theoretical normal distribution, the data are considered to closely follow a normal distribution. Significant deviations, particularly large curves or areas of separation from the line, suggest that the data are not normally distributed. By analyzing both the normal distribution histogram and the P–P plot, it becomes more intuitive to assess whether the data are approximately normally distributed. The frequency distribution histograms and P–P plots of the fitting errors are shown in
Figure 6.
The histogram of the frequency distribution shows a peak in the middle with lower frequencies on both sides, and the P–P plot indicates that the cumulative probability of the sample data matches the cumulative probability of the theoretical normal distribution. Although the error data may not strictly follow a normal distribution, it can be considered approximately normal for practical purposes. The one-sample Z-test was conducted to assess the accuracy of the B-spline fit, with the test value set to 0. The known population standard deviation is replaced by the sample standard deviation, which is 3.42 cm. The null hypothesis was that there is no significant difference between the fitting error and the test value. The one-sample Z-test results yielded a p-value of 0.9935. Since the p-value exceeds the common significance level of 0.05, the null hypothesis cannot be rejected, indicating no significant difference between the mean of the fitting error and 0.
3.3. The Influence of the Coastline on the Fitting Results
SLA data obtained from satellite remote sensing are typically affected by various factors, such as noise, satellite orbit errors, and meteorological influences. Near the coastline, SLA values are generally less accurate than in open sea areas due to several challenges. These include the relative height difference between the water surface and land, the complex dynamics of nearshore waters, and the lower spatial resolution of the satellite data. Coastal topography is often more complex, encompassing various geomorphic types like sandy beaches, rocky areas, wetlands, as well as shallow regions and estuaries. These features are difficult to accurately represent in SLA data. Additionally, ensuring that the generated grid points to be fitted align precisely with the boundaries of the satellite’s along-track data is challenging when compared using ETOPO1 topographic data.
Given these challenges, it is important to investigate how the coastline affects B-spline fitting. Specifically, we explore whether varying the proportion of coastline included in the fitting range impacts the fitting error.
Based on the previous analysis, an initially unaffected coastline area with coordinates 16–52°W, 20–50°N was selected as the fitting region. The spatial ranges of 36° × 30° were maintained, with order combinations of 41 in the east–west direction and 38 in the north–south direction fixed. The fitting region was then shifted westward in 5° increments until the final spatial extent of 46–82°W, 20–50°N was reached.
Figure 7, Subfigures 1–7, display the results for different fitting regions.
To assess the impact of the coastline on fitting errors, the region was divided into sections using meridians at 52°W, 57°W, 62°W, 67°W, 72°W, and 77°W. The MAE for the fitting results is computed separately for each section, with the MAE values presented in the corresponding figures. The analysis shows that the MAE does not continue to increase as the fitting region approaches the coastline. Instead, the MAE is primarily influenced by the number of mesoscale eddies within each spatial extent. Regions with a high density of mesoscale eddies exhibit larger MAEs, as the B-spline fitting errors tend to concentrate along the edges of these eddies.
To further investigate the influence of the coastline on mesoscale eddy fitting, two areas with high mesoscale eddy density are selected for closer examination: 40–50°W, 30–48°N and 51–67°W, 32–42°N. These regions are highlighted in red and black boxes in
Figure 7. The MAEs of the fitting results within these regions are compared across subfigures, using the MAE from Subfigure 1 (red boxes) as the baseline. For the black box baseline, the MAE is compared with results from a region (40–70°W, 20–42°N) that is near a coastline-free area but still close to the continent. Subfigure 8 in
Figure 7 uses an optimal B-spline fitting approach with order combinations of 36 for the east–west direction and 28 for the north–south direction, determined through 10-fold cross-validation.
Figure 8 presents the MAE and RMSE of the fitting results for the satellite–subsatellite points in each region of the subfigures in
Figure 7. Additionally, it shows the MAE and RMSE for the satellite–subsatellite points within the red box and the black box in each subfigure of
Figure 7.
Comparing the MAE and RMSE of the fitting results for the entire region at different spatial ranges across the subfigures in
Figure 7 (shown in
Figure 8), we observe that as the fitting region shifts westward, both the MAE and RMSE gradually increase. This is primarily due to a decrease in satellite along-track data as the region moves west, causing overfitting when the fitting order exceeds the available data, leading to higher errors. But the maximum increase in MAE was only 1.20 cm, and the maximum increase in RMSE was only 2.49 cm. In contrast, the MAE and RMSE for the region in Subfigure 8 of
Figure 7, where optimal fitting orders were selected using cross-validation, are comparable to those in Subfigure 1. When examining the MAE and RMSE for the regions within the red boxes in
Figure 7, the effect of the shore boundary on B-spline fitting in areas slightly farther from the shore is minimal. Similarly, for the regions within the black boxes, the influence of the shore boundary on B-spline fitting in areas closer to the shore is also negligible.
Traditional global fitting methods, such as polynomial fitting, are often influenced by shoreline variations, which can distort the fitting results for the entire ocean area. However, B-spline fitting, due to its local support structure, is less affected by shoreline variations, impacting only a small region near the shore. This property allows researchers to focus on specific regions of the sea surface height without worrying about the complex shoreline or distant areas interfering with the fit.
3.4. Calculate and Compare Geostrophic Flows
Due to the absence of velocity data in satellite along-track measurements, the geostrophic flow calculations based on SLA data differ from the actual flow field. To address this, reliable flow field data are selected from gridded data. Specifically, Level-4 gridded daily sea surface geostrophic flow data are used to calculate a 9-day average, centered on the same time range as the satellite data.
Figure 9a shows the sea surface geostrophic flow calculated using B-spline-fitted SLA. Central finite differences are used here.
Figure 9b,c display flow field from the Level-4 gridded data and the flow field of the Level-4 gridded data with an increased number of grid points after fitting with B-splines, respectively. The steps of fitting the flow velocity data with B-splines are consistent with the fitting steps of SLA mentioned earlier.
From
Table 4, it can be seen that the error between the calculated geostrophic flow at nearly half of the grid points and the gridded data is within 0.03 m/s. The averaged velocity magnitude of the sea surface geostrophic flow in the grid reanalysis data is 0.15 m/s. To better understand the differences between the geostrophic currents, we computed several metrics for the sea surface geostrophic current derived from B-spline SLA fitting and compared them with the corresponding currents from the gridded reanalysis data. The key metrics are as follows: MAE of 0.06 m/s, RMSE of 0.56 m/s, Mean Error (ME) of 0.003 m/s, and Structural Similarity Index (SSIM) of 0.62. This suggests a spatial similarity between the two types of sea surface geostrophic currents, but the consistency in quantity is limited. This difference arises because the gridded reanalysis data, which integrates multi-source observations (such as buoys, satellites, and models), has a more robust data foundation compared to the sea surface geostrophic data derived solely from the along-track data of six satellites. In contrast, the sea surface geostrophic data used in this study is derived solely from the along-track data of six satellites, which introduces additional limitations. Moreover, SLA-based geostrophic flow calculations offer a finer data resolution by reducing the grid point step size compared to gridded data. Currently, full-field geostrophic flow calculations primarily rely on gridded data and interpolation of flow at satellite–subsatellite points. Although traditional gridded reanalysis products can provide accurate SLA, they tend to smooth out small-scale features during the gridding and data merging process. Additionally, converting irregularly distributed data to standard grids is time-consuming, which can compromise the timeliness of the results [
20]. In contrast, real-time processing using satellite along-track data improves the timeliness of geostrophic flow estimates.
3.5. Calculate and Compare Vertical Component of Relative Vorticity
The sea surface geostrophic flow from the Level-4 gridded data is used to define the reliable flow field. The vertical component of relative vorticity field is then computed from these flow fields using Formula (8), as shown in
Figure 10b,c. In conjunction with
Figure 9, it is evident that regions with elevated sea surface heights exhibit negative vertical component of relative vorticity relative to their surroundings, while regions with lower sea surface heights show positive vertical component of relative vorticity. This pattern underscores the fundamental relationship between sea surface height variations and vertical component of relative vorticity, consistent with the established principles of ocean dynamics regarding vortex structures and sea surface height distribution.
Figure 10a presents vertical component of relative vorticity computed using B-spline fitting of the SLA. The locations of the positive and negative vertical component of relative vorticity centers align closely with those in the reliable vertical component of relative vorticity field. This suggests that the B-spline fitting method effectively captures the spatial distribution of vertical component of relative vorticity, and despite boundary smoothing differences, it still accurately reflects the key characteristics of the vertical component of relative vorticity field. To better understand the differences between the vertical component of relative vorticities, we computed several metrics for the vertical component of relative vorticity derived from B-spline SLA fitting and compared them with the corresponding vertical component of relative vorticity from the gridded reanalysis data. The key metrics are as follows: MAE of
, RMSE of
, ME of
, and SSIM of 0.96. The regions with high vertical component of relative vorticity typically have values on the order of 10
−5 . This suggests a spatial similarity between the two types of vertical component of relative vorticities, but the consistency in quantity is limited.
4. Conclusions
This study builds upon the B-spline surface fitting framework for extracting SLA data, previously validated for regional scales [
31]. We extend the application of this method to a basin-scale domain, conducting experiments and analysis over the North Atlantic (36° × 30°). Our detailed analysis demonstrates that the bicubic quasi-uniform B-spline surface fitting method achieves results comparable to Level-4 gridded SLA data when applied to satellite along-track SLA data across the Atlantic Ocean basin. As the fitting area expands, the rate of increase in fitting order slows relative to the spatial extent, though higher-order fits and more control points are necessary for accuracy. The optimal fitting order is approximately 1.4 times the latitudinal and longitudinal spans for areas not exceeding 10 times the scale of mesoscale eddies, and about 1.2 times the spans for areas greater than 10 times the eddy scale. Within the anticyclonic vortex contour, 33.3% of the B-spline fitting results are more than 1 cm larger than the Level-4 gridded SLA data. In contrast, within the cyclonic vortex contour, 40.3% of the B-spline fitting results are more than 1 cm smaller than the Level-4 gridded SLA data. This suggests that the B-spline fitting results more accurately reflect the true intensity of mesoscale eddies in the ocean.
The study also examines the effect of the coastline on sea surface height fitting results. It finds that the local support advantage of B-spline fitting minimizes the impact of the coastline on the fitting accuracy, even in coastal areas. This suggests that the B-spline method is robust under complex topographic conditions and is well suited for studying near-shore sea areas. The geostrophic flow derived from the B-spline fitting are in spatial agreement with the Level-4 gridded sea surface geostrophic flow data, as are the calculated vertical component of relative vorticity. This demonstrates that B-spline fitting enhances the density of data while preserving the overall structure of the vertical component of relative vorticity field.
Our work systematically quantifies the impacts of coastlines by incorporating land into the fitting region. The local support property of the B-spline effectively confines errors to nearshore zones (ΔMAE < 1.20 cm), while eddy-dense areas remain unaffected (
Figure 7). This confirms the method’s suitability for application in marginal seas and boundary current systems, such as the Gulf Stream. Compared to traditional gridded reanalysis products, the B-spline method improves the timeliness of geostrophic flow data while retaining key details of the sea surface height field, which is critical for real-time ocean flow monitoring and prediction.