Next Article in Journal
Digging in Deep: Size and Site-Specific Variation in Burrow Morphology and Behaviour of the Mud Shrimp, Trypaea australiensis Dana, 1852
Previous Article in Journal
Underwater Target Tracking Method Based on Forward-Looking Sonar Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Error Quantification of Gaussian Process Regression for Extracting Eulerian Velocity Fields from Ocean Drifters

1
Rosenstiel School of Marine, Atmospheric and Earth Science, University of Miami, Miami, FL 33149, USA
2
Special Oceanography Coordination, Federal University of Santa Catarina, Florianópolis 88040, SC, Brazil
*
Author to whom correspondence should be addressed.
J. Mar. Sci. Eng. 2025, 13(3), 431; https://doi.org/10.3390/jmse13030431
Submission received: 15 January 2025 / Revised: 12 February 2025 / Accepted: 17 February 2025 / Published: 25 February 2025
(This article belongs to the Section Physical Oceanography)

Abstract

:
Drifter observations can provide high-resolution surface velocity data (Lagrangian data), commonly used to reconstruct Eulerian velocity fields. Gaussian Process Regression (GPR), a machine learning method based on Gaussian probability distributions, has been widely applied for velocity field interpolation due to its ability to provide interpolation error estimates and handle separations between particles. However, its evaluation has primarily relied on cross-validation, which approximates temporal and spatial correlations but does not fully capture their dependencies, limiting the comprehensiveness of performance assessment. Moreover, GPR has not been rigorously tested on model datasets with reference velocity fields to evaluate its overall accuracy and the reliability of the error estimate. This study addresses these gaps by (1) assessing the accuracy of GPR-reconstructed fields and their error estimates, (2) evaluating GPR performance across temporal and spatial dimensions, and (3) analyzing the relationship between training data density and prediction accuracy. Using six metrics, GPR predictions are evaluated on a double-gyre model and a Navy Coastal Ocean Model (NCOM). Results show that GPR achieves high accuracy, contingent on sampling density and velocity magnitude, while validating the posterior covariance matrix as a reliable error predictor. These findings provide critical insights into the strengths and limitations of GPR in oceanographic applications.

1. Introduction

With the explosive growth of available data and computing resources, recent advances in machine learning and artificial neural networks have yielded transformative results across diverse scientific disciplines, including image recognition, cognitive science, and genomics [1]. In recent years, machine learning and artificial neural networks have been widely applied in ocean science research [2,3,4,5].
The deployment of drifting buoys has increased drastically in the last decade, and the data density has increased a lot [6,7,8,9,10]. Lagrangian data, which may offer high-resolution surface velocity information, are crucial for fundamental ocean science and applied problems, including but not limited to search-and-rescue efforts, drifting sensor arrays, and pollution mitigation measures such as responding to an oil spill. The challenge in utilizing Lagrangian data lies in the inherent difficulties of reconstructing velocity fields from ungridded observations, which often require processing and interpolation onto a structured grid to enhance analysis and scientific interpretation. Because particle trajectories are obtained by integrating the velocity, even minor errors in the forecast of Eulerian velocity tend to accumulate and grow [11]. Additionally, particle motion is often inherently chaotic, even in simple flows [12]. Thus, even a slight difference in initial conditions in space and time can result in significantly different trajectories, which adds complexities to the prediction and reconstruction of the underlying velocity fields.
In oceanography, reconstructing Eulerian velocity fields from Lagrangian data are a complex challenge, addressed through various approaches. Among these, the classical methodologies are primarily divided into Lagrangian and pseudo-Lagrangian techniques. The Lagrangian approach directly utilizes the trajectories of drifting objects to infer fluid velocities [13]. The pseudo-Lagrangian method incorporates additional information, such as model predictions or historical data, to refine these inferences [14]. Several methodologies, including optimal interpolation [14,15], mode decomposition techniques [16], Kalman filtering [17,18,19], variational methods [20,21,22], and particle filter methods [23,24,25], have been implemented to bridge observed Lagrangian data with Eulerian velocity fields. These conventional approaches frequently require data preprocessing steps, like the filtering of observations, to mitigate scale and noise issues inherent in the data, following the models’ foundational mechanics and fluid dynamics equations. Gaussian Process Regression (GPR) falls within the family of Optimal Interpolation and Kalman filtering techniques, both of which provide posterior variances as estimates of reconstruction uncertainty. However, traditional implementations of these methods often require predefined assumptions about system dynamics, whereas GPR offers a flexible non-parametric approach that naturally incorporates multi-scale variability in oceanic flows.
Building on this advantage, GPR extends beyond deterministic interpolation by adopting a probabilistic framework that models velocity fields as distributions rather than point estimates. As a non-parametric probabilistic machine learning approach, GPR predicts the distribution of unknown functions based on observed sets of Lagrangian data points. This formulation enables not only interpolation, but also the quantification of uncertainty, making it a powerful tool for geostatistics and forward propagation of uncertainty in numerical models [26,27,28,29]. In recent years, it has also been applied in reconstructing velocity fields based on observed drifter data to provide statistics and visualization of submesoscale flow field evolution in time [2,30,31]. GPR offers significant advantages in the reconstruction of velocity fields, primarily due to its ability to inherently accommodate the multi-scale nature of ocean dynamics through a multi-scale covariance function that characterizes velocity correlations across horizontal space and time [32,33]. This capability enables a more detailed reconstruction of velocity fields without the need for preliminary filtering of observations to adjust to the model’s scale. It is particularly suitable for areas where the underlying mechanics are not well understood and are challenging to parametric models. It is important to note that while GPR’s adaptability allows for its application in various contexts—from sequential implementation, as seen in the works of Li et al. (2015) [34], and simultaneous implementation, as seen in the works of Le Traon (1990) [35], to direct multi-scale analysis—our specific deployment of GPR with dual length scales per dimension is intentionally crafted to address the multi-scale challenges inherent in oceanic data. This approach, detailed further in our methodology, leverages the strengths of GPR in a novel manner, distinct from the conventional use of the technique. Thus, GPR offers a significant advantage over classic approaches by providing a probabilistic framework that accommodates multi-scale ocean dynamics without the need for preliminary data filtering and offers confidence estimates for the reconstructed velocity fields, thereby enhancing both the accuracy and reliability of oceanographic analyses.
In previous studies, validation of GPR-produced results has primarily relied on cross-validation and comparisons with remote sensing data [36,37]. For instance, Gonçalves et al. [30] applied GPR to drifter data from the Lagrangian Submesoscale Experiment (LASER) and used cross-validation to evaluate its performance. Similarly, Lodise et al. [2] applied GPR to drifter data from LASER in the Gulf of Mexico and the Coherent Lagrangian Pathways from the Surface Ocean to the Interior (CALYPSO) experiment in the Mediterranean Sea, validating the results against Marine X-band Radar velocity measurements. While these studies demonstrated that GPR is effective for reconstructing surface velocity fields, they were limited in their evaluation methodology.
Cross-validation, a commonly used validation technique, assumes that the training and validation data are independent and identically distributed [38]. However, this foundational assumption is violated when using drifter data, as drifter trajectories are inherently correlated due to their interaction with the same underlying flow field. This interdependence undermines the reliability of cross-validation in providing a comprehensive performance assessment for GPR. Additionally, cross-validation neglects the spatial and temporal dependencies characteristic of oceanographic data, further limiting its capacity to rigorously evaluate GPR’s performance. Moreover, GPR has not been systematically tested against model datasets with reference velocity fields to assess its overall accuracy and the reliability of its error estimates.
To overcome these limitations, this study advances the validation framework by focusing on error quantification and grid-level reliability of GPR reconstructions. A key feature of this approach is the availability of reference solutions against which reconstructed velocity fields can be systematically evaluated. Recognizing the scarcity of high-quality, long-duration observational drifter datasets in the global ocean, we utilize two well-controlled model datasets as ground truth. These include an analytical non-divergent double-gyre model and the fully data-assimilating Navy Coastal Ocean Model (NCOM) [39]. These models provide high-resolution velocity fields while allowing control over spatial and temporal characteristics, enabling a rigorous investigation of the impact of spatial and temporal approximations on reconstruction accuracy and the data density required to achieve a given error level.
This study examines the errors of reconstructed velocity fields using GPR when a full reference velocity field is available and explores the relationship between training data density and prediction accuracy. A key novelty of our approach lies in GPR’s ability to accommodate the multi-scale nature of oceanic flow by learning these scales directly from data rather than imposing predefined assumptions. Two distinct test cases are employed to assess GPR’s performance: a non-divergent double-gyre model with time-periodic perturbations, which evaluates temporal error dynamics, and a simulated convergence region in the Gulf of Mexico using NCOM, which focuses on spatial error quantification. By integrating these test cases, this research provides critical insights into the strengths and limitations of GPR in reconstructing complex oceanographic velocity fields across multiple scales.
The structure of this paper is as follows: Section 2.1 provides a concise overview of the theory and methodology employed in this study. Section 2.2 introduces the non-divergent double-gyre model and the NCOM, which are used to conduct tests focused on the temporal and spatial dimensions, respectively. This study’s results and main findings are presented and discussed in Section 3 and Section 4, respectively.

2. Materials and Methods

2.1. Theory and Calculation

2.1.1. Gaussian Process Regression

Gaussian Process Regression is a probabilistic supervised machine learning framework widely used for regression and classification [40]. It leverages prior knowledge through kernels and provides uncertainty estimates alongside predictions. A Gaussian process is defined by its mean and covariance functions [27]. The mean function is also referred to as the prior mean, as it represents the best estimate before accounting for additional observations. To perform the reconstruction, we treat the zonal u and meridional v components of the particle velocities as separate, scalar quantities, with each velocity datum being associated with a unique coordinate in horizontal space and time, such that ( u i d , v i d ) = [ u ( p i d ) , v ( p i d ) ] , with p = p ( x , y , t ) , i = 1 , , n , the superscript d refers to the observational data, i is the index of the observation and n is the total number of observations.
In practical application with discrete data, velocity reconstruction requires expressing the Gaussian process formulation as a joint or multivariate Gaussian distribution. This involves defining the prior mean as a vector of velocities and the covariance function as a matrix of covariances, ensuring that the model captures spatial and temporal dependencies in the velocity field. The goal of GPR is to estimate u t at unobserved space-time target points using the observational data u d .
u d u t N u ¯ d u ¯ t , K d d K d t K d t T K t t
where ∼ indicates that the velocity components at observed u d and target u t locations follow a joint normal distribution N , fully described by a mean function and a covariance structure. Specifically, u ¯ d and u ¯ t represent the prior mean velocity estimates at observed and target points, respectively. K d d is the covariance matrix between observed velocity points, capturing spatial and temporal relationships in the training data. K d t is the covariance matrix between observed and target points, which defines how known observations influence predictions. ( K d t ) T represents the transpose of K d t , ensuring symmetry in the covariance structure, and K t t is the covariance matrix between pairs of target points, determining the correlation structure among unobserved locations.
u ¯ i d = u ¯ d ( p i ) , K i , j d d = K ( p i , p j ) , i , j = 1 , 2 , , n , u ¯ i t = u ¯ t ( p i ) , K i , j t t = K ( p i , p j ) , i , j = 1 , 2 , , m , K i , j d t = K ( p i , p j ) , i = 1 , 2 , , n , j = 1 , 2 , , m .
Expanding the prior mean vectors and covariance matrix entries (Equation (2)) and by conditioning the Gaussian prior distribution (Equation (1)) on the observations, we obtain the joint Gaussian posterior distribution,
u t | u d N u ˜ t , Q , u ˜ t = u ¯ t + K d t K d d 1 u d u ¯ d , Q = K t t K d t K d d 1 K d t .
where u ˜ t refers to the posterior mean, and Q refers to the posterior covariance matrix. Equation (3) serves as the predictive equations for GPR. We consider the prior mean u d to be zero so that all the information about the velocity field is inferred from the regression. The covariance matrices must be positive semidefinite [41] and should ideally reflect the real space-time correlations of the predicted fields.
Our covariance function, k, is the sum of two squared exponential functions.
k p , p = i = 1 M σ i 2 e x p [ t t 2 2 r t i 2 y y 2 2 r y i 2 x x 2 2 r x i 2 ]
where r t , r y and r x are the correlation time scale, meridional length scale, and zonal length scale; σ i 2 is the signal variance; and M is the number of scales per dimension. In the double-gyre case, the two gyres represent the larger scale, while the smaller perturbations correspond to the smaller scale. Similarly, in the NCOM case, we assume the presence of a larger mesoscale and a smaller submesoscale. Therefore, in both the double-gyre and NCOM test cases, the number of scales per dimension, M, is set to 2. These correlation scales can be assembled into a hyperparameter vector θ = ( σ 1 2 , r t 1 , r x 1 , r y 1 , σ 2 2 , r t 2 , r t 2 , r x 2 , r y 2 , and σ N 2 ) .

2.1.2. Optimization of Hyperparameters

To find the hyperparameters ( σ 1 2 , r t 1 , r x 1 , r y 1 , σ 2 2 , r t 2 , r t 2 , r x 2 , r y 2 , a n d σ N 2 ) for each velocity component, we start the analysis by optimizing the hyperparameters of the covariance function k ( θ ) with the marginal likelihood approach [27],
log p u d | θ = 1 2 u d T B 1 u d 1 2 log | B | n 2 l o g 2 π
where B = ( K d d + σ N 2 I ) and n is the number of observations. To optimize the hyperparameters by maximizing the log marginal likelihood, we take the partial derivatives of the log marginal likelihood function concerning each hyperparameter θ j θ ,
θ j log p u d θ = 1 2 u d T B 1 B θ j B 1 u d 1 2 Tr B 1 B θ j = 1 2 Tr β β T B 1 B θ j
where β = B 1 u d and Tr is the trace of a square matrix. We then utilize the limited memory Broyden–Fletcher–Goldfarb–Shanno (L-BFGS) gradient-based optimization algorithm [42] to find the values of q that maximize the log marginal likelihood.

2.1.3. Error Estimation and Density Calculation

We assess GPR performance using multiple error metrics, including deviation and relative error, which quantify prediction accuracy at each grid point. The predicted error (Equation (7)) is calculated as the square root of the diagonal of the posterior covariance matrix Q, providing a probabilistic estimate of the uncertainty in the GPR-inferred velocity components [30]. This serves as a key diagnostic for identifying regions with high uncertainty in predictions. Furthermore, a comprehensive evaluation of the GPR model’s overall performance is undertaken through the utilization of several key metrics, including the Coefficient of Determination ( R 2 ), Mean Bias Error (MBE, Equation (8)), Root Mean Square Error (RMSE, Equation (9)), Mean Absolute Error (MAE, Equation (10)), Model Efficiency (EF, Equation (11)), and Willmott’s D (Equation (12)).
E r r Q = Q
M B E = N 1 i = 1 N ( P i O i )
R M S E = ( N 1 i = 1 N ( P i O i ) 2 )
M A E = N 1 i = 1 N | P i O i |
E F = 1 i = 1 N ( P i O i ) 2 i = 1 N ( O ¯ O i ) 2
D = 1 i = 1 N ( P i O i ) 2 i = 1 N ( | P i O ¯ | | O i O ¯ | ) 2
where N represents the total number of observational data. O denotes the observed values, which refer to the reference values obtained from established models in this study. P signifies the predicted values generated from GPR.
The Coefficient of Determination ( R 2 ) serves as an initial measure of model reliability, quantifying the relationship between actual and predicted data. Meanwhile, Mean Bias Error (MBE) captures systematic biases by averaging the differences between predictions and observations. Although MBE is not a norm, it provides valuable insights into consistent over- or underestimation by the model. Together, these metrics offer a preliminary understanding of the model’s validity.
The relationship between R 2 and overall model performance is not well-defined, as highlighted by Willmott [43]. Therefore, metrics like Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are employed for a more comprehensive assessment of performance. RMSE places greater emphasis on larger errors, making it sensitive to significant deviations and useful for identifying problematic regions. In contrast, MAE treats all errors equally and is less influenced by outliers, offering a more balanced perspective on model accuracy.
In addition to these measures, other accuracy assessments are applied to evaluate the GPR model. Model efficiency (EF), proposed by Greenwood et al. [44], evaluates the relationship between observed and predicted mean deviations. An EF value close to zero implies that the observed mean values are more reliable than the model’s predictions, revealing significant model limitations. Finally, a relative and bounded measure of model validity known as the Index of Agreement (D) [43] is employed. This metric scales with the magnitude of the variables, retains mean information, and does not excessively amplify the impact of outliers. It comprehensively assesses how well the model aligns with the observed data, considering the studied variables’ characteristics.

2.2. Datasets and Test Configuration

We used two cases in this study. The first case is a simple double-gyre model where the flow field is defined by a time-varying stream function; the focus here is on the influence of temporal variations on the GPR inference. The second case uses an NCOM-simulated flow field in the Gulf of Mexico that features a convergence region; this test focuses on the influence of complex spatial variations on the GPR inference. This test case has no time variation to ensure a ground truth velocity field with which to compare. The error quantification process was performed by comparing the difference between the Eulerian and GPR-reconstructed fields. Both deviations and relative errors were calculated to compare with the posterior distribution from GPR. Lagrangian particles required for training the GPR were sampled from simulated trajectories (Table 1). The overview of the error quantification process is shown in Figure 1.

2.2.1. Time-Periodic Double-Gyre Simulation

First, we consider a periodically driven double-gyre flow [45], which has frequently been used as a test bed for different tools for the numerical analysis of transport [46]. The flow is non-divergent, and consists of a pair of counter-rotating gyres with a time-periodic perturbation (Figure 2). This design aims to estimate the performance of GPR in three dimensions (x, y, t), especially in the time dimension. The stream function ψ is given by
ψ ( x , y , t ) = sin ( x ) sin ( y ) + ϵ sin ( x ω t ) s i n ( 2 y )
where ϵ is the amplitude of the time-periodic asymmetry in the gyre. The flow domain is the rectangular region of non-dimensional size 6.4 × 3.2; we use the same parameter values as in Shadden et al., 2005 [45], i.e., ϵ = 0.1 and ω = 2 π / 10 . Fifty Lagrangian particles were randomly seeded and advected in the domain for two full periods while sampling their trajectories every 1/50 of the gyre period ( Δ t = 0.2 ) (Figure 2). The random release minimizes the influence of the release locations on the GPR inference and avoids biasing the assessment of its performance. GPR is then employed to estimate the Eulerian velocity field on a uniformly spaced grid of 128 × 64. The inferred velocity field is then compared to the reference velocity field implied by the stream function, and the aforementioned error metrics are computed.

2.2.2. Navy Coastal Ocean Model (NCOM)—Convergence Region

The Navy Coastal Ocean Model (NCOM) is a realistic, complex and data-assimilative oceanographic model, renowned for its abilities in simulating a spectrum of oceanic dynamics [47], including the mesoscale convergence zones investigated in this study. These pivotal regions, characterized by their dynamic mesoscale mixing and critical exchanges between oceanic layers, are captured by NCOM. A notable convergence zone in the Gulf of Mexico has been chosen that extends over a region of 135 × 146 km with a high resolution grid of 1 km (Figure 3). To evaluate the capability of GPR in reconstructing velocity fields, we employed a Lagrangian–Eulerian advection framework to analyze particle trajectories and assess the model’s performance (Table 1).
The NCOM fields are provided on a uniformly spaced grid with a resolution of 1 km and sampled every 15 min. For this analysis, the NCOM reference fields were treated as time-invariant by freezing their temporal variations to focus solely on their spatial characteristics. Using the Lagrangian–Eulerian advection algorithm outlined in Table 1, 200 particles were released randomly across the research region. Particle trajectories were sampled every 15 min over 24 time steps, corresponding to a 6 h observational period. GPR was then applied to the particle trajectories to infer the velocity field at the end of this 6 h period. The inferred velocity fields were subsequently compared with the reference velocity fields from NCOM to assess the accuracy of GPR. The particle velocity components, u i d and v i d , were derived from the updated particle positions using the same interpolation method described in Table 1.

3. Results

3.1. Double-Gyre Model

The double-gyre model provides an idealized framework for evaluating GPR’s ability to reconstruct velocity fields. The optimized hyperparameters for the model are presented in Table 2, reflecting the use of 5000 data points during the optimization process.
The noise level for zonal (u) and meridional (v) velocities were optimized to σ N 1.18 × 10 20 and σ N 9.75 × 10 5 , respectively. The u component exhibits a larger spatial scale ( r x 1 = 0.647 ; r y 1 = 0.779 ) with a signal standard deviation σ 1 = 0.161 , alongside a smaller spatial scale ( r x 1 = 9.33 × 10 3 ; r y 1 = 5.46 × 10 3 ) with a signal standard deviation σ 2 = 6.34 × 10 4 . Similarly, the v component demonstrates a larger spatial scale ( r x 1 = 0.928 ; r y 1 = 1.01) with σ 1 = 0.152 , and a smaller spatial scale ( r x 2 = 0.0127 ; r y 2 = 0.93 ) with σ 2 = 1.96 × 10 3 . The larger spatial scales effectively capture the periodic Eulerian perturbations of the double-gyre system. In contrast, the smaller spatial scales exhibit sensitivity to regions near hyperbolic points, where velocities approach zero. This multiscale sensitivity underscores the strength of GPR in modeling complex dynamics, enabling the differentiation of mesoscale and localized features.

3.1.1. Performance Metrics and Temporal Trends

The performance of the GPR model was assessed using both accuracy metrics and error indicators over the 21 time steps for the zonal (u) and meridional (v) velocity components. Figure 4 summarizes these evaluation metrics, including the Model Efficiency ( E F ), Willmott’s Index of Agreement (D), and Coefficient of Determination ( R 2 ), which assess the agreement between predicted and reference velocity fields. Additionally, error metrics such as Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Bias Error (MBE) quantify deviations between the predictions and the ground truth.
Key observations:
  • High predictive accuracy:
    Metrics such as E F , D, and R 2 consistently approach 1 across all time steps, indicating excellent agreement between predicted and reference velocities.
    Error metrics (RMSE, MAE, MBE) remain close to zero throughout the time series, underscoring the robustness of the GPR model in reconstructing velocity fields.
  • Temporal variations:
    Model performance peaks during the middle observational period when (e.g., steps 9–14) the GPR reconstruction benefits from past and future observations. This results in lower errors and higher accuracy during this period.
    The GPR reconstruction exhibits slightly higher errors when one-sided (either future or past) observations are available.) Predictions at these steps primarily rely on data from one temporal direction, reducing the overall information available for inference.
  • Component-wise performance:
    The meridional velocity (v) reconstruction has consistently smaller errors than the zonal velocity (u), as reflected in both accuracy and error metrics.
    This disparity arises from the double-gyre dynamics, where perturbations are concentrated along the zonal direction (x-axis), leading to greater variability and complexity for u.
  • Model robustness:
    Despite minor temporal variations, the GPR model demonstrates consistent performance across all time steps, showcasing its ability to accurately reconstruct temporal velocity field dynamics.

3.1.2. Velocity Field Comparisons

Figure 5 compares the velocity fields generated by the double-gyre model with those reconstructed by the GPR model at two representative time steps: the 9th and 14th, chosen from a total of 21 time steps. These specific time steps were selected as they exhibit the most pronounced tilts in opposing directions, posing a greater challenge for accurate reconstruction.
At time-step 9 (Figure 5, left), the velocity field shows a slight tilt to the right, a feature that is accurately captured by the GPR model. Conversely, at time-step 14 (Figure 5, right), the velocity field tilts slightly to the left, reflecting the periodic nature of the double-gyre dynamics. The GPR predictions successfully replicate these subtle variations, highlighting the model’s effectiveness in reconstructing dynamic features.
The close alignment between the GPR-reconstructed fields and the reference model velocity fields underscores the robustness of GPR in handling spatially and temporally varying velocity fields. Minor discrepancies, when present, are mainly localized in low-velocity regions, such as areas near hyperbolic points. These discrepancies are explored further in the error analysis subsection.

3.1.3. Error Analysis

Figure 6 presents the deviations, relative errors and predicted errors ( E r r Q ) for the zonal (u) and meridional (v) velocity components at time-step 9. In most regions, the reconstructed velocities exhibit minimal deviations and relative errors, indicating that GPR effectively captures the velocity fields’ dynamics. However, exceptions arise in low-velocity regions, where errors are notably higher.
For both u and v, relative errors often exceed ± 100 % in areas where the velocity magnitude approaches zero. These errors are primarily caused by incorrect directional predictions by GPR, a common challenge in low-velocity conditions. Such errors are amplified by minor perturbations and observational uncertainties, including potential inaccuracies in GPS measurements and time lag during data acquisition.
The predicted error ( E r r Q ) effectively highlights areas with high uncertainty. Regions with elevated E r r Q values correlate well with areas of significant deviations and relative errors, particularly around hyperbolic points where velocity magnitudes are very small. Additionally, E r r Q reflects areas of data voids, which are inherent in Lagrangian observations due to the uncontrollable nature of particle trajectories being determined by the underlying flow field. This dual role of E r r Q —indicating both uncertainty and data sparsity—demonstrates its utility as a diagnostic tool for identifying regions where GPR predictions may require further refinement or where additional observational data could improve reconstruction accuracy.
Figure 7 provides scatter plots comparing GPR-predicted velocity values to the reference velocities from the double-gyre model at time-steps 9 and 14. These scatter plots highlight the correlation between predictions and actual values for both zonal (u) and meridional (v) velocities. Marker size and color represent the predicted error magnitude ( E r r Q ), with larger, brighter markers indicating higher errors.
Key observations:
  • Zonal velocity ( u ):
    Predictions for u exhibit slightly larger deviations from the y = x line compared to v, especially at time-step 14. This disparity can be attributed to disturbances along the x-axis, where the larger spatial range introduces additional complexity for the GPR model.
    Errors are more pronounced in regions with lower velocities, consistent with the trends identified in the relative error analysis.
  • Meridional velocity ( v ):
    Predictions for v display tighter clustering around the y = x line, reflecting higher accuracy compared to u.
    This improved performance is likely due to the smaller spatial range in the v direction, which allows the model to better capture the underlying dynamics.
  • Time-step variability:
    Predictions at time-step 9 are more accurate than those at time-step 14. This trend aligns with earlier findings (Figure 4), where middle time steps benefit from both future and past observations.
These scatter plots reinforce the findings from the relative error analysis, demonstrating the strengths of GPR in accurately capturing velocity components while highlighting challenges in regions with larger spatial ranges or low velocities.

3.1.4. Summary of Double-Gyre Model Results

Having demonstrated GPR’s accuracy in reconstructing velocity fields in an idealized double-gyre model, we now evaluate its performance in a more complex, realistic oceanic setting: the NCOM convergence region. Through a comprehensive evaluation spanning multiple performance metrics, error analysis, and component-wise assessments, several key findings emerge:
  • Overall performance: GPR demonstrated strong predictive accuracy, as evidenced by high values of metrics such as E F , D, and R 2 , which consistently approached 1 across all time steps. Error metrics (RMSE, MAE, and MBE) remained close to zero, underscoring the robustness of the model in capturing the dynamics of the velocity fields.
  • Temporal trends: The GPR model performed best during the middle time steps (e.g., steps 9–14), leveraging information from both future and past observations. Slightly higher errors at the beginning and end of the time series (e.g., steps 1 and 21) reflect the limitations of one-sided information.
  • Component-wise insights: The meridional velocity (v) consistently outperformed the zonal velocity (u), likely due to the perturbations in the zonal direction (x-axis) inherent to the double-gyre dynamics. This result aligns with the physical nature of the model, where the zonal direction exhibits higher variability and complexity.
  • Error distribution: The relative error analysis revealed that regions with low velocity magnitudes posed the greatest challenge for GPR, with relative errors often exceeding ± 100 % . This limitation was most prominent near hyperbolic points and in low-velocity regions. The scatter plots further emphasized these discrepancies, highlighting larger deviations in the u component compared to v.
  • Utility of predicted error ( ErrQ ):  E r r Q has demonstrated its value as a reliable diagnostic tool by effectively correlating with regions of high uncertainty, significant deviations, and large relative errors. Beyond assessing prediction reliability, E r r Q also highlights areas of data sparsity, which is particularly useful for refining model design and optimizing observational strategies. This dual functionality makes E r r Q an essential component for improving GPR performance and identifying regions where additional data or adjustments are needed to enhance reconstruction accuracy.
  • Reconstruction of velocity fields: Visual comparisons of the GPR-reconstructed and model-generated velocity fields demonstrated strong alignment, with the GPR model accurately capturing subtle dynamic features such as periodic tilts and gyre structures. Minor discrepancies were primarily localized in regions with sparse data or low velocities.

3.2. NCOM Convergence Region

The optimized hyperparameters for NCOM are shown in Table 3. A total of 4800 data points were used in this optimization. The optimized noise levels for the zonal velocity (u) and meridional velocity (v) were σ N 1.33 × 10 4 m s−1 and σ N 3.56 × 10 4 m s−1, respectively. Zonal velocity u presented a larger spatial scale ( r x 1 = 23.7 km ; r y 1 = 12.1 km ) with a signal standard deviation σ 1 = 6.34 × 10 2 m s−1, alongside a smaller spatial scale ( r x 2 = 3.65 km ; r y 2 = 3.16 km ) with a signal standard deviation σ 2 = 1.6 × 10 3 m s−1. Similarly, meridional velocity v presented a larger spatial scale ( r x 1 = 16.2 km ; r y 1 = 43.9 km ) with σ 1 = 0.312 m s−1 and a smaller spatial scale ( r x 2 = 3.11 km ; r y 2 = 3.72 km ) with σ 2 = 5.54 × 10 3 m s−1.
The larger spatial scales reflect mesoscale dynamics, while the smaller spatial scales capture submesoscale variability. The longer correlation times associated with smaller spatial scales highlight the model’s sensitivity to persistent localized features. Conversely, shorter correlation times for larger spatial scales indicate their relevance to broader, less temporally persistent patterns.
This hyperparameter configuration demonstrates the multi-scale nature of the NCOM convergence region, making it an excellent case for evaluating Gaussian Process Regression in reconstructing complex, ocean-like velocity fields. The next subsections will delve into the performance metrics, velocity field comparisons, and error analysis to assess the GPR model’s effectiveness in this challenging test case.

3.2.1. Performance Metrics and Sampling Density Insights

The analysis of the GPR model’s performance in the NCOM convergence region focuses first on the impact of sampling density on prediction accuracy. These investigations provide insights into the optimal conditions for GPR-based velocity field reconstruction and offer practical considerations for drifter deployment.
Sampling density effects.
Sampling density plays a pivotal role in determining the accuracy of GPR predictions. To evaluate its impact, two spatial domains were analyzed: a broader region (135 km × 146 km) and a more confined region (9 km × 6 km). The results, shown in Figure 8, reveal how increasing sampling density reduces relative error, but with diminishing returns as density surpasses certain thresholds.
  • Broader region (135 km × 146 km): In the broader domain, increasing the sampling density from 0.024   points / km 2 (20 drifters) to 0.24   points / km 2 (200 drifters) reduces the overall relative error from approximately 12.5 % to 4.5 % . However, the improvement becomes marginal beyond a density of 0.12   points / km 2 . This suggests that while increasing sampling density improves prediction accuracy, practical considerations such as drifter deployment feasibility must guide the selection of density levels.
  • Confined region (9 km × 6 km): In the confined domain, higher densities achieve lower relative errors, approaching an optimal density of 7   points / km 2 . Beyond this threshold, the performance improvement plateaus. This saturation may be attributed to inherent limitations in the GPR model’s complexity and the fidelity of the simulated drifter data. Identifying such thresholds is critical for balancing prediction accuracy with resource allocation.
Quantitative analysis of sampling density.
Table 4 and Table 5 summarize performance metrics for varying sampling densities. For both u and v, lower sampling densities ( < 0.024   points / km 2 ) result in suboptimal performance, with E F values below 0.85 and relatively high RMSE values. As density increases to 0.12   points / km 2 or higher, performance metrics improve significantly, with E F values approaching 0.98 and RMSE values dropping below 0.04   m / s . However, further increases in density beyond 7   points / km 2 yield negligible improvements, reflecting the diminishing returns observed in Figure 8.
Implications for drifter deployment.
The analysis underscores the importance of optimizing drifter deployment to achieve accurate GPR predictions:
  • Trade-offs in broader regions: In larger domains, balancing prediction accuracy and practical constraints, such as cost and deployment logistics, is crucial. A sampling density of 0.12   points / km 2 appears to offer an effective balance.
  • Precision in confined regions: For smaller, high-priority areas, achieving densities close to 7   points / km 2 maximizes accuracy while acknowledging diminishing returns beyond this threshold.
The combined analysis of relative error and sampling density reveals the critical interplay between data availability and prediction accuracy. While GPR excels in capturing mesoscale and submesoscale dynamics in well-sampled regions, its performance declines in sparsely sampled or low-velocity areas. Future work should explore strategies to address these challenges, such as integrating physical constraints into the model or leveraging advanced optimization techniques for drifter deployment.

3.2.2. Velocity Field Comparisons

Figure 9 compares the NCOM-generated and GPR-reconstructed fields in both magnitude (upper panels) and direction (lower panels). The GPR predictions closely replicate the spatial structure and flow dynamics of the NCOM fields, with minor discrepancies observed in regions with sparse particle data, particularly in the lower-left corner.
The velocity directions show strong agreement across most of the domain, demonstrating GPR’s ability to capture complex mesoscale and submesoscale flow patterns. Slight directional deviations align with areas of low velocity and reduced data coverage, consistent with relative error analyses.
Overall, the results validate the robustness of GPR in reconstructing realistic ocean-like velocity fields, while highlighting the importance of sufficient data density for optimal performance.

3.2.3. Error Analysis

Figure 10 presents a detailed evaluation of the GPR-predicted velocities compared to the reference NCOM values, focusing on deviations and relative errors for the zonal (u) and meridional (v) components across the grid.
For the zonal velocity ( u ), Panels (a) and (c) reveal that deviations are generally low across most of the grid, with significant errors concentrated in the south-western region where data sparsity is most pronounced. The peak deviation in this region reaches approximately 0.2   m / s , but the relative error remains moderate at around 20 % . This pattern suggests that the GPR model effectively reconstructs the larger-scale zonal flow dynamics, particularly in regions with sufficient observational density. The smoother gradients in the u component, associated with the predominant south-to-north flow direction, contribute to lower deviations and reduced relative errors compared to the meridional component.
In comparison, meridional velocity ( v ) exhibits larger deviations and relative errors, as shown in Panels (b) and (d). Errors are particularly evident in the same lower-left corner, where data sparsity leads to uncertainties in reconstruction. Relative errors for v exceed ± 100 % in low-velocity regions, where directional predictions become unreliable. The higher variability in the meridional component can be attributed to the presence of localized features and steeper velocity gradients, which are more challenging for the GPR model to capture accurately.
Panels (e) and (f) display the predicted error ( ErrQ ) for both components, providing insight into the model’s confidence across the grid. High predicted errors align closely with regions of elevated deviations and relative errors, particularly in low-velocity areas and regions with sparse particle observations. This correlation confirms the utility of E r r Q as an indicator of prediction uncertainty, allowing for targeted identification of problematic areas in the velocity field reconstruction.
Summary: The analysis highlights several key observations:
  • The zonal velocity (u) generally exhibits lower deviations and relative errors due to smoother flow patterns and fewer localized variations.
  • The meridional velocity (v) shows higher errors, reflecting the challenges of capturing complex gradients and low-velocity regions.
  • Data sparsity significantly influences error distribution, with both components experiencing larger deviations where particle coverage is limited.
  • The predicted error ( E r r Q ) effectively identifies areas of high uncertainty, providing a valuable diagnostic tool for assessing the reliability of the GPR predictions.
The scatter plots in Figure 11 further illustrate the relationship between GPR-predicted and reference NCOM velocities for both the zonal (u) and meridional (v) components. Most data points closely cluster around the y = x line, indicating strong agreement between the GPR predictions and NCOM values. However, deviations become more pronounced for low-velocity regions, as reflected by the larger and brighter markers corresponding to higher predicted error ( E r r Q ).
For the zonal velocity (u), predictions remain relatively accurate, with deviations primarily confined to regions of sparse data coverage or minimal velocity magnitudes. In contrast, the meridional velocity (v) exhibits slightly larger deviations, particularly at lower velocities. This increased variability aligns with the observed higher errors in v, reinforcing the challenges in reconstructing complex gradients and localized features.
The predicted error ( E r r Q ) effectively highlights problematic regions in both u and v, where deviations from the y = x line are most prominent. This reinforces E r r Q as a valuable tool for quantifying uncertainty and identifying areas where prediction accuracy may be compromised.
Relative error and velocity magnitude. Figure 12 illustrates the relationship between relative error and velocity magnitude for both the actual and predicted velocities. A clear trend emerges: as velocity magnitude decreases, the relative error increases significantly. This effect is particularly pronounced below 0.1   m / s , where relative errors frequently exceed 100 % . Such high errors indicate the model’s reduced capability to accurately predict both velocity magnitude and direction in low-velocity regions.
The yellow and red lines in Figure 12 represent 10 % and 100 % error thresholds, respectively. Below the 0.1   m / s velocity threshold, GPR predictions often deviate significantly, resulting in unreliable estimates. This is likely due to the inherent challenges in distinguishing noise from signal in regions with minimal velocity variation. Such errors are further amplified by observational uncertainties, including inaccuracies in simulated drifter trajectories and low sampling densities in these regions.
The error analysis reveals several key findings:
  • Zonal velocity ( u ): deviations and relative errors are lower, benefiting from smoother gradients and large-scale flow dynamics.
  • Meridional velocity ( v ): errors are larger, particularly in regions with steep gradients and low velocities, due to the complexity of localized features.
  • Data sparsity: errors are concentrated in regions with limited particle observations, underscoring the importance of adequate sampling density.
  • Predicted error ( ErrQ ):  E r r Q effectively identifies regions of high uncertainty, aligning well with observed deviations and relative errors.
  • Low-velocity regions: relative errors significantly increase below 0.1   m / s , where the model struggles to predict both magnitude and direction accurately.
These insights emphasize the importance of sufficient data coverage and highlight areas for future improvements, such as integrating physical constraints or enhancing prediction methods for low-velocity regions.

3.2.4. Summary of NCOM Model Results

The evaluation of GPR performance in the NCOM convergence region underscores its capability to reconstruct ocean-like velocity fields while highlighting areas for improvement. Key findings include:
  • Hyperparameter configuration: The optimized hyperparameters reflect the multi-scale dynamics of the region. Larger spatial scales capture mesoscale patterns, while smaller scales highlight submesoscale variability. Correlation times reinforce the persistence of localized features and the transient nature of broader flow patterns, showcasing the adaptability of GPR in resolving multi-scale phenomena.
  • Performance metrics and sampling density:
    Sampling density plays a pivotal role in determining prediction accuracy.
    In the broader region (135 km × 146 km), increasing sampling density from 0.024   points / km 2 to 0.12   points / km 2 significantly reduces relative errors. However, improvements plateau beyond this threshold, emphasizing the need for practical deployment strategies.
    In the confined region (9 km × 6 km), optimal performance is achieved at approximately 7   points / km 2 , where diminishing returns become evident.
  • Velocity field comparisons: GPR successfully reconstructs both the magnitude and direction of the NCOM velocity fields (Figure 9). Minor discrepancies are observed in sparsely sampled areas, particularly near boundaries. The model’s robustness in replicating mesoscale and submesoscale flow structures is evident across the domain.
  • Error analysis:
    Grid-based analysis reveals elevated deviations and relative errors in regions with sparse observations (Figure 10).
    The zonal velocity (u) exhibits lower errors due to smoother gradients, while the meridional velocity (v) shows higher deviations in areas of steep velocity gradients and low data density.
    Predicted error ( E r r Q ) effectively correlates with regions of high uncertainty, offering a reliable tool for assessing prediction reliability.
  • Scatter plot insights: Scatter plots (Figure 11) highlight strong agreement between GPR predictions and reference values at higher velocities. However, deviations increase in low-velocity regions, where larger predicted error values ( E r r Q ) are observed. This trend underscores the challenges of maintaining accuracy in low-velocity scenarios.

4. Discussion

4.1. Implications for Velocity Field Reconstruction and Ocean Circulation Studies

The findings of this study demonstrate that GPR provides a robust framework for reconstructing velocity fields, offering advantages over traditional interpolation methods. Compared to traditional methods, such as optimal interpolation and Kalman filtering, GPR offers the advantage of providing uncertainty quantification. However, its performance depends on data density, and unlike physics-based models, it does not incorporate explicit governing equations of ocean dynamics. The rigorous evaluation of GPR using a double-gyre model and the Navy Coastal Ocean Model (NCOM) highlights its ability to reconstruct both temporally and spatially varying velocity fields with high accuracy.
The comparison between the two test cases reveals important insights into the conditions under which GPR performs optimally. In the double-gyre model, which emphasizes temporal dynamics, GPR leverages information from both past and future observations, resulting in more accurate predictions in intermediate observational periods and slightly higher errors when relying on one-sided observations. In the NCOM convergence region, which focuses on spatial reconstruction, the accuracy of GPR is highly dependent on the sampling density and spatial variability of the flow, as also noted by Kamath et al., 2018 [48]. These findings suggest that GPR is particularly effective in capturing velocity structures when sufficient training data are available, and its performance degrades in sparsely sampled or low-velocity regions.
Beyond accuracy assessments, the generalizability of our findings to real-world oceanographic applications is a key aspect of this study. One major advantage of GPR is its ability to incorporate data-driven uncertainty estimates, making it highly relevant for applications where direct measurements are sparse or intermittent, such as remote sensing- [49,50,51] and drifter-based [2,31,52,53,54] velocity reconstructions. Unlike traditional interpolation methods, which do not provide confidence estimates, GPR allows researchers to identify regions where predictions are less reliable, guiding adaptive sampling strategies for future observational campaigns.
The results also have direct implications for ocean circulation studies, particularly in improving Lagrangian trajectory modeling and transport analysis. By providing statistically optimal velocity estimates with quantified uncertainties, GPR can improve estimates of particle transport pathways, which are critical for understanding submesoscale mixing, material transport, and pollutant dispersion [55,56,57]. This is particularly important for studies of ocean tracers, biological connectivity, and climate-related transport processes, where errors in velocity field reconstructions can significantly impact trajectory simulations and predictive modeling [58,59,60,61,62].
Furthermore, the study highlights the limitations of cross-validation as a validation metric for Lagrangian velocity reconstructions, given that drifter trajectories are spatially and temporally correlated. The use of model-provided reference velocity fields in this study overcomes these limitations and sets a precedent for future studies to adopt more rigorous validation methodologies. The findings reinforce that GPR-based reconstructions are most reliable when applied in regions with adequate observational density and suggest that hybrid approaches—integrating GPR with physics-informed constraints—could further enhance velocity field estimates in real-world settings.
Despite its strong performance in well-sampled regions, GPR has notable limitations when applied to highly turbulent flows where velocity gradients change abruptly [63,64]. Its reliance on smooth covariance functions, while effective for capturing mesoscale and submesoscale dynamics [54], often fails to adequately represent sharp discontinuities or abrupt changes in velocity fields, which are common in real-world ocean currents. Furthermore, the computational cost of GPR is relatively high, particularly for large datasets, due to the need to compute and invert large covariance matrices. This limitation poses challenges for scaling GPR to global or high-resolution regional oceanographic studies. Addressing these challenges will require innovations such as sparse approximation methods [65,66], adaptive sampling strategies [67], or hybrid approaches that integrate GPR with physics-based models to improve its applicability to highly dynamic and complex oceanic systems.

4.2. Future Directions and Enhancements

While this study provides a rigorous assessment of GPR’s strengths and limitations, future work should explore strategies to improve GPR performance in low-velocity and sparsely sampled regions. Potential enhancements include incorporating adaptive covariance structures, hybrid machine learning approaches that integrate physical constraints, and multi-scale GPR frameworks that account for both global and local velocity variations. Additionally, applying GPR to real-world oceanographic datasets (e.g., satellite-derived velocities, long-term drifter deployments) will further validate its effectiveness and generalizability to operational ocean monitoring and forecasting.

5. Conclusions

This study evaluates Gaussian Process Regression (GPR) for reconstructing Eulerian velocity fields using Lagrangian drifter observations, focusing on its accuracy, error estimates, and performance across temporal and spatial dimensions. By evaluating GPR on both the double-gyre model and the NCOM convergence region, we demonstrate its ability to reconstruct complex velocity fields with high accuracy. Our findings highlight GPR’s effectiveness in well-sampled regions while revealing its limitations in low-velocity or sparsely observed areas.
  • Performance and accuracy: GPR demonstrates high accuracy in reconstructing ocean-like velocity fields, achieving overall accuracy levels exceeding 90%. This makes it a reliable tool for capturing both mesoscale and submesoscale dynamics in surface ocean studies. Furthermore, GPR’s posterior covariance matrix serves as a robust predictor of interpolation uncertainty, offering valuable insights into the reliability of reconstructed fields.
  • Sampling density: Sampling density significantly influences prediction accuracy, with lower densities leading to greater errors, particularly in regions with sparse observational coverage. Our results identify an optimal sampling density of approximately seven data points per km2 per time-step, beyond which improvements plateau. This finding underscores the importance of strategic drifter deployment to balance accuracy and resource efficiency in data collection efforts.
  • Velocity magnitude: GPR performs more reliably in faster-flowing regions, while accuracy diminishes in low-velocity areas (below 0.1 m/s). These errors, often exceeding 100% relative error, highlight challenges in accurately reconstructing directional flows in regions with minimal velocity variation. This limitation stems from GPR’s sensitivity to hyperparameter optimization, which tends to prioritize dynamic segments of the flow.
  • Insights into temporal and spatial dependencies: The double-gyre model reveals GPR’s strength in leveraging temporal dependencies, achieving peak performance in intermediate time steps where both future and past observations are available. In contrast, the NCOM case highlights spatial dependencies, with errors concentrated in sparsely sampled or low-velocity regions. These results validate the necessity of evaluating GPR across both temporal and spatial dimensions to comprehensively understand its behavior.
Building on these key findings, this study addresses critical gaps in prior evaluations of GPR by moving beyond conventional validation methods such as cross-validation and remote sensing comparisons. While previous studies have demonstrated GPR’s effectiveness in reconstructing surface velocity fields, their reliance on cross-validation introduces significant limitations. Cross-validation assumes independent and identically distributed (i.i.d.) data, an assumption that is often violated in drifter datasets where trajectories are inherently correlated in space and time. This dependence among data points undermines the validity of cross-validation results, preventing a comprehensive assessment of GPR’s true accuracy and the reliability of its posterior covariance matrix as an error estimate.
To overcome these challenges, we conduct a rigorous evaluation using model datasets with known velocity fields. By leveraging ground truth velocity fields from the double-gyre model and NCOM, we provide a more robust framework for assessing GPR’s strengths and limitations. The results confirm GPR as a reliable tool for oceanographic applications, demonstrating its ability to reconstruct complex velocity fields with high accuracy while also offering valuable uncertainty estimates. These capabilities are particularly useful for studying ocean circulation patterns, including the transport of heat, salt, nutrients, and pollutants.
Additionally, the relationship between sampling density, velocity magnitude, and accuracy offers practical insights into optimizing observation strategies. Our findings highlight that while increasing sampling density improves prediction accuracy, the benefits diminish beyond an optimal threshold of approximately seven data points per km2 per time-step. Moreover, GPR performs best in faster-flowing regions, whereas its accuracy declines in low-velocity areas (below 0.1 m/s), where directional uncertainties are amplified.
These insights suggest opportunities for further refinement of GPR-based velocity reconstruction methods. Future work should focus on integrating physical constraints, hybrid modeling approaches, and advanced regularization techniques to improve accuracy, particularly in low-velocity and sparsely sampled regions. Future advancements in hybrid modeling approaches, incorporating both machine learning and physical constraints, can further enhance GPR’s role in ocean circulation studies, improving both accuracy and interpretability for real-world applications.

6. Declaration of Generative AI and AI-Assisted Technologies in the Writing Process

During the preparation of this work, the authors used ChatGPT to improve readability and language. After using this tool, the authors reviewed and edited the content as needed and took full responsibility for the content of the publication.

Author Contributions

Conceptualization, J.X., M.I., R.C.G. and T.Ö.; Methodology, J.X., M.I., R.C.G. and T.Ö.; Software, J.X.; Validation, J.X., M.I., R.C.G. and T.Ö.; Formal analysis, J.X.; Resources, J.X., M.I., R.C.G. and T.Ö.; Data curation, J.X.; Writing—original draft, J.X.; Writing—review & editing, J.X., M.I., R.C.G. and T.Ö.; Visualization, J.X.; Supervision, T.Ö.; Funding acquisition, T.Ö. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Office of Naval Research under grant N00014-20-1-2023 (MURI ML-SCOPE) to RSMAES, the University of Miami, and the Massachusetts Institute of Technology.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The scripts used in this research are available on GitHub (accessed on 14 January 2025): https://github.com/JunfeiXia/GPR_NCOM.

Acknowledgments

We are grateful to the Office of Naval Research for support under grant N00014-20-1-2023 (MURI ML-SCOPE) to RSMAES, the University of Miami, and the Massachusetts Institute of Technology.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  2. Lodise, J.; Özgökmen, T.; Gonçalves, R.C.; Iskandarani, M.; Lund, B.; Horstmann, J.; Poulain, P.M.; Klymak, J.; Ryan, E.H.; Guigand, C. Investigating the formation of submesoscale structures along mesoscale fronts and estimating kinematic quantities using Lagrangian drifters. Fluids 2020, 5, 159. [Google Scholar] [CrossRef]
  3. Grossi, M.D.; Kubat, M.; Özgökmen, T.M. Predicting particle trajectories in oceanic flows using artificial neural networks. Ocean Model. 2020, 156, 101707. [Google Scholar] [CrossRef]
  4. Gupta, A.; Lermusiaux, P.F.J. Neural closure models for dynamical systems. Proc. Math. Phys. Eng. Sci. 2021, 477. [Google Scholar] [CrossRef]
  5. Hoteit, I.; Abualnaja, Y.; Afzal, S.; Ait-El-Fquih, B.; Akylas, T.; Antony, C.; Dawson, C.; Asfahani, K.; Brewin, R.J.; Cavaleri, L.; et al. Towards an end-to-end analysis and prediction system for weather, climate, and marine applications in the Red Sea. Bull. Am. Meteorol. Soc. 2021, 102, E99–E122. [Google Scholar] [CrossRef]
  6. Mariano, A.J.; Griffa, A.; Özgökmen, T.M.; Zambianchi, E. Lagrangian analysis and predictability of coastal and ocean dynamics 2000. J. Atmos. Ocean. Technol. 2002, 19, 1114–1126. [Google Scholar] [CrossRef]
  7. Poje, A.C.; Ozgökmen, T.M.; Lipphardt, B.L., Jr.; Haus, B.K.; Ryan, E.H.; Haza, A.C.; Jacobs, G.A.; Reniers, A.J.H.M.; Olascoaga, M.J.; Novelli, G.; et al. Submesoscale dispersion in the vicinity of the Deepwater Horizon spill. Proc. Natl. Acad. Sci. USA 2014, 111, 12693–12698. [Google Scholar] [CrossRef]
  8. Lumpkin, R.; Özgökmen, T.; Centurioni, L. Advances in the application of surface drifters. Ann. Rev. Mar. Sci. 2017, 9, 59–81. [Google Scholar] [CrossRef]
  9. Novelli, G.; Guigand, C.M.; Cousin, C.; Ryan, E.H.; Laxague, N.J.M.; Dai, H.; Haus, B.K.; Özgökmen, T.M. A biodegradable surface drifter for ocean sampling on a massive scale. J. Atmos. Ocean. Technol. 2017, 34, 2509–2532. [Google Scholar] [CrossRef]
  10. Dasaro, E.A.; Shcherbina, A.Y.; Klymak, J.M.; Molemaker, J.; Novelli, G.; Guigand, C.M.; Özgökmen, T.M. Ocean convergence and the dispersion of flotsam. Proc. Natl. Acad. Sci. USA 2018, 115, 1162–1167. [Google Scholar] [CrossRef]
  11. Griffa, A.; Piterbarg, L.I.; Özgökmen, T. Predictability of Lagrangian particle trajectories: Effects of smoothing of the underlying Eulerian flow. J. Mar. Res. 2004, 62, 1–35. [Google Scholar] [CrossRef]
  12. Aref, H. Stirring by chaotic advection. J. Fluid Mech. 1984, 143, 1–21. [Google Scholar] [CrossRef]
  13. Hernandez, F.; Le Traon, P.Y.; Morrow, R. Mapping mesoscale variability of the Azores Current using TOPEX/POSEIDON and ERS 1 altimetry, together with hydrographic and Lagrangian measurements. J. Geophys. Res. 1995, 100, 24995. [Google Scholar] [CrossRef]
  14. Molcard, A. Assimilation of drifter observations for the reconstruction of the Eulerian circulation field. J. Geophys. Res. 2003, 108. [Google Scholar] [CrossRef]
  15. Molcard, A.; Griffa, A.; Özgökmen, T.M. Lagrangian data assimilation in multilayer primitive equation ocean models. J. Atmos. Ocean. Technol. 2005, 22, 70–83. [Google Scholar] [CrossRef]
  16. Toner, M.; Kirwan, A., Jr.; Kantha, L.; Choi, J. Can general circulation models be assessed and their output enhanced with drifter data? J. Geophys. Res. Ocean. 2001, 106, 19563–19579. [Google Scholar] [CrossRef]
  17. Ide, K.; Kuznetsov, L.; Jones, C.K. Lagrangian data assimilation for point vortex systems. J. Turbul. 2002, 3. [Google Scholar] [CrossRef]
  18. Kuznetsov, L.; Ide, K.; Jones, C.K.R.T. A method for assimilation of Lagrangian data. Mon. Weather Rev. 2003, 131, 2247–2260. [Google Scholar] [CrossRef]
  19. Salman, H.; Kuznetsov, L.; Jones, C.K.R.T.; Ide, K. A method for assimilating Lagrangian data into a Shallow-Water-equation ocean model. Mon. Weather Rev. 2006, 134, 1081–1101. [Google Scholar] [CrossRef]
  20. Kamachi, M.; O’Brien, J.J. Continuous data assimilation of drifting buoy trajectory into an equatorial Pacific Ocean model. J. Mar. Syst. 1995, 6, 159–178. [Google Scholar] [CrossRef]
  21. Taillandier, V.; Griffa, A.; Molcard, A. A variational approach for the reconstruction of regional scale Eulerian velocity fields from Lagrangian data. Ocean Model. 2006, 13, 1–24. [Google Scholar] [CrossRef]
  22. Taillandier, V.; Griffa, A. Implementation of position assimilation for ARGO floats in a realistic Mediterranean Sea OPA model and twin experiment testing. Ocean Sci. 2006, 2, 223–236. [Google Scholar] [CrossRef]
  23. Salman, H. A hybrid grid/particle filter for Lagrangian data assimilation. II: Application to a model vortex flow. Q. J. R. Meteorol. Soc. 2008, 134, 1551–1565. [Google Scholar] [CrossRef]
  24. Salman, H. A hybrid grid/particle filter for Lagrangian data assimilation. I: Formulating the passive scalar approximation. Q. J. R. Meteorol. Soc. 2008, 134, 1539–1550. [Google Scholar] [CrossRef]
  25. Krause, P.; Restrepo, J.M. The diffusion kernel filter applied to Lagrangian data assimilation. Mon. Weather Rev. 2009, 137, 4386–4400. [Google Scholar] [CrossRef]
  26. Kennedy, M.C.; O’Hagan, A. Predicting the output from a complex computer code when fast approximations are available. Biometrika 2000, 87, 1–13. [Google Scholar] [CrossRef]
  27. Rasmussen, C.E.; Williams, C.K.I. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006. [Google Scholar]
  28. Thacker, W.C.; Iskandarani, M.; Gonçalves, R.C.; Srinivasan, A.; Knio, O.M. Pragmatic aspects of uncertainty propagation: A conceptual review. Ocean Model. 2015, 95, 25–36. [Google Scholar] [CrossRef]
  29. Iskandarani, M.; Wang, S.; Srinivasan, A.; Thacker, W.C.; Winokur, J.; Knio, O.M. An overview of uncertainty quantification techniques with application to oceanic and oilspill simulations. J. Geophys. Res. Ocean. 2016, 121, 2789–2808. [Google Scholar] [CrossRef]
  30. Gonçalves, R.C.; Iskandarani, M.; Özgökmen, T.; Thacker, W.C. Reconstruction of submesoscale velocity field from surface drifters. J. Phys. Oceanogr. 2019, 49, 941–958. [Google Scholar] [CrossRef]
  31. Berlinghieri, R.; Trippe, B.L.; Burt, D.R.; Giordano, R.; Srinivasan, K.; Özgökmen, T.; Xia, J.; Broderick, T. Gaussian processes at the Helm(holtz): A more fluid model for ocean currents. arXiv 2023, arXiv:2302.10364. [Google Scholar]
  32. Walder, C.; Kim, K.I.; Schölkopf, B. Sparse multiscale Gaussian process regression. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 1112–1119. [Google Scholar]
  33. Stephenson, D.; Kermode, J.R.; Lockerby, D.A. Accelerating multiscale modelling of fluids with on-the-fly Gaussian process regression. Microfluid. Nanofluidics 2018, 22, 139. [Google Scholar] [CrossRef] [PubMed]
  34. Li, Z.; McWilliams, J.C.; Ide, K.; Farrara, J.D. A multiscale variational data assimilation scheme: Formulation and illustration. Mon. Weather Rev. 2015, 143, 3804–3822. [Google Scholar] [CrossRef]
  35. Le Traon, P.Y. A method for optimal analysis of fields with spatially variable mean. J. Geophys. Res. Ocean. 1990, 95, 13543–13547. [Google Scholar] [CrossRef]
  36. Rohani, A.; Taki, M.; Abdollahpour, M. A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I). Renew. Energy 2018, 115, 411–422. [Google Scholar] [CrossRef]
  37. Martino, L.; Laparra, V.; Camps-Valls, G. Probabilistic cross-validation estimators for Gaussian process regression. In Proceedings of the 2017 25th European Signal Processing Conference (EUSIPCO), Kos, Greece, 28 August–2 September 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 823–827. [Google Scholar]
  38. Stone, M. Cross-validation: A review. Stat. A J. Theor. Appl. Stat. 1978, 9, 127–139. [Google Scholar]
  39. Jacobs, G.A.; Bartels, B.P.; Bogucki, D.J.; Beron-Vera, F.J.; Chen, S.S.; Coelho, E.F.; Curcic, M.; Griffa, A.; Gough, M.; Haus, B.K.; et al. Data assimilation considerations for improved ocean predictability during the Gulf of Mexico Grand Lagrangian Deployment (GLAD). Ocean Model. 2014, 83, 98–117. [Google Scholar] [CrossRef]
  40. Albert, C.G.; Rath, K. Gaussian process regression for data fulfilling linear differential equations with localized sources. Entropy 2020, 22, 152. [Google Scholar] [CrossRef]
  41. Bretherton, F.P.; Davis, R.E.; Fandry, C. A technique for objective analysis and design of oceanographic experiments applied to MODE-73. Deep. Sea Res. Oceanogr. Abstr. 1976, 23, 559–582. [Google Scholar] [CrossRef]
  42. Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
  43. Willmott, C.J. On the validation of models. Phys. Geogr. 1981, 2, 184–194. [Google Scholar] [CrossRef]
  44. Greenwood, D.; Neeteson, J.; Draycott, A. Response of potatoes to N fertilizer: Dynamic model. Plant Soil 1985, 85, 185–203. [Google Scholar] [CrossRef]
  45. Shadden, S.C.; Lekien, F.; Marsden, J.E. Definition and properties of Lagrangian coherent structures from finite-time Lyapunov exponents in two-dimensional aperiodic flows. Physica D 2005, 212, 271–304. [Google Scholar] [CrossRef]
  46. Froyland, G.; Padberg-Gehle, K. Almost-invariant and finite-time coherent sets: Directionality, duration, and diffusion. In Ergodic Theory, Open Dynamics, and Coherent Structures; Springer Proceedings in Mathematics & Statistics; Springer: New York, NY, USA, 2014; pp. 171–216. [Google Scholar]
  47. Martin, P.J. Description of the navy coastal ocean model version 1.0. NRL Rep. NRL/FR/7322-00 2000, 9962, 42. [Google Scholar]
  48. Kamath, A.; Vargas-Hernández, R.A.; Krems, R.V.; Carrington, T.; Manzhos, S. Neural networks vs Gaussian process regression for representing potential energy surfaces: A comparative study of fit quality and vibrational spectrum accuracy. J. Chem. Phys. 2018, 148, 241702. [Google Scholar] [CrossRef]
  49. Ma, Y.; He, Y.; Wang, L.; Zhang, J. Probabilistic reconstruction for spatiotemporal sensor data integrated with Gaussian process regression. Probabilistic Eng. Mech. 2022, 69, 103264. [Google Scholar] [CrossRef]
  50. Camps-Valls, G.; Martino, L.; Svendsen, D.H.; Campos-Taberner, M.; Muñoz-Marí, J.; Laparra, V.; Luengo, D.; García-Haro, F.J. Physics-aware Gaussian processes in remote sensing. Appl. Soft Comput. 2018, 68, 69–82. [Google Scholar] [CrossRef]
  51. Zhou, J.; Jia, L.; Menenti, M.; Gorte, B. On the performance of remote sensing time series reconstruction methods–A spatial comparison. Remote Sens. Environ. 2016, 187, 367–384. [Google Scholar] [CrossRef]
  52. Li, T.; Biferale, L.; Bonaccorso, F.; Buzzicotti, M.; Centurioni, L. Stochastic Reconstruction of Gappy Lagrangian Turbulent Signals by Conditional Diffusion Models. arXiv 2024, arXiv:2410.23971. [Google Scholar]
  53. Bang, C.; Altaher, A.S.; Zhuang, H.; Altaher, A.; Srinivasan, A.; Cherubin, L. Physics-Informed Neural Networks to Reconstruct Surface Velocity Field from Drifter Data. Authorea Preprints 2024.
  54. Aravind, H.; Özgökmen, T.M.; Allshouse, M.R. Lagrangian analysis of submesoscale flows from sparse data using Gaussian Process Regression for field reconstruction. Ocean Model. 2025, 193, 102458. [Google Scholar] [CrossRef]
  55. Molinari, R.; Kirwan, A.D. Calculations of differential kinematic properties from Lagrangian observations in the western Caribbean Sea. J. Phys. Oceanogr. 1975, 5, 483–491. [Google Scholar] [CrossRef]
  56. Berta, M.; Griffa, A.; Özgökmen, T.M.; Poje, A.C. Submesoscale evolution of surface drifter triads in the Gulf of Mexico. Geophys. Res. Lett. 2016, 43, 11–751. [Google Scholar] [CrossRef]
  57. Mao, M.; Xia, M. Modeling blue crab (Callinectes sapidus) larval transport and recruitment dynamics in a shallow lagoon-inlet-coastal ocean system. J. Geophys. Res. Ocean. 2024, 129, e2023JC020785. [Google Scholar] [CrossRef]
  58. Stohl, A.; Wotawa, G.; Seibert, P.; Kromp-Kolb, H. Interpolation errors in wind fields as a function of spatial and temporal resolution and their impact on different types of kinematic trajectories. J. Appl. Meteorol. Climatol. 1995, 34, 2149–2165. [Google Scholar] [CrossRef]
  59. Walmsley, J.L.; Mailhot, J. On the numerical accuracy of trajectory models for long-range transport of atmospheric pollutants. Atmosphere-Ocean 1983, 21, 14–39. [Google Scholar] [CrossRef]
  60. Srinivasan, A.; Chin, T.; Chassignet, E.; Iskandarani, M.; Groves, N. A statistical interpolation code for ocean analysis and forecasting. J. Atmos. Ocean. Technol. 2022, 39, 367–386. [Google Scholar] [CrossRef]
  61. Mao, M.; Xia, M. Particle dynamics in the nearshore of Lake Michigan revealed by an observation-modeling system. J. Geophys. Res. Ocean. 2020, 125, e2019JC015765. [Google Scholar] [CrossRef]
  62. Fitzenreiter, K.; Mao, M.; Xia, M. Characteristics of surface currents in a shallow lagoon–inlet–coastal ocean system revealed by surface drifter observations. Estuaries Coasts 2022, 45, 2327–2344. [Google Scholar] [CrossRef]
  63. Ho, A.; Citrin, J.; Auriemma, F.; Bourdelle, C.; Casson, F.J.; Kim, H.T.; Manas, P.; Szepesi, G.; Weisen, H.; Contributors, J. Application of Gaussian process regression to plasma turbulent transport model validation via integrated modelling. Nucl. Fusion 2019, 59, 056007. [Google Scholar] [CrossRef]
  64. Zhang, Z.J.; Duraisamy, K. Machine learning methods for data-driven turbulence modeling. In Proceedings of the 22nd AIAA Computational Fluid Dynamics Conference, Dallas, TX, USA, 22–26 June 2015; p. 2460. [Google Scholar]
  65. Quinonero-Candela, J.; Rasmussen, C.E. A unifying view of sparse approximate Gaussian process regression. J. Mach. Learn. Res. 2005, 6, 1939–1959. [Google Scholar]
  66. Smola, A.; Bartlett, P. Sparse greedy Gaussian process regression. Adv. Neural Inf. Process. Syst. 2000, 13. [Google Scholar]
  67. Mohammadi, H.; Challenor, P.; Williamson, D.; Goodfellow, M. Cross-Validation–based Adaptive Sampling for Gaussian Process Models. SIAM/ASA J. Uncertain. Quantif. 2022, 10, 294–316. [Google Scholar] [CrossRef]
Figure 1. Overview of the error quantification process for the double-gyre and NCOM models.
Figure 1. Overview of the error quantification process for the double-gyre and NCOM models.
Jmse 13 00431 g001
Figure 2. Double-gyre model velocity field and advected Lagrangian particle trajectories. The left panel shows the velocity field at the first time step, and the right panel shows the advected particle trajectories at the final time step.
Figure 2. Double-gyre model velocity field and advected Lagrangian particle trajectories. The left panel shows the velocity field at the first time step, and the right panel shows the advected particle trajectories at the final time step.
Jmse 13 00431 g002
Figure 3. NCOM velocity field (15 January 2016) in the Gulf of Mexico (top) and the convergence region (red square). The bottom figure shows the velocity fields and advected Lagrangian particle trajectories (dots with lines). The dots indicate the last positions. The color in both the top and bottom figures indicates the velocity with unit m/s.
Figure 3. NCOM velocity field (15 January 2016) in the Gulf of Mexico (top) and the convergence region (red square). The bottom figure shows the velocity fields and advected Lagrangian particle trajectories (dots with lines). The dots indicate the last positions. The color in both the top and bottom figures indicates the velocity with unit m/s.
Jmse 13 00431 g003
Figure 4. Model evaluation metrics over each time step.
Figure 4. Model evaluation metrics over each time step.
Jmse 13 00431 g004
Figure 5. Comparison of model-simulated velocity fields (blue arrows) and GPR-reconstructed fields (red arrows) at time-step 9 (left) and time-step 14 (right).
Figure 5. Comparison of model-simulated velocity fields (blue arrows) and GPR-reconstructed fields (red arrows) at time-step 9 (left) and time-step 14 (right).
Jmse 13 00431 g005
Figure 6. Error estimation for the double-gyre model at time-step 9. Panels (a,b) show absolute deviations across the grid. Panels (c,d) show the relative error, ranging from −100% to 100%, with values outside this range indicating incorrect directional predictions. Panels (e,f) illustrate the predicted error ( E r r Q ), highlighting regions of high uncertainty due to data sparsity or complex velocity gradients.
Figure 6. Error estimation for the double-gyre model at time-step 9. Panels (a,b) show absolute deviations across the grid. Panels (c,d) show the relative error, ranging from −100% to 100%, with values outside this range indicating incorrect directional predictions. Panels (e,f) illustrate the predicted error ( E r r Q ), highlighting regions of high uncertainty due to data sparsity or complex velocity gradients.
Jmse 13 00431 g006
Figure 7. Scatter plots comparing GPR-predicted velocities with actual velocities from the double-gyre model at time-step 9 and time-step 14. Marker color and size indicate the predicted error magnitude ( E r r Q ), with larger and brighter markers representing higher errors.
Figure 7. Scatter plots comparing GPR-predicted velocities with actual velocities from the double-gyre model at time-step 9 and time-step 14. Marker color and size indicate the predicted error magnitude ( E r r Q ), with larger and brighter markers representing higher errors.
Jmse 13 00431 g007
Figure 8. Relationship between overall relative error and sampling density. (Left): broader region (135 km × 146 km). (Right): confined region (9 km × 6 km).
Figure 8. Relationship between overall relative error and sampling density. (Left): broader region (135 km × 146 km). (Right): confined region (9 km × 6 km).
Jmse 13 00431 g008
Figure 9. Comparison of velocity fields from the NCOM and GPR-reconstructed fields. The upper panels show the velocity fields from both sources, while the lower panels illustrate the differences between them.
Figure 9. Comparison of velocity fields from the NCOM and GPR-reconstructed fields. The upper panels show the velocity fields from both sources, while the lower panels illustrate the differences between them.
Jmse 13 00431 g009
Figure 10. Error estimation for the NCOM velocity field. Panels (a,b) present deviations in zonal and meridional velocity components. Panels (c,d) display relative errors, with values outside of the −100% to 100% range indicating incorrect directional predictions. Panels (e,f) show the predicted error ( E r r Q ), identifying regions of high uncertainty due to data sparsity or complex velocity gradients.
Figure 10. Error estimation for the NCOM velocity field. Panels (a,b) present deviations in zonal and meridional velocity components. Panels (c,d) display relative errors, with values outside of the −100% to 100% range indicating incorrect directional predictions. Panels (e,f) show the predicted error ( E r r Q ), identifying regions of high uncertainty due to data sparsity or complex velocity gradients.
Jmse 13 00431 g010
Figure 11. Scatter plot comparing GPR-predicted and NCOM velocities. The markers indicate the predicted error, with brighter colors and larger sizes denoting higher E r r Q values. Only a subset, approximately 10%, of randomly selected data points are displayed in the figure.
Figure 11. Scatter plot comparing GPR-predicted and NCOM velocities. The markers indicate the predicted error, with brighter colors and larger sizes denoting higher E r r Q values. Only a subset, approximately 10%, of randomly selected data points are displayed in the figure.
Jmse 13 00431 g011
Figure 12. Relationship between relative error and velocity magnitude. The x-axis is inverted to emphasize the trend of increasing error at lower velocities. The upper panels compare relative error with exact velocity, while the lower panels compare relative error with predicted velocity. The yellow line marks the 10% error threshold, while the red line indicates 100% error, highlighting the model’s reduced accuracy in low-velocity regions.
Figure 12. Relationship between relative error and velocity magnitude. The x-axis is inverted to emphasize the trend of increasing error at lower velocities. The upper panels compare relative error with exact velocity, while the lower panels compare relative error with predicted velocity. The yellow line marks the 10% error threshold, while the red line indicates 100% error, highlighting the model’s reduced accuracy in low-velocity regions.
Jmse 13 00431 g012
Table 1. The Lagrangian–Eulerian advection algorithm.
Table 1. The Lagrangian–Eulerian advection algorithm.
1.n particles are randomly distributed in the research region.
2.The initial condition (zonal velocity u ( i , t 0 ) and meridional velocity v ( i , t 0 )) of the n particles are obtained from the NCOM velocity field through 16 points of nearest-neighbor interpolation.
3.The particles travel with the condition for one time-step Δ t , and the locations are then updated as x ( i , t ) = x ( i , t 1 ) + u ( i , t 1 ) Δ t ; y ( i , t ) = y ( i , t 1 ) + v ( i , t 1 ) Δ t .
4.Since the velocity fields do not change with time, only spatial interpolation is needed to update the u i and v i , based on the NCOM velocity fields and new locations.
5.Repeat steps 3 and 4 to complete the entire period.
Table 2. Optimized hyperparameters of the covariance function for the double-gyre model.
Table 2. Optimized hyperparameters of the covariance function for the double-gyre model.
Hyperparametersuv
σ N 1.18 × 10−209.75 × 10−5
σ 1 0.1610.152
r t 1 (steps)13.3215.48
r x 1 0.6740.928
r y 1 0.7791.01
σ 2 0.0006340.00196
r t 2 (steps)114.52247.52
r x 2 0.009330.0127
r y 2 0.005460.931
Table 3. Optimized hyperparameters of the covariance function for NCOM.
Table 3. Optimized hyperparameters of the covariance function for NCOM.
Hyperparametersuv
σ N (m s−1)0.0001330.000356
σ 1 (m s−1)0.06340.312
r t 1 ( h ) 5591147
r x 1 ( km ) 23.716.2
r y 1 ( km ) 12.143.9
σ 2 (m s−1)0.00160.00554
r t 2 ( h ) 336470
r x 2 ( km ) 3.653.11
r y 2 ( km ) 3.164.72
Table 4. NCOM U evaluation table.
Table 4. NCOM U evaluation table.
Num of DriftersDensity (n/km2) EF D R 2 MBE RMSE MAE
10.0012−0.550.490.110.250.382.0 × 10−5
20.0024−0.560.490.0870.250.382.0 × 10−5
50.0061−0.430.520.130.230.371.9 × 10−5
100.012−0.300.560.180.210.351.7 × 10−5
200.0240.820.950.840.0390.134.9 × 10−6
300.0370.850.960.850.0170.124.5 × 10−6
400.0490.940.980.94−0.0120.0782.85 × 10−6
500.0610.940.990.95−0.00370.0722.66 × 10−6
1000.120.980.990.981.2 × 10−40.041.3 × 10−6
2000.240.980.990.98−0.00130.0381.1 × 10−6
Table 5. NCOM V evaluation table.
Table 5. NCOM V evaluation table.
Num of DriftersDensity (n/km2) EF D R 2 MBE RMSE MAE
10.0012−4.60.460.0780.630.834.5 × 10−5
20.0024−2.70.590.260.340.673.6 × 10−5
50.0061−1.80.620.250.300.582.8 × 10−5
100.0120.390.850.640.160.271.3 × 10−5
200.0240.680.920.710.0190.207.8 × 10−6
300.0370.830.950.850.0370.145.0 × 10−6
400.0490.850.960.860.0280.134.8 × 10−6
500.0610.860.960.860.0230.134.5 × 10−6
1000.120.960.990.960.0240.0752.3 × 10−6
2000.240.980.990.980.00590.0521.5 × 10−6
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xia, J.; Iskandarani, M.; Gonçalves, R.C.; Özgökmen, T. Error Quantification of Gaussian Process Regression for Extracting Eulerian Velocity Fields from Ocean Drifters. J. Mar. Sci. Eng. 2025, 13, 431. https://doi.org/10.3390/jmse13030431

AMA Style

Xia J, Iskandarani M, Gonçalves RC, Özgökmen T. Error Quantification of Gaussian Process Regression for Extracting Eulerian Velocity Fields from Ocean Drifters. Journal of Marine Science and Engineering. 2025; 13(3):431. https://doi.org/10.3390/jmse13030431

Chicago/Turabian Style

Xia, Junfei, Mohamed Iskandarani, Rafael C. Gonçalves, and Tamay Özgökmen. 2025. "Error Quantification of Gaussian Process Regression for Extracting Eulerian Velocity Fields from Ocean Drifters" Journal of Marine Science and Engineering 13, no. 3: 431. https://doi.org/10.3390/jmse13030431

APA Style

Xia, J., Iskandarani, M., Gonçalves, R. C., & Özgökmen, T. (2025). Error Quantification of Gaussian Process Regression for Extracting Eulerian Velocity Fields from Ocean Drifters. Journal of Marine Science and Engineering, 13(3), 431. https://doi.org/10.3390/jmse13030431

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop