Uncertainty Quantification of Trajectory Clustering Applied to Ocean Ensemble Forecasts

Partitioning ocean flows into regions dynamically distinct from their surroundings based on material transport can assist search-and-rescue planning by reducing the search domain. The spectral clustering method partitions the domain by identifying fluid particle trajectories that are similar. The partitioning validity depends on the accuracy of the ocean forecasting, which is subject to several sources of uncertainty: model initialization, limited knowledge of the physical processes, boundary conditions, and forcing terms. Instead of a single model output, multiple realizations are produced spanning a range of potential outcomes, and trajectory clustering is used to identify robust features and quantify the uncertainty of the ensemble-averaged results. First, ensemble statistics are used to investigate the cluster sensitivity to the spectral clustering method free-parameters and the forecast parameters for the analytic Bickley jet, a geostrophic flow model. Then, we analyze an operational coastal ocean ensemble forecast and compare the clustering results to drifter trajectories south of Martha's Vineyard. This approach accurately identifies regions of low uncertainty where drifters released within a cluster remain there throughout the window of analysis. Drifters released in regions of high uncertainty tend to either enter neighboring clusters or deviate from all predicted outcomes.


Introduction
Fluid flows, even if unsteady and aperiodic, may admit persistent patterns generally referred to as coherent structures that reveal flow characteristics related to the transport of fluid particles [McWilliams, 1984;Provenzale, 1999;Haller, 2015]. Coherent structures of the elliptic type [Haller and Beron-Vera, 2012;Froyland and Padberg-Gehle, 2015;Allshouse and Peacock, 2015] are portions of fluid that do not significantly mix with the rest of the domain. From a Lagrangian perspective, the perimeter delimiting material within these structures remains nearly constant as they move [Allshouse and Thiffeault, 2012;Froyland, 2015], and fluid can be transported over long distances while surrounded by more vigorous mixing [Provenzale, 1999;Beron-Vera et al., 2013;Haller and Beron-Vera, 2013]. Eulerian methods compute coherent structures directly from instantaneous velocity fields and may not highlight features that have a persistent impact on transport. Additionally, these methods require full velocity fields, which may be unavailable or hard to reconstruct from sparse sets of observed trajectories. Lagrangian methods, in turn, can identify coherent structures based on the trajectories themselves and identify the dominant features over a given time interval [Provenzale, 1999;Boffetta et al., 2001;Peacock and Dabiri, 2010;Allshouse and Thiffeault, 2012;Haller and Beron-Vera, 2013].
One approach for identifying these coherent structures uses cluster analysis. Clustering algorithms reveal underlying structures in data sets by partitioning the data so that similar elements are assigned to the same cluster, while dissimilar elements are assigned to different ones. These algorithms have been extensively studied and widely applied in image segmentation and anomaly detection, as well as biological and physical processes [Jain et al., 1999;Everitt et al., 2011]. For fluid transport analysis, clustering techniques can G. S. Vieira, I. I. Rypina, and M. R. Allshouse, 2020 efficiently identify elliptic structures when applied to fluid particle trajectories [Froyland and Padberg-Gehle, 2015;Hadjighasem et al., 2016], which we refer to as trajectory clustering.
Here, we analyze particle trajectories using the spectral clustering algorithm [Shi and Malik, 2000;Ng et al., 2002;von Luxburg, 2007;Filippone et al., 2008]. This method has been used to identify coherent structures in analytic and simulated flows [Hadjighasem et al., 2016[Hadjighasem et al., , 2017Vieira and Allshouse, 2020]. The clustering performs a systematic partitioning of the trajectories into coherent and incoherent sets, providing a conceptual simplification of the underlying dynamical system for a general flow. This method is also frameinvariant, and hence the identified structures are the same in all frames that translate or rotate relative to each other. One drawback of the spectral clustering method, however, is that there are a number of free-parameters that the user has to select, which could impact the results [Hadjighasem et al., 2017].
While ocean drifter data can be clustered a posteriori to identify coherent structures, an a priori analysis to predict coherent structures relies on trajectories obtained from a forecast model. The clustering accuracy is limited by that of the trajectories, which, in turn, is limited by the accuracy of the velocity model. In ocean modeling, however, several sources of uncertainties pose a challenge to the Lagrangian approach [Lermusiaux, 2006;Lermusiaux et al., 2006]. To simplify ocean models and reduce computational expenses, the governing equations are only resolved on a restricted range of spatial and temporal scales, and the influence of scales outside this window is either parameterized or neglected. Uncertainties also arise from the limited knowledge of processes within the scale window, which require approximate representations or parameterizations. Moreover, measurements used for model initialization and parameter estimation are limited in coverage and accuracy, leading to imprecise initial conditions and model parameters. Finally, models of interactions between the ocean and the atmosphere are approximate, and ocean boundary conditions are inexact. All of the above lead to differences between the actual values and the modeled values of physical fields and properties.
To account for model uncertainties intrinsic to the modeling process, an effective option is to perform ensemble statistics. Different sets of model parameter values generate an ensemble of possible outcomes, which are then processed to provide probabilistic information about the variability on the end results. Searchand-rescue planning, for instance, already considers ensemble statistics to produce probability-distribution maps for the target's location [Kratzke et al., 2010]. The vast uncertain parameter space together with the continuous motion of floating objects driven by unsteady flows, however, can lead to error accumulation in the predicted trajectories [Serra et al., 2020]. Coherent structures have been shown to depend less on individual trajectories and are more robust to model parameter variations and noise, highlighting the main structures in the flow even in the event of imperfect or scarce trajectories [Haller, 2002;Lermusiaux et al., 2006]. The robustness of the clustering algorithm to perturbations in the individual trajectories via an ensemble set of realizations has yet to be studied and could aid in quantifying uncertainty for the trajectory clusters.
The detection of robust clusters could then be used in the deployment of drifters to observe such features in the ocean. Other Lagrangian methods have been used in the past in experiments to interpret the observed behavior of drifters deployed in the ocean Jacobs et al., 2014;Beron-Vera et al., 2015;Williams et al., 2015;Rypina et al., 2014Rypina et al., , 2016Rypina and Pratt, 2017]. In these studies, drifters were released, tracked, and their trajectories were compared to results from Lagrangian analyses performed a posteriori using velocities from models, satellite altimetry or radar measurements. Few studies, however, have used Lagrangian methods to plan and execute field experiments [Haza et al., 2007[Haza et al., , 2010Serra et al., 2020]. The drifter-release experiment presented in [Filippi et al., 2020], whose drifter data is used in this paper, is one of the first experiments targeting coherent structures based on an a priori trajectory clustering obtained from forecast simulations.
Our approach accounts for and quantifies uncertainty when clustering trajectories. We apply the spectral clustering algorithm varying the method free-parameters to understand how the clustering results are sensitive to the implementation, analyze ensemble simulations to understand how the model parameters impact the resulting clusters, and finally apply the method to a forecast data set to compare the clustering prediction to experimental drifter data. The method by Hadjighasem et al. [2016] is modified with a more broadly used similarity function and soft clustering membership probabilities, allowing for a probabilistic view of the clusters for each individual model realization. Two different systems are analyzed with our approach: an analytic flow model and forecast simulations provided by a coastal ocean model. First, the Bickley jet analytic flow [Rypina et al., 2007;Beron-Vera et al., 2010] with model parameter variations is used to mimic model uncertainty. Then, we analyze ensemble-forecasts generated by the MIT-MSEAS ocean model [Haley and Lermusiaux, 2010;Haley Jr et al., 2015] of the coastal region of the Martha's Vineyard island. Ensemble statistics of the resulting clusters provide a probabilistic view of the coherent structures identified by the method. The forecast clustering results are used to identify coherent structures in a drifterrelease experiment, and the drifter trajectories are compared to the forecast cluster behavior and associated uncertainty.
The paper is organized as follows. Section 2 presents the spectral clustering method used in this work and how clustering results are processed to provide ensemble statistics. Section 3 introduces the quasi-periodic Bickley jet system, performs ensemble statistics for a single set of particle trajectories while varying the method free-parameters to quantify the sensitivity of the clustering results to the implementation. Section 4 varies Bickley jet system model parameters to asses the robustness of the identified coherent structures in a scenario of uncertainty in the velocity field. Section 5 presents the clustering analysis used to target coherent structures in a drifter-release experiment south of Martha's Vineyard, and compares the forecast results to the observed drifter trajectories. Finally, conclusions are presented in Section 6.

Method
This section presents the spectral clustering method and how uncertainty is measured. We adapt the method of Hadjighasem et al. [2016] to use soft clustering (fuzzy c-means) to assign a cluster membership probability to each particle trajectory. Section 2.1 introduces the method along with specifics of the similarity measure selected and the soft clustering approach. Section 2.2 presents the general procedure we use to calculate statistics and quantify uncertainty when considering an ensemble of possible clustering results.

Spectral Clustering Method with Soft Memberships for Trajectory Clustering
To analyze flows from a Lagrangian perspective, we consider massless fluid particles that move with the velocity field u(t, x). The trajectory x i (t) of fluid particle i departing from (1) A total of N particles are initialized in a grid that uniformly covers the targeted domain at t 0 , and their individual trajectories are numerically integrated and output at discrete time instances within the time interval [t 0 , t f ]. We use the set of trajectories {x i (t)} 1≤i≤N to partition the spatial domain into clusters. The spectral clustering algorithm performs an eigenanalysis to project the trajectory set onto a subspace that may yield clusters maximizing the within-cluster similarity and minimizing the between-clusters similarity [Nock et al., 2009]. Particles clustered together should move as a compact group, with limited mixing with particles outside of the cluster, while particles in different clusters should experience dissimilar trajectories [Froyland and Padberg-Gehle, 2015;Hadjighasem et al., 2016].
The spectral analysis requires the construction of a positive, weighted, undirected graph known as the similarity graph. This graph quantifies the pairwise similarities between trajectories. Each graph node represents a particle trajectory, and the graph edges are weighted by the similarity between the nodes they connect [von Luxburg, 2007;Hadjighasem et al., 2016;Shi and Malik, 2000]. To compute these weights, one must first define a measure for similarity between trajectories. One possible metric for the dissimilarity between trajectories x i and x j is their time-averaged distance where dist(·, ·) is the Euclidean distance. Then, a similarity measure for the edge weights w ij = w(r ij ) must be chosen as a function of the time-averaged distance. The only restriction on this functional dependence is that the weight should be a monotonically decreasing function of the distance. This choice of similarity function controls the graph edge weight distribution, and therefore can impact the clustering results [von Luxburg, 2007]. Hadjighasem et al. [2016] use a similarity function that is the inverse of the time-averaged distance, w ij = l x /r ij (a constant l x is included here to make the weight dimensionless and does not impact the results). A more widely used function for graph partitioning is the Gaussian function [Shi and Malik, 2000;Ng et al., 2002;von Luxburg, 2007] w ij = exp −r 2 ij /2σ 2 , which we will use. The similarity radius σ > 0 is a method free-parameter that controls the spatial width of the connected neighborhoods [Shi and Malik, 2000]. Note that w ii = 1 and w ij → 0 for r ij σ. The choice of the Gaussian measure has a number of advantages over the l x /r ij choice: first, it is bounded, with 0 ≤ w ij ≤ 1; second, no offset value is set in the diagonal entries [Hadjighasem et al., 2016]; and third, the sparsification step can be skipped (or at least has a negligible impact on the results, see Supplementary Material A), as (3) goes to zero faster than the measure in [Hadjighasem et al., 2016] for large r ij . The similarity radius σ in (3) determines the rate of decay of w ij to zero as r ij increases, which is analogous to the graph sparsification in [Hadjighasem et al., 2016] where w ij = 0 for r ij above a defined threshold. A direct comparison between these two similarity functions and the impact of the σ-choice are presented in Sections 3.1 and 3.2, respectively.
The similarity matrix W ∈ R N ×N stores the w ij values, and the diagonal degree matrix D is computed such that d ii = N j=1 w ij . The generalized eigenvectors q of the unnormalized graph Laplacian L = D − W are computed from the generalized eigenproblem The normalized vectors q 1 , q 2 , . . . , q N , corresponding to the generalized eigenvalues 0 ≤ λ 1 ≤ . . . ≤ λ N , differentiate properties in the graph and facilitate the clustering process [von Luxburg, 2007]. While all of the eigenvectors provide information, the dominant eigenvectors, associated with the smallest eigenvalues, reveal the most dynamically relevant characteristics. Each trajectory is characterized by a value within each eigenvector, and these are the values ultimately used to cluster trajectories. The eigenvector matrix Q ∈ R N ×M whose columns are the dominant eigenvectors q 1 , . . . , q M , with M N , stores the characteristics used to cluster the trajectories. The number of retained eigenvectors M is specified depending on the system. Let y i ∈ R M be the characterization vector corresponding to the i-th row of Q, which contains the condensed differentiating information of trajectory i. The eigenanalysis provides a suitable low dimensional representation of the data set. The characterization vectors {y i } 1≤i≤N are then partitioned into K clusters. The relationship between the number of clusters K and the number of dominant eigenvectors M used for clustering depends on characteristics of the system, and is specified for each of the case studies in the following sections.
We cluster the characterization vectors using a fuzzy c-means algorithm [Froyland and Padberg-Gehle, 2015;Filippone et al., 2008], instead of the conventional k-means often performed at this step [von Luxburg, 2007;Hadjighasem et al., 2016]. The c-means algorithm assigns to each trajectory i probabilities p i,k ∈ [0, 1] of being a member of cluster k, with 1 ≤ k ≤ K, such that K k=1 p i,k = 1. This step requires the prescription of a fuzziness parameter m > 1 that controls how "tight" the clustering membership probabilities are. As m → 1, the clustering result approaches the k -means result where p i,k ∈ {0, 1}. For greater m, the "looser" distribution in membership probabilities allows the identification of particles that present intermediate behaviors between clusters, as opposed to always assigning a trajectory as member of a single cluster. The cluster centers are initialized based on the dominant eigenvectors before starting the c-means iterative process detailed by Froyland and Padberg-Gehle [2015]. Finally, Hadjighasem et al. [2016] use an eigengap heuristic to determine M and K that relies on a large cluster differentiation to work properly, and in many practical cases an eigengap may be less pronounced or nonexistent [von Luxburg, 2007]. We leave the number of dominant eigenvectors and clusters as method free-parameters and study the related uncertainty associated to the clustering results for each K-choice.

Uncertainty Quantification for Multiple Realizations
In Sections 4 and 5, we will be interested not in applying the spectral clustering algorithm to a single realization of the flow, but in collecting clustering results from different realizations and combining them to get statistical information about the variability of the clusters with the method free-parameters σ, m and K, or with velocity model parameters. We therefore need a method to combine clustering results from different realizations. Regardless of the differences between realizations, we initialize the N particles on the same grid, so that trajectory i is uniquely labelled across realizations, based on x i (t 0 ). Each realization I ∈ {1, . . . , R} generates a full set of membership probabilities p (I) i,k , with i ∈ {1, . . . , N } and k ∈ {1, . . . , K}. The cluster labels for different realizations are matched based on the similarity between cluster centers.
To quantify the uncertainty of a trajectory i being a member of cluster k, the probabilities p (I) i,k are used to compute the mean membership probabilities over all R realizations and the corresponding sample standard deviations Both the mean and the standard deviation of the realizations are bounded values, with 0 ≤ p i,k ≤ 1 and 0 ≤ S i,k ≤ 0.5. We perform this calculation for each trajectory, for each cluster.

The Bickley Jet System and Sensitivity to Method Free-Parameters
We analyze the analytic, quasi-periodic Bickley jet system to evaluate the spectral clustering method freeparameter and velocity field model sensitivity. This system is an idealized model of a meandering zonal jet under geostrophic balance and has been extensively used to illustrate coherent structures in fluid flows [Rypina et al., 2007;Beron-Vera et al., 2010;Schlueter-Kuck and Dabiri, 2017;Hadjighasem et al., 2017]. The model features a sheared zonal flow on which a superposition of Rossby-like waves propagate. The streamfunction ψ prescribing the two-dimensional velocity field u = −∂ y ψ e x + ∂ x ψ e y with a superposition of three waves is where U and L are a characteristic speed and length, respectively. The domain is periodic in the x-direction, with periodicity l x = 2πR e cos(60 • ), where R e = 6371 km is Earth's radius. The rectangular domain corresponds to x/l x ∈ [0, 1], and we limit our analysis to y/l x ∈ [−0.15, 0.15]. The Rossby-like waves correspond to the three longest wave modes in the periodic domain, with amplitudes A n , wave numbers k n = 2πn/l x , phase speeds c n , and phases φ n , for n = 1, 2, 3. To model the self-consistent state obtained by Rypina et al. [2007] for modes 2 and 3 on the periodic domain, we fix U = 62.74 ms −1 , L = 1767 km, c 2 /U = 0.2051 and c 3 /U = 0.4615. The mode-1 wave has speed c 1 /U = 0.1446 chosen based on the golden ratio to break periodicity [Rypina et al., 2007;Hadjighasem et al., 2016]. Provided that A 1 min(A 2 , A 3 ), the mode-1 wave acts as a small perturbation to the system's periodicity. In this section, we fix the mode amplitudes to A 1 = 0.0075, A 2 = 0.15, and A 3 = 0.30, values used by Hadjighasem et al. [2016], and all three waves are in phase: φ 1 = φ 2 = φ 3 = 0. Note, however, that while the system dynamics depend on the values of A n and φ n [Rypina et al., 2007], there is no physical basis for the stated choice of amplitudes and phases, and the impact of varying these parameters will be explored in Section 4.
The present section discusses the application of the spectral clustering algorithm to the Bickley jet system, and studies the sensitivity of the results to user-defined method free-parameters. We present the impact of the choice of similarity measure in Section 3.1. Then, study the sensitivity to the method free-parameters: similarity radius σ, in Section 3.2, tightness of the cluster memberships m, in Section 3.3, and number of clusters K, in Section 3.4.

Gaussian Similarity Measure
To cluster the Bickley jet system according to the method described in Section 2.1, particles are initialized in a uniform grid of 400 by 120 positions uniformly covering the domain, and advected from t 0 = 0 to t f = 40 days, matching [Hadjighasem et al., 2016]. The distance function in (2) takes into consideration the x-periodicity of the domain, and the M = 6 dominant eigenvectors of (4) are used to partition the domain into K = 7 clusters, to account for 6 materially coherent vortices, which are the coherent clusters, and an incoherent cluster, the chaotic sea [Hadjighasem et al., 2016]. The method uses a fuzziness parameter m = 2. The membership probabilities for the clusters identifying the 6 materially coherent vortices at time t 0 are presented in Figure 1. Figure 1(a) presents our results, obtained using a Gaussian similarity function (3), with similarity radius σ/l x = 0.020, and no sparsification of the similarity matrix. The membership probabilities, plotted in different colors, highlight the 6 coherent vortices. We assign labels to the vortices, from left to right, and vortex 6 appears split at t 0 due to the x-periodicity. The membership probabilities of being part of the chaotic sea, the seventh cluster, are complementary to the ones plotted and are omitted throughout. Note that the use of a soft membership assigns to particles located at the periphery of the vortices lower probabilities of being a member, which relates to lower similarity in the dynamics (some of them may, for instance, be trapped inside the vortices for just a fraction of the time window of analysis, then merge with the chaotic sea, or vice-versa).
The results for the Gaussian similarity measure in Figure 1(a) are similar to those obtained with the l x /r ij similarity measure used by Hadjighasem et al. [2016], presented in Figure 1(b). For the latter, matrix entries w ij = 0 for r ij /l x > 0.075, and the offset diagonal value is chosen as 100 times the largest matrix entry. The clustering results are particularly sensitive to the sparsification threshold, which relates to the choice of σ for the similarity measure (3). Both similarity functions yield similar results for the selected parameter values, with smoother transitions in membership probabilities (from 1 to 0) for the Gaussian measure.
When using the l x /r ij similarity measure, the degree of sparsification is an additional parameter for the clustering method that can impact the results. This parameter can be eliminated for the Gaussian similarity measure as no sparsification was necessary for the result in Figure 1(a), but it is worthwhile to sparsify the matrix to reduce computational costs and storage. We demonstrate in the Supplementary Material A that sparsifying entries of W that satisfy w ij < exp(−4 2 /2) ≈ 3 · 10 −4 , corresponding to r ij > 4σ, has negligible impact on the clustering membership probability results, and hence we sparsify according to this rule hereafter. Note that this result is valid for the Bickley jet, but the impact of sparsification may vary for an arbitrary flow.

Similarity Radius
We highlighted in the previous section how the similarity radius σ is closely related to the sparsification threshold used by Hadjighasem et al. [2016]. While the sparsification threshold selection was mentioned in [Hadjighasem et al., 2016], the clustering results are highly sensitive to this parameter [Filippi et al., 2020], and we now address this sensitivity.
While there may be some intuition on the size of the structures of interest, this is not always helpful in prescribing σ or in understanding how the σ-choice impacts the resulting clusters. Hadjighasem et al. [2016] choose their threshold by defining which values of W to keep based on a fixed percent sparsification of the matrix. However, for a fixed percent sparsification, the graph connections that are retained are ultimately a function of the number of particles and their distribution in the initial grid.
Based of this relationship between the sparsification level in [Hadjighasem et al., 2016] and the parameter σ in the Gaussian similarity function (3), we vary σ to demonstrating the clustering sensitivity to changes in sparsification. First, we define an interval bounding all relevant σ-values to be tested. For σ/l x distributed on the interval [0.005, 0.040] with steps of 0.001 (36 cases), the method is applied with m = 2 to identify K = 7 clusters. Figure 2 presents the membership probabilities for each of the six coherent clusters for σ/l x = 0.005, 0.015, 0.030 and 0.040. Compared to the results for σ/l x = 0.020 presented in Figure 1(a), smaller values of σ (Figures 2(a,b)) tend to shrink the coherent clusters to the vortex cores only, while larger choices of σ (Figures 2(c,d)) assign higher membership probabilities to filaments that correspond to particles that do not belong to the coherent vortex from the start, but have a long residence time on the perimeter of the vortex.
To determine how the membership probabilities for each trajectory depend on σ, we use the information from the different realizations to compute the membership probability means p i,k and sample standard deviations S i,k for each trajectory and cluster, as described in Section 2.2. These statistical measures are presented in Figure 3. In Figure 3(a), we present p i,1 for vortex k = 1, and in Figure 3 such that, for each trajectory i,k is the cluster that maximizes the mean membership probability, hencê k = argmax k p i,k . Figure 3(a,c) demonstrates that the vortex cores have the greatest mean membership, which relates to the fact that those particles are identified with high membership probabilities in all realizations. Particles further away from the cores have a lower mean, as a result of lower probabilities of being part of the respective vortices, in particular for low σ values. We also notice sharp drops in the probability after a certain vortex size. The uncertainty of particles being part of a vortex is highlighted by the standard deviations in Figure  3(b,d), where we again see negligible standard deviation (low uncertainty) on the membership probabilities for the cores. Because there are no realizations that identify the central jet and some portions of the chaotic sea as part of the clusters, there is also a low standard deviation for those trajectories. The highest uncertainty is obtained for particles between the core and the chaotic sea, and some filaments are also highlighted with higher standard deviation. Those filaments correspond to particles consistently identified as part of a vortex, with high membership probabilities, for σ greater than a threshold, but not for lower values of σ.

Fuzziness Parameter
Next, we consider the sensitivity of the choice of the fuzziness parameter m, introduced by the c-means step that assigns cluster membership probabilities. In this analysis, σ/l x = 0.020 while m is varied on the interval [1.05, 3.00] with steps of 0.05. The results are presented in Figure 4. The membership probabilities for the extreme values (m = 1.05 and 3.00) are illustrated in Figure 4(a,b), and the superimposed mean and standard deviation in Figure 4(c,d).
While m = 1.05 (in Figure 4(a)) yields a bimodal probability distribution, with 99% of the p i,k > 0.5 cases also being greater than 0.95 (so tending to the corresponding k-means result), the use of m = 3.00 ( Figure  4(b)) produces smoother probability transitions from the vortex core to the perimeter, as well as overall lower membership probabilities, with only 10% of the p i,k > 0.5 cases greater than 0.95. The superimposed mean and standard deviations reveal a similar trend to the one observed for the dependence on σ, with two major differences: (i) while varying m introduces uncertainty in the membership probability of trajectories starting at the vortex perimeters, the m-choice does not lead to the same significant capture of filaments as member of the vortex for the current σ/l x = 0.020 value, and (ii) the magnitude of the standard deviations related to m are about half of the ones associated to σ (see Figure 4(d) in comparison to Figure 3(d)).

Number of Clusters
Finally, the number of clusters in the system is not necessarily known beforehand, and here we address how the clustering results statistically vary for different choices of K. As presented in [Hadjighasem et al., 2016], for a suitable sparsification level of the similarity matrix, the eigengap heuristic can be used to infer the use of M = 6 dominant eigenvectors to choose K = 7 for the Bickley jet, and thus identify the 6 materially coherent  Figures 1 and 3. vortices, plus the chaotic sea. It is known, however, that the existence of an eigengap is not guaranteed for any system, and depends on, among other properties, the connectivity of the similarity graph [von Luxburg, 2007]. The number of clusters for systems not as distinctly separated as the Bickley jet will, therefore, be more challenging to determine. Without any prior knowledge about the number of clusters in the system, one could consider clustering based on different numbers of dominant eigenvectors of (4) into different K, which might result in merging clusters, splitting clusters, missing clusters, or even identifying new ones. As we vary K, we fix the relationship between M and K to K = M + 1 to always include the chaotic sea as an independent cluster [Hadjighasem et al., 2016].
Because the free-parameter K cannot be varied continuously, and because varying K while fixing σ and m would only mean changing the number of eigenvectors to use in the c-means step, we adopt a different strategy to quantify the K-uncertainty. For each choice of K, we perform statistics using realizations in which σ and m are varied. The (σ/l x , m) pairs are drawn from uniform distributions over [0.005, 0.040]×[1.05, 3.00], corresponding to the same intervals in Sections 3.2 and 3.3. We consider 100 samples drawn from uniform distributions, and compute mean and sample standard deviations for K = 6, 7 and 8. One should now expect missing or merging vortices for K < 7, while splitting vortices or identifying new structures for K > 7. For the ensemble statistics, if vortex k is not identified in realization I, we set p (I) i,k = 0 for all i. For K = 8, a new coherent cluster corresponding to the jet is consistently identified, in all realizations. The jet is therefore considered a seventh coherent cluster.
The superimposed means and standard deviations for K = 6, 7 and 8 are presented in Figure 5. For K = 6, Figure 5(a) reveals that different (σ/l x , m) free-parameter choices can result in different vortices not being identified. Notably, vortices 3, 5 and 6 are missed in some realizations, which reduces their mean probability and increases uncertainty for those clusters. Figure 5(b), for K = 7, presents six vortices identified with similar probability distributions to the ones obtained by varying σ alone in Section 3.1 (in Figure 3(c,d)). Such similarity between these cases highlights the dominance of the σ-sensitivity over the choice of m, and the fact that these two parameters do not compound each other. Finally, Figure 5(c) presents the results for K = 8, highlighting the consistent identification of the jet as a coherent cluster in the system. The detection of the jet has little impact on the probability distributions for the six vortices. Both the means and standard deviations for the vortices remain similar to the ones obtained for K = 7, with a slight standard deviation increase for the cores. The jet cluster has an overall higher sensitivity to σ and m, with higher standard deviations than the vortex cores in Figure 5 While these ensemble statistics could be used as a basis for setting the method free-parameters, a thorough investigation on this is beyond the focus of the parameter sensitivity study presented in this paper. Based on the results for K = 6, 7, and 8, it would be reasonable to state that K = 7 could be chosen over the other options, as it is the case leading to the smallest space-averaged uncertainty. At the same time, however, the jet is a structure that differentiates itself from the rest of the flow, by behaving in a distinct and coherent way compared to the remaining particles in the chaotic sea. It could be argued that the choice of K = 8 for this system is an equally possible way of clustering the system, and provides extra information about the jet cluster, at the price of increasing the result sensitivity to other method free-parameters. Further work is necessary to automate this selection for a general system.

Ensemble Realizations and Uncertainty to Model Parameters and System Dynamics
In the previous section, a single set of model parameters of the Bickley jet system, with fixed physical parameters, was used to demonstrate how the choice of the method free-parameters impact the clustering results. Here, we focus on model parameters that influence the dynamics of the system by changing the velocity field and the resulting trajectories. Multiple realizations of the system are used to determine the clustering sensitivity to model parameters, allowing for a characterization of the robustness of the identified clusters. Section 4.1 explains how system parameters are randomized to generate multiple dynamically different realizations. Section 4.2 presents the statistics over the described realizations, leading to an uncertainty quantification of the clustering results to model parameters.

Perturbing the Bickley Jet Dynamics
While system parameters such as U , L, c 2 , and c 3 are set by physical arguments (as discussed in Section 3), the wave amplitudes A n and phases φ n are not, despite exerting a major influence on the system dynamics [Rypina et al., 2007]. We use these values as unknown model parameters to introduce variability in the system dynamics. The amplitudes and phases are drawn from normal distributions centered around the values used in Section 3 (and [Hadjighasem et al., 2016]). The realization presented in Section 3 is hereafter referred to as the central realization, and corresponds to amplitudes A n = A n , with A 1 = 0.0075, A 2 = 0.15, and A 3 = 0.30, and phases φ n = 0. Therefore, each realization I in this section is generated by where N µ, Σ 2 denotes a normal distribution of mean µ and standard deviation Σ. The standard deviations for A (I) n are scaled by the corresponding mean values, while the standard deviation for φ (I) n is chosen small enough so that the vortex centers for each realization are likely to be inside of the area covered by the vortices in the central realization.
A total of R = 1000 realizations are generated from these distributions, and the spectral clustering algorithm described in Section 2 is applied with method parameters fixed to σ/l x = 0.020, m = 2 and K = 7. While it is possible that individual realizations may require a different method free-parameter selection, our goal in presenting this section is to separate the effects of method free-parameters from those of model parameters. By sampling model parameters together with method free-parameters, the cluster uncertainty resulting from the model would be obfuscated. We thus fix the method free-parameters based on the central realization only, while considering multiple realizations with varying model parameters. Figure 6 presents the membership probabilities for four realizations. While the realizations do modify the position, shape, and dynamics of the vortices (see videos in Supplementary Material), their presence and number, as well as the presence of the shear jet, mostly remain unchanged. Figure 6(a,b) presents how variable the initial cluster sizes and shapes can be, as well as the effect of the wave phases on the nonuniform spacing between the identified vortices at t 0 . While for most realizations all six vortices are identified, there are cases where some of the expected vortices are not identified. Figure 6(c) presents a case in which the jet is identified and one of the vortices is missed. For that case, vortex 5 gets identified as part of the chaotic sea (not plotted), while a highly asymmetric jet is identified as another coherent cluster in the system, and trajectories are assigned membership probabilities p i,jet . Figure 6(d) presents a case for which a more symmetric jet is identified as a cluster and two of the vortices (2 and 4) are merged into a single cluster. Other realizations, not illustrated here, result in cases where only a few (or even none) of the vortices are identified. For those realizations, there is no clear Eulerian signature of the six vortices. In what follows, only clusters that identify one and only one vortex are considered for statistical purposes.

Uncertainty Quantification of Ensemble Simulations
The clustering results for all R = 1000 realizations are analyzed to measure their uncertainty statistics. For each one of the vortices, the mean and standard deviation over the ensemble of parameter value sets of the Bickley jet are computed according to (5) and (6). Figure 7 presents the superimposed membership probability mean, p i,k , and corresponding superimposed sample standard deviation, S i,k , for the ensemble. Even with a variety of behaviors observed in the ensemble, as illustrated in Figure 6, averaging the ensemble smooths the vortices, resulting in cluster shapes and positions that are similar to the ones obtained for the central realization (Figure 1(a)). However, the mean probabilities in Figure 7(a) highlight how introducing uncertainty in the wave amplitudes and phases leads to smaller vortex cores with high membership probabilities. The mean probabilities now peak at 0.91 rather than 1.00, due to the phase shift between realizations.  The membership probability decay from the vortex cores to the perimeters is also more gradual than for the central realization, and the averaging clears out previously identified filaments that are realization-dependent. Figure 7(b) highlights a more spread out distribution and overall higher magnitude for the superimposed standard deviation. Moreover, higher standard deviations are now observed at the vortex cores, which arises from the phase parameter uncertainty. Also, notice that most of the low uncertainty regions associated to the jet in Figure 3(b) have higher uncertainty in Figure 7(b). With the varying model parameters and dynamics, the vortex positions, shapes and trajectories are more variable. All of these introduce uncertainties that are not observed for the central set of parameter values. The ensemble analysis, therefore, highlights structure sensitivities (and robustness) that are not apparent from the central realization alone.
To illustrate the ramifications of regions of high uncertainty, we demonstrate how different ensemble trajectories are when initialized at high and low uncertainty locations. At t 0 = 0, the orange particles in Figure 8(a) are initially located at the core of cluster 1 and correspond to p i,1 = 0.883 and S i,1 = 0.287, while the red particles start at the perimeter of cluster 4 and correspond to p j,4 = 0.469 and S j,4 = 0.475. Figure 8(b,c,d) presents snapshots of the particle positions for all 1000 realizations at three different times. For reference, the gray boundaries enclose vortices k = 1 and 4 for the central realization (for particles such that p i,k ≥ 0.5). While particles released in the position of lowest uncertainty (orange) remain concentrated, particles released in the position of highest uncertainty (red) quickly spread throughout the domain. Particle concentration is more pronounced for the case of lower uncertainty, with 76.4% of the particles remaining inside the corresponding boundary after t = 40 days, as opposed to only 45.0% for the higher uncertainty case. The presence of the jet separating the top and bottom of the domain keeps most orange and red particles from moving to the opposite half.
Regions of higher uncertainty, not characterized when considering the central realization only, are revealed by the ensemble analysis, and correspond to flow regions for which particle cluster membership is most unknown. Our analysis shows that the cores of the vortices are robust even if the model parameters are varied, but the narrow filaments identified in the central realizations should be viewed as less robust as

Martha's Vineyard Ensemble Forecast and Surface Drifter Trajectories
Having demonstrated the clustering uncertainty quantification method on an analytic system, we apply the technique to a real ensemble forecast of a nested primitive equation ocean model, with intrinsic model uncertainty. The model is used to forecast the three-dimensional velocity field for the coastal region near Martha's Vineyard, an island located south of Cape Cod, Massachusetts, USA. Trajectories from drifters deployed [Filippi et al., 2020] for the corresponding day are then compared to the cluster behaviors. Section 5.1 presents characteristics of the Martha's Vineyard region, the model used to forecast the velocity field, and the results of the clustering uncertainty quantification analysis applied to the ensemble forecast. Section 5.2 details the drifter experiment and compares the drifter trajectories to the forecast trajectory clustering results and the associated uncertainty.

Velocity Model Ensemble Forecast and Uncertainty Quantification
The island of Martha's Vineyard, with an area of almost 250 km 2 , is the largest island in New England, and lies 11 km off the coast of Cape Cod. The prevailing currents in the coastal region south of the island are associated with wind-driven coastal divergence, tidal forcing and a mean southward drift [Rypina et al., 2014]. During the summer months, the region experiences a mean westward surface current that reaches velocities of 15 cm s −1 [Rypina et al., 2016]. The drifter deployment experiments targeting predicted coherent structures [Filippi et al., 2020], presented in Section 5.2, took place around the 2.5 km 2 uninhabited island of Nomans Land, south of Martha's Vineyard. The channel between the two islands has a width of approximately 5 km and an average depth of 10 m.
We used the MIT Multidisciplinary Simulation, Estimation, and Assimilation Systems (MIT-MSEAS) primitive equation ocean modeling system [Haley and Lermusiaux, 2010;Haley Jr et al., 2015] to compute ocean surface velocity forecasts in the Martha's Vineyard coastal region during August 2018. The modeling system provided forecasts of the ocean state variable fields (three-dimensional velocity, temperature, salinity, and sea-surface height) every hour, with a spatial resolution of 200 m. More details about the model forecast initialization, tidal forcing, atmospheric flux forcing, and CTD data assimilation can be found in [Serra et al., 2020;Filippi et al., 2020]. The deterministic two-way nesting ocean forecast initialized from the estimated ocean state conditions at a particular time is referred to as the central forecast, and ensemble forecasts were initialized using Error Subspace Statistical Estimation procedures [Lermusiaux, 2002]. The forecasts within the ensemble were initialized from perturbed initial conditions of all state variables and forced by perturbed tidal forcing, atmospheric forcing fluxes and lateral boundary conditions. These perturbations were created to represent the expected uncertainties in each of these quantities. Finally, parameter uncertainties (bottom drag, mixing coefficients, etc.) were also modeled by perturbing the values of parameters for each forecast.
A total of 71 forecasts were considered for the present study. Fluid particle trajectories confined to the surface were analyzed over a 6 h-time-window, between t 0 = 16:00 and t f = 22:00 UTC on August 7, 2018. Such a short timing is critical in search-and-rescue operations, as after six hours the likelihood of rescuing people alive drops significantly [Serra et al., 2020]. The forecast velocity fields used for trajectory integration were generated the night before the experiment. Synthetic trajectories were computed using the web-based gateway Trajectory Reconstruction and Analysis for Coherent Structure Evaluation Shadden, 2019, 2020] Figure 9(a) presents the initial particle distribution, superposed with the velocity field for the central forecast. Trajectories are integrated using an adaptive 7 th -order Runge-Kutta-Fehlberg method and bicubic spline spatial interpolation, with free-slip boundary conditions applied near land. Particle positions are output every 5 minutes. Figure 9(b) presents the final positions of the particles for the central forecast. Darker regions correspond to particles collecting, which is mostly observed along the coast. The model velocity field captures the reversal of the tide, as can be observed from the flipping in the average flow direction between t 0 in Figure 9(a) and t f in Figure 9(b). Trajectories obtained in each of the 71 forecasts are presented in Figure 9(c) in contrast with the central forecast, for particles initialized at distinct positions, to demonstrate the degree of trajectory variability between forecasts.
A similar study to the one in Section 3 was performed for the central forecast, to determine the spectral clustering sensitivity to method free-parameters. We select the following method free-parameters: similarity radius σ = 1 km, fuzziness m = 2 and number of clusters K = 4, which are used for all R = 71 forecasts. While it is possible to break the domain into fewer clusters, four clusters minimize the space-averaged standard deviation of the results. Because the graph is fully connected with this choice of σ, λ 1 = 0 and the components of q 1 are constant [von Luxburg, 2007]. Therefore, the clustering (in the absence of a chaotic sea) into K = 4 clusters is performed using the information from eigenvectors q 2 , q 3 and q 4 only. The membership probabilities for the central forecast are presented in Figure 10(a) for trajectories with p i,k ≥ 0.5. The domain is partitioned into 4 quadrants of similar size, and gaps between clusters correspond to particles with membership probabilities lower than 0.5. Different forecasts are used to compute the means p i,k and sample standard deviations S i,k for k ∈ {1, . . . , 4}. The superimposed mean p i,k and standard deviation S i,k are presented in Figure 10(b,c). The parameterization used for the model produces only a modest variation in the trajectory outcomes over six hours as demonstrated in Figure 9(c). Regions of highest uncertainty for this system correspond to identifying the edges of the clusters accurately, but this level of uncertainty is significantly lower than those observed in the Bickley jet example. The most pronounced uncertainty regions, in Figure 10(c), correspond to the boundary between clusters 1 and 4, followed by that between clusters 1 and 2.

Drifter Data and Forecast Cluster Dynamics
The experiment targeting predicted coherent structures consisted of surface drifter releases that took place on August 7, 2018, in the vicinity of Nomans Land [Filippi et al., 2020]. The CODE drifters used in the experiment have technical specifications listed in [Serra et al., 2020;Filippi et al., 2020]. Drifters of the same design are routinely used by the U.S. Coast Guard in search-and-rescue operations, as well as in previous field experiments in the coastal region near Martha's Vineyard [Rypina et al., 2014[Rypina et al., , 2016. The drifters were equipped with GPS transmitters that provided positioning fixes every 5 min, with an accuracy of a few meters. An elliptical route around Nomans Land was used for the drifter deployment, employing two WHOI vessels to minimize ship time, so that all drifters were in water by the start of the interval of analysis. Eighteen drifters were deployed in the water around the predicted locations of the targeted coherent structures. The position of the drifters at the starting time of our analysis, t 0 = 16:00 UTC, is presented in Figure 11(a). Figure 11(a-d) presents the mean cluster trajectories for each of the 4 clusters, on which we superpose the drifter positions from [Filippi et al., 2020]. To plot the evolution of p i,k , the entire domain is first split into 175 × 175 bins. Then, at each time instance, an average membership probability for each bin, over the forecasts, is calculated. This average is based on particles inside each bin at time t, and weighted by their membership probabilities p i,k . The overall system dynamics are similar to what was observed for the central forecast in Figure 9(a,b), with the clusters highlighting groups of different behavior. Using the labels from Figure 10(a), while cluster 1 (purple) travels a longer distance to the northeast, ending north of Martha's Vineyard, cluster 2 (blue) moves a shorter distance northward, while clusters 3 and 4 (red and yellow) are less dynamic and remain mostly to the east of Nomans Land.
Drifters are marker-coded according to the mean membership probabilities p i,k of their spatial position at t 0 . Triangles correspond to cluster 1, squares to cluster 2, circles to cluster 3, and no drifters were initialized in cluster 4. Drifters initialized in higher uncertainty regions (see Figure 10(c)) are plotted as crosses, and correspond to regions where S i,k > 0.1, or p i,k < 0.7. The results demonstrate that the drifters predominately remain inside of the forecast model clusters during the first four hours of the time interval, and exceptions to this behavior are associated with higher uncertainty. Consider, for example, the three drifters represented as crosses, initially west of Nomans Land. Two of them move along the northern coast of the island, with one of them beaching and the other ending on cluster 2. The third drifter headed south first, then east, also toward cluster 2. This fact highlights the value of the uncertainty quantification analysis to understand different dynamical behaviors in the flow which are not captured by the central forecast alone. All drifters were eventually advected eastward after 20:00. Some discrepancies between the clustering predicted behavior and the drifter trajectories were observed. Wind gusts that occurred between 16:00 and 20:00 significantly affected the drifter trajectories. These gusts had not been predicted by the forecasts from the National Centers for Environmental Prediction used by the MIT-MSEAS model for the atmospheric forcing [Filippi et al., 2020]. The study was therefore limited by differences between observed flows and model flows, and the coherent structure analysis accuracy is limited to the accuracy of the velocity field used for processing. Nonetheless, drifters released in the clusters with high membership probabilities tend to behave similarly to the clusters, even if they do not precisely match the predicted trajectory behavior. This fact highlights the coherent structure robustness, even in uncertain conditions. The clustering analysis partitions the domain into robust regions where a drifter released in the region will remain there or nearby over the period of analysis. The uncertainty quantification analysis helped identifying the key structures delimiting regions with different transport behaviors, further showing and expanding the applicability of trajectory clustering for studying oceanic flows, despite model imperfections. Figure 11: Time evolution of the average clusters in a binned domain. The 18 drifters are marker-coded depending on the cluster in which they started. Crosses represent the 4 drifters initialized at locations of higher uncertainty, with S i,k > 0.1. The times correspond to (a) t 0 = 16:00, (b) t = 18:00, (c) t = 20:00, and (d) t f = 22:00 UTC. Triangles are initialized on the purple cluster (3 drifters), squares on the blue cluster (5 drifters) and circles on the orange cluster (6 drifters). Corresponding colorbars presented in Figure 10

Conclusions
Ensemble statistics of the trajectory clustering results provide a partitioning of the fluid domain that may provide critical information in emergency response situations, such as search-and-rescue operations, when operational decisions about optimal resource allocation need to be made quickly, accurately, and account for model uncertainties. We presented a modified version of the spectral clustering method with soft membership probabilities, and applied it to fluid particle trajectories to identify coherent structures first in an analytic flow model, then in forecast simulations provided by a coastal ocean model. Uncertainty quantification was applied to assess both the result sensitivity to the clustering method free-parameters and the cluster variability with unknown parameters of the model data. The method sensitivity study, performed on the analytic quasi-periodic Bickley jet system, identified the similarity radius as the free-parameter to which the clusters are most sensitive. To mimic model uncertainty, the Bickley jet parameters were varied to perform a model sensitivity study that highlights the robustness of vortex cores compared to the more uncertain vortex perimeters.
Finally, the method was applied to an ocean ensemble forecast of the coastal region of Martha's Vineyard, and the clustering results were compared to drifter trajectories from a drifter release experiment targeting predicted coherent structures. The forecast clusters from the ensemble analysis provided a good baseline for the drifter behavior, as drifters deployed within each cluster performed similar motions to their corresponding clusters. Ocean transport predictions are challenging due to the complexity of the underlying dynamics governing the flow, and while the Lagrangian approach ultimately depends on the accuracy of the available velocity fields and the quality of the model data, the results presented here demonstrate that the identified clusters are robust to uncertainties in the model and able to predict the main elements of the organization of flow transport. The coupling of the clustering approach to uncertainty quantification can provide a more complete and informative description of flow transport and areas of higher and lower uncertainty within different clusters. Despite not having been the case for the drifter release presented here, a clustering forecast analysis could be used in planning a drifter deployment for future experiments.
Further refinement of the trajectory clustering method is highly desirable, in particular aiming to reduce parameter sensitivity. The uncertainty quantification with respect to method free-parameters could be used to select parameters that minimize cluster variability, in order to identify clusters that are physical structures, and not a byproduct of the system and parameters chosen. On a more fundamental level, while the method presented here provides robust clusters, it may be possible to improve the method by incorporating fundamental changes to the similarity measure, rather than addressing the sensitivity to freeparameters only. To quantify trajectory similarity, one could not only consider the similarity between the particle spatial coordinates in time, but also that of their velocity vectors, and propose a hybrid notion of similarity. Finally, applying the method to other oceanic forecasts and using the forecast clustering results to plan and execute drifter release experiments can be a promising path to more effective experiments, by increasing the likelihood of released drifters capturing targeted coherent structures and ocean transport barriers.

Funding
This research was funded by National Science Foundation Division of Atmospheric and Geospace Sciences, award number 1520825.

A Similarity Matrix Sparsification
As mentioned in Section 2.1, the use of the Gaussian similarity measure w ij = exp −r 2 ij /2σ 2 (9) in the similarity matrix W greatly reduces the impact of the sparsification step compared to the l x /r ij measure used in Hadjighasem et al. [2016]. This result is due to w ij in (9) approaching zero faster for large r ij . It is mentioned in Section 3.1, however, that sparsifying the matrix is worthwhile to reduce computational costs and storage, and we therefore sparsify entries of W that satisfy w ij < exp(−4 2 /2), corresponding to r ij > 4σ. This sparsification rule is used in Sections 3.2, 3.3, 3.4 and 4.1, under the assumption that such sparsification has negligible impact on the clustering membership probability results.
Here, we present a study for the Bickley jet system with model parameters as in Section 3 and method free-parameters σ/l x = 0.020, m = 2, and K = 7, where the clustering is performed using different levels of sparsification. We vary the sparsification levels by setting w ij = 0 for r ij /σ > α, with α ∈ {0.5, 1, 2, 4, 8} and α = ∞ representing the non-sparsified result matching Figure 1(a). Figure 12(a) presents a histogram with the distribution of r ij values. The vast majority of these values represent the time-averaged distance between particles that describe dissimilar trajectories, and therefore correspond to graph edges of negligible weights. Figure 12(b) presents the weight dependence on r ij /σ, and therefore the maximum magnitude of matrix entries that are dropped when W is sparsified for a choice of α. Figure 12: (a) Histogram of the time-averaged distance values, made dimensionless by σ = 0.020/l x , with a bin width of 0.25. (b) Weight dependence on r ij /σ for the Gaussian measure (9). The w ij values for r ij /σ = 0.5, 1, 2, 4 and 8 are shown in blue.
The impact of the sparsification parameter α on the number of retained (nonzero) entries in the similarity matrix is shown in Table 1. For all finite choices of α presented here, the percent sparsification of W is greater than 95%. The cluster membership probabilities obtained with the different choices of α are presented in Figure 13. We notice that sparsifying with α = 0.5, 1 or 2 clearly has an effect on the resulting membership probabilities, as clusters shrink as α is reduced. For α ≥ 4, however, the probabilities no longer depend on α, and no difference is observed between the method output for α = 4 or the non-sparsified case (α = ∞). Sparsifying with α = 4 also reduces the number of nonzero entries in W by about 60 times, and therefore facilitates storage and reduces the computational time when solving the generalized eigenproblem. Note that this result is valid for the Bickley jet parameterized as in Section 3, but the impact of sparsification may vary for an arbitrary flow.