Lagrangian Data Assimilation for Improving Model Estimates of Velocity Fields and Residual Currents in a Tidal Estuary

: Numerical models are associated with uncertainties that can be reduced through data assimilation (DA). Lower costs have driven a recent tendency to use Lagrangian instruments such as drifters and ﬂoats to obtain information about water bodies. However, difﬁculties emerge in their assimilation, since Lagrangian data are set out in a moving frame of reference and are not compatible with the ﬁxed grid locations used in models to predict ﬂow variables. We applied a pseudo-Lagrangian approach using OpenDA, an open-source DA tool to assimilate Lagrangian drifter data into an estuarine hydrodynamic model. Despite inherent challenges with using drifter datasets, the work showed that low-cost, low-resolution drifters can provide a relatively higher improvement over the Eulerian dataset due to the larger area coverage of the drifter. We showed that the assimilation of Lagrangian data obtained from GPS-tracked drifters in a tidal channel for a few hours can signiﬁcantly improve modelled velocity ﬁelds (up to 30% herein). A 40% improvement in residual current direction was obtained when assimilating both Lagrangian and Eulerian data. We conclude that the best results are achieved when both Lagrangian and Eulerian datasets are assimilated into the hydrodynamic model.


Introduction
A considerable proportion of estuaries have intermittently open connections to the ocean, presenting the typical characteristics of Intermittently Closed and Open Lakes and Lagoons (ICOLLs) [1]. Natural and anthropogenic hazards such as climate change, growing human populations and increasing urbanisation produce many pressures on estuaries and have a negative impact on human assets and livelihoods [2,3].
Consistent with effectively managing estuarine systems, monitoring methods have received attention due to advances in the realism of numerical models, which facilitates high-resolution simulations of such systems. This monitoring requires two types of information: (1) a numerical model that describes the hydrodynamics of the system and (2) measurements used for model simulation that incorporates a more comprehensive mechanism (i.e., a data assimilation (DA) scheme) to improve model estimates. In response to decision makers' requirements to identify the extent of estuarine problems, reliably measured data are essential to understand system characteristics via numerical modelling. addition to applicability for non-linear systems [16,19,20]. Using the EnKF algorithm, we can assess the effects of several elements, including the number of ensembles, frequency of assimilation, and observation density (number of drifters), to obtain an effective DA system [21].
The two main reasons that constrain the application of Lagrangian DA in estuarine dynamics studies are the programming effort required and the underestimation of the value of such assimilation. Herein, we present a flexible and easy-to-implement framework in which we use OpenDA; an open-source DA software for Lagrangian data assimilation. OpenDA has been widely used for DA and calibration purposes in lakes, rivers, and estuaries; however, cost-effective GPS-tracked Lagrangian drifters have not previously been used in an estuary.
Sediment transport processes in tidal channels usually result in erosion and deposition [22]. Determination of water circulation patterns in estuaries is a key element to support the sustainable management of these systems [23]. Tidal systems with asymmetric flow properties induce residual currents that cause net flow migration in either the ebb or flood direction [24]. It is also evident that, in tidal systems, residual currents are dominant in one direction; opposite residual currents can occur locally and change the direction of flow [25]. Residual currents control the exchange of sediments, including adsorbed nutrients and contaminants with the adjacent coastal zones and consequently impact geomorphic processes (e.g., stream meandering, bank erosion) and the overall health of estuarine ecosystems. The direction of residual currents (non-tidal currents) is important in sediment transport and bank erosion studies in estuaries because residual circulation and flood/ebb tidal asymmetry are the major mechanisms controlling the transport of fine suspended sediment [26]. In Currimundi Lake, with an asymmetric behaviour, in addition to the longshore transport of sand resulting in the migration of the inlet channel, the net flow direction has a pivotal impact on sand redirection causing bank erosion and entrance channel migration. Thus, tidal hydrodynamics and entrance behaviour studies of the lake play a key role in the effective management of the problems that this estuary encounters.
In this paper, we focused on the application of an EnKF to examine the reliability of Lagrangian DA performance using cost-effective drifters to improve velocity and direction of residual current estimates in Currimundi Lake. Our aim was to: (i) investigate the performance of Lagrangian DA, for improving the accuracy of model predictions; (ii) reduce the programming effort required for assimilation of Lagrangian data into hydrodynamic models; (iii) examine the effect of Eulerian and Eulerian-Lagrangian assimilation on DA performance; and (iv) investigate the spatial variation of residual currents in the domain through identifying five model scenarios increasing the modelling time window from 21 h to 8 days.
Delft3D flexible mesh in 2D mode was used for hydrodynamic simulations for five different time windows (i.e., 21 h, one day, two days, four days, and eight days). We then calculated Eulerian velocities from Lagrangian positions using a pseudo-Lagrangian approach and assimilated Lagrangian velocities using the EnKF method, embodied in OpenDA. In the next step, we conducted an extensive study to quantify the efficiency of EnKF-Lagrangian data assimilation as a function of the number of ensembles, the frequency of assimilation, and the number of drifters, taking into account the uncertainties associated with the boundary forcing of the model and observations. The performance of the Lagrangian assimilation was also compared with Eulerian data assimilation (i.e., fixed-point velocities), along with assimilating both Lagrangian and Eulerian velocities. To examine the effect of DA on the residual velocity direction in the lake, we quantified the variation of residual velocity direction in a situation when the model time window increased while a constant assimilation window was adopted.
The paper is organised as follows: in Section 2, we describe the materials and methods; in Section 3, we provide a brief introduction to the DA environment (OpenDA) and its EnKF module and DA set up, along with describing a post-processing method to remove fluctuations from drifter data. The DA experiments and monitoring network used its EnKF module and DA set up, along with describing a post-processing method to remove fluctuations from drifter data. The DA experiments and monitoring network used to investigate the DA performance are presented in Section 4, followed by results and discussion in Section 5. Finally, we present a detailed summary of the study and the conclusions from the simulation results in Section 6.

Study Area
The field study location was the main channel of Currimundi Lake. Currimundi Lake (longitude 153°8′10″ E, latitude 26°45′40″ S) is a micro-tidal estuary located in south-east Queensland characterized as a mixed tide with a predominantly semi-diurnal tidal pattern and is a micro-tidal estuary with a spring tidal range limited to 0.8 m [27]. The depth of the mid-channel varies between 3 to 5 m and the width varies from 70 to 300 m between upstream and downstream ( Figure 1). Lake Kawana discharges freshwater into the Currimundi Lake system through a weir (0.65 m above the AHD) located 3.6 km from the channel mouth. When the inlet is open, the average discharge rate is 80 ML/day to maintain the water level upstream. The main channel of Currimundi Lake is connected and generally open to the ocean whilst the main tributaries and man-made water bodies are located upstream (Figure 1).
(a) Aerial view of Currimundi Lake catchment including Lake Kawana and the weir separating Currimundi Lake and Lake Kawana (Maps created using ArcGIS software (Esri, 2020)).

Field Experiment Descriptions and Instrumentation
The field experiment covered 2 km of a relatively straight channel reach downstrea from the pontoon ( Figure 1) for a 21 h period during both ebb and flood conditions (27 28 April 2015). Key environmental forcings include wind, tide, and discharge (Table 1 To obtain the flow velocity at the near-surface, both GPS-tracked floating drifters an fixed ADV were used. Drifters utilised in this study were low-cost, low-resolution, cylindrical PVC pipes: cm diameter and 50 cm long with similar physical configuration as the high-resolutio drifters described in Suara, Wang [9]. The drifters were ballasted to achieve a ~47 cm su merged height. These drifters have an absolute horizontal position accuracy between and 3 m and were sampled at 1 Hz. A fleet of 18 drifters was launched in Currimun Lake on 28 April 2015 in clusters of four; they floated in the main channel for three hou from 13:00 to 16:00. Four drifters were either lost or experienced logging errors, thus da Aerial view of Currimundi Lake catchment including Lake Kawana and the weir separating Currimundi Lake and Lake Kawana (Maps created using ArcGIS software (Esri, 2020)).

Field Experiment Descriptions and Instrumentation
The field experiment covered 2 km of a relatively straight channel reach downstream from the pontoon ( Figure 1) for a 21 h period during both ebb and flood conditions (27-28 April 2015). Key environmental forcings include wind, tide, and discharge (Table 1). To obtain the flow velocity at the near-surface, both GPS-tracked floating drifters and fixed ADV were used. Drifters utilised in this study were low-cost, low-resolution, cylindrical PVC pipes: 4 cm diameter and 50 cm long with similar physical configuration as the high-resolution drifters described in Suara, Wang [9]. The drifters were ballasted to achieve a~47 cm submerged height. These drifters have an absolute horizontal position accuracy between 2 and 3 m and were sampled at 1 Hz. A fleet of 18 drifters was launched in Currimundi Lake on 28 April 2015 in clusters of four; they floated in the main channel for three hours from 13:00 to 16:00. Four drifters were either lost or experienced logging errors, thus data from 14 drifters were used in this study. The distance travelled by drifters between the bridge (point A) and pontoon (point B) is shown ( Figure 1). The ADV was mounted 0.5 m below the water surface at the pontoon (B) and sampled continuously at 50 Hz during the entire experiment period from 19:00 on 27 April to 16:00 on 28 April (Figure 1). Further details about environmental conditions, instrumentation and quality control for this experiment are explained in [28].

Hydrodynamic Model
Hydrodynamic simulations were conducted using Delft3D FM, an open-source hydrodynamic model developed by Deltares, Netherlands. The model was run in depth-averaged 2D mode. The model solves the Navier-Stokes equations for an incompressible fluid, using the shallow water equations and the Boussinesq assumptions. To ensure that the model output is independent of the grid resolution, a grid independence test was carried out. Five different grids were constructed ranging from 25 m down to 2 m. The cross-sectional average velocity at the middle of the domain (Point C in Figure 1a) was used to test the mesh convergence. The results show that the average velocity was not sensitive to further refinement beyond a minimum grid size of 5 m. However, an increase in the minimum grid size caused an increase in the cross-sectional average velocity. Courant-Friedrichs-Lewy (CFL) condition for this model was (CFL ≤ 1). A detailed model description can be accessed through the manual [29].
Modelling at Currimundi Lake used an unstructured grid with a spatial resolution of 5 m following the channel morphology; the time series of discharge obtained from an ADV and the water level obtained from a gauging station were used for upstream and downstream boundaries, respectively. The downstream boundary was forced with water level data obtained from the tidal gauge within the main channel and sampled at an interval of five minutes. Therefore, the diurnal, semi-diurnal and higher frequency tidal constituents are included in the boundary. The simulation time step was 1 min over the simulation period of 21 h plus a 21 h spin-up time for initial condition propagation. A spatially constant bottom friction was defined by applying Manning's coefficient n = 0.025 obtained through model calibration.
The model calibration process is a pivotal prerequisite to the DA process because DA algorithms are mainly bias-blind [30] and erroneous parameters can lead to significant uncertainties in the DA process. The calibration of the hydrodynamic model was performed with two sets of data: Eulerian calibration using 70% of velocity measurements collected from a fixed-instrument (ADV) and Lagrangian calibration using 70% velocities from GPS-tracked drifters. Both calibrations used Root Mean Square Error (RMSE) and correlation (R 2 ) between observed and simulated velocities and aimed to adjust the bed roughness coefficient through Manning's coefficient (n). The optimum Manning coefficient (n = 0.025) attained through Lagrangian calibration was equivalent to that observed via Eulerian calibration. Thus, Lagrangian data has the potency to be used in hydrodynamic model calibration in such environments. After the calibration, modelled and observed velocities correlated very well (R 2 = 0.94) and root mean square error was reasonably low (RMSE = 0.019 m/s), suggesting that the calibration successfully reduced systematic model errors. The calibrated model was then validated by comparing simulated velocities with 30% of the measurements for both Eulerian and Lagrangian velocities. The RMSE and R 2 values through this validation reflected a successful calibration of the model. The calibration and validation of the hydrodynamic model are presented in full detail in [28].
To investigate the effect of DA on the direction of residual currents in the main channel, we used our calibrated model to simulate the dynamics of Currimundi Lake for longer durations of one day, two days, four days and eight days. Upstream and downstream boundary forcings (i.e., discharge and water level) used in the hydrodynamic model for longer temporal simulations were obtained by developing a stage-discharge relationship and gauging station data measurements. Details of the tidal discharge rating curve used to predict flow discharge as well as calibration and validation of the model can be found in [31]. We performed the Lagrangian assimilation over a period of three hours for our four model scenarios with different boundary forcing temporal variations (i.e., one day, two days, four days and eight days). We then calculated the residual currents by averaging over a complete tidal cycle (12 h 25 min), considering the predominantly semi-diurnal tidal pattern in Currimundi Lake. The impact of DA on the direction of residual currents for each of the model scenarios was based on Root Mean Square Deviation (RMSD). This was then compared to identify the best DA performance when the model period increased from one to eight days whilst the assimilation period remained fixed at three hours.

Data Assimilation
To enable flexibility and reasonable computational cost, we utilised a variant of the Kalman filter, namely an Ensemble Kalman Filter (EnKF). The main procedure was the recursive computation of the means and covariance matrices of the ensemble system state. Since its first introduction by Evensen [32], the EnKF has been widely used in different geoscience disciplines such as oceanic [33], riverine [34], and estuarine [35] studies. The full theoretical formulation for the EnKF framework is presented in Evensen [36].

Data Assimilation Platform
OpenDA is an open-source tool, enabling the incorporation of random numerical models and observations through assimilation algorithms. It can be utilised to minimise the programming effort by enabling flexible software implementation among users (http://www.openda.org (accessed on 1 December 2020)). The OpenDA framework is validated by performing both real and synthetic DA that supports the potency of this relatively new framework launched in 2010 (OpenDA). Three main data assimilation components in OpenDA include: (i) a stochastic observer for specifying observation data used in the application and information about data uncertainty; (ii) an algorithm that itemizes the input parameters required by DA; and (iii) a stochastic model in which users identify model-related information. The complete source code is accessible via GitHub (https://github.com/OpenDA-Association/OpenDA (accessed on 1 December 2020)).
This environment supports assimilation of the available observations through the use of filtering techniques such as EnKF and particle filters. These algorithms have been successfully applied in different areas like DA of currents and salinity profiles [37], flood forecasting [38] and more recently in DA for accurate estimation of sea level anomalies (SLA) and residual currents [39,40]. These diverse applications demonstrate the efficacy of OpenDA as a generic toolbox for DA.

Data Assimilation Algorithm
At each time step, the flow model (Delft3D FM) receives real velocity data from the drifters and the shallow water model equations produce a collection of states (ensembles) representing the evolution of the processed inputs. The EnKF compares these real data to the estimated outputs to provide the best estimation of the flow state in the field.
Below, we explain the fundamentals behind the EnKF algorithm in Equations (1)-(7). The true model state based on the physical state of the lake at time t is defined as At (in our case, velocity for the entire model grid), M is the nonlinear lake system operator, ω is the noise, and U is the hydrodynamic forcing vector (here discharge and water level) for time t: The state-space vector, notedÂ is an approximation by the hydrodynamic model Delft3D-FM of the true state S. The noise is added to the forcing term ω.
The forecast state is specified byŜ f and the analysis state acquired after DA isŜ a : Appl. Sci. 2021, 11, 11006 8 of 26 The observation (y) equation is described as: where H is an operator connecting the state of the system to the observation and ξ is measurement noise. The observation prediction equation is: and the state estimation of the system obtained from DA which will be used in the next sequence as an initial condition is: K is a weighting factor called Kalman gain and is considered as a balance between model and observation uncertainties [41]. K tends to decrease the error covariance of the state estimate during the analysis time: where R t and P f are the measurement error covariance matrix and a priori state error covariance matrix, respectively. The EnKF is able to compute a time-varying covariance error based on the dynamics of the system. For an ensemble of forecasts (j = 1, . . . , N), each prone to some level of uncertainty in model processes, forcing, or initial conditions, P is: where P f t is the forecast error covariance.

Data Assimilation Setup
The purpose of DA is essentially to characterize uncertainties [42]. The hydrodynamics of a system is modelled based on deterministic equations and the model initial conditions. However, boundary conditions are associated with uncertainties that adversely affect the performance of any calibrated model [19]. To address this issue, we included stochasticity and added noise to both velocity components as well as the water level in boundary conditions. Observations (i.e., velocity) to be assimilated were perturbed at observation locations to support the measurement errors. Ensemble forecasting produces a number of model realizations from stochastic perturbations applied to model boundary forcings and observations. These model realizations are then propagated using available observational information, which is required to reduce model uncertainty and improve its accuracy.
We introduced stochasticity to the deterministic model by using the OpenDA noise model. To fully capture the model uncertainty, an ensemble was created with a reasonably wide spread [43]; to this end, error statistics ( Table 2) were designated to sufficiently spread the ensemble members.

Pseudo-Lagrangian Assimilation
A common problem when assimilating Lagrangian data is that there is a nonlinear relationship between the observed variables (i.e., the positions r) and the model variables to be modified (i.e., the Eulerian velocities u). This problem can be circumvented by approximating u as the finite difference of successive positions, u ≈ ∆r/∆t (∆r = position displacement and ∆t = sampling period). This method, known as pseudo-Lagrangian assimilation, works well when the sampling period is much smaller than the Lagrangian correlation time scale [16]. For this study, the drifters were sampled at a frequency of 1 Hz resulting in a sampling period that is at least one order of magnitude less than the Lagrangian correlation scale (50 s) in a channel with similar physical characteristics as Currimundi Lake [9].

Trajectory Filtering
In the case of experimental flows (real data), estimated velocity (u ≈ ∆r/∆t) cannot be used directly in the data assimilation system. To resolve this issue, a statistical description of individual trajectories in the form of a space-time averaging filter is applied. V averaged (x, t) is defined as the mean velocity observed at time t and location X in a space-time window W = Wt × Ws, where Wt is a temporal window around t and Ws is the spatial neighbourhood of X(t).
In Figure 2, the spatial window (red box) is indicated, representing the location of Lagrangian drifters from time t1 to time t2. As drifters move along the water, some of them enter and depart from this spatial window during the corresponding time of the window (Wt). We "average out" the velocities of all the drifters traveling through the specific spatial window during a given time interval (V averaged (x, t), green arrow in Figure 2. This filtered velocity represents the local flow velocity at a specified time and is subsequently used in the flow computations.
Depth averaged velocity is obtained from the drifter surface velocities using a correction factor of 0.85 following the logarithmic-law-of-the-wall for velocity profiles measurement in open channels [44]. Depth averaged velocity is obtained from the drifter surface velocities using a correction factor of 0.85 following the logarithmic-law-of-the-wall for velocity profiles measurement in open channels [44].

DA Experiments
In this section, a base experiment (i.e., Base-Test) used as a benchmark is presented first and the rigour of the results of the Base-Test is then tested using a sensitivity analysis

DA Experiments
In this section, a base experiment (i.e., Base-Test) used as a benchmark is presented first and the rigour of the results of the Base-Test is then tested using a sensitivity analysis by varying the parameters of DA: the ensemble size, the number of the drifters and sampling period. These parameters are adjusted in an appropriate range to provide guidance for applications in real cases.
A comparison between the performance of the Lagrangian assimilation with respect to the Eulerian assimilation and Lagrangian plus Eulerian assimilation using the "pseudo-Lagrangian" scheme was conducted. Finally, the difference in the residual current direction realised by DA in terms of RMSD is presented.

Experiment Characteristics
Six different experiments were conducted to assess DA performance: • Free-Run: Model simulation without assimilation; • Base-Test: The assimilation of the Eulerian velocities obtained from eight Lagrangian drifter data released within the straight section of the channel between points A and B (Figure 1). The assimilation process initiates with a sampling period ∆t = 1 min, which corresponds to the time step of the model, so the assimilation is undertaken at each model time step. This experiment is designed to study the DA performance on the velocity estimates subject to the availability of velocity information during the drifter travel time; • Ensemble-Test: To assess the impact of the number of ensemble members, five values for ensemble size were analysed: 5, 10, 25, 50 and 100. Various ensemble sizes were selected to gain the optimum size considering a trade-off between computational cost and accuracy; • Frequency-Test: A series of experiments was performed maintaining the same configuration as in Base-Test, using sampling period ∆t varied as 1, 2, 5 and 10 min. • Density-Test: To examine the effect of the number of drifters deployed, two experiments, in addition to Base-Test, were performed with four and two drifters released in the straight section of the lake; • Validation-Test: To validate the DA performance with a different set of data (i.e., six non-assimilated drifters).

Monitoring Configurations
Depending on the availability of velocity data at various locations in the domain of interest, estimates of how the velocity improved using DA varied. By identifying two velocity monitoring configurations, we argue that increasing the density (i.e., number) and the distribution (i.e., spatial coverage) of observations in the channel will lead to improvement of velocity and residual currents results in our domain. To assess DA performance, two monitoring Set-Ups were used in addition to the Base-Test: Set-Up I: Using one fixed ADV deployed 0.5 m below the surface at the pontoon ( Figure 1). This set-up investigated the impact of DA on velocity estimates, assimilating 21 h of velocity data at ten-second intervals. Set-Up II: A combination of Base-Test and Set-Up I, using both drifters and ADV velocity data simultaneously. Table 3 summarises the characteristics of the DA experiments and monitoring network used for evaluating the DA performance.  Figure 1 3 h/1 min

DA Performance Evaluation
The success of the assimilation in the experiments and Set-Ups is evaluated. Different measures for evaluating and comparing the results of various scenarios employed in this study are used as benchmark indicators. The velocity estimates obtained by Free-Run and Base-Test were compared with the drifter measurements in terms of RMSE at any station or location that observations were available.
Root mean square error (RMSE) is estimated as: To quantify the improvement gained by the DA, we defined percentage improvement as: Absolute Accuracy Error is calculated as: Mean Absolute Error is calculated as: Deviations between simulated velocities before and after assimilation (i.e., Free-Run and DA experiments) were used to compare the performance of DA for both Eulerian and Lagrangian assimilations (i.e., Root Mean Square Deviation (RMSD)).
The velocities in the model state are described in terms of north/east (V x and V y ) components. The results here are presented in terms of the magnitude and direction of the currents because, operationally, these are the parameters of interest. The velocity magnitude and direction (θ) are obtained as:

Ensemble Size
To employ a statistical sample of the state of the system, the EnKF functions by integrating ensembles of states that are independent of each other; therefore, the model is run N (number of ensembles) times. The ensemble size is generally identified empirically and is based upon a compromise between a reasonable representation of the state and computation cost [19]. Increasing the size of the ensemble (N) decreases the errors in the Monte Carlo sampling at the rate of 1 √ N [36]. We determined ensemble size by performing a sensitivity analysis using ensemble sizes of 5, 10, 25, 50 and 100.
With respect to our Base-Test (Section 4.1), the results indicate that for an increasing number of ensembles the analysis error reduces. Figure 3 provides RMSE for both velocity components when the size of the ensemble varies. The most striking improvement in RMSE occurs when we increase N up to 25. Further increases in the ensemble size had no significant improvement; hence, we conclude that the best improvement was achieved with 25 ensemble members.

Assimilation Frequency
An important physical parameter to consider in DA is the frequency at which data are assimilated into our model. To test the sensitivity of our DA approach to the assimilation frequency, we performed a series of experiments maintaining the same configuration as in the Base-Test with eight drifters but altering the value of the frequency (Δt) at which drifter velocities were assimilated into the model. Error estimates in terms of RMSE for both velocity components are presented in Figure 4. For assimilation intervals <2 min, our model remains less sensitive to the specific interval used, likely due to data correlation effects. However, increasing the assimilation interval beyond 2 to 10 min shows a high degradation in DA performance. This infers that DA performance is highly responsive to the frequency of data assimilation. Assimilating drifter data every 1 min resulted in ~27% improvement while assimilating every 10 min produced essentially no improvement in model output. Performing assimilations every 5 min provided an 11% and 13% improvement in and , respectively, notably less than the improvement gained in the Base-Test, with an assimilation frequency of 1 min. However, an assimilation frequency of 2 min led to a slight degradation in DA performance improvement compared to the Base-Test. In fact, increasing the time interval beyond the Lagrangian integral time scale of the lake (here, ~50 s) causes some important spatiotemporal velocity information to be lost, leading to deterioration in the model performance [16]. In general, the results denote that using a DA time interval close to the Lagrangian integral time scale of the domain leads to better outcomes.

Assimilation Frequency
An important physical parameter to consider in DA is the frequency at which data are assimilated into our model. To test the sensitivity of our DA approach to the assimilation frequency, we performed a series of experiments maintaining the same configuration as in the Base-Test with eight drifters but altering the value of the frequency (∆t) at which drifter velocities were assimilated into the model. Error estimates in terms of RMSE for both velocity components are presented in Figure 4. For assimilation intervals <2 min, our model remains less sensitive to the specific interval used, likely due to data correlation effects. However, increasing the assimilation interval beyond 2 to 10 min shows a high degradation in DA performance. This infers that DA performance is highly responsive to the frequency of data assimilation. Assimilating drifter data every 1 min resulted in~27% improvement while assimilating every 10 min produced essentially no improvement in model output. Performing assimilations every 5 min provided an 11% and 13% improvement in V x and V y , respectively, notably less than the improvement gained in the Base-Test, with an assimilation frequency of 1 min. However, an assimilation frequency of 2 min led to a slight degradation in DA performance improvement compared to the Base-Test. In fact, increasing the time interval beyond the Lagrangian integral time scale of the lake (here, 50 s) causes some important spatiotemporal velocity information to be lost, leading to deterioration in the model performance [16]. In general, the results denote that using

Number of Drifters
Determining the optimal number of drifters to be deployed is an important aspect of our Lagrangian DA assessment. To this end, four experiments in addition to the Base-Test were conducted with 2, 4, 12 and 14 drifters while keeping other parameters the same as those in the Base-Test (Table 3). Error assessments in terms of RMSE for both velocity components are shown in Figure 5. A major improvement in the RMSE occurs when 14 drifters are used. However, the optimum value is attained with eight drifters because this results in the maximum improvement in RMSE with the fewest number of drifters. When the number of drifters exceeds eight, the error reduces marginally by ~2%. The optimum value of eight drifters determined here supports the selection in the Base-Test (Table 3). In all cases, DA is effective even when a limited number of drifters were used (i.e., two drifters), which affirms the value of assimilating minimal Lagrangian drifter data to improve model estimates, possibly for locations poorly covered by observation systems. This indicates that a small number of drifters can result in a significant improvement to model predictions. With more observations, our analysis shows better improvement and is, therefore, less likely to have unrealistic features in our hydrodynamic model outputs that can appear when a sparse network of observations is analysed. Although it would be more beneficial to record more data, this optimum number of drifters is dependent on many factors including the flow domain being modelled (e.g., eddy dominant or quiescent (slack) sections of the domain) [21]. Ensemble size= 25 Number of drifters= 8 Figure 4. Sensitivity of DA performance to frequency of assimilation. Drifter data were assimilated at 1, 2, 5 and 10 min intervals. The Base-Test configuration is applied.

Number of Drifters
Determining the optimal number of drifters to be deployed is an important aspect of our Lagrangian DA assessment. To this end, four experiments in addition to the Base-Test were conducted with 2, 4, 12 and 14 drifters while keeping other parameters the same as those in the Base-Test (Table 3). Error assessments in terms of RMSE for both velocity components are shown in Figure 5. A major improvement in the RMSE occurs when 14 drifters are used. However, the optimum value is attained with eight drifters because this results in the maximum improvement in RMSE with the fewest number of drifters. When the number of drifters exceeds eight, the error reduces marginally by~2%. The optimum value of eight drifters determined here supports the selection in the Base-Test (Table 3). In all cases, DA is effective even when a limited number of drifters were used (i.e., two drifters), which affirms the value of assimilating minimal Lagrangian drifter data to improve model estimates, possibly for locations poorly covered by observation systems. This indicates that a small number of drifters can result in a significant improvement to model predictions. With more observations, our analysis shows better improvement and is, therefore, less likely to have unrealistic features in our hydrodynamic model outputs that can appear when a sparse network of observations is analysed. Although it would be more beneficial to record more data, this optimum number of drifters is dependent on many factors including the flow domain being modelled (e.g., eddy dominant or quiescent (slack) sections of the domain) [21].

Model Validation
We validated the model using independent data, that is, data that were not assimilated by the system. We validated the DA process by comparing modelled velocity estimates when data from eight drifters were assimilated with non-assimilated observations (i.e., six drifters). It is noted that the velocity output from model simulations that was compared with measured velocities were based on the actual drifter trajectories with 1 min interval (Δt = 1). A scatter plot of observed versus simulated horizontal velocities shows good agreement (R 2 = 0.90) between velocities derived from the model at drifter locations and velocities of non-assimilated drifters ( Figure 6). This agreement is significantly higher than the correlation obtained between model velocity without assimilation and drifter velocities (R 2 =0.56) [28]. The result here further determines a reliable DA performance and shows that the integration of Lagrangian data significantly improves the hydrodynamic model prediction.

Model Validation
We validated the model using independent data, that is, data that were not assimilated by the system. We validated the DA process by comparing modelled velocity estimates when data from eight drifters were assimilated with non-assimilated observations (i.e., six drifters). It is noted that the velocity output from model simulations that was compared with measured velocities were based on the actual drifter trajectories with 1 min interval (∆t = 1). A scatter plot of observed versus simulated horizontal velocities shows good agreement (R 2 = 0.90) between velocities derived from the model at drifter locations and velocities of non-assimilated drifters ( Figure 6). This agreement is significantly higher than the correlation obtained between model velocity without assimilation and drifter velocities (R 2 = 0.56) [28]. The result here further determines a reliable DA performance and shows that the integration of Lagrangian data significantly improves the hydrodynamic model prediction.

Improvement of Velocity Estimates at Observed Locations
DA improvement of the model estimates of velocity at all observed locations was tested. To evaluate the impact of velocity data on the efficiency of assimilation, an absolute error indicator was used. Figure 7a examines the observation minus modelled velocity components in the domain that was covered by drifters and for the x (left panels) and y components (right panels). The absolute error between observations (i.e., drifter velocities) and modelled velocities derived from the Free-Run (i.e., model without assimilation) is plotted in the top panel and the absolute error between observations and modelled velocity after assimilation (i.e., Base-Test) is shown in the bottom panel. Clearly, the DA enhanced model performance with most of the error values near zero after assimilation. The Free-Run appears to have a higher error at observed locations while the assimilation appears to have reduced this error significantly, bringing the modelled velocity close to the observed velocity. To better exhibit the general performance of the DA, Figure 7b shows the time series of absolute error for the modelled velocity with and without assimilation (i.e., Base-Test and Free-Run, respectively) with respect to drifter velocity. It is evident that assimilation of drifter velocities can successfully reduce the errors in the modelled velocity during the assimilation period.

Improvement of Velocity Estimates at Observed Locations
DA improvement of the model estimates of velocity at all observed locations was tested. To evaluate the impact of velocity data on the efficiency of assimilation, an absolute error indicator was used. Figure 7a examines the observation minus modelled velocity components in the domain that was covered by drifters and for the x (left panels) and y components (right panels). The absolute error between observations (i.e., drifter velocities) and modelled velocities derived from the Free-Run (i.e., model without assimilation) is plotted in the top panel and the absolute error between observations and modelled velocity after assimilation (i.e., Base-Test) is shown in the bottom panel. Clearly, the DA enhanced model performance with most of the error values near zero after assimilation. The Free-Run appears to have a higher error at observed locations while the assimilation appears to have reduced this error significantly, bringing the modelled velocity close to the observed velocity. To better exhibit the general performance of the DA, Figure 7b shows the time series of absolute error for the modelled velocity with and without assimilation (i.e., Base-Test and Free-Run, respectively) with respect to drifter velocity. It is evident that assimilation of drifter velocities can successfully reduce the errors in the modelled velocity during the assimilation period. Appl. Sci. 2021, 11

Impact of DA Observation Types on the Model Performance
The assimilation runs were conducted for the Set-Up I and II in addition to the Base-Test which assimilated pseudo-Lagrangian drifter data. Set-Up I assimilated the Eulerian data while Set-Up II assimilated both the pseudo-Lagrangian dataset used in the Base-Test and the Eulerian dataset used in Set-Up I. DA performance with respect to the differences between model estimates with and without assimilation was examined. Velocity magnitude for the Free-Run simulation is shown in Figure 8a. The RMSD between Free-Run modelled velocities and DA modelled velocities in the Base-Test, Set-Up I and Set-Up II for all grid points was calculated. The special variation of RMSD for the Base-Test shows that DA remarkably improved the model results of Lagrangian drifter data assimilation, specifically in the sections of the lake where the drifters traversed (Figure 8b). These findings suggest that DA performance is highly dependent on the spatial coverage of drifters. Large RMSD values were observed in a small area in the vicinity of downstream boundary where no observations were available. Errors are likely due to sparse data.
The spatial variation of RMSD (i.e., the deviation between Free-Run and Set-Up I) for the assimilated velocity was then calculated. This statistical measure shows that a single sensor at a certain location can contribute to improvement in the model results through DA (Figure 8c). We found distinct differences between the model before and after assimilation. The difference between the assimilated Lagrangian and Eulerian velocity (Set-Up II) and Base-Test velocity, suggests that including both sets of data (i.e., Lagrangian and Eulerian) concurrently results in larger RMSD in the domain compared to when the Base-Test or Set-Up I is examined independently (Figure 8d). The model is strongly affected by DA as observed from the three cases examined, although the impacts are not identical in terms of magnitude or with respect to location. RMSE and R 2 values before and after DA for Lagrangian, Eulerian and Both Lagrangian and Eulerian DA indicate significant improvements over the Free-Run simulation (Table 4 and Figure 9).

Spatial Variation of Residuals Currents (Direction of Residual Currents)
Tidal hydraulics and entrance dynamics along with bank erosion and water quality are dominant issues with high priorities that need to be addressed for Currimundi Lake. The hydrodynamic and residual transport patterns emerging from tidal motion have important consequences for the transport of sediments, water quality and bank erosion; thus, an accurate estimation of residual currents is very important in this setting.
A previously developed tidal rating curve was used to predict a long time series of discharges by measuring the water level at a gauging station. This relationship was validated by investigating the correlation between the predicted and observed discharge,

Spatial Variation of Residuals Currents (Direction of Residual Currents)
Tidal hydraulics and entrance dynamics along with bank erosion and water quality are dominant issues with high priorities that need to be addressed for Currimundi Lake. The hydrodynamic and residual transport patterns emerging from tidal motion have important consequences for the transport of sediments, water quality and bank erosion; thus, an accurate estimation of residual currents is very important in this setting.
A previously developed tidal rating curve was used to predict a long time series of discharges by measuring the water level at a gauging station. This relationship was validated by investigating the correlation between the predicted and observed discharge, along with examining the comparison between model outputs using observed and predicted discharge boundary conditions [31]. The residual currents were derived from tidal velocities during 8-day, 4-day, 2-day and 1-day periods in which full tidal cycles prevailed. These non-tidal currents driven by wind are important in estuaries because they are the principal means of transport for dissolved and suspended matter [45,46]. In addition, the lack of consistent current measurements in these areas begs the need for accurate current estimations. The effect of DA on the direction of residual currents in terms of RMSD is shown in Figure 10. While the assimilation window is kept constant at 3 h, the time window of residual velocity changes. A significant difference between the model without assimilation and Set-Up II for time windows of 1, 2, 4, and 8 days was observed (Figure 10a-d, respectively). The deviation is particularly larger in the 1-day window of residual velocity and especially in shallow sections of the lake mostly driven by wind (close to downstream boundary) and the section where drifters made their way through as expected from DA performance analysis results. Given that the assimilation period is constant, as the time window of residual velocity increases, the deviation between model results without assimilation and Set-Up II decreases, which shows the effect of the ratio of residual velocity period to assimilation period on the DA performance ( Figure 11). along with examining the comparison between model outputs using observed and predicted discharge boundary conditions [31]. The residual currents were derived from tidal velocities during 8-day, 4-day, 2-day and 1-day periods in which full tidal cycles prevailed. These non-tidal currents driven by wind are important in estuaries because they are the principal means of transport for dissolved and suspended matter [45,46]. In addition, the lack of consistent current measurements in these areas begs the need for accurate current estimations. The effect of DA on the direction of residual currents in terms of RMSD is shown in Figure 10. While the assimilation window is kept constant at 3 h, the time window of residual velocity changes. A significant difference between the model without assimilation and Set-Up II for time windows of 1, 2, 4, and 8 days was observed (Figure 10a-d, respectively). The deviation is particularly larger in the 1-day window of residual velocity and especially in shallow sections of the lake mostly driven by wind (close to downstream boundary) and the section where drifters made their way through as expected from DA performance analysis results. Given that the assimilation period is constant, as the time window of residual velocity increases, the deviation between model results without assimilation and Set-Up II decreases, which shows the effect of the ratio of residual velocity period to assimilation period on the DA performance ( Figure 11).

Conclusions
Drifters have a long history of use in oceanographic, climate research and weather forecasting; however, they have only been introduced into estuarine environments in the

Conclusions
Drifters have a long history of use in oceanographic, climate research and weather forecasting; however, they have only been introduced into estuarine environments in the Figure 11. The effect of residual velocity time window on averaged root mean square error.

Conclusions
Drifters have a long history of use in oceanographic, climate research and weather forecasting; however, they have only been introduced into estuarine environments in the last few years. Accordingly, many drifter types exist. Recently, drifters with Global Positioning System (GPS) receivers have drawn considerable attention due to their ability to obtain data with a large spatiotemporal coverage at a relatively lower cost compared to traditional Eulerian instruments. This work provides the first examination into the use of low-cost, low-resolution Lagrangian drifters for improving the accuracy and understanding of flow field dynamics of a micro-tidal estuary (Currimundi Lake) through data assimilation into a hydrodynamic model.
The real velocities derived from the Lagrangian drifter data in addition to Eulerian velocities were assimilated into a two-dimensional version of Delft3D-FM which solves shallow water equations through a finite element method. Assimilation experiments were conducted for 3 h, using the flow data collected during the 28 April 2015 experiment. The EnKF was applied through the OpenDA assimilation toolbox. We tested the system for different data configurations and model setups. The problem of assimilation of Lagrangian drifter data in Eulerian models is circumvented by applying a pseudo-Lagrangian approach through which the corresponding Eulerian velocity is constructed. OpenDA, which is an open interface standard, was used to evaluate the efficiency of an Ensemble Kalman Filter (EnKF) data assimilation scheme to improve the velocity field and residual currents of the study domain.
The pivotal elements influencing DA performance were examined. The robustness of the EnKF method was assessed by varying several parameters, such as the number of ensembles, frequency of assimilation and number of drifters. We found that the ensemble size plays a key role in reducing model error. The results also show that DA performance is highly sensitive to the assimilation time interval, where a more frequent DA time interval of 1 min led to the largest improvement (i.e., 28%). Increasing the time interval to 10 min showed nearly no improvement because of velocity information loss. Assimilation of Lagrangian drifter data, even with a limited number of drifters, improves velocity estimates. Therefore, even a small number of Lagrangian sensors can be effective for collecting water flow information to be used in an EnKF-driven assimilation process.
The Lagrangian assimilation algorithm was examined by conducting a series of DA experiments. A base experiment (Base-Test) was performed using data from eight drifters in the model. The results obtained by Lagrangian assimilation were compared with Eulerian and Lagrangian plus Eulerian assimilation. Results demonstrated the efficacy of DA performance because a significant deviation was gained in velocity maps before and after assimilation. We found that both Lagrangian and Eulerian data improve assimilation of our system dynamics with Lagrangian data being the most important. The results also indicate that concurrent assimilation of Lagrangian drifter and Eulerian ADV velocities lead to a higher deviation between assimilated and non-assimilated models.
Our results suggest that a desirable alternative to fixed Eulerian devices such as ADVs and ADCPs are drifters. Furthermore, drifters can be promptly deployed in any specific domain under study, particularly in the case that unforeseen events occur. Lagrangian measurements are especially beneficial in the areas where Eulerian instruments are either sparse or unreliable. It is evident, however, that directly assimilating Lagrangian positions is relatively difficult in Eulerian-based hydrodynamic models because of the differences in data and model frameworks. In addition, raw Lagrangian data are rather challenging to use because particle flow is generally affected by local flow perturbations, which are induced by different physical processes, such as turbulence and surface wind. Despite these inherent challenges, the work showed that the use of low-cost, low-resolution drifters provided relatively higher improvement in the model prediction over the Eulerian data set due to a larger area coverage.
Using a discharge-rating curve, we predicted discharge for the period of 1, 2, 4, and 8 days, which was then applied to model the domain for these time periods. The direction of residual currents was calculated over the full tidal cycles, which showed better results (i.e., the deviation between model before and after assimilation) when the time window of assimilation decreased from 8 days to 1 day and demonstrated large differences in the shallow areas as well as the section captured by drifters. Having an open connection with the ocean, these non-tidal currents are not necessarily initiated internally but are often propagated from the ocean. Depending on the availability of estuary stage data, this study can be conducted during major tidal periods and therefore can act as a guidance platform for assessing the role of DA in improving the modelled currents for the entire spatial extension of the domain.