#### 3.1. Velocity Threshold Optimization

Data QC on radial velocity is primarily based on the application of radial velocity thresholds. The conventional approach involves setting threshold velocity values a priori, using either long-term statistics of HFR data at each radar grid cell or quantitative analyses and comparisons with near-surface current data. Data can be derived from moorings, drifters, or even glider observations, provided that they are available within the radar footprint and that differences and limitations of each individual sampling strategy are properly accounted for. In the specific deployments across Australia, both approaches have proven to be reliable and complementary: under the assumption of normally distributed radial velocity data, 99% confidence levels (3 times the deviation from the mean, σ) can be set from the velocity distributions and anomalous radial velocity data can be identified and either flagged or removed. Examples of radial current distributions from the radar stations and the current meter data for the FRE HFR system are reported in

Figure 2. Regardless of the relative distances between near-surface radar data and mooring observations, which are 13–56 m below the surface, there is a surprisingly good agreement in terms of mean radial velocity and 99% confidence levels across all moorings. Before applying any velocity threshold, R values are in the range R = 0.15 to R = 0.86, with RMSD in the range 10.3 cm/s to 40.8 cm/s (

Table 2). After applying a 1.5 m/s (absolute value) on radial velocity, comparison metrics improved in general for both FRE and GUI HFR stations, with R increasing up to 25% or more and RMSD decreasing by up to 50% (

Table 2)

Comparison metrics are in good agreement with previously reported findings in different ocean regions [

24,

25,

26]. Significant differences and poor statistical comparison results can be observed for the radar-mooring pairs FRE—WATR04 (distance from surface 13 m), FRE—WATR10 (distance from surface 43 m), GUI—NRSROT (distance from surface 15 m), and GUI—WACA20 (distance from surface 32 m). Differences arise primarily from radar radial velocity often exceeding 1.5 m/s, which are inconsistent with both HFR data at nearby locations, or in comparison to mooring velocity data. At these locations, comparison metrics improve significantly when 99% thresholds are applied to the HFR observations with increased (decreased) R (RMSD) values (

Table 2).

#### 3.2. SNR Threshold Optimization

The SNR of the Doppler lines in a Doppler spectra is in general considered to be a good proxy for HFR data quality and, in combination with velocity threshold, are more than adequate to identify and flag the majority of suspect anomalous data. For direction-finding systems such as the SeaSonde systems, it was shown in particular that low SNR values are useful for identifying anomalous data and that data quality generally improves as SNR increases [

27]. Analyses of the five-month data set used here suggest that similar assumptions hold for phased-array radars such as the WERA systems. Distribution of radial velocities for different classes of SNR (not shown here) suggest that suspect radial velocities exceeding 1.5 m/s (absolute value) magnitude tend to be clustered between 5 and 10 dB, while radial velocities between [−1.5, −1] m/s and [1, 1.5] or +/− 1 m/s span similar ranges for SNR, including values below the 10 dB threshold. As for radial velocity, SNR thresholds are usually defined a priori under the rule of thumb that data quality improves with increasing SNR values. Thresholds are also generally assumed to be time- or space-invariant; i.e., it is assumed they are constant with time or space, and regional, spatial, or temporal variations due to such factors as noise levels or the transmit patterns are not taken into account. Restrictive thresholds can have the undesirable effect of removing valid observations; on the other hand, the opposite would happen with poorly constrained thresholds.

An optimization of the SNR threshold is possible and desirable using independent datasets: different threshold values are used, poorly SNR-constrained radial velocities are removed, and R and RMSD metrics are computed with the aim of finding that particular value for which a significant change in (R, RMSD) is obtained with an acceptable data loss. Results for the Rottnest HFR deployment are summarized in

Figure 3.

Figure 3 in particular shows that the (R, RMSD) for SNR thresholds in the 0–40 dB range, while

Table 3 shows results only for the default threshold used in the proprietary software (6 dB) and the 10 dB threshold for which no more statistically significant changes are observed.

In general, (R, RMSD) values are consistent with other radar validation analyses performed in different environments [

24,

25,

26]. Little or no effect is introduced for SNR values below 4 dB, in agreement with the minimum threshold applied during the inversion of the Doppler spectra to radial currents. For both HFR stations used here (FRE and GUI), the most statistically significant changes occur between 5 and 10 dB. In this range, RMSD values drop from 40 to 10–12 cm/s, R values increase from R < 0.2 to R > 0.7, while data loss is within 10–15%. The analysis also shows that some locations are less sensitive than others to variations in the SNR thresholds, and no statistically significant changes can be detected regardless of the applied SNR thresholds.

#### 3.3. Advanced Artifact Removal

When correctly chosen or fine-tuned, thresholds on radial velocity or SNR (

Section 3.1 and

Section 3.2) are in general capable of handling the majority of anomalous velocities in a radial map. There may be cases, however, when artifacts appear in the radial velocity field, such as unrealistic velocities at fixed and known range cells, i.e., distances, from the HFR receiver. Most of the time, similar artifacts are introduced by either external radio frequency interference (RFI) or by modulations of the 50–60 Hz 220 V power line. As a consequence, strong signals appear in the range-Doppler spectra at multiples of the 50 Hz frequency, for instance at approximately 20, 40, 60, 80, 100, and 120 km offshore (

Figure 4a); at higher harmonics, this signal also tends to spread over frequency and potentially interfere with the detection of the Doppler peaks, thus introducing spurious radial currents.

There are several possible ways of dealing with this contamination. A first approach, presently implemented and operational across the Australian Ocean Radar HFR network, makes use of an iterative method that fits a 1-D or 2-D reference signal to the radial SNR map, and identifies and flags anomalous data that clearly stand out from the background SNR values. Other approaches, currently under investigation, act on the I/Q time series of the range-gated signals at each antenna, and subsequently identify and filter any anomalous signal in the frequency domain in a manner similar to [

28,

29,

30,

31]. The conventional beam-forming and the Doppler spectra inversion steps are then applied on the filtered data. This section focuses on the first approach, which is operationally implemented across the network for both NRT and DM operations. Other methods still need to be optimized in order to account for the spreading over Doppler for the higher-order harmonics (

Figure 4a).

An example of heavily contaminated range-Doppler spectra and corresponding radial map is given in

Figure 4a,b, respectively. Data shown here refer to the HFR station at Red Rock (RRK), NSW, where the 50 Hz contamination is particularly significant, although it is commonly observed at all the installations with the exception of Cape Wiles (CWI), SA. The range-Doppler spectrum-derived following [

32,

33] (

Figure 4a) shows the expected Bragg peaks from the backscattering ocean waves, but also strong signals at 20, 40, 60, 80, 100, 120, and 140 km offshore, with SNR values comparable in magnitude to the dominant Bragg peaks. While at closer ranges this feature is well separated from the true Doppler peaks, the spreading over frequency that is observed with range biases the Doppler peak detection at 60, 80, 100, 120, and 140 km offshore. Effects are evident in both the radial and SNR maps, where well-defined rings are clearly detected and inconsistent radial current patterns are found. Since the SNR values associated with radial currents at these range cells typically exceed 30–40 dB, any spike identification based on the conventional SNR threshold would fail detection, remove potentially valid observations, and propagate anomalous observations through to the vector mapping step. Similarly, a conventional threshold on radial velocity would possibly remove the obviously erroneous data points; however, the method would fail to identify the problematic range cells. For this specific example, poorly SNR constrained radial currents are still within realistic ranges ([−1.5, 1.5] m/s). Radial velocities exceeding a maximum value of 1.5 m/s are flagged; some of them are associated with poor SNR values, whilst others exceed the 10 dB threshold. Peaks at 80 and 120 km cannot be identified using standard constraints on SNR or radial velocity. In contrast, most of the observations associated with the peak at the 140 km range may be flagged and removed with SNR and speed combined.

The artifact removal procedure developed at the Ocean Radar facility fits either a 1-dimensional polynomial to the SNR distribution along each radial direction, or a 2-D surface on the SNR map and, on the basis of a “distance” between the model and the data, identifies and flags suspect data. The procedure is iterative and continuously updates the distance between the fit model and the observations until no suspect data points are found, or a maximum number of iterations is reached. Detailed steps are provided below:

fit a polynomial model to the SNR distribution along each radar beam (1-D case) or fit a 2-D surface to the SNR map;

estimate a “distance” between the data and the polynomial fit and set confidence levels as n * σ(data-fit), where n = 2, 3;

remove suspect SNR data outside of the confidence levels;

move to the next radial beam (1-D case);

repeat Steps 1–4 until (i) no more anomalous data are detected or (ii) a maximum number of iterations is reached.

For the 1-D case, the actual implementation of the algorithm allows for the choice of a second-order, third-order, or exponential function, while for the 2-D case two options that fit either a third- or fifth-order surface to the SNR maps are available. An example of the results of the 1-D iterative identification and removal along one specific radial beam (92° NCW) is provided in

Figure 5 for the cubic fit (

Figure 5a), the quadratic fit (

Figure 5b), and exponential decay (

Figure 5c). No other thresholds are applied to this example.

In order to test the effectiveness of the 1-D and 2-D formulation and optimize the choice of the fit function, a series of sensitivity tests have been performed on a synthetic SNR map with realistic patterns, for instance matching the one that generated the radial current shown in

Figure 4b. The SNR model used here is a simple linear decay over range for all bearings, which starts from 60 dB at the grid cell closest to the receiver site and decreases to 4 dB at 180 km offshore. A Gaussian-shaped noise function is added to range cells 90–95 and 120–128 with a peak amplitude of 50 dB, and to range cells 150–159 with a peak amplitude of 15 dB, so as to reproduce a typical pattern in the data. An additional noise term is included in the sensitivity tests, with variance levels of 0 (no noise), 1, 2, 5, and 10 dB. No assumptions are made in regard to directional patterns. Sensitivity analysis aimed to determine the speed, the number of iterations required to flag and remove the known artifacts, and the number of “false-flag” i.e., incorrect detections, as a function of the noise level.

Results for the 1-D case for both the noise-free and randomly distributed noise cases can be summarized as follows. In the absence of noise, the quadratic, cubic, and exponential fit equally succeed in detecting and flagging more than 70% of the anomalies, while minimizing the false-flag detection (0% in this specific realization). On the other hand, the exponential fit requires a significantly larger number of iterations to achieve results comparable to the quadratic and cubic function; however, there is no major improvement in the detection rate as the number of iterations increases. Detection rates decrease when noise is added to the SNR maps, but the lower success rate is related to the missed identification of the weaker SNR peaks at the outer ranges rather than an increasing rate of false-detection.

When applied to field data, results suggest that a third-order polynomial (

Figure 6c) is more effective in identifying and removing anomalous observations from the radial maps than the quadratic (

Figure 6b) or exponential fit (

Figure 6d). For this specific example, the artifact removal is applied to radial currents from one station (RRK), and additional thresholds have been applied to radial SNR and radial velocity before the polynomial fit.

Results for the 2-D fit improve with respect to the 1-D case when noise is added to the synthetic SNR test case, but no improvement is obtained when a noise term is excluded. In general, the detection rate decreases with the increasing noise level similarly to the 1-D case, but the false detection rate is significantly lower in the 2-D case for the same noise level. Processing speed also increases slightly in respect to the 1-D case when noise is added to the SNR map. When applied to a real radial velocity map, no major differences can be found between the third- or fifth-order surfaces, yet processing speed greatly improves—up to a factor of 10 compared to the 1-D case.