The Quadrature Method: A Novel Dipole Localisation Algorithm for Artificial Lateral Lines Compared to State of the Art

The lateral line organ of fish has inspired engineers to develop flow sensor arrays—dubbed artificial lateral lines (ALLs)—capable of detecting near-field hydrodynamic events for obstacle avoidance and object detection. In this paper, we present a comprehensive review and comparison of ten localisation algorithms for ALLs. Differences in the studied domain, sensor sensitivity axes, and available data prevent a fair comparison between these algorithms from their original works. We compare them with our novel quadrature method (QM), which is based on a geometric property specific to 2D-sensitive ALLs. We show how the area in which each algorithm can accurately determine the position and orientation of a simulated dipole source is affected by (1) the amount of training and optimisation data, and (2) the sensitivity axes of the sensors. Overall, we find that each algorithm benefits from 2D-sensitive sensors, with alternating sensitivity axes as the second-best configuration. From the machine learning approaches, an MLP required an impractically large training set to approach the optimisation-based algorithms’ performance. Regardless of the data set size, QM performs best with both a large area for accurate predictions and a small tail of large errors.


Introduction
Artificial lateral lines (ALLs) are sensor arrays inspired by the biological lateral line organ found in fish and amphibians. This organ enables fish to detect and locate moving objects such as prey, predators, or social partners [1]. The ability to sense one's environment using a lateral line is sometimes called hydrodynamic imaging [2,3]. Two types of hydrodynamic imaging are distinguished: active hydrodynamic imaging, where fish use their movement's flow field to detect stationary obstacles; and passive hydrodynamic imaging, where fish detect fluid flows generated externally. Both types of hydrodynamic imaging have applications for ALLs. Active hydrodynamic imaging is useful for obstacle avoidance of autonomous underwater vehicles (AUVs) [4]. Here, ALLs provide a complementary sense for AUVs because-unlike cameras-they do not rely on visibility and-unlike sonar-they do not actively emit a signal. Passive hydrodynamic imaging can be used for tracking the location of objects-for instance, ships in a harbour-or detecting disturbances near underwater installations.
Several dipole localisation algorithms have been developed for ALLs in the last 15 years [5][6][7][8][9][10][11][12][13][14]. These algorithms attempt to locate objects that move underwater using the water flow pattern their movement generates. This process is analogous to solving the inverse problem of hydrodynamic source localisation [15]. It is challenging to compare these algorithms' performance from their original works due to differences in experimental designs and conditions. For instance, some algorithms were evaluated for 2D localisation [5][6][7][8][9][10], whereas others localised sources in 3D [11][12][13]. Additionally, some algorithms were evaluated in simulations [5,9], while other studies used physical sensors [6][7][8][10][11][12][13][14]. In particular, differences in the (simulated) sensor sensitivity and the areas in which sources are located prevent the comparison of average or median prediction errors between studies because the number of inaccurate predictions depends heavily on sensor sensitivity and increases with the area's size.
The present research compares ten dipole localisation algorithms via their ability to determine an object's state from velocity measurements simulated using potential flow [16]. In our study, this state refers to the 2D position and movement direction relative to an array of flow sensors. The quality of these state estimates is compared using two analyses. First, we determine the size of the area in front of the array in which these algorithms accurately estimate an object's state. This analysis allows an intuitive comparison of the effective range of each algorithm. Secondly, we determine each algorithm's distribution of prediction errors, i.e., the differences between the actual and estimated state. This analysis better indicates the reliability of each algorithm and provides a median error metric.
As our baseline, we use two localisation methods: a random predictor (RND) and an off-the-shelf least square curve fit (LSQ) algorithm. The remaining algorithms are divided into three categories. The first category contains three template-based algorithms. Template matching [6] and linear constraint minimum variance (LCMV) beamforming [11,12] use a set of velocity measurements of a single source at different known positions and movement directions to locate a source. The continuous wavelet transform (CWT)-introduced for dipole localisation byĆurčić-Blake and van Netten [5]-is also a template-based algorithm. This algorithm is based on the observation that a set of wavelets fully describes the potential flow [16] velocity generated by a dipole source. The second category contains artificial neural networks. A multi-layer perceptron (MLP) was used by Abdulsadda and Tan [7] and Boulogne et al. [9]. The latter showed that an extreme learning machine (ELM) performed better than their MLP implementation in high signal-to-noise conditions. The third category contains model-based algorithms. The first two, Gauss-Newton (GN) and Newton-Raphson (NR), fit a potential flow model to measured velocity values to predict a dipole's position and movement direction [8]. Abdulsadda and Tan [8] showed that GN consistently performed slightly better than NR, and both algorithms outperformed LCMV beamforming.
There are several novel aspects of the present study. Firstly, all localisation algorithms are extended to use various combinations of the velocity field's parallel and perpendicular components with respect to the sensor array. Several 2D sensitive fluid flow sensors exist in the literature [17][18][19][20], which provide the orthogonal fluid flow component that is not yet used by most dipole localisation algorithms. In addition, hair cells with varying orientations in close proximity have been observed in the ear of fish [21] and on the body of the Xenopus laevis frog (as cited in [1]). These varying orientations are thought to contribute to the localisation of stimuli. Secondly, not only are both the position and direction of movement of the source varied, their mutual effects on the prediction error are analysed as well. Thirdly, we use a novel approach to compare the performance of the dipole localisation algorithms, quantified by the area in which they correctly predict the position and direction of movement of an object with a predefined accuracy. Fourthly, we extend the template matching algorithm to a k-nearest neighbours (KNN) generalisation, where k = 1 is equivalent to the template matching algorithm as referenced earlier [6]. Finally, we introduce a novel model-based dipole localisation algorithm coined the quadrature method (QM), which exploits geometric properties of a 2D-projection of a velocity field. We show that the QM has state-of-the art localisation performance and how the movement direction of a dipole can be estimated directly from velocity measurements and its estimated location.
The remainder of the present paper is structured as follows: Section 2 explains the fluid flow simulation, our methods of analysis, and the dipole localisation algorithms. Section 3 presents the results of both analysis methods, providing error distributions of the algorithms as well as the total area with median errors below predefined levels of accuracy. Section 4 places our findings in the context of previous work. Section 5 summarises our contributions and conclusions.

Materials and Methods
In the following subsections, we describe the dipole flow field, the simulation environment, the conditions used for our analyses, each dipole localisation algorithm, and their hyperparameter optimisation strategy.

The Dipole Flow Field
Fluid flows were computed for a small sphere (radius a = 1 cm) vibrating with a fraction of its size (amplitude A = 2 mm), which generates a dipole field [16]. The dipole is the most informative component of a hydrodynamic stimulus for source localisation with ALLs because the higher terms of a multipole expansion decay with distance more quickly [22]. The lower term-the monopole-is measurable at larger distances. However, it is driven by changes in an object's size, so it is less relevant for localising moving objects. The dipole stimulus has become a popular source for studies with ALLs [5][6][7][8][9]13,14,[23][24][25][26][27][28].
A potential flow model was used to simulate fluid flows produced by a sphere, usually referred to as a dipole field. This model was utilised in several other studies [9,13,25,28] and its usefulness is supported by experimental findings in fish lateral line research [5,25]. Potential flow velocity v is computed by [16]: where a is the radius of the sphere, w = w x , w y are the instantaneous velocity components of the moving sphere in 2D, and r = s − p is the location of the sensor s = x, y relative to the source p = b, d . The dipole's position p(t) and velocity w(t) over time were computed as: and w(t) = dp(t) dt = 2π f A cos (ϕ) sin (ϕ) cos (2π f t), where A is the amplitude and f the frequency of the oscillation, p 0 is the average position of the source, ϕ is the azimuth angle of the motion, and t indicates time. We treat localisation as recovering the source's average position p 0 from ALL velocity measurements over a period of time. The source's movement during this period did not influence our results because there were an integer number of oscillations in each period. In other words, p 0 corresponded precisely with the average position in the measurement segments.
Even though a unique mapping exists between source states and their velocity profilesi.e., the patterns an infinite continuous linear array of flow sensors would measure-the inverse problem is challenging because sensor arrays only capture a discrete segment of the velocity profile, which may not contain the informative zero-crossings or peaks. Figure 1 shows the parallel (v x ) and perpendicular (v y ) velocity profiles relative to the sensor array. These velocity profiles broaden and their amplitude decays with the distance of the source, reducing the information captured by a fixed-sized ALL. 2D sensitive sensors increase the chance of capturing one of the velocity profile's more informative points because the zero-crossings of one velocity component are located near the peaks of the other velocity component. Another challenge arises in the movement direction estimation. The velocity profiles of objects in the same place but moving in opposite directions differ only in their sign. Consequently, some of the dipole localisation algorithms struggle to differentiate between these source states. We employ a post-processing step to improve these algorithms' movement direction estimation (see Section 2.5).

Simulation Environment
Eight sensors were simulated, based on the configuration of Wolf et al. [10]. The sensors were placed on the x-axis, centred around x = 0. The length of the sensor array was L = 40 cm, with 5.71 cm between sensors. The source sphere had a radius of a = 10 mm and moved with an amplitude of A = 2 mm at a fixed frequency of f = 45 Hz. Similar frequency values have been used in the literature: 40 Hz in [7,8,23], 45 Hz in [13,27], and 50 Hz in [26,29]. The source's radius and vibration amplitude are comparable to the work of Abdulsadda and Tan [7,8,24] (a = 9.5 mm and A = 1.91 mm). Figure 2 shows a schematic view of the present configuration. The source sphere was located between x = ±0.5 m and from y = 0.025 m to y = 0.525 m, ensuring its edge was always at least 15 mm away from the closest sensor's centre. The orientation of the source oscillation ranged from ϕ = 0 rad to ϕ = 2π rad. For each measurement, the fluid velocity at the sensors was simulated for a duration of T = 1 s and sampled at 2048 Hz, comparable to the values used by Pandya et al. [6] (T = 0.5 s at 2048 Hz). The simulated sensors had a Gaussian sampled velocityequivalent noise of σ = 1.0 × 10 −5 m s −1 . This value was chosen to be in the top five most sensitive sensors reported by Jiang et al. [30]: 2.5 × 10 −6 m s −1 [31], 5 × 10 −6 m s −1 at resonance [20], 8 × 10 −6 m s −1 [32], 8.2 × 10 −6 m s −1 [33] and 2 × 10 −4 m s −1 [34]. The resulting signal to noise ratios (SNRs) are shown for the fifth sensor from the left in Figure 3. Parallel v x and perpendicular v y velocity components are indicated at the right-most sensor (not to scale). The area in which the source sphere is positioned is offset by 25 mm from the array location, ensuring a minimal distance of 15 mm between the source's edge and closest sensor's centre.  . The signal to noise ratio (SNR) of both velocity components measured by the fifth sensor (x = 2.86 cm). The top row shows contours of the median SNR in cells of 2 × 2 cm 2 . The bottom row shows the median SNR's polar contours in cells of 0.02π rad × 2 cm for source states with an x-coordinate between x = −7.14 cm and x = 12.86 cm, indicating how the movement direction of a dipole ϕ influences the SNR. Simulated potential flow measurements (Equation (1)) and the same measurements with additive Gaussian distributed noise values (σ = 1.5 × 10 −5 m s −1 , µ = 0 m s −1 ) were used to compute the SNR. Specifically, the SNR was computed as the frequency power ratio between the noisy measurements and noise floor at the source frequency ( f = 45 Hz). The frequency power was computed by a discrete Fourier transform (DFT) using a Hamming window.

Performance Analyses
Two methods of analysis were employed in this research to compare the performance of the dipole localisation algorithms. Analysis Method 1 varied the number of measurements the algorithms could use to train and optimise their hyperparameters. Not every algorithm requires training or has hyperparameters to optimise. For the algorithms that do not have a training phase, we expect a consistent performance regardless of the amount of data. The other algorithms are expected to improve as the amount of training data increases. In this analysis method, all sensors were sensitive to both the parallel and perpendicular velocity component. Analysis Method 2 varied which velocity components were measured by the sensors: (x + y) both components on all sensors, (x|y) alternating v x and v y for subsequent sensors, (x) only v x by all sensors, (y) only v y by all sensors. In this analysis method, the largest training and optimisation set was used.
The best performing algorithms based on the two analysis methods were also evaluated using higher velocity-equivalent noise levels to show how they perform on sensors with a lower SNR. The increased noise levels (σ) were: 1.0 × 10 −3 m s −1 and 1.8 × 10 −2 m s −1 based on [35,36], respectively, as cited in [30]. 1.0 × 10 −4 m s −1 was added to bridge the gap with the analysis methods' noise-level, which was 1.0 × 10 −5 m s −1 . All sensors were sensitive to both the parallel and perpendicular velocity component, and the largest training set was used in this comparison.
In all analysis methods, the algorithms' predictions were recorded for each measurement in the test set. A source state with a unique combination of position p = b, d and movement direction ϕ was used to generate each measurement. The error of the predicted position E p (m) and movement direction E ϕ (rad) were computed as: and where P = b, d, ϕ are the actual properties of a test state andP = b ,d,φ is the prediction based on the test state's velocity measurements. The areas in which the localisation algorithms' median position errors were lower than 1 cm, 3 cm, 5 cm, and 9 cm were computed by discretising the simulated domain in 2 × 2 cm cells. The areas for the movement direction error were computed similarly with 0.01π rad, 0.03π rad, 0.05π rad, 0.09π rad. The training and test sets contained randomly sampled source states. Poisson Disc sampling [37] was used to ensure an even spread of states over the simulated domain. The minimum distance between states D s controlled the number of source states in each set. This distance was computed as the Euclidean distance in the x-y-ϕ/2π space containing all possible source states. It can be interpreted as follows: when two states have the same position, their movement direction differs by at least 2πD s rad; when they have the same movement direction, their position differs by at least D s m. The orientation dimension was divided by 2π to balance the number of positions and orientations considered. Table 1 shows the minimum sampling distance, the resulting number of states, and the average minimum distances in terms of position and orientation for each data set. The values of D s were chosen in terms of the source radius and correspond to the thresholds applied to the position and movement direction errors. Table 1. An overview of the data sets used in this study. The minimum distance between samples D s controls the number of source states. A source state is specified by a position p and orientation ϕ. The distance between two states was computed as the Euclidean distance in the combined x-y-ϕ/2π space containing all possible combinations of source positions and movement directions. The orientation dimension was divided by 2π to balance the number of positions and orientations. The testing data set has a different number of source states than the training set with D s = 0.01, due to the randomness of Poisson Disc sampling [37]. The average distance to the closest neighbour within each data set is indicated for both the position and orientation to support the interpretation of D s .

Type
Min

Parameter Optimisation Approach
Several of the dipole localisation algorithms have hyperparameters, which can be varied to fine-tune their performance. A single error metric that combines and balances both the position and movement direction is required to optimise these parameters. Given that we compare the dipole localisation algorithms based on the area in which they can accurately predict an object's state, the hyperparameter optimisation process should prioritise perfecting accurate predictions over reducing the error of inaccurate predictions. Therefore, we used a normalised absolute error metric E norm : where P = b, d, ϕ are the properties of the test state andP = b ,d,φ is the prediction. Compared to an error metric based on the Euclidean distance, the minimisation of E norm is less sensitive to predictions with a large error. Algorithms that required training were optimised using 5-fold cross-validation (80% training/20% validation split). The other algorithms used the entire training set for a single validation pass. The hyperparameter values that minimised the mean validation error E norm were used in the evaluation with the withheld test set.
Appendix A provides the optimal values of all hyperparameters for each condition of the first two analysis methods. In Analysis Method 3, the algorithms used the optimal values of Analysis Method 1's D s = 0.01 condition.

Dipole Localisation Algorithms
Each dipole localisation algorithm had access to a training set (sensors×time) are the velocity measurements over time and P i = b i , d i , ϕ i the state of the ith training dipole. The velocity measurements contain values for v x and v y depending on analysis variation. The to-be-predicted velocity profile is denoted asṼ (sensors×time) .
Not every algorithm explicitly used the time component of the velocity measurements. When required, the time dimension was averaged out by computing a DFT of the signal at each sensor. The magnitude of the 45 Hz components multiplied by the sign of their phase reconstructs the velocity signal over the sensors v (sensors×1) . Abdulsadda and Tan [24] used this method-without multiplying the sign of the phase-to reduce the influence of noise. A Hamming window was used for computing the DFT.
In the following subsections, each dipole localisation algorithm is described. Table 2 provides a summary of their properties. Table 2. Properties of the dipole localisation algorithms. The 'Limited to domain' column indicates whether the algorithm can produce predictions outside the simulated domain (see Figure 2). The 'Limited to sample' column indicates whether the algorithm is able to produce a prediction that is not present in the training set.

Algorithm
Type Training Hyperparameters Limited to Domain Limited to Sample RND -no no yes no LCMV [12,13] template-based yes no yes yes KNN template-based yes yes yes no CWT [5] template-based yes yes yes no ELM [9,10] neural network yes no no no MLP [7,9] neural network yes no no no GN [8] model-based no yes yes no NR [8] model The random predictor is used as a baseline for comparing performance. The algorithm does not use the training set T nor the to-be-predicted velocity measurementṼ . Instead, it generates a uniform random position and movement direction within the simulated domain (see Figure 2) for every test state.

Linear Constraint Minimum Variance (LCMV) Beamforming
LCMV beamforming was introduced for dipole localisation by Yang et al. [12] and Nguyen et al. [13]. The algorithm computes a prediction by evaluating an energy value E i of each source state in the training set T [12,13]: where R =ṼṼ T is the covariance of the to-be-predicted source state, and v i is the ith training source's velocity measurement with the time dimension averaged-out. The position and movement direction of the training source with the highest energy value is used as predictionP = b ,d,φ . LCMV is not able to differentiate between sources at the same position but with opposite orientations. To solve this issue, we computed the predicted state's expected potential flow values for both the predicted and opposite movement direction. The movement direction with the smallest difference to the measured velocity was used as the final estimate.

K-Nearest Neighbours (KNN)
The KNN algorithm generalises the template matching approach used by Pandya et al. [6]. KNN computes a prediction by finding the k training states with the most similar velocity measurements V i compared to the to-be-predicted source's measurementsṼ . Before computing the Euclidean distance between the velocity measurements, the time dimension was averaged out from both the training measurements and the measurement of the to-bepredicted source (see Section 2.5). Then, all velocity measurements were normalised by their maximum absolute value. Finally, the average position and movement direction of the k closest training states were computed and used as predictionP = b ,d,φ . The value of k was optimised, ranging from k = 1 to k = 20.

The Continuous Wavelet Transform (CWT)
The CWT was introduced for dipole localisation byĆurčić-Blake and van Netten [5]. The algorithm is based on the observation that potential flow and the pressure gradient along a lateral line can be expressed as wavelets. As in Wolf and van Netten [14], we extend the family of wavelets to include the perpendicular velocity component. Note, the coordinate system used here is slightly different. Deriving the wavelets from the potential flow formula (Equation (1))-using the approach ofĆurčić-Blake and van Netten [5]-finds (see Appendix B for the derivations): with where a = 1 cm is the radius of the source, w = w x , w y is the movement velocity of the source, r = s − p is the relative location of the sensor s = x, y from the perspective of the source p = b, d , and ϕ is the azimuth angle of the motion relative to the sensor array.
To compute a prediction, the CWT uses the to-be-localised velocity measurement with the time dimension averaged outṽ (see Section 2.5) and the source positions p i = b, d in the training set T . Note that the CWT does not use the training dipoles' velocity measurements or movement directions. Instead, it computes the values of the wavelets for each position in the training set. These values were evaluated for an extended sensor array matching the simulation domain's width and normalised by their maximum absolute value. Only the values at the eight sensors were kept after the normalisation step. From these values, a vector was constructed for each position p i = b, d in the training set T , containing four CWT coefficients: one for each combination of velocity component and wavelet. The peak of a Gaussian surface fitted to this vector's magnitude Wv(p i ) was used to estimate the sources' positionp = b, d . This fit was constrained to have a peak within the simulated domain (see Figure 2), and only Wv's values between a factor t min and t max of its maximum were used for the fit. The values of t min and t max were optimised, ranging from 0 to 1 under the constraint that t min < t max . The movement direction was estimated as the circular mean of:φ where Wv m n (p) is the CWT coefficient ofṽ n=(x or y) with ψ m=(e, o, or n) . Appendix C shows how these equations were derived and provides the analytical values of c x and c y . Unfortunately, the values of c x and c y can be shown to depend on the source position for our simulated finite and discontinuous sensor array. Therefore, we optimised the values of c x (between 0.5 and 1) and c y (between 0.3 and 0.9). Consequently, the estimation of the movement direction was optimised for the positions where the CWT is accurate.
In total, four hyperparameters were optimised for the CWT (t min , t max , c x , c y ). The best combination of values was determined in 30 iterations of the Bayesian optimisation algorithm provided by MATLAB [38].

The Extreme Learning Machine (ELM)
The ELM is a neural network designed to provide "the best generalisation performance at extremely fast learning speed" [39]. An ELM is a single layer feed-forward network with randomly initialised weights on the hidden layer. These weights are not changed during training. To find the optimal weights for the output layer, the ELM can be treated as a linear system because the input weights and activation function are fixed. The online sequential ELM variant (OS-ELM) [40] was used to support iterative training. Random initial weights were generated, and the network was trained on the training set T . The velocity measurements were pre-processed by averaging out the time dimension and normalising with their maximum absolute value. Rectified linear units were used as hidden nodes, as recommended in Goodfellow et al. [41] (p. 168). The layer sizen of the ELM was optimised using 101 values spaced logarithmically ranging fromn = 10 tō n = N, where N is the number of training measurements in the training set T (see Table 1). The parameter sweep was terminated for the first value ofn for which the ELM could not be trained due to a singular matrix inversion.

The Multi-Layer Perceptron (MLP)
An MLP was implemented to determine how much better a high capacity network performs compared to the ELM. Rectified linear units were used as hidden nodes, as recommended in Goodfellow et al. [41] (p. 168). Linear activation functions were used on the input and output nodes. The weights of the network were initialised using normalised initialisation, introduced by Glorot and Bengio [42]. Bias-weights were initialised to zero and kept constant during training, essentially disabling them. The network was trained to minimise the mean absolute error (MAE) between the predicted source stateŝ This error metric differs from E norm (Equation (6)) because the MAE does not normalise the individual dimensions and does not consider the circular nature of ϕ.
As indicated earlier, 80% of the training set T was used for training, the remainder for validation. Each velocity measurement was pre-processed by averaging out the time dimension and normalising by its maximum absolute value. The Adam [43] optimisation algorithm was used with the recommended values for the gradient decay factor ρ 1 = 0.9, the squared gradient decay factor ρ 2 = 0.999, and denominator offset δ = 10 −8 [41] (p. 301). Weight decay was applied with a factor of 10 −4 . A decay factor of 0.1 was applied to the learning rate every 10 epochs. The remaining 20% of the training set was used to compute a validation error every 50 iterations. The validation set and training set were each shuffled every epoch. The training was stopped when the validation error did not reach a new minimum in the last 5 evaluations, or the number of epochs exceeded 500. The minibatch size was 2048. The network was pre-trained in three stages to improve the performance on source states close to the sensors, first using states within 20 cm, then 40 cm, and finally 60 cm of the origin.
Three hyperparameters were optimised: the learning rate ranging from = 10 −4 to = 10 −1 , the number of layers n ranging from n = 1 to n = 4, and the layer sizesn ranging fromn = 16 ton = 1024. Each layer had the same number of nodes to simplify the optimisation. The best combination of parameters was determined in 30 iterations of the Bayesian optimisation algorithm provided by MATLAB [38].
2.5.7. The Gauss-Newton (GN) Algorithm GN was implemented for dipole localisation by Abdulsadda and Tan [8]. The algorithm does not use the training set T . Instead, it iteratively fits a potential flow model to the absolute value of the to-be-predicted source state's velocity measurements with the time dimension averaged out |v|. Let θ 0 = b 0 , d 0 , ϕ 0 be an initial estimate. Then the next iteration is given by [8]: where λ is a step size parameter, and v(θ k ) is the potential flow of a source state θ k computed using Equation (1). The algorithm terminates when the change in θ is smaller than = 1 × 10 −3 , the number of iterations exceeds 100, or the matrix inversion could not be computed due to a singular matrix. The gradient of |v(θ k )| was estimated numerically using a step size of δ = 1 × 10 −3 . The step size was λ = 1, because every iteration solves a linearised version of the fitting problem (see [8]). The simulated domain's centre was used as the initial estimate (b 0 = 0 m and ϕ 0 = π rad). However, the centre of the d-domain may not be the optimal value for d 0 because states close to the sensors are typically localised more accurately, and the distance between the actual source position and the initial estimate influences the convergence of GN [8]. Therefore, the value of d 0 was optimised and chosen from the range d 0 = 0.025 m to d 0 = 0.525 m, with a step size 0.025 m. Differently from Abdulsadda and Tan [8], we applied an upper and lower bound on θ k . After every iteration, the values of b k and d k were clipped to the simulated domain (see Figure 2), and a modulo 2π operation was applied to ϕ k .
A single post-processing step was performed to improve the movement direction estimation. GN is not able to differentiate between sources at the same position with opposite orientations because it uses the absolute velocity measurements to fit a potential flow. To solve this issue-as with the LCMV beamforming algorithm-we computed the predicted state's expected potential flow values for both the predicted and opposite movement direction. The movement direction with the smallest difference to the measured velocity was used as the final estimate.

The Newton-Raphson (NR) Algorithm
NR was also introduced for dipole localisation by Abdulsadda and Tan [8]. The algorithm is very similar to GN; only the hyperparameter optimisation and update step are different. Let θ 0 = b 0 , d 0 , ϕ 0 be an initial estimate. Then, the next iteration is given by [8]: with and where λ is a step size parameter, v(θ k ) is the potential flow of a source θ k computed using Equation (1), and |v| is the absolute value of the to-be-predicted source state's velocity measurements with the time dimension averaged out. The gradient g(θ) and Hessian G(θ) were estimated numerically with a step size of δ = 1 × 10 −3 . The step size λ, stopping conditions, bounds check, and initial estimate were the same as for GN. Different from GN, a norm limit l was applied to the change in θ in each iteration. The value of l was optimised, ranging from l = 0.1 to l = 1. The best combination of d 0 and l was determined in 30 iterations of the Bayesian optimisation algorithm provided by MATLAB [38]. The same post-processing step was applied as in GN, to improve the movement direction estimation.

The Least Square Curve Fit (LSQ) Algorithm
The LSQ predictor was implemented as another baseline for comparing performance. The algorithm does not require training and does not have hyperparameters to optimise. As a result, the training set T is not used by LSQ. Therefore, the number of training states does not influence the performance of LSQ. Consequently, the algorithm is only used in the second analysis method.
LSQ computes a prediction using the lsqcurvefit function from MATLAB [38]. The algorithm fits a potential flow model v(θ) (Equation (1)) tov: the to-be-predicted source state's velocity measurement with the time dimension averaged out. The lsqcurvefit function implements the trust-region-reflective algorithm (see [44]). Lower and upper bounds were provided for all elements of θ = b, d, ϕ , to limit their values to the simulated domain (see Section 2.2). Gradients were estimated numerically by lsqcurvefit, using a step size of δ = 1 × 10 −3 . The algorithm terminated when the number of iterations exceeded 100 or the change in θ was smaller than = 1 × 10 −3 . The function and optimality tolerance checks were set to machine-precision. The initial estimate θ 0 is identical to GN.

The Quadrature Method (QM) Algorithm
QM is a newly proposed localisation algorithm that takes advantage of 2D sensitive sensors. The algorithm combines measurements of v x and v y to construct a curve, ψ quad , that has characteristics that are nearly independent of the orientation of a source: Combining v x and v y in this way is somewhat analogous to computing the overall amplitude of two quadrature time signals. Appendix D explains the properties of the quadrature curve in more detail. Summarising, the factor 1/2 can be shown to negate the influence of the orientation ϕ at the maximum of the curve, allowing for an estimate of the lateral position b that is virtually independent of the orientation ϕ. The distance of the source d is linearly related to a measure of the width of ψ quad . Two so-called anchor points ρ ± anch are used to measure this width. These anchor points are located where ψ quad takes a value of about 0.458 times the curve's maximum and their location is almost independent of the dipole's orientation. Note, ρ (Equation (12)) describes a source state's location along the sensor array normalised by its distance. Practically, though, the locations of these anchor points are estimated in terms of x. Given the location of the anchor points, the source distance d is computed as: In practice, the locations of the anchor points were estimated by linearly interpolating ψ quad between the two sensors where ψ quad intersected with 0.458 times its maximum value.
An accurate estimate of the movement direction ϕ can be computed directly from the measured velocity values once the source state's position p = b, d is known. For this purpose, the measured velocity is analysed using the wavelets from the CWT (see Section 2.5.4). Figure 4 visualises the wavelets and their form in a 3D ψ e -ψ o -ρ space. The movement direction can be recovered from ψ e -ψ o slices of this 3D space at the sensor locations ( Figure 4c). In these slices, the wavelets and the measured velocity are vectors ( ψ e , ψ o , and v x ) with lengths corresponding to their values at the sensor. Note that all these values are known because the wavelets can be computed from the estimated position. The angle between v x and ψ e corresponds to the to-be-predicted movement direction ϕ. To compute ϕ, we use the vector combination of the wavelets ψ env = ψ e + ψ o , which has a length ψ env = ψ 2 e + ψ 2 o . The angle between ψ env and ψ e is ϕ = atan ψ o /ψ e . The difference between ϕ and ϕ is α = acos v x /ψ env because v x is a linear combination of the two wavelets. Consequently, the movement ϕ can be estimated using: Depending on the source's position and movement direction and the sensors' location, α should be added or subtracted from ϕ . Therefore, two estimates were computed for each sensor's v x measurement. The circular median of all sixteen estimates was used as final prediction. This estimation approach also works for v y when using ψ o and ψ n in place of ψ e and ψ o .
The predictions based on the anchor points' estimated locations are limited because the spatial resolution of the sensor array limits the accuracy of the anchor point location estimation. In addition, one or both anchor points may not be within the measurable range of the ALL.
A refinement step was introduced to improve the predictions. First, the estimate of the position was improved by fitting a potential flow model to ψ quad . The fit was computed using the lsqcurvefit function from MATLAB [38]. The position and orientation estimates computed using the anchor points were used as starting estimate θ 0 = b 0 , d 0 , ϕ 0 . Lower and upper bounds were provided for all elements of θ. Gradients were estimated numerically by lsqcurvefit, with a step size of δ = 1 × 10 −3 . The fit procedure terminated when the number of iterations exceeded 10, or the change in θ was smaller than = 1 × 10 −3 . The function and optimality tolerance checks were set to machine precision. Then, the orientation was re-estimated using the improved position estimate. The complete refinement step was repeated four times, each iteration using the previous estimated position and orientation as the starting point. projection of this 3D wavelet. For instance, a projection on the ρ-ψ e plane (bottom plane) yields ψ e for ϕ = 0 rad. For a general angle ϕ, the measured velocity profile is a projection on a plane through the ρ axis subtending an angle ϕ with the ρ-ψ e plane. (c) Diagram illustrating the geometric relation between a measured v x , the angles α and ϕ which are constrained via ψ env , and the movement orientation ϕ. We show a slice of ψ env (green) in the ψ e -ψ o plane for a fixed value of ρ. The velocity value at this fixed ρ is a vector v x (black) in this space. It has a length v x ∝ ψ e cos ϕ + ψ o sin ϕ and has angle ϕ. The contributions of ψ e (blue) and ψ o (orange) to v x are shown in yellow and purple. The angle ϕ of ψ env can be computed directly from an estimated source position. Given that the difference between ϕ and ϕ is α = acos v x /ψ env , we can compute an estimate of ϕ at every sensor using only measured velocity values and a position estimate.

Results
In this research, ten dipole localisation algorithms were compared by the area in which they accurately estimate the position and movement direction of an object. Specifically, the area with a median position error E p below 1 cm, 3 cm, 5 cm, and 9 cm, and the area with a median orientation error E ϕ below 0.01π rad, 0.03π rad, 0.05π rad, and 0.09π rad are reported. Section 3.1 presents the first analysis' results, where we varied the available data set size, and Section 3.2 presents the second analysis' results, where the sensor sensitivity direction was varied. Section 3.3 provides additional results to compare the localisation algorithms, including the performance of the best three algorithms on simulated sensors with lower SNRs.

Analysis Method 1: Amount of Training and Optimisation Data
This analysis method varied the number of measurements the algorithms could use to train and optimise their hyperparameters to show how that influences the algorithms' performance. LSQ was not included in this analysis because it does not require training nor has hyperparameters to optimise. However, LSQ's results from the comparable (x + y) condition of Analysis Method 2 are shown to allow for a comparison of the performance of all algorithms.
The results of this analysis are summarised in Figures 5 and 6. These figures visualise the total area in which the median position error and median orientation error were below their respective thresholds. There are several observations of note. Firstly, as expected, the model-based algorithms' areas did not increase with the amount of training and optimisation data. With D s = 0.09, we found 0.21 m 2 for QM, 0.22 m 2 for GN, and 0.11 m 2 for NR at E p ≤ 1 cm. In contrast, the template-based algorithms' and the neural networks' areas did increase with the training and optimisation sets. Using the largest data set D s = 0.01, these algorithms achieved a position error lower than 1 cm in areas of 0.18 m 2 for MLP, 0.14 m 2 for KNN, 0.12 m 2 for LCMV, 0.1 m 2 for ELM, and 0.00 m 2 for CWT. Only the random predictor had a median position error larger than 9 cm everywhere.
Secondly, only QM, GN, and NR had a median orientation error below 0.01π rad in some part of the simulated domain with D s = 0.09, with areas of 0.06 m 2 , 0.06 m 2 , and 0.02 m 2 , respectively. With the largest training and optimisation set (D s = 0.01), the MLP and KNN reached the 0.03π rad threshold in 0.09 m 2 and 0.02 m 2 , respectively, whereas LCMV only reached 0.05π rad in an area of 0.04 m 2 . The other predictors had a median orientation error larger than 0.09π rad everywhere.
The predictors' performance is not only characterised by the area in which they work well; it is also important to show how accurate predictions were in areas where they did not work well. To provide a complete picture of the predictor performance, Figures 7 and 8 visualise the distributions of the position and orientation errors, respectively.  (4)) below 1 cm, 3 cm, 5 cm, and 9 cm for the training and optimisation sets with a varying minimum distance between source states D s (see Section 2.3 and Table 1) and the (x + y) sensor configuration at the σ = 1.0 × 10 −5 m s −1 noise level. The median position error was computed for 2 × 2 cm 2 cells. Note, the bar for LSQ* is based the (x + y) condition in Analysis Method 2. . Total areas with a median movement direction error E ϕ (Equation (5)) below 0.01π rad, 0.03π rad, 0.05π rad for the training and optimisation sets with a varying minimum distance between source states D s (see Section 2.3 and Table 1) and the (x + y) sensor configuration at the σ = 1.0 × 10 −5 m s −1 noise level. The median movement direction error was computed for 2 × 2 cm 2 cells. Note, the bar for LSQ* is based on the (x + y) condition in Analysis Method 2.  Firstly, note that all dipole localisation algorithms had lower median position errors than the random predictor with all training and optimisation sets. This difference was smaller for the median orientation error. In particular, ELM's orientation error distribution with the smallest training and optimisation set was similar to that of the random predictor. Secondly, the position error distribution of NR had a large tail; the 75th and 95th percentiles were 0.81 m and 0.98 m with D s = 0.09, respectively. The GN predictor also had a higher 95th percentile (0.32 m) than the otherwise similarly performing QM predictor (0.14 m) with D s = 0.09. The same relation was also present in the orientation error distributions. The 95th percentile of GN (2.28 rad) was larger than that of QM (0.92 rad) with D s = 0.09.
Another interesting observation is the difference between the development of the orientation error distributions of MLP and KNN. The errors of KNN decreased gradually with each increase in training and optimisation data, whereas the errors of the MLP decreased drastically between D s = 0.03 and D s = 0.01. Note that the MLP used four layers with D s = 0.01 and one layer with all the other training and optimisation sets (see Table A1). Finally-unlike the position error-the CWT's orientation errors were larger with the largest optimisation set compared to the smaller sets.

Analysis Method 2: Sensor Sensitivity Axes
This analysis method varied which velocity components were measured by the sensors to determine how that influences the algorithms' performance. The QM predictor was not included in this analysis because it requires both velocity components to be measured by all sensors. However, QM's results from the comparable D s = 0.01 condition of Analysis Method 1 are shown to allow for a comparison of the performance of all algorithms.
The results of this analysis are summarised in Figures 9 and 10. These figures visualise the total area in which the median position error and median movement direction error were below their respective thresholds. In configuration (x + y)-which measured both velocity components at all sensors-the GN predictor had the largest area with median position error lower or equal to 1 cm (0.22 m 2 ). The MLP and LSQ predictors followed with 0.19 m 2 and 0.18 m 2 , respectively. The NR predictor (0.11 m 2 ) performed worse than KNN (0.14 m 2 ) and LCMV (0.12 m 2 ). The ELM did not perform well at the 1 cm threshold (0.01 m 2 ). However, the areas at the higher thresholds were similar to those of the MLP.
The CWT and RND predictors had a median position error larger than 1 cm in the entire simulated domain.
In the other configurations, LSQ performed the best; the median position error was lower or equal to 1 cm in 0.12 m 2 in configuration (x), and 0.15 m 2 in configuration (y) and (x|y). In general, the areas with median position errors lower or equal to 3 cm were larger with the alternating configuration (x|y) compared to configurations (x) or (y). The CWT predictor is an exception; it performed the best in configuration (x) at all position error thresholds. For GN and NR, the larger area at the 3 cm threshold in configuration (x|y) went along with a smaller area at the 1 cm threshold compared to configuration (x).
The movement direction errors, shown in Figure 10, indicate a slightly different pattern. The LSQ predictor had the largest areas with a median movement direction error below 0.01π rad in all configurations (0.05 m 2 in (x) and (x|y), 0.07 m 2 in (y), and 0.08 m 2 in (x + y)). The GN predictor performed better than LSQ at the higher thresholds in configuration (x + y) (0.19 m 2 to 0.15 m 2 at E ϕ ≤ 0.05π rad and 0.26 m 2 to 0.18 m 2 at E ϕ ≤ 0.09π rad). Interestingly, the MLP performed better than KNN, especially in configuration (x|y) and (x + y) (see Figure 10).
Similar to the first analysis (Section 3.1), we also show the error distributions in

Additional Results
This section presents additional results of the two analysis methods and the performance of the three best algorithms on simulated sensors with lower SNRs. The algorithms' training and prediction times are reported, and spatial and polar maps of the median errors are shown, visualising the areas in which the predictors performed well and the effect of the source's orientation on the prediction accuracy. Table 3 shows each localisation algorithms' average prediction time using both velocity components and total training time for the largest training set (D s = 0.01). All algorithms were evaluated on a high performance computing cluster, using 12 cores of an Intel Xeon 2.6 GHz processor and 64 GB RAM. The run-time performance of the implementations was optimised for computing many predictions simultaneously. As a result, the average prediction times may not be representative of the single-source prediction time. The random predictor had the shortest prediction time, followed by the MLP and ELM. KNN was also one of the quicker algorithms, about three times slower than the MLP. The modelbased predictors were slower: the MLP could compute roughly four, eight and nine predictions in the time it took for GN, LSQ and QM to compute a prediction. The remaining algorithms were considerably slower than the MLP (36 times for LCMV, 175 times for NR, and 300 times for CWT). Most of the CWT's prediction time came from fitting the Gaussian to the coefficients. It should be noted that the computational aspects of our implementations have not been extensively optimised, and the degree of optimisation between algorithms may have varied.  Figure 13 shows the spatial contours of the median position error E p and median movement direction error E ϕ . These figures show in which areas the algorithms were able to compute an accurate prediction. Both QM and GN had a median position error within 1 cm up to roughly 35 cm from the sensor array. The GN predictor was better at locating source states in the lower corners of the simulated domain than LSQ and QM. However, its predictions directly in front of the sensors were worse. The MLP and KNN also performed well in the lower corners of the simulated domain. However, their median position error was lower than 1 cm up to only roughly 25 cm. For the LSQ and LCMV predictors, the median position errors were smaller than 1 cm up to 30 cm and 20 cm, respectively. These algorithms showed a quick drop in performance as the distance of a source state increases. Their median position errors were larger than 9 cm from 30 cm and 45 cm of the sensor array, respectively. QM, GN and the MLP had a median position error lower than 9 cm at least up to 50 cm. The contours in the orientation errors were similar. Comparing the two neural networks, the MLP had more accurate movement direction predictions than the ELM and also covered a larger area of the simulated domain.
Polar contours of the median position and median movement direction errors are shown in Figure 14, indicating the effect of source movement direction ϕ and distance d on the predictions. QM and GN performed similarly in the (x + y) configuration; both algorithms had accurate position and movement directions predictions for all source orientations ϕ. In configuration (x + y), QM's movement direction prediction was slightly more accurate at longer distances for parallel and perpendicular movement directions. Closer to the sensors, QM's movement direction prediction was more accurate for source states with a parallel orientation. GN had accurate movement direction predictions for parallel source states at all distances in configuration (x + y). The difference in movement direction estimation between QM and GN shown in Figure 14 can be explained by GN's wider range of accurate predictions visible in Figure 13. LSQ had trouble predicting the position and movement direction of source states with orientations between ϕ = ± 1 4 π rad-except for parallel sources ϕ = 0 rad-in configurations (x + y) and (y). GN's localisation performance in configurations (x) and (y) is interesting as well, as it favoured perpendicular source states. On the other hand, GN's movement direction predictions in configuration (y) were inaccurate regardless of the source state's orientation ϕ and distance d. Figure 15 shows how QM, GN, and the MLP perform using sensors with lower SNRs. Only the position error is shown. Appendix E contains a similar figure for the movement direction errors, as well as the spatial and polar projections. QM has the largest area with median position errors below 1 cm at the higher noise levels. However, from σ = 1 × 10 −4 m s −1 and up, the MLP has lower median and 75 percentile values.

Discussion
In Analysis Method 1, we confirmed that (1) the total area in which the model-based algorithms produce accurate predictions does not depend on the amount of training and optimisation data, and (2) the template-based algorithms' and neural networks' areas with accurate predictions increase with the amount of training and optimisation data ( Figures 5 and 6). The MLP, in particular, benefits from large amounts of training data. However, only with our largest training and optimisation set (90,435 states) are the MLP and KNN able to approach QM's and GN's performance. Even in that case, their movement direction predictions are no match for those of QM and GN. Whether the MLP or KNN provides the better performance depends on the amount of training and optimisation data. The MLP performed better with our smallest and largest sets (Section 3.1). In the other cases, KNN performed better.
In Analysis Method 2, we demonstrated the benefit of 2D sensitive sensors compared to 1D sensitive sensors. Improved position and movement direction estimation of ELMs using 2D sensors compared to using only v x was previously shown by Wolf and van Netten [45]. In the present study, we showed that other localisation algorithms also benefit from using both velocity components. In addition, we demonstrated that the alternating sensor configuration (x|y) used by Yang et al. [12] and Nguyen et al. [13] is not an adequate substitute for 2D sensitive sensors at the used spatial resolution. However, it does improve performance compared to measuring a single velocity component along the entire sensor array. We speculate that the performance difference between the alternating sensor configuration and 2D sensitive sensors diminishes as the sensor array's spatial resolution increases.
Finally, we showed that the newly introduced QM provided the best overall performance with 2D sensors, regardless of the data set size. Its areas with a median position error below 1 cm and median movement direction error below 0.01π rad were as large as that of GN (Figures 5 and 6). However, the tails of the error distributions of QM were shorter than those of GN (Figures 7 and 8). In other words, the less-accurate predictions of QM were better than those of GN. With 1D sensors, LSQ performed the best, except for source states with an orientation close to ϕ = 0 rad.
It should be noted that the model-based algorithms depend on an accurate forward model. Fitting a potential flow model only works when it accurately describes the actual velocity measurements. When more complicated hydrodynamic phenomena are present in the measurements, a forward model may become quite complex. In that case, the MLP and KNN may be good alternatives. However, these algorithms require a lot of training and optimisation data to reach a similar performance as the model-based algorithms.
We continue the discussion with remarks specific to the used algorithms in the following subsections.

The Gauss-Newton (GN) Algorithm
GN's performance in this work was mostly in line with the results presented by Abdulsadda and Tan [8]. They showed that GN has a superior localisation performance compared to LCMV beamforming and template matching. In addition, they highlighted that the convergence behaviour of GN heavily depends on the initial estimate. In their simulation, the largest region of convergence had a radius of 1.5 cm, depending on the source's position [8]. In the current study, we did not estimate the region of convergence. Instead, we showed that-based on the median position and movement direction errors in the simulated domain-GN can be used in a larger area, especially when 2D sensitive sensors are used. However, there is no guarantee of convergence in that larger area, as is evident from the large tail in the error distributions (Figures 11 and 12). To improve the convergence rate, one could potentially use the estimated position and movement direction of another prediction method as the initial estimate of GN.
A difference in our results compared to Abdulsadda and Tan [8], is the effect of the source state's orientation on the prediction accuracy when only v x or only v y are used ( Figure 14). We found that parallel source states are localised less accurately than other orientations when only v x is used. Abdulsadda and Tan [8] did not report such an effect. In their results, of the 19 states reported, four had a roughly parallel orientation. Due to those state's positions it is difficult to determine how much their movement direction influenced the prediction accuracy. So, this effect may not have been observable with their experimental setup.
Another difference compared to Abdulsadda and Tan [8] were our step tolerance and maximum number of iteration hyperparameters. Both parameters determine the accuracy and computational cost of the algorithm. For our experiments, a prediction within 1 mm and 0.001π rad was sufficient. The maximum number of iterations was reduced to 100 compared to the 2500 used by Abdulsadda and Tan [8]. We found that the limit of 100 iterations did not prevent a prediction from reaching the outer edges of the simulated domain. Depending on the use case, these parameters can be tuned to provide the best performance.
Finally, unlike Abdulsadda and Tan [8], we applied a bounds check on each iteration's estimated position and movement direction, to keep the estimates within the simulated domain ( Figure 2). Since we did not compare GN's performance with and without the bounds check, we cannot draw conclusions about its effect. However, we expect that the bounds check has two advantageous effects. Firstly, it may cause non-converging predictions to terminate earlier because-depending on their trajectory-the difference between estimates in subsequent iterations may become smaller than = 1.0 × 10 −3 . Secondly, it may cause some otherwise non-converging predictions to converge. Instead of continuing on their trajectory, these estimates move along the simulated domains boundaries and find a path towards the solution.

The Newton-Raphson (NR) Algorithm
Unlike the results of Abdulsadda and Tan [8], NR did not perform similar to GN in our analyses. Computing NR's predictions took more than 40 times as long as for GN ( Table 3). The increased run-time cost of NR in our implementation is probably due to the numerical estimation of the Hessian. The difference in localisation performance may also be due this numerical estimation. The Newton method (Equation (15)) requires that the Hessian is positive definite; otherwise, it may move the prediction towards a saddle point instead of a minimum [41] (p. 279,304). Manual inspection of the Hessian found negative eigenvalues in all iterations for the inspected source states, regardless of the prediction accuracy. A preliminary hyperparameter validation showed that the regularisation strategy suggested by Goodfellow et al. [41] (p. 304) to solve this issue did not improve the performance of NR. From their publication, it is not clear whether Abdulsadda and Tan [8] computed the Hessian analytically. Alternatively, this difference in performance may be explained if their initial estimates were within NR's region of convergence.
In our implementation of NR, we applied the same bounds check as for GN. Since we did not compare the performance of NR with and without the bounds check, we cannot draw conclusions about its effect. However, we expect the same improvements as with GN.

The Multi-Layer Perceptron (MLP) and Extreme Learning Machine (ELM)
We showed that an MLP performs quite a lot better than an ELM when a large amount of training data is available (90,435 source states) (Figures 9 and 10). The main differences between these neural networks are the training procedure and the higher capacity of the MLP. The MLP was trained to minimise the mean absolute error (MAE), whereas the ELM minimises the mean squared error (MSE). The difference in capacity was especially large when the largest training and optimisation set was used with 2D sensitive sensors. In that case, the MLP had roughly ten times more weights than the ELM and roughly sixty times more trainable weights. When the smaller training and optimisation sets were used, the MLP performed more similar to the ELM and used only a single layer.
The comparison of QM, GN and the MLP on simulated sensors with lower SNRs showed the robustness of the MLP against noise (Figure 15). This finding is in line with Boulogne et al. [9], who showed that the MLP was more robust to noise than ELMs and echo state networks (ESNs).
The performance of the MLP may be improved further by also training bias-weights. Bias-weights serve as an activation-offset for each node in a layer, providing additional flexibility in the activation function. Bias-weights increase the capacity of the network and enable nodes to output non-zero values when their input is zero. We expect that the network's prediction of, in particular, the source distance benefits from bias-weights because the distances are not centred around the same value as the input.

The Quadrature Method (QM) Algorithm
In this study, we introduced the QM dipole localisation algorithm. The algorithm is designed for 2D sensitive sensors and provides state-of-the-art performance. Compared to GN, the algorithm produces more accurate predictions close to the sensor array ( Figure 13) and has less skewed error distributions (Figures 7 and 8). These results indicate that the iterative refinement procedure of QM converges better than the non-linear optimisation of GN. In addition, the effective area of QM was larger than that of GN when using simulated sensors with lower SNRs (Figure 15).
Aside from its performance, QM has several attractive attributes. For instance, the orientation estimation procedure is very quick-with a computational complexity in the order of sensors-and only depends on the measured velocity and an estimate of the source's position. As a result, any algorithm that estimates a source's position can use QM's orientation estimation. In addition, QM is able to compute an initial position estimate directly from the measured velocity (also with a computational complexity in the order of sensors). Consequently, QM does not require a hyperparameter specifying an arbitrary initial esti-mate. It should be noted that this initial estimate is accurate only in a limited area, as it depends on anchor-points on ψ quad that have to fall within the sensor array (Appendix D). Finally, the run-time of QM is easily tuned using two simple hyperparameters: the number of refinement iterations and the number of iterations used to fit the potential flow model. The hyperparameters values used in this study resulted in an average prediction time that was roughly two and a half times slower than GN.
Several interesting research questions about QM remain unanswered. For instance, we did not analyse the QM's performance using only its initial estimate. Especially for higher resolution sensor arrays, these estimates may perform quite well compared to the other algorithms. In addition, other algorithms may benefit from using QM's initial estimate. For example, GN's performance close to the sensor array may improve when QM's initial estimate is used as the starting point for fitting the potential flow model. Other two-stage combinations of algorithms may also be interesting to develop.

Future Research Directions and Possible Applications
In the present research, ten dipole localisation algorithms were compared using a stationary ALL in a 2D environment. Applications of ALLs typically operate in more complex environments that are embedded in 3D and may include self-motion. The shape and movement of a sensing platform itself would have a significant impact on the flow fields which were omitted in this analysis (see, for instance, Windsor et al. [46] for an analysis of flow fields around gliding blind cave fish). Localising sources in 3D requires different sensor configurations. Yang et al. [12] used a sensor array consisting of two orthogonal lines on a cylinder to demonstrate LCMV beamforming's 3D localisation performance. Wolf et al. [15] used two parallel ALLs to localise multiple simultaneous sources in 3D. Analysing QM's performance on 3D localisation tasks would be an interesting future research project.
Other interesting future research directions include using more realistic fluid flow simulations. For instance, Lin et al. [47] used a Navier-Stokes equation solver to predict the ratio of a source's distance and size based on the measured velocity amplitude range. Additionally, it remains unclear how well the algorithms' performances transfer to localising differently shaped or self-propelled objects. Finally, the algorithms could be compared on their performance locating moving objects instead of stationary dipoles.

Conclusions
In the present study, we compared a wide range of algorithms for determining a dipole's position and orientation in the vicinity of a flow sensor array. To demonstrate the effect of the amount of available data to optimise or tune each algorithm, we sampled a bounded domain with four levels of granularity. To demonstrate the effects of sensitivity directions of the flow sensors, we extended the implementation of existing algorithms to support data from four different sensor configurations: sensitivity parallel to the array, sensitivity at a right angle to the array, sensors with alternating sensitivity directions, and finally an array of 2D-sensitive sensors. These effects on the algorithms were quantified by the area in which an algorithm can correctly determine a dipole's position and orientation relative to the array with predefined degrees of accuracy. For further comparisons, we also disclosed box plots to indicate the distribution of errors, as well as visual representations of these errors in the 2D spatial domain and source orientation domain.
We demonstrated the benefit of 2D-sensitive sensors compared to 1D sensitive sensors for a dipole localisation task. All considered algorithms benefited from using information from 2D-sensitive sensors, although the amount of improvement varies. The extension to 2D-sensitive sensors allowed the introduction of a novel dipole localisation algorithm coined the quadrature method (QM). This algorithm is designed to take advantage of geometric properties that result from 2D-sensitive flow measurements. We showed that QM provides state-of-the-art performance and produces more accurate predictions than the Gauss-Newton (GN) algorithm [8], especially for source positions close to the sensor array.
Finally, we analysed how dipole localisation algorithms' performance depends on the amount of training and optimisation data. We find that template-based algorithms and neural-networks require large amounts of training data to approach the performance of model-based algorithms that require only a small training and optimisation set in a simulation setting.
Since the simulation's assumptions are all based on the potential flow of a vibrating sphere, the resulting flow fields can be used for all dipole fields. Such fields are not restricted to those generated by submerged moving objects. Exactly the same dipole field equations are associated with acoustic, electric, and magnetic phenomena. Therefore, the comparisons made in the present work may also be of interest in other applications than hydrodynamic imaging.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data generated in this study, including the algorithms' predictions and the training and test source states, are available from Zenodo [48]. Our implementation of the algorithms in MATLAB R2018a [38] as used in this publication are also available from Zenodo [49].

Acknowledgments:
We would like to thank the Center for Information Technology of the University of Groningen for providing access to the Peregrine high performance computing cluster.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Final Hyperparameter Values
This appendix lists the final hyperparameter values as used in the analyses. Tables A1 and A2 provide the optimal hyperparameter values for Analysis Methods 1 and 2, respectively. Analysis Method 3 used the optimal values of Analysis Method 1's D s = 0.01 condition. Section 2.4 explains the general approach used to find the hyperparameters' optimal values and Section 2.5 explains how each hyperparameter influences the dipole localisation algorithms. Table A1. All hyperparameter values for the first analysis optimised using the training set. This analysis varied the minimum distance D s between source states in the training and optimisation sets to show how the amount of training and optimisation data influences the dipole localisation algorithms' performance. The (x + y) sensor configuration at the σ = 1.0 × 10 −5 m s −1 noise level was used in this analysis. Table 1 describes the respective data sets in more detail.

Appendix B. Potential Flow Wavelets
In this appendix, we derive the three wavelets used in Wolf and van Netten [14] from the potential flow formula [16] (Equation (1) where a is the radius of the sphere, w = w x , w y are the velocity components of the moving sphere in 2D, and r = s − p is the position of the sensor s = x, y as seen from the source p = b, d .

Appendix B.1. The Even Wavelet
The even wavelet ψ e occurs in v x when a source moves parallel to the sensor array [5,14]. Let w = w x , 0 , then the parallel velocity component is given by To find the wavelet, ||r|| is rewritten as: Substituting this in the formula for v x results in: So, finally:

. The Odd Wavelet
The odd wavelet ψ o occurs in v x when a source moves perpendicular to the sensor array [5,14]. Let w = 0, w y , then the parallel velocity component is given by Rewriting and substituting ||r|| in the same way as in Appendix B.1, yields: So, finally: The navelet (Not-A-waVELET) ψ n occurs in v y when a source moves perpendicular to the sensor array [14]. Let w = 0, w y , then the perpendicular velocity component is given by: Rewriting and substituting ||r|| in the same way as in Appendix B.1, results in: So, finally:

Appendix C. Movement Direction Estimation with the Continuous Wavelet Transform (CWT)
In this appendix, we show how the CWT can be used to estimate the direction of movement of a dipole source. In Section 2.5.4, we showed that potential flow (Equation (1)) can be expressed in terms of wavelets: v x = a 3 ||w|| 2|y − d| 3 (ψ e cos (ϕ) + ψ o sin (ϕ)), v y = a 3 ||w|| 2|y − d| 3 (ψ o cos (ϕ) + ψ n sin (ϕ)), with where a = 1 cm is the radius of the source, w = w x , w y is the movement velocity of the source, r = s − p is the relative position of the sensor s = x, y as seen from the source p = b, d , and ϕ is the azimuth angle of the motion. In general, the CWT coefficients W f (u, s) are computed using [50]: where f (t) is the signal that is analysed and ψ u,s (t) is a scaled s and translated u version of a mother wavelet ψ [50]: In the case of dipole localisation, four sets of CWT coefficients are computed: Wv o Wv n y (p) = Wv o y (p) = where v x (x) and v y (x) are the velocity components measured at a sensor s = x, y for a source at p = b, d . Since the sensor array has a finite length, the information used by the CWT is in practice limited to sources that have most part of their significant nonzero excitation profile projected on the array. An important property of the wavelets is that they do not form a global orthonormal or orthogonal set of base functions. This renders the method less vulnerable to possible noise contributions on the measurements of the sensor array. On the other hand, different scales and shifts of the mother wavelets are mixed, so that especially in the case of closely spaced multiple sources, crosstalk over the reconstructed wavelets will occur. Consequently, the estimated spatial map formed by the maxima of the reconstruction functions may be distorted in such cases. In addition, the contributions of the individual wavelets to the measured velocity are only separated at the source's position because the CWT of ψ e with ψ o and ψ n with ψ o produces zero coefficients only at that position. Consequently, the movement direction of a dipole source depends on an accurate position estimate. In the following subsections, the movement direction estimation is discussed for v x and v y , respectively.

Appendix C.1. Movement Direction Estimation with the Parallel Velocity Component
The formulas for the CWT coefficients using the measurements of v x (Equations (A31) and (A32)) can be rewritten to show the influence of both wavelets on the coefficients, using Equation (A24): Wv o with where a = 1 cm is the radius of the source sphere, w is the velocity vector of the source, ϕ is the azimuth angle of w with respect to the sensor array, and s = x, y is the location of the sensor. In these equations, we differentiate between the position of the source which generated the measured velocity (p) and the position of a source for which the CWT is evaluated (p). Suppose that the position of the dipole is known p =p. Then, W xeo (p) and W xoe (p) are both zero, and Equations (A35) and (A36) reduce to: Wv e x (p) = a 3 ||w|| 2|y − d| (7/2) Wv o x (p) = a 3 ||w|| 2|y − d| (7/2) So, Wv e x (p) and Wv o x (p) have to be normalised to estimate the direction of movement: with: To conclude, in order to estimate the movement direction, a scaling as to be applied to the CWT coefficients at the estimated source position (p): Appendix C.

Movement Direction Estimation with the Perpendicular Velocity Component
Using the same steps as for v x , we find: Wv n y (p) = a 3 ||w|| 2|y − d| (7/2) So, Wv n y (p) and Wv o y (p) have to be normalised to estimate the direction of movement: with: To conclude, in order to estimate the movement direction, a scaling as to be applied to the CWT coefficients at the estimated source position (p):

Appendix D. The Quadrature Method (QM)
Here, we provide some more details about the quadrature method (QM). Some previous attempts have been made to analytically determine a 2D position p = b, d and movement direction ϕ of a dipole source from its generated fluid flows [5,25]. This task is called the inverse problem of hydrodynamic source localisation [15]. An intrinsic difficulty of the inverse problem using the dipole field is that both v x and v y consist of a directional-dependent combination of two of the three basis wavelets (see Section 2.5.4): with where a = 1 cm is the radius of the source, w = w x , w y is the movement velocity of the source, r = s − p is the relative position of the sensor s = x, y from the perspective of the source p = b, d , and ϕ is the azimuth angle of the motion. The contribution of the basis functions depends on this ϕ. A series of velocity profilessimultaneous measurements of flow velocity by a series of sensors arranged in a linear array-from a source at the same instantaneous position (b = 0, d = 1, so ρ = −x) are shown in Figure A1 for five different source directions (ϕ = 0°, 80°, 160°, 240°and 320°). A continuous ϕ-sequence of such profiles, presented in a time-lapse series, can be interpreted as a travelling wave moving within the envelopes indicated in green in Figure A1: This travelling wave may be thought of as a 2D projection of a single 3D line structure, which rotates along the x-axis in synchrony with the angle ϕ (see Figure 4b). Different snapshots of such a projected travelling wave correspond to a velocity profile produced by a source at the same position but moving in different directions. Obviously, the green envelope is independent from the angle ϕ and is solely and completely determined by the source's position p = b, d . When it could be distilled from the measurements, these envelopes' shape would indicate the source's x-position b with its top and yield the source distance d through its width, according to the definitions given in Equations (A52)-(A55). A necessary final step then would be to recover the movement angle ϕ from a measured velocity profile.
Reconstructing such an angle-independent envelope directly from a (measured) velocity profile has so far been proven impractical because it is difficult to separate the contributions of the wavelets in a single velocity profile without prior knowledge about the angle ϕ. However, a practical approach is provided by measurement of the two orthogonal velocity components v x and v y directed in a single plane through the array. This allows for the reconstruction of a virtually orientation-independent curve somewhat equivalent to the envelopes discussed above. This newly proposed method uses both v x and v y . As can be seen in comparing Figure A1a,b, v x and v y share a useful property: where one velocity profile has a local maximum, the other is close to a zero-crossing and vice versa, as has been previously noted in Wolf and van Netten [45]. We can effectively utilise this property to construct a curve, similar to the envelope discussed above, and show that it is only slightly dependent on the direction angle ϕ. The method is somewhat analogous to constructing an overall amplitude from two time-signals, as f.i. a sine and a cosine, which are 90°out of phase by taking a 'quadrature' combination of the two velocity components.
Here, we will construct the quadrature curve by using a specific weighting of the two measured quadratic velocity components, which will later be argued to minimise its dependency on the angle ϕ: with (using 2 sin ϕ cos ϕ = sin 2ϕ): Arranging the terms per angle dependence and then normalising with 1 + sin(ϕ) 2 at ρ = 0 yields the 'normalised quadrature' ψ quad,norm (ρ, ϕ): with an even (symmetric) and odd (skewed) component: . (A62) Figure A2 gives examples of ψ quad,norm for the same values of ϕ as shown in Figure A1. Note that Φ sym is even with respect to ρ irrespective of the angle ϕ, while Φ skew is odd only with a third-order term (ρ 3 ) rather than also with a first-order term. This is a result of the factor 1/2 applied to v 2 y in Equation (A57), which effectively cancels the first-order terms. This cancellation has the desirable property that Φ skew is flat around ρ = 0, which leads to ψ quad,norm having a single maximum at ρ = 0, rather than two possible local maxima. Consequently, the maximum of ψ quad,norm acts as a prime 'anchor point', pointing exactly to the source's x-position (b).  Figure A1). Furthermore, the two constituting functions Φ sym (ρ, ϕ) (middle panel) and Φ skew (ρ, ϕ) (right panel) are shown. These functions are, respectively, even and odd in ρ. At the secondary anchor points ±ρ anch the function Φ sym (ρ, ϕ) provides angle independent values which may be employed to determine the source distance d before the source direction of motion ϕ is known.
It should be noted from Figure A2 and Equations (A61) and (A62) that-apart from ρ = 0-both Φ sym and Φ skew still have a dependency on the source direction ϕ. However, as suggested by Figure A2, Φ sym has two additional 'secondary anchor points' at ±ρ anch , which have no dependencies on angle ϕ and therefore provide convenient locations for a determination of an exact and invariable width measure of Φ sym . This width measurement provides a good estimate of a 'representative width' of ψ quad,norm . The addition of Φ skew shifts the overall curve almost equally at these secondary anchor points because of its odd property. As a result, the relative distance between the secondary anchor points is hardly dependent on the angle ϕ and therefore almost linearly scales with the distance d of the source.
The exact locations of the secondary anchor points ±ρ anch of Φ sym can be calculated analytically. Because the anchor points should be independent of the angle ϕ, solving for the condition ∂Φ sym ∂ϕ = 0 defines their location. For this, we may treat the factor 1 + ρ 2 5 in the denominator as a constant C: The right-hand side of Equation (A63) has three solutions, corresponding to all three anchor points: The width between ±ρ anch equals 4/ √ 5 ≈ 1.789. Determination of the resulting function value at the secondary anchor points Φ sym (±ρ anch , ϕ) leads via substitution to the angle-independent value: The distance between the two values of ρ where ψ quad,norm reaches (0.210) has been numerically determined for a range of angles ϕ, and indeed shows a slight variation around 1.78, as shown in Figure A3. This confirms the almost ϕ-independent way for determining the width and therefore the distance (d) of the source to the array.

Appendix E. Additional Figures
This appendix provides additional figures for both analyses. For the first analysis method (Section 3.1), Figures A4 and A5 show the spatial contours of the median position error (Equation (4)) and median movement direction error (Equation (5)), respectively. Each column corresponds to the minimum sampling distance D s of the training and optimisation set (Table 1). Figures A6 and A7 show the polar contours, indicating how a source's movement direction influences the errors.
For the second analysis method (Section 3.2), Figures A8 and A9 provide the spatial contours and Figures A10 and A11 the polar contours for this analysis. Each column corresponds to a configuration of sensor sensitivity axes: (x + y) both velocity components are measured by all sensors, (x|y) subsequent sensors measure v x and v y alternatingly, (x) all sensors measure only v x , (y) all sensors measure only v y .
Finally, Figures A12-A16 show the spatial and polar contours of the position and movement direction error for the comparison of the quadrature method (QM), the Gauss-Newton (GN) algorithm, and the multi-layer perceptron (MLP) using simulated sensors with lower signal to noise ratios (SNRs) as well as the movement direction error distributions and the total area with accurate movement direction predictions.  (Table 1) while using the (x + y) sensor configuration at the σ = 1 × 10 −5 m s −1 noise level. The median movement direction error was computed in 2 × 2 cm 2 cells. The sensors were equidistantly placed between x = ±0.2 m.