Direction-of-Arrival Estimation Based on Frequency Difference–Wavenumber Analysis for Sparse Vertical Array Configuration

Frequency–wavenumber (f–k) analysis can estimate the direction of arrival (DOA) of broadband signals received on a vertical array. When the vertical array configuration is sparse, it results in an aliasing error due to spatial sampling; thus, several striation patterns can emerge in the f–k domain. This paper extends the f–k analysis to a sparse receiver-array, wherein a multitude of sidelobes prevent resolving the DOA estimates due to spatial aliasing. The frequency difference-wavenumber (Δf–k) analysis is developed by adopting the concept of frequency difference, and demonstrated its performance of DOA estimation to a sparse receiver array. Experimental results verify the robustness of the proposed Δf–k analysis in the estimation of the DOA of cracking sounds generated by the snapping shrimps, which were recorded by a sparse vertical array configuration during the shallow water experiment.


Introduction
The direction of arrival (DOA) of a signal propagated over an ocean waveguide is a primary factor in various applications and procedures, such as estimating of the location of the source and restoring the transmitted signal in a passive environment [1][2][3]. DOA estimation has recently been applied in Green's function estimation of unknown sources (e.g., ships) [4][5][6][7]. The frequency-wavenumber ( f -k) analysis, as well as reliable and robust delay-and-sum (DAS) beamforming, can estimate the DOA of a uniform linear array with d-m spacing [8][9][10]. In the f -k analysis, the DOA is calculated by using the ratio of the wavenumber, determined by the frequency of the signal, with the wavenumber derived by spatial sampling. If the frequency of the signal exceeds the design frequency (=c/2d, where c is the nominal sound speed in water of 1500 m/s), the angle would be inaccurately estimated due to spatial aliasing. A statistical approach can be used to rectify the DOA of a broadband signal in a single-path environment [9]. However, angle correction is difficult when multipaths exist, and the frequency band of the signal is significantly higher than the design frequency (i.e., sparse).
For a sparse vertical array configuration, DOA estimation by using DAS beamforming is highly likely to fail, and recent efforts have been made to overcome this problem. Abadi et al. proposed a beamforming approach based on the concept of frequency difference, known as frequency-difference (FD) beamforming [11]. FD beamforming is a method of beamforming a difference-frequency component that is equal to the difference between two frequencies extracted by the product of two relatively high-frequency components [11][12][13]. Xenaki et al. applied the concept of compression sensing (CS) to

Review of the Frequency-Wavenumber Analysis
The f -k analysis is defined as a time-space Fourier transform of the signal received from an array. Here, we review the relationship between the f -k analysis and DAS (or conventional) beamforming [9]. A simple case with a linear array of M sensors and a single-ray path is considered. In a single-path scenario, the signal received by one array element, s(ω o , r m ), is given as follows: Here, r m represents the mth array element location and θ 0 represents the DOA, with the positive angle representing an upgoing path. The ω o and A denote the frequency and amplitude of the signal, respectively.
Because the exponent of the exponential term in Equation (1) The beam output in a specific beam angle θ is given as where τ m (θ) is the time delay for the ray path that arrives at the mth receiver at nominal elevation angles θ from the horizontal plane. For simple plane-wave beamforming, τ m (θ) can be computed from τ m (θ) = r m sin θ/c, where c is the nominal sound speed in water at 1500 m/s. Substituting Equation ( Equation (5) is a spatially discrete Fourier transform (DFT) form of a frequencydomain signal received via an array, known as the f -k analysis. That is, the f -k analysis and DAS beamforming are closely related [9], and the peak (i.e., DOA) corresponds to the wavenumber k o = (ω o /c) sin θ o .

Physical Region of the f -k Analysis
In the case of a uniform linear array with d-m element spacing, which corresponds to D = 1 in Reference [9], the spatial DFT period is 2π/d and the wavenumber range is When the number of wavenumber bins in the spatial DFT is N, which is an even number, the grid of wavenumbers is given as Recalling the wavenumber component [k = (ω o /c) sin θ] for the steering angle, the look angle grid is given as Here, λ o = 2πc/ω o and the period of the look angle is λ o /d. Because θ l is a function of frequency (or wavelength), the position of the peak on the wavenumber grid varies with frequency, even at the same angle [10]. For example, if the frequency of the signal is the same as the design frequency (i.e., λ o = 2d) and θ o = 90 • , a peak appears in the wavenumber grid corresponding to l = N/2. Furthermore, when the frequency of the signal is lower than the design frequency for the same angle θ o , λ o becomes greater than 2d, resulting in a peak in the grid lower than N/2, which is still within the wavenumber grid. In contrast to the previous two cases, λ o < 2d at a frequency higher than the design frequency. Therefore, if θ o = 90 • , the theoretical peak position determined by Equation 8 is greater than N/2 and exceeds the wavenumber grid [8].
As such, if the theoretical peak position exceeds the wavenumber grid, a peak appears on the grid that is subtracted from the theoretical wavenumber value by an integer multiple of 2π/d due to the spatial DFT period (2π/d). This wavenumber shift, known as spatial aliasing [8], results in an inaccurate DOA estimation, which is a type of angle filtering by an array. This indicates that the angle that the array can physically detect at a frequency higher than the design frequency is restricted [10]. All angles can be detected at frequencies lower than the design frequency. However, as the frequency increases above the design frequency, the range of angles that can be detected gradually decreases. In the case of DAS beamforming, forcing the output of all angles is possible, even at frequencies higher than the design frequency. However, the grating lobe causes ambiguity. Consequently, there is a tradeoff between angle restriction and the appearance of grating lobes.
In the case of a broadband signal, the peaks corresponding to a frequency band form a striation with a slope providing DOA information [10]. In a single-path environment, even if spatial aliasing occurs, the DOA may be corrected by using the statistical approach [8]. However, if a broadband signal is received through a multipath and has sufficiently high frequencies to indicate that the array configuration is sparse, the physically detectable angle range will be extremely narrow. Consequently, the angle cannot be estimated by using the f -k analysis because of striation interference. Furthermore, reaching the near field of the array is likely if the array configuration is sparse. Therefore, we proposed a time-space Fourier transform based on the FD concept to address this problem, as described in the next section.

Frequency Difference-Wavenumber Analysis
When the frequency band of the signal is considerably higher than the design frequency, the FD concept is employed to obtain the low-frequency component [11][12][13][19][20][21]. This is performed by taking the quadratic product of the two in-band frequency components of the received signal as follows: Here, the difference frequency, ∆ω, is the difference between the two frequencies (i.e., ∆ω = ω 2 − ω 1 (ω 2 > ω 1 )) that must come from within the signal bandwidth. When Ω L and Ω H define the lower-and upper-frequency bounds of the signal bandwidth, respectively, the average frequency of the two frequencies, ω = (ω 1 + ω 2 )/2, is within the following range: Ω L + ∆ω/2 ≤ ω ≤ Ω H − ∆ω/2. The asterisk denotes a complex conjugate. s (ω, ∆ω, r m ) is a signal with the ∆ω component, but it may not be a version in which the frequency of the original signal is completely downconverted to ∆ω.
To overcome the problem that occurs in the f -k analysis when the array configuration is sparse, this paper proposes a method (called the ∆ f -k analysis) that utilizes the FD concept as a preprocessing step before performing spatial DFT.
Recalling that the f -k analysis is a two-dimensional Fourier transform of the received signal as a function of time and space, the ∆ f -k analysis is formulated by estimating the difference-frequency component from Equation (8) and performing spatial DFT as follows: For Q total wavefront arrivals, rather than a single arrival, a quadratic product, such as Equation (9), comprises Q desired terms and Q 2 − Q unintended terms [19]. Although the desired terms mimic the field at the difference frequency, unintended terms with a cosine factor that varies with ω may result in the formation of sidelobes. However, for a sufficiently high ω, the cosine factor sign changes abruptly as ω varies, allowing the unintended terms to be suppressed by considering an incoherent average throughout the signal bandwidth [11][12][13][19][20][21]. Thus, the following signal bandwidth-averaging is necessary to mimic the f -k analysis at a low frequency from the ∆ f -k analysis: If the difference frequency is within the design frequency and spatial aliasing does not occur, the peak in the wavenumber axis of the ∆ f -k analysis calculated by using Equation (10) For a broadband signal, a striation can form in a frequency band that is significantly lower than the design frequency because various difference-frequency components can be obtained from the signal bandwidth. Thus, in the aforementioned multipath environment, it is feasible to separate the striations by minimizing striation interference. Additionally, the conversion to a low frequency makes the source in the near field appearto be in the far field.
We compare the results of the proposed algorithm and FD beamforming by using simulated and experimental data, respectively, as described in Sections 4 and 5.

SAVEX15
In May 2015, SAVEX15 was conducted in the northeastern ECS by using the research vessel Onnuri [22]. Figure 1 shows a schematic of the experiment with the sound-speed profile (SSP) measured from a conductivity, temperature, and depth (CTD) profile collected on JD 141 [22]. The acoustic transmissions were in various frequency bands covering 0.5-32 kHz and included both channel-probing waveforms and communication sequences. Throughout the experiment, highly impulsive noises produced by the snapping shrimps, which usually thrive at depths of less than 60 m [23], were unexpectedly received on large-aperture vertical arrays and dominated the soundscape [24,25]. Cracking sounds, known as snaps, have the most spectral energy at higher frequencies (>10 kHz) and are composed of two arrivals (i.e., direct and surface-reflected arrivals). When ambient noise data with no acoustic transmissions and only cracking sounds were analyzed, the dominant frequency band of the cracking sounds was found to be 11-24 kHz. At the lower-frequency bound (11 kHz) of the dominant frequency band, the element spacing corresponded to 27.5 wavelengths, rendering the array configuration extremely sparse. To verify the proposed algorithm, we used the snaps recorded during the experiment as well as the simulation data by using a 60-ms cosine-tapered linear frequency modulation chirp with the same frequency band (i.e., 11-24 kHz) as snaps. The source and VLA configurations, as well as the SSP shown in Figure 1, were utilized in the simulation, and the source was assumed to be on the seabed, comparable to the snapping shrimps [24]. The acoustic transmissions were in various frequency bands covering 0.5--32 kHz and 158 included both channel-probing waveforms and communication sequences. Throughout 159 the experiment, highly impulsive noises produced by the snapping shrimps, which usually 160 thrive at depths of less than 60 m [23], were unexpectedly received on large-aperture vertical 161 arrays and dominated the soundscape [24,25]. Cracking sounds, known as snaps, have the 162 most spectral energy at higher frequencies (>10 kHz) and are composed of two arrivals 163 (i.e., direct and surface-reflected arrivals). When ambient noise data with no acoustic 164 transmissions and only cracking sounds were analyzed, the dominant frequency band 165 of the cracking sounds was found to be 11--24 kHz (see Fig. 6). At the lower-frequency 166 bound (11 kHz) of the dominant frequency band, the element spacing corresponded to 27.5 167 wavelengths, rendering the array configuration extremely sparse. To verify the proposed 168 algorithm, we used the snaps recorded during the experiment as well as the simulation data 169 using a 60-ms cosine-tapered linear frequency modulation chirp with the same frequency 170 band (i.e., 11--24 kHz) as snaps. The source and VLA configurations, as well as the SSP 171 shown in Fig. 1, were utilized in the simulation, and the source was assumed to be on the 172 seabed, comparable to the snapping shrimps [24].

Numerical simulation 174
The ray-tracing code BELLHOP was used to generate the received signals for the 175 simulation [26,27]. The Green function between the source and receiver can be calculated 176 using the following equation: where a qm is the arrival amplitude, including any phase shift from boundary bounces, and 178 t qm is the arrival time. These two variables are the outputs of BELLHOP. The received 179 signals were generated by multiplying the Green function with the frequency-domain 180 transmitted signal and performing an inverse Fourier transform.

181
As previously mentioned, we used two ray-path arrivals (i.e., Q = 2) to mimic the 182 scenario of snapping shrimp, discussed in Sec. 5. The source depth and the range between 183 the source and the VLA were set to 100 m and 210 m, respectively [24].
184 Figure 2(a) shows the f − k analysis of the simulation data. The slopes appear to be 185 visible; however, overall, it is featureless to the extent that the angle cannot be estimated, 186 although there were only two ray paths. This is because of the near field of the array as well 187 as striation interference. Considering that the far field of the array is reached when L 2 /4λr is 188

Numerical Simulation
The ray-tracing code BELLHOP was used to generate the received signals for the simulation [26,27]. The Green function between the source and receiver can be calculated by using the following equation: where a qm is the arrival amplitude, including any phase shift from boundary bounces, and t qm is the arrival time. These two variables are the outputs of BELLHOP. The received signals were generated by multiplying the Green function with the frequency-domain transmitted signal and performing an inverse Fourier transform.
As previously mentioned, we used two ray-path arrivals (i.e., Q = 2) to mimic the scenario of snapping shrimp, discussed in Section 5. The source depth and the range between the source and the VLA were set to 100 m and 210 m, respectively [24]. Figure 2a shows the f -k analysis of the simulation data. The slopes appear to be visible; however, overall, it is featureless to the extent that the angle cannot be estimated, although there were only two ray paths. This is because of the near field of the array as well as Sensors 2023, 23, 337 6 of 14 striation interference. Considering that the far field of the array is reached when L 2 /4λr is less than unity [28], the source will be in the near field because this parameter at 11 kHz (the lower frequency bound) is greater than 27 (i.e., L 2 /4λr = (56.25 m) 2 /[4(0.1364 m)210 m] = 27.6). If the source is in the near field, the slope related to the DOA in the f -k analysis inevitably spreads, and the influence of the spread is increased in the case of a sparse vertical array configuration. With k o = (ω o /c) sin θ o , the x-axis was converted from a wavenumber to a physically detectable angle, as shown in Figure 2b. Recalling that the angle range that the array can detect by using Equation (8) decreases when the frequency of the signal is higher than the design frequency, the angle range becomes extremely narrow for an extremely sparse vertical array configuration, as discussed in Section 2. The grayshaded region represents the regions that are not physically detectable, and, when the design frequency is 200 Hz, the detectable angle range of the array at 24 kHz is within ±0.8 • . Figure 2b, similar to Figure 2a, is featureless, as if the pattern of striations is random. Hence, the DOA cannot be corrected by using periodicity.
Version December 15, 2022 submitted to Journal Not Specified 6 of 14 less than unity [28], the source will be in the near field because this parameter at 11 kHz (the 189 lower frequency bound) is greater than 27 (i.e., L 2 /4λr = (56.25 m) 2 /[4(0.1364 m)210 m] = 190 27.6). If the source is in the near field, the slope related to the DOA in the f − k analysis 191 inevitably spreads, and the influence of the spread is increased in the case of a sparse 192 vertical array configuration. With k o = (ω o /c) sin θ o , the x-axis was converted from a 193 wavenumber to a physically detectable angle, as shown in Fig. 2(b). Recalling that the 194 angle range that the array can detect using Eq. 8 decreases when the frequency of the 195 signal is higher than the design frequency, the angle range becomes extremely narrow for 196 an extremely sparse vertical array configuration, as discussed in Sec. 2. The grey-shaded 197 region represents the regions that are not physically detectable, and, when the design 198 frequency is 200 Hz, the detectable angle range of the array at 24 kHz is within ±0.8 • . 199 Figure 2(b), similar to Fig. 2(a), is featureless, as if the pattern of striations is random. Hence, 200 the DOA cannot be corrected using periodicity.
201 Figure 2. f − k analysis of the simulation data, with the geometry shown in Fig. 1; (a) frequencywavenumber domain, and (b) frequency-angle domain. A 60-ms cosine-tapered linear frequency modulation chirp in the same frequency band as that generated by the snapping shrimp is utilized for the simulation, resulting in a sparse vertical array configuration. Despite the two-path environment, the f − k analysis is featureless. Because the array configuration is sparse, the gray-shaded region, where the array cannot physically detect the DOA, is extremely wide. Thus, the DOA cannot be estimated due to striation interference by the multipath.
First, FD beamforming was applied to the same simulation (See Fig. 3). To minimize 202 the number of cross terms generated due to the multipath, the output of FD beamforming 203 was incoherently averaged over 11 kHz ≤ ω 1 ≤ 22.6 kHz with 10-Hz intervals. The y-axis 204 in Fig. 3 indicates the difference frequency. To confirm the trend of an increasing difference 205 frequency, the difference frequencies, which are user-chosen parameters, were set from 206 0 Hz to 1400 Hz with 1-Hz intervals. This is comparable to simulating a signal with a 207 frequency band of 0--1400 Hz. The arrows in Fig. 3 represent the DOAs calculated using 208 the image method based on the center of the VLA as the reference angles. The red and blue 209 arrows correspond to the direct (12.6 • ) and surface-reflected (−36.1 • ) paths, respectively. 210 Figure 2. f -k analysis of the simulation data, with the geometry shown in Figure 1; (a) frequencywavenumber domain, and (b) frequency-angle domain. A 60-ms cosine-tapered linear frequency modulation chirp in the same frequency band as that generated by the snapping shrimp is utilized for the simulation, resulting in a sparse vertical array configuration. Despite the two-path environment, the f -k analysis is featureless. Because the array configuration is sparse, the gray-shaded region, where the array cannot physically detect the DOA, is extremely wide. Thus, the DOA cannot be estimated due to striation interference by the multipath.
First, FD beamforming was applied to the same simulation (see Figure 3). To minimize the number of cross terms generated due to the multipath, the output of FD beamforming was incoherently averaged over 11 kHz ≤ ω 1 ≤ 22.6 kHz with 10-Hz intervals. The yaxis in Figure 3 indicates the difference frequency. To confirm the trend of an increasing difference frequency, the difference frequencies, which are user-chosen parameters, were set from 0 Hz to 1400 Hz with 1-Hz intervals. This is comparable to simulating a signal with a frequency band of 0-1400 Hz. The arrows in Figure 3    Two vertical lines (mainlobes) are observed in the FD beamforming output, and the 211 angles corresponding to two vertical lines are in good agreement with the DOAs. Several 212 curves in Fig. 3 are grating lobes caused by spatial aliasing and are a mix of grating lobes 213 produced by each path. In Fig. 3, the white dotted line represents the maximum limit 214 frequency (i.e., 400 Hz) that satisfies the far field of the array for the geometry considered 215 here. A frequency higher than 400 Hz (lower part of the white dotted line in Fig. 3) satisfies 216 the near field of the array, resulting in sidelobes emerging around the main or grating lobes 217 due to the angle spread. Two vertical lines (mainlobes) are observed in the FD beamforming output, and the angles corresponding to two vertical lines are in good agreement with the DOAs. Several curves in Figure 3 are grating lobes caused by spatial aliasing and are a mix of grating lobes produced by each path. In Figure 3, the white dotted line represents the maximum limit frequency (i.e., 400 Hz) that satisfies the far field of the array for the geometry considered here. A frequency higher than 400 Hz (lower part of the white dotted line in Figure 3) satisfies the near field of the array, resulting in sidelobes emerging around the main or grating lobes due to the angle spread.
The results of the proposed algorithm for simulation data are displayed in Figure 4a. The result of ∆ f -k analysis is incoherently averaged over 11 kHz ≤ ω 1 ≤ 22.6 kHz with 10-Hz intervals, similar to that of FD beamforming. The difference between Figures 2a and 4a is remarkable. In contrast to the featureless f -k analysis (see Figure 2a), the output of the ∆ f -k analysis shows that the two main gradients are clearly separated because multipath interference is mitigated.
Although spatial aliasing exists in the difference-frequency band of 0-1400 Hz, clear separation allows for the DOA estimation through periodicity. However, beyond the maximum limit frequency (white dotted line), other minor slopes are formed in addition to the major slopes, which shows the angle spread due to the near-field effect explained in FD beamforming. This angle spread can be relaxed if the range is increased, whereas other conditions remain fixed. Figure 4b shows the result of wavenumber-to-angle conversion, using ∆k o = (∆ω o /c) sin θ o , as shown in Figure 2. The gray-shaded region represents the angle range that the array cannot detect physically. When the difference frequency is lower than the design frequency, an output at all angles was achieved, similar to that in FD beamforming. However, as the difference frequency is increased, the angle range decreases. Nevertheless, the ∆ f -k analysis can detect a wider angle compared with the f -k analysis. To minimize the number of unintended terms generated due to the multipath, the FD beamforming outputs are averaged over the signal frequency band (i.e., 11--22.6 kHz), resulting in two distinct vertical lines between −60 • and 30 • . The red and blue arrows indicate the DOAs corresponding to the direct and surface-reflected paths, respectively. The two arrows are aligned with the angle represented by the two vertical lines, showing that FD beamforming can estimate the DOAs of a sparse vertical array configuration.  Fig. 3 are identical, except for the gray-shaded region. Compared with the featureless f − k analysis, the ∆ f − k analysis clearly exhibits two lines. Considering the range between the source and VLA and aperture of the VLA, the maximum limit frequency of the far field of the array (white dotted line) is 400 Hz. Angle spread can occur because of the near-field effect on difference frequencies higher than the maximum limit frequency, as shown in (a) and (b).
Two vertical lines (mainlobes) are observed in the FD beamforming output, and the 211 angles corresponding to two vertical lines are in good agreement with the DOAs. Several 212 curves in Fig. 3 are grating lobes caused by spatial aliasing and are a mix of grating lobes 213 produced by each path. In Fig. 3, the white dotted line represents the maximum limit 214 frequency (i.e., 400 Hz) that satisfies the far field of the array for the geometry considered 215 here. A frequency higher than 400 Hz (lower part of the white dotted line in Fig. 3) satisfies 216 the near field of the array, resulting in sidelobes emerging around the main or grating lobes 217 due to the angle spread. Compared with the featureless f -k analysis, the ∆ f -k analysis clearly exhibits two lines. Considering the range between the source and VLA and aperture of the VLA, the maximum limit frequency of the far field of the array (white dotted line) is 400 Hz. Angle spread can occur because of the near-field effect on difference frequencies higher than the maximum limit frequency, as shown in (a,b).
Except for the gray-shaded region, Figures 3 and 4b are identical. This relationship is consistent with that described in Sec. 2 between the f -k analysis and DAS beamforming and is corroborated by Figure 5. Additional averaging (i.e., double averaging) in the range of difference frequencies can improve the robustness of DOA estimation [11,12,19,20]. Figure 5 shows a comparison between DAS beamforming, the results of double averaging from the ∆ f -k analysis and FD beamforming. Averaging for ∆ f was performed with 1-Hz intervals, excluding the gray-shaded regions. The green dashed line represents DAS beamforming, whereas the black solid line and the red dotted line represent the double averaging outputs of the ∆ f -k analysis and FD beamforming, respectively. All the results are normalized to the peak value. First, Figure 5a shows the results of double averaging within the difference frequency band lower than the design frequency as well as the result of DAS beamforming.. Two large pulses are detected in DAS beamforming. However, the exact angle could not be calculated because of oscillations within each pulse due to the sparseness of the array. By contrast, double averaging of the ∆ f -k analysis and FD beamforming provided consistent results at all angles, with two distinct angles (direct: 13.8 • , surface-reflected: −36 • ), demonstrating consistency between Figures 3 and 4b. Compared with the reference angles, the maximum error of the angle estimated in Figure 5a is approximately 1 • , indicating that the angle is successfully estimated. Figure 5b shows the results of double averaging over all difference frequencies (i.e., 0-1400 Hz) used in this paper. The result of DAS beamforming is identical to that shown in Figure 5a. The two mainlobes of double averaging remained similar and distinguishable. For simulation data, the angles estimated from the ∆ f -k analysis and FD beamforming are as follows: (1) ∆ f -k analysis, 16.4 • (direct) and −36.5 • (surface-reflected); and (2) FD beamforming, 16.4 • (direct) and −35.0 • (surface-reflected). Compared with the reference angles, the maximum error in the estimation of all angles using the two approaches is 4 • (i.e., the surface-reflected path in FD beamforming). This error is believed to be caused by the angle spread due to the near-field effect on difference frequencies above 400 Hz, and the error decreased as the range increased. In addition, there is a noticeable difference in the sidelobes between Figure 5a,b. The difference between the ∆ f -k analysis and FD beamforming is based on whether the gray-shaded regions are included when doubleaveraged over all difference frequencies. This is the region where the grating lobes of FD beamforming are formed, and all grating lobes are included in the case of double averaging in FD beamforming. This yields the same effect as adding background noise. As a result, the sidelobe of FD beamforming is larger than that of the ∆ f -k analysis. The simulation confirmed that, although both approaches can estimate DOAs, there may be a difference in the sidelobe depending on the difference-frequency band for double averaging. separation allows for the DOA estimation through periodicity. However, beyond the 226 maximum limit frequency (white dotted line), other minor slopes are formed in addition to 227 the major slopes, which shows the angle spread due to the near-field effect explained in 228 FD beamforming. This angle spread can be relaxed if the range is increased, whereas other 229 conditions remain fixed. Figure 4(b) shows the result of wavenumber-to-angle conversion, 230 using ∆k o = (∆ω o /c) sin θ o , as shown in Fig. 2. The gray-shaded region represents the 231 angle range that the array cannot detect physically. When the difference frequency is 232 lower than the design frequency, an output at all angles was achieved, similar to that 233 in FD beamforming. However, as the difference frequency is increased, the angle range 234 decreases. Nevertheless, the ∆ f − k analysis can detect a wider angle compared with the 235 f − k analysis. Hz with DAS beamforming (green dashed line). For double averaging, the ∆ f interval is 1 Hz, and the gray-shaded region in Fig. 4 is excluded. Along with FD beamforming, the ∆ f − k analysis has the ability to estimate any angle within the design frequency. By contrast, the sidelobes of the ∆ f − k analysis are lower than those of FD beamforming when averaging over all difference-frequency bands, because the grating lobes of FD beamforming have the same function as adding background noise.
Except for the grey-shaded region, Figs. 4(b) and 3 are identical. This relationship is 237 consistent with that described in Sec. 2 between the f − k analysis and DAS beamforming 238 and is corroborated by Fig. 5. Additional averaging (i.e., double averaging) in the range of 239 difference frequencies can improve the robustness of DOA estimation [11,12,19,20]. Figure 5 240 shows a comparison between DAS beamforming, the results of double averaging from the 241 ∆ f − k analysis and FD beamforming. Averaging for ∆ f was performed with 1-Hz intervals, 242 excluding the gray-shaded regions. The green dashed line represents DAS beamforming, 243 whereas the black solid line and the red dotted line represent the double averaging outputs 244 of the ∆ f − k analysis and FD beamforming, respectively. All the results are normalized to 245 the peak value. First, Fig. 5(a) shows the results of double averaging within the difference 246 frequency band lower than the design frequency as well as the result of DAS beamforming.. 247  Figure 4 is excluded. Along with FD beamforming, the ∆ f -k analysis has the ability to estimate any angle within the design frequency. By contrast, the sidelobes of the ∆ f -k analysis are lower than those of FD beamforming when averaging over all difference-frequency bands, because the grating lobes of FD beamforming have the same function as adding background noise.

Experimental Results
To verify the proposed algorithm by using experimental data, we analyzed a set of direct and surface-reflected noises collected on JD 141 (JD141 06:55:30). Cracking sounds along the VLA are shown in Figure 6a. The direct and surface-reflected paths, which were separated at around 30.7 s, are denoted by D and S, respectively. The snapping shrimp was in the near field of the array, as evidenced by the direct path with the shape of a spherical wavefront. The surface-reflected noise was dispersive due to the rough sea surface. As an example of a cracking sound, which has a higher spectral energy at frequencies above 10 kHz [24], the spectrogram of the received signal at the middle hydrophone (i.e., 51.25 m; eighth channel) are displayed in Figure 6b. The dominant frequency band of cracking sounds was found to be 11-24 kHz (frequency band between the white dotted lines in Figure 6b) and was used to estimate the DOAs by using FD beamforming and the proposed algorithm. Version December 15, 2022 submitted to Journal Not Specified 10 of 14 Through simulations, we found that the f − k analysis for the sparse vertical array 286 configuration is featureless. As this featurelessness appears similarly in the data of FD 287 beamforming, the f − k analysis for the snaps measured from the experiment is not dis-288 played. Figs. 7-9 illustrate the results along with the experimental data in the same manner 289 as the simulation results (see Figs. [3][4][5]. Recall that averaging for ω 1 was performed with 290 10-Hz intervals between 11 kHz and 22.6 kHz. Figure 7 shows the FD beamforming output 291 as the experimental result counterpart of Fig. 3. Two vertical lines appear between −60 • 292 and 30 • in the frequency band below 400 Hz, as in the simulation; however, afterwards, the 293 two lines spread and disappeared in the frequency band. This phenomenon is believed 294 to be caused by the array shape. In contrast to the simulation, which assumes that the 295 array is a straight line, the array during the experiment was not straight and might have 296 had a curvature because of various factors such as current. This curvature can potentially 297 cause an angle spread similar to that caused due to the near-field effect, and a direct arrival 298 through a relatively closer path will be more sensitive. Nevertheless, except for the differ-299 ence in the background noise, the patterns of the main and grating lobes in Figs. 7 and 3 300 are similar, suggesting that the ∆ f − k analysis of the experimental data is similar to that of 301 the simulation data. Through simulations, we found that the f -k analysis for the sparse vertical array configuration is featureless. As this featurelessness appears similarly in the data of FD beamforming, the f -k analysis for the snaps measured from the experiment is not displayed. Figures 7-9 illustrate the results along with the experimental data in the same manner as the simulation results (see . Recall that averaging for ω 1 was performed with 10-Hz intervals between 11 kHz and 22.6 kHz. Figure 7 shows the FD beamforming output as the experimental result counterpart of Figure 3. Two vertical lines appear between −60 • and 30 • in the frequency band below 400 Hz, as in the simulation; however, afterward, the two lines spread and disappeared in the frequency band. This phenomenon is believed to be caused by the array shape. In contrast to the simulation, which assumes that the array is a straight line, the array during the experiment was not straight and might have had a curvature because of various factors such as current. This curvature can potentially cause an angle spread similar to that caused due to the near-field effect, and a direct arrival through a relatively closer path will be more sensitive. Nevertheless, except for the difference in the background noise, the patterns of the main and grating lobes in Figures 3 and 7 are similar, suggesting that the ∆ f -k analysis of the experimental data is similar to that of the simulation data. Figure 8 shows the output of the ∆ f -k analysis as the counterpart of the experimental results in Figure 4. The experimental data were comparable to the simulation results, as expected from the FD beamforming output. Furthermore, the angle spread and decay caused by the abovementioned array shape were more clearly highlighted by the anglerelated slope in the ∆ f -k domain (see Figure 8a). Despite the addition of background noise and the influence of the array shape compared with the simulation, the two main lines can still be clearly identified in Figure 8. Overall, the outputs of FD beamforming of the simulation (see Fig. 3) and experimental data show good agreement. In the difference frequency range below 400 Hz, two vertical lines are formed, as in the simulation. However, for a difference-frequency band higher than 400 Hz, the angle spread is more severe than that in the simulation. This more severe angle spread may have been caused by (1) the curvature of the array shape because of various factors, such as current, as well as (2) the near-field effect.  Figure 8 shows the output of the ∆ f − k analysis as the counterpart of the experimental 303 results in Fig. 4. The experimental data were comparable to the simulation results, as 304 expected from the FD beamforming output. Furthermore, the angle spread and decay 305 caused by the above-mentioned array shape were more clearly highlighted by the angle-306 related slope in the ∆ f − k domain [see Fig. 8(a)]. Despite the addition of background noise 307 and the influence of the array shape compared with the simulation, the two main lines can 308 still be clearly identified in Fig. 8. Figure 7. FD beamforming of experimental data. Overall, the outputs of FD beamforming of the simulation (see Figure 3) and experimental data show good agreement. In the difference frequency range below 400 Hz, two vertical lines are formed, as in the simulation. However, for a differencefrequency band higher than 400 Hz, the angle spread is more severe than that in the simulation. This more severe angle spread may have been caused by (1) the curvature of the array shape because of various factors, such as current, as well as (2)  Overall, the outputs of FD beamforming of the simulation (see Fig. 3) and experimental data show good agreement. In the difference frequency range below 400 Hz, two vertical lines are formed, as in the simulation. However, for a difference-frequency band higher than 400 Hz, the angle spread is more severe than that in the simulation. This more severe angle spread may have been caused by (1) the curvature of the array shape because of various factors, such as current, as well as (2) the near-field effect.  Figure 8 shows the output of the ∆ f − k analysis as the counterpart of the experimental 303 results in Fig. 4. The experimental data were comparable to the simulation results, as 304 expected from the FD beamforming output. Furthermore, the angle spread and decay 305 caused by the above-mentioned array shape were more clearly highlighted by the angle-306 related slope in the ∆ f − k domain [see Fig. 8(a)]. Despite the addition of background noise 307 and the influence of the array shape compared with the simulation, the two main lines can 308 still be clearly identified in Fig. 8.  Figures 3 and 7, the ∆ f -k analysis of cracking sounds is similar to that of simulation data. Figure 9 depicts the results of the experimental data obtained by employing double averaging, which can improve the robustness and the output of DAS beamforming. Double averaging is performed with an interval of 1 Hz starting at 0 Hz, and the upper bounds of double averaging in Figure 9a,b are 200 Hz (design frequency) and 1400 Hz, respectively. As all angles can be detected by using the ∆ f -k analysis in the difference-frequency band within the design frequency, the results of FD beamforming and ∆ f -k analysis doubleaveraged from 0-200 Hz coincide, as illustrated in Figure 9a. This is the expected result, which is the same as the simulation result. By contrast, data in Figure 9b, in which the upper bound of double averaging is 1400 Hz, differ from the simulation data (see Figure 5b). The intensity of the direct path is greater than that of the surface-reflected path in the simulation, as shown in Figure 5b, whereas the intensity of the surface-reflected path is greater in the ∆ f -k analysis in Figure 9b. Although not shown here, the intensity of the direct path steadily decreased when the upper bound of double averaging is gradually increased from 200-1400 Hz. This drop in the direct path intensity is expected owing to the angle spread. In contrast to the surface-reflected path, wherein the angle spread is not apparent due to detectable angle restriction, the intensity of the direct path decreased because of destructive interference caused by the angle spread. Nonetheless, we confirmed that the two mainlobes can be clearly identified and that the sidelobes, after double averaging the f -k analysis, are lower than those after FD beamforming, as in the simulation. The two peaks in Figure 9a, which are not affected by angle spread, are 10.5 • (direct) and −38 • (surface-reflected) shifted by −2 • from the angles (see Figure 5a) estimated in the simulation. Because there was an array tilt during SAVEX15 [7,24], which caused a shift in the angular axis, the angle shift between the experimental and simulation data is reasonable. , not only the sidelobes are reduced but also the intensity of the direct path is lower than that of the surface-reflected path. Angle spread, which deteriorates at difference frequencies above 400 Hz, causes destructive interference. Consequently, as double-averaging the direct path includes a wider angle-spread region, the intensity is relatively lower than that of the surface-reflected path. Figure 9 depicts the results of the experimental data obtained by employing double 31 averaging, which can improve the robustness and the output of DAS beamforming. Double 31 averaging is performed with an interval of 1 Hz starting at 0 Hz, and the upper bounds 31 of double averaging in Figs. 9(a) and 9(b) are 200 Hz (design frequency) and 1400 Hz, 31 respectively. As all angles can be detected using the ∆ f − k analysis in the difference-31 frequency band within the design frequency, the results of FD beamforming and ∆ f − k 31 analysis double-averaged from 0--200 Hz coincide, as illustrated in Fig. 9(a). This is the 31 expected result, which is the same as the simulation result. By contrast, data in Fig. 9(b), 31 in which the upper bound of double averaging is 1400 Hz, differ from the simulation 31 data [see Fig. 5(b)]. The intensity of the direct path is greater than that of the surface-31 reflected path in the simulation, as shown in Fig. 5(b), whereas the intensity of the surface-32 reflected path is greater in the ∆ f − k analysis in Fig. 9(b). Although not shown here, the 32 intensity of the direct path steadily decreased when the upper bound of double averaging 32 is gradually increased from 200--1400 Hz. This drop in the direct path intensity is expected 32 owing to the angle spread. In contrast to the surface-reflected path, wherein the angle 32 spread is not apparent due to detectable angle restriction, the intensity of the direct path 32 decreased because of destructive interference caused by the angle spread. Nonetheless, 32 we confirmed that the two mainlobes can be clearly identified and that the sidelobes, after 32 double averaging the f − k analysis, are lower than those after FD beamforming, as in the 32 simulation. The two peaks in Fig.9(a), which are not affected by angle spread, are 10.5 • 32 (direct) and −38 • (surface-reflected) shifted by −2 • from the angles [see Fig.5(a)] estimated 33 in the simulation. Because there was an array tilt during SAVEX15 [7,24], which caused a 33 Figure 9. For experimental data, comparison of the ∆ f -k analysis (black solid line) with FD beamforming (red dotted line) double-averaged over 0 ≤ ∆ f ≤ (a) 200 Hz (design frequency) or (b) 1400 Hz with DAS beamforming (green dashed line). In (b), contrary to the expected result (a), not only are the sidelobes reduced but also the intensity of the direct path is lower than that of the surface-reflected path. Angle spread, which deteriorates at difference frequencies above 400 Hz, causes destructive interference. Consequently, as double-averaging the direct path includes a wider angle-spread region, the intensity is relatively lower than that of the surface-reflected path.

Conclusions
For a sparse vertical array configuration, the f -k analysis, which can be used to estimate the DOA of a wideband signal, has a significantly limited detection angle. Additionally, the DOA cannot be estimated due to interference if there is a multipath. To solve this problem, we proposed the ∆ f -k analysis, in which the FD concept of, utilized for sparse vertical array configuration, was adapted to the f -k analysis. The performance of the ∆ f -k analysis was verified via simulation in the SAVEX15 environment and was compared with DAS and FD beamforming. Subsequently, the cracking sounds recorded by a sparse vertical array configuration during the SAVEX15 experiment were analyzed. The ∆ f -k analysis effectively estimated the DOA of a sparse vertical array configuration, which DAS beamforming could not estimate. Additionally, analogous to the relationship between the f -k analysis and DAS beamforming, we verified that the ∆ f -k analysis is closely related to FD beamforming. The outputs of the two algorithms became identical when the difference frequency was lower than the design frequency. However, when the difference frequency was higher than the design frequency, the detectable angle of the ∆ f -k analysis was limited, resulting in reduced sidelobes in the double-averaged ∆ f -k analysis due to the filtering effect of the grating lobes induced by FD beamforming. The simulation results are consistent with the experimental data, indicating that angle estimation using the ∆ f -k analysis is feasible for sparse vertical array configuration and that the ∆ f -k analysis and FD beamforming are closely related.