A New Low-Cost Acoustic Beamforming Architecture for Real-time Marine Sensing: Evaluation and Design

: This paper mainly studies the performance of an acoustic beamforming technique applied to a low-cost hydrophone in a linear array of two to four elements for the detection and localization of underwater acoustic sound waves. It also evaluates the integration of the array in an energy-efﬁcient real-time monitoring system architecture, allowing marine sensing to be conducted without human intervention. Such architecture would consist of vertical linear arrays of two or four RHSA-10 hydrophones models attached to a buoy or a vessel for sound detection; a frequency domain beamformer (FDB) technique implemented in a Xilinx Spartan-7 ﬁeld programmable gate array (FPGA) for sound source localization; a LoRa wireless sensor network mote to provide convenient access from a base center. The architecture aims to alleviate sea trafﬁc control for countries that lack the ﬁnancial resources to properly address illegal ﬁshing or piracy issues, mostly committed in small fast motorized boats. In our experiment, the sound waves emitted by a small motorized boat were successfully detected and tracked by three data acquisitions at a 1 km range. It is demonstrated that a system using a small number of hydrophones is capable of producing robust accuracy over a large band frequency in the presence of noise interference.


Introduction
Piracy and illegal fishing are considered the most serious security issues occurring in the high sea. A recent study [1] indicates that illegal, unreported and unregulated (IUU) fishing harm causes environmental and social impacts amounting to a total loss of 10-23.5 billion US dollars worldwide. With around 11-26 million tons, more than 30% of the entire global fisheries are caught illegally. A recent report from the commercial crime services [2], a specialized division of the international chamber of commerce, shows that at least one vessel is plundered every week at sea.
Hydrophone array monitoring systems are already used in various applications in the marine sector and, therefore, can be a viable alternative for countries that lack the funds to equip themselves with appropriate technology or those facing a too large area to monitor. This way, instead of having patrol vessels watching an area, sound waves gathered by a network of buoys are computed by the monitoring system; the resulting data are then transmitted through radio frequencies to the base command, and only then can the authorities be alerted to apprehend illegal fishing or pirate vessels.
One significant limitation of this architecture, despite it being a cheaper alternative to other vessel monitoring systems, is the onboard energy source. The battery capacity highly impacts features from the area to the duration of monitoring. The deployment and maintenance also proved to be time-consuming and are often limited by weather conditions. It further drives the need to design a system with low-power elements to limit the maintenance cycles. Unreported or unregulated fishing occurring in authorized fishing areas with licensed fishing vessels looking to maximize profit is a scenario in which this architecture would be ineffective, since the system would not be monitoring such an area.
A new technology [3] to track ships from space may lead to significant changes in border control; the new system will use the radio beacons that are available on every vessel on the ocean to consistently follow each boat from space. With the implementation of this new monitoring system, it is believed that most illegal activities in the ocean will be greatly reduced in the future. As the system is expected to come at a great price and still require an adequate response force, a low-power hydrophone array monitoring system remains a good alternative.
Hydrophones are a special type of microphone, designed for use underwater to record or listen in the marine acoustic environment. In the past, they have been used to locate submerged vehicles, marine vocal animals and even to study global warming through ocean temperature evolution. Large-aperture hydrophone array systems in [4,5] are used in accordance with the passive ocean acoustic waveguide remote sensing (POAWRS) to detect and characterize the underwater sound radiated from ocean vessels including both surface ships and submarines. Hydrophone arrays are used in passive acoustic monitoring on vocal animals with distinctive sounds, such as whales in articles [6][7][8], to successfully detect the presence, distribution and number of animals in a specific area. Whales' vital functions, from foraging to courtship detection, are enabled by calibrated recordings of whale species, from which sound source parameters can be estimated. Articles [9,10] also follow the same path to track and monitor dolphins in three dimensions from a single small vessel or seafloor. The impact of underwater noise [11] generated by human activities, such as oil or gas exploration and extraction, dredging and construction of offshore renewable energy devices, shipping and maritime industries in ocean habitat or into previously undeveloped areas of the oceans, on mammals and other protected species, is assessed. The same approach is used in [12] to detect body malformations and delay development in larvae. The use of hydrophones goes beyond the study of mammal sound and the effect of noise on them; it is well known that the military sector often resorts to this for surveillance purposes. The Surveillance Towed Array Sensor System (SURTASS) and Kariwara, respectively, from the USA and Australia are both towed arrays using hydrophone technology for use in passive maritime surveillance systems with a long-range monitoring, made possible by exploiting the deep sound channel.
FPGA-Based sound localization architectures in recent years have been studied and developed to satisfy the increasing quality demand of acoustic applications. Such architectures, as they mainly use battery energy as a power source, are flexible enough to adapt their characteristics to realistic sound-driven environments while being power-efficient at the same time. For instance, ref. [13] proposed an FPGA design of an embedded processor for underwater applications such as AUV navigation and tracking using a structural design that is more robust than behavioral; Ref. [14] designed a custom real-time processing highefficiency demodulation filter for FPGA-based acoustic camera architecture. In [15], the authors proposed an FPGA sound localization to track some sound coming from animals or wounded peoples in an environment with reduced visibility such as fog, forest, muddy water. The article [16] explores Filter-and-Sum-based architecture and several acceleration techniques to provide accurate sound-source localization in real-time. Researchers in [17] applied FPGA sound localization in areas where logging is completely prohibited; they evaluated array configurations of 4, 8, and 16 microphones and a multitude of delay and sum algorithms to detect and locate chain-saw noise.
This section briefly discussed recent articles on hydrophone array configurations and FPGA-based sound localization. Section 2 describes the main components of the vessel monitoring architecture, and the theoretical background for array design and frequency domain beamforming algorithm. Section 3 evaluates simulation and experimental results of acoustic vessel monitoring array accuracy on underwater sound waves. Figure 1 provides insight into the practical setup of our architecture, this includes a set of free-floating buoys scattered across the target area, land control bases, fishing and pirate vessels. Each of the floating buoys is equipped with a global positioning system receiver (GPS), a beamformer fully embedded in a field programmable gate array (FPGA), which processes several operations to detect the signal incoming source, a battery pack, and a wireless sensor mote to relay the input signal location result to a control center and an array of RHSA-10 model hydrophones. As discussed before, the system aims to track vessels emitting sound waves with frequencies up to 20 kHz in a specific area; the hardware configuration block is illustrated in Figure 2. The system includes an array of hydrophones, then several operations, ranging from pre-amplifying to sampling, are processed. Later, the main process is handled by the FPGA followed by the LoRa node for remote data transmission. This last component will not be discussed in detail in this article because it will be the object of a completely separate study. The FPGA is linked to the system via Serial Peripheral Interface (SPI), a synchronous serial communication protocol that provides full duplex communication at very high speeds. It provides low-cost support for as many devices as the number of available chips, with speeds up to about 100 MHz. The LoRa node, on the other hand, is connected via I 2 C, a two-wire interface used to connect relatively low-speed devices, operating in the range from 400 kHz to 1 MHz.

Materials and Methods
Each floating buoy in the system is equipped with a linear array of 2 RHSA-10 model hydrophones, fixed on an alloy rod at a 3 m distance from each other. The 4 m long alloy rod is itself fixed under the buoy, which has hydrophones at 0.5 m from each edge. A deflector and a weight are fixed at the down edge of the rod to reduce drifting flow and procure more stability; the buoy deployment configuration is illustrated in Figure 3. In the initial design, each buoy is working autonomously; later, advantage is taken of LoRa end nodes' peer-to-peer communication to enable buoys to work as groups in a bigger array. This not only enhances the tracking accuracy, it also increases the area coverage. The whole architecture is expected to reach an overall cost of $1000 per set. This includes $99 for the buoy, $645 for the fpga, $100 for each hydrophone, $15 for a LoRa end device for the main components. As power consumption is a main concern in the design, the battery is designed to allow for at least one week of continuous monitoring.

Hydrophones and Arrays
Hydrophone array deployment is the most effective way of collecting data when the direction and location of the sound source are essential to the research. The optimization of design parameters is the very first enhancing factor for signal tracking, which may lead to accurate beamforming results or even the analysis of configurations that need fewer hydrophones for the desired application. The number of elements being replaced is heavily influenced by the energy source; therefore, the array is designed according to the battery storage capacity to ensure success when tracking relatively wideband signals for long periods.

RHSA-10 Model
Hydrophones are waterproof sound listening elements; the quality of a hydrophone depends on the sensitivity of its piezoelectric material. The highly sensitive RHSA-10 hydrophone was selected for array design throughout the study. The RHSA-10 model is a uniform omnidirectional hydrophone, offering a flat receiving response in a wide frequency range, operating at depths of up to 500 m underwater with excellent durability and stability results. The RHSA-10 provides a wide-band frequency response, ranging from 1 Hz to 150 kHz, at temperatures from 0 to 40 degrees celsius. A 15 m long BNC shielded cable, together with a hydrophone, weighs around 1kg and can be stored at a temperature ranging from −40 to 80 degrees celsius.
The RHSA-10 in Figure 4, like most hydrophones, is based on a piezoelectric material. Whether made of composite or ceramic, piezoelectric materials, they have a very special property: whenever they are in contact with a sound wave, a change in electrical properties is observed. Accordingly, when the piezoelectric material is under pressure, an immediate increase or decrease in voltage is observed across its entire plate. The variation is proportional to the pressure being applied to it, and so the intensity and frequency from which the source is being generated can be identified from that variation.

Array Design
Although several array configurations are covered in this section, covering both the simulation and experiment station, only the trail of the hydrophone array in a linear configuration is followed. Planar and circular arrays techniques were disregarded in this study because, with only two elements, a linear configuration is the only option. From a technical perspective, a linear arrangement of hydrophones is the best configuration to fix on a floating buoy, from a cost and energy perspective, as this arrangement requires the lowest number of elements per array, which is of great importance when designing a low-cost, long-lasting real-time architecture.

Linear Array
A linear array Figure 5 is a row of hydrophones equally spaced in a line. This can be described as an x-axis array. A good example of this is towed arrays, which are usually fixed behind a submarine or, in this case, on a floating buoy. The most prominent advantage of a ship's towed line array is its ability to be towed so that it can be away from its ship, away from the only noise source in the area, which is the vessel it is attached to. Floating buoys, on the other hand, have the advantage of being able acoustically survey underwater currents on a much finer scale than the ship's towed array system. Moreover, with new acoustically quiet designs, the flow noise of drifting buoys is considerably reduced, which makes it the ultimate configuration for underwater data collection.
However, its disadvantage comes from its linear formation: whenever it does have a contact, it can detect one of two directions of the incoming sound wave. This means that if a sound wave comes from a position at the right to the array, each of the hydrophones detects the wave and not the direction it is coming from. Later, the system can compare the hydrophones' Time Difference of Arrival (TDoA) to find the sound wave direction. The linear array also cannot detect whether the signal is coming from up or down, so the system detects two sources for each noise, giving it an ambiguity problem. Each detection also leads to two unambiguous bearing contacts; thankfully, there are a series of process that can be used to determine which one is the real contact and which one is the shadow.

Planar Array
A planar array Figure 6 could simply be represented as a rectangle of hydrophones. It is very similar to a linear array, with the only difference being that it is represented in the same plane: in other words, on two axes. The planar array's elements are equally spaced in both the x-and y-axis; this design presents some advantages; one is its high accuracy, and one is the disappearance of the ambiguousness problem from the linear array. With the planar design, when a sound signal is detected, the hydrophones not only detect the sound but also, since they are arranged in two dimensions, the system will detect the source signal from the Time of Arrival comparison, without having to deal with a shadow contact. The downside of the planar array comes from its size: it is much larger than the linear array, and so ot cannot be towed, as there is not yet a tow cable strong enough to tow it. Even if, in the future, a cable is developed, such a large block would float around like a flag, making it practically impossible to obtain accurate readings. This type of array is typically mounted on the hull or along the side of a vessel; different vessels have different specifications. However, because they are mounted on the vessel itself, there is a lot of noise produced by the vessel interfering with the detection. Additionally, because of the ship noise, they have a very short detection range.

Cylinder/Circular Array
The cylinder array Figure 7 is an array in three dimensions. The high number of hydrophones makes the localization far more accurate in sound wave detection. It can steer up and down in depression elevation, so if the noise source is a little bit above the array, it would be able to detect from how far above the source is, whereas the two-dimension planar array is not able to accurately provide that information. One drawback of this method is that whenever the cylinder array attempts to steer up or down, it loses a little bit of sensitivity. A circular array is a bit more evolved as, in the same array, some elements look down while others look up at the same time, so there is not any loss of sensitivity when tracking a signal from above or below the array.

Xilinx Spartan-7 FPGA
A field-programmable gate array (FPGA) is an integrated circuit that contains an array of configurable logic blocks (CLBs), wired together via programmable interconnects that are designed to be configured by a customer after manufacturing. The Spartan-7 is built on the 28 nm (28 HPL) process from TSM, featuring a MicroBlaze soft processor running over 200 DMIPs with 800 Mb/s DDR3 support. This process enables an excellent balance between performance and power consumption, providing the best performance per watt. The architecture is designed to handle high-performance applications at a low total cost, including the small-form factor packaging cost, cost of tools and cost of resources needed to create the user design.
The Xilinx Spartan-7 is suited for applications with a limited available PCB footprint such as acoustic beamforming, sensor interfacing, communication bridging or motor and motion control. The spartan-7 family ranges from 6000 to 10,200 logic cells. All devices contain slices of logic and DSP with up to 160 DSP slices, and a block ram ranging from 5 to 120 and up to 400 I/O in the largest device package combination.

Frequency Domain Beamformer
Rather than using time delay for beamforming, the frequency domain beamformer uses an alternative approach that measures the phase difference between each sinusoid to recover the original time signal. Fast Fourier Transform is used to compute the hydrophone data into the frequency domain. A Fourier Transform is a technic that takes an input signal in the time domain and computes it into a set of frequencies. The Fourier Domain formula [18] is expressed in Equation (1). It is the starting point of any frequency domain analysis, since it represents any non-periodic time domain discrete signal of finite duration as a combination of sinusoidal basis functions. Discrete time methods are implemented in this equation because the Fourier Transform formula uses infinite time and an infinite number of samples, which is a problem: it is impossible to see into the future and impossible to store an infinite amount of data.
The Equation (1) represents the sampled signal from hydrophones at specific frequency of N points block, where Y k is the Fast Fourier Transform coefficient in kth frequency bin, y[n] the discrete representation of the sampled input signal and k = 0, 1, 2, . . . , N − 1. (2) is the representation of each hydrophone composing the array; each one is computed as a column of the vector containing the Fourier Transform coefficients for the signal.
Equation (3) illustrates the result of the division between the Fourier Transform frequency form of the signal Z(w) from Equation (1) and the frequency bin, with Z k representing the result for the kth interval between samples.
The term w k in Equation (4) represents the frequency of the kth bin, with f s as the system discrete signal number of samples per second.
f s for sampling frequency in Equation (5) is one important aspect, as it determines how quickly the sampling can be conducted. When processing signals, a system is limited by how quickly it can take samples; if the system is sampling too slow, it receives unwanted signals called aliases, which appear in the results. The Nyquist theorem states that the system needs to sample, at the lowest, at two times the highest frequency of the incoming signal.
The frequency resolution illustrated in Equation (6) as f r is the difference in frequency between each bin. This sets a limit on how precise the results can be, focusing on the narrowest possible main-lobe width. f r is equal to the sampling frequency f s divided by the number of datapoints N used in the Fast Fourier Transform.
The smartest way to improve the frequency resolution [19] of a Fast Fourier Transform is to increase the number of datapoints while keeping the sampling frequency constant. This means increasing the number of frequency bins that are created, and ultimately decreasing the frequency difference between each. However, a greater frequency resolution has the downside of leading to a shorter time resolution. The ideal solution would be to improve the frequency resolution without altering the Fast Fourier Transform size or sampling frequency.
T in The Equation (7) represents the total sampling time, which helps describe the theoretical minimum response time to an input change, with L as the number of averages. (8) is the result of the remaining weights and delays after Y k substitution in Equation (3) where e k is the complex conjugate transpose of e k .
The steering vector e k in Equation (9) possesses the weights and delays that are appropriate for every hydrophone channel, as well as the other information that is needed to focus the array at a given location. The data defined by e k are worth collecting when looking for the power response for a given frequency bin within a specific region of space.
P k in Equation (10) illustrates the calculation of average power response for a given location and frequency bin.
The block E[Y k Y k ] is substituted as R k in Equation (11) P k = e k R k e k (12) Therefore P k can be rewritten as Equation (12) The cross-spectral matrix R k in Equation (13) contains information comparing hydrohpone data between each other, and the kth frequency bin power for every hydrophone.
Equation (13) can be computed using G ij k expression and G ij . Beamforming in the frequency domain offers a certain number of advantages. The high sampling rates are necessary to minimize the malicious effects of the discrete steering delay quantization on sidelobe height. The transformation of time shifts into phase rotations, which greatly simplifies calculations of the connection of individual hydrophone signals with a focus point or directional information into complex multiplications.

LoRa WSN Mote
The acronym LoRa stands for long range. This is a wireless technology allowing a battery-powered sender to transmit small data packages to a receiver over a long distance. A gateway can handle several devices at the same time; in our case, this means that we can easily set up a network with more than hundreds of buoys. The end node is composed of a radio module with an antenna and a microprocessor to process the data. It is worth noting that the LoRa does not require a license, since its equipment transmits on a free band. Unfortunately, it does not support direct communication between end nodes.
LoRawan network architecture is deployed in a star topology. Communication between the end nodes and gateway is bidirectional, which means that both end nodes and gateways can receive and send data. A typical communication starts with an end node broadcasting data to every gateway in its vicinity. The gateway then transfers the data to a network server. After collecting messages from all the gateways, the network server filters out the duplicate data and finally forwards the package to the correct application server, where the end-user can process the sensor data.

Simulation
In the following simulation examples, we evaluated marine-sensing linear array accuracy using the MATLAB phased array system toolbox on underwater acoustic noise examples. The phased array toolbox [20] provides algorithms and tools for the design and simulation of a sensor array and beamforming systems in various applications. The main idea of the simulation is to study the effect of element spacing in linear array configurations of two and four hydrophones for a beampattern using Fast Fourier Transform.
For the purpose of the simulation, we consider a scenario with a single traveling source centered at about a dominant frequency of 20 kHz, coming at a 45 degrees angle. The speed of sound in an underwater channel is 1500 m/s, so the full wavelength λ for 20 kHz is about 7.5 cm. The sampling frequency f s is set at 40 kHz; the number of datapoints N is equal to 1000 with zero padding, and all positive frequency bins are considered.
The element spacing in hydrophone configurations is an essential factor in the design of the array sensor. The simulation scenario illustrates the result of two and four sensors with an inter-element spacing of λ/4, λ/2, λ and 2λ for a signal source at 45°. Figures 8 and 9 demonstrate the resolution of hydrophone configuration when applied to the Fast Fourier Transform algorithm. The resolution is an illustration of both the main and side lobe levels; thus, the aim is to separate the main signal from side lobe ones. The main lobe width, as well as the maximum side lobe level, are the elements that need to be taken into consideration when attempting to accurately determine the source of a signal. Distinguishing the main signal from the side lobe is a key factor; the difference between the two levels is what enables the algorithm to successfully track the wave's direction of propagation. The complexity comes from the fact that, when using only arrays of two elements, the power difference between lobes is only 5 dB, compared to a notable 10 dB when using the four-element array, as illustrated, respectively, in Figures 8 and 9. For both configurations, we notice the appearance of multiple grating lobes for element spacing values of λ and 2λ.

Experimental Results
The Sanya peninsula located in the southernmost province (Hainan) of China was home to the experiment station. Data from several acoustic tests were recorded, including the location and tracking of water targets such as motorized ships. The experiment represented in Figure 10 consisted of three floating platforms and one motorized ship, which was used as the sound source. Each floating platform was self-powered and equipped with a distributed acquisition station to communicate with the main computer using 4G data transmission. The longitude and latitude coordinates (109.486748, 18.2187446; 109.486977, 18.2186395; 109.487549, 18.2193598), measured by GPS positioning of three floating platforms, are represented in Figure 11.  Each acquisition station Figure 12 was equipped with a NI cRio-9040 compact embedded monitoring controller, with GPS Synchronization, an NI 9467 module providing accurate time synchronization, an NI-9223 module for signal sampling and a linear array of 2 RHSA-10 uniform omnidirectional hydrophones ( Figure 13) fixed 3 m from each other on a 4 m alloy rod. The alloy rod itself was also vertically fixed on the floating platform.  In order to track the target, the three floating platforms were deployed in a triangle configuration with an interval varying from 1 to 2 km. The motorized ship navigated within the triangular floating platforms, sailing back and forth according to a scheduled route, while the three sub-stations began the GPS synchronous acquisition, and transmitted the original hydrophone signal to the upper computer through the 4G network. At the end of the recording, the main computer analyzed the data to confirm the consistency between the underwater acoustic positioning trajectory and the navigation trajectory of the measured ship. The purpose of the experiment is to test and verify the positioning effect and accuracy of the distributed underwater acoustic array.
Due to the high interference of cruise ships and various operation vessels in the external environment of the site, a data group with relatively good signal-to-noise ratio was selected for analysis. The original data collection was imported into SignalPad software, a data logging and analysis application developed with National Instruments. The initial time-frequency analysis of the three monitoring points, found by SignalPad, is represented in Figures 14-16. This shows that signal energy is mainly concentrated in the frequency range below 7 kHz, and there are unknown, constant frequency interference signals at each monitoring point above 7 kHz.   The original data's total length is 50 s, with 35 s of actual effective sailing time of the motorized boat. The recorded signal is filtered by FIR band-pass 1000 Hz-5000 Hz to remove unnecessary high-and low-frequency interference signals. The time-domain of each of the three monitoring substations was obtained as shown in the Figure 17. After slicing the time-domain signal on LabView, the three groups of signal were correlated into each other to find the cross-correlation maximum value. The timepoint corresponding to the maximum value provides the transmission time difference between the sound source signal and the hydrophone monitoring points. As there are many interferences in the actual signal, the resulting time difference curve was first smoothed and filtered. Later, the sound source position was obtained from the transmission time difference of each slice. The least squares method was used to scan the circular area grid with a radius of 4 km to obtain the source location, and the measured signal trajectory path is shown in Figures 18 and 19.

Discussion
A low-cost system of hydrophones being presented in a linear array with an acoustic beamforming technique was introduced as the first step to design a real-time marine sensing architecture. In an experiment performed in the Sanya peninsula with three data acquisition substations in a triangular formation, the system was able to detect underwater sound waves generated by a motorized boat and provide an accurate tracking of the boat trajectory. Such a system, implemented in an architecture and coupled with a wireless sensor network, has the potential to greatly affect high sea monitoring. We presented the monitoring architecture, which is essentially based on a linear array of hydrophones driven by a Field Programmable Gate Array (FPGA) and Wireless Sensor Network (WSN). The array formation, representing the first block in our architecture, was tested in a real-life experiment, with the results predicted by our theoretical models. The obvious next step is the design and implementation of the Wireless Sensor Network mote. Furthermore, a range of factors including multiple boat tracking, sea breeze, weather, and interference may cause errors and measurement inaccuracy to underwater sound source localization. These have yet to be considered. Future work centered around the uncertainty of the marine survey environment is already being considered.