A 30 µ W Embedded Real-Time Cetacean Smart Detector

: Cetacean monitoring is key to their protection. Understanding their behavior relies on multi-channel and high-sampling-rate underwater acoustic recordings for identifying and tracking them in a passive way. However, a lot of energy and data storage is required, requiring frequent human maintenance operations. To cope with these constraints, an ultra-low power mixed-signal always-on wake-up is proposed. Based on pulse-pattern analysis, it can be used for triggering a multi-channel high-performance recorder only when cetacean clicks are detected, thus increasing autonomy and saving storage space. This detector is implemented as a mixed architecture making the most of analog and digital primitives: this combination drastically improves power consumption by processing high-frequency data using analog features and lower-frequency ones in a digital way. Furthermore, a bioacoustic expert system is proposed for improving detection accuracy (in ultra-low-power) via state machines. Power consumption of the system is lower than 30 µ W in always-on mode, allowing an autonomy of 2 years on a single CR2032 battery cell with a high detection accuracy. The receiver operating characteristic (ROC) curve obtained has an area under curve of 85% using expert rules and 75% without it. This implementation provides an excellent trade-off between detection accuracy and power consumption. Focused on sperm whales, it can be tuned to detect other species emitting pulse trains. This approach facilitates biodiversity studies, reducing maintenance operations and allowing the use of lighter, more compact and portable recording equipment, as large batteries are no longer required. Additionally, recording only useful data helps to reduce the dataset labeling time.


Introduction
Every day, more than one hundred animal species disappear in the world. Since the 16th century at least 680 vertebrate species have been driven to extinction by human actions (https://www.un.org/sustainabledevelopment/blog/2019/05/nature-declineunprecedented-report/, accessed on 25 March 2021). Improving understanding of animals can help protect them. Biodiversity monitoring is fundamental for that using complementary techniques going from recorded video analysis to manual census. This article focuses on one of the most relevant technique: acoustic monitoring of animals emitting sound pulses, bursts or frequency chirps [1]. This category involves several species, from the smallest such as insects or birds emitting trains of voiced pulses [2,3] to the largest mammals. Indeed, most of the biomass emits trains of pulses mainly for echolocation, sometimes with high-frequency pulses, sometimes at lower frequencies. Biosonars are perfect examples of trains of clicks or pulses emissions in different propagation environments: air for bats, water for marine mammals. While usually high frequency such as for bats, biosonars can also rely on low frequency voiced pulses trains: an example is the blue whale, emitting strong pulses at a frequency of 25 Hz.
To illustrate our work, sperm whales (Physeter macrocephalus, Pm) have been chosen as a relevant case study. They use a biosonar to sense deep sea by echolocation, emitting click trains, and can be considered to be a relevant case study as their echolocation technique is very close to other species. That means results obtained in this paper can be extended to other animals with minor adjustments such as modifying filters frequencies.
Some cetacean species have been categorized as vulnerable among the endangered species. This is a result of anthropogenic activities creating chemical and acoustic pollution, leading to a slow decay of the populations. However, it is hard to cope with these global issues at a local scale. It is not the same with fishing or transportation activities [4] which also cause many cetacean deaths due to collision with boats or fishing by-catch [5,6]. In contrast with global issues, this can be locally avoided by detecting cetaceans in real time: some boat and whale collisions could be prevented if relevant alerts were sent to in time. This is the main purpose of this work, as well as monitoring them accurately to gather a better knowledge about them.
Sperm whale monitoring is subject to several constraints. When installed in strict nature reserves, recording systems need to be autonomous for a very long time, leading to important requirements in battery life, especially if used in always-on mode. On the other hand, echolocation pulses contain high-frequency information, requiring high-samplingrate recordings, which consumes a lot of embedded energy. As these sea giants are rarely present, in standard always-on recorders, most of this energy is lost by recording empty sequences at a high sampling rate. in this sense, power consumption can be drastically reduced by activating high-performance recording and analysis only when animals are suspected to be present. For our system, this relies on using an always-on ultra-low power wake-up detector as a first step. It is the topic of this paper. A case study on a real sperm whales dataset is presented with an embedded pulse detection technique. Detection accuracy is further improved by using an expert rules system to reject false-positives.

Sperm Whale Biosonar
Sperm whales use echolocation for orientation and prey localization as shown in Figure 1. Signals emitted are composed by sequences of clicks named click trains [7][8][9][10][11]. A sperm whale click is a transient wave as shown Figure 2 and is composed of a direct and a reflected pulses emitted by pneumatic compression [12,13] (Figure 1). Depending on the relative orientation of the animal and the receiver, as well as the size of its head, the Inter-Pulse Intervals (IPI) of this click going from 1 to 10 ms varies. Detection algorithms must be compliant with these variations. IPI is an interesting feature to discriminate between species and will be discussed later in the paper; however, in this study, as a proof of concept, we will first integrate ICI that is also a discriminant features of this species.
Multipulsed clicks are transients ( Figure 2) with a frequency spectrum going from 8 kHz to 20 kHz ( Figure 3) [14][15][16][17]. Each click is separated from the next one by a time interval T InterClick ∈ [100 ms; 1 s] .  Cetacean monitoring is complex because these mammals are spending most of their life in deep water, going down to 2000 m under sea-level and moving on long distances. Moreover, due to the sparsity of these animals, they are rarely present in a given area. Consequently, when an always-on recorder is used for monitoring, memory storage and energy are spent in an inefficient manner. This is even more crucial when high-frequency recordings with multiple hydrophones is performed (necessary to locate the acoustic emitters such as in [17][18][19]. This implies using large batteries and storage capacities to be able to record in always-on mode sperm whales from multiple hydrophones over a long period of time. This makes monitoring of these species a technical challenge, as bio-environmental equipment is heavy, expensive, costly to install and maintain. Thus, reducing the size and increasing autonomy of these recorders is key to facilitate biodiversity monitoring.

Keys to Ultra-Low Power Monitoring System
To achieve ultra-low power consumption in monitoring systems, continuous recording is not an efficient option as most of the time animals are not present. Instead, selective recording and analysis of times an animal has been detected is preferable. We thus choose to keep only simple detectors consuming a few µW always-on. In this way, power and data storage consumption can be drastically reduced by recording data only when the probability of with a cetacean signal is high.
However, to achieve a state-of-art detection accuracy, using ultra-low power detectors can be too limited. Instead of that it is advisable to implement the detector in three stages as shown in Figure 4 in green, orange and red. These three stages correspond to the three different types of embedded artificial intelligence implementations, each one with a different magnitude of power consumption as shown in Figure 5. These three stages are described here: Figure 5. Different types of embedded AI according to their power consumption.

•
First stage: an ultra-low power mixed-signal analog-digital acoustic wake-up system (green block in Figure 4): It processes high-frequency input signal using analog primitives, triggering an event when high energy pulses at specific frequencies occur. It is also robust to ambient sea noise, as the latter is measured using analog filters and a digital ultra-low-power processor working at low frequency. With a power consumption of 30 µW, it can be classified as an ultra-low power AI circuit as described in Figure 5. It also enters the category of ultra-low power wake-up, among other existing ones presented in Table 1 and described here with their advantages and drawbacks. It is important to note that in this comparison, most of proposed wake-up are standard envelope detectors without smart features for improving their versatility and their detection capabilities such as: -Adapting to ambient noise for adjusting detection level automatically.

-
Classifying signals considering their spectrum. -Implementing temporal pattern detection for improving classification based on expert rules. This comparison is ordered by the frequency capability of the detector, considering this has a strong impact on power consumption. One of the most interesting implementations is a low frequency (500 Hz) acoustic signal processor able to distinguish cars, trucks and generators noises for a power consumption equal to 12 nW [20]. This one is too limited in frequency for cetaceans and cannot detect mixed temporal and frequency patterns such as sperm whales clicks, but achieves a remarkable ultra-low-power consumption that can be considered to be a milestone for a further silicon implementation of our solution. Another wake-up detecting low seismic frequencies is proposed in [21], featuring a very low 40 nW power consumption but without classification capabilities and with a limited frequency range (40 to 100 Hz). [22] is also an energy efficient (1.8 µW) envelope detector working at 1 kHz, but does not implement advanced AI features anymore. These first 3 implementations have a very low power consumption corresponding to limited operation frequencies, and cannot be used for cetacean detection because of that.
Refs. [23,24] are implementations able to deal with sperm whales signal frequencies. They are energy efficient but focused on detection of given and fixed specific ultrasonic frequencies (41 kHz and 57 kHz respectively). Used for remote activation of devices, they cannot be used for pattern detection with temporal and frequency analysis. Refs. [25,26] have a wide input frequency range, but are based on digital or analog discrete circuits making their power consumption higher than previous implementations (34 µW). However, they can be used for detecting an adjustable specific frequency in a narrow band, making them interesting for several applications, but cannot implement cetacean detection expert rules.
Higher frequency wake-up are proposed for comparison purpose such as [27] working at 2 GHz: it is a radio frequency (RF) detector with a power consumption of 52 µW. Refs. [28] presents an interesting optimized envelope detector for amplitudemodulated (AM) signal at 315 MHz. It uses an interesting technique of frequency shifting using passive components for reducing power consumption, but cannot implements AI detection rules. Finally, none of the solutions presented in Table 1 can be used for detecting cetaceans and more precisely sperm whales clicks: this is the topic of this paper and proposed solution is described extensively in Section 2, combining ultra-low-power analog features and digital processing computed using the sensor controller of a low-power system on chip (SoC). This allows to achieve low power detection of complex events based on their frequency and temporal attributes at a rather high frequency (20 kHz). • Second stage: a low-power microcontroller is implemented as a second stage detector (orange block in Figure 4), adding the ability to automatically tune the sensitivity of the first stage. It is implemented on a low-power microcontroller (ARM M4). Described in Section 3.3, one of its features is a state machine implementing expert rules: events generated by the ultra-low-power detector (first stage) are analyzed to avoid some false-positives and thus improving click detection reliability. Another option for this second stage could be to use an embedded shallow neural network such as in [29][30][31]. State machine analysis is performed only when events have been generated by the first stage, leading to a very low microcontroller activity (less than 0.01% of the time). Consequently, average power consumption of this stage is lower than the first one, even if this instantaneous consumption is a M4 microcontroller one during the analysis. A second state machine, is implemented to dynamically tune click detection sensitivity of the first stage detector, according to current false-positive and true-positive rates observed. It is described extensively in Section 2. • Third stage: a higher power 4-channels recorder (red block in Figure 4), able to compute deep-learning signal analysis, is triggered by the second stage. It allows recording several channels at a high sampling rate, but consumes a lot of energy (more than 1.2 W). It implements 24 bits analog-digital converter (ADC), 512 ksps temporal resolution, and channel synchronization, all of which are key to allow sound source classification and localization. This recorder is started only when clicks have been validated by the expert rule state machine, reducing drastically its average power consumption compared to an always-on recorder. Active less than 0.05% of the time in real conditions (due to the sparsity of sperm whales), it extends the battery life of the recorder by a factor 2000, reducing its average power consumption to less than 1mW. This allows an important battery and overall size reduction, easing its installation and maintenance while reducing its cost. This high-resolution recorder named Qualilife HighBlue [18] has been designed by SMIoT [32] but is not on the scope of this paper. It is used in Caribbean Marine Mammals Preservation Network (CARIMAM) project to monitor Caribbean underwater biodiversity at a large scale. This data acquisition system is embedded in a waterproof sonobuoy designed by OSEAN company in France ( Figure 6). This sonobuoy named Bombyx2, is the version 2 of Bombyx1 [33]. It is designed and used for the next five years to monitor cetacean presence in real time to prevent collisions with ships in Cetacean sanctuary Pelagos and elsewhere [5].  It is important to note that this paper only focuses on the ultra-low power part of the wake-up system as well as the dynamic expert rules-based tuning algorithm: Section 2 presents the ultra-low power cetacean click detector, Section 3 describes optimization of the algorithm using experts rules and automatic gain control. Results of both detector stages are discussed in their respective sections and a final conclusion is presented in Section 4.

System Architecture
The ultra-low power sperm whales pulse detector first and second stage implementation is presented in Figure 7 as a functional block diagram. It aims at detecting sudden increases in acoustic level on well-defined frequencies (8 kHz to 20 kHz) characteristic of cetacean clicks. It relies on the following circuits with ultra-low power consumption shown in Figure 8.

•
Passive piezoelectric hydrophone with a measurement bandwidth of 50 kHz. This passive sensor has been chosen to minimize power consumption, compared to amplified ones. • Charge amplifier to amplify the piezoelectric charge signal. Integrator with a high input impedance, it converts acoustic piezo sensor charge into voltage, multiplying it by the inverse of the capacitance C 1 in the feedback path as shown in Figure 9. Without considering the additional resistor, this leads to: Additional resistor R 1 forms a low-pass filter with C 1 to limit the frequency bandwidth of the hydrophone to 20 kHz, the maximum frequency present in sperm whales clicks. Power consumption of this block is: 0.9 µA. • High-pass amplifier: as sperm whales pulse frequencies spread from 8 to 20 kHz, a band-pass filter is used to focus only on relevant signal. This filter gain Bode diagram is presented in Figure 10. This band-pass filter is formed by the preceding low-pass filter and a high-pass amplifying filter cutting near 8 kHz. Amplification factor is 10. An ultra-low power op-amp (MAXIM MAX409A [35]) is used in this block, with a power consumption of 1 µA for a gain bandwidth product equal to 150 kHz, allowing amplification by a factor 10 at 15 kHz. • Low-Pass Peak Detector: signal envelope is extracted using passive components: a diode [36] and a RC low-pass filter with a time constant equal to 10 ms. This peak detector gathers multiples click reverberations P0, P1 and P2 (Figure 2) of a cetacean click into one single detection as shown in Figure 11. Power consumption of this passive block is null. It is important to note that signal envelope is properly detected when input signal is increasing, but not when it is decreasing: output cannot decrease faster than the RC discharge. Thus, measuring output pulse time returns a result more related to pulse amplitude than to its duration. Improving this can be done by adding a reset circuit activated when input signal stays under a given threshold during a very short time. This have been considered in this work, but this circuit also add an additional power consumption that is not worthwhile. As explained in Section 3, it is useful to distinguish between cetacean clicks and periodic anthropogenic noise: sperm whales clicks are very short pulses (1 ms) separated by a long pulse interval (100 ms to 1 s) whereas common anthropogenic noises are mainly continuously repetitive ones (motor boats vibrations for example). In these conditions, adding a RC discharge with a duration of 3RC = 30 ms to the cetacean pulses does not affect the algorithm for detecting isolated pulses repeated with a long time interval. Thus, reset circuit is not necessary and has been removed.
• Low-pass filter for sea noise estimation: to avoid false-positives, average sea noise level must be estimated as a reference value for peak extraction. This is done using a low-pass filter with a cutoff frequency of 0.1 Hz in output of the peak detector. When a stationary noise is present, output V Noise increases as it depends on the average of successive input signal amplitudes. In this way, a heavy swell or a motorboat cruising around the hydrophone will increase V Noise , whereas an isolated sperm whale click will not change the input average amplitude and therefore V Noise . Thus, this long period low-pass filter provides an estimate of the sea noise level in the frequency band of interest. To implement this block, a voltage follower (implemented using an operational amplifier LPV811 [37]) acting as a voltage buffer is added between the peak detector and the low-pass filter, consuming 600 nA. • Comparator: this component is responsible for comparing the signal envelope (output of the peak detector) to a reference value V Re f proportional to the estimated average sea noise level. When the output of peak detector block is higher than the reference value, a detection event is generated, triggering an interrupt in the processing microcontroller. The power consumption of this ultra-low power analog comparator is 320 nA. • Reference value V Re f generation: this feature is fundamental for the algorithm reliability. In a first approach, it seems evident to set V Re f = KV Noise . However, for a calm sea, this noise level will be very low, and possibly under the inner noise level of the hydrophone combined with its amplifying chain, leading to an important falsepositive ratio detection. Thus, a constant must be added to ensure a reliable detection. This leads to Equation (1): where K 1 and K 2 are constants to be optimized according to AUC or specific functional points. Too small K 1 and K 2 generate false alarms as the reference level of the comparator is low, while high K 1 and K 2 constants result in a loss of smallest clicks of a distant sperm whale, as well as when the average sea noise level is important. Doing these adjustments using analog circuits ( Figure 12) requires an adjustable constant generator, an adjustable amplifier and an adder. This solution has a power consumption of 58 µA, mainly due to the use of digital potentiometers allowing up to 1 MΩ resistances, such as AD5222 [38], consuming 40 µA. Compared with other parts of the analog front-end, the comparator reference adjustment power consumption is too high and needs to be reduced. There is another way of proceeding, less power consuming and more versatile: using an ultra-low power digital processor used to acquire average sea noise level V Noise with an analog to digital converter (ADC) at a low sampling frequency rate. Then, computing V Re f is done digitally and converted into an analog value entering the comparator, using an ultra-low power Digital Analog Converter (DAC). These operations are implemented using an ultra-low power system on chip (CC2652 [39]) with a circuit dedicated to low frequency operations, named sensor controller. This circuit can be activated while the main processor is in sleep mode, leading to an ultra-low power consumption of 4 µA for ADC conversion of average sea noise level at 0.1 Hz. A LTC1662 [40] DAC is used to convert the reference value transmitted by the sensor controller in SPI to an analogical value. Corresponding power consumption is 1.5 µA, leading to an overall consumption of 5.5 µA. This shows that mixing analog and digital ultra-low-power techniques can be a power efficient way of processing an analog signal. It makes the most of using analog computations for high-frequency signal processing and digital computations for low frequency ones. It is important to note that in Equation (1), constants K 1 and K 2 are fixed hyperparameters of the model that can be adjusted by grid search algorithms to maximize click detection accuracy as described in Section 2.2. However, depending on experimental conditions such as anthropogenic noise, sea noise evaluation can be biased, leading to false alarms. In this case, algorithm reliability can be improved by adjusting dynamically K 1 and K 2 to avoid false alarms. These adjustments are done using a state machine-based automatic gain control introduced in Section 3. • Voltage regulator: a 3.3 V supply voltage value has been chosen. Powered by a single Li-Ion 3.7 V cell, a Microchip MCP1810 [41] linear voltage regulator is used, with a power consumption of 20 nA. Reducing the input voltage to 1.8 V would be a good option and will be done in a further work. • Voltage reference: a single 3.3 V supply is used, requiring generation of a virtual 1.65 V ground voltage. This is done using a 1.65 V voltage divider followed by an analog buffer (using a LPV811 operational amplifier [37]) with a current consumption of 0.58 µA. Table 2 shows the power consumption of each circuit used in the ultra-low power part of the click detector, leading to an aggregated power consumption of 12.5 µA, without using expert system state machines. Measurements have been done using a custom mixed analog-digital breadboard shown in Figure 13 and designed for this project. This board is fully functional and allows measuring precisely power consumption of each analog and digital modules by physically connecting or disconnecting power supplies of each stage. Considering there are limited space constraints in the sonobuoy, this breadboard has been used in this form during experiments. Its size will be reduced in the future, but without changing anything to the implemented analog features. However, it is noticeable that many available features in the development board shown in Figure 13 have not been used. 4 Sallen-Key structures, 4 multiple feedback (MFB) structure, 4 peak detectors, 2 DAC, 4 passive filters, 4 comparators and 4 delay lines are available on the board, with a CC2652 for the digital part. Only 2 Sallen-Key structures, 1 peak detector, 1 passive low-pass filter, 1 comparator and the CC2652 S°C have been used for implementing the system (plus one more analog structure for the charge amplifier that can be implemented using an additional MFB structure, but is not in the scope of this paper).
Each block of the analog implementation is detailed in Table 2. Complete ultra-low power hardware 12.5 Figure 13. Analog test and power measurement board.

Results
Signal waveforms obtained using the analog detection algorithm are shown on Figure 11. As explained before, an important feature of the proposed ultra-low power cetacean detector is the ability to automatically tune the comparator reference value V Re f described in Equation (1) depending on the sea ambient noise. This equation uses 2 fixed hyper-parameters K 1 and K 2 that must be optimized to maximize classification accuracy.
Performance is evaluated using a test set of 100 underwater recordings, with and without sperm whale clicks. They are chosen at random in a labeled database composed of samples from the BOMBYX project sonobuoy [33]. The dataset is composed of 8% of clean samples, and 92% of noisy ones (either boat noise or white noise were added at −6, 0, 6, 12, 18, 24 dB signal-to-noise ratio (SNR)). There is equal number of positive and negative samples. For Findability, Accessibility, Interoperability, and Reuse (FAIR) comparisons to other detectors, the full labeled dataset (including pulses of another Mediterranean fin whales) is available online at SABIOD.org (http://sabiod.org/pub/dataset/PhyseterMand-BalaenopteraP-dataset.zip, accessed on 25 March 2021).
Receiver Operating Characteristic (ROC) curve of the ultra-low power click detector is presented in Section 3.3 : Area Under the Curve (AUC) is 75%. An optimal configuration corresponds to K 1 = 4.3 and K 2 = 0.25V. This is a relevant achievement considering system power consumption is only 12.5 µA. Moreover, it is important to note that the reference database is mainly focused on sperm whales and motorboats but has very few samples of silent sea. Consequently, detecting clicks on this database is much more difficult than it would be in real conditions with a frequently quiet sea.

Improving Click Detection Using Expert Bioacoustic Rules
The proposed ultra-low power click detector relies on the real time analysis of the input acoustic signal to detect a sudden increase in its energy. The output of the comparator corresponds to a logical 1 during a click. However, other types of acoustic signal with a high level in high frequencies, such as a motorboat running close to the hydrophone, can also trigger the comparator, leading to false alarms.

Implementing Expert Rules in Ultra-Low Power Using State Machines
To improve the accuracy of the detection algorithm, an ultra-low power improvement based on two bioacoustical expert rules is proposed. It relies on: • Click duration: a cetacean click has a well-known shape [13], its main peak lasts between 50 µs and 200 µs and can be repeated a few times. This first expert rule is used to decide whether the received pulse fits this interval or not, corresponding to a high probability for the received signal to be a click. • Interclick interval: time between two successive clicks is also well-defined (Interval ∈ [100 ms; 1 s]) [13] and can be used as a second expert rule to confirm if two successive clicks can be part of a click train or not. Process is iterated on each new click.
These expert rules can be implemented efficiently using a microcontroller. As described in the left diagram of Figure 14, it relies on a simple state machine, leading to a very low additional power consumption. When an event (rising edge) is generated in output of the comparator, meaning that a high energy signal has been received in the given frequency band, the state machine starts waiting for a falling edge. The latter corresponds to the end of the high energy pulse whose duration is compared to a reference interval (the first expert rule) to determine whether the pulse is consistent with the possible acoustic emission of sperm whales or not. If it is, a click counter is incremented, and the next click detection is started. If the click is not the first one, interclick duration is calculated and compared to a second reference interval using the second expert rule. If the interclick duration is in the interval, click detection is validated and a true-positive (TP) counter is incremented, otherwise a false-positive (FP) counter is incremented.

Automatic Gain Control
As described before, using expert rules also allows to get indications on the number of true and false-positives. This is useful to fine tune the main detection algorithm and especially K 1 and K 2 constants used to generate the comparator reference as described in Equation (1). If the constant is too high, clicks will be missed, if it is too low, the comparator output will be triggered on unwanted signals. To dynamically adjust these constants, a second state machine implementing an automatic gain control has been proposed and is shown on the right side of Figure 14. It uses the false-positive and true-positive counters incremented in the click detector state machine. Each time a click is identified as part of a click train, the true-positive counter Count TP is incremented. Each time a pulse or an inter-pulse length does not comply with the expert rules, the false-positive counter Count FP is incremented. Periodically, these counters values are reset to 0. Following situations can occur as shown in the state machine of Figure 14: • If Count FP = 0 and Count TP > 0, clicks are detected. In this situation, gain is welltuned and is not adjusted. • If Count FP = Count TP = 0, nothing is detected, neither clicks nor other signals. Sensitivity of detection can be increased by reducing simultaneously gain K 1 and K 2 , with a minimum value of K 1 = 2. This reduces the reference voltage corresponding to the sea noise level (yellow line in Figure 11) • If Count FP > 0, false alarms are present, and gains K 1 and K 2 must be increased simultaneously to avoid detection of sea noise. This can happen when weather conditions are changing very quickly or boats are approaching.
Considering power consumption, this automatic gain control can be implemented in ultra-low power, because adjustment is done when a click has been validated or not in a few µs.

Results
Results (ROC curves) of the improved detection algorithm using expert rules are shown in Figure 15. Performance is evaluated using the same test set of 100 recordings as for the click detection algorithm. Area Under the Curve (AUC) is equal to 85%, compared to 75% without using the expert rules. Area under the ROC curve is 85% using the state machine and expert rules, versus 75% without.
Power consumption impact of these expert rules and automatic gain control is low: current is increased by 4.5 µA (measured value), mainly because of a timer used for evaluating pulse and inter-pulses duration has been activated. This limited additional power consumption allows to avoid false alarms, reducing the number of high-frequency recorder activation, thus saving battery power by avoiding unnecessary high-power recordings.
Proposed embedded ultra-low power cetacean detector can also be compared with a standard solution using a convolutional neural network (CNN) [43], trained and tested using the same dataset. This network has approximately 10.000 parameters. Its input signal is a windowed 512 points short-time Fourier transform (STFT), split using a Mel-Spectrum front-end. It is processed by 3 depth-wise convolution layers of 64 features. Complexity of this CNN illustrates how it can be difficult to detect cetacean clicks: its corresponding area under the curve (AUC) is 92% for a power consumption of 543.51 mW. As shown in Table 3, proposed ultra-low power detector has an AUC of 85% (7% less accurate than the CNN solution), but its power consumption is approximately 18,000 times lower. Thus, this makes it particularly relevant to be embedded in a buoy for real-time long-term cetacean monitoring. Table 3. Area Under the Curve (AUC) and power consumption comparison with the state-of-the-art detectors using the labeled database composed of samples from the BOMBYX project sonobuoy [33].

Conclusions
In this paper, a mixed analog-digital always-on ultra-low power smart wake-up based on pulse-pattern analysis is presented. It is used for triggering a high-performance multi-channel recorder only when necessary. Its architecture makes the most of ultra-low power analogical primitives coupled with an embedded digital low-power system for fine tuning the pulse detector to reduce energy consumption. An ultra-low power application to sperm whales click detection is proposed. Overall measured power consumption of the basic click detector is 12.5 µA. Tested on a labeled dataset that is more difficult than real conditions, its area under the receiver operating characteristic (ROC) curve is equal to 75%. It can be improved using expert rules implemented with state machines, leading to an area under the curve of 85%, while consuming only 17 µA in operational conditions. This allows an autonomy of 2 years on a single CR2032 battery cell.
Further work will focus on more complex pattern analyses, based on same principle: merging frequency filter bank and heterogeneous temporal features analysis. More precisely, IPI feature for this species is also being integrated in an enhanced multiscaled version of the state machine to increase its AUC. Existing CNN is integrating this IPI pattern. One could reach the CNN AUC with this completed state machine, but in Ultra Low Power. IPI is also an efficient feature to discriminate between individuals of different sizes, and then to better estimate the number of individuals in an area. This would be another output of the state machine.