An Acoustic Vehicle Detector and Classiﬁer Using a Reconﬁgurable Analog/Mixed-Signal Platform

: The wireless sensor nodes used in a growing number of remote sensing applications are deployed in inaccessible locations or are subjected to severe energy constraints. Audio-based sensing offers ﬂexibility in node placement and is popular in low-power schemes. Thus, in this paper, a node architecture with low power consumption and in-the-ﬁeld reconﬁgurability is evaluated in the context of an acoustic vehicle detection and classiﬁcation (hereafter “AVDC”) scenario. The proposed architecture utilizes an always-on ﬁeld-programmable analog array (FPAA) as a low-power event detector to selectively wake a microcontroller unit (MCU) when a signiﬁcant event is detected. When awoken, the MCU veriﬁes the vehicle class asserted by the FPAA and transmits the relevant information. The AVDC system is trained by solving a classiﬁcation problem using a lexicographic, nonlinear programming algorithm. On a testing dataset comprising of data from ten cars, ten trucks, and 40 s of wind noise, the AVDC system has a detection accuracy of 100%, a classiﬁcation accuracy of 95%, and no false alarms. The mean power draw of the FPAA is 43 µ W and the mean power consumption of the MCU and radio during its validation and wireless transmission process is 40.9 mW. Overall, this paper demonstrates that the utilization of an FPAA-based signal preprocessor can greatly improve the ﬂexibility and power consumption of wireless sensor nodes.


Wireless Sensor Networks and Acoustic Techniques
Over the past decade, the rapid rise of the Internet of Things (IoT) has facilitated the advancement of remote sensing by simplifying the design and expanding the scale of wireless sensor networks (WSNs) [1][2][3][4][5]. Yet, WSN expansion has faced bottlenecks due to the energy constraints and accessibility restrictions imposed by the remote placement of sensor nodes [1]. Thus, there is a growing demand for reliable devices with ultra-low power consumption and high wireless reprogrammability.
To achieve low power consumption, minimally preprocessed data are often transmitted to a base station for further analysis. However, frequent wireless communication can be a leading contributor to energy cost in WSNs. Another school of thought is to locally process data and only transmit relevant information. In general, there exists a tradeoff between computation and communication overhead in WSNs [6]. Nevertheless, local processing may be a more viable choice in scenarios where sensor data are sparse in relevant information content or where the necessary computational tasks can be performed at low energy cost.
Audio data are often sparse in relevant information, which makes audio sensor nodes good candidates for the local-processing approach. Audio capture and analysis inherently consumes less energy than its video counterpart, making acoustic techniques good for low-power sensing schemes [7]. Acoustic vehicle detection and classification (hereafter "AVDC") has previously been used as a benchmark for the validation of low-power node architectures [6]. AVDC has applications in traffic monitoring and military surveillance, especially in scenarios where sensors cannot be placed in the line of sight of the classification target [2].
Vehicles, like many other objects with rotary or oscillatory components, can be identified from the frequency content of their acoustic signature. Hence, wavelet analysis, which classifies objects based on temporal frequency content, has been used in AVDC-like applications [8]. However, wavelet analysis is energy-intensive due to the involved digital computations.
Amplitude analysis is another oft-used audio processing method. It is typically executed through a thresholding scheme like the structural integrity monitor presented in [9]. Adaptive thresholding was proposed for vehicle detection in [10]. The approach taken in [11] uses an envelope detector and comparator to construct an event detector for a low-power cargo monitoring tag. Amplitude analysis affords simple, energy-efficient solutions to a multitude of problems but causes a high false alarm rate due to its low event-oriented specificity. For the AVDC task, a combination of spectral analysis and amplitude analysis could achieve good accuracy while maintaining low power consumption and simplicity.
The ability to implement training with the node out-of-the-loop would help hasten node deployment and facilitate remote reconfigurability. With the node out-of-the-loop, training can be done offline with the same reconfigurable architectures being used in the field. With this approach, system parameters can be re-adjusted in the field if more data become available, and specific adjustments can be made for individual node performance. Deployed nodes can also be repurposed for different remote-sensing tasks without the need for retrieval. Minimizing the number of programming cycles undergone by the sensor node potentially saves time and energy while extending the lifespan of the node.

Proposal, Novelty, and Overview
Digital signal processors (DSPs) are oft-used in wireless sensor nodes due to several merits. For one, DSPs can be readily repurposed for a wide assortment of tasks, providing great benefit to node flexibility and programmability [6]. Additionally, digital systems have good immunity to electrical noise and environmental fluctuations, which make them a reliable choice for long-term applications. Recent applications of digital signal processing include a low-power feature extractor for speech recognition [12] and a front-end for an electrocardiogram acquisition system [13]. DSP energy-efficiency has historically followed Gene's Law, doubling every 18 months due to the power savings afforded by rapid technology scaling [14].
Yet, analog-to-digital converters, which are at the heart of DSP-based sensor nodes that interface with analog sensors, have not kept up with Gene's Law [14] and are still a dominant contributor to overall power draw in digital sensor nodes [15]. Improvements in DSP energy-efficiency are becoming infrequent as technology scaling becomes physically difficult and prohibitively expensive. Resource-constrained digital systems may suffer from high latency as well, which can be a concern in certain scenarios [16].
In contrast, analog signal processors (ASPs) can directly interface with analog sensors, forgoing the need for analog-to-digital converters. ASP-based solutions offer real-time, low-power signal processing and often require less infrastructure than their digital counterparts. Analog electronics endow designers with a rich assortment of powerful computational components at ultra-low energy cost [17,18]. In fact, [14] noted that ASP energy-efficiency has historically led DSP technology scaling by 20 years.
These are several reasons why analog circuits are used in sensor interfaces like the capacitive sensing front end demonstrated in [19] and the delay line for ultrasonic imaging demonstrated in [20]. ASPs are also widely used in biopotential acquisition and analysis systems, finding recent applications in the current-mode front-end proposed in [21] and the electrocardiogram feature detector proposed in [22]. There have also been recent demonstrations of ASPs in machine learning, such as the convolutional neural network processor shown in [23] and the framework for computing Gaussian kernels shown in [24].
Yet, analog circuits can be susceptible to electrical noise and environmental changes. Analog electronics are also ill-suited for certain tasks (such as radio transmission or sequential logic). It may be more synergistic to use a combination of analog and digital electronics for a general-purpose node architecture. Hence, this paper proposes a hybrid node architecture comprising of an always-on, mixed-signal processor (MSP) and a microcontroller unit (MCU) that is only awoken when a significant event is detected by the MSP. In this way, the intrinsic device physics of analog components are leveraged to perform computations that would inherently require more power on fully digital platforms while still retaining the abilities to wirelessly network and perform sophisticated digital computation. [1,6,25] have previously shown that this type of configuration can facilitate significant power savings.
Although this paper expands on the work described in [6], it differs in a few aspects from previous works that offered solutions to the AVDC problem. Related works utilized application-specific integrated circuits (ASICs) [1,6,9,10,25]. ASICs offer superb performance, ultra-low power consumption (especially with an integrated MSP), and good robustness. However, ASIC development is time-consuming and ASIC-based solutions can be limited in their ability to handle future changes to procedures or overall tasks. These issues are mitigated in this paper by using a low-power field-programmable analog array (FPAA). Compared to an ASIC-based solution, an FPAA-based solution can leverage mixed-signal computation and a reconfigurable architecture for a more expansive scope.
The reconfigurable analog/mixed-signal Platform (RAMP) introduced in [26] is the base for the preprocessing circuitry in this paper. By utilizing the RAMP, which contains a custom-designed FPAA at its heart, a sensor node with immense post-deployment reconfigurability, flexibility, and versatility can be achieved. As a consequence of using the RAMP, the event-detection stage of the AVDC system proposed in this work is different from that used in [6], which was based on an analog ASIC. In this work, floating-gate transistors (FGs) were used for comparator threshold generation instead of digital potentiometers, a lookup table (LUT) was used for template matching instead of a complex programmable logic device, and a panStamp MCU was used for transmission of the final decision instead of a TelosB digital platform. The subcircuits used in the spectral-analysis stage of the MSP also have different topologies than their counterparts in [6], and a digital debouncing circuit is introduced to the event-detection stage. A comparison between the results of this work and [6] highlight the improvements in power consumption and accuracy that are possible by using a reconfigurable system rather than an ASIC.
To take full advantage of the increased precision afforded by generating comparator thresholds with FG-transistor-based circuits, a new training algorithm is proposed in this paper. The training algorithm used in [6] used brute enumeration of comparator thresholds, keeping the MSP in-the-loop throughout the entire training process. This methodology is only practical if threshold increments are coarse; hence, the methodology proposed in this paper uses a two-step, continuous-optimization procedure to determine comparator thresholds and requires significantly less communication with the MSP.
In summary, this paper evaluates the performance of a reconfigurable analog system within the context of a vehicle detection and classification application. The remainder of the paper is as follows: In Section 2, the infrastructure of the RAMP is discussed. The topology of the AVDC circuit is explained in Section 3. Section 4 discusses the process of training the AVDC system. Section 5 demonstrates the performance of the AVDC circuit in comparison to historical work. Finally, Section 6 offers concluding remarks.

Reconfigurable Platform
A field-programmable analog array (FPAA) is a reconfigurable integrated circuit (IC) that is the mixed-signal analogue to a field-programmable gate array (FPGA); while FPGAs allow for post-fabrication synthesis of digital circuits, FPAAs allow for the post-fabrication synthesis of analog circuits [27][28][29]. FPAAs can be used to synthesize common circuits such as amplifiers or filters, making them useful for sensor interfacing applications. The reconfigurable, analog architecture of an FPAA allows for both general-purpose analog signal processing and complex application-driven signal-processing tasks [30].
The RAMP (described at length in [26]) leverages a custom-designed FPAA structure to provide an abundant number of signal processing and computational elements. The RAMP consists of ten stages of computational analog blocks (CABs), including specific stages for spectral-analysis, transconductors, sensor interfacing, mixed-signal-processing, digital computation, and general transistors. Each of the ten stages are replicated to create eight independently reconfigurable channels.
Of primary importance for this AVDC application are the spectral-analysis, mixed-signal, and digital stages. The spectral-analysis CABs contain bandpass filters (BPFs), peak detectors, adaptive-time-constant filters, and operational transconductance amplifiers (OTAs). The mixed-signal CABs contain comparators, sample and holds, pulse generators, timers, and starved inverters. The digital stages contain LUTs and JK flip-flops.
Signal routing in the RAMP is achieved via two programmable switch domains: "switch boxes" and "connection boxes" (as shown in Figure 1). These programmable switches are implemented using SRAM-controlled T-gates (nMOS and pMOS transistors connected in parallel to create a rail-to-rail switch). The "connection boxes" provide a crossbar switch matrix comprised of T-gates for intra-CAB routing. The "switch boxes" utilize the T-gates in a 4-way "diamond-switch," allowing for inter-CAB connections. A desired switch configuration can be loaded into the RAMP using an on-chip serial-peripheral interface (SPI). In total, 20,380 switches are included within the RAMP [26]. CABs are connected to tunable FG current biases, allowing for the performance of analog circuits with programmable parameters (e.g., corner frequencies and gain) to be modified to fit a user-specified design constraint. The printed circuit board (PCB) that houses the RAMP can be seen in Figure 2a. The PCB includes the RAMP (die micrograph shown in Figure 2b), the MCU, and several peripherals (such as sensors, communication interfaces, and buttons/switches).

Programming Infrastructure
FGs are the building blocks of nonvolatile memory. FGs are metal-oxide-semiconductor field-effecttransistors with capacitors connected to their gate such that the gate of the transistor is floating (isolated from any DC path to ground). The charge stored on the gate regulates the current allowed through the transistor [31]. In this way, charge can be stored on the gate to create a long-term controlled current source.
Two different processes are used to apply or remove charge from an FG. Hot-carrier injection (hereafter "injection") is used to add charge to the FG, and Fowler-Nordheim tunneling (hereafter "tunneling") is used to remove charge from the FG [31]. Due to the high voltages required for tunneling, the RAMP only utilizes tunneling for global erasure of FGs. To perform fast, linear injection on the RAMP, FGs are individually placed into a programming structure. This structure is a continuous-time, OTA-based, negative feedback controller which compares the voltage on the control gate of the FG being injected to a target voltage. The injection target voltage corresponds to a characterized bias current that is desired from the FG [31]. After programming, FGs are connected to CABs to bias analog circuits.
Configuration of the RAMP IC is expedited by a few layers of abstraction. To synthesize a circuit on the RAMP, a user must first provide the RAMP software with the corresponding netlist. The RAMP software then estimates injection target voltages for the necessary FGs and uses simulated annealing [32] to estimate the switch activation pattern resulting in the shortest mean path among the components specified in the netlist. FG and switch domain configurations are then compressed and sent to the MCU on the RAMP development board. The MCU then communicates with the RAMP IC through SPI to place each required FG into the programming structure and inject them to their respective targets. Switch domains are configured after all FG programming is complete [26]. The RAMP IC can be reconfigured in-the-field, but any user-specified configurations will be not be operational for the duration of programming. Once the relevant FGs and switches have been programmed, the RAMP is ready to perform user-specified operations.

Vehicle Detector Configuration
As mentioned in Section 1.2, the pivotal use of an FPAA-based MSP and the implications of FPAA usage distinguish the work proposed in this paper from that on which it is based [1,6,25]. The "always-on" audio MSP constructed on the RAMP performs a combination of spectral and amplitude analyses; vehicles are identified by their instantaneous frequency content by first decomposing audio data into several frequency channels and then comparing the signal power in the frequency channels to predetermined templates. A match to the spectral template for a vehicle will wake the MCU from its low-power sleep mode. The AVDC system leverages the following signal-processing steps: Spectral-Analysis Stage: 1. Spectral decomposition using a filterbank of bandpass filters (BPFs) 2. RMS envelope estimation using a bank of root-mean-square (RMS) detectors cascaded with a bank of ripple-smoothing, adaptive-time-constant (adaptive-τ) lowpass filters Event-Detection Stage: 1. Digitization using a bank of comparators 2. Digital "debouncing" using a starved inverter 3. Template matching using a LUT 4. Final decision transmission using a panStamp MCU Figure 3 shows a block diagram of the AVDC system, and the various signal-conditioning steps of the MSP are discussed in more detail in the subsequent subsections.

Spectral Decomposition
OTA-based BPFs, shown in Figure 4a, are utilized in a filterbank to decompose the input audio. The BPFs used in the RAMP are based on a design demonstrated in [33], and they have the benefit of independently adjustable corner frequencies. The corner frequencies of a BPF is set by tuning the transconductances of its constituent OTAs via FG current biases. The filterbank is initially constructed with eight BPFs configured in a half-octave spacing from 77 Hz to 1113 Hz. Although the spacing between filters in adjacent channels and the exact corner frequencies can be tuned precisely [33], this level of precision is found to be of low importance for the AVDC proposed in this paper-the training algorithm proposed in Section 4b finds that a few channels contain redundant information. These redundant channels are removed from the spectral-analysis stage of the final AVDC system. Figure 4b exhibits the frequency response of the initial filterbank. To minimize the output distortion of BPFs, the range of the input to V in should be properly scaled with the mean at midrail, and V re f should be referenced to midrail.

RMS Envelope Estimation
After obtaining the frequency decomposition, it is desirable to estimate the RMS envelopes of the BPF outputs to assess the instantaneous signal power in each frequency band. The peak detector described in [34] is biased as an RMS detector to perform this task. The RMS detector depicted in Figure 5a, has two independently adjustable parameters, attack rate (G m,A ) and decay rate (G m,D ), which correspond to the transconductances of the constituent OTAs. Adjustment of OTA transconductances also changes the output ripple of the RMS detector. The RMS detector operates like an asymmetric integrator whose output is given by where V in and V RMS denote the input and output of the RMS detector, respectively. Despite careful tuning, the RMS detector output will invariably contain some ripple, especially if an accurate envelope estimate is desired. Thus, a bank of adaptive-τ lowpass filters based on the topology presented in [34] is implemented for ripple rejection. At its core, the adaptive-τ, shown in Figure 5b, is an OTA-based lowpass filter. However, the OTA used in an adaptive-τ filter is designed to have a transconductance that increases with increasing input amplitude [34]. Hence, the time constant of the adaptive-τ filter decreases with increasing input amplitude.

Digitization
The smoothed RMS envelopes are digitized using a bank of comparators. RAMP comparators have built-in hysteresis, which aids in producing a steady LUT output despite residual noise from the spectral analysis stage. Comparator reference voltages are generated by using FGs to source or sink current through resistors. The comparator architecture in the MSP (depicted in Figure 6) consists of a differential amplifier cascaded with an inverter and an "edgifier" circuit. The edgifier circuit was proposed in [35] and uses starved inverters to accelerate the rising and falling edges of its input signal, making it useful for the digitization of slow analog signals.

Digital Debouncing
For certain channels (notably the highest frequency channel), the hysteresis inherent to a RAMP comparator is insufficient to generate a steady digital input to the LUT. Continual oscillations in the LUT input may cause excessive querying and lead to unwanted power consumption. A digital debouncing circuit (shown in Figure 3) is used to mitigate this issue. To incite a state change in the output, this debouncing circuit requires that the input signal maintain its digital state for a minimum amount of time. This minimum time requirement for the debouncer is analogous to the "setup time" requirement of a D-latch. The digital debouncing circuit is constructed by cascading a time-voltage conversion circuit with a comparator referenced to midrail. More particularly, the time-voltage conversion circuit is an asymmetrically starved inverter with the output connected to a capacitor. This implementation allows for the "setup times" necessary for the "high" or "low" state to be independently adjusted. If the voltage on the capacitor is near either supply rail prior to an input state change, the "setup time" of the debouncer can be approximated as follows: The constants used in Equation (2) are labeled in Figure 3.

Template Matching
Once stable comparator outputs are generated, a LUT is used to match comparator activation patterns to one of the three vehicle classes considered in this paper: "noise/no car," "car," and "truck." Due to chip real-estate considerations, the LUT fabricated on the RAMP is a digital framework with six inputs and two outputs that checks compliance with Boolean expressions; hence, the LUT is configured to use one output to assert vehicle presence and another output to assert vehicle type (i.e., if a truck is present). To configure a LUT, the RAMP software decompiles the necessary Boolean expressions into logical tables that are then stored in the two SRAM arrays comprising the LUT. When the LUT receives an input, it asynchronously queries the SRAM cells with the corresponding addresses to determine the outputs. Figure 7 presents a block diagram of the LUT architecture.

Final Decision Transmission
The preliminary decision made by the MSP is susceptible to errors due to wind noise and differences between approaching, present, and receding vehicles. Differences in vehicle size and proximity can lead to scenarios where the LUT incorrectly identifies the vehicle class. Thus, an MCU is used to verify the decision of vehicle presence by implementing the decision-accumulation scheme demonstrated in [6]. In the decision-accumulation scheme, the LUT output that asserts the presence of a vehicle is used as an interrupt or "wake-up" pin for the MCU. Once the LUT has generated an interrupt on the "vehicle presence" pin, the MCU wakes up and records both outputs from the LUT. The MCU continues sampling the LUT output until the LUT's "vehicle presence" pin remains low for 100 ms. This additional pause helps confirm that the vehicle that triggered the interrupt has moved out of the range of the audio sensor.
The MCU then decides the vehicle class. If the LUT's "vehicle presence" pin has been asserted for a minimum of 50 ms, the MCU decides that a vehicle is present. Otherwise, the MCU decides that the LUT has false triggered, and it returns to its low-power sleep mode. If a vehicle is present, the MCU must activate its radio transmission module to send the final decision results to a base station. However, before radio transmission, the MCU determines the vehicle type by comparing the number of samples for which the LUT's "vehicle type" pin has been asserted to a threshold. If this threshold (which is estimated to be 5000 samples using basic training data statistics) is exceeded, the vehicle is classified as a truck; otherwise, the vehicle is classified as a car. The MCU decision-making process is summarized in Figure 8.   Figure 8. Process used by the microcontroller unit (MCU) to make the final detection and classification decision. Section 4 discusses methods for normalizing the input audio and configuring the amplitude analysis circuits (i.e., the AVDC event-detection stage) in detail.
The panStamp NRG 2 [36] MCU was used in this work for its IoT capabilities, small form factor, and low power consumption. Despite measuring 1.6 cm by 2.2 cm, the panStamp NRG 2 has 32 kb of flash memory, a low sleep current of 1.5 µA, a maximum radio transmission power of 12 dBm, and AES encryption capabilities. The panStamp not only enables transmission of the final MCU decision, but it also facilitates wireless reconfiguration of the RAMP IC. A photograph of a panStamp NRG 2 soldered onto the backside of the RAMP PCB is displayed in Figure 9.

Data Preparation
Audio data preparation is the first step in the AVDC training process. The audio data used in this paper were collected previously in [16]; the data are ten-second audio clips (sampled at 4 kHz) from 20 cars and 20 trucks as well as 80 s of wind noise. Truth vectors indicating the presence and class of each vehicle were constructed via human inspection. The normalized audio and their respective truth vectors were randomly concatenated into a training and testing dataset with the same size and class composition. The training dataset was passed through the spectral-analysis stage of the AVDC system. The output stream from the adaptive-τ filter in each channel was recorded via a National Instruments 6259 data acquisition card and then imported into MATLAB R to be used in the threshold estimation procedure, as described in the next two subsections. A spectrogram demonstrating the output of the spectral-analysis stage in response to the testing dataset is shown in Figure 10 for reference.

Comparator Threshold Optimization
Once the data have been prepared, training the AVDC system entails configuring the event-detection stage of the MSP. The first step in the configuration of the event-detection stage, and the focus of this subsection, is the selection of comparator reference voltages (hereafter, "comparator thresholds"). On the RAMP, comparator threshold voltages are generated by using FGs to source or sink current through resistors. These threshold voltages should be selected such that comparator activation patterns (hereafter, "codewords") are indicative of a specific vehicle class: (1) noise, (2) car, or (3) truck. Hamming distance, which is the minimum number of bit replacements required to match two binary strings, is a good measure of the similarity between two codewords. Hence, comparator thresholds (denoted by τ) are selected to maximize the mean Hamming distance between codewords triggered by different vehicle classes (denoted by M 1 ) and to minimize the mean Hamming distance between codewords triggered by the same vehicle class (denoted by M 2 ). These criteria can be phrased as a multiobjective nonlinear programming problem. In this paper, maximization of M 1 is prioritized to reduce false alarms and misclassification, which cause unnecessary energy expenditure.
As noted in Section 1.2, it is desirable to keep the MSP out-of-the-loop for most of the training process to improve the training rate, reduce the degradation of FGs, and facilitate post-deployment adjustments. The event-detection circuits in the MSP are well characterized, so empirical measurements of spectral-analysis stage outputs can be utilized in software simulations to estimate configuration parameters for the event-detection stage of the MSP. These parameter estimates can be easily "ported" back to the RAMP for implementation in hardware. Hence, this optimization problem is solved with a two-step, lexicographic approach [37] using the MATLAB R fmincon solver and the interior-point algorithm.
Lexicographic approaches are multiobjective optimization strategies that sequentially optimize a series of objectives in order of decreasing priority. Lexicographic approaches can be used to obtain Pareto-optimal solutions [38]. The optimization problem proposed in this paper, has two objectives, M 1 and M 2 . In the first solution step, M 1 is maximized while M 2 is constrained by the relaxed boundary condition M 2 Tar : In Equation (3), τ lb and τ ub denote the bounds on the comparator thresholds and are set to be the minimum and maximum output voltage from each channel of the spectral-analysis stage, respectively. In the second solution step, the "argument" τ of the "minimum" of M 2 is sought: where the optimum from the first problem (M 1 Max ) is relaxed by a factor of (typically around 10%) to act as a constraint on the value of M 1 for acceptable solutions. Objective functions M 1 and M 2 are computed by weighting the mean Hamming distance between each pair of codewords triggered by the relevant comparison classes by a factor that is indicative of the specificity of the codeword. Defining the objectives in this manner rewards codewords for their specificity to a particular vehicle class and for their distinctness from codewords triggered by other vehicle classes. The underlying formula for computing M 1 and M 2 can be written as M k for k ∈ {1, 2}: Where P :,:,1 =    In Equation (5), • denotes the element-wise (Hadamard) matrix product, e denotes the 256×1 all-ones vector, H denotes the 256×256 matrix indicating the Hamming distance between pairs of binary codewords, and P denotes the three-dimensional matrix representing the required class comparisons. Figure 11 visually illustrates the class comparisons represented in P and the indexing scheme used in Equation (5). G represents the sparse matrix containing the occurrence frequency of each codeword comparison, which, in turn, is indicative of the class-specificity of the code words. G is computed using the following expression: G(P ψ ,1 ,k , P ψ,2,k ) = F T (P ψ,1,k , * ) F(P ψ,2,k , * ) where F(m, * ) denotes the mth row of F, a 3×256 matrix whose first, second, and third rows indicate the number of times each codeword (ordered by decimal value) is expected to occur when noise, cars, or trucks, respectively are present. Codewords are, in turn, estimated from τ and the output from the spectral-analysis stage of the MSP. The thresholds found through the optimization procedure proposed in this section are mapped to current biases using the characterization models of the relevant FG current sources and sinks.

Lookup Table Configuration
After threshold estimation, a LUT is configured such that the first output indicates vehicle presence and the second output indicates vehicle type. The RAMP contains 16 LUTs, each with six inputs and two outputs, that are available for usage in the AVDC. The flexible signal routing granted by the RAMP enables LUTs to be arrayed to obtain a variety of input-output combinations. However, we found that the LUTs did not need to be arrayed in this work; a few frequency channels contained redundant information, and the threshold selection algorithm placed the comparator thresholds for these channels near the lower or upper bounds (τ lb or τ ub ). As a result, redundant channels had a mostly stationary comparator activation pattern and were pruned to preserve resources. The final AVDC system in this paper uses five frequency channels.
In this work, the number of nonredundant frequency channels is tractable, and comparator activation patterns are defined (where certain comparators generally activate when a vehicle is present, and other comparators generally activate when a truck is present). Hence, after the comparator threshold values have been determined by the optimization procedure, LUT Boolean expressions can be readily found via observation of the resulting codewords. Table 1 shows the percentage of codewords that belong to each vehicle class that also satisfy the Boolean templates used to configure the LUT. Figure 12a shows the codewords in response to the testing dataset, and Figure 12b demonstrates the output of the final configuration of the LUT to the testing dataset. Table 1 and Figure 12 both indicate that the performance of the LUT alone is satisfactory for vehicle detection yet unsatisfactory for vehicle classification, thus motivating the use of an MCU to interpret the output of the LUT and bridge lapses in LUT performance. Table 1. Percentage of Codewords Satisfying LUT Templates in Response to the Testing Dataset (C n denotes the Boolean state of the channel with index n).

LUT Output
Codeword Template Vehicle Class If there are a greater number of nonredundant channels or if the data contain a greater number of vehicle classes, more elaborate techniques may be necessary for LUT template selection. In such a scenario, it may be possible to leverage multiclass classification algorithms [39] to aggregate codewords into their respective vehicle classes and then use logic minimization methods to resolve the final LUT templates.

Results
After the training procedure, no changes are made to the AVDC system. Figure 13 shows the process of making a vehicle detection, starting from the raw audio signal from the sensor. The label for vehicle presence is shown below the input audio signal to give an idea of when the vehicle is near the microphone. "Detection Interrupt" indicates that a vehicle has been detected by the MSP. The "Classification" output from the LUT is used to implement the decision-accumulation scheme described in [6]. Once the MCU makes the final decision on the vehicle type, a radio transmission is sent (which is registered as a spike in power draw). The MCU, which consists of the microcontroller and its peripheral devices (e.g., the radio transmitter), can operate in three distinct states. Each state's power consumption is listed in Table 2. The first state is "sleep mode." Sleep mode is the low-power state of the MCU when it is monitoring a digital interrupt pin for a vehicle detection from the LUT. When an interrupt has been sent from the LUT, the MCU enters the second state: "wake mode." In wake mode, the MCU powers up to make the final decision pertaining to vehicle presence and type. If the MCU decides that a vehicle is present, it enters the third state: "radio mode." Otherwise, it returns to sleep mode. Radio mode is the state in which the MCU's radio transmitter is enabled to broadcast vehicle detections and classifications. Of the three modes, radio mode is the highest in power consumption, so the MCU immediately returns to sleep mode after the radio transmission is complete. The testing data described in Section 4 were utilized to evaluate the performance of the trained AVDC system. Like the training data set, the testing data set consists of ten car samples, ten truck samples, and 40 s of wind noise. The detection and classification results of the trained AVDC for the testing dataset are shown in Table 3. The system proposed in this paper achieved a detection accuracy of 100% and a classification accuracy of 95% (one car was misidentified as a truck). The system had no false alarms during the 40 s of wind noise. False alarms cause the MCU to enter wake mode and possibly radio mode. Hence, each false positive contributes to energy consumption, which decreases the overall lifetime of a resource-constrained sensor node.  Table 3 also includes comparisons to [6], which presented results from a system with a similar analog/mixed-signal processing chain but built as an application-specific device. Since the audio data and class labels used in both papers are identical, and the signal processing circuits share some similarities (similarities are summarized in Section 1.2), comparisons to the work presented in [6] can provide a fair testimony of the merits of employing a reconfigurable system over an ASIC. In comparison to [6], which reported a classification accuracy of 80% for cars and four false positives, the AVDC presented in this paper has a higher classification accuracy for cars and no false positives.
To demonstrate the power savings of the AVDC system, Figure 14 compares battery lifetime in three scenarios-(1) this work, (2) previous work utilizing an ASIC-based MSP [6], and (3) a system with a purely digital implementation. Figure 14 shows the advancements that were made by [6] compared to an all-digital implementation, increasing the estimated lifetime from 4 months to 2.4 years. Figure 14 also shows the improvements of this AVDC system over [6], increasing the estimated lifetime from 2.4 years to 7.5 years. Additionally, Figure 14 predicts each case for both radio transmission (w/ TX, for real-time systems that need to transmit on each event occurrence) and no radio transmission (w/o TX, for data-logging systems that do not need to transmit on each event occurrence).  Other recent advances in acoustic classification systems have shown high accuracy with low power consumption. While different in application, the voice activity detectors described in [40][41][42] provide useful comparisons in terms of acoustic processing. The voice activity detector presented in [40] reports an accuracy of 89% while consuming 6 µW. Another voice activity detector constructed from an analog filterbank feeding into a neural network reports approximately 85% accuracy while consuming only 1 µW [41]. A voice activity detector utilizing a deep neural network is presented in [42]; the neural network has a power consumption of 22 µW and the total system can consume up to 7.78 mW.
Acoustic sensing systems have also been applied to applications that are more similar to the AVDC. The system described in [43] reports a very-low power consumption of 12.2 nW and 95% classification accuracy when applied to vehicle detection. The vehicle detection system uses a custom micro-electromechanical system sensor and a custom digital-processing scheme for classification but is limited by a bandwidth of 500 Hz. The stereo-audio sensing application presented in [44] utilizes on-chip feature extraction for high accuracy and reports a power consumption of 55 µW, which is comparable to the 43 µW power consumption of the RAMP in this paper. The underwater acoustic monitoring system presented in [45] focuses on a different application space but performs a spectral analysis like many other sensing systems highlighted in this section. The underwater system has a reported power consumption of 62 µW for its MSP and was constructed primarily from commercially available components.
The systems described in [40][41][42][43][44][45] were all fabricated in 65 nm, 90 nm, or 180 nm processes. Smaller process sizes naturally bring lower supply voltages and power consumption, particularly for digital systems. In contrast, the custom-designed RAMP IC was fabricated in a standard 0.35 µm CMOS process and leveraged a non-optimized, commercially-available MCU for the digital processing. The integration of custom digital-processing circuits on the RAMP IC would also significantly reduce the power consumption of the MCU, particularly in wake mode and radio mode.
While the acoustic monitoring systems in [6,[40][41][42][43][44][45] perform their respective tasks at low power levels, their application scope is constrained. In contrast, the AVDC implementation proposed in this paper was built on a highly reconfigurable platform, which opens up many more signal-processing possibilities; AVDC represents only a single application of the RAMP system, and many more low-power applications can be developed on the RAMP. In summary, the RAMP system is able to provide a classification accuracy and low power consumption that is comparable to other ultra-low-power systems while also providing the flexibility to be reprogrammed for a wide variety of applications beyond AVDC.

Conclusions
A low-power reconfigurable WSN node architecture centered on an FPAA-based mixed-signal preprocessor and a panStamp MCU was presented in this paper. The proposed node architecture was evaluated in the context of a resource-constrained AVDC scenario in order to evaluate the feasibility of its use in remote sensing applications. Using a mixed-signal audio processor and selectively waking the MCU for significant events indeed leads to considerable power savings compared to an exclusively digital approach. Additionally, test results demonstrate that the AVDC system had a 100% detection accuracy, correctly classified 95% of the vehicles detected, and had no false alarms during 40 s of wind noise. The RAMP architecture has the potential to let sensor nodes adapt to changes in detection conditions and changes in mission directives without the need for physical recovery, making it indispensable for long-term remote-sensing applications.
Author Contributions: S.B, S.A., and D.W.G. were responsible for writing this paper and designing circuits. S.B. was responsible for the event detector training algorithm. S.A. was responsible for the MCU algorithm and experimental measurements. All authors have read and agreed to the published version of the manuscript. Acknowledgments: The authors would like to thank Brandon Rumberg for helpful discussions during the design and system tuning processes.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this paper: