1. Introduction
Noise pollution is a common problem in urban environments. Humans are continuously exposed to noise as they go about their daily lives. However, exposure to noise in the urban environment or in the workplace can be a source of discomfort leading to health-related problems such as hearing loss if the correct protective actions are not taken. In order to assess the risk of noise for humans, one course of action is to measure the noise level present in the humans’ living and working environments. Typical environmental or background noise levels in residential areas range from 30 to 80 dB, and long-term exposure to sound levels over 85 dB causes hearing damage [
1]. Studies [
2,
3] investigated the effects of exposure to office noise and showed that everyday exposure to noise disturbance affected comfort, health, and work performance.
In order to measure human exposure to noise, the measurement equipment must correlate the measured sound pressure level (SPL) to the perceived loudness level of noise using weighting filters, such as an A-weighting filter. An A-weighting filter for sound level meters is defined in the International standard IEC 61672-1 [
4] and is required to assess noise levels for legislative regulations. The standard describes the A-weighting filter by tabulating frequency weighting values, and giving an analytical expression for the transfer function of the filter, but it does not define its implementation details. A-weighting is applied to the sound samples to estimate the loudness perceived by the human ear [
1].
An A-weighting filter can be implemented as an analog or digital filter. Digital filters can achieve far superior results to those of analog filters, as they do not suffer from parasitic or temperature variations that affect analog filters. Besides, the implementation of digital filters in digital systems (e.g., SoCs and MCUs), which are omnipresent in today’s measurement equipment, sensor nodes, and edge devices, is straightforward.
Coefficients of digital filters obtained from transfer functions of analog filters are real numbers that require an infinite number of bits for their representations, or at least a floating-point representation in digital systems. In practical situations, it is impossible to represent a digital filter’s coefficients with an infinite number of bits; hence, designers generally use fixed-point approaches to represent the filter’s coefficients. Unfortunately, fixed-point designs degrade the filter frequency response and introduce a theoretical limit of the filter’s performance [
5]. For example, in the realization of IIR filters in digital hardware, the filter’s accuracy is limited by the length of the word used to represent the coefficients and perform arithmetic operations. Additionally, due to quantization, these coefficients are not exact. Consequently, the finite-word filter’s frequency response with quantized coefficients is different from the filter’s frequency response with exact coefficients. On the other hand, a fixed-point digital filter design can maximize filter performance in terms of area, delay, and power consumption.
The interest in reducing the power consumption of digital filters used in edge computing and sensor networks is growing rapidly. Several techniques are used to achieve significant reductions in power consumption. One of the first papers proposing approximate processing to achieve these goals is [
6]; there the authors proposed an algorithm to reduce the total switched capacitance by dynamically varying the filter order based on signal statistics. A recent trend in low-power design is approximate computing for reducing arithmetic activity and average chip power dissipation [
7,
8,
9,
10]. Multiplication represents a widespread arithmetic operation in DSP; therefore, many DSP applications can benefit from an efficient multiplier design. By relaxing the requirement for exact computation, we can design a power-efficient approximate multiplier [
11,
12,
13,
14,
15]. The error that emerges from product approximation should be constrained to deliver acceptable results in the application. Therefore, it is essential to find a good compromise between the accuracy of a multiplier and its efficient design.
The effectiveness of approximate multipliers for achieving low-power processing motivated us to apply approximate multiplication inside the A-weighting filter. The main idea was to introduce approximate multiplication in an A-weighting IIR filter and save the area and energy while introducing a neglectable computational error. This work is motivated by some earlier works in which the approximate multiplication was used in finite impulse response (FIR) filters [
16,
17,
18,
19]. With the optimal placement of approximate multipliers inside the A-weighting filter, we anticipate that the frequency response would be almost identical to the frequency response of the A-weighting filter with exact arithmetic. In this work, we show that we can ensure the minimal influence of approximate multiplication on the performance of the A-weighting filter and achieve power-efficient processing.
The contributions of this paper can be summarized as follows:
This paper presents a new design for an approximate low-power digital A-weighting filter implemented as a sixth-order IIR filter with approximate multipliers.
This work provides a thorough analysis of the effects of approximate multiplication on the frequency response of an A-weighting IIR filter. We show how the optimal placement of approximate multipliers across the filter and the appropriate zero-pole pairings ensure minimal degradation of the filter’s frequency response in the presence of approximate multiplication.
Synthesis results indicate that the proposed approximate IIR filter design achieves a nearly 70% reduction in energy (power-delay product) while preserving the required accuracy.
The rest of the manuscript is organized as follows.
Section 2 gives some background, and discusses related work and the state of the art. The architecture of a digital IIR A-weighting filter and the effects of coefficient quantization are discussed in
Section 3. The proposed approximate multiplication suitable for use in an IIR filter’s cascade is presented in
Section 4. In
Section 5, the impacts of zero-pole pairings and the placement of approximate multiplication among the filter’s characteristics are analyzed, followed by a description of the design of a low power digital A-weighting IIR filter using approximate multiplication. Experimental results are summarized in
Section 6. Finally, the paper is concluded in
Section 7.
3. Digital IIR A-Weighting Filter Architecture and Coefficient Quantization
A digital A-weighting filter is implemented as infinite impulse response (IIR) filter, whose output depends on a finite number of input samples and a finite number of previous filter outputs. Due to the feedback paths, IIR filters are less numerically stable than their FIR counterparts [
37] but provide better performance and less computational cost than FIR filters. In this section, we explore a suitable implementation of a digital A-weighting filter and its coefficient quantization.
We follow the approach by Risojević et al. [
29]. Using matched-z transformation [
37] for the transfer function given in (
3) of the analog A-weighting filter, and sampling frequency
kHz, the transfer function of the A-weighting digital filter is obtained as:
The magnitude response of the filter with transfer Function (
4) slightly violates the tolerance limits imposed by [
4] for high frequencies. Therefore, we added a first-order low-pass section to correct the magnitude response. The gain and cutoff frequency of the added first-order section were chosen by trial and error. The resulting transfer function is:
The digital filter defined by (
5) will be referred as a reference filter in the rest of the paper.
As can be seen from (
5), the filter has poles in the unit circle’s proximity, which can make the filter unstable in the presence of coefficient quantization. Risojević et al. [
29] employed a cascade-form realization of the transfer function given in (
5) using second-order sections (SOS) to avoid system instability due to the round-off errors in the fixed-point arithmetic. The main disadvantage of SOS filter implementation is the nonlinear relationship between the filter’s coefficients and filter’s poles and zeros [
37]. Due to this nonlinear relationship, it is hard to determine the effect of quantization of the filter coefficients on its poles and zeros’ positions and control the sensitivity of these positions to quantization errors. The SOS’s nonlinear relationship between coefficients and poles motivated us to redesign the A-weighting filter as a cascade-form with the first-order sections (FOS). The filter implementation using FOS is characterized by a linear relationship between filter coefficients and its zeros and poles. Hence, we have control of the poles and zeros of the filter with quantized coefficients. Moreover, the A-weighting filter’s FOS and SOS implementations have the same number of employed delay elements and arithmetic units (adders and multipliers). Factorization of the numerator and denominator polynomials in the transfer function of the A-weighting digital filter (
5) yields the cascade-form implementation with FOS:
where the transfer functions of the first-order sections are:
The proposed filter can also be represented by matrices of its coefficents as:
where the position of the coefficients (i.e., zeros and poles) within the matrices represents the placement of FOS. Cascade filter realizations can be obtained by different pole-zero pairings and by different orderings of sections. In floating-point arithmetic, pole-zero pairings and the order of sections in the cascade do not affect the filter’s frequency response. However, when the filter is applied in digital electronics using the finite number of bits to represent the filter’s coefficients and in the presence of approximate multiplication, we cannot presume that the filter’s frequency response is unaffected by pole-zero pairings, the ordering of FOS in the cascade and approximate arithmetics.
We tackle this problem in
Section 5. Here, we present the proposed quantization used to determine the minimal amount of bits required to represent the filter coefficients without violating the tolerance limits imposed by the IEC 61672-1 standard. We perform quantization as follows:
where
represents rounding to the nearest integer,
and
represent quantized coefficients obtained from
and
, and
Q denotes the number of bits used to represent the decimal part of the filter coefficients.
The magnitude responses of the A-weighting filter for different values of
Q are depicted in
Figure 2. When
, the filter’s frequency response violates the IEC 61672-1 standard’s tolerance limits, but only in a narrow frequency range from 10 to 100 Hz. For
, the filter’s magnitude response has a 0.3 dB higher magnitude response than upper tolerance limits for frequencies smaller than 20 Hz. Finally, an A-weighting filter with
satisfies the tolerance limits imposed by the IEC 61672-1 standard. Therefore, we represent the coefficients with 11 bits in the two’s complement fixed-point format. The quantized filter coefficients for all six FOS of the A-weighting filter multiplied by
are:
4. The Proposed Approximate Multiplication
In digital filters, the multipliers represent indispensable components that have a strong influence on their area, delay, and energy. If we employ approximate multiplier in digital filters, we can significantly improve energy consumption and area usage. Low energy consumption is a desired property as A-weighting filters are often employed as a part of battery-powered devices. However, approximate multiplication can significantly influence the A-weighting filter’s stability and magnitude response. Hence, careful design and placement of approximate multipliers are required. This section first presents an exact multiplier whose design leverages the coefficient’s quantization and then proposes an approximate multiplier, which we obtain by simplifying the exact multiplier.
4.1. Exact Radix-4 Multiplier
A radix-4 Booth multiplier [
38] consists of two stages: a partial product generation, and a partial product addition stage. Let us illustrate radix-4 Booth encoding for the multiplication of two
n-bit integers, i.e., a multiplicand
X and multiplier
Y in two’s complement:
and
where
and
represent the bits from
X and
Y, respectively. In the radix-4 Booth encoding, the multiplier
Y is divided into overlapping groups of three bits:
where
Taking into account the radix-4 enconding of
Y, we can write the product
as:
where
represents
j-th partial product generated from
group encoding:
The previous discussion deals with the general case of an
n-bit multiplier. In our case, the filter coefficients of the A-weighting filter are represented with 11-bit integers. If we observe filter coefficients as Y input, the resulting radix-4 Booth multiplier generates six partial products, as shown in
Figure 3a). As we can see, the partial product generation stage consists of six Booth encoders, which generate partial products from each radix-4 group
. In the partial product addition stage, we employ the Wallace tree [
39] to reduce the number of partial products to two. The final partial product addition is implemented using a prefix (fast) adder [
38].
4.2. Approximate Odd Radix-4 Multiplier
In order to reduce the number of partial products, we propose a slightly modified radix-4 encoding. The main idea behind the proposed encoding is to shift the position of group encodings one place to left, as illustrated in
Figure 3b. Let:
Now, the encoded value,
, of an
n-bit binary number is equal to:
In the case of an 11-bit number, the encoded value
is:
By setting
, we can rewrite the above equation as:
Hence, the idea is to use the encoding from (
17) and (
19) to encode the multiplier Y:
In such a way, we can decrease the number of partial products by one for binary numbers with odd number of bits.
To avoid costly subtraction, which leads to a more complex circuitry, we propose to neglect the term
and to approximate
Y as follows
Section 5 shows that neglecting the term
leads to an acceptable error. From (
22), we can see that an error arises only when
Y is an odd number.
With the proposed approximate odd radix-4 encoding, we can calculate the product
as:
where
represents
j-th partial product generated from
. Note that we employ the same circuitry to obtain
and
as in the design of exact radix-4 multiplier.
To further improve the proposed multiplier design in terms of area, delay, and energy consumption, we propose the omission of the last
M bits of multiplier
Y. The proposed omission also decreases the number of partial products, leading to even more hardware and energy-efficient design. For example, if
, we omit the last two partial products in
Figure 3b).
Section 5 shows that this error does not affect the filter’s response if we select
M carefully in each first-order section.
4.3. Error Analysis of the Approximate Odd Radix-4 Multiplier
In this subsection, we present the error analysis of the approximate odd radix-4 (AO-RAD4) multiplier presented in the previous subsection. We analyze the mean relative error (MRE) and the relative error distribution for error assessment. MRE is obtained as an average relative error for all sets of inputs and all possible combinations for a bit multiplier.
The calculation of relative error for AO-RAD4 is as follows. Considering (
22) and (
23), the relative error of AO-RAD4 multiplier for a number pair
is obtained as:
where
is an approximately encoded operand as in (
22). Hence, the relative error depends only on
Y. The mean relative error (MRE) is calculated as follows:
Figure 4 illustrates MRE (left) and error distribution (right) for different design instances of the AO-RAD4 multiplier. Error distribution is the probability that the relative error is smaller than a specific value. We can notice that MRE (
Figure 4, left) increases exponentially with
M. The error distribution (
Figure 4, right) shows that the parameter
M has a significant impact on error distribution. For example, the number of outputs whose relative error is below 0.1 decreases significantly (from 93% to 86%) when the parameter
M increases from 3 to 4.
6. Simulation and Synthesis Results
We performed the experiments in three steps to verify the proposed approach for implementing an IIR A-weighting filter. Firstly, MATLAB simulations are described and presented to assess the fixed-point A-weighting IIR filter’s behavior with and without approximate multiplication. MATLAB simulation consists of comparing the frequency responses of the filters, filtering a set of environmental noise recordings, and comparing the filters’ outputs in terms of normalized root mean square error (NRMSE) and mean absolute error of sound pressure level (SPL). Secondly, we have used Verilog to implement the filters and synthesize them to 45 nm Nangate Open Cell Library. The resulting values of the area, delay, and power performance are reported. Finally, we have implemented the filter in Zynq-7000 SoC on the ZYBO Z7 FPGA development board to verify the filter’s operation in a real environment.
6.1. Magnitude Response of the Proposed Digital A-Weighting Filter
In this section, we present the MATLAB simulations of the proposed and reference A-weighting filters to observe the influence of approximate multiplications on the frequency response of the filter. We observe how much the frequency response of the proposed filter deviates from the exact frequency response given in the standard.
Figure 7 shows the magnitude responses of the proposed digital A-weighting filter from
Figure 5 and the reference digital A-weighting filter whose transfer function is given by (
5). Note that in MATLAB simulation, we use IEEE754 double-precision format to represent the reference filter’s coefficients. It can be observed from
Figure 7 that the magnitude response of the proposed A-weighting filter satisfies the tolerance limits imposed by IEC 61672-1 standard. Moreover, the magnitude responses of the proposed and reference digital A-weighting filters are almost identical to each other.
To quantitatively assess the two magnitude responses, we used the CSF measure.
Figure 8 shows CSF for the frequency range
. The high values of CSF for the examined frequency range suggest that the implemented and reference A-filter have nearly identical frequency responses. For the examined frequency range, the average CSF equals
%, which indicates a high similarity between frequency responses of the reference and the proposed A-weighting filters. Therefore, we can conclude that employed approximate multipliers have a negligible influence on the filter’s frequency response.
6.2. Acoustic Noise Level Measurement
To assess the proposed A-weighting digital filter’s performance with approximate multipliers, we used the DEMAND collection of acoustic noise in diverse environments [
43,
44]. For acoustic noise level measurement, we have calculated each recording’s sound pressure level according to Equation (
1) using fast averaging. Each recording is frequency A-weighted before we calculate the SPL value to take into account the impact of frequency on human perception of loudness. The DEMAND collection of recordings comprises four indoor environments categories, with three recordings within each category. The indoor categories are Domestic, Office, Public, and Transportation. The Domestic category consists of DKITCHEN (inside a kitchen during the preparation of food), DLIVINGR (inside a living room), and DWASHING (domestic washroom with washing machine running) recordings. The Office category consists of OHALLWAY (a hallway inside an office building with occasional traffic), OMEETING (a meeting room), and OOFFICE (a small office with three people using computers) recordings. The Public category consists of PCAFETER (a busy office cafeteria), PRESTO (a university restaurant at lunchtime), and PSTATION (the main transfer area of a busy subway station) recordings. Finally, the Transportation category consists of the following recordings: TBUS (a public transit bus), TCAR (a private passenger vehicle), and TMETRO (a subway).
Figure 9 shows the normalized root mean square error (NRMSE) between the signal from the reference filter and the signal from the proposed filter for each of the recordings in the DEMAND collection. Normalized root mean square error is defined as:
where
is the signal obtained from the reference digital filter,
is the signal obtained from the proposed filter,
and
are the maximum and minimum values of the signal
, respectively, and
N is the number of samples in each signal. It can be observed from
Figure 9 that the NRMSE values between the signal from the reference filter and the signal from the proposed filter are very small. To statistically assess the range of estimates for mean NMRSE, we have calculated a 95% confidence interval (95% CI) from the obtained NMRSE on the DEMAND dataset. The CI determines the range of plausible values for mean NMRSE. The CI is calculated as follows:
where
represents the mean value of observed samples,
represents the critical value from the Student’s t-distribution,
s represents the standard deviation of observed samples, and
n represents the number of samples. We have obtained 95% CI of
for the estimate of mean NMRSE. Hence, our method would exhibit NMRSE between
and
, which implies that the proposed filter can be deployed in sound pressure level measurement without noticeable performance degradation.
We have calculated two sound pressure levels for each recording: one with the proposed and one with the reference A-weighting filter. The loudness was calculated using the “fast” response (window size of 250 ms). The mean error (
) is also reported for each recording.
Figure 10 shows the loudness profiles for each of the recordings in dB SPL (A-weighted).
As can be observed from
Figure 9 and
Figure 10, NRMSE between the signal from the reference filter and the signal from the proposed filter is in strong correlation with the mean absolute error (
) between the SPL values obtained with the proposed and the reference A-weighting filters. For example, the DKITCHEN recording has the smallest NRMSE and
, and the TCAR recording has the highest NRMSE and
.
To understand the underlying distribution of
, we have calculated the histogram for the DEMAND dataset and presented it in
Figure 11. From
Figure 11, we can conclude that a significant amount of the
concentrates on interval [0.6,0.8] dB. Through the histogram analysis, we concluded that 91% percent of obtained
is smaller than 1 dB. Keeping in mind that professional SPL meters tend to have
dB error tolerance, these results indicate that the proposed filter offers satisfiable performance for SPL measurement. Finally, we can see that the maximal
is equal to 1.4 dB. This suggests that the proposed filter can comply with an Type 2 sound level meter [
4].
Finally, we have assessed the proposed filter’s decibel range using pink noise sequences. We have generated several pink noises with different noise levels and calculated two sound pressure levels for each sequence: one with the proposed approximate and one with the reference A-weighting filter.
Figure 12 shows the correlation between the noise level of pink noise and
. As we can see, the proposed filter gives satisfactory results for the examined pink noise sequences.
6.3. CMOS Synthesis
In this subsection, we analyze and compare the proposed digital A-weighting filter’s hardware performance in terms of power, area, delay, and power-delay-product (PDP). We compare the synthesis results of two digital A-weighting filters: the proposed digital filter with AO-RAD4 multipliers as in
Figure 5, and the reference filter with exact RAD-4 multipliers (
5). The filters were implemented in Verilog and synthesized to 45 nm Nangate Open Cell Library. For Verilog to GDS synthesis flow, we employed OpenROAD Flow [
45], a full RTL-to-GDS flow built entirely on open-source tools. We used timing with 10 MHz virtual clocks to evaluate the power with a 5% signal toggle rate and output load capacitance equal to 10 fF. The synthesis conditions aim to compare different filters while keeping equal conditions for all experiments. The synthesis results are listed in
Table 2 and consist of cell area in
, delay or critical path in nanoseconds, total power (leakage plus dynamic) in μW, and energy or power-delay-product (PDP) in fWs.
As can be observed from
Table 2, the proposed digital filter with the approximate AO-RAD4 has substantially smaller area utilization and energy consumption compared to the reference digital filter with exact RAD-4 multipliers. The proposed filter occupies only 41% of the area of the reference filter. The power consumption for the digital filter with exact multipliers is 63% higher than the power consumption for the proposed digital filter with AO-RAD4 approximate multipliers. The proposed digital filter consumes 70% less energy (PDP) than the digital filter with exact multiplication. Besides, the proposed digital filter can process the samples 1.2 time faster.
The superior hardware performance of the proposed approximate filter originates from the usage of the approximate multipliers. The proposed approximate filter and the reference filter have the same FOS structure and the same number of arithmetic operations. Still, the former employs the approximate AO-RAD4 multipliers, and the latter employs the exact radix-4 multipliers. In this way, we achieved fair comparison and eliminated the influence of the filter structure on the synthesis results. For the exact radix-4 multiplier, the complexity of the product generation stage equals (n bits of multiplicand X, and partial products). In the case of the AO-RAD4 approximate multiplier, the complexity is equal to , where n denotes the bit width of the multiplicand, q quantization factor, and M represents the truncation parameter of approximate odd radix-4 Booth multiplier. In the proposed filter, we chose bits for representing the multiplicand X, for the quantization factor, and employed AO-RAD4 multipliers with and . Therefore, the partial product stage complexity in AO-RAD4 is theoretically reduced by 80% compared to the exact radix-4 multiplier. The exact multiplier and the proposed AO-RAD4 multiplier also differ in the number of partial products. The exact radix-4 Booth multiplier has partial products, and the proposed approximate multiplier has partial products. The employed approximate multipliers with and have only three partial products, and the exact multiplier has 16 partial products. With fewer partial products, the approximate AO-RAD4 multiplier exhibits significantly smaller energy consumption and area utilization, which leads to an overall reduction in area and energy in the proposed filter.
To compare the Verilog model and MATLAB model outputs, we conducted the verification through FPGA prototyping. We deployed the proposed filter to the Zync 7000 SoC on the ZYBO Z7 FPGA development board. For the test inputs, we used impulse sequence and Gaussian white noise (AWGN).
Figure 13 shows the filter’s outputs from the filter implemented in Zync 7000 SoC in the presence of the environmental noise and MATLAB simulation model. We can notice that the outputs match, and the filter implemented in Zync 7000 SoC has the same functionality as the MATLAB model.
6.4. Discussion
Employment of approximate multipliers in the A-weighting IIR filter offers remarkable savings in energy consumption and area utilization, and it has a negligible impact on its accuracy. As the approximate and the reference filter have the same structure, and the same number of first-order sections, the low area utilization and low energy consumption in the approximate filter comes solely from the employment of the approximate multipliers. The smaller number of partial products in the proposed approximate multiplier leads to a smaller circuit. Hence, the overall area and power consumption of the proposed filter have been reduced. However, careful placement of approximate multipliers in the A-weighting filter is required to meet the A-weighting filter’s accuracy, stability, and frequency response. As the criteria for placement of the approximate multipliers, we selected the similarity between magnitude responses of the proposed filter and reference filter. In other words, the optimal choice and placement of the approximate multipliers in the A-weighting filter give the magnitude response, which is almost identical to the magnitude response of the referenced A-weighting filter and satisfies the IEC 61672-1 requirements. Hence, there is an insignificant difference between signals filtered with the proposed and reference filters.
As with every approximation scheme, the one proposed here also has shortcomings and limitations. The proposed approximation scheme applies only to the IIR filters that can be implemented through decomposition on the first-order sections (FOS). We selected the first-order sections as a filter building block because they have a linear relationship between coefficients and poles of a transfer function. On the other hand, we can decompose on FOS only the IIR filters with real poles or near-real poles. Besides, this study solely concentrated on deploying approximate multipliers, and the design of adders was unaltered. To further improve the proposed filters’ power consumption, we need to consider the adders’ design. To summarize,
Figure 14 shows the design flow presented in this paper.