A Ratio Model of L1/L2 Norm for Sound Source Identification

Huang, Linsen; Xu, Zhongming; Zhang, Zhifei; He, Yansong

doi:10.3390/s20185290

Open AccessLetter

A Ratio Model of L₁/L₂ Norm for Sound Source Identification

School of Automotive Engineering, Chongqing University, 174 Shezhengjie, Chongqing 400044, China

^*

Author to whom correspondence should be addressed.

Sensors 2020, 20(18), 5290; https://doi.org/10.3390/s20185290

Submission received: 13 August 2020 / Revised: 8 September 2020 / Accepted: 14 September 2020 / Published: 16 September 2020

(This article belongs to the Section Physical Sensors)

Download

Browse Figures

Versions Notes

Abstract

In the field of sound source identification, robust and accurate identification of the targeted source could be a challenging task. Most of the existing methods select the regularization parameters whose value could directly affect the accuracy of sound source identification during the solving processing. In this paper, we introduced the ratio model

ℓ_{1} / ℓ_{2}

norm to identify the sound source(s) in the engineering field. Using the alternating direction method of multipliers solver, the proposed approach could avoid the selection of the regularization parameter and localize sound source(s) with robustness at low and medium frequencies. Compared with other three methods employing classical penalty functions, including the Tikhonov regularization method, the iterative zoom-out-thresholding algorithm and the fast iterative shrinkage-thresholding algorithm, the Monte Carlo Analysis shows that the proposed approach with

ℓ_{1} / ℓ_{2}

model leads to stable sound pressure reconstruction results at low and medium frequencies. The proposed method demonstrates beneficial distance-adaptability and signal-to-noise ratio (SNR)-adaptability for sound source identification inverse problems.

Keywords:

microphone array; near-field acoustical holography; sound source identification; sparse regularization

1. Introduction

The acoustic imaging techniques have played a central role in the Noise, Vibration, and Harshness (NVH) engineering field of sound source identification, such as automotive, aircraft, and industrial applications [1,2,3]. As one of the excellent acoustic imaging tools, near-field acoustical holography (NAH) was first formalized four decades ago by the groundbreaking work of Williams [4,5]. It is an experimental method that estimates a complex sound field, based on the sound pressure measured by a microphone array close to the measured object. NAH gradually becomes an ideal technique which provides high spatial resolution for sound source localization and quantification. NAH is often cast as an acoustic reconstruction problem. However, both the resolution quality and the economic cost in NAH are limited by the size of sampling dataset.

The findings of compressive sensing (CS) techniques shed new light on the microphone array signal processing [6]. Much literature has examined the combined sparse regularization strategy based on the CS to deal with the time- and labor-intensive tasks in the community of acoustic imaging. Chardon [7] firstly introduced the CS technique into NAH to reduce the number of samplings and tuned the data fidelity constraint by using a function of the hologram norm. In the research conducted by Attendu [8], the author predicted the optimal regularization parameter with the Pareto frontier curve which is another kind of the L-curve approach. Fernandez-Grande [9] estimated a noise floor, assuming that the signal-to-noise ratio (SNR) is known. Gerstoft [10] set out to examine the regularization parameter selection via the least absolute shrinkage and selection operator (LASSO) path. Hald [11] tested a simple method about the determination of the regularization parameter which is based on a specified dynamic range in decibels. Giri [12] applied the regularized least squares (RLS) method as a baseline method to heuristically choose the regularization parameter. Hu [13] adopted the general cross validation to determine the regularization parameter in the proposed method. Shi [14] found a reasonable weighting parameter of the objective function through many experiments. Bi [15] determined the best value parameter in the data fidelity constraint by the 10-fold cross validation method. Tan [16] proposed an approach that allows the penalty to be automatically adjusted by putting the measured distance into the regularization parameter. However, the methods selecting the regularization parameter are involved in these above-mentioned reports and most of them are heuristic. To that end, an objective approach for the performance evaluation should be designed, so that it could be free from the experimental conditions. The work by Hald [11] mentioned that the selection of the regularization parameter could be avoided by solving the constrained

ℓ_{1}

minimization problem. Correspondingly, Rahimi [17] proposed a ratio of

ℓ_{1}

and

ℓ_{2}

norms model denoted as

ℓ_{1} / ℓ_{2}

which is comparable to other approaches in sparse recovery. Pham [18] mapped a sparse moving-source with a smoothed

ℓ_{1} / ℓ_{2}

regularization method. Wang [19] connected the relationship between

ℓ_{1} / ℓ_{2}

model and

ℓ_{1} - α ℓ_{2}

model in the domain of sparse recovery. Enlighten by the scale-invariant

ℓ_{1} / ℓ_{2}

norm ratio model, we intend to introduce this ratio model into the application of sound source identification to avoid the selection of the regularization parameter during the solving process.

In this presented paper, an approach using the ratio model of the

ℓ_{1}

and

ℓ_{2}

norm (denoted as

ℓ_{1} / ℓ_{2}

) for sound source identification is proposed. We intend to solve this problem with the alternating direction method of multipliers (ADMM). To benchmark the performance of the proposed method, we compare with other three reconstruction methods including the Tikhonov regularization method (denoted as Tikhonov), the iterative zoom-out-thresholding algorithm (IZOTA) [11] and the fast iterative shrinkage-thresholding algorithm (FISTA) [20] to demonstrate the advantage of the proposed method in the application of sound source identification.

The remainder of this paper is the following: Section 2 models the sound source reconstruction problem. Section 3 introduces the ratio model of

ℓ_{1} / ℓ_{2}

norm. Section 4 evaluates the proposed method on performance assessments and error evaluations compared with Tikhonov, IZOTA, and FISTA in numerical simulations. Section 5 verifies the availability of the proposed method by experimental results, and Section 6 concludes and summarizes this paper.

2. Sound Source Reconstruction Model

The equivalent source method (ESM) enables the sound field generated by random sources to be interchanged by a number of equivalent sources in an assumed equivalent source plane [21,22]. With N equivalent sources in the equivalent source plane and M microphones in the microphone array (i.e., the holography plane), Equation (1) obtains the acoustic pressure at mth microphone:

p (m) = \sum_{n = 1}^{N} g (r_{m} | r_{n}) q_{n}

(1)

where

g (r_{m} | r_{n})

is the free-field Green’s function that establishes the relationship between nth equivalent source and mth microphone.

g (r_{m} | r_{n}) = e^{- j k ∥ {\vec{r}}_{m n} ∥_{2}} / 4 π {∥ {\vec{r}}_{m n} ∥}_{2}

,

k = 2 π f / c

is the wavenumber (in rad/m), f represents the frequency and c means the acoustic speed (in m/s),

j = \sqrt{- 1}

.

∥ {\vec{r}}_{m n} ∥_{2}

represents the distances between the equivalent sources and the sensors. The Equation (2) relates the microphone array and the equivalent sources

q

.

p = A q

(2)

where

p

is the vector form of elements

p (m)

; The vector

A

is an

M \times N

transfer matrix. After calculating the equivalent sources strength

q

, the Equation (3) reconstructs this inverse problem to obtain the reconstructed acoustic pressure

p_{R}

p_{R} = G q,

(3)

where

G

is the other transfer matrix which relates the equivalent source plane and the reconstruction plane. Often, Equation (2) could be solved by the Tikhonov regularization method which is a kind of

ℓ_{2}

norm type regularization method.

min_{q} {∥ p - A q ∥}_{2}^{2} + λ {∥ q ∥}_{2}^{2}

(4)

where

{∥ p - A q ∥}_{2}^{2}

is known as measurement error, it could be interpreted the correctness of fitness for the liner system;

{∥ q ∥}_{2}^{2}

is the regularizer with

ℓ_{2}

norm.

λ

is a positive regularization parameter to balance the relative importance between the measurement error and sparsity. However, the solution of the Tikhonov regularization method,

q_{tik} = {[A^{H} A + λ^{2} I]}^{- 1} A^{H} p

, with

I

is the

M \times M

identity matrix, is smooth rather not sparse. One strategy would be to recast the

ℓ_{2}

norm problem solved by the conventional regularization method into the

ℓ_{1}

norm minimization model by sparsity-promoting. As previously described [23],

ℓ_{1}

norm regularization frame could promote the sparsity, but it fails to generate a solution with enough sparsity.

min_{q} {∥ p - A q ∥}_{2}^{2} + λ {∥ q ∥}_{1}

(5)

Another strategy would be to replace the

ℓ_{1}

norm minimization by introducing a sparsity-promoting

ℓ_{0}

norm penalty [24]. The sparse representation in Equation (2) could be transformed to Equation (6). Yet this would lead to a combinatory optimization problem; this acquires a sparse but underestimated solution via the corresponding sparse regularization model at a high computational cost.

min_{q} {∥ q ∥}_{0} s . t . {∥ p - A q ∥}_{2}^{2} \leq ε

(6)

where

{∥ q ∥}_{0}

is the number of nonzero components (usually named the

ℓ_{0}

norm though it is a pseudo norm),

ε

is the noise level. The sparse frame of solution could be measured by

{∥ q ∥}_{0}

. Because

ℓ_{0}

norm is a non-convex and non-smooth function, Equation (6) is a non-deterministic polynomial-time (NP) hard problem. To avoid the computational difficulty of solving the model (6), Candès [25] summarised the methods that could relax it to the

ℓ_{1}

norm frame.

3. The Introduction of Ratio Model of $L_{1} / L_{2}$ Norm

We applied the ratio model of the

ℓ_{1}

and

ℓ_{2}

norm for sound source identification:

min_{q \in C^{N}} \{\frac{{∥ q ∥}_{1}}{{∥ q ∥}_{2}} + {∥ A q - p ∥}_{2}^{2}\}

(7)

Using the frame of decomposition-coordination, the alternating direction method of multipliers (denoted as ADMM) forms simply and functions powerfully; it solves small local subproblems to coordinate to solve a large global problem [26]. The combination of both the dual decomposition and augmented Lagrangian methods paves the way for ADMM to solve the constraint optimization problem. Therefore, we intended to solve the ratio model of the

ℓ_{1}

and

ℓ_{2}

norm via the ADMM solver and we introduced two auxiliary variables to apply the ADMM solver.

The Equation (7) could be rewritten into Equation (8) [17,27],

min_{q, y, z} \{\frac{{∥ z ∥}_{1}}{{∥ y ∥}_{2}} + {∥ A q - p ∥}_{2}^{2}\} s . t . y = q, z = q

(8)

As discussed in [26], ADMM runs well without specific tuning. The augmented Lagrangian

L_{ρ} (q, y, z; v, w)

is expressed in Equation (9) by reformulating Equation (8):

\begin{matrix} L_{ρ} (q, y, z; v, w) = & \frac{{∥ z ∥}_{1}}{{∥ y ∥}_{2}} + {∥ A q - p ∥}_{2}^{2} + \frac{ρ}{2} {∥q - y + \frac{1}{ρ} v∥}_{2}^{2} + \frac{ρ}{2} {∥q - z + \frac{1}{ρ} w∥}_{2}^{2} \end{matrix}

(9)

where

y

and

z

are two auxiliary variables,

v

and

w

are two Lagrange multiplier vectors. The Equation (10) processes in following five steps by updating

q, y, z, v, w

individually:

\{\begin{matrix} q^{(k + 1)} : = arg min_{q} L_{ρ} (q, y^{(k)}, z^{(k)}; v^{(k)}, w^{(k)}) \\ y^{(k + 1)} : = arg min_{y} L_{ρ} (q^{(k + 1)}, y, z^{(k)}; v^{(k)}, w^{(k)}) \\ z^{(k + 1)} : = arg min_{z} L_{ρ} (q^{(k + 1)}, y^{(k + 1)}, z; v^{(k)}, w^{(k)}) \\ v^{(k + 1)} : = v^{(k)} + ρ (q^{(k + 1)} - y^{(k + 1)}) \\ w^{(k + 1)} : = w^{(k)} + ρ (q^{(k + 1)} - z^{(k + 1)}) \end{matrix}

(10)

where

ρ

is a positive constant and it is suggested to be 200 in this study, and

{(\cdot)}^{(k + 1)}

implies the

{(k + 1)}^{t h}

iteration. The update for

q

could be expressed in Equation (11),

\begin{matrix} q^{(k + 1)} = (I - A^{T} {(A A^{T})}^{- 1} A) f^{(k)} + A^{T} {(A A^{T})}^{- 1} p \end{matrix}

(11)

where

I

refers to a diagonal matrix and

f^{(k)} = \frac{1}{2} (y^{(k)} - \frac{1}{ρ} v^{(k)}) + \frac{1}{2} (z^{(k)} - \frac{1}{ρ} w^{(k)})

.

For the solution of the

y

-subproblem, let

c^{(k)} = {∥z^{(k)}∥}_{1}

and

d^{(k)} = q^{(k + 1)} + \frac{v^{(k)}}{ρ}

, this subproblem could be reduced to Equation (12):

y^{(k + 1)} = arg min_{y} \frac{c^{(k)}}{{∥ y ∥}_{2}} + \frac{ρ}{2} {∥y - d^{(k)}∥}_{2}^{2}

(12)

For the auxiliary variable

y

, we could obtain Equation (13) by taking derivative of the objective function:

(- \frac{c^{(k)}}{{∥ y ∥}_{2}^{3}} + ρ) y = ρ d^{(k)}

(13)

As a result, there exists a positive number

τ^{(k)} \geq 0

such that

y = τ^{(k)} d^{(k)}

; Given

d^{(k)}

, we denote

η^{(k)} = {∥d^{(k)}∥}_{2}

and

D^{(k)} = \frac{c^{(k)}}{ρ {(η^{(k)})}^{3}}

, then

τ^{3} - τ^{2} - D^{(k)} = 0

(14)

The Equation (14) acquires only one real root according to the cubic-root formula. The Equation (15) provides the closed-form solution.

τ^{(k)} = \frac{1}{3} + \frac{1}{3} (C^{(k)} + \frac{1}{C^{(k)}})

(15)

with

C^{(k)} = \sqrt[3]{\frac{27 D^{(k)} + 2 + \sqrt{{(27 D^{(k)} + 2)}^{2} - 4}}{2}}

.

In summary, the update for the auxiliary variable

y

in Equation (9) could be expressed in Equation (16)

y^{(k + 1)} = \{\begin{matrix} e^{(k)} & d^{(k)} = 0 \\ τ^{(k)} d^{(k)} & d^{(k)} \neq 0 \end{matrix}

(16)

where

e^{(k)}

is a random vector with the

ℓ_{2}

norm to be

\sqrt[3]{\frac{c^{(k)}}{ρ}}

.

Finally, the update for the auxiliary variable

z

in Equation (9) could be expressed in Equation (17):

z^{(k + 1)} = S (q^{(k + 1)} + \frac{w^{(k)}}{ρ}, \frac{1}{ρ {∥y^{(k + 1)}∥}_{2}})

(17)

where

S {(b, μ)}_{i}

is the soft-thresholding function as Equation (18) defined:

S {(b, μ)}_{i} = sign (b_{i}) max (0, |b_{i}| - μ), i = 1, 2, \dots n

(18)

We determined the regularization parameter in the Tikhonov regularization method via the L-curve method; followed the regularization parameter in FISTA based on the literature [11]; set the parameters in IZOTA as per the reference [28]. For FISTA and the proposed approach, the iteration is started with the initial guess

q^{(0)}

with a vector of zeros; the maximum iteration is 600.

4. Simulations Analysis

We describe the technique in two main phases: performance assessments and error evaluations respectively.

In all simulations, we assumed that the sound velocity is 340 m/s; the sound source was a vibrating sphere of radius 0.01 m, which vibrated at

2.5 \times 10^{- 2}

m/s. The equivalent source plane, the reconstruction plane, and the microphone array (i.e., the holography plane) were

10^{- 3}

m,

10^{- 2}

m, and 0.1 m from the sound source plane respectively (see Figure 1a); Figure 1b illustrates the geometry of 18-channel irregular microphone array. For the single sound source case, the point source was placed at (0, 0, 0) m; for the coherent sound sources case, the point sources were placed at (0.2, 0, 0) m and (−0.2, 0, 0) m respectively. The meshing grids of the equivalent source plane and the reconstruction plane were

41 \times 41

grids whose spacing is

10^{- 2}

m. White Gaussian noise with SNR 30 dB was added to the simulations.

We acquired all the numerical results of simulation by running a MATLAB 9.2 (R2017a) implementation of four methods including Tikhonov, IZOTA, FISTA, and

ℓ_{1} / ℓ_{2}

on a standard PC (8 GB of RAM on Windows 10) with CPU (Intel i5-8400, 2.8 GHz).

4.1. Performance Assessments

Figure 2 and Figure 3 simulated numerically the sound pressure reconstruction maps at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz in turn for the single sound source case and the coherent sound sources case respectively. We found that the Tikhonov regularization method failed to focus on the sound source(s) since the identified acoustic pressure of the main lobe was lower than 10 dB compared with other methods and mixed the coherent sound sources into a hot-spot (Figure 3a). As a mainstream method in the field of NAH, IZOTA coincided with well the theoretical value in both of cases; it underperformed at the frequency of 800 Hz for the coherent sound sources case (Figure 3g), since this method did not provide a minimize function of the

ℓ_{1}

norm regularization problem [11]. FISTA focused on the centers but it failed to depress the side lobes effectively (Figure 3j). The proposed method detected the position of the targeted sound source(s) and reconstructed the acoustic pressure accurately compared with the theoretical value (Figure 3m).

Figure 4 reveals the cross-section of the reconstruction acoustic pressure plane for both cases. The Tikhonov regularization method peaked at half of the theoretical value; IZOTA was superior to FISTA and it underperformed at low frequencies in the coherent sound sources case; the proposed method fitted the theoretical value expected in the coherent sound sources case at the frequency of 2400 Hz.

4.2. Error Evaluations

To quantitatively rate these four methods in percentage terms, the reconstruction error (RE) and the quantitative error (QE) were defined in Equation (19):

RE = \sqrt{\sum_{i = 1}^{N} {(|p_{i}| - |{\bar{p}}_{i}|)}^{2}} / \sqrt{\sum_{i = 1}^{N} {|{\bar{p}}_{i}|}^{2}}; QE = |max (S P L_{theory}) - max (S P L_{reconstruction})|

(19)

where

p_{i}

is the reconstructed value of acoustic pressure and

{\bar{p}}_{i}

is the theoretical value of acoustic pressure;

S P L_{reconstruction}

is the reconstructed value of acoustic pressure sound pressure level (SPL) and

S P L_{theory}

is the theoretical value of sound pressure level.

Figure 5 depicts the error of reconstructed acoustic pressure in the frequency band from 500 Hz to 5000 Hz. The reconstruction error was performed using the average value of 30 sets of simulations. Tikhonov failed to work with the increase of frequency. The increase of RE for FISTA was the most obvious error and it was also the largest error increase among the four algorithms. It increased faster in the coherent sound sources case compared with the single sound source case. IZOTA was relatively stable (below 15%), though it fluctuated along with the number supplement of the sound source (over 30%). The proposed method performed a stable function (below 30%) at low and medium frequencies. It is clear for the reconstruction error that the proposed method could decrease below 10% along with the increase of the frequency in the monopole sound source case.

Next, we adopted the quantitative error maps for the coherent sound sources case to evaluate the comprehensive performance which was influenced by different frequencies, varied measured distances, and dissimilar SNRs. Figure 6a,c,e,g presents the quantitative error (QE) of four above-mentioned algorithms in the frequency band of 500∼5000 Hz in with holographic distances of 0.05∼0.5 m and SNR 30 dB; in Figure 6b,d,f,h with SNR 10∼60 dB and holographic distance 0.1 m. The contour maps shared the same color bars whose maximum values were set to 50%.

On the distance-adaptability hand, Figure 6a shows that Tikhonov ran well in the near-field but stumbled in the high-frequency measurements. IZOTA extended the distance and the frequency for the measurement (Figure 6c). Figure 6e shows that FISTA could fail to maximize the performance when the frequency above 3500 Hz and the measured distance over 0.3 m since the quantitative error exceeded 40%.

ℓ_{1} / ℓ_{2}

provided the acceptable reconstruction accuracy at low frequencies and in the near-field measurements; it is worth noting that

ℓ_{1} / ℓ_{2}

could be adaptive well to the far-field measurements with the frequency increased (Figure 6g). On the SNR-adaptability hand, the QE of Tikhonov failed to be adaptive in the high-frequency band (Figure 6b). IZOTA and FISTA progressed obviously and they fitted at low and medium frequencies with SNR 10∼60 dB.

ℓ_{1} / ℓ_{2}

performed well over SNR 20 dB at low and medium frequencies (Figure 6h).

On top of that, we conducted an T-trail Monte Carlo simulation at 350 Hz interval frequency from 450 Hz to 5000 Hz, to evaluate the reconstruction accuracy. In this Monte Carlo simulation, the position of the sound source was random while the SNR was 30 dB. According to the references [29,30,31], we defined the average reconstruction error (ARE) and the average quantitative error (AQE) in Equations (20) and (21) apiece,

ARE = \frac{1}{T} \sum_{i = 1}^{T} (20 lg (\sum_{i = 1}^{N} (|p_{i}| / |{\bar{p}}_{i}|)) / (20 lg (\sum_{i = 1}^{N} |{\bar{p}}_{i}|)))

(20)

AQE = \frac{1}{T} \sum_{i = 1}^{T} (\frac{1}{N} \sum_{i = 1}^{N} 20 lg (|p_{i}| / |{\bar{p}}_{i}|))

(21)

where

T = 1000

, N is the number of the equivalent sources.

Figure 7 shows the curves of the average reconstruction error (ARE) and the average quantitative error (AQE) calculated by the four above-mentioned methods under 1000-trial Monte Carlo simulation in both cases. The smaller value of ARE and AQE could present a better and more stable reconstruction performance. Over the 1200 Hz frequency range, IZOTA and

ℓ_{1} / ℓ_{2}

stepped in steady-state while Tikhonov and FISTA were in a rising state. Below the 1200 Hz frequency range, Tikhonov was the best among the four methods.

Taken together, on the reconstruction accuracy and stability hand, IZOTA was superior to

ℓ_{1} / ℓ_{2}

;

ℓ_{1} / ℓ_{2}

is superior to FISTA; FISTA was superior to Tikhonov. It is worth noting that, compared to IZOTA [28],

ℓ_{1} / ℓ_{2}

required fewer parameters to be selected.

5. Experimental Applications

We conducted experiments to check the ability of these methods. The same setting as simulation, we placed the speaker(s) excited by a steady-state sinusoidal acoustic wave at the targeted position (as visible in Figure 8). The speaker was at the origin of the coordinates in the single sound source case while the speakers were at (0.2, 0, 0) m and (−0.2, 0, 0) m in the coherent sound sources case. The sound source plane was the plane perpendicular to the ground on which the speaker was located. The equivalent source plane was at

10^{- 3}

m (z =

10^{- 3}

m), the reconstruction plane was at

10^{- 2}

m (z =

10^{- 2}

m), and the microphone array was at 0.1 m (z =

10^{- 1}

m) distance from the sound source plane. We acquired the data using an 18-channel Combo microphone array made by HBK company, which was a random distribution of HBK 4951 microphones in a circle of 0.38 m diameter. The sampling frequency was 32,768 Hz and the sampling time was 5 s. Collected by the HBK LAN-XI data collector, the sound pressure data were processed to the acoustic imaging via the MATLAB software. The reconstructed sound pressure map exported the sound pressure level (SPL), with 2 dB intervals and −10 dB from the peak value as the display range.

Figure 9 and Figure 10 shows the experimental sound pressure maps at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz in turn for the monopole sound source case and the coherent sound sources case respectively. We found that the IZOTA performed extremely well in the monopole sound source case while it failed to depress the sidelobe at low frequencies in the coherent sound sources case. FISTA corresponded to the simulation analysis results. The proposed method could detect the position of the sound source(s), separate the coherent sound sources, and reconstruct the acoustic pressure accurately.

The experimental results of this study should be considered in light of the following limitation: the experiments were conducted in a room with non-anechoic chamber conditions, so noise interference may have been present. We speculate that this was the reason for the difference between the 2400 Hz simulated and experimental results.

6. Conclusions

In this paper, an approach is proposed to obtain a robust and accurate estimation for sound source identification via the

ℓ_{1} / ℓ_{2}

norm ratio model. Using the ADMM solver, the proposed approach could avoid the selection of the regularization parameter.

Simulation analysis results illustrate that the proposed method could recognized the location of sound source(s) and reconstruct the acoustic pressure close to the theoretical value; it localizes sound source with robustness, and expresses the distance-adaptability and SNR-adaptability at low and medium frequencies. The Monte Carlo analysis shows that the proposed approach leads to stable sound pressure reconstruction results at low and medium frequencies. Experimental applications chart the sound pressure map at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz; the results indicate that

ℓ_{1} / ℓ_{2}

could separate the coherent sound sources at 800 Hz, while IZOTA underperformed.

The proposed method could offer potential gains in industrial applications for sound source identification.

Author Contributions

Conceptualization, L.H., Z.X., and Y.H.; methodology, L.H. and Z.Z.; software, L.H.; validation, Z.Z. and Z.X.; formal analysis, L.H.; investigation, L.H.; resources, Z.Z., Y.H., and Z.X.; data curation, L.H., Z.X.; writing—original draft preparation, L.H.; writing—review and editing, Z.Z., Y.H., and Z.X.; visualization, L.H.; supervision, Z.X.; project administration, Z.X.; funding acquisition, Z.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 11874096.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ballesteros, J.A.; Sarradj, E.; Fernández, M.D.; Geyer, T.; Ballesteros, M.J. Noise source identification with Beamforming in the pass-by of a car. Appl. Acoust. 2015, 93, 106–119. [Google Scholar] [CrossRef]
Wang, J.; Ma, W. Deconvolution algorithms of phased microphone arrays for the mapping of acoustic sources in an airframe test. Appl. Acoust. 2020, 164, 107283. [Google Scholar] [CrossRef]
Chiariotti, P.; Martarelli, M.; Castellini, P. Acoustic beamforming for noise source localization – Reviews, methodology and applications. Mech. Syst. Signal. Process. 2019, 120, 422–448. [Google Scholar] [CrossRef]
Williams, E.G.; Maynard, J.D.; Skudrzyk, E. Sound source reconstructions using a microphone array. J. Acoust. Soc. Am. 1963, 68, 340–344. [Google Scholar] [CrossRef]
Maynard, J.D.; Williams, E.G.; Lee, Y. Nearfield acoustic holography: I. Theory of generalized holography and the development of NAH. J. Acoust. Soc. Am. 1985, 78, 1395–1413. [Google Scholar] [CrossRef]
Zan, M.; Xu, Z.; Huang, L.; Zhang, Z. A Sound Source Identification Algorithm Based on Bayesian Compressive Sensing and Equivalent Source Method. Sensors 2020, 20, 865. [Google Scholar] [CrossRef] [PubMed]
Chardon, G.; Daudet, L.; Peillot, A.; Ollivier, F.; Bertin, N.; Gribonval, R. Near-field acoustic holography using sparse regularization and compressive sampling principles. J. Acoust. Soc. Am. 2012, 132, 1521–1534. [Google Scholar] [CrossRef]
Attendu, J.M.; Ross, A. Sparse regularization for reconstructing transient sources with time domain nearfield acoustical holography. J. Acoust. Soc. Am. 2018, 143, 3796–3806. [Google Scholar] [CrossRef] [PubMed]
Fernandez-Grande, E.; Xenaki, A.; Gerstoft, P. A sparse equivalent source method for near-field acoustic holography. J. Acoust. Soc. Am. 2017, 141, 532–542. [Google Scholar] [CrossRef]
Gerstoft, P.; Xenaki, A.; Mecklenbräuker, C.F. Multiple and single snapshot compressive beamforming. J. Acoust. Soc. Am. 2015, 138, 2003–2014. [Google Scholar] [CrossRef]
Hald, J. A comparison of iterative sparse equivalent source methods for near-field acoustical holography. J. Acoust. Soc. Am. 2018, 143, 3758–3769. [Google Scholar] [CrossRef] [PubMed]
Giri, R.; Srikrishnan, T.A.; Rao, B.D.; Zhang, T. Empirical Bayes based relative impulse response estimation. J. Acoust. Soc. Am. 2018, 143, 3922–3933. [Google Scholar] [CrossRef] [PubMed]
Hu, D.; Li, H.; Hu, Y.; Fang, Y. Sound field reconstruction with sparse sampling and the equivalent source method. Mech. Syst. Signal. Process. 2018, 108, 317–325. [Google Scholar] [CrossRef]
Shi, T.; Liu, Y.; Bolton, J.S. Spatially sparse sound source localization in an under-determined system by using a hybrid compressive sensing method. J. Acoust. Soc. Am. 2019, 146, 1219–1229. [Google Scholar] [CrossRef] [PubMed]
Bi, C.; Liu, Y.; Zhang, Y.; Xu, L. Sound field reconstruction using inverse boundary element method and sparse regularization. J. Acoust. Soc. Am. 2019, 145, 3154–3162. [Google Scholar] [CrossRef]
Tan, D.; Chu, Z.; Wu, G. Robust reconstruction of equivalent source method based near-field acoustic holography using an alternative regularization parameter determination approach. J. Acoust. Soc. Am. 2019, 146, EL34–EL38. [Google Scholar] [CrossRef]
Rahimi, Y.; Wang, C.; Dong, H.; Lou, Y. A Scale-Invariant Approach for Sparse Signal Recovery. SIAM J. Sci. Comput. 2019, 41, A3649–A3672. [Google Scholar] [CrossRef]
Pham, M.Q.; Oudompheng, B.; Mars, J.I.; Nicolas, B. A Noise-Robust Method with Smoothed ℓ₁/ℓ₂ Regularization for Sparse Moving-Source Mapping. Signal. Process. 2017, 135, 96–106. [Google Scholar] [CrossRef]
Wang, C.; Yan, M.; Rahimi, Y.; Lou, Y. Accelerated Schemes for the L₁/L₂ Minimization. IEEE Trans. Signal. Process. 2020, 68, 2660–2669. [Google Scholar] [CrossRef]
Beck, A.; Teboulle, M. A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef]
Koopmann, G.H.; Song, L.; Fahnline, J.B. A method for computing acoustic fields based on the principle of wave superposition. J. Acoust. Soc. Am. 1989, 86, 2433–2438. [Google Scholar] [CrossRef]
Jeans, R.; Mathews, I.C. The wave superposition method as a robust technique for computing acoustic fields. J. Acoust. Soc. Am. 1992, 92, 1156–1166. [Google Scholar] [CrossRef]
Sun, Y.; Tan, X.; Li, X.; Lei, L.; Kuang, G. Sparse optimization problem with s-difference regularization. Signal. Process. 2020, 168, 107369. [Google Scholar] [CrossRef]
Qiao, B.; Ao, C.; Mao, Z.; Chen, X. Non-convex sparse regularization for impact force identification. J. Sound. Vib. 2020, 477, 115311. [Google Scholar] [CrossRef]
Candès, E.; Wakin, M.; Boyd, S. Enhancing Sparsity by Reweighted ℓ₁ Minimization. J. Fourier. Anal. App. 2008, 14, 877–905. [Google Scholar] [CrossRef]
Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Found. Trends. Mach. Learn. 2011, 3, 1–122. [Google Scholar] [CrossRef]
Yu, L.; Antoni, J.; Zhao, H.; Guo, Q.; Wang, R.; Jiang, W. The acoustic inverse problem in the framework of alternating direction method of multipliers. Mech. Syst. Signal. Process. 2021, 149, 107220. [Google Scholar] [CrossRef]
Hald, J. Fast wideband acoustical holography. J. Acoust. Soc. Am. 2016, 139, 1508–1517. [Google Scholar] [CrossRef]
Chu, Z.; Yang, Y.; He, Y. Deconvolution for three-dimensional acoustic source identification based on spherical harmonics beamforming. J. Sound. Vib. 2015, 344, 484–502. [Google Scholar] [CrossRef]
Yin, S.; Chu, Z.; Zhang, Y.; Liu, Y. Adaptive reweighting homotopy algorithm based compressive spherical beamforming with spherical microphone arrays. J. Acoust. Soc. Am. 2020, 147, 480–489. [Google Scholar] [CrossRef]
Yang, Y.; Chu, Z.; Ping, G.; Xu, Z. Resolution enhancement of two-dimensional grid-free compressive beamforming. J. Acoust. Soc. Am. 2018, 143, 3860–3872. [Google Scholar] [CrossRef] [PubMed]

Figure 1. (a) The diagrammatic sketch of simulation set-up; (b) The geometry of microphone array.

Figure 2. Calculated sound source maps, with signal-to-noise ratio (SNR) 30 dB, at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz for single sound source respectively: the equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane.

Figure 2. Calculated sound source maps, with signal-to-noise ratio (SNR) 30 dB, at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz for single sound source respectively: the equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane.

Figure 3. Calculated sound source maps, with SNR 30 dB, at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz for coherent sound sources respectively: The equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane.

Figure 3. Calculated sound source maps, with SNR 30 dB, at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz for coherent sound sources respectively: The equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane.

Figure 4. The cross-section of the reconstruction acoustic pressure plane: (a–c) are in the single sound source case, sound source at (0, 0, 0) m; (d–f) are in the coherent sound sources case, sound sources at (0.2, 0, 0) m and (−0.2, 0, 0) m: The equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane.

Figure 4. The cross-section of the reconstruction acoustic pressure plane: (a–c) are in the single sound source case, sound source at (0, 0, 0) m; (d–f) are in the coherent sound sources case, sound sources at (0.2, 0, 0) m and (−0.2, 0, 0) m: The equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane.

Figure 5. The error calculated by the Tikhonov regularization method, iterative zoom-out-thresholding algorithm (IZOTA), fast iterative shrinkage-thresholding algorithm (FISTA), and

ℓ_{1} / ℓ_{2}

, the equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane, with SNR 30 dB: (a) the single sound source case, sound source at (0, 0, 0) m; (b) the coherent sound sources case, sound sources at (0.2, 0, 0) m and (−0.2, 0, 0) m respectively.

Figure 5. The error calculated by the Tikhonov regularization method, iterative zoom-out-thresholding algorithm (IZOTA), fast iterative shrinkage-thresholding algorithm (FISTA), and

ℓ_{1} / ℓ_{2}

, the equivalent source plane at

10^{- 3}

m, the reconstruction plane at

10^{- 2}

m, and the microphone array at 0.1 m distance from the sound source plane, with SNR 30 dB: (a) the single sound source case, sound source at (0, 0, 0) m; (b) the coherent sound sources case, sound sources at (0.2, 0, 0) m and (−0.2, 0, 0) m respectively.

Figure 6. The quantitative error of methods with holographic distance 0.05∼0.5 m (left column), with SNR 30 dB; The quantitative error of methods with SNR 10∼60 dB (right column), with holographic distance 0.1 m.

Figure 7. The average quantitative errors (AQE) and the average reconstruction errors (ARE) for the 1000-trial Monte Carlo analysis by the Tikhonov regularization method, IZOTA, FISTA, and

ℓ_{1} / ℓ_{2}

, with the position of the sound source random and SNR 30 dB.

Figure 7. The average quantitative errors (AQE) and the average reconstruction errors (ARE) for the 1000-trial Monte Carlo analysis by the Tikhonov regularization method, IZOTA, FISTA, and

ℓ_{1} / ℓ_{2}

, with the position of the sound source random and SNR 30 dB.

Figure 8. The testing set-up: (a) The single sound source case, sound source at (0, 0, 0) m; (b) The coherent sound sources case, sound sources at (0.2, 0, 0) m and (−0.2, 0, 0) m. The equivalent source plane at

10^{- 3}

m (z =

10^{- 3}

m), the reconstruction plane at

10^{- 2}

m (z =

10^{- 2}

m), and the microphone array at 0.1 m (z =

10^{- 1}

m) distance from the sound source plane.

Figure 8. The testing set-up: (a) The single sound source case, sound source at (0, 0, 0) m; (b) The coherent sound sources case, sound sources at (0.2, 0, 0) m and (−0.2, 0, 0) m. The equivalent source plane at

10^{- 3}

m (z =

10^{- 3}

m), the reconstruction plane at

10^{- 2}

m (z =

10^{- 2}

m), and the microphone array at 0.1 m (z =

10^{- 1}

m) distance from the sound source plane.

Figure 9. Experimental sound source maps at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz respectively for single sound source case, the microphone array at 0.1 m distance from the sound source plane.

Figure 10. Experimental sound source maps at the frequencies of 800 Hz, 1600 Hz, and 2400 Hz respectively for coherent sound sources case, the microphone array at 0.1 m distance from the sound source plane.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, L.; Xu, Z.; Zhang, Z.; He, Y. A Ratio Model of L₁/L₂ Norm for Sound Source Identification. Sensors 2020, 20, 5290. https://doi.org/10.3390/s20185290

AMA Style

Huang L, Xu Z, Zhang Z, He Y. A Ratio Model of L₁/L₂ Norm for Sound Source Identification. Sensors. 2020; 20(18):5290. https://doi.org/10.3390/s20185290

Chicago/Turabian Style

Huang, Linsen, Zhongming Xu, Zhifei Zhang, and Yansong He. 2020. "A Ratio Model of L₁/L₂ Norm for Sound Source Identification" Sensors 20, no. 18: 5290. https://doi.org/10.3390/s20185290

APA Style

Huang, L., Xu, Z., Zhang, Z., & He, Y. (2020). A Ratio Model of L₁/L₂ Norm for Sound Source Identification. Sensors, 20(18), 5290. https://doi.org/10.3390/s20185290

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Ratio Model of L₁/L₂ Norm for Sound Source Identification

Abstract

1. Introduction

2. Sound Source Reconstruction Model

3. The Introduction of Ratio Model of $L_{1} / L_{2}$ Norm

4. Simulations Analysis

4.1. Performance Assessments

4.2. Error Evaluations

5. Experimental Applications

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Ratio Model of L1/L2 Norm for Sound Source Identification

Abstract

1. Introduction

2. Sound Source Reconstruction Model

3. The Introduction of Ratio Model of L 1 / L 2 Norm

4. Simulations Analysis

4.1. Performance Assessments

4.2. Error Evaluations

5. Experimental Applications

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

A Ratio Model of L₁/L₂ Norm for Sound Source Identification

3. The Introduction of Ratio Model of $L_{1} / L_{2}$ Norm