Achieving 3D Beamforming by Non-Synchronous Microphone Array Measurements

Beamforming technology is an essential method in acoustic imaging or reconstruction, which has been widely used in sound source localization and noise reduction. The beamforming algorithm can be described as all microphones in a plane simultaneously recording the source signal. The source position is then localized by maximizing the result of the beamformer. Evidence has shown that the accuracy of the sound source localization in a 2D plane can be improved by the non-synchronous measurements of moving the microphone array. In this paper, non-synchronous measurements are applied to 3D beamforming, in which the measurement array envelops the 3D sound source space to improve the resolution of the 3D space. The entire radiated object is covered better by a virtualized large or high-density microphone array, and the range of beamforming frequency is also expanded. The 3D imaging results are achieved in different ways: the conventional beamforming with a planar array, the non-synchronous measurements with orthogonal moving arrays, and the non-synchronous measurements with non-orthogonal moving arrays. The imaging results of the non-synchronous measurements are compared with the synchronous measurements and analyzed in detail. The number of microphones required for measurement is reduced compared with the synchronous measurement. The non-synchronous measurements with non-orthogonal moving arrays also have a good resolution in 3D source localization. The proposed approach is validated with a simulation and experiment.


Introduction
Sound source localization has high demand and exceptional value in applications such as automobiles [1], submarines [2], aircrafts [3,4], etc. Several methods have been proposed, such as beamforming and inverse methods [5,6], in which beamforming has evolved into one of the most important methods of sound source localization. The basic idea of beamforming is that by weighting each array element's outputs, the expected maximum output power of the signal is directed to the position of the sound source. Different representations of the basis function also derive other classical methods, such as (1) near-field acoustical holography (NAH) [7][8][9]; (2) the inverse boundary element method (IBEM) [10][11][12]; (3) the equivalent source methods (ESM) [13][14][15]; (4) statistically optimal near-field acoustical holography (SONAH) [16,17]; and (5) the Helmholtz equation-least-squares method (HELS) [18]. The beamformer has a good resolution when the plane of the beamformer is parallel to the microphone plane. The spatial resolution of the planar beamformer deteriorated sharply when the plane is perpendicular to the microphone array. Most beamforming research is conducted question arises: Is it possible to effectively improve the array's spatial resolution by moving the planar microphone array in space?
In this paper, the performance of 3D beamforming with the non-synchronous measurements is studied systematically. The prototype array configuration is optimized by moving a planar array in space when the sound sources are distributed in 3D space. Compared to conventional 3D beamforming, the number of microphones required for the measurement is reduced. The spherical basis is selected on the basis function instead of the Fourier basis when the beamforming expands to being 3D. The algorithm flowchart of this paper is shown in Figure 1. It is noted that conventional beamforming is just one convenient choice here and all the beamformers (i.e., MVDR and MUSIC [35,36]) that are based on the cross-spectral matrix can be applied in the proposed methods. The current paper is organized as follows. In Section 2, the non-synchronous measurements are developed in the context of 3D beamforming. The simulation results between a single synchronous measurement (i.e., one planar array) and non-synchronous measurements (i.e., a non-synchronous moving array) are compared in Section 3. In Section 4, the performance of the non-synchronous measurements and the synchronous measurement are discussed. In Section 5, an experiment is made to validate the performance of the non-synchronous measurements in 3D imaging. The main conclusions of this study are summarized in Section 6.
Sensors 2020, 20, x FOR PEER REVIEW 3 of 22 beamforming. Therefore, an interesting question arises: Is it possible to effectively improve the array's spatial resolution by moving the planar microphone array in space?
In this paper, the performance of 3D beamforming with the non-synchronous measurements is studied systematically. The prototype array configuration is optimized by moving a planar array in space when the sound sources are distributed in 3D space. Compared to conventional 3D beamforming, the number of microphones required for the measurement is reduced. The spherical basis is selected on the basis function instead of the Fourier basis when the beamforming expands to being 3D. The algorithm flowchart of this paper is shown in Figure 1. It is noted that conventional beamforming is just one convenient choice here and all the beamformers (i.e., MVDR and MUSIC [35,36]) that are based on the cross-spectral matrix can be applied in the proposed methods. The current paper is organized as follows. In Section 2, the non-synchronous measurements are developed in the context of 3D beamforming. The simulation results between a single synchronous measurement (i.e., one planar array) and non-synchronous measurements (i.e., a non-synchronous moving array) are compared in Section 3. In Section 4, the performance of the non-synchronous measurements and the synchronous measurement are discussed. In Section 5, an experiment is made to validate the performance of the non-synchronous measurements in 3D imaging. The main conclusions of this study are summarized in Section 6.  Figure 1. Algorithm flowchart of this paper. The data-missing spectral matrix is first obtained by nonsynchronous measurements. Then the data-missing spectral matrix is filled with data to obtain the complete spectral matrix. Finally, acoustic imaging is achieved by conventional beamforming.

Conventional Beamforming
Denoting the sound pressure measured by the microphone at position r as () p r , () p r is the sum of the particular pressure P () p r and the random pressure N () p r . P () p r is the sound pressure transmitted from the sound sources s at  r to the microphone at r , which can be obtained from Figure 1. Algorithm flowchart of this paper. The data-missing spectral matrix is first obtained by non-synchronous measurements. Then the data-missing spectral matrix is filled with data to obtain the complete spectral matrix. Finally, acoustic imaging is achieved by conventional beamforming.

Conventional Beamforming
Denoting the sound pressure measured by the microphone at position r as p(r), p(r) is the sum of the particular pressure p P (r) and the random pressure p N (r). p P (r) is the sound pressure transmitted from the sound sources s at r to the microphone at r, which can be obtained from the Green's function G(r, r ). p N (r) is the measurement noise, which is usually assumed to be a Gaussian distribution function [37,38]. For a given frequency ω, p(r) can be expressed as p(r, ω) = p P (r, ω) + p N (ω) (1) Sensors 2020, 20, 7308 4 of 21 where p P (r, ω) = r ∈Γ G(r, ω|r )s(r , ω)dr , G(r, ω|r ) = e jk r−r 2 4π r − r 2 (2) G(r, ω|r ) is the free-field Green's function that describes the acoustic propagation between the sources at r and the microphones at r. k = ω/c is the wavenumber that represents the number of radians per unit distance, and c is the wave velocity. The 2 -norm is defined as • 2 . Substituting Equation (2) into Equation (1), and re-writing the measured pressure signal in the form of the matrix, we can get p(r, ω) = G(r, ω|r )s(r , ω) + p N (ω). ( Denoting the number of microphones as M and the number of the sources as S, the sizes are p(r, ω) ∈ C M×1 , G(r, ω|r ) ∈ C M×S , G = (g 1 , . . . , g s , . . . , g S ), s(r , ω) ∈ C S×1 , and p N (ω) ∈ C M×1 . The cross-spectral matrix C(ω) can be obtained as where E{•} represent the mathematical expectation, (•) H represents the Hermitian transpose, I is the number of pressure snapshots, and the size is C(ω) ∈ C M×M . Conventional beamforming is designed to locate sound sources by compensating for the time delay and amplitude attenuation between the microphones and the virtual source. The outcome of the beamformer Q BF is calculated in terms of the cross-spectral matrix as The weights h n are designed to be independent of the array data or data statistics, which compensate for the time delays and amplitude attenuation of the forward propagation. It can be obtained as h n = g n g n 2 2 , n = 1, . . . , S The beamformer is designed to search for the source location by steering the microphone array. When the virtual source location coincides with the real source position, the outcome of the conventional beamforming gets the maximum. A more detailed description of the conventional beamforming can be found in Chu's work [35].

Non-Synchronous Measurements Theory
Unlike synchronous measurement, in which all microphones acquire data simultaneously, non-synchronous measurements receive data by moving the microphone array under the assumption of a stable sound field. There is evidence that non-synchronous measurements are effective in 2D beamforming. For non-synchronous measurements, a denser or larger virtual array can be obtained by moving the microphone array in a 2D space, as shown in Figure 2. In denser virtual arrays, the working frequency of the beamforming algorithms is expanded; in larger virtual arrays, the entire radiating object is also better covered [39]. Schematic diagram of the non-synchronous measurements in a 2D space: (a) measurement with a planar microphone array; (b) non-synchronous measurements' microphone array for a denser array; (c) non-synchronous measurements' microphone array for a bigger array. The circle "○" represents a microphone, and the same color represents the microphones in the same test.
The sound source localization is based on the beamforming method, whether the synchronous or non-synchronous measurements. The cross-spectral matrix ( )  in one measurement. There is an example to clarify the spectral matrix issues in the non-synchronous measurements: assuming that the microphone array includes 25 microphones and the scanning plane contains three sound sources. Figure 3a shows the cross-spectral matrix (1) ( ) ω C in one measurement. It is a 25 25 × full matrix, and the rank of ( ) ( ) ( ) r C , is equal to 12 (i.e., 3 4 × ). Figure 3c shows the target spectral matrix, 4 ( ) ω C  . The missing data were supplemented by the non-synchronous measurements and the rank of 4 ( ) ω C  , 4 ( ) r C  , was restored to 3. Schematic diagram of the non-synchronous measurements in a 2D space: (a) measurement with a planar microphone array; (b) non-synchronous measurements' microphone array for a denser array; (c) non-synchronous measurements' microphone array for a bigger array. The circle " " represents a microphone, and the same color represents the microphones in the same test.
The sound source localization is based on the beamforming method, whether the synchronous or non-synchronous measurements. The cross-spectral matrix C(ω) is the main difference between the synchronous and non-synchronous measurements. For synchronous measurements, C(ω) is a M × M full matrix. The rank of C(ω) is equal to the number of uncorrelated sources S. For the non-synchronous measurements, a complete cross-spectral matrix is obtained, no longer due to the moving microphone array. The obtained matrixĈ P (ω) will lose the non-diagonal block elements' information when the cross-spectral matrix obtained from each measurement C (i) (ω) is placed on the diagonal block of a larger matrix. The superscript P refers to the total number of the non-synchronous measurements and the superscript i refers to the i-th measurement.Ĉ P (ω) is a MP × MP matrix. When the elements at the non-diagonal position ofĈ P (ω) are set to zero in the absence of references, the rank ofĈ P (ω), r(Ĉ P (ω)), is equal to the product of the number of the sources and the number of moves, SP. The rank of the cross-spectral matrix is considered to be the number of sources. The non-synchronous measurements method is located in the source position by supplementing the missing data inĈ P (ω). The accomplished cross-spectral matrix C P (ω) has the same rank as C (i) (ω) in one measurement. There is an example to clarify the spectral matrix issues in the non-synchronous measurements: assuming that the microphone array includes 25 microphones and the scanning plane contains three sound sources. Figure 3a shows the cross-spectral matrix C (1) (ω) in one measurement. It is a 25 × 25 full matrix, and the rank of C (i) (ω), r(C (i) ), is equal to 3. If the prototype array moves four times,Ĉ 4 (ω) (as shown in Figure 3b) is obtained by rearranging the C (i) (ω), i = 1, 2, 3, 4 of the single measurement in the diagonal block position.Ĉ 4 (ω) is a (25 × 4) × (25 × 4) matrix, with data missing at the non-diagonal positions. The rank ofĈ 4 (ω), r(Ĉ 4 ), is equal to 12 (i.e., 3 × 4). Figure 3c shows the target spectral matrix, C 4 (ω). The missing data were supplemented by the non-synchronous measurements and the rank of C 4 (ω), r( C 4 ), was restored to 3.  Figure 3c shows the target spectral matrix, 4 ()  C . The missing data were supplemented by the non-synchronous measurements and the rank of 4 ()  C , 4 () r C , was restored to 3. Figure 3. (a) C (1) (ω) ∈ C 25×25 in one single measurement; r(C (1) ) is the rank of C (1) (ω). (b)Ĉ 4 (ω) ∈ C 100×100 by arranging C (i) (ω), i = 1, 2, 3, 4 in the diagonal block position; r(Ĉ 4 ) is the rank ofĈ 4 (ω).
(c) C 4 (ω) ∈ C 100×100 with the non-synchronous measurements by implementing the missing data at the non-diagonal block; r( C 4 ) is the rank of C 4 (ω).
Sensors 2020, 20, 7308 6 of 21 The core problem in the non-synchronous measurements is the data-missing spectral matrix completion problem. The process of finding a full cross-spectral matrix can be described as The four additional constraints are explained as follows: 1.
The 1st constraint: r( C) = S ensures that the rank of the reconstructed cross-spectral matrix is still equal to the number of the sources S.

2.
The 2nd constraint: A(•) denotes the sampling operator that gets the elements in the diagonal block of a matrix, and the size is A( C) ∈ C MP×MP , which is identical to C ∈ C MP×MP . ||A( C) −Ĉ|| F ≤ ε 1 ensures that the difference between A( C) andĈ in the Frobenius norm is less than a given tolerance ε 1 .

3.
The 3rd constraint: Ψ is a projection matrix. Ψ CΨ H = E{Ψ P P H Ψ H } is the cross-spectral matrix of Ψ P, where the projected matrix Ψ P can be the smooth pressure [40] of the non-synchronous measured pressure P. To ensure the acoustic field's spatial continuity, is added here, and the difference in the cross-spectral matrix between Ψ P and P is required to be smaller than a given tolerance ε 2 . The detailed discussion on this constraint is addressed in Section 2.2. It can be seen from the 1st constraint, the eigenvalue vector of the cross-spectral matrix should be S-sparse. The number of non-zero eigenvalues is S, equal to the number of sound sources. The rank estimation of a matrix is still a difficult problem. An alternative model has been proposed to solve this problem: the cross-spectral matrix is considered to be a full rank matrix, with a few dominant eigenvalues. This method considers the eigenvalue spectrum to be "weak sparse", in which the sorted eigenvalue elements of the spectral matrix decay rapidly according to the power law, rather than "complete S-sparse". Equation (7) can be realized in another form of constrained optimization as where ||•|| * denotes the nuclear norm of a matrix, which is defined as the sum of the eigenvalues λ i , . The objective function minimize C || C|| * is to seek C with the minimum nuclear norm.
2D beamforming generally assumes that the distance between the source plane and the measurement plane is known, and the non-synchronous measurements move the microphone array on one plane, making the virtual microphone array denser or larger (seen in Figure 2). 3D beamforming assumes that the source is distributed in a 3D space, and the distribution of the entire 3D sound source space is obtained by iterative scanning of the measured distance. In most studies, the implementation of 3D beamforming still uses planar array measurements, which leads to low spatial resolution problems. In this paper, the non-synchronous measurements are used to measure 3D space, in which the measurement array envelops the 3D sound source space to improve the resolution of the 3D space. As shown in Figure 4, the original array is located in the x-y plane, and then the array can be moved sequentially to the y-z plane and x-z plane, such that the scattering object is surrounded by the virtual microphone array in three planes. This study aims to extend the application of non-synchronous measurements in 3D beamforming. on one plane, making the virtual microphone array denser or larger (seen in Figure 2). 3D beamforming assumes that the source is distributed in a 3D space, and the distribution of the entire 3D sound source space is obtained by iterative scanning of the measured distance. In most studies, the implementation of 3D beamforming still uses planar array measurements, which leads to low spatial resolution problems. In this paper, the non-synchronous measurements are used to measure 3D space, in which the measurement array envelops the 3D sound source space to improve the resolution of the 3D space. As shown in Figure 4, the original array is located in the x-y plane, and then the array can be moved sequentially to the y-z plane and x-z plane, such that the scattering object is surrounded by the virtual microphone array in three planes. This study aims to extend the application of non-synchronous measurements in 3D beamforming. The circle "○" represents a microphone, and the same color represents the microphones in the same test.

Spatial Basis and Spatial Continuity of the Acoustic Field
The sound pressure signal captured by the microphone P can be expressed using a set of spatial bases as (9) where Φ is the spatial basis vector and  is the coefficients vector. where () m l P • is the associated Legendre function: (b) non-synchronous moving microphone array in space from the x-y plane to the y-z plane (red circles); (c) non-synchronous moving microphone array in space from the x-y plane to the x-z plane (blue circles). The circle " " represents a microphone, and the same color represents the microphones in the same test.

Spatial Basis and Spatial Continuity of the Acoustic Field
The sound pressure signal captured by the microphone P can be expressed using a set of spatial bases as where Φ is the spatial basis vector and ϑ is the coefficients vector. φ i (r) and ϑ i are the i-th basis and the corresponding coefficient, respectively. The Fourier basis Φ(x, y) = e i(k x x+k y y) is chosen for 2D beamforming, where x and y are the microphones' coordinates, and k x and k y are the wavenumbers along the x and y directions. For 3D beamforming, a 3D spatial basis should be considered. In this paper, spherical harmonics are chosen (as shown in Figure 5), and its complete orthonormal form in polar coordinates (r, θ, ϕ), Y m l (θ, ϕ) of order l and degree m, can be written as where l ∈ R + , |m| ≤ l. P m l (•) is the associated Legendre function: P l (•) is the Legendre polynomial of degree l: Sensors 2020, 20, 7308 8 of 21 P r θ ϕ can be re-written in forms of spherical harmonics as Figure 5. Spherical harmonics. The real spherical harmonics Y m l (θ, ϕ) for l = 0, 1, 2, 3 (top to bottom) and m = 0, 1 · · · l (left to right). The value of the spherical harmonic function is represented by color, in which the maximum value is characterized by red, and the minimum value is represented by blue.
The negative order spherical harmonics Y −m l (θ, ϕ) can be obtained by rotating 90 o /m around the z-axis relative to the positive harmonics. The pressure signals at one microphone P(r, θ, ϕ) can be re-written in forms of spherical harmonics as where θ m l (r) is the coefficient. Assuming the acoustic field's spatial continuity, the reconstructed acoustic field should be smooth, and the sound pressures measured by the two adjacent microphones should have comparable levels. Such constraints should be included in the optimization model. Since the best approximation of vector P in subspace R(Φ) is its projection vector Proj R(Φ) P, we need to get the orthogonal projection matrix on the subspace R(Φ). According to Equation (9), the coefficient ϑ can be obtained from the measured pressure P as Sensors 2020, 20, 7308 where † denotes the pseudo-inverse of a matrix based on the fact that Φ is not generally invertible. The smoothed pressure P can be obtained by where Ψ = ΦΦ † is the orthogonal projection matrix, and P = Proj R(Φ) P = ΦΦ † P is the projection of P on space R(Φ). The cross-spectral matrix of the smoothed pressure C is given by If the smoothed pressure P is already in the space R(Φ), the projection of P in space R(Φ) should be itself. The constraint ||Ψ CΨ H − C|| F ≤ ε 2 is included in the optimization model, where Ψ CΨ H denotes the re-projection result and C denotes the smoothed result. Ψ = ΦΦ † has encoded the microphone position information into the matrix C.

Simulation Results
For conventional 2D beamforming, the plane where the sound sources are located is discretized into N grids with known positions. For 3D acoustic imaging, the observation zone will be extended from one plane to a rectangular box. In the current numerical experiments, the rectangular box's center to be scanned is at the origin, and the size is 0.4 m × 0.4 m × 0.4 m. The scanned box is discretized uniformly by a 41 × 41 × 41 grid with a distance of 0.01 m. The six-point sources with equivalent unit magnitudes are located at . These sound sources are located at the vertex of an imaginary cube with the center at the origin and the side length of 0.2 m. The signal of the source is generated by random white noise. The prototype planar array with 56 microphones distributed by the Archimedes spiral is initially located in the z = −0.5 m plane, and the center of the planar array is located at (0, 0, −0.5) (m). The frequency of beamforming is set to 4000 Hz, and the signal-to-noise ratio (SNR) is set to 20 dB. To illustrate the advantages of the 3D beamforming with non-synchronous measurements, the results of 3D imaging by the conventional beamforming and the non-synchronous measurements are compared in the following sections. The results of 3D imaging by the conventional beamforming with the original planar array are given in Section 3.1. As a comparison, the results of the 3D imaging by the non-synchronous measurements with orthogonal and non-orthogonal moving arrays are shown, respectively, in Sections 3.2 and 3.3. The orthogonal moving arrays mean that the initial microphone array and the moving microphone array are perpendicular to each other. The non-orthogonal moving arrays mean a certain angle between the microphone array before and after moving. Figure 6a shows the microphone array's geometry and the location of the sound sources in the numerical simulation. The microphone array is set to be z = −0.5 m, and the center of the array is set to (x = 0 m, y = 0 m, z = −0.5 m). The scanned box is shown as partially transparent to allow visualization of the sound sources. Figure 6b shows the source of power maps in the observation zone, which has been normalized by the maximum value. Six slices are presented on this figure to get a better view of the sound sources location. The simulated sound sources are located at the intersection of these slices.  Figure 6e shows, on the z = −0.1 m slice, that, parallel to the microphone array, conventional beamforming locates the sound sources well, and has a good spatial resolution in both the x and y directions. As shown in Figure 6c,d, on x = 0.1 m and y = 0.1 m, for the slices perpendicular to the microphone array, it is difficult to distinguish the sound source's location by conventional beamforming. The main lobes of these sound sources are merged into the perpendicular directions, which leads to poor spatial resolution so that the sound source's position cannot be accurately located.

3D Imaging by the Conventional Beamforming with a Planar Array
get a better view of the sound sources location. The simulated sound sources are located at the intersection of these slices. Figure 6c-e respectively show the sources power maps at three typical slices (x = 0.1 m, y = 0.1 m, z = − 0.1 m). The positions of the simulated sound sources are marked with "+" on the figures. Figure 6e shows, on the z = − 0.1 m slice, that, parallel to the microphone array, conventional beamforming locates the sound sources well, and has a good spatial resolution in both the x and y directions. As shown in Figure 6c,d, on x = 0.1 m and y = 0.1 m, for the slices perpendicular to the microphone array, it is difficult to distinguish the sound source's location by conventional beamforming. The main lobes of these sound sources are merged into the perpendicular directions, which leads to poor spatial resolution so that the sound source's position cannot be accurately located.

3D Imaging by the Non-Synchronous Measurements with Orthogonal Moving Arrays
The orthogonal moving arrays mean that the moving microphone array is perpendicular to the plane of the previously tested microphone. Figure 7a shows the microphone array configuration when the prototype array is moved once. The initial microphone array is located on the z = −0.5 m plane, which is consistent with the array position in Section 3.1, and the moving microphone array is located on the y = −0.5 m plane. The geometric center of the moving microphone array is located on (x = 0 m, y = −0.5 m, z = 0 m). The results in the 3D zone and typical slices are shown in Figure 7b-e. Comparing the results in Figure 6d and Figure 7d, the orthogonally moving the microphone array one time can improve the resolution in the z-direction of the x-z plane of the initial prototype array. The positions of the sound sources in the Z direction are not distinguished well due to the merging of the main lobes in Figure 6d, while the two sound source positions in the Z direction are located accurately in Figure 7d. By comparing the results in Figure 6c and Figure 7c, the y-z plane's resolution

3D Imaging by the Non-Synchronous Measurements with Orthogonal Moving Arrays
The orthogonal moving arrays mean that the moving microphone array is perpendicular to the plane of the previously tested microphone. Figure 7a shows the microphone array configuration when the prototype array is moved once. The initial microphone array is located on the z = −0.5 m plane, which is consistent with the array position in Section 3.1, and the moving microphone array is located on the y = −0.5 m plane. The geometric center of the moving microphone array is located on (x = 0 m, y = −0.5 m, z = 0 m). The results in the 3D zone and typical slices are shown in Figure 7b-e. Comparing the results in Figures 6d and 7d, the orthogonally moving the microphone array one time can improve the resolution in the z-direction of the x-z plane of the initial prototype array. The positions of the sound sources in the Z direction are not distinguished well due to the merging of the main lobes in Figure 6d, while the two sound source positions in the Z direction are located accurately in Figure 7d. By comparing the results in Figures 6c and 7c, the y-z plane's resolution is slightly improved when the microphone array is moved orthogonally once, the resolution in the Z direction is not enough to locate the positions of the sound sources.
In this simulation case, the microphone array moves orthogonally twice. It moves vertically again based on the simulation case in Figure 7. In Figure 8a, the "third" microphone array is located on the x = 0.5 m plane, and the microphone array's center is located at (x = 0.5 m, y = 0 m, z = 0 m). The non-synchronous measurements of using three orthogonal moving microphone arrays can achieve good spatial resolution in the x, y, and z directions. Figures 6-8 show that the non-synchronized measurement by orthogonally moving the microphone array can overcome the lack of spatial resolution in the normal direction of the planar array. The planar microphone array can only guarantee the spatial resolution on the plane parallel to the microphone array. The microphone array is moved orthogonally and continuously twice, where a high spatial resolution can be obtained in all directions during 3D acoustic imaging.
is slightly improved when the microphone array is moved orthogonally once, the resolution in the Z direction is not enough to locate the positions of the sound sources. In this simulation case, the microphone array moves orthogonally twice. It moves vertically again based on the simulation case in Figure 7. In Figure 8a, the "third" microphone array is located on the x = 0.5 m plane, and the microphone array's center is located at (x = 0.5 m, y = 0 m, z = 0 m). The non-synchronous measurements of using three orthogonal moving microphone arrays can achieve good spatial resolution in the x, y, and z directions. Figures 6-8 show that the nonsynchronized measurement by orthogonally moving the microphone array can overcome the lack of spatial resolution in the normal direction of the planar array. The planar microphone array can only guarantee the spatial resolution on the plane parallel to the microphone array. The microphone array is moved orthogonally and continuously twice, where a high spatial resolution can be obtained in all directions during 3D acoustic imaging.

3D Imaging by the Non-Synchronous Measurements with Non-Orthogonal Moving Arrays
In the previous simulation configuration, the microphone array is always moving orthogonally. Due to limitations in measurement space or movement error, it is not always guaranteed that the

3D Imaging by the Non-Synchronous Measurements with Non-Orthogonal Moving Arrays
In the previous simulation configuration, the microphone array is always moving orthogonally. Due to limitations in measurement space or movement error, it is not always guaranteed that the microphone array remains orthogonal. Figure 9a shows the microphone array configuration when the prototype array is rotated 45 • along the x-axis. The miniatures in the upper-right and lower-right corners are the x-z and y-z views of the microphone array configuration, respectively. The initial microphone array is located at the z = −0.5 m plane. The "second" microphone array is at an angle of 45 • to the z = −0.5 m plane, and its center is 0.5 m away from the origin point (0, 0, 0). Figure 9 shows that the "second" microphone array still improves the resolution in the Z direction in the y = 0.1 m plane. Compared to the results of the orthogonally moving microphone array in Figure 7, the main lobes of the sources at y = 0.1 is relatively larger. As shown in Figure 9c, the extension direction of the main lobes in the x = 0.1 plane changes a bit. The resolution in the x = 0.1 m plane needs to be improved by further moving the microphone array.  The microphone array is moved again based on the previous analysis. The configuration of the microphone array after moving is shown in Figure 10a. The miniatures in the upper-right and lowerright corners are the x-z and y-z views of the microphone array configuration. The "third" microphone array rotates the initial microphone array by −45° on the y-axis. Its center still keeps 0.5 m away from the origin point (0, 0, 0). The results show that the resolution in the x = 0.1 m plane is improved. The main lobes of sound sources can be distinguished in the Z direction, and the sound sources localization effect is relatively better than that in Figures 6 and 9. We can conclude that the non-synchronous measurements in the 3D space can improve the spatial resolution. Compared with Figure 8e, Figure 10e shows the more accurate result of the sound source localization because in the x-y slice the density of the microphone array increases. . 3D Beamforming results from the non-synchronous measurements, non-orthogonally moving the array once. (a) The simulation setup. The red points "·" represent the sound sources. The other same color represents the microphones in the same test. The black points "·" represent the microphones in a single planar array test. The blue points "·" represent the microphones in the test with the non-orthogonal moving array, moved here for the first time. The first subfigure represents the view in the x-z plane, and the second subfigure represents the view in the y-z plane. The microphone array is moved again based on the previous analysis. The configuration of the microphone array after moving is shown in Figure 10a. The miniatures in the upper-right and lower-right corners are the x-z and y-z views of the microphone array configuration. The "third" microphone array rotates the initial microphone array by −45 • on the y-axis. Its center still keeps 0.5 m away from the origin point (0, 0, 0). The results show that the resolution in the x = 0.1 m plane is improved. The main lobes of sound sources can be distinguished in the Z direction, and the sound sources localization effect is relatively better than that in Figures 6 and 9. We can conclude that the non-synchronous measurements in the 3D space can improve the spatial resolution. Compared with

Comparison between the Synchronous Measurements and Non-Synchronous Measurements
The results of 3D imaging between the synchronous and non-synchronous measurements are compared in this section. The microphone array is moved orthogonally twice for the nonsynchronous measurements. The corresponding synchronous measurement is that the three microphone arrays capture data simultaneously. Figure 11a shows the cross-spectral matrix from three independent measurements, which has been arranged on the matrix's diagonal in the form of blocks. The elements in the non-diagonal block positions of the measured spectral matrix are missing, and they have been filled with zeros in this figure to visualize these unknown parts. Figure 11b shows the spectral matrix completion by the proposed non-synchronous measurements method. All elements in the current spectral matrix are known in both diagonal and non-diagonal blocks. Figure  11c shows the full cross-spectral matrix in the case of all microphones in three arrays, capturing data simultaneously (i.e., the synchronous measurements). Let . The frequency of beamforming is set to 4000 Hz, and the signal-to-noise ratio is set to 20 dB. In this case, the maximum relative error position Error of the six source positions is 1.926%. Since the beamforming algorithm is suitable for higher working frequencies, we selected frequencies of 4000 Hz (Figure 12), 6000 Hz (Figure 13), and 8000 Hz ( Figure 14). The beamforming map slices of the synchronous measurement are shown in Figures 12a-c, 13a-c, and 14a-c, and the beamforming map slices of the non-synchronous

Comparison between the Synchronous Measurements and Non-Synchronous Measurements
The results of 3D imaging between the synchronous and non-synchronous measurements are compared in this section. The microphone array is moved orthogonally twice for the non-synchronous measurements. The corresponding synchronous measurement is that the three microphone arrays capture data simultaneously. Figure 11a shows the cross-spectral matrix from three independent measurements, which has been arranged on the matrix's diagonal in the form of blocks. The elements in the non-diagonal block positions of the measured spectral matrix are missing, and they have been filled with zeros in this figure to visualize these unknown parts. Figure 11b shows the spectral matrix completion by the proposed non-synchronous measurements method. All elements in the current spectral matrix are known in both diagonal and non-diagonal blocks. Figure 11c shows the full cross-spectral matrix in the case of all microphones in three arrays, capturing data simultaneously (i.e., the synchronous measurements). Let D non denote the position of the sound source obtained by the non-synchronous measurements. The relative error can quantify the difference between the real sound source position D real and D non by the 2 -norm, which is obtained as Error position = ( D non 2 − D real 2 )/ D real 2 . The frequency of beamforming is set to 4000 Hz, and the signal-to-noise ratio is set to 20 dB. In this case, the maximum relative error Error position of the six source positions is 1.926%. Since the beamforming algorithm is suitable for higher working frequencies, we selected frequencies of 4000 Hz (Figure 12 Figures 12-14 show (1) both the synchronous measurements and the non-synchronous measurements can accurately locate the position of the sound source and have a good spatial resolution; and (2) compared to the synchronous measurements, the non-synchronous measurements increase the sidelobes slightly, but the number of microphones required for measurement is reduced significantly in the localization of the spatial sound sources.  Figures 12d-f, 13d-f, and 14d-f. The beamforming results of Figures 12-14 show (1) both the synchronous measurements and the non-synchronous measurements can accurately locate the position of the sound source and have a good spatial resolution; and (2) compared to the synchronous measurements, the non-synchronous measurements increase the sidelobes slightly, but the number of microphones required for measurement is reduced significantly in the localization of the spatial sound sources.
The spectral matrix completion Measured spectral matrix Full spectral matrix (a) (b) (c)      Figures 12d-f, 13d-f, and 14d-f. The beamforming results of Figures 12-14 show (1) both the synchronous measurements and the non-synchronous measurements can accurately locate the position of the sound source and have a good spatial resolution; and (2) compared to the synchronous measurements, the non-synchronous measurements increase the sidelobes slightly, but the number of microphones required for measurement is reduced significantly in the localization of the spatial sound sources.
The spectral matrix completion Measured spectral matrix Full spectral matrix (a) (b) (c)             The relative error can quantify the difference between the spectral matrix completion (S com ) and the full spectral matrix (S Full ) by the Frobenius norm, which is obtained as Error CSM = ( S com F − S full F )/ S full F . Figure 15 shows the errors of the spectral matrix completion (S com ) and the full spectral matrix (S Full ) with different frequencies and different SNRs. Although the performance of the non-synchronous measurements is influenced by noise, the errors of S com and S Full decrease with the SNR. When the SNR is relatively low (for example, SNR = 20 dB and 10 dB), the noise plays a key role in the results of acoustic reconstruction. As the SNR increases and frequency increase, the difference between the spectral matrix completion and the full spectral matrix gradually decreases.
Sensors 2020, 20, x FOR PEER REVIEW 17 of 22 The relative error can quantify the difference between the spectral matrix completion ( com S ) and the full spectral matrix ( Full S ) by the Frobenius norm, which is obtained as . Figure 15 shows the errors of the spectral matrix completion ( com S ) and the full spectral matrix ( Full S ) with different frequencies and different SNRs. Although the performance of the non-synchronous measurements is influenced by noise, the errors of com S and Full S decrease with the SNR. When the SNR is relatively low (for example, SNR = 20 dB and 10 dB), the noise plays a key role in the results of acoustic reconstruction. As the SNR increases and frequency increase, the difference between the spectral matrix completion and the full spectral matrix gradually decreases.

Experimental Setup
An experiment of applying the non-synchronous measurements to 3D acoustic imaging has been done in a full-anechoic chamber with a cut-off frequency of 40 Hz and background noise of −1 dB(A). The acoustic vibration measurement platform was provided by the Zhejiang Institute of Metrology (ZJIM). The microphone model was MPA416 (provided by Beijing Prestige Technology Co., Beijing, China). The acquisition instrument model was DH5922D (provided by Jiangsu Donghua Testing Technology Co., Ltd., Jingjiang, China). Figure 16a shows the photograph of the on-site experiment setup. To display the spatial locations of the sound sources, the schematic diagram of the sound source locations and the geometry of the microphone array are drawn in Figure 16b. Four speakers are not in one plane in the experiment. The sound source was driven by Bluetooth speakers. Similar to previous simulations, these four speakers were located at the four vertices (A, C, D and B1) of an imaginary cube ABCD-A1B1C1D1. The side length of this imaginary cube was about 0.25 m. To more conveniently describe the spatial position of the sources and the array in the following, this imaginary cube's faces and vertices were used as the reference for positioning. The four speakers were supported by four tripods and placed on the marked locations on the ground. The microphone array Figure 15. The relative error of the spectral matrix completion (S com ) and the full spectral matrix (S Full ) with different frequencies and different signal-to-noise ratios (SNRs).

Experimental Setup
An experiment of applying the non-synchronous measurements to 3D acoustic imaging has been done in a full-anechoic chamber with a cut-off frequency of 40 Hz and background noise of −1 dB(A). The acoustic vibration measurement platform was provided by the Zhejiang Institute of Metrology (ZJIM). The microphone model was MPA416 (provided by Beijing Prestige Technology Co., Beijing, China). The acquisition instrument model was DH5922D (provided by Jiangsu Donghua Testing Technology Co., Ltd., Jingjiang, China). Figure 16a shows the photograph of the on-site experiment setup. To display the spatial locations of the sound sources, the schematic diagram of the sound source locations and the geometry of the microphone array are drawn in Figure 16b. Four speakers are not in one plane in the experiment. The sound source was driven by Bluetooth speakers. Similar to previous simulations, these four speakers were located at the four vertices (A, C, D and B1) of an imaginary cube ABCD-A1B1C1D1. The side length of this imaginary cube was about 0.25 m. To more conveniently describe the spatial position of the sources and the array in the following, this imaginary cube's faces and vertices were used as the reference for positioning. The four speakers were supported by four tripods and placed on the marked locations on the ground. The microphone array has 56 elements and has the same geometry as the previous simulation. Taking the center of the imaginary cube (O) as the origin, the space Cartesian coordinate system is established in Figure 16b. When the microphone array was initially placed, the array was placed parallel to the plane ABB1A1 with a distance of 0.6 m.
The microphone array was located on the x = −0.725 m plane. The microphone array position was adjusted so that the position of the array center (O1) is the same as the imaginary cube center (O). The strategy for moving the microphone array is shown in Figure 17. It is the top view. The 1st position of the microphone array was located at the x = −0.725 m plane. In the second measurement, the microphone array was rotated 90 degrees counterclockwise. The 2nd position of the microphone array was located at the y = −0.725 m. During the measurement, four speakers simultaneously emitted 4000 Hz pure tone. The acquisition instrument was provided by Jiangsu Donghua Testing Technology Co., Ltd., and its model was DH5922D. When there were more than 32 channels, the maximum sampling rate was 128 kHz/channel. So, a sampling frequency of 50 KHz was selected. The sampling time was 30 s. has 56 elements and has the same geometry as the previous simulation. Taking the center of the imaginary cube (O) as the origin, the space Cartesian coordinate system is established in Figure 16b. When the microphone array was initially placed, the array was placed parallel to the plane ABB1A1 with a distance of 0.6 m. The microphone array was located on the x = −0.725 m plane. The microphone array position was adjusted so that the position of the array center (O1) is the same as the imaginary cube center (O). The strategy for moving the microphone array is shown in Figure 17.
It is the top view. The 1st position of the microphone array was located at the x = −0.725 m plane. In the second measurement, the microphone array was rotated 90 degrees counterclockwise. The 2nd position of the microphone array was located at the y = −0.725 m. During the measurement, four speakers simultaneously emitted 4000 Hz pure tone. The acquisition instrument was provided by Jiangsu Donghua Testing Technology Co., Ltd., and its model was DH5922D. When there were more than 32 channels, the maximum sampling rate was 128 kHz/channel. So, a sampling frequency of 50 KHz was selected. The sampling time was 30 s.   has 56 elements and has the same geometry as the previous simulation. Taking the center of the imaginary cube (O) as the origin, the space Cartesian coordinate system is established in Figure 16b. When the microphone array was initially placed, the array was placed parallel to the plane ABB1A1 with a distance of 0.6 m. The microphone array was located on the x = −0.725 m plane. The microphone array position was adjusted so that the position of the array center (O1) is the same as the imaginary cube center (O). The strategy for moving the microphone array is shown in Figure 17. It is the top view. The 1st position of the microphone array was located at the x = −0.725 m plane. In the second measurement, the microphone array was rotated 90 degrees counterclockwise. The 2nd position of the microphone array was located at the y = −0.725 m. During the measurement, four speakers simultaneously emitted 4000 Hz pure tone. The acquisition instrument was provided by Jiangsu Donghua Testing Technology Co., Ltd., and its model was DH5922D. When there were more than 32 channels, the maximum sampling rate was 128 kHz/channel. So, a sampling frequency of 50 KHz was selected. The sampling time was 30 s.

Experimental Results
The 3D beamforming results between a single planar microphone array and the non-synchronous measurements are compared in Figure 18. A significant difference is visible in three cases. Only with the planar microphone array in the 1st position (Figure 18a) or 2nd position (Figure 18b), there are one or two distinct maximum peaks in the field of view, and there is a large sidelobe in the normal direction of each position. It is impossible to locate the spatial source sources accurately. With the non-synchronous measurements (Figure 18c), three distinct maximums appear in the field of vision, and the side lobes in the normal direction are reduced significantly. The 4th sound source is located at the corner on the other side, which was blocked by the slices of the 3D beamforming. The spatial localization accuracy of the sound sources is significantly improved when using the non-synchronous measurements when moving the arrays in 3D space, especially when the microphone array is perpendicular to the sound source plane.

Experimental Results
The 3D beamforming results between a single planar microphone array and the nonsynchronous measurements are compared in Figure 18. A significant difference is visible in three cases. Only with the planar microphone array in the 1st position (Figure 18a) or 2nd position ( Figure  18b), there are one or two distinct maximum peaks in the field of view, and there is a large sidelobe in the normal direction of each position. It is impossible to locate the spatial source sources accurately. With the non-synchronous measurements (Figure 18c), three distinct maximums appear in the field of vision, and the side lobes in the normal direction are reduced significantly. The 4th sound source is located at the corner on the other side, which was blocked by the slices of the 3D beamforming. The spatial localization accuracy of the sound sources is significantly improved when using the nonsynchronous measurements when moving the arrays in 3D space, especially when the microphone array is perpendicular to the sound source plane.  Figure 19. With the single microphone array placed in the 1st position (i.e., parallel to the y-z plane), two sound sources can be located at the upper-left and lower-right quadrants at the x = − 0.13 m plane. At the y = − 0.13 m and z = − 0.13 m plane, beamforming maps are not used to determine the sources' positions due to the larger extension of the lobes. With the single microphone array placed in the 2nd position (i.e., parallel to the x-z plane), the beamforming map at the y = − 0.13 m plane exhibits two major lobes that correspond to the positions of the two sound sources. At the x = − 0.13 m and z = − 0.13 m plane, the main lobes extend so badly that it is impossible to find the sound source positions exactly. When the non-synchronous measurements are adopted, the spatial resolution in both the x = − 0.13 m and y = − 0.13 m planes is improved relative to the single microphone array. For the z = − 0.13 m plane, the beamforming map is improved slightly with the non-synchronous measurements than with the single microphone array. Comparing the beamforming between the non-synchronous measurements and the single microphone array suggests that the non-synchronous measurements improve the spatial positioning resolution of the planar microphone array.  Figure 19. With the single microphone array placed in the 1st position (i.e., parallel to the y-z plane), two sound sources can be located at the upper-left and lower-right quadrants at the x = −0.13 m plane. At the y = −0.13 m and z = −0.13 m plane, beamforming maps are not used to determine the sources' positions due to the larger extension of the lobes. With the single microphone array placed in the 2nd position (i.e., parallel to the x-z plane), the beamforming map at the y = −0.13 m plane exhibits two major lobes that correspond to the positions of the two sound sources. At the x = −0.13 m and z = −0.13 m plane, the main lobes extend so badly that it is impossible to find the sound source positions exactly. When the non-synchronous measurements are adopted, the spatial resolution in both the x = −0.13 m and y = −0.13 m planes is improved relative to the single microphone array. For the z = −0.13 m plane, the beamforming map is improved slightly with the non-synchronous measurements than with the single microphone array. Comparing the beamforming between the non-synchronous measurements and the single microphone array suggests that the non-synchronous measurements improve the spatial positioning resolution of the planar microphone array. Sensors 2020, 20,

Conclusions
Due to the scanning area being a plane in the 2D beamforming method, the distance between the microphone and the sound source needs to be known in advance. The sound sources of industrial production are randomly distributed in space, instead of in a plane. The application of conventional beamforming from 2D to 3D is expanded in this paper. By moving the microphone array in space, the non-synchronous measurements can significantly improve the planar array's positioning resolution in the normal direction. The 3D sound source localization is achieved by increasing the number of microphone arrays in most current studies, which have the disadvantage of requiring more microphones and acquisition channels. The non-synchronous measurements overcome this shortcoming and can also achieve a high-resolution in 3D positioning. The method proposed in this paper has extensive application potential in the field of 3D sound localization.

Conclusions
Due to the scanning area being a plane in the 2D beamforming method, the distance between the microphone and the sound source needs to be known in advance. The sound sources of industrial production are randomly distributed in space, instead of in a plane. The application of conventional beamforming from 2D to 3D is expanded in this paper. By moving the microphone array in space, the non-synchronous measurements can significantly improve the planar array's positioning resolution in the normal direction. The 3D sound source localization is achieved by increasing the number of microphone arrays in most current studies, which have the disadvantage of requiring more microphones and acquisition channels. The non-synchronous measurements overcome this shortcoming and can also achieve a high-resolution in 3D positioning. The method proposed in this paper has extensive application potential in the field of 3D sound localization.