A 2D-DOA Sparse Estimation Method with Total Variation Regularization for Spatially Extended Sources

: In this paper, a novel two-dimensional direction of arrival (2D-DOA) estimation method with total variation regularization is proposed to deal with the problem of sparse DOA estimation for spatially extended sources. In a general sparse framework, the sparse 2D-DOA estimation problem is formulated with the regularization of extended source characteristics including spatial position grouping, acoustic signal block sparse, and correlation features. An extended sources acoustic model, two-dimensional array manifold and its complete representation, total variation regularization penalty term, and the regularization equation are built, and are utilized to seek the solutions where the non-zero coefﬁcients are grouped together with optimum sparseness. A total variation sparse 2D-DOA estimation model is constructed by combining total variation regularization with LASSO. The model can be easily solved by the convex optimization algorithm, and the solving process can promote the sparsity of the solution on the spatial derivatives and the solution itself. The theoretical analysis results show that the steps of decorrelation processing and angle matching of traditional 2D-DOA estimation methods could be avoided when adopting the proposed method. The proposed method has better robustness to noise, better sparsity, and faster estimation speed with higher resolution than traditional methods. It is promising to provide a coherent sources sparse representation of a non-strictly sparse ﬁeld.


Introduction
Estimation of the direction of arrival is an important research issue in array signal processing, which has been widely used in radar, sonar, speech processing, and wireless communication [1][2][3][4][5][6]. At present, the main azimuth estimation methods can be divided into subspace algorithm [7], maximum likelihood algorithm [8], sparse reconstruction algorithm [9], and beamforming algorithm [10,11]. The far-field point source model with sound source energy concentrated at discrete angles is often taken as the research object in the above algorithms; the algorithms also lay a theoretical foundation for DOA estimation of the spatially extended sources model, which is characterized by a small number of beats, dense distribution of multiple targets and strong correlation of multiple targets. For example, the algorithms can be used in the analysis of flat circular piston [12,13], wind turbine blade [14], and underwater objects in sonar imaging [15].
Malioutov D. et al. [16] presented a source localization method based on a sparse representation of sensor measurements with an overcomplete basis composed of samples from the array manifold. The sparse reconstruction algorithms by sound source spatial orientation distribution model of the sparse sex parameter estimation problem can be transformed into the traditional sparse spectrum estimation problem of sparse array. This method does not need complex statistical values to set the initial variable, which can overcome the low piecewise constant or linear solutions. Combining with the total variation regularization and LASSO, the 2D-DOA total variation sparse estimation model is constructed, and the accurate orientation estimation is realized by the convex optimization algorithm. This method does not prescribe the eigen-decomposition of the high-dimensional sample covariance matrix and avoids the extra angle matching and the de-coherence step of the conventional method. Meanwhile, it can realize the joint estimation of the pitch angle and azimuth angle of extended sources with high detection probability. The effectiveness of the method is verified by numerical simulation and experiments, and the performance of the method is evaluated and discussed.

Problem Formulation
Consider a planar array with dimensions M × N and M = N placed on the XY plane, where M and N represent the number of elements in the X axis and Y axis direction, respectively. The elements are evenly distributed with the interval d = 0.5λ, where λ is the incident wavelength. Assuming the sampling time t, t = 1, 2, · · · , L, L is the number of snapshots, the far-field narrowband sound sources s(t) = [s(t) 1 , s(t) 2 , · · · , s(t) N s ] T impinging on this array, where N s represents the number of sound sources, (θ k , φ k ) represents the 2D incident wave arrival direction of the kth sound source, and k = 1, 2, · · · , N s , θ k , and ϕ k respectively represent the incident azimuth and pitch angle, The DOA estimation problem can be expressed as a linear equation. A typical array observation model for receiving signals at t time can be expressed as y(t) = As(t) + n(t) (1) where y(t) denotes the array output signal vector, A denotes the 2D array manifold matrix, n(t) denotes the additive noise vector, σ 2 is the mean square error, and the mean value of σ 2 is zero, which is independent from the sound source.

Spatial Extended Source Model
Given the above, and as shown in Figure 1, the spatially extended sources are remodeled by multiple monopole sources arranged in a certain shape, amplitudes, and the phase relationship between adjacent monopoles. Thus, multiple monopoles can be arranged as continuous linear sources with constant or linear amplitudes and random uniform phases. According to the significant grouping characteristics of spatially extended sources, its model can be established: , s(t) g+1 · · · s(t) 2g s [2] , · · · s(t) N s −g · · · s(t) N s where g is the grouping length and G depicts the number of groups, N s = G × g. Further considering the correlation of monopoles in the groups, taking the first group s [1] as an example and the first monopole source s(t) 1 in the group as a reference, the k th monopole source s(t) k in the group can be expressed as: where β k is the amplitude fading factor k = 1, · · · , g and ∆ϕ k represents the phase difference of s(t) k related to s(t) 1 , without losing the generic form, taking β k = 1 and ∆ϕ k = 0. where k β is the amplitude fading factor 1, , = k g  and ∆ k ϕ represents the phase difference of ( ) k s t related to ( ) 1 s t , without losing the generic form, taking 1 = k β and 0 ∆ = k ϕ . Figure 1. Acoustic incidence model of extended sources in planar array.

2D Array Manifold Matrix
As the key parameter of Equation (1), the array manifold matrix A directly affects the accuracy of DOA estimation. Considering the spatial geometric relationship between the extended sources and the array shown in Figure 1, the time difference τ between the transmissions of the th k incident monopole source from the first sensor ( ) where c is the propagation velocity of sound waves in the air at standard atmospheric pressure. The th X sensor's received acoustic signal can be expressed as- where a x and y a are the array steering vectors of the s N incident sound sources in the axis and axis directions respectively, and the array manifold matrix of the axis directions X and Y are expressed Since the 2D generalized direction vector can be approximately expressed as the Kronecker product of two 1D direction vectors along the axis and axis directions [39], and the array physical structure determines the array manifold matrix, therefore, the 2D direction vector of a planar array can be defined as:

2D Array Manifold Matrix
As the key parameter of Equation (1), the array manifold matrix A directly affects the accuracy of DOA estimation. Considering the spatial geometric relationship between the extended sources and the array shown in Figure 1, the time difference τ between the transmissions of the k th incident monopole source from the first sensor (0, 0) to X th sensor (m, n) can be defined as where c is the propagation velocity of sound waves in the air at standard atmospheric pressure. The X th sensor's received acoustic signal can be expressed as- where a X = e jωτ X , ω denotes the angular frequency of incident wave, ω = 2π f , and f denotes the incident wave frequency. The direction vectors of the k th monopole source along the X axis and Y axis directions can be represented by the Vandermonde matrix a x = [1, e j2πd cos θ k sin ϕ k /λ , · · · , e j2π(M−1)d cos θ k sin ϕ k /λ ] a y = [1, e j2πd sin θ k sin ϕ k /λ , · · · , e j2π(N−1)d sin θ k sin ϕ k /λ ] where a x and a y are the array steering vectors of the N s incident sound sources in the X axis and Y axis directions respectively, and the array manifold matrix of the axis directions X and Y are expressed A y (θ, ϕ) = a y (θ 1 , ϕ 1 ), a y (θ 2 , ϕ 2 ), · · · , a y (θ K , ϕ K ) Since the 2D generalized direction vector can be approximately expressed as the Kronecker product of two 1D direction vectors along the X axis and Y axis directions [39], and the array physical structure determines the array manifold matrix, therefore, the 2D direction vector of a planar array can be defined as: where a denotes the 2D directional vector and ⊗ denotes the Kronecker products. The corresponding 2D array manifold matrix can be constructed as: where A is the array manifold matrix combining with the axis directions X and Y.

Sparse DOA Estimation Model
In the sparse representation of DOA estimation, the overcomplete representation of the two-dimensional array manifold matrix A can be represented as: where A is the overcomplete representation of A.a(θ k , ϕ k ) is the steering vector, k = 1, 2, · · · , P, P ≥ N s .{(θ 1 , ϕ 1 ), (θ 2 , ϕ 2 ), · · · , (θ P , ϕ P )} is the set of all potential source positions. In the sparse DOA estimation, A is known and independent of the actual source positions.
The source position is represented as a P × 1 vector, in which only N s elements are nonzero, so the source is sparse in space. The number of nonzero elements in P × 1 vector corresponds to the angle position of the actual sound source. The array model of sparse representation can be expressed as: In summary, the sparse DOA estimation problem is to find out the position index of all nonzero elements in the P × 1 vector under the given observation vector y(t) and overcomplete array manifold matrix A, which is also the goal of this paper.
From Equations (12) and (13), it can be found that the count of nonzero items is an ideal measure of sparsity, denoted by s 0 , which is called 0 -norm. However, this is a difficult combinatorial optimization problem, therefore, the 1 -norm is often adopted to replace the 0 -norm, thus the problem can be transformed into a convex optimization problem. In this case, the global optimal solution can be solved through linear programming, namely, sparse regularization. The solution can be expressed as: Equation (14) is a traditional sparse DOA estimation model named LASSO. When there are only a small number of sound sources, the model can significantly improve the resolution of sparse DOA estimation. However, when the sound source is a spatially extended source with obvious grouping and block sparse characteristics, the sparsity of a single numerical solution domain cannot show the block sparse characteristics well and cannot effectively represent the sparse of sources. Therefore, the total variation is introduced and a 2D total variation sparse DOA estimation method is proposed.

2D-DOA Total Variation Regularization
In this section, a novel 2D-DOA sparse estimation method with total variation regularization is present concretely. The definition of total variation matrix is given and the corresponding total variation regularization term is constructed, which is combined with the traditional LASSO to build the model of 2D-DOA sparse estimation. The proposed method (abbreviated 2DTV-CES) can utilize the group's characteristics (block sparse) and spatial derivatives to promote the sparsity both on the spatial derivatives of the solution and on the solution itself.

Total Variation Regularization
Total variation regularization depends on the local smoothing characteristics or discrete gradient sparse of the spatially extended source, which can greatly promote the block sparsity of the sound source solution and gives a piecewise constant or linear fitting, which is used in the setting of the coordinate closely related to its adjacent coordinate (local correlation) in the real model, and inhibits the single noise peak. Firstly, in order to clarify the meaning of total variation, the variation (discrete gradient) definition is carried out for each of the two adjacent coefficients in the coefficient vector of a monopole source. Secondly, according to the variation relationship between each of the two adjacent coefficients, the total variation matrix D tv is constructed using 2D spatial derivatives. Finally, the total variation matrix D tv is used to regularize the source matrix S, and the total variation sparse solution is realized by convex optimization.

Total Variation Definition
Suppose that the source S matrix with dimension M × N, according to the column sorting matrix, S i,j represents any element in the matrix, i is the row vector index, j is the column vector index, i, j = 1, . . . , M. The variation of any row and adjacent elements in the matrix is defined as: where ∆S h and ∆S v represent the first-order X axis direction variation and Y axis direction variation of the adjacent elements of matrix S, respectively. The 2D total variation is defined as:

Total Variation Matrix
The total variation matrix D tv is built from one-dimensional first-order total variational operator D 1d . It is defined as a (M − 1) × M dimensional banded matrix. The sum of each row of the matrix is zero, and any element d i,i in row i th is assigned the value −1, d i,i+1 is assigned the value 1, and the rest is zero.
The matrix D tv is generated by removing the first and last rows of the autocorrelation matrix of D 1d . The total variation matrix D tv can be expressed as: Let S Z = S S T , then for the source matrix S, there is: Appl. Sci. 2023, 13, 9565 7 of 16

2D-DOA Total Variation Regularization Model
When Equation (19) is substituted into Equation (14), the sparse DOA estimation model can be transformed into a 2D-DOA total variation regularization model: (20) In order to ensure the sparsity of the target solution s, the LASSO and two-dimensional total variation are combined based on Equation (20), which improves the sparsity of the coefficients (i.e., there are few active coefficients in the solution) and the sparsity of the continuous coefficient difference (i.e., the sparsity of the sparse contour flatness). As a result, the proposed method is suitable for the applications involving block sparsity or mixed peaks and flat regions. Therefore, the final 2D-DOA total variation sparse estimation model is expressed as: where µ 1 and µ 2 are regular parameter. From the above derivation process, it can be seen that the method (2DTV-CES) does not need the traditional decoherence processing at the cost of losing the antenna aperture, and does not estimate the azimuth angle and the pitch angle respectively, therefore, the angle pairing is avoided totally and the algorithm is simplified sharply.

Numerical Simulations
The 2DTV-CES performance in 2D-DOA estimation is evaluated by simulations starting with spatially extended sources. We consider a planar array with M = 16 sensors and spacing d = λ/2. The complete array manifold matrix A in Equation (21) [23], and the Improved ESPRIT [30] in the presence of the same spatially extended sources under different SNR and different sparsity K.

Performance Metrics
The detection probability, as the accuracy evaluation factor of focusing directions, is built to evaluate the performance of the three different algorithms. When the absolute value of the difference between the estimated focusing directionŝ(θ i , ϕ j ) and the actual focusing direction s(θ i , ϕ j ) is less than or equal to 2 • , the detection is defined as correctly detecting. The greater the detection probability, the higher the accuracy.
The detection probability is defined as the relative root-mean-square error at the focusing direction g. where q denotes the number of Monte Carlo experiments, s g (θ i , ϕ j ) is the actual focusing direction, andŝ g (θ i , ϕ j ) is the corresponding focusing direction estimation.

Performance of TV-Norm Regularization
The results in Figures 2 and 3 indicate qualitatively the performance of total variation regularization (TV-norm) of 2D-DOA estimation in the presence of two kinds of spatially extended sources in Case 1 and Case 2 under high-SNR (SNR = 20) and sparsity (K = 10).

Performance of TV-Norm Regularization
The results in Figures 2 and 3 indicate qualitatively the performance of total variation regularization (TV-norm) of 2D-DOA estimation in the presence of two kinds of spatially extended sources in Case 1 and Case 2 under high-SNR (SNR = 20) and sparsity ( K = 10).   Figure 2a shows the spatial spectrum of the spatially extended sources with the equal amplitudes in Case 1. The amplitude of the solution coefficients is strict with the constant value 1 dB, and each focusing direction can be clearly distinguished. The solution coefficient is divided into two groups distinctly. It is clear that the TV-norm can promote the piecewise constant solution so that the solution coefficients are closely related to its neighbors, which reflects the block sparse characteristics of the spatially extended sources. Figure 2b,c shows the 2D-DOA's estimation maps of 2DTV-CES and LASSO respectively. It can be seen that the 2DTV-CES can accurately estimate all focusing directions with TV-norm regularization. There are 19 focusing directions of 21 directions that can be accurately distinguished, and only 2 focusing directions with 1 • deviation. They are also within the allowable error range ±2 • , and can be regarded as effective estimation. Moreover, the two groups of focusing direction can be effectively separated. Relatively speaking, the estimation accuracy of LASSO is poor. There are only 11 focusing directions out of 21 directions that can be identified and are unable to distinguish between different groups. This is because LASSO can only promote solution sparsity in the data domain but not in the gradient domain.   Figure 2a shows the spatial spectrum of the spatially extended sources with the equal amplitudes in Case 1. The amplitude of the solution coefficients is strict with the constant value 1 dB, and each focusing direction can be clearly distinguished. The solution coefficient is divided into two groups distinctly. It is clear that the TV-norm can promote the piecewise constant solution so that the solution coefficients are closely related to its neighbors, which reflects the block sparse characteristics of the spatially extended sources. Figure 2b,c shows the 2D-DOA's estimation maps of 2DTV-CES and LASSO respectively. It can be seen that the 2DTV-CES can accurately estimate all focusing directions with TV-norm regularization. There are 19 focusing directions of 21 directions that can be accurately distinguished, and only 2 focusing directions with 1° deviation. They are also within the allowable error range ±2°, and can be regarded as effective estimation. Moreover, the two groups of focusing direction can be effectively separated. Relatively speaking, the estimation accuracy of LASSO is poor. There are only 11 focusing directions out of 21 directions that can be identified and are unable to distinguish between different groups. This is because LASSO can only promote solution sparsity in the data domain but not in the gradient domain.
In contrast, 2DTV-CES not only promotes the sparsity in the data domain but in the  In contrast, 2DTV-CES not only promotes the sparsity in the data domain but in the gradient domain with TV-norm. Therefore, 2DTV-CES has a better capacity for the promotion of block sparsity. Figure 3 depicts the 2D-DOA's estimations of spatial spectrum distribution and DOA's map of 2DTV-CES and LASSO in Case 2. The results show 2D-DOA's estimates of complex extended sources mixed with point sources, to demonstrate the promotion of sparsity and clustering by the 2DTV-CES method. Figure 3a shows the spatial spectrum of 2D-DOA estimation. It can be seen that when there are point sources and extended sources at the same time, the constant solution can be obviously separated into three groups by 1 -norm and the TV-norm regularization of 2DTV-CES. The amplitude of the solution coefficients is strict with the constant value 1 dB, and each focusing direction can be clearly distinguished. The method can effectively identify the focusing directions of extended sources and two single-point sources. Figure 3b,c demonstrates the 2D-DOA maps of the method 2DTV-CES and LASSO, respectively. The results show that 2DTV-CES can estimate all focusing directions. Especially, there are 19 directions with zero deviation, and 2 directions have 1 • deviation. Only 11 of 21 directions can be estimated by LASSO with more than 1 • deviation. Both methods can estimate two peaks within the allowable ±2 • deviation. Compared with the DOA's estimation, the 2DTV-CES is better than LASSO, since 2DTV-CES introduces and utilizes TV-norm regularization on the basis of LASSO, which promotes piecewise constant and peak solutions.

Performance of SNR and Sparsity K
The results in Figures 4 and 5 indicate the performance of 2DTV-CES, LASSO, and Improved ESPRIT as a function of sparsity K and SNR, respectively. Figure 4 shows the detection probability of the 2DTV-CES, LASSO, and the Improved ESPRIT with the different sparsity K. The performance of three different methods in 2D-DOA's estimation at the range of [(8 + ∆k) • , (15 + ∆k) • ], ∆k ∈ [1 : 1 : 9], and the sparsity K at the range of [2:1:10] is evaluated. Each 2D-DOA is obtained by 1000 Monte Carlo experiments. It can be seen that the detection probability of 2DTV-CES gradually increases up to 98% and tends to stabilize with the increasing of the sparsity; 2DTV-CES has even better estimation performance in the case of high sparsity than that of lower (K < 3). When K is lower, the detection probability of LASSO fluctuates slightly between 97 and 100%, and the estimation performance is significantly better than that of the 2DTV-CES and the Improved ESPRIT. This phenomenon is caused by the weak grouping of an extended source with lower K, and the sparse promotion feature of the LASSO can achieve high resolution. However, with the increasing of K, the extended sources exhibit strongly grouped structural characteristics. Based on the sparsity promotion feature, the 2DTV-CES utilizes the TV-norm regularization to promote the formation of segmented constant distribution, resulting in a significantly better performance than the other two methods. To summarize, 2DTV-CES is not sensitive to the change of sparsity when its values are more than 3 and has good robustness.    Figure 5 describes the performance of the three methods under different SNRs. As can be seen, the average detection probability of 2DTV-CES is up to 75% and is 40% higher than that of LASSO with the increase of SNR. Obviously, the SNR robustness of 2DTV-CES is higher than that of the other two methods in the whole range of SNR [−5:1:20]. As     Figure 5 describes the performance of the three methods under different SNRs. As can be seen, the average detection probability of 2DTV-CES is up to 75% and is 40% higher than that of LASSO with the increase of SNR. Obviously, the SNR robustness of 2DTV-CES is higher than that of the other two methods in the whole range of SNR [−5:1:20]. As With the same 2D-DOA estimation conditions as shown in Figure 4, the performance of 2DTV-CES, LASSO, and the Improved ESPRIT in the presence of additive noise at a range of [−5:5:20] dB SNR is evaluated. The results are shown in Figure 5. Figure 5 describes the performance of the three methods under different SNRs. As can be seen, the average detection probability of 2DTV-CES is up to 75% and is 40% higher than that of LASSO with the increase of SNR. Obviously, the SNR robustness of 2DTV-CES is higher than that of the other two methods in the whole range of SNR [−5:1:20]. As described, the performance of SNR robustness of 2DTV-CES is optimal. In the case of exceeding the limitation of the array element, there is still a high detection probability at a smaller SNR.
The performance of the proposed method 2DTV-CES, the LASSO, and the Improved ESPRIT is compared. The spatial average detection probability is taken as the metric, and statistical analysis is carried out from three aspects: sparsity K, signal-to-noise ratio SNR, and computing speed. The results are shown in Table 1. As shown in Table 1, the spatial average detection probability of 2DTV-CES is more than 85% with different sparsity and SNR. The average computing speed is only 29 s. Compared with the other two methods, 2DTV-CES has the uppermost spatial average detection probability under the same sparsity and SNR conditions and the fastest computing speed with the narrowest range of computing speed variation. Improved ESPRIT has the lowest detection probability and the longest detection time and it is not suitable for sparse DOA estimation of extended sound sources. The results further illustrate the advantages of the proposed method 2DTV-CES.

Experimental Results and Discussion
An experimental study is conducted to examine 2D-DOA sparse estimation by 2DTV-CES. The measurements were conducted in the semi-anechoic chamber (500 m 3 , cut-off frequency 20 Hz) at the Qingdao University of Technology, China, with a 16-channel 2Drandom array of 1 m diameter. Its dimensions are 4 × 4 and the microphone spacing d was 8.5 cm. In this experiment, there were two different types of spatially extended sources. One type was a continuous source consisting of four monopole sources in Figure 6a, the other was a dipole source with two loudspeaker drivers closely driven in antiphase in Figure 6b. The two types of extended sources were driven by 2000 Hz acoustic signals and were oriented so that the sources' plane is parallel to the array plane. The sources' plane was located at z = 0 cm and the microphone array at z = 100 cm. According to the Nyquist sampling theorem, the sampling frequency of 4096 Hz was taken to obtain 10 s date records. Figure 6c depicts the 2D-DOA measurement metric in experimental procedure.
In order to verify the focusing and block sparsity features of TV-norm regularization of the 2DV-CES, the coefficient distributions of sound source sparse solution of the three methods were calculated by using continuous sources, and the results are shown in Figure 7.  Figure 7a, 7b and 7c, respectively. As shown in Figure 7a, 2DTV-CES clearly shows the division of three complete groups, the amplitude profile of each group is strictly displayed according to the set value, and each 2D-DOA can be effectively estimated. In Figure 7b, the results of LASSO can also be depicted in three groups named a, b, and c. The length of each group is short comparing with it in Figure 7a, and multiple values within the group are overlaped. The amplitude profile of each group is the constant value 1 and did not show any linear changes. Figure 7c shows the results of the Improved ESPRIT method. There are no group a and only a few single values in group b and group c. Comparing the three methods, it can be concluded that 2DTV-CES has good block focusing, sparsity, and can estimate every DOA value completely and clearly. LASSO shows good sparsity, but it cannot clearly estimate all DOAs with constant amplitude and cannot identified any DOAs with linear amplitude. Improved ESPRIT cannot achieve DOA's estimation of extended sound sources. This result further verifies the role of TV-norm regularization and is consistent with the simulation results in Figures 2 and 3.
Appl. Sci. 2023, 13, x FOR PEER REVIEW 12 of 17 detection probability under the same sparsity and SNR conditions and the fastest computing speed with the narrowest range of computing speed variation. Improved ESPRIT has the lowest detection probability and the longest detection time and it is not suitable for sparse DOA estimation of extended sound sources. The results further illustrate the advantages of the proposed method 2DTV-CES.

Experimental Results and Discussion
An experimental study is conducted to examine 2D-DOA sparse estimation by 2DTV-CES. The measurements were conducted in the semi-anechoic chamber (500 m 3 , cut-off frequency 20 Hz) at the Qingdao University of Technology, China, with a 16-channel 2D-random array of 1 m diameter. Its dimensions are 4 × 4 and the microphone spacing d was 8.5 cm. In this experiment, there were two different types of spatially extended sources. One type was a continuous source consisting of four monopole sources in Figure  6a, the other was a dipole source with two loudspeaker drivers closely driven in antiphase in Figure 6b. The two types of extended sources were driven by 2000 Hz acoustic signals and were oriented so that the sources' plane is parallel to the array plane. The sources' plane was located at z 0 = cm and the microphone array at z 100 = cm. According to the Nyquist sampling theorem, the sampling frequency of 4096 Hz was taken to obtain 10 s date records. Figure 6c depicts the 2D-DOA measurement metric in experimental procedure. The array test data were analyzed by 2DTV-CES, LASSO and Improved ESPRIT methods on the MATLAB numerical platform, and the calculated results were shown in Figure  7a, 7b and 7c, respectively. As shown in Figure 7a, 2DTV-CES clearly shows the division of three complete groups, the amplitude profile of each group is strictly displayed according to the set value, and each 2D-DOA can be effectively estimated. In Figure 7b, the results of LASSO can also be depicted in three groups named a, b, and c. The length of each group is short comparing with it in Figure 7a, and multiple values within the group are overlaped. The amplitude profile of each group is the constant value 1 and did not show any linear changes. Figure 7c shows the results of the Improved ESPRIT method. There are no group a and only a few single values in group b and group c. Comparing the three methods, it can be concluded that 2DTV-CES has good block focusing, sparsity, and can estimate every DOA value completely and clearly. LASSO shows good sparsity, but it  As stated in the review, through the above two kinds of experiments, the advantages of 2DTV-CES in data sparsity promotion and SNR robustness are effectively demonstrated, and the experimental basis for sparse representation of non-sparse fields is provided.

Conclusions
In this paper, a novel 2D-DOA sparse estimation method with total variation regularization is proposed. It is intended to solve the sparse estimation of a non-strictly sparse sound field using the regularization of sound sources' characteristics. The main conclusions are as follows.
The spatially extended source model is constructed using spatial positions' grouping features of sound sources and sound signals' correlation characteristics. The two-dimensional sparse representation of sound sources is realized by constructing a two-dimensional array manifold matrix and its overcomplete representation.
These characteristics, spatial grouping, block sparse, and correlation, are converted into regularization penalty terms by constructing the total variation regularization term. It promotes the sparsity of the block sparse solution on the spatial derivatives domain. Furthermore, combined with the ℓ1-norm regularization term, the 2D-DOA sparse estimation model is built to be based on a joint estimate pitch angle and azimuth angle of an extended source by the convex optimization algorithm. This process promotes sparsity both on the spatial derivatives of the solution and on the solution itself, thus rapidly seeking solutions where the nonzero coefficients are grouped together.
The numerical and experimental results demonstrate that the method with regularization penalty term significantly improves the sparsity and the resolution and detection probability in the 2D-DOA sparse estimation. It also has certain robustness to noise. The solving process avoids the feature decomposition, decoherence, and angle pairing of the high-dimensional sample covariance matrix in the traditional methods. The method is promising for the capability of sparse representation, not necessarily sparse sound fields (e.g., in acoustic near-fields, reflective environments).  As stated in the review, through the above two kinds of experiments, the advantages of 2DTV-CES in data sparsity promotion and SNR robustness are effectively demonstrated, and the experimental basis for sparse representation of non-sparse fields is provided.

Conclusions
In this paper, a novel 2D-DOA sparse estimation method with total variation regularization is proposed. It is intended to solve the sparse estimation of a non-strictly sparse sound field using the regularization of sound sources' characteristics. The main conclusions are as follows.
The spatially extended source model is constructed using spatial positions' grouping features of sound sources and sound signals' correlation characteristics. The twodimensional sparse representation of sound sources is realized by constructing a twodimensional array manifold matrix and its overcomplete representation.
These characteristics, spatial grouping, block sparse, and correlation, are converted into regularization penalty terms by constructing the total variation regularization term. It promotes the sparsity of the block sparse solution on the spatial derivatives domain. Furthermore, combined with the 1 -norm regularization term, the 2D-DOA sparse estimation model is built to be based on a joint estimate pitch angle and azimuth angle of an extended source by the convex optimization algorithm. This process promotes sparsity both on the spatial derivatives of the solution and on the solution itself, thus rapidly seeking solutions where the nonzero coefficients are grouped together.
The numerical and experimental results demonstrate that the method with regularization penalty term significantly improves the sparsity and the resolution and detection probability in the 2D-DOA sparse estimation. It also has certain robustness to noise. The solving process avoids the feature decomposition, decoherence, and angle pairing of the high-dimensional sample covariance matrix in the traditional methods. The method is promising for the capability of sparse representation, not necessarily sparse sound fields (e.g., in acoustic near-fields, reflective environments).