CAE-CNN-Based DOA Estimation Method for Low-Elevation-Angle Target

For the DOA (direction of arrival) estimation of a low-elevation-angle target under the influence of a multipath effect, this paper proposes a DOA estimation method based on CAE (convolutional autoencoder) and CNN (convolutional neural network). The algorithm firstly inputs the signal covariance matrix of the array of the low-elevation target containing direct and reflected waves into the convolutional autoencoder to realize the de-multipath, and uses the spatial features extracted by the convolutional autoencoder as the input of the extreme learning machine to realize the DOA preclassification of direct waves; based on the preclassification results, one branch of the three parallel convolutional neural nets is selected, and the output of the convolutional autoencoder is used as the input of this branch to realize DOA estimation. The simulation results show that the algorithm has better estimation accuracy and efficiency than the conventional algorithms, especially when the DOA of the target is in the lower range. The analysis of the simulation results shows that the algorithm is effective, in which the convolutional autoencoder can effectively realize the de-multipath, and the use of parallel convolutional neural networks can avoid overfitting and underfitting and realize DOA estimation more accurately.


Introduction
With the continuous development of electromagnetic wave theory and technology, radar has been widely used in meteorology, exploration, remote sensing, and especially in the military [1]. Driven by the gradual deepening of operational requirements, various countries have developed better air defense systems, and the possibility of targets breaking through the defense from high altitude has become less, while low-altitude/ultralowaltitude surprise defense has become one of the main threats to modern radar air defense systems. In the low-/ultralow-altitude environment, the difficulties of the target direction of arrival (DOA) estimation processing problem include clutter, strong interference, terrain occlusion, and especially the serious multipath effect. When the radar beam hits the target, in addition to the return echoes directly from the target, the echoes scattered indirectly through the ground/sea surface or obstacles will also reach the radar receiver at the same time, superimposing each other to produce an interference effect, called the multipath interference effect [2], and such indirect target echoes that are scattered or bypassed by obstacles to reach the receiver are weaker than the direct target echoes in most cases, and appropriate measures can be taken. It is relatively easy to attenuate or eliminate the interference effect caused by obstacles, while the indirect target echoes reflected from the ground/sea surface are very strong and almost comparable to the direct target echoes. Therefore, the reflection from the ground surface/sea surface is the main cause of the multipath interference effect, and is one of the important issues of concern to radar signal processing researchers.
Currently, the majority of radars operate in the 200 MHz-to-10 GHz frequency band, covering meter-wave radar, decimeter-wave radar, and centimeter-wave radar, etc. According to the antenna 3 dB beamwidth calculation method, the beamwidth of meter-wave radar is usually in the order of 10 degrees, the beamwidth of centimeter-wave radar is around a few degrees, and decimeter-wave radar is in between [1]. In the field of signal processing, the multipath effect that is being focused on mainly occurs when the target elevation angle is less than 0.8 beamwidths. When the target elevation angle is greater than 0.8 beamwidths, the effect of the multipath can be more easily filtered out from the time, air, or frequency domains by hardware means or other data-processing methods [3]. Therefore, the range of low-elevation angles is from 0 to about 10 • . For the case of lowelevation angles generating a multipath effect, the current solutions include multipath mitigation and coherent signal separation. Multipath mitigation is mainly achieved by changing antenna pattern; however, its effect on the phase will lead to a certain extent to the degradation of the angle-measurement performance [4]; coherent signal separation is mainly achieved by making the signal covariance matrix full-rank [5]; common methods include the forward or backward spatial smoothing algorithm, Toeplitz matrix reconstruction [6,7], etc. The forward or backward spatial smoothing algorithm divides the array signal into subarrays and solves the problem of the covariance matrix of the original signal being not full-rank by calculating the subarray covariance matrix [8,9] and then averaging it, but the algorithm will lose the array aperture and reduce freedom, thus affecting the angle estimation performance. The Toeplitz matrix reconstruction method changes the data structure of the covariance matrix by constructing a Toeplitz matrix to achieve rank recovery, but the difference between multipath coherent signals and real coherent signals is that the covariance matrix of the received signal does not have Toeplitzness. Ebrahim M. et al. propose a segmented DOA estimation method, with the first stage for distinguishing uncorrelated signals and the second stage for solving the direction of arrival of correlated signals using covariance and iterative spatial smoothing [10,11], but in the actual signalprocessing process, the peaks of direct and reflected waves are difficult to distinguish in a multipath environment, and it is difficult to invert the covariance matrix by the peaks already obtained. ZHAO et al. use the ICA algorithm to obtain the steering vector containing multipath component information, and use CS theory to estimate the direct component of the array signal and the DOA of each multipath component [12], but it is difficult to achieve effective differentiation with this method when the direct and reflected angles are small. In addition to these, DOA estimation methods based on artificial intelligence algorithms also provide ideas for the solution of this type of problem. For example, a hierarchical convolutional neural network-based smart antenna is proposed [13], which realizes DOA estimation by progressively refining subsectors and transforms the estimation problem into a classification problem, but the direct angle and reflected angle of the low-elevationangle target are both small, which leads to a high computational cost; Xiang proposes the estimation performance of existing super-resolution algorithms by enhancing the phase of the direct angle in the received signal through deep convolutional neural networks [14] or deep neural networks to build a feature-to-feature phase enhancement framework [15], but simply enhancing the phase of the direct angle signal in the received signal is more difficult, which requires more prerequisites; Ge uses deep convolutional neural networks to achieve DOA estimation of coherent sources based on the sparse representation of the received signal of the array [16], which requires the discretization of the received signal, while the actual received signal angle is unknown, which leads to discrepancies in the data input when applying this model and affects the estimation accuracy; Liu uses deep neural networks to achieve DOA estimation with robustness to array defects [17], but this method is not applicable to low-elevation-angle targets.
For low-elevation-angle targets with small direct and reflected angles, which are difficult to distinguish and estimate with high accuracy, this paper proposes a low-elevationangle target DOA estimation algorithm based on convolutional autoencoder and convolutional neural network. The convolutional autoencoder (CAE) in this algorithm can effectively extract the direct angle part of the received signal, and the angle estimation using convolutional neural network (CNN) after preclassification by the extreme learning machine can further improve the estimation accuracy and reduce the time and space complexity, and the overall model uses the received signal covariance matrix as input without complex preprocessing.

Multipath Signal Spatial Model
The spatial model where the multipath effect occurs is shown in Figure 1 below, demonstrating the correspondence between angle, wave path, and spatial distance. Usually, the plane parallel to the ground or sea surface is used as the base plane, and the direct angle θ d is positive and the reflected angle θ i is negative. For the convenience of presentation, the direct angle and reflected angle are calculated with positive values in the spatial model.
For low-elevation-angle targets with small direct and reflected angles, which are difficult to distinguish and estimate with high accuracy, this paper proposes a lowelevation-angle target DOA estimation algorithm based on convolutional autoencoder and convolutional neural network. The convolutional autoencoder (CAE) in this algorithm can effectively extract the direct angle part of the received signal, and the angle estimation using convolutional neural network (CNN) after preclassification by the extreme learning machine can further improve the estimation accuracy and reduce the time and space complexity, and the overall model uses the received signal covariance matrix as input without complex preprocessing.

Multipath Signal Spatial Model
The spatial model where the multipath effect occurs is shown in Figure 1 below, demonstrating the correspondence between angle, wave path, and spatial distance. Usually, the plane parallel to the ground or sea surface is used as the base plane, and the direct angle is positive and the reflected angle is negative. For the convenience of presentation, the direct angle and reflected angle are calculated with positive values in the spatial model. According to the geometric relationship in Figure 1, the following can be obtained: Dividing Equations (1) and (2), that is, From Figure 1, the wave paths of the direct and reflected waves can be obtained as follows: According to the geometric relationship in Figure 1, the following can be obtained: Dividing Equations (1) and (2), that is, From Figure 1, the wave paths of the direct and reflected waves can be obtained as follows: where R d denotes the wave path of the direct wave and R 1 + R 2 denotes the wave path of the reflected wave. Therefore, the wave path difference ∆R can be expressed as follows: According to the existing small-angle approximation [18], when θ < 12 • , the numerical error of the sine and tangent functions does not exceed 1%, and the sine and tangent values are approximately equal. The elevation angles of the far-field low-altitude targets studied in this paper are all in the range of (0 • , 10 • ), which are consistent with the small-angle approximation. Therefore, combined with Equation (4), Equation (8) can be expressed in the following form: When the target is far from the radar and the elevation angle is small, according to L'Hopital's law, cosθ d −1 sinθ d ≈ 0, the wave range difference is approximated as 0.

Multipath Signal Model
In practice, the signal is propagated in both directions between the radar and the target; therefore, the echo signal received by the radar is divided into four parts, corresponding to the four propagation paths of ground reflection in the low-altitude environment [19], as shown in the four paths in Figure 1 above: direct AB-direct BA, direct AB-reflected BOA, reflected AOB-direct BA, reflected AOB-reflected BOA. For the convenience of calculation, only the multipath effect in the receiving process is calculated, corresponding to the first two of the four paths above. Suppose there is an ideal uniform line array with M array elements: array element spacing d is no greater than half wavelength, and the number of snapshots is L. The far-field narrowband signal is incident to the array with a small elevation angle θ d (θ d > 0), and the multipath signal reflected by a smooth surface is incident to the array with an angle θ i (θ i < 0), then the array received signal can be expressed as where S = s(t) denotes the signal vector; n(t) denotes the array additive noise vector, ; ε denotes the total reflection coefficient; ρ denotes the complex reflection coefficient, which is determined by the reflecting surface characteristics [20]; λ 0 is the signal wavelength; 2π∆R λ 0 is the phase difference due to the distance difference between the direct and reflected waves; and A = a(θ d ) + εa(θ i ), a(θ d ) and a(θ i ) denote the direct and reflected wave guide vectors, respectively, According to the derivation of the multipath effect spatial model, ∆R is approximately 0, so the phase difference between the direct and reflected waves is approximately 0, the direct and reflected waves are coherent, and the total reflection coefficient is approximately equal to the complex reflection coefficient. When the relative position relationship between the target and the radar is fixed, the mapping relationship that exists between the direct angle and reflected angle derived from Equation (5)-it is possible to obtain further: Substituting Equation (13) into Equation (10), a new signal expression can be obtained: where denotes the Hadamard product, Γ d = [τ d1 , τ d2 , . . . , τ dL ] T . When the array signal is rewritten in the form of Equation (14), the array received signal covariance matrix is represented in the following form [21]: where R ss = E s(t)s H (t) is the signal covariance matrix, σ 2 denotes the unknown noise power, and I denotes the unit matrix with dimension L × L.
In general, when the array is incident only by the direct wave signal θ d and there is no multipath signal, the array received signal covariance matrix is as follows: Comparing Equations (15) and (16), when the mapping f : R → R 0 and its neural network structure are obtained using the deep learning method, the reflected angle signal part of the multipath signal can be filtered out and the direct-angle signal is obtained, thus realizing the de-multipath.

Deep Neural Network Model
The proposed deep learning network structure is divided into two parts: the first part by CAE to realize the de-multipath and by the extreme learning machine (ELM) to realize the angle preclassification; the second part by the parallel convolutional neural network to realize the angle estimation. The deep learning network structure is shown in Figure 2 Substituting Equation (13) into Equation (10), a new signal expression can be obtained: where ⨀ denotes the Hadamard product, = [ 1 , 2 , … , ] . When the array signal is rewritten in the form of Equation (14), the array received signal covariance matrix is represented in the following form [21]: where = { ( ) ( )} is the signal covariance matrix, 2 denotes the unknown noise power, and denotes the unit matrix with dimension × .
In general, when the array is incident only by the direct wave signal and there is no multipath signal, the array received signal covariance matrix is as follows: Comparing Equations (15) and (16), when the mapping : → 0 and its neural network structure are obtained using the deep learning method, the reflected angle signal part of the multipath signal can be filtered out and the direct-angle signal is obtained, thus realizing the de-multipath.

Deep Neural Network Model
The proposed deep learning network structure is divided into two parts: the first part by CAE to realize the de-multipath and by the extreme learning machine (ELM) to realize the angle preclassification; the second part by the parallel convolutional neural network to realize the angle estimation. The deep learning network structure is shown in Figure 2 below: In the CAE and preclassification model, the CAE is trained with the array signal covariance matrix under the multipath effect as input, and its corresponding signal covariance matrix 0 , containing only the direct wave signal without the reflected wave as the output of the CAE, so as to achieve the purpose of extracting features and removing the multipath; then, according to the three angle intervals, the angle preclassification is performed by the ELM with latent features extracted from the CAE model, and after the classification is completed, the deep convolutional neural network is trained according to the principle of separate training for different categories, so as to obtain the final angle estimate. In the CAE and preclassification model, the CAE is trained with the array signal covariance matrix R under the multipath effect as input, and its corresponding signal covariance matrix R 0 , containing only the direct wave signal without the reflected wave as the output of the CAE, so as to achieve the purpose of extracting features and removing the multipath; then, according to the three angle intervals, the angle preclassification is performed by the ELM with latent features extracted from the CAE model, and after the classification is completed, the deep convolutional neural network is trained according to the principle of separate training for different categories, so as to obtain the final angle estimate.

Convolutional Autoencoder and Preclassification Model
The CAE and preclassification model consists of two neural networks in parallel, as shown in Figure 3 below. The first part strung by the blue arrows is CAE, which is used to extract features and implement the mapping of the multipath signal covariance matrix to the direct wave signal covariance matrix. The part strung by the green arrows is the extreme learning machine part, which serves to realize the angle preclassification. The CAE and preclassification model consists of two neural networks in parallel, as shown in Figure 3 below. The first part strung by the blue arrows is CAE, which is used to extract features and implement the mapping of the multipath signal covariance matrix to the direct wave signal covariance matrix. The part strung by the green arrows is the extreme learning machine part, which serves to realize the angle preclassification.

Convolutional Autoencoder
CAE is a variant of autoencoder (AE) that is based on the same principles as the autoencoder and is often applied in data compression and feature extraction [22]. Although AE is often considered as an unsupervised learning mode, strictly speaking, AE should be classified as self-supervised learning, which aims at copying the input to the output and reconstructing the feature representation between the input and the output. Structurally, AE is divided into two major parts: the encoder transforms the input data into a latent space representation or latent feature representation, and then reconstructs this mapping relationship by the decoder. The difference between CAE and AE lies in the encoding and decoding process with the help of convolutional operation instead of other neural networks and with better training effect.
In the coding process, the input data are the covariance matrix , there are convolution kernels, each convolution kernel is denoted as , the size of the convolution kernels are 3 × 3, and the bias is added to the convolution result after each convolution operation, denoted by , then the operation of the first convolution layer in the encoding process can be expressed as Among them, [ * ] denotes the convolution operation and ( ) denotes the activation function, and the ReLU function is usually chosen. The pooling layer is entered after the convolution operation, and the maximum pooling function is chosen in this paper [23], i.e., the maximum value is selected to be retained in the pooling region, and the size of the pooling region is 2 × 2. Two convolution-pooling operations are performed during the coding process, and the latent feature representation of the data can be obtained at the end of the coding process.
The upsampling operation is performed first in the decoding process, and its purpose is to expand the data by a certain proportion. After the convolution operation, the size of the data is usually reduced by a certain percentage, and the size of the feature map after the upsampling operation is larger than the size of the input feature map, so that the size of the data can be recovered. The upsampling operation methods include

Convolutional Autoencoder
CAE is a variant of autoencoder (AE) that is based on the same principles as the autoencoder and is often applied in data compression and feature extraction [22]. Although AE is often considered as an unsupervised learning mode, strictly speaking, AE should be classified as self-supervised learning, which aims at copying the input to the output and reconstructing the feature representation between the input and the output. Structurally, AE is divided into two major parts: the encoder transforms the input data into a latent space representation or latent feature representation, and then reconstructs this mapping relationship by the decoder. The difference between CAE and AE lies in the encoding and decoding process with the help of convolutional operation instead of other neural networks and with better training effect.
In the coding process, the input data are the covariance matrix R, there are K convolution kernels, each convolution kernel is denoted as w k , the size of the convolution kernels are 3 × 3, and the bias is added to the convolution result after each convolution operation, denoted by b k , then the operation of the first convolution layer in the encoding process can be expressed as Among them, [ * ] denotes the convolution operation and σ(x) denotes the activation function, and the ReLU function is usually chosen. The pooling layer is entered after the convolution operation, and the maximum pooling function is chosen in this paper [23], i.e., the maximum value is selected to be retained in the pooling region, and the size of the pooling region is 2 × 2. Two convolution-pooling operations are performed during the coding process, and the latent feature representation of the data can be obtained at the end of the coding process.
The upsampling operation is performed first in the decoding process, and its purpose is to expand the data by a certain proportion. After the convolution operation, the size of the data is usually reduced by a certain percentage, and the size of the feature map after the upsampling operation is larger than the size of the input feature map, so that the size of the data can be recovered. The upsampling operation methods include the transposed convolution method, bilinear difference method, inverse pooling method, and direct filling method. In this paper, the direct padding method is chosen, i.e., each value in the matrix is expanded into an equivalent matrix corresponding to the required size according to the expansion requirement. Taking the expansion of a 2 × 2 matrix into a 4 × 4 matrix as an example, the expansion process is shown in Figure 4. the transposed convolution method, bilinear difference method, inverse pooling method, and direct filling method. In this paper, the direct padding method is chosen, i.e., each value in the matrix is expanded into an equivalent matrix corresponding to the required size according to the expansion requirement. Taking the expansion of a 2 × 2 matrix into a 4 × 4 matrix as an example, the expansion process is shown in Figure 4. Compared with the transposed convolution method and the bilinear interpolation method, the direct padding method does not require complex operations and has a significant advantage in terms of time and space complexity compared with the inverse pooling method, which does not require storing the location and numerical information of the pooling operation during the encoder process. After the upsampling operation, the output result is convolved with the same rules as the convolution in the encoding process and the same activation function. The loss function of CAE in this paper is the binary cross-entropy function, which is defined as where denotes the actual value of the elements in the covariance matrix and ′ is the predicted value.

Extreme Learning Machine for Preclassification
Since the angle range is not strictly divided, the selection for CNN branches is not strict, especially when the direct angle is at the boundary of the range. Based on the above requirements, this paper selects an extreme learning machine with simple structure and low time and space complexity for angle preclassification. After the encoding process of CAE, the latent features are generated, and the latent features are used as the input of ELM guided by the green arrow in Figure 3 for training, so as to complete the preclassification of angles.
The characteristics of ELM are that there is only one implicit layer; it does not require gradient-based backpropagation to adjust the weights, i.e., the connection weights of the input and implicit layers as well as the implicit and output layers do not need to be adjusted iteratively, which can effectively reduce the number of operations, and they do not require a loss function. Some studies have confirmed that ELM has more obvious advantages in generalization [24]. The expression of the training process of the extreme learning machine for samples = [ 1 , 2 , … , ], = 1,2, … , can be expressed in the following form: where denotes the data dimension of sample ; L denotes the number of neurons in the hidden layer; and denote the input weights and bias, respectively; denotes Compared with the transposed convolution method and the bilinear interpolation method, the direct padding method does not require complex operations and has a significant advantage in terms of time and space complexity compared with the inverse pooling method, which does not require storing the location and numerical information of the pooling operation during the encoder process. After the upsampling operation, the output result is convolved with the same rules as the convolution in the encoding process and the same activation function. The loss function of CAE in this paper is the binary cross-entropy function, which is defined as where r ij denotes the actual value of the elements in the covariance matrix and r ij is the predicted value.

Extreme Learning Machine for Preclassification
Since the angle range is not strictly divided, the selection for CNN branches is not strict, especially when the direct angle is at the boundary of the range. Based on the above requirements, this paper selects an extreme learning machine with simple structure and low time and space complexity for angle preclassification. After the encoding process of CAE, the latent features are generated, and the latent features are used as the input of ELM guided by the green arrow in Figure 3 for training, so as to complete the preclassification of angles.
The characteristics of ELM are that there is only one implicit layer; it does not require gradient-based backpropagation to adjust the weights, i.e., the connection weights of the input and implicit layers as well as the implicit and output layers do not need to be adjusted iteratively, which can effectively reduce the number of operations, and they do not require a loss function. Some studies have confirmed that ELM has more obvious advantages in generalization [24]. The expression of the training process of the extreme learning machine for samples X i = [x i1 , x i2 , . . . , x im ], j = 1, 2, . . . , N can be expressed in the following form: where m denotes the data dimension of sample X i ; L denotes the number of neurons in the hidden layer; W j and b j denote the input weights and bias, respectively; β j denotes the output weight; σ(x) denotes the activation function; and t i denotes the output of the limit learning machine. For all samples, it can be further abbreviated as where H = σ W j ·X i + b j N×L denotes the output of all hidden layer neurons, β = [β 1 , β 2 , . . . , β L ] T denotes the output weight, and t denotes the actual output. The purpose of neural network learning is to minimize the error between the actual output and the target output T i , i.e., the expression of the minimization loss function is Traditional gradient descent-based algorithms used to solve the above problem often require adjusting all parameters during the iterative process, but in ELM, once the input weights W j and bias b j are determined, the output of the hidden layer neurons is uniquely determined. Therefore, the ELM training problem can be transformed into a linear system Hβ = T solution problem, and the weights β can be uniquely determined: where H + is the Moore-Penrose generalized inverse matrix of the matrix H. The performance of ELM for preclassification is evaluated by precision, which is calculated as shown below: where N i denotes the number of samples classified as i intervals, N t denotes the number of samples in N i that are correctly classified.

Convolutional Neural Network Model
Considering the estimation accuracy and time-space complexity, a parallel convolutional neural network [25] is designed, and its model is shown in Figure 5  The difference is that the number of convolution kernels in each convolutional layer and the number of neurons in each fully connected layer are different in the three convolutional neural network models. In the training process, the forward propagation is followed by the backward error propagation to correct the parameters, and so on iteratively, and the network training is completed when the training error is less than the set threshold or the training count is reached. During training and testing, the three convolutional neural network models are independent of each other and do not interfere with each other.

Simulation Experiments and Data Analysis
In the simulation experiments, the designed array is an ideal uniform line array with 20 elements, the array element interval is half-wavelength, and the target is a farfield narrowband signal. According to Equation (4), 20 groups of spatially located relatively fixed direct and reflected angles are randomly set, the low-altitude range is (0°, 10°], and the accuracy of the direct-angle change is ∆ = 0.001°, so the capacity of all samples is 200000, containing a one-to-one corresponding covariance matrix with multipath, covariance matrix 0 without multipath, and corresponding direct and reflected The "de-multipath" covariance matrix obtained from CAE and the angle classification obtained from ELM are input to the parallel convolutional neural network model in Figure 5, and the covariance matrix is input to the corresponding convolutional neural network by judging the angle category to output the estimated angle. The three parallel convolutional neural networks CNN1, CNN2, and CNN3 have the same convolutional structure, which are three convolutional layers connected with a pooling layer and finally connected with three fully connected layers; the size of the convolutional kernel is 3 × 3, the activation function is the ReLU function, the pooling layer adopts the maximum pooling criterion, the size is 2 × 2, and the depth of fully connected layer is 3. To make the output more streamlined and intuitive, CNN1, CNN2, and CNN3 are designed according to the regression problem model, and the output of the convolutional neural network is the DOA estimate, and the loss function of each convolutional neural network model is the mean-square error function, i.e., where θ d is the actual value and θ d denotes the predicted value. The difference is that the number of convolution kernels in each convolutional layer and the number of neurons in each fully connected layer are different in the three convolutional neural network models. In the training process, the forward propagation is followed by the backward error propagation to correct the parameters, and so on iteratively, and the network training is completed when the training error is less than the set threshold or the training count is reached. During training and testing, the three convolutional neural network models are independent of each other and do not interfere with each other.

Simulation Experiments and Data Analysis
In the simulation experiments, the designed array is an ideal uniform line array with 20 elements, the array element interval is half-wavelength, and the target is a far-field narrowband signal. According to Equation (4), 20 groups of spatially located relatively fixed direct and reflected angles are randomly set, the low-altitude range is (0 • , 10 • ], and the accuracy of the direct-angle change is ∆θ = 0.001 • , so the capacity of all samples is 200,000, containing a one-to-one corresponding covariance matrix R with multipath, covariance matrix R 0 without multipath, and corresponding direct and reflected angles. The capacity of the test set is 2000, which is selected randomly, and the rest of samples are used as the training set. The CAE and preclassification model and parallel CNN model are shown in Figures 3 and 5 above, where the number of convolutional kernels for each layer of CAE is 20, the size is 3 × 3, and the activation function is the ReLU function. In the decoding stage, a convolutional layer with a number of convolutional kernels of 1, a size of convolutional kernels of 3 × 3, and an activation function of the sigmoid function is added at the end. The number of neurons in the hidden layer of the ELM model is 50, and the three intervals of angle classification are C1:(0 • ,3.  16,12), CNN2: (20,12,12), CNN3: (12,12,6); and the number of neurons in the fully connected layer are CNN1:(6000,3000,3000), CNN2: (3000,2000,2000), and CNN3:(1500,1500,1500), respectively, and DOA estimation is realized after convolutional, pooling, and fully connected layer estimation. In this paper, root-mean-square error is chosen as a measure of DOA estimation performance, which is defined as where Q denotes the test set capacity, and θ and θ denote the estimated and actual angle values, respectively.

Verification of Algorithm Validity
The number of array elements in the experiment is 20, the signal-to-noise ratio (SNR) is 10 dB, and the number of snapshots is 100; three sets of direct and reflected angles are taken as examples, respectively. The spatial smoothing preprocessing (SSP) algorithm [8], the improved spatial smoothing preprocessing (MSSP) algorithm [9], and the convolutional autoencoder algorithm proposed in this paper are used for decoherence, and the MUSIC algorithm is used for DOA estimation after decoherence, which are noted as SSP MUSIC, MSSP MUSIC, and CAE MUSIC in order. The spatial spectrum is used to compare the angular resolution and estimation performance of the three algorithms, as shown in  gorithm [8], the improved spatial smoothing preprocessing (MSSP) algorithm [9], an the convolutional autoencoder algorithm proposed in this paper are used for decohe ence, and the MUSIC algorithm is used for DOA estimation after decoherence, which a noted as SSP MUSIC, MSSP MUSIC, and CAE MUSIC in order. The spatial spectrum used to compare the angular resolution and estimation performance of the three alg rithms, as shown in Figures 6-8.   the convolutional autoencoder algorithm proposed in this paper are used for decohe ence, and the MUSIC algorithm is used for DOA estimation after decoherence, which a noted as SSP MUSIC, MSSP MUSIC, and CAE MUSIC in order. The spatial spectrum used to compare the angular resolution and estimation performance of the three alg rithms, as shown in Figures 6-8.   Comparing SSP MUSIC and MSSP MUSIC in Figure 6 shows that when the dire angle is small, the spacing between the direct angle and reflected angle is also small, an at this time, relying on the spatial smoothing for decoherence cannot fully achieve t distinction between the direct and reflected angles, and the spectral peaks of the S MUSIC and MSSP MUSIC algorithms appear only in the negative-angle region. A shown in Figures 7 and 8, when the direct angle increases, SSP MUSIC and MSSP M SIC can achieve the distinction between the direct and reflected angles, but there are i terference spectral peaks in the target region, and especially for the SSP MUSIC alg Comparing SSP MUSIC and MSSP MUSIC in Figure 6 shows that when the direct angle is small, the spacing between the direct angle and reflected angle is also small, and at this time, relying on the spatial smoothing for decoherence cannot fully achieve the distinction between the direct and reflected angles, and the spectral peaks of the SSP MUSIC and MSSP MUSIC algorithms appear only in the negative-angle region. As shown in Figures 7 and 8, when the direct angle increases, SSP MUSIC and MSSP MUSIC can achieve the distinction between the direct and reflected angles, but there are interference spectral peaks in the target region, and especially for the SSP MUSIC algorithm, the interference spectral peaks can even seriously affect the final angle estimation. As can be seen in Figures 6-8, compared with SSP MUSIC and MSSP MUSIC, the CAE MUSIC algorithm is able to obtain spectral peaks within the target region, and its most significant advantage is that there are no interference spectral peaks within the target region, and the amplitude of the interference spectral peaks outside the target region is much smaller than that of the spectral peaks within the target region. When the incidence angle is small, the estimation errors of all three algorithms are large, and as the incidence angle increases, the estimation accuracy of the three algorithms increases, and the estimation accuracy of the CAE MUSIC algorithm is better than the other two algorithms. It can be proved that the CAE algorithm is effective in the extraction of direct wave signals in low-elevation-angle signals, and its performance is better than that of SSP and MSSP.
The RMSE of the combined algorithm and CAE CNN for DOA estimation in each angle range (0 to 1, 1 to 2, 2 to 3, etc.) are calculated under the above conditions, and the calculation results are shown in Figure 9. From Figure 9(a), it can be seen that the RMSEs of CAE MUSIC, CAE ESPRIT, and CAE ML are lower when the direct angle is greater than 5°; when the direct angle is less than 5°, the RMSEs are all higher, with CAE ESPRIT estimation accuracy slightly higher than the other two, but at a direct angle greater than 7°, the RMSE of this algorithm is greater than the other two algorithms. In Figure 9(b), the three curves of CAE CNN1, CAE CNN2, and CAE CNN3 are plotted from the results of each of the above three neural networks tested after completing the training of all samples, i.e., no preclassification by ELM, and DOA estimation is realized using a single CNN. The CAE CNNs are the results obtained from the preclassification of ELM and then the DOA estimation by the corresponding CNN implementation, which is the model proposed in this paper. From Figure 9(b), it can be seen that CAE CNN1, CAE CNN2, and CAE CNN3 have unequal RMSEs in each angle range, and the RMSE decreases as the angle increases. When the direct angle is small, CNN1 has the lowest RMSE and the highest estimation accuracy, and in the range from 4° to 6°, CNN2 has a lower RMSE than the other two, and when the direct angle is larger than 6°, CNN3 performs slightly better than the other two. Comparing Figure 9 (a) and (b), it can be seen that the use of CNN can significantly re- From Figure 9a, it can be seen that the RMSEs of CAE MUSIC, CAE ESPRIT, and CAE ML are lower when the direct angle is greater than 5 • ; when the direct angle is less than 5 • , the RMSEs are all higher, with CAE ESPRIT estimation accuracy slightly higher than the other two, but at a direct angle greater than 7 • , the RMSE of this algorithm is greater than the other two algorithms. In Figure 9b, the three curves of CAE CNN1, CAE CNN2, and CAE CNN3 are plotted from the results of each of the above three neural networks tested after completing the training of all samples, i.e., no preclassification by ELM, and DOA estimation is realized using a single CNN. The CAE CNNs are the results obtained from the preclassification of ELM and then the DOA estimation by the corresponding CNN implementation, which is the model proposed in this paper. From Figure 9b, it can be seen that CAE CNN1, CAE CNN2, and CAE CNN3 have unequal RMSEs in each angle range, and the RMSE decreases as the angle increases. When the direct angle is small, CNN1 has the lowest RMSE and the highest estimation accuracy, and in the range from 4 • to 6 • , CNN2 has a lower RMSE than the other two, and when the direct angle is larger than 6 • , CNN3 performs slightly better than the other two. Comparing Figure 9a,b, it can be seen that the use of CNN can significantly reduce the RMSE of DOA estimation, especially when the angle is less than 5 • . Comparing the four curves in (b), it can be seen that when the three angle ranges are trained separately, the RMSE of CAE CNNs in each angle range is better than the other three. From the analysis of model training efficiency, the increase in the number of convolutional kernels, the number of fully connected layer neurons, and the number of iterations in the convolutional neural network will increase the time and space complexity of the model, and there is even a risk of overfitting, and from the above two aspects, dividing the angle intervals for model training and angle estimation separately has significant advantages.
The precision of the three angle intervals using ELM for preclassification is shown in Table 1. As can be seen from Table 1, the precision of the classification of each angle interval is high after adopting ELM, which can meet the preclassification requirements.

Effect of the Number of Snapshots on DOA Estimation Performance
In this group of simulation experiments, SNR = 10 dB, other simulation parameters are kept constant, and the array received data with snapshot numbers of 25, 50, 100, 150, 200, 300, and 400 are designed for simulation experiments. Since there are cases where it is difficult to effectively estimate the direct or reflected angles when the DOA estimation is performed by the conventional algorithm (e.g., it is difficult to effectively separate the spatial spectra of the direct and reflected angles in the MSSP MUSIC algorithm, or the estimation of the angles by the algorithm is seriously deviated from the target region, etc.), the above cases are regarded as invalid estimates, and therefore, only the valid directand reflected-angle estimates are calculated when the mean-square error calculation is performed The mean-square error and the comparison experiments on the effect of signalto-noise ratio on DOA estimation performance in the next section are performed according to this principle. The statistical effective estimation rates of each conventional algorithm are shown in Table 2. From Table 2, it can be seen that the effective estimation rates of the above three algorithms for the direct angle are lower than those for the reflected angle, among which MSSP ESPRIT has the lowest estimation rate for the direct angle, MSSP MUSIC has the highest estimation rate for the direct angle, and the ML algorithm has the highest estimation rate for the reflected angle, and the overall effective estimation rates for the direct and reflected angles increase as the number of snapshots increases. When CAE is used in combination with the conventional algorithm, the estimated angles are all within the target range, and all of them can achieve effective estimation.
The RMSE of each algorithm is calculated for different numbers of snapshots, including the total (direct and reflected angles) RMSE, the direct and reflected angles are calculated separately, and the RMSE of DOA estimation by CAE combined with conventional and CNN algorithms are compared separately, as shown in Figure 10 below.  show that the RMSE of DOA estimation by CAE combined with the conventional algorithm for the direct angle is smaller than that by the conventional algorithm after MSSP decoherence, and the RMSE shows an insignificant decreasing trend as the number of snapshots increases, and the RMSE of each algorithm is basically stable when the number of snapshots is larger than 200. When the number of snapshots is larger, MSSP ESPRIT has the highest estimation accuracy for the direct angle among the conventional algorithms, but its effective estimation rate is also the lowest when combined with Table 2. Combining the RMSE and the effective estimation accuracy, the ML algorithm among the conventional algorithms has better performance for the estimation of the direct angle; from Figure 10(d), it can be seen that the CAE MUSIC, CAE ESPRIT, and CAE ML algorithms' RMSE is not greatly affected by the number of snapshots, and CAE CNN algorithm has a significantly better RMSE than the other algorithms at different snapshot numbers, and the RMSE decreases as the number of snapshots increases and stabilizes when the number of snapshots is greater than 200.

Effect of SNR on DOA estimation performance
In this set of experiments, the number of snapshots is 200, other simulation parameters are kept constant, and the array received data with SNR of −5 dB, 0 dB, 3 dB, 6 dB, 9 dB, 12 dB, and 15 dB are designed for simulation experiments. The ratios of effective direct and reflected angles obtained by each algorithm at different SNRs are shown in Table 2 below.
From Table 3, it can be seen that as the SNR increases, the effective estimation rate of each algorithm for the direct and reflected angles increases, among which the effective estimation rate of the MSSP MUSIC algorithm for the direct angle is higher than the oth- Figure 10a-c show that the RMSE of DOA estimation by CAE combined with the conventional algorithm for the direct angle is smaller than that by the conventional algorithm after MSSP decoherence, and the RMSE shows an insignificant decreasing trend as the number of snapshots increases, and the RMSE of each algorithm is basically stable when the number of snapshots is larger than 200. When the number of snapshots is larger, MSSP ESPRIT has the highest estimation accuracy for the direct angle among the conventional algorithms, but its effective estimation rate is also the lowest when combined with Table 2. Combining the RMSE and the effective estimation accuracy, the ML algorithm among the conventional algorithms has better performance for the estimation of the direct angle; from Figure 10d, it can be seen that the CAE MUSIC, CAE ESPRIT, and CAE ML algorithms' RMSE is not greatly affected by the number of snapshots, and CAE CNN algorithm has a significantly better RMSE than the other algorithms at different snapshot numbers, and the RMSE decreases as the number of snapshots increases and stabilizes when the number of snapshots is greater than 200.

Effect of SNR on DOA Estimation Performance
In this set of experiments, the number of snapshots is 200, other simulation parameters are kept constant, and the array received data with SNR of −5 dB, 0 dB, 3 dB, 6 dB, 9 dB, 12 dB, and 15 dB are designed for simulation experiments. The ratios of effective direct and reflected angles obtained by each algorithm at different SNRs are shown in Table 2 below.
From Table 3, it can be seen that as the SNR increases, the effective estimation rate of each algorithm for the direct and reflected angles increases, among which the effective estimation rate of the MSSP MUSIC algorithm for the direct angle is higher than the other two algorithms, and the ML algorithm has the highest effective estimation rate for the reflected angle, but at the same time, the ML algorithm has the lowest effective estimation rate when the SNR is −5 dB and 0 dB, so it is most affected by the low SNR. The RMSE of each algorithm under different SNRs was calculated and compared with the RMSE of CAE combined with conventional and CNN algorithms for DOA estimation, respectively, as shown in Figure 11 below. The RMSE of each algorithm under different SNRs was calculated and compared with the RMSE of CAE combined with conventional and CNN algorithms for DOA estimation, respectively, as shown in Figure 11 below. As shown in Figure 11(a)-(c), the RMSE of each algorithm DOA estimation decreases with the increase in SNR, among which the ML algorithm has the most significant change, and the RMSE of CAE MUSIC, CAE ESPRIT, and CAE ML algorithms does not change significantly with SNR. The RMSE of the traditional algorithm MSSP ESPRIT algorithm is the smallest for the total, direct, and reflected angles, but combined with Table 2, its effective estimation rate is the lowest, compared with the other two algorithms. As can be seen in Figure 11(d), the RMSE of CAE ML for DOA estimation of the direct angle is the lowest among the combined algorithms, but the RMSE of CAE CNN decreases with increasing SNR and is significantly lower than the other three algorithms under all SNR conditions, and the RMSE tends to be stable when the SNR is greater than or equal to 9dB. It can be concluded above that the CAE CNN algorithm is better than the conventional algorithm and the combined algorithm for DOA estimation of the direct angle in low-elevation signals under different SNRs.

Conclusions
In the field of radar signal processing, DOA estimation of low-elevation-angle targets has always been a key and difficult problem, which is rooted in the fact that compared with high-altitude targets, the multipath effect generated by low-elevation-angle As shown in Figure 11a-c, the RMSE of each algorithm DOA estimation decreases with the increase in SNR, among which the ML algorithm has the most significant change, and the RMSE of CAE MUSIC, CAE ESPRIT, and CAE ML algorithms does not change significantly with SNR. The RMSE of the traditional algorithm MSSP ESPRIT algorithm is the smallest for the total, direct, and reflected angles, but combined with Table 2, its effective estimation rate is the lowest, compared with the other two algorithms. As can be seen in Figure 11d, the RMSE of CAE ML for DOA estimation of the direct angle is the lowest among the combined algorithms, but the RMSE of CAE CNN decreases with increasing SNR and is significantly lower than the other three algorithms under all SNR conditions, and the RMSE tends to be stable when the SNR is greater than or equal to 9dB. It can be concluded above that the CAE CNN algorithm is better than the conventional algorithm and the combined algorithm for DOA estimation of the direct angle in low-elevation signals under different SNRs.

Conclusions
In the field of radar signal processing, DOA estimation of low-elevation-angle targets has always been a key and difficult problem, which is rooted in the fact that compared with high-altitude targets, the multipath effect generated by low-elevation-angle targets is more serious, and the strong ground reflection signal is highly correlated with the direct wave in the time domain, and it is almost impossible to distinguish the direct wave signal from the ground reflection multipath signal in the spatial, temporal, and frequency domain, resulting in a large DOA estimation error. With the application of deep learning in radar signal processing, deep learning also provides new ideas for DOA estimation of low-elevation-angle targets. Based on this, this paper proposes a new CAE-CNN-based DOA estimation method for low-elevation-angle targets, which is divided into two main parts. The first part takes the covariance matrix of the received signals of the low-elevationangle target array containing both direct and reflected waves as the input of CAE, and obtains the "de-multipath" covariance matrix containing only the direct wave signals. In order to reduce the increase in complexity caused by preclassification, the preclassification model is implemented by the ELM model, and the latent spatial features obtained in the encoding process of CAE are used as input to ELM to obtain the DOA preclassification of the direct waves. The "de-multipath" covariance matrix and the angular preclassification of DOA obtained in the first part are used as the inputs to the parallel CNN in the second part. According to the preclassification, the corresponding convolutional neural network is selected and the "de-multipath" covariance matrix is used as the input to obtain the DOA estimation of the direct wave. As deep neural network models can lead to underfitting when the dataset is small, sufficient training set data are necessary in order for the model to be adequately trained. The model is trained offline, and when the neural network is trained and then predicted, the time required is minimal. Taking into account the estimation accuracy and model complexity, a parallel convolutional neural network model with three branches is designed. Simulation experiments show that the CAE algorithm can effectively achieve the de-multipath of a low-elevation-angle target signal, and has higher estimation accuracy and effective estimation rate when combined with the traditional algorithm for DOA estimation, and the combination of CAE and parallel CNN algorithm can better extract the feature relationship between the covariance matrix and angle in different angle ranges, and the RMSE of DOA estimation is lower and the estimation performance is better, and overfitting and underfitting can be avoided as much as possible, reducing time and space complexity.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the data in this paper not being from publicly available datasets but obtained from the simulation of the signal models listed in the article.