A 1D Cascaded Denoising and Classification Framework for Micro-Doppler-Based Radar Target Recognition

Beili Ma; Baixiao Chen

doi:10.3390/rs17091515

and

National Key Laboratory of Radar Signal Processing, Xidian University, Xi’an 710071, China

^*

Author to whom correspondence should be addressed.

Remote Sens.2025, 17(9), 1515;https://doi.org/10.3390/rs17091515

Version Notes

Order Reprints

Abstract

Micro-Doppler signatures play a crucial role in capturing target features for the radar classification task, and the time–frequency distribution method is widely used to represent micro-Doppler signatures in many applications including human activities, ground moving target identification, and different types of drones distinguishing. However, most existing studies that utilize radar micro-Doppler spectrograms often require extended observation times to effectively represent the cyclostationarity and periodic modulation of radar signals to achieve promising classification results. In addition, the presence of noise in real-world environments poses challenges by generating weak micro-Doppler features and a low signal-to-noise ratio (SNR), leading to a significant decline in classification accuracy. In this paper, we present a novel one-dimensional (1D) denoising and classification cascaded framework designed for low-resolution radar targets using a micro-Doppler spectrum. This framework provides an effective signal-based solution for feature extraction and recognition from the single-frame micro-Doppler spectrum in a conventional pulsed radar system, which boasts high real-time efficiency and low computation requirements under conditions of low resolution and a short dwell time. Specifically, the proposed framework is implemented using two cascaded subnetworks: Firstly, for radar micro-Doppler spectrum denoising, we propose an improved 1D DnCNN subnetwork to enhance noisy or weak micro-Doppler signatures. Secondly, an AlexNet subnetwork is cascaded for the classification task, and the joint loss is calculated to update the denoising subnetwork and assist with optimal classification performance. We have conducted a comprehensive set of experiments using six types of targets with a ground surveillance radar system to demonstrate the denoising and classification performance of the proposed cascaded framework, which shows significant improvement over separate training of denoising and classification models.

Keywords:

micro-Doppler signatures; time–frequency analysis; radar target recognition; pulsed-Doppler radar

1. Introduction

The low-resolution radar system has the advantages of a simple structure, a low cost, and easy engineering implementation in comparison with the ultra-wideband and high-resolution radar [1,2], which has been significantly applied to perimeter security and surveillance. In this paper, we focus on low-resolution radar target classification and aim to implement a single-frame decision recognition scheme based on a pulsed surveillance radar, which is extremely challenging due to a low-range resolution and short dwell time.

Micro-Doppler signatures present the micro-motion characteristic differences for the moving radar targets, which refer to the Doppler shift in radar returns caused by the motion of components on a target [3,4,5]. For example, rotating parts on a target can induce frequency modulations in the radar returns [6]. They contain the details of the target micro-motion components reflected in the frequency domain, from which useful information can be extracted for target recognition [7,8]. For many years, hand-crafted feature extraction methods for radar target classification have been utilized to capture micro-Doppler signatures in various fields such as ground moving target classification [9,10], human activities [11], and hand gesture identification [12].

Recently, deep learning methods have been widely used for tasks such as image classification, semantic segmentation, speech recognition, and radar target recognition, demonstrating remarkable capabilities and performance [13,14,15,16,17,18,19]. The micro-Doppler spectrogram, which represents micro-Doppler information in the time–frequency domain, plays a crucial role in bridging radar signals and deep learning methods, especially in the context of target recognition and classification [20]. As is known, convolutional neural network (CNN) and other deep learning architectures are commonly employed for image-based data. Converting radar signals into the micro-Doppler spectrogram allows for the representation of complex temporal and frequency patterns as images. Each target’s unique micro-Doppler signature becomes a distinguishable feature in these spectrograms. By using micro-Doppler spectrograms as input data, deep learning models can be trained to automatically extract and learn discriminative features for target identification.

A group of pulses that are to be combined coherently, for example via Doppler processing or synthetic aperture radar (SAR) imaging, are said to form a coherent processing interval (CPI). The radar signal processing often operates on data from one CPI and typically lasts from tens of milliseconds to tens of seconds. The primary objective of this paper is to integrate the recognition task into one frame in a pulsed-Doppler radar system, wherein the CPI serves as the processing unit for target detection and tracking. In our previous research [21], we presented the micro-Doppler spectrum and spectrogram for radar signals within a single CPI to analyze the micro-Doppler characteristics of six different target types. To achieve the classification task, we employed a two-channel Vision Transformer (ViT) based on the micro-Doppler spectrogram. Although the micro-Doppler spectrogram offers richer information compared to the micro-Doppler spectrum, our earlier work demonstrated that the spectrum also possesses the capability of distinguishing between the six target classes. Motivated by this insight, our current effort aims to develop a one-dimensional (1D) signal-based network using the micro-Doppler spectrum, as opposed to a two-dimensional (2D) image-based network relying on the micro-Doppler spectrogram. This shift allows us to explore the potential of the micro-Doppler spectrum for robust target classification with a 1D framework, which not only has the merit of high real-time efficiency but also significantly reduces computational demands.

Meanwhile, our work considers the challenge posed by noise, a prevalent factor that significantly influences classification performance. In real-world environments, the presence of noise often leads to the reduction in signal-to-noise ratio (SNR) and the weaker micro-Doppler features which contribute to a notable decline in the accuracy of classification. Recognizing and mitigating the impact of noise within the micro-Doppler spectrum is crucial for ensuring the robustness and reliability of our classification methods in real-world scenarios.

In this paper, a 1D denoising and classification cascaded framework based on the micro-Doppler spectrum is proposed to effectively classify low-resolution radar targets. The proposed framework is efficiently designed with two cascaded sub-networks to achieve classification decisions within one CPI period, which has the advantages of high real-time efficiency and low computational load, especially under the challenging conditions of a low resolution and short dwell time in traditional pulse radar systems. The first denoising subnetwork addresses radar micro-Doppler spectrum denoising, in which an improved 1D-DnCNN model is proposed to enhance the quality of micro-Doppler signatures that may be affected by noise or exhibit weak signals. The second classification subnetwork utilizes an AlexNet-based architecture for the classification task, where a joint loss is calculated to facilitate the simultaneous optimization of the denoising subnetwork and contribute to achieving excellent classification performance.

The main contributions of this paper are summarized as follows.

This work develops a 1D signal-based network using the micro-Doppler spectrum to achieve the classification task based on a conventional pulsed radar system, as opposed to a 2D image-based network relying on the micro-Doppler spectrogram commonly used in existing studies.
We propose an improved 1D-DnCNN model that utilizes the 1D-BM3D algorithm to obtain ground truth labels for measured data, effectively addressing the challenge of training pair construction and enhancing denoising performance for 1D micro-Doppler spectrum, thereby ensuring high-quality input for the subsequent classification task.
We propose a cascaded framework with joint loss optimization, where the denoising subnetwork enhances micro-Doppler signatures degraded by noise and is integrated with the 1D-AlexNet classification subnetwork. By employing a joint loss function, our approach enables the synchronized optimization of both denoising and classification tasks. This design ensures outstanding classification performance, even under the challenges of a low resolution and short dwell time in the surveillance radar system.
Validated on both simulated and measured radar data, the proposed 1D cascaded framework demonstrates outstanding multi-target classification performance compared to separately trained denoising and classification models. Furthermore, compared to 2D-based networks, it offers superior real-time efficiency and lower computational complexity.

The remainder of this paper is organized as follows: Section 2 reviews related work in the field. The proposed framework is presented in detail in Section 3. Section 4 presents the experimental setup and implementation details. In Section 5, we provide a comprehensive evaluation of the framework, including experimental results and performance analysis. Section 6 discusses the limitations of our work and suggests future research directions. Finally, Section 7 concludes this paper.

2. Related Works

2.1. Radar Micro-Doppler Spectrograms Denoising

Denoising is a crucial process for reconstructing high-quality signals from noisy measurements. Classical and highly successful image denoising methods, such as non-convex shrinkage and reconstruction (NCSR) [22], guided sparse representation (GSR) [23], and block matching and 3D filtering (BM3D) [24], group similar patches within the image globally via block matching or clustering, and impose non-local structural priors on these groups, which usually lead to state-of-the-art image denoising performance.

In recent years, deep learning-based methods, including a denoising convolutional neural network (DnCNN) [25], a fast and flexible Denoising Network (FFDNet) [26], and a convolutional blind denoising Network (CBDNet) [27], have shown more promising results compared to traditional techniques. DnCNN focuses on denoising Gaussian noise by employing residual learning and batch normalization (BN). FFDNet extends this capability to handle more complex, real-world noise by incorporating noise level maps into the network input. CBDNet further enhances FFDNet by introducing a five-layer fully convolutional network (FCN) to adaptively generate noise level maps, achieving robust blind denoising.

Some studies have applied image-based denoising models to radar micro-Doppler spectrograms. Tang et al. [28,29] propose a feature mapping network (FMNet) to clean noisy micro-Doppler spectrograms and utilize it for human skeletal motion reconstruction using WiFi-based micro-Doppler spectrograms. In [30], they improve denoising performance by replacing additive white Gaussian noise (AWGN) with realistic noise generated by a generative adversarial network (GAN). The study in [31] develops a GAN-based model to address the universal blind denoising problem for radar micro-Doppler spectrograms.

2.2. 1D Convolutional Neural Networks

Deep 2D CNNs, with their numerous hidden layers and millions of parameters, have the capacity to learn complex objects and patterns, provided they are trained on large visual databases with ground-truth labels. With proper training, this unique ability makes them the primary tool for various engineering applications involving 2D images and video frames. Many existing studies focus on 2D radar image detection and identification using 2D CNNs, based on SAR [1,32,33], ISAR [34], and time–frequency domain spectrograms [35,36].

Nevertheless, this may not be a viable option in numerous applications using 1D radar signals. To address this issue, 1D CNNs have recently been proposed and immediately achieved state-of-the-art performance levels in several applications. Furthermore, another major advantage of 1D CNNs is that a real-time and low-cost hardware implementation is feasible due to the simple and compact configuration of 1D CNNs that perform only 1D convolutions (scalar multiplications and additions) [37,38]. For example, an end-to-end 1D CNN model is employed in [39] for human activity classification, achieving not only high classification accuracy but also low computational complexity. The study in [40] introduces a stacked CNN Bidirectional Recurrent Neural Network (CNN-Bi-RNN) model, where radar High Range Resolution Profile (HRRP) targets are recognized as time sequences within the proposed signal-based model. Additionally, the work in [41] utilizes the real and imaginary components of raw radar signals as inputs to a deep neural network for signal denoising and recognition.

2.3. Cascaded Network

Ding et al. [42] propose a novel approach that integrates image denoising with high-level vision tasks using deep learning methods. This technique combines two image processing subtasks by utilizing a joint loss function to update the denoising network through backpropagation, ultimately enhancing the performance of high-level vision tasks. Similarly, the study in [43] applies denoising techniques to improve classification accuracy in radar-based human activity recognition.

Inspired by this approach, we adopt a similar strategy in our case, employing a 1D cascaded network that jointly addresses denoising and classification tasks for low-resolution radar targets. This approach aims to utilize the denoising technique to improve classification accuracy, thereby enhancing the overall performance of the system.

3. Methods

In this section, the proposed 1D denoising and classification cascaded framework is presented with a comprehensive description below. The overall architecture is visually depicted in Figure 1.

Figure 1. Overall architecture of the proposed framework.

As illustrated in Figure 1, the entire framework consists of two cascaded subnetworks for denoising and classification. The raw radar signal first undergoes Fourier transform and clutter removal as preprocessing steps. The resulting 1D micro-Doppler spectra serve as the input for subsequent processing. With labels generated by applying the 1D-BM3D algorithm to the measured micro-Doppler spectra, the 1D-DnCNN denoising model is trained using measured training pairs, effectively enhancing the micro-Doppler signatures within the Doppler spectra. The whole process is implemented using a cascaded approach, where the denoising and classification models are connected with a joint loss function. This collaborative training strategy facilitates synchronized optimization of the denoising and classification models, creating a unified framework that can classify radar targets accurately.

3.1. 1D Micro-Doppler Spectrum Input

In a pulse-Doppler radar system, a sequence of N coherent pulses is transmitted with a specified Pulse Repetition Interval (PRI). The received echo is processed through quadrature demodulation, resulting in two-channel signals: In-phase (I) and Quadrature phase (Q). One CPI is commonly employed as the processing unit for target detection and tracking in a pulsed-Doppler radar system, serving as a standard design choice. For a detected moving target, the received one-CPI baseband signal can be expressed as

x (n) = I_{n} + j Q_{n}

(1)

where N is the number of accumulated pulses;

n = 0, 1, \dots, N - 1

is the pulse index;

I_{n}

and

Q_{n}

are I and Q samples of the n-th received pulse, respectively.

The spectrum

X_{f}

can be obtained from the Discrete Fourier Transform of

x (n)

, primarily reflecting the Doppler speed of the moving target.

X_{f} = \sum_{n = 0}^{N - 1} x (n) e^{- j 2 π f n / N}

(2)

where

f = 0, 1, \dots, N - 1

is the frequency index.

The spectrogram

S T F T_{f, t}

captures time-varying frequency modulation information based on Short-Time Fourier Transform (STFT) [3], which is calculated by windowing the signal

x (n)

with a sliding window and then computing the Discrete Fourier Transform of the windowed signal.

S T F T_{f, t} = \sum_{l = 0}^{L - 1} w (l) x (l + t N_{s t e p}) e^{- j 2 π f l / L}

(3)

where

w (l)

is the window function of length L,

l = 0, 1, \dots, L - 1

;

N_{s t e p}

represents the step size; the entire signal

x (n)

of length N is divided into T segments,

T = (N - L) / N_{s t e p} + 1

;

f = 0, 1, \dots, L - 1

is the frequency index, and

t = 0, 1, \dots, T - 1

is the time index. STFT is a widely used time–frequency domain method for micro-Doppler analysis in radar signals. Mainstream radar target classification methods leverage time–frequency domain information, typically using the spectrogram

S T F T_{f, t}

as the 2D input image.

In this paper, we propose utilizing the spectrum

X_{f}

as a 1D input vector in the frequency domain, in contrast to the spectrogram

S T F T_{f, t}

in the time–frequency domain. We examine the micro-Doppler differences in spectrogram and spectrum for various radar targets, using human walking and running as a comparative example. Figure 2 illustrates the micro-Doppler distributions of radar returns for walking and running individuals, presenting both the spectrogram

S T F T_{f, t}

for a prolonged observation time and the Doppler spectrum

X_{f}

for a one-CPI time. The left images in Figure 2a,b depict spectrograms derived from 30 frames of radar returns for human running and walking. These two images last approximately 2.5 s, revealing a stable and periodic micro-Doppler phenomenon with sinusoidal-like changes in velocity during a gait. It is evident that walking and running motions exhibit distinct micro-Doppler signatures owing to their different gait patterns. Walking typically manifests a slower pattern and weaker micro-Doppler component, in contrast to the faster pattern and stronger micro-Doppler component associated with running.

Figure 2. Micro-Doppler differences of human motion in the spectrogram

S T F T_{f, t}

and spectrum

X_{f}

. (a) Human running. (b) Human walking.

However, obtaining more pronounced micro-Doppler differences with a long observation time, as shown in the left pictures, may not be practical for the pulsed radar system used in this study. It is impossible to achieve continuous accumulation for a moving target due to discontinuous observation and a short dwell time, especially in the Track-And-Scan (TAS) mode. Therefore, achieving one-CPI recognition becomes more meaningful in engineering applications. Moreover, despite the limitations in observation time, the Doppler spectrum can provide valuable information about the main Doppler shift and micro-Doppler sidebands. It effectively highlights unique micro-Doppler signatures associated with different micro-motions, while also having the capability to distinguish various radar targets.

Clutters present in radar returns often result in unwanted peaks at the zero-Doppler frequency. We employ the CLEAN technique [44] to effectively remove the clutter. As illustrated in Figure 3, the Clean algorithm effectively removes clutter from the raw Doppler spectrum while preserving the integrity of micro-Doppler components, ensuring no distortion during the process. The resultant spectrum after clutter removal, is referred to as the micro-Doppler spectrum, serving as the input for the proposed framework.

Figure 3. An example of Doppler spectrum through clutter removal with one CPI for human running. (a) Raw Doppler spectrum. (b) Doppler spectrum after clutter removal.

3.2. 1D-BM3D-DnCNN Model for Denoising

In this subsection, we present an improved 1D-DnCNN model, combined with the 1D-BM3D algorithm (referred to as 1D-BM3D-DnCNN), as an effective solution for denoising the 1D micro-Doppler spectrum.

3.2.1. 1D-DnCNN Denoising Subnetwork

The utilization of residual learning in DnCNN allows the network to efficiently estimate noise from the noisy input. Additionally, the integration of batch normalization further enhances the training speed and contributes to the overall denoising efficacy. The DnCNN model adopts a 1D structure, as illustrated in Figure 4.

Figure 4. The structure details of 1D-DnCNN.

The denoising model training involves inputting pairs of noisy and clean spectra, both represented as 1D series rather than 2D images. The convolution operations in the DnCNN model are adjusted for 1D input. The shape of the input samples for both noisy and clean spectra is

B \times H \times 1

, where B represents the batch size, and H is the length of the radar spectrum. It is stated in [25] that increasing the depth expands the receptive field of the network, enabling it to utilize contextual information from a larger image region. To balance performance and computational efficiency, a minimal depth of 17 was set, demonstrating that DnCNN can effectively compete with leading denoising methods. We consider it valuable to verify whether 1D-DnCNN with a similarly reduced depth, compared to 2D-CNN, can still achieve good denoising performance. Therefore, we applied 1D convolution to keep the dimension identical between the 1D input and output data of

H \times 1

and set the depth of the Conv+BN+ReLu module to 15 in our work, as shown in Figure 4 for details.

Once fully trained, the denoising network, denoted as F, establishes a mapping between noisy and clean data. The denoising result is expressed as

\hat{X_{f}} = F (X_{f})

, where

\hat{X_{f}}

represents the denoised spectrum. The output of the denoising subnetwork, maintaining the size of

B \times H \times 1

, is then fed into the subsequent classification subnetwork, which is outlined in the following steps.

3.2.2. 1D-BM3D Algorithm

For the simulated dataset, training pairs can be easily generated using simulated data. However, obtaining clean data to serve as ground truth for the measured dataset is challenging. Some studies utilize simulated samples as labels for the measured samples. In our work, we propose applying the 1D-BM3D algorithm to pre-denoise the measured data, using the denoised samples as labels for the measured dataset.

BM3D is widely recognized as one of the most effective denoising algorithms for images. In this work, we adapt the algorithm for one-dimensional radar signals. The description of 1D-BM3D Algorithm Steps is illustrated in Figure 5.

Figure 5. Description of the 1D-BM3D Algorithm Steps.

The modified process begins with block division, where the radar signal is segmented into overlapping blocks of fixed length to ensure sufficient redundancy for effective noise suppression. The 1D micro-Doppler spectrum

X_{f}

is divided into multiple overlapping 1D signal blocks. Given that the 1D spectrum

X_{f}

has a length of

N = 512

, we set the length of each signal block to

K = 32

with a stride of 1. The total number of signal blocks is then calculated as

(N - K) / stride + 1 = 481

. The i-th block can be represented as

X_{f, i} = {X_{f} ∣ k \in [k_{i}, k_{i} + K - 1]}

(4)

where

k_{i}

is the starting position of the i-th block.

In the block matching step, similar blocks are identified within a predefined search window. For each reference block

X_{f, i}

, the similarity is evaluated using the Euclidean distance between blocks. The set of matched similar blocks is

S_{i} = {X_{f, j} ∣ d (X_{f, i}, X_{f, j}) < τ}

(5)

where

d (X_{f, i}, X_{f, j})

is the distance between blocks

X_{f, i}

and

X_{f, j}

, and

τ

is the similarity threshold. The reference block

X_{f, i}

and its matched blocks

S_{i}

are stacked into a two-dimensional array

Y_{i}

:

Y_{i} = [\begin{matrix} X_{f, i} \\ X_{f, j_{1}} \\ X_{f, j_{2}} \\ ⋮ \\ X_{f, j_{m}} \end{matrix}]

(6)

where

X_{f, j_{1}}, X_{f, j_{2}}, \dots, X_{f, j_{m}}

are the similar blocks in

S_{i}

.

During collaborative filtering, the grouped blocks are transformed into a sparsity-promoting domain with the Discrete Cosine Transform (DCT), and noise reduction is performed using a thresholding operation. The two-dimensional array

Y_{i}

undergoes DCT to obtain its transform domain representation:

{\hat{Y}}_{i} = T (Y_{i}),

(7)

where

T (\cdot)

is the transform operation. The transformed two-dimensional array

{\hat{Y}}_{i}

is represented as

{\hat{Y}}_{i} = [\begin{matrix} {\hat{X}}_{f, i} \\ {\hat{X}}_{f, j_{1}} \\ {\hat{X}}_{f, j_{2}} \\ ⋮ \\ {\hat{X}}_{f, j_{m}} \end{matrix}]

(8)

where

{\hat{X}}_{f, i} = T (X_{f, i})

is the transformed reference block, and

{\hat{X}}_{f, j_{1}}, {\hat{X}}_{f, j_{2}}, \dots, {\hat{X}}_{f, j_{m}}

are the transformed similar blocks. In the transform domain, a hard thresholding operation is applied to

{\hat{Y}}_{i}

to remove noise components. The filtered result is

{\hat{Y}}_{i}^{filtered} = F ({\hat{Y}}_{i})

(9)

where

F (\cdot)

is the hard thresholding filter. The filtering formula is given by

{\hat{Y}}_{i}^{filtered} (m) = \{\begin{matrix} {\hat{Y}}_{i} (m) & if | {\hat{Y}}_{i} (m) | > Θ \\ 0 & otherwise \end{matrix}

(10)

where

Θ

is the threshold parameter.

Finally, in the aggregation step, the denoised blocks are combined to reconstruct the signal. The inverse transform is applied to the filtered two-dimensional array to obtain the denoised signal blocks:

Y_{i}^{denoised} = T^{- 1} ({\hat{Y}}_{i}^{filtered})

(11)

The denoised signal blocks

Y_{i}^{denoised}

are then aggregated to reconstruct the original one-dimensional signal. Since the blocks are overlapping, the final signal is obtained by weighted averaging:

\tilde{X_{f}} = \frac{\sum_{i} w_{i} (k) \cdot {\tilde{X}}_{f, i}}{\sum_{i} w_{i} (k)}

(12)

where

w_{i} (k)

is the weight function. The label

\tilde{X_{f}}

is generated by applying the 1D-BM3D algorithm to denoise the measured noisy spectrum

X_{f}

, thereby forming the training pair.

3.3. Cascaded Strategy with the Joint Loss Function

3.3.1. Cascaded Training Strategy

In our case, the ultimate objective is to accomplish the classification task based on the low-resolution radar. The denoising process proposed in Section 3.2 is implemented to diminish the impact of noise, aiming for improved classification performance. As demonstrated in previous studies [40,42], an increase in SNR correlates with higher recognition accuracy. To perform classification, a classification model is cascaded following the denoising stage. We utilize the AlexNet as the foundational structure for the 1D classification subnetwork, as described in Figure 6. To accommodate the 1D input, similar adjustments with the denoising subnetwork are made to the convolution and pooling operations. The modified structure ensures compatibility with the characteristics of 1D input data, allowing for an effective integration of the denoised spectrum into the classification model.

Figure 6. The structure details of 1D-AlexNet.

The joint training strategy of our proposed cascaded framework is illustrated in detail in Figure 7. This framework integrates the denoising subnetwork (1D-BM3D-DnCNN) and the classification subnetwork (1D-AlexNet) into a unified model for joint training. The input data consist of three components: the 1D measured noisy spectrum

X_{f}

, the denoising label

\tilde{X_{f}}

, and the classification label y. The pair

{X_{f}, \tilde{X_{f}}}

constitutes the training set for the denoising subnetwork, where the denoised output

\hat{X_{f}}

serves as the input to the classification. Consequently, the pair

{\hat{X_{f}}, y}

forms the the training set for the classification subnetwork. The entire cascaded model is optimized through a joint loss function L, where

L_{d}

measures the loss of the denoising subnetwork between the denoised output and clean reference

{\hat{X_{f}}, \tilde{X_{f}}}

, while

L_{c}

computes the loss of the classification subnetwork between the predicted class label and the ground truth label

{\hat{y}, y}

.

Figure 7. The joint training strategy of the proposed cascaded framework.

3.3.2. Joint Loss Function

Our proposed cascaded framework is designed to enhance overall performance by leveraging the denoising process to better represent the input radar signals, thereby optimizing the effectiveness of the classification subnetwork. There are two loss components adopted in our proposed cascaded framework.

The reconstruction loss of the denoising subnetwork is the mean squared error (MSE) between the clean spectrum and the denoised spectrum, which can be represented as

L_{d} = \frac{1}{H} \sum_{i = 0}^{H} {({\hat{X_{f}}}_{i} - {\tilde{X_{f}}}_{i})}^{2}

(13)

where

\tilde{X_{f}}

is the ground truth spectrum;

\hat{X_{f}}

is the denoised spectrum, obtained from the output of the denoising subnetwork F, with the measured noisy spectrum

X_{f}

as the input;

{\tilde{X_{f}}}_{i}

indicates the i-th element of

\tilde{X_{f}}

;

\tilde{X_{f}}

and

\hat{X_{f}}

are of size

H \times 1

.

The loss of the classification subnetwork is the cross-entropy loss between the predicted label and the truth label. For a multi-classification problem, each class is labeled with an index corresponding to the radar target it represents, and the index is represented as an one-hot vector y with M dimensions (as there are M classes in total). For instance, the first class can be represented as

y = {[1, 0, 0, \dots, 0]}_{T}

. Let the the denoised spectrum

\hat{X_{f}}

feed into the classification subnetwork, which is denoted as

Φ

. The output predicted label is then a M-dimensional vector

\hat{y}

, where

\hat{y} = Φ (\hat{X_{f}})

. The cross-entropy loss between y and

\hat{y}

can be calculated as

L_{c} = - \sum_{i = 0}^{M - 1} {\hat{y}}_{i} log (y_{i})

(14)

where y is the ground truth label,

{\hat{y}}_{i}

represents the probability of the classified target belonging to the i-th class, and

\sum_{i = 0}^{M - 1} {\hat{y}}_{i} = 1

.

The joint loss is defined as the weighted sum of the reconstruction loss and the classification loss, which can be represented as

L = L_{d} + λ L_{c}

(15)

where

λ

is the weight for balancing the losses

L_{d}

and

L_{c}

.

This joint optimization strategy creates a synergistic relationship between the denoising and classification tasks. The denoising subnetwork is trained to generate accurate outputs while preserving discriminative features essential for classification. Simultaneously, the classification subnetwork guides the denoiser to produce representations that enhance classification performance.

4. Experimental Setting

This section describes the experimental settings, including evaluation metrics, data collection, parameter settings, and the noise model.

4.1. Dataset Description

To evaluate the proposed method, six kinds of targets are considered: wheeled vehicle, tracked vehicle, person walking, person running, UAV, and ship. Wheeled and tracked vehicles are two typical targets for the vehicle classification task based on micro-Doppler signatures. In this case, a small van (Iveco Daily) and a tank are used as wheeled and tracked vehicle targets, respectively. The UAV target used for our experiments is a type of a quadcopter drone (DJI Phantom 4), and the ship target is a common medium-sized speedboat.

In our experiment, two datasets are used: a measured dataset and a simulated dataset.

4.1.1. Measured Dataset from X-Band Pulsed-Doppler Radar System

The real data collection from typical targets is performed based on a ground surveillance pulsed-Doppler radar. It is an X-band radar system with an emitted power of 15 W and the sampling frequency of 80 MHz. The coverage range is 0–360° in azimuth and 0–30° in elevation. The radar antenna has an azimuth beam width of 3° and an elevation beam width of 4.2°. This system has a low resolution of 15 m in distance due to a narrow bandwidth of 10 MHz. The number of coherent pulses N is 256 and the PRI is 340 μs in one frame, and the lasting time for one CPI signal is calculated to be about 87 ms. The experimental scene with the radar system and target are shown in Figure 8.

Figure 8. Experimental scene with the radar system and target.

Classifying ground-moving and near-ground flying objects are important for applications such as perimeter security and surveillance. Due to the low resolution and short dwell time, the surveillance radar used in our experiment is often only required to achieve a rough classification of typical targets such as humans, vehicles, and UAVs. In our work, we expanded the typical targets to include six distinct classes in order to intensify the challenge of classification and test our proposed framework. A total of 6400 frames of six types of targets—wheeled vehicle (1300), tracked vehicle (1000), person walking (1100), person running (1000), UAV (1400), and ship (600)—are measured by the X-band surveillance radar system to perform relevant experiments. These data are collected from several kinds of scenes with the radar placed on the road or hillside to detect and track the targets. The detection range of targets covers a range from 100 m to 5 km.

4.1.2. Simulated Dataset for Denoising Model

We evaluate the denoising performance using the AWGN model in the experiment, a widely accepted assumption in denoising studies. This model is constructed using a zero-mean normal distribution based on the specified noise level (SNR).

4.2. Parameter Settings

In order to obtain training and test data, we selected the hierarchical sampling method for assessing the denoising performance and classification accuracy in our simulated and measured dataset in our experiments. Hierarchical sampling, also known as stratified sampling, is a method used to split a dataset into training and testing subsets, and each subset maintains the same proportion of each class as in the original dataset. This method can ensure that the class distribution in the training and testing datasets mirrors that of the original dataset, help in building more accurate and reliable models, especially for imbalanced datasets, and minimize the bias introduced by random sampling, where some classes might be underrepresented or overrepresented. In our work, we set the test size from 0.2 to 0.8, specifying that 20% to 80% of the data are used for testing sequentially. The parameter “stratify” was set to the label to ensure that the split was stratified based on the target variable label. The random state was set to 42 to ensure the reproducibility of the results. All models were trained for 200 epochs. The process needed to be repeated 5 times with randomly changed training samples to fairly evaluate the suggested strategy, and the calculated average value of these 5 repetitions was the final result.

All models in our work were trained and tested in the Keras framework of the Tensorflow backend. The GPU that we used was a NVIDIA Tesla P100 graphics card with a 16-GB memory.

5. Results

This section allows for a thorough assessment of the framework’s ability to denoise measured radar signals and accurately classify the radar targets.

5.1. Simulated Radar Spectrum Denoising Training

For the simulated data, AGWN was introduced into the simulated clean signal to assess the performance of the denoising subnetwork. The denoising subnetwork took the radar spectrums as input, and outputs the denoised spectrums. We created a training set for the DnCNN model from the clean simulated spectrums and AWGN noisy spectrums of six classes. We evaluated the proposed denoising subnetwork under nine noise levels, where SNR = −20, −15, −10, −5, 0, 5, 10, 15, 20 dB. Note that we simulated noise in complex radar signals by generating Gaussian noise for both the real and imaginary components separately. The noisy complex signal was then created by adding these noise components to the original signal.

Taking the UAV target as an example, we generated a clean simulated spectrum and AWGN noisy spectrums under nine noise levels. In our previous work [45], we simulated the UAV rotor echo model and analyzed the micro-Doppler distribution characteristics of the UAV target, pointing out that the main Doppler component of the fuselage and the micro-motion component were distributed on both sides of the fuselage in the frequency spectrum. An example of the original, noisy, and denoised Doppler spectra for the UAV target is presented in Figure 9. The clean spectrum samples were formed by randomly changing the moving speed of the fuselage and the strength of the rotors, as shown in Figure 9a. The noisy spectra were created by adding clean simulated spectra with noise, and the examples with different noise levels are shown in Figure 9b. Figure 9c shows the spectra after being processed by the 1D-DnCNN denoising model proposed in this paper. It can be seen that the noise of different degrees is suppressed to a certain extent, but the denoising effect is related to the SNR of the previous noise addition. The higher the SNR, the better the restoration after denoising.

Figure 9. An example of the original, noisy, and denoised Doppler spectra for simulated UAV data. (a) Original spectra. (b) Noisy spectra with added noise at SNR levels of −20, −15, −10, −5, 0, 5, 10, 15, and 20 dB. (c) Denoised spectra.

The clean spectrum samples were formed by randomly changing the moving speed of the fuselage and the strength of the rotors, as shown in Figure 9a. The noisy spectra were created by adding clean simulated spectra with noise, and the examples with different noise levels are shown in Figure 9b. Figure 9c shows the spectra after being processed by the 1D-DnCNN denoising model proposed in this paper. It can be seen that the noise of different degrees is suppressed to a certain extent, but the denoising effect is related to the SNR of the previous noise addition. The higher the SNR, the better the restoration after denoising.

Here, the Root Mean Squared Error (RMSE) serves as a metric to quantify the difference between the denoised spectrum and the clean spectrum, which can be expressed as

R M S E = \sqrt{\frac{1}{H} \sum_{i = 0}^{H} {({\hat{X_{f}}}_{i} - {X_{f}}_{i})}^{2}}

(16)

Additionally, the SNR provides a direct measure of how the spectrum is affected by the introduced noise component:

S N R = 10 log \frac{\sum_{i = 0}^{H} {({X_{f}}_{i})}^{2}}{\sum_{i = 0}^{H} {({\hat{X_{f}}}_{i} - {X_{f}}_{i})}^{2}}

(17)

where

X_{f}

is the ground clean spectrum, and

\hat{X_{f}}

is the denoised spectrum. These two performance indicators help evaluate the effectiveness of the denoising subnetwork in mitigating the impact of noise on the signal.

Table 1 presents the improvements in SNR and RMSE values for both the noisy and denoised datasets. The results show a significant enhancement after applying the denoising model, while Figure 10 shows the comparison curves of the two evaluation indicators before and after denoising, which also proves the effectiveness of the 1D-DnCNN denoising model. When adding the same noise level, both SNR and RMSE are significantly improved after denoising processing. As the noise level increases, the SNR and RMSE values improve, but the room for improvement is reduced.

Table 1. The improvements in SNR and RMSE values before and after denoising.

Figure 10. Comparison of SNR and RMSE before and after denoising.

5.2. Measured Radar Spectrum Denoising Training

For the simulated dataset, we can create a training pair by using the simulated clean data and artificially added noise. However, capturing clean data to serve as ground truth for the measured dataset is quite challenging. In our work, we attempt two training approaches to address this issue:

The first approach utilizes the simulated samples as labels for the measured samples.
The second approach applies the 1D-BM3D algorithm to pre-denoise the measured data, treating the denoised samples as labels for the measured samples.

Figure 11 presents the 1D-DnCNN denoiser trained using these two approaches. The details of the two approaches and a comparison of their denoising performance will be illustrated in the subsequent section.

Figure 11. 1D-DnCNN denoiser using two training approaches.

For the first approach, the simulated data are constructed to reflect the micro-Doppler characteristics specific to the six target classes, aiming for correlation and consistency with the measured target data. This allows us to use the simulated data for 1D-DnCNN model training and ensure its effectiveness on the measured data.

The performance of the 1D-DnCNN denoising model is evaluated based on the classification accuracy achieved on the denoised data using the 1D-AlexNet model. For comparison, we also train Autoencoder (AE) and U-Net models and assess their denoising effectiveness. Specifically, the denoised outputs from the 1D-DnCNN, 1D-AE, and 1D-Unet models are individually fed into the 1D-AlexNet for classification. The split rate, which represents the ratio between the training and the testing sets, is set as 0.2/0.8, 0.3/0.7, 0.4/0.6, 0.5/0.5, 0.6/0.4, 0.7/0.3, and 0.8/0.2, respectively. Figure 12 shows the classification performance under varying split rates after applying the three denoising models, demonstrating that the 1D-DnCNN model outperforms the other two models on our measured radar dataset.

Figure 12. Classification results under varying split rates after DnCNN, AE, and Unet denoising models.

In the second approach, the 1D-BM3D algorithm is applied to denoise the 1D measured radar spectrum samples, and the pre-denoised samples from the 1D-BM3D process serve as labels to guide the training of the 1D-DnCNN denoising model.

Using these two training approaches, we train the 1D-DnCNN model. Subsequently, the trained model is applied to the measured samples under both approaches, generating the corresponding denoised outputs, as shown in Figure 11.

An example of both denoising training approaches on the measured Doppler spectra of six classes is presented in Figure 13. Figure 13a displays the measured radar Doppler spectra for the six classes, while Figure 13b,c show the denoised spectra obtained using the two respective training approaches. The visual denoising results clearly show that the second approach performs better, as it not only removes noise but also preserves the micro-Doppler signatures of various targets more effectively. A comparison of the classification performance of these two approaches will be provided in the next subsection.

Figure 13. An example of both denoising training approaches on the measured Doppler spectra of six classes. (a) Measured Doppler spectra. (b) Denoised Doppler spectra using the first training approach. (c) Denoised Doppler spectra using the second training approach.

5.3. Performance of the Proposed Cascaded Framework

Classification is the ultimate objective of our work, and we utilized an AlexNet-based deep network in our framework to perform this task. For the cascaded classification framework, we trained our model on the measured radar dataset obtained from an X-Band pulsed-Doppler radar system, as described in Section 4.1.1. The joint loss in our framework is composed of the reconstruction loss from the denoising subnetwork and the classification loss from the 1D-AlexNet subnetwork, as defined in Equation (4), where the weight parameter

λ

is empirically set to 2. The split rate is set as 0.8/0.2.

The loss and accuracy curves for the training and testing datasets are shown in Figure 14, demonstrating that our proposed method achieves excellent performance. The final classification accuracy exceeds 95%, with only a minimal gap between the training and testing accuracy curves.

Figure 14. Loss and accuracy curves of the proposed cascaded model.

We also investigated how signal denoising can enhance the classification task over the measured dataset from an X-Band pulsed-Doppler Radar system. The noisy radar spectrums of six target classes were denoised and then fed into the 1D-AlexNet network for the classification task. To evaluate how different denoising schemes contribute to the classification performance, we experimented with the following cases:

The measured radar spectrums were directly fed into the 1D-AlexNet classification network, termed as 1D-AlexNet. This scheme served as the baseline.
The measured radar spectrums were first denoised using the 1D-BM3D algorithm and then fed into the 1D-AlexNet classification network, termed as 1D-BM3D + AlexNet.
The measured radar spectrums were denoised using the separately trained 1D-DnCNN denoising network with the simulated dataset, and then fed into the 1D-AlexNet classification network, termed as 1D-DnCNN (Separate) + AlexNet.
The measured radar spectrums were denoised using the separately trained 1D-DnCNN denoising network with the measured dataset where the ground truth was generated by the 1D-BM3D algorithm, and then fed into the 1D-AlexNet classification network, termed as 1D-DnCNN with BM3D (Separate) + AlexNet.
Our proposed scheme: The measured radar spectrums were processed through the cascaded framework consisting of the 1D-DnCNN denoising subnetwork based on 1D-BM3D and the 1D-AlexNet classification subnetwork, which was trained using the joint loss, termed as Joint Training (Proposed).

Table 2 presents the classification performance for the five cases described above. Additionally, we investigated the effects of varying the split rate, including 0.2/0.8, 0.3/0.7, 0.4/0.6, 0.5/0.5, 0.6/0.4, 0.7/0.3, and 0.8/0.2. Figure 15 shows that the accuracy trend as the training set proportion increases from 0.2 to 0.8. It is clear that accuracy improves as the training set proportion increases. It can be observed that the baseline 1D-AlexNet scheme yields significantly lower accuracy than the other four cases for all split rates, highlighting the importance of denoising as a preprocessing step for the classification task on measured radar data. When denoising preprocessing methods like 1D-BM3D + AlexNet or separate denoising training methods like 1D-DnCNN (Separate) + AlexNet were applied, the accuracy improved compared to the baseline. As shown in Table 2 and Figure 15, our proposed joint training approach achieved the highest accuracy among all cases, demonstrating the effectiveness of the cascaded denoising and classification framework.

Table 2. Classification results (%) comparison for five cases.

Figure 15. Comparison of classification performance for the five cases.

6. Discussion

We propose a 1D-based model that demonstrates excellent classification performance when validated on both simulated and measured radar data, outperforming separately trained denoising and classification models. In contrast to most current studies that employ 2D time–frequency spectrograms as input to deep network models, our approach utilizes the 1D micro-Doppler spectrum as input, significantly reducing computational complexity while maintaining competitive accuracy. Table 3 presents a comparison between 1D-based and 2D-based models in terms of parameters, training time, and testing time.

Table 3. Comparison between 1D-based and 2D-based models.

In Table 3, the “1D-DnCNN-AlexNet” model represents our proposed framework, which cascades 1D-DnCNN and 1D-AlexNet and takes the 1D micro-Doppler spectrum as input. For comparison, we also evaluate the “2D-DnCNN-AlexNet” model, which cascades 2D-DnCNN and 2D-AlexNet and processes 2D time–frequency spectrograms as input. The results demonstrate that, compared to 2D-based networks, the 1D-DnCNN-AlexNet model achieves a reduction in the number of parameters while also exhibiting faster training and testing speeds, making it more efficient for real-time applications.

While this paper demonstrates promising results, several limitations should be noted. Our work conducts classification experiments exclusively using real measured data from a narrow-band pulse radar, and the generalization capability of the proposed model across other datasets has yet to be validated. Furthermore, due to the high cost of radar data acquisition, the number of real measured samples used in this study is relatively limited. The scarcity of labeled data also imposes constraints on the training performance of deep learning models. Future work will focus on evaluating the adaptability of the model to various radar systems and environmental conditions to enhance its robustness across a wider range of applications. Additionally, we aim to explore the fine classification of similar targets, such as various types of vehicles or UAVs, which presents a greater challenge given the constraints of low-resolution radar.

7. Conclusions

In this paper, we propose a 1D denoising and classification cascaded network for low-resolution radar target classification, utilizing the 1D micro-Doppler spectrum instead of the 2D micro-Doppler spectrogram commonly used in existing studies. By implementing the joint loss optimization on the whole framework, the denoising network improves the quality of the measured radar micro-Doppler spectrum and enables the classifier to perform more accurately. The experimental results demonstrate that the proposed cascaded framework delivers excellent performance on our measured data. Our proposed approach is designed with practical engineering applications in mind, delivering superior multi-target classification performance for pulsed-Doppler radar systems by enabling rapid decision-making within a single frame while maintaining low computational complexity by utilizing 1D radar signals.

Author Contributions

Conceptualization, B.M.; methodology, B.M.; validation, B.M.; investigation, B.M.; resources, B.M.; writing—original draft preparation, B.M.; writing—review and editing, B.M.; supervision, B.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China under Grant No. 62271367.

Data Availability Statement

The data presented in this study are available from the corresponding author upon request. The data are not publicly available due to privacy restrictions.

Acknowledgments

The authors would like to thank Fei Zhao, Xincheng Liu, and Liang Fang for the experimental data collection.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	convolutional neural network
SAR	synthetic aperture radar
CPI	coherent processing interval
1D	one-dimensional
2D	two-dimensional
SNR	signal-to-noise ratio
NCSR	non-convex shrinkage and reconstruction
GSR	guided sparse representation
BM3D	block matching and 3D filtering
DnCNN	denoising convolutional neural network
FFDNet	fast and flexible Denoising Network
CBDNet	convolutional blind denoising Network
BN	batch normalization
FCN	fully convolutional network
FMNet	feature mapping network
AWGN	additive white Gaussian noise
GAN	generative adversarial network
CNN-Bi-RNN	CNN bidirectional recurrent neural network
HRRP	high range resolution profile
PRI	Pulse Repetition Interval
STFT	Short-Time Fourier Transform
TAS	Track-And-Scan
DCT	discrete Cosine Transform
MSE	mean squared error
RMSE	root mean squared error
AE	autoencoder

References

Zhang, T.; Zhang, X.; Ke, X.; Liu, C.; Xu, X.; Zhan, X.; Chen, W.; Ahmad, I.; Zhou, Y.; Pan, D.; et al. HOG-ShipCLSNet: A novel deep learning network with HOG feature fusion for SAR ship classification. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5210322. [Google Scholar] [CrossRef]
Liu, X.; Wu, Y.; Liang, W.; Cao, Y.; Li, M. High resolution SAR image classification using global-local network structure based on vision transformer and CNN. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4505405. [Google Scholar] [CrossRef]
Chen, V.C.; Li, F.; Ho, S.-S.; Wechsler, H. Micro-Doppler effect in radar: Phenomenon, model, and simulation study. IEEE Trans. Aerosp. Electron. Syst. 2006, 42, 2–21. [Google Scholar] [CrossRef]
Clemente, C.; Balleri, A.; Woodbridge, K.; Soraghan, J.J. Developments in target micro-Doppler signatures analysis: Radar imaging, ultrasound and through-the-wall radar. EURASIP J. Adv. Signal Process. 2013, 2013, 47. [Google Scholar] [CrossRef]
Xu, X.; Feng, C.; Han, L. Classification of radar targets with micro-motion based on RCS sequences encoding and convolutional neural network. Remote Sens. 2022, 14, 5863. [Google Scholar] [CrossRef]
Qin, X.; Deng, B.; Wang, H. Micro-Doppler feature extraction of rotating structures of aircraft targets with terahertz radar. Remote Sens. 2022, 14, 3856. [Google Scholar] [CrossRef]
Hanif, A.; Muaz, M.; Hasan, A.; Adeel, M. Micro-Doppler based target recognition with radars: A review. IEEE Sens. J. 2022, 22, 2948–2961. [Google Scholar] [CrossRef]
Zhu, N.; Hu, J.; Xu, S.; Wu, W.; Zhang, Y.; Chen, Z. Micro-motion parameter extraction for ballistic missile with wideband radar using improved ensemble EMD method. Remote Sens. 2021, 13, 3545. [Google Scholar] [CrossRef]
Li, Y.; Du, L.; Liu, H. Hierarchical classification of moving vehicles based on empirical mode decomposition of micro-Doppler signatures. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3001–3013. [Google Scholar] [CrossRef]
Molchanov, P.O.; Astola, J.T.; Egiazarian, K.O.; Totsky, A.V. Classification of ground moving targets using bicepstrum-based features extracted from micro-Doppler radar signatures. EURASIP J. Adv. Signal Process. 2013, 2013, 61. [Google Scholar] [CrossRef]
Kim, Y.; Ling, H. Human activity classification based on micro-Doppler signatures using a support vector machine. IEEE Trans. Geosci. Remote Sens. 2009, 47, 1328–1337. [Google Scholar]
Ryu, S.-J.; Suh, J.-S.; Baek, S.-H.; Hong, S.; Kim, J.-H. Feature-based hand gesture recognition using an FMCW radar and its temporal feature analysis. IEEE Sens. J. 2018, 18, 7593–7602. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
Amodei, D.; Ananthanarayanan, S.; Anubhai, R.; Bai, J.; Battenberg, E.; Case, C.; Casper, J.; Catanzaro, B.; Cheng, Q.; Chen, G.; et al. Deep speech 2: End-to-end speech recognition in English and Mandarin. In Proceedings of the Machine Learning Research 2016, New York, NY, USA, 19–24 June 2016; pp. 173–182. [Google Scholar]
Pei, J.; Huang, Y.; Huo, W.; Zhang, Y.; Yang, J.; Yeo, T.-S. SAR automatic target recognition based on multiview deep learning framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2196–2210. [Google Scholar] [CrossRef]
Jiang, W.; Wang, Y.; Li, Y.; Lin, Y.; Shen, W. Radar target characterization and deep learning in radar automatic target recognition: A review. Remote Sens. 2023, 15, 3742. [Google Scholar] [CrossRef]
Qi, F.; Lv, H.; Liang, F.; Li, Z.; Yu, X.; Wang, J. MHHT-based method for analysis of micro-Doppler signatures for human finer-grained activity using through-wall SFCW radar. Remote Sens. 2017, 9, 260. [Google Scholar] [CrossRef]
Wan, J.; Chen, B.; Xu, B.; Liu, H.; Jin, L. Convolutional neural networks for radar HRRP target recognition and rejection. EURASIP J. Adv. Signal Process. 2019, 2019, 5. [Google Scholar] [CrossRef]
Raja Abdullah, R.S.A.; Alnaeb, A.; Ahmad Salah, A.; Abdul Rashid, N.E.; Sali, A.; Pasya, I. Micro-Doppler estimation and analysis of slow moving objects in forward scattering radar system. Remote Sens. 2017, 9, 699. [Google Scholar] [CrossRef]
Ma, B.; Egiazarian, K.O.; Chen, B. Low-resolution radar target classification using vision transformer based on micro-Doppler signatures. IEEE Sens. J. 2023, 23, 28474–28485. [Google Scholar] [CrossRef]
Dong, W.; Zhang, L.; Shi, G.; Li, X. Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 2013, 22, 1620–1630. [Google Scholar] [CrossRef]
Zhang, J.; Zhao, D.; Gao, W. Group-based sparse representation for image restoration. IEEE Trans. Image Process. 2014, 23, 3336–3351. [Google Scholar] [CrossRef]
Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef] [PubMed]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [PubMed]
Zhang, K.; Zuo, W.; Zhang, L. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 2018, 27, 4608–4622. [Google Scholar] [CrossRef] [PubMed]
Guo, S.; Yan, Z.; Zhang, K.; Zuo, W.; Zhang, L. Toward convolutional blind denoising of real photographs. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 1712–1722. [Google Scholar]
Tang, C.; Li, W.; Vishwakarma, S.; Shi, F.; Julier, S.J.; Chetty, K. FMNet: Latent feature-wise mapping network for cleaning up noisy micro-Doppler spectrogram. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5106612. [Google Scholar] [CrossRef]
Tang, C.; Li, W.; Vishwakarma, S.; Shi, F.; Julier, S.; Chetty, K. MDPose: Human skeletal motion reconstruction using WiFi micro-Doppler signatures. IEEE Trans. Aerosp. Electron. Syst. 2024, 60, 157–167. [Google Scholar] [CrossRef]
Tang, C.; Li, W.; Vishwakarma, S.; Woodbridge, K.; Julier, S.; Chetty, K. Learning from natural noise to denoise micro-Doppler spectrogram. arXiv 2021, arXiv:2102.06887. [Google Scholar]
Yang, Y.; Wen, P.; Ye, W.; Li, B.; Lang, Y. Blind universal denoising for radar micro-Doppler spectrograms using identical dual learning and reciprocal adversarial training. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 4120–4134. [Google Scholar] [CrossRef]
Zhao, L.; He, Q.; Ding, D.; Zhang, S.; Kuang, G.; Liu, L. Selecting pseudo supervision for unsupervised domain adaptive SAR target classification. EURASIP J. Adv. Signal Process. 2022, 2022, 84. [Google Scholar] [CrossRef]
Du, L.; Li, L.; Guo, Y.; Wang, Y.; Ren, K.; Chen, J. Two-stream deep fusion network based on VAE and CNN for synthetic aperture radar target recognition. Remote Sens. 2021, 13, 4021. [Google Scholar] [CrossRef]
Xue, R.; Bai, X.; Cao, X.; Zhou, F. Sequential ISAR target classification based on hybrid transformer. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5111411. [Google Scholar] [CrossRef]
Chen, S.; He, W.; Ren, J.; Jiang, X. Attention-based dual-stream vision transformer for synthetic aperture radar target recognition. IEEE Trans. Geosci. Remote Sens. 2023, 61, 7448–7461. [Google Scholar]
Lv, Q.; Quan, Y.; Feng, W.; Sha, M.; Dong, S.; Xing, M. Radar deception jamming recognition based on weighted ensemble CNN with transfer learning. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5107511. [Google Scholar] [CrossRef]
Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D convolutional neural networks and applications: A survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
Jing, Z.; Li, P.; Wu, B.; Yuan, S.; Chen, Y. An adaptive focal loss function based on transfer learning for few-shot radar signal intra-pulse modulation classification. Remote Sens. 2022, 14, 1950. [Google Scholar] [CrossRef]
Chen, H.; Ye, W. Classification of human activity based on radar signal using 1-D convolutional neural network. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1178–1182. [Google Scholar] [CrossRef]
Pan, M.; Liu, A.; Yu, Y.; Wang, P.; Li, J.; Liu, Y.; Lv, S.; Zhu, H. Radar HRRP target recognition model based on a stacked CNN–Bi-RNN with attention mechanism. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5100814. [Google Scholar] [CrossRef]
Du, M.; Zhong, P.; Cai, X.; Bi, D. DNCNet: Deep radar signal denoising and recognition. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 3549–3562. [Google Scholar] [CrossRef]
Liu, D.; Wen, B.; Jiao, J.; Liu, X.; Wang, Z.; Huang, T.S. Connecting image denoising and high-level vision tasks via deep learning. IEEE Trans. Image Process. 2020, 29, 3695–3706. [Google Scholar] [CrossRef]
Yu, R.; Du, Y.; Li, J.; Napolitano, A.; Le Kernec, J. Radar-based human activity recognition using denoising techniques to enhance classification accuracy. IET Radar Sonar Navig. 2024, 18, 277–293. [Google Scholar] [CrossRef]
Tsao, J.; Steinberg, B.D. Reduction of sidelobe and speckle artifacts in microwave imaging: The CLEAN technique. IEEE Trans. Antennas Propag. 1988, 36, 543–556. [Google Scholar] [CrossRef]
Ma, B.; Chen, B.; Zhang, Z.; Diao, D.; Fang, L.; Zhao, F. Classification of UAV and ground targets by micro-Doppler signatures based on PD surveillance radar. CIE Int. Conf. Radar 2021, 1361–1365. [Google Scholar]

Figure 1. Overall architecture of the proposed framework.

Figure 2. Micro-Doppler differences of human motion in the spectrogram

S T F T_{f, t}

and spectrum

X_{f}

. (a) Human running. (b) Human walking.

Figure 3. An example of Doppler spectrum through clutter removal with one CPI for human running. (a) Raw Doppler spectrum. (b) Doppler spectrum after clutter removal.

Figure 4. The structure details of 1D-DnCNN.

Figure 5. Description of the 1D-BM3D Algorithm Steps.

Figure 6. The structure details of 1D-AlexNet.

Figure 7. The joint training strategy of the proposed cascaded framework.

Figure 8. Experimental scene with the radar system and target.

Figure 9. An example of the original, noisy, and denoised Doppler spectra for simulated UAV data. (a) Original spectra. (b) Noisy spectra with added noise at SNR levels of −20, −15, −10, −5, 0, 5, 10, 15, and 20 dB. (c) Denoised spectra.

Figure 10. Comparison of SNR and RMSE before and after denoising.

Figure 11. 1D-DnCNN denoiser using two training approaches.

Figure 12. Classification results under varying split rates after DnCNN, AE, and Unet denoising models.

Figure 13. An example of both denoising training approaches on the measured Doppler spectra of six classes. (a) Measured Doppler spectra. (b) Denoised Doppler spectra using the first training approach. (c) Denoised Doppler spectra using the second training approach.

Figure 14. Loss and accuracy curves of the proposed cascaded model.

Figure 15. Comparison of classification performance for the five cases.

Table 1. The improvements in SNR and RMSE values before and after denoising.

	Noise Level (dB)	−20	−15	−10	−5	0	5	10	15	20
SNR (dB)	Original	−12.09	−11.53	−8.72	−4.07	0.91	6.05	11.30	16.68	22.14
SNR (dB)	Denoised	−0.13	0.43	6.16	8.29	12.04	16.07	20.12	22.76	25.86
RMSE	Original	1.6 × 10⁻²	1.5 × 10⁻²	1.1 × 10⁻²	6.7 × 10⁻³	3.8 × 10⁻³	2.1 × 10⁻³	1.2 × 10⁻³	6.3 × 10⁻⁴	3.4 × 10⁻⁴
RMSE	Denoised	4.2 × 10⁻³	4.3 × 10⁻³	2.5 × 10⁻³	1.9 × 10⁻³	1.2 × 10⁻³	7.1 × 10⁻⁴	4.3 × 10⁻⁴	3.0 × 10⁻⁴	2.1 × 10⁻⁴

Table 2. Classification results (%) comparison for five cases.

Split Rate	0.2/0.8	0.3/0.7	0.4/0.6	0.5/0.5	0.6/0.4	0.7/0.3	0.8/0.2
1D-AlexNet	79.22	82.05	83.38	84.69	86.45	85.99	87.08
1D-BM3D + AlexNet	87.90	90.33	90.79	91.96	91.35	91.84	93.06
1D-DnCNN (Separate) + AlexNet	86.60	89.02	89.85	89.82	90.48	91.15	92.39
1D-DnCNN with BM3D (Separate) + AlexNet	83.03	88.06	90.37	91.09	92.09	93.35	94.11
Joint Training (Proposed)	90.00	93.12	93.29	94.19	94.45	94.94	95.84

Table 3. Comparison between 1D-based and 2D-based models.

Model	Input Shape	Parameter	Training Time/Epoch	Testing Time
2D-DnCNN-AlexNet	[512,512,1]	226,355,985	95.36 s	2.81 ms
1D-DnCNN-AlexNet	[512,1]	32,735,367	14.12 s	0.56 ms

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A 1D Cascaded Denoising and Classification Framework for Micro-Doppler-Based Radar Target Recognition

Abstract

1. Introduction

2. Related Works

2.1. Radar Micro-Doppler Spectrograms Denoising

2.2. 1D Convolutional Neural Networks

2.3. Cascaded Network

3. Methods

3.1. 1D Micro-Doppler Spectrum Input

3.2. 1D-BM3D-DnCNN Model for Denoising

3.2.1. 1D-DnCNN Denoising Subnetwork

3.2.2. 1D-BM3D Algorithm

3.3. Cascaded Strategy with the Joint Loss Function

3.3.1. Cascaded Training Strategy

3.3.2. Joint Loss Function

4. Experimental Setting

4.1. Dataset Description

4.1.1. Measured Dataset from X-Band Pulsed-Doppler Radar System

4.1.2. Simulated Dataset for Denoising Model

4.2. Parameter Settings

5. Results

5.1. Simulated Radar Spectrum Denoising Training

5.2. Measured Radar Spectrum Denoising Training

5.3. Performance of the Proposed Cascaded Framework

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Article Metrics

Citations

Article Access Statistics