Air Battlefield Time Series Data Augmentation Model Based on a Lightweight Denoising Diffusion Probabilistic Model
Abstract
1. Introduction
- (1)
- Data transformation. We convert multivariate time series data into 3-channel images via matrix expansion, and univariate data via Gramian angular fields and Markov transition fields.
- (2)
- Considering the need for miniaturization and intelligence in future combat platforms, depthwise separable convolution is introduced to lighten the denoising diffusion probabilistic model (DDPM). Thus, the number of parameters of the model and the amount of computation are reduced.
- (3)
- This paper designs an improved knowledge distillation method with multiple teacher models to accelerate the sampling process.
- (4)
- To validate the practicality and reliability of the generated data, this paper conducts relevant experiments. Generated data enhance performance of intention recognition and target recognition models.
2. Related Work
2.1. Data Augmentation Methods Based on Rules
2.2. Data Augmentation Methods Based on Simulation Models
2.3. Data Augmentation Methods Based on Traditional Machine Learning
2.4. Data Augmentation Methods Based on Deep Learning
3. Related Theory
3.1. Depthwise Separable Convolution
3.2. Denoising Diffusion Probabilistic Model
3.3. Knowledge Distillation
4. Methodology
4.1. Data Encoding Module
4.1.1. Multivariate Time Series Data Preprocessing
- (1)
- Normalization Processing
- (2)
- 2D Matrix Embedding
- (3)
- Visualization Processing
4.1.2. Univariate Time Series Data Preprocessing
- (1)
- Polar Coordinate Transformation
- (2)
- Gramian Angular Fields (GASF/GADF)
- (3)
- Markov Transition Fields (MTFs)
- (4)
- Visualization Processing
4.2. Data Augmentation Module
- (1)
- Forward process. Gaussian noise is gradually added to the original data until it approximates the standard Gaussian distribution.
- (2)
- Reverse process. The LDDPM is fitted by a noise prediction model (U-Net neural network), iteratively denoised, and finally obtains a sample from the data distribution. The inputs of the U-Net neural network are the latent variables at time t, feature after the original data have been extracted, and time t. By training a U-Net neural network to predict the noise at time t in the reverse process, and are obtained, and the next latent variable is sampled. The generated sample can be obtained through repeated iterations.
4.3. Training and Sampling Process
Algorithm 1 Training process |
1: repeat |
2: |
3: |
4: |
5: |
6: until converged |
Algorithm 2 Sampling process |
1: |
2: for do |
3: if t > 1, else z = 0 |
4: |
5: end for |
6: return |
4.4. Reasoning Acceleration Process
5. Experiments and Discussions
- (1)
- Can the LDMKD-DA model effectively extract features from the original sample?
- (2)
- Can the LDMKD-DA model generate high-quality and diverse samples?
- (3)
- Does the generated sample have the same feature distribution as the original sample?
- (4)
- Can the generated samples be used to train specific models?
5.1. Experimental Setup
5.1.1. Experimental Datasets
- 1.
- Air Target Intention Dataset (ATI Dataset)
- 2.
- High-Resolution Range Profile Dataset (HRRP Dataset)
5.1.2. Evaluation Indicators
5.1.3. Hyperparameter Setting
5.2. Multivariate Time Series Data Experiments
5.3. Univariate Time Series Data Experiments
5.4. Application Experiment Analysis
5.4.1. ATI Dataset Application Experiment Analysis
5.4.2. HRRP Dataset Application Experiment Analysis
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
LDMKD-DA | Air battlefield time series data augmentation model based on lightweight denoising diffusion probabilistic model |
DDPM | Denoising diffusion probabilistic model |
GAN | Generative adversarial network |
VAE | Variational autoencoder |
DSC | Depthwise separable convolution |
DW | Depthwise convolution |
PW | Pointwise convolution |
GASF | Gramian summation angular field |
GADF | Gramian difference angular field |
MTF | Markov transition fields |
MMD | Maximum mean discrepancy |
Params | Number of parameters |
FLOPs | Amount of computation |
Appendix A. Precision, Recall, and F1 Score Values of the ATI Dataset Application Experiment
Model | Precision | Recall | F1 Score |
---|---|---|---|
LDMKD-DA | 96.84 | 96.60 | 96.70 |
DDPM | 97.07 | 97.06 | 97.05 |
VAE | 95.29 | 95.03 | 95.13 |
HAVE | 96.06 | 95.69 | 95.86 |
VQ-VAE | 95.73 | 95.46 | 95.55 |
GAN | 96.75 | 96.63 | 96.64 |
TimeGAN | 96.54 | 96.31 | 96.37 |
SigCWGAN | 96.55 | 96.37 | 96.44 |
Model | Precision | Recall | F1 Score |
---|---|---|---|
LDMKD-DA | 99.22 | 99.22 | 99.22 |
DDPM | 99.45 | 99.45 | 99.45 |
VAE | 97.34 | 97.84 | 97.57 |
HAVE | 98.32 | 97.67 | 97.97 |
VQ-VAE | 98.31 | 97.97 | 98.14 |
GAN | 98.45 | 97.70 | 98.04 |
TimeGAN | 98.86 | 98.42 | 98.62 |
SigCWGAN | 98.81 | 98.57 | 98.68 |
Model | Precision Range | Recall Range | F1 Score Range |
---|---|---|---|
LDMKD-DA | 97.32–98.39 | 97.21–98.42 | 97.26–98.40 |
DDPM | 97.46–98.61 | 97.36–98.64 | 97.40–98.62 |
VAE | 96.78–97.72 | 96.62–97.55 | 96.64–97.63 |
HAVE | 96.85–98.03 | 96.52–97.84 | 96.63–97.91 |
VQ-VAE | 97.38–97.95 | 97.38–97.95 | 97.38–97.95 |
GAN | 96.94–98.23 | 96.59–97.98 | 96.74–98.08 |
TimeGAN | 97.38–97.99 | 97.06–97.79 | 97.17–97.88 |
SigCWGAN | 96.22–98.26 | 95.01–98.01 | 95.31–98.11 |
Appendix B. Precision, Recall, and F1 Score Values of the HRRP Dataset Application Experiment
Model | Precision | Recall | F1 Score |
---|---|---|---|
LDMKD-DA | 88.84 | 88.28 | 88.46 |
DDPM | 89.10 | 89.16 | 89.13 |
VAE | 87.03 | 87.05 | 86.93 |
HAVE | 87.94 | 87.43 | 87.50 |
VQ-VAE | 87.85 | 87.17 | 87.18 |
GAN | 88.49 | 87.32 | 87.39 |
TimeGAN | 88.37 | 88.03 | 88.16 |
SigCWGAN | 88.38 | 88.07 | 88.13 |
Model | Precision | Recall | F1 Score |
---|---|---|---|
LDMKD-DA | 93.05 | 93.09 | 93.06 |
DDPM | 93.84 | 93.81 | 93.81 |
VAE | 90.93 | 90.41 | 90.28 |
HAVE | 92.32 | 92.06 | 92.06 |
VQ-VAE | 92.85 | 92.63 | 92.57 |
GAN | 93.32 | 92.84 | 92.93 |
TimeGAN | 92.11 | 92.02 | 92.02 |
SigCWGAN | 92.40 | 91.77 | 91.82 |
Model | Precision Range | Recall Range | F1 Score Range |
---|---|---|---|
LDMKD-DA | 89.17–92.00 | 88.90–92.01 | 88.99–92.00 |
DDPM | 91.36–92.89 | 91.29–92.91 | 91.29–92.89 |
VAE | 87.93–90.63 | 87.29–90.56 | 87.16–90.57 |
HAVE | 88.48–91.29 | 88.41–91.29 | 88.31–91.25 |
VQ-VAE | 88.01–91.82 | 87.95–91.12 | 87.97–91.20 |
GAN | 89.32–92.22 | 88.20–91.68 | 87.97–91.73 |
TimeGAN | 89.21–91.52 | 88.38–91.21 | 88.56–91.25 |
SigCWGAN | 88.78–92.19 | 88.76–91.92 | 88.66–91.85 |
References
- Fusano, A.; Sato, H.; Namatame, A. Multi-Agent Based Combat Simulation from OODA and Network Perspective. In Proceedings of the 2011 UkSim 13th International Conference on Computer Modelling and Simulation, Cambridge, UK, 30 March–1 April 2011; pp. 249–254. [Google Scholar]
- Bruno, N.; Chaudhuri, S. Flexible database generators. In Proceedings of the 31st International Conference on Very Large Data, Trondheim, Norway, 30 August–2 September 2005; VLDB Endowment: Los Angeles, CA, USA; pp. 1097–1107. [Google Scholar]
- Houkjær, K.; Torp, K.; Wind, R. Simple and realistic data generation. In Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul, Republic of Korea, 12–15 September 2006; VLDB Endowment: Los Angeles, CA, USA; pp. 1243–1246. [Google Scholar]
- Kang, Y.; Hyndman, R.J.; Li, F. GRATIS: GeneRAting TIme Series with diverse and controllable characteristics. Stat. Anal. Data Min. 2020, 13, 354–376. [Google Scholar] [CrossRef]
- Bokde, N.D.; Feijóo, A.; Al-Ansari, N.; Yaseen, Z.M. A Comparison Between Reconstruction Methods for Generation of Synthetic Time Series Applied to Wind Speed Simulation. IEEE Access 2019, 7, 135386–135398. [Google Scholar] [CrossRef]
- Koltuk, F.; Schmidt, E.G. A Novel Method for the Synthetic Generation of Non-I.I.D Workloads for Cloud Data Centers. In Proceedings of the 2020 IEEE Symposium on Computers and Communications (ISCC), Rennes, France, 7–10 July 2020; pp. 1–6. [Google Scholar]
- Arlitt, M.; Marwah, M.; Bellala, G.; Shah, A.; Healey, J.; Vandiver, B. IoTAbench: An Internet of Things Analytics Benchmark. In Proceedings of the 6th ACM/SPEC International Conference on Performance Engineering, New York, NY, USA, 28 January–4 February 2015; Association for Computing Machinery: New York, NY, USA; pp. 133–144. [Google Scholar]
- Shamshad, A.; Bawadi, M.A.; Wan Hussin, W.M.A.; Majid, T.A.; Sanusi, S.A.M. First and second order Markov chain models for synthetic generation of wind speed time series. Energy 2005, 30, 693–708. [Google Scholar] [CrossRef]
- Li, Y.; Hu, B.; Niu, T.; Gao, S.; Yan, J.; Xie, K.; Ren, Z. GMM-HMM-Based Medium- and Long-Term Multi-Wind Farm Correlated Power Output Time Series Generation Method. IEEE Access 2021, 9, 90255–90267. [Google Scholar] [CrossRef]
- Mogren, O. C-RNN-GAN: Continuous recurrent neural networks with adversarial training. arXiv 2016. [Google Scholar] [CrossRef]
- Zhang, X.; Wu, T.; Zhang, Y. Attack Detection and Data Restoration of Remote Estimation Systems Based on D-RCGAN. In Proceedings of the Asian Control Conference, Dalian, China, 5–8 July 2024; pp. 1167–1173. [Google Scholar]
- Jiang, X.; Li, D.; Zhang, H.; Zhou, Y.; Liu, J.; Xiang, X. Weighted Strategy Optimization Approach for Discrete Sequence Generation. In Proceedings of the 2024 4th International Symposium on Artificial Intelligence and Intelligent Manufacturing (AIIM), Chengdu, China, 20–22 December 2024; pp. 843–846. [Google Scholar]
- Ramponi, G.; Protopapas, P.; Brambilla, M.; Janssen, R. T-CGAN: Conditional Generative Adversarial Network for Data Augmentation in Noisy Time Series with Irregular Sampling. arXiv 2019. [Google Scholar] [CrossRef]
- Vijaya, K. Evaluating Financial Risk in the Transition from EONIA to ESTER: A TimeGAN Approach with Enhanced VaR Estimations. Int. J. Innov. Sci. Mod. Eng. 2024, 12, 1–9. [Google Scholar]
- Xu, T.; Wenliang, L.K.; Munn, M.; Acciaio, B. COT-GAN: Generating Sequential Data via Causal Optimal Transport. arXiv 2020. [Google Scholar] [CrossRef]
- Lu, C.; Reddy, C.K.; Wang, P.; Nie, D.; Ning, Y. Multi-Label Clinical Time-Series Generation via Conditional GAN. IEEE Trans. Knowl. Data Eng. 2024, 36, 1728–1740. [Google Scholar] [CrossRef]
- Wang, H.; Zhu, H.; Li, H. Multi-Mode Data Generation and Fault Diagnosis of Bearings Based on STFT-SACGAN. Electronics 2023, 12, 1910. [Google Scholar] [CrossRef]
- Shi, H.; Xu, Y.; Ding, B.; Zhou, J.; Zhang, P. Long-Term Solar Power Time-Series Data Generation Method Based on Generative Adversarial Networks and Sunrise–Sunset Time Correction. Sustainability 2023, 15, 14920. [Google Scholar] [CrossRef]
- Haque, T.; Syed, M.A.B.; Jeong, B.; Bai, X.; Mohan, S.; Paul, S.; Ahmed, I.; Das, S. Towards Efficient Real-Time Video Motion Transfer via Generative Time Series Modeling. arXiv 2025. [Google Scholar] [CrossRef]
- Fraccaro, M.; Sønderby, S.K.; Paquet, U.; Winther, O. Sequential Neural Models with Stochastic Layers. arXiv 2016. [Google Scholar] [CrossRef]
- Li, Y.; Mandt, S. Disentangled Sequential Autoencoder. arXiv 2018. [Google Scholar] [CrossRef]
- Jeon, S.; Seo, J.T. A Synthetic Time-Series Generation Using a Variational Recurrent Autoencoder with an Attention Mechanism in an Industrial Control System. Sensors 2024, 24, 128. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Y.; Zhang, Z.; Cui, R. Few-Shot Learning for Time Series Data Generation Based on Distribution Calibration. In Proceedings of the Web Information Systems and Applications; Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 198–206. [Google Scholar]
- Yi, H.; Hou, L.; Jin, Y.; Saeed, N.A.; Kandil, A.; Duan, H. Time series diffusion method: A denoising diffusion probabilistic model for vibration signal generation. Mech. Syst. Signal Process. 2024, 216, 111481. [Google Scholar] [CrossRef]
- Adib, E.; Fernandez, A.S.; Afghah, F.; Prevost, J.J. Synthetic ECG Signal Generation Using Probabilistic Diffusion Models. IEEE Access 2023, 11, 75818–75828. [Google Scholar] [CrossRef]
- Wang, S.; Wang, G.; Fu, Q.; Song, Y.; Liu, J.; He, S. STABC-IR: An air target intention recognition method based on bidirectional gated recurrent unit and conditional random field with space-time attention mechanism. Chin. J. Aeronaut. 2023, 36, 316–334. [Google Scholar] [CrossRef]
- Sun, W.; Lu, G.; Zhao, Z.; Guo, T.; Qin, Z.; Han, Y. Regional Time-Series Coding Network and Multi-View Image Generation Network for Short-Time Gait Recognition. Entropy 2023, 25, 837. [Google Scholar] [CrossRef]
- Cao, B.; Xing, Q.; Li, L.; Xing, H.; Song, Z. KGTLIR: An Air Target Intention Recognition Model Based on Knowledge Graph and Deep Learning. Comput. Mater. Contin. 2024, 80, 1251–1275. [Google Scholar] [CrossRef]
- Xu, B.; Chen, B.; Wan, J.; Liu, H.; Jin, L. Target-Aware Recurrent Attentional Network for Radar HRRP Target Recognition. Signal Process. 2019, 155, 268–280. [Google Scholar] [CrossRef]
Method | References | The Need for Real Data | The Need for Domain Specific Knowledge | Advantages and Disadvantages |
---|---|---|---|---|
Rule-based methods | [2,3,4] | × | × | Advantages: They do not need to rely on large amounts of historical data or training models. Disadvantages: The generated data may be too simple and unrealistic. |
Simulation model-based methods | [5,6] | √ | √ | Advantages: They can simulate the behavior of complex systems, which in turn generates more realistic data. Disadvantages: They need a lot of domain knowledge and are computationally intensive. |
Traditional machine learning-based methods | [7,8,9] | √ | × | Advantages: They take into account the effect of historical data. Disadvantages: The models are too simple and require parameterization. |
Deep learning-based methods | GAN [10,11,12,13,14,15,16,17,18] | √ | × | Advantages: They show superiority in terms of the quality of samples generated. Disadvantages: The training process is unstable, suffers from pattern crashes, vanishing gradients, and lacks rigorous mathematical derivation. |
VAE [19,20,21,22,23] | √ | × | Advantages: They provide rigorous mathematical derivations. Disadvantages: They struggle to generate high-quality samples. | |
Diffusion model [24,25] | √ | × | Advantages: They provide a rigorous mathematical derivation process and are able to generate high-quality samples. Disadvantages: The sampling time of the model is too long, and the number of model parameters is too large. |
Attribute | Details |
---|---|
Source | Air Combat Maneuvering Generator, ACMG |
Covered scenarios | Plain air defense combat, mountain air defense combat, etc. |
Time distribution | 6 sample periods |
Total samples | 3520 |
Feature types | 8 numerical features and 4 non-numerical features |
Sample distribution | Attack: 600; retreat: 600; penetrate: 600; reconnaissance: 600; interference: 280; surveillance: 600; feint: 240 |
Hyperparameter | Value |
---|---|
Kernel size of DSC | 3 × 3 |
Optimizer | Adam |
Learning rate | 0.0001 |
Batch size | 64 |
Number of diffusion steps | 1000 |
Kernel function of MMD | RBF |
Model | FLOPs | Params | MMD |
---|---|---|---|
LDMKD-DA | 1.368 G | 16.382 M | 0.0838 |
DDPM (LDMKD-DA w/o DSC and KD) | 7.278 G | 76.065 M | 0.0731 |
LDMKD-DA w/o DSC | 7.278 G | 76.065 M | 0.0792 |
LDMKD-DA w/o KD | 1.368 G | 16.382 M | 0.0845 |
Model | VAE | HVAE | VQ-VAE | GAN | TimeGAN | SigWCGAN |
---|---|---|---|---|---|---|
MMD | 0.6851 | 0.2278 | 0.1553 | 0.2860 | 0.2788 | 0.2212 |
Model | FLOPs | Params | MMD |
---|---|---|---|
LDMKD-DA | 314.556 M | 2.779 M | 0.1124 |
DDPM (LDMKD-DA w/o DSC and KD) | 627.540 M | 4.575 M | 0.1106 |
LDMKD-DA w/o DSC | 627.540 M | 4.575 M | 0.1115 |
LDMKD-DA w/o KD | 314.556 M | 2.779 M | 0.1136 |
Model | VAE | HVAE | VQ-VAE | GAN | TimeGAN | SigWCGAN |
---|---|---|---|---|---|---|
MMD | 0.5562 | 0.5603 | 0.5341 | 0.1832 | 0.1741 | 0.1544 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cao, B.; Xing, Q.; Li, L.; Shi, J.; Lin, W. Air Battlefield Time Series Data Augmentation Model Based on a Lightweight Denoising Diffusion Probabilistic Model. AI 2025, 6, 192. https://doi.org/10.3390/ai6080192
Cao B, Xing Q, Li L, Shi J, Lin W. Air Battlefield Time Series Data Augmentation Model Based on a Lightweight Denoising Diffusion Probabilistic Model. AI. 2025; 6(8):192. https://doi.org/10.3390/ai6080192
Chicago/Turabian StyleCao, Bo, Qinghua Xing, Longyue Li, Junjie Shi, and Weijie Lin. 2025. "Air Battlefield Time Series Data Augmentation Model Based on a Lightweight Denoising Diffusion Probabilistic Model" AI 6, no. 8: 192. https://doi.org/10.3390/ai6080192
APA StyleCao, B., Xing, Q., Li, L., Shi, J., & Lin, W. (2025). Air Battlefield Time Series Data Augmentation Model Based on a Lightweight Denoising Diffusion Probabilistic Model. AI, 6(8), 192. https://doi.org/10.3390/ai6080192