Multi-Channel Physical Feature Convolution and Tri-Branch Fusion Network for Automatic Modulation Recognition
Abstract
1. Introduction
- Multi-channel physical prior feature fusion input: The original complex signal is converted into multiple physically meaningful channels, including IQ (real and imaginary parts), phase, phase difference, amplitude, second-order spectrum fast Fourier transform(FFT), and fourth-order spectrum (higher-order spectrum). Four CNN channels process the IQ signal, together with one original-signal channel, emphasizing physical interpretability rather than treating IQ or spectra in isolation.
- Three-branch joint structure (CNN, BiLSTM, ViT): Each branch processes different features for final comprehensive discrimination. The CNN branch uses the above multi-channel inputs; the BiLSTM branch captures modulation periodicity and long-term temporal dependencies; and the IQ signal is converted into a constellation diagram for modulation-type identification via a Vision Transformer. Fusing signal- and image-based cues yields more accurate results.
- Path attention fusion module: Instead of simple concatenation, this module adapts to varying feature dependencies across modulation types. It enables the model to adjust under diverse SNR conditions while ensuring stable feature quality. The three path outputs are adaptively weighted to generate the final feature vector for classification and identification.
2. Signal Model and Task
3. The Proposed Method
3.1. IQ Signal Processing for Multi-Channel CNN
3.2. Multi-Channel CNN
3.3. Vision Transformer
3.4. BiLSTM
4. Three-Branch Fusion Classification Method
- IQ signal preprocessing: The raw complex base band IQ samples are normalized and reshaped into the required formats for sequence-based processing and constellation construction.
- Parallel feature extraction: The preprocessed IQ sequence is fed simultaneously into three branches. The CNN branch extracts local multi-scale time–frequency features, the BiLSTM branch models long-range temporal dependencies and modulation periodicity, and the ViT branch takes the corresponding constellation image as input to learn global spatial structures based on the [CLS] token representation.
- Feature aggregation: The high-level feature vectors produced by the three branches are concatenated into a joint feature representation, which collects complementary information from the time domain, sequence dynamics, and constellation space.
- Adaptive path fusion: The concatenated feature is passed through the path attention module (PAM), which generates branch-wise weights and produces a fused feature by assigning different importance to the CNN, BiLSTM, and ViT representations.
- Classification: The fused feature is fed into the final fully connected layers and softmax to output the posterior probabilities of all modulation types, and the class with the highest probability is taken as the predicted modulation label.
4.1. Overall Framework Description
4.2. The Path Attention Module and Classifier
4.3. Loss Function
4.4. Data Flow and Representation
5. Experiments and Results
5.1. Dataset Description
5.2. Experimental Details
5.3. Results and Analysis
5.4. Comparison of Methods
5.5. Ablation Experiment
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, X.; Zhao, Y.; Huang, Z. A Survey of Deep Transfer Learning in Automatic Modulation Classification. IEEE Trans. Cogn. Commun. Netw. 2025, 11, 1357–1381. [Google Scholar] [CrossRef]
- Nandi, A.; Azzouz, E. Algorithms for automatic modulation recognition of communication signals. IEEE Trans. Commun. 1998, 46, 431–436. [Google Scholar] [CrossRef]
- Dobre, O.A.; Abdi, A.; Bar-Ness, Y.; Su, W. Survey of automatic modulation classification techniques: Classical approaches and new trends. IET Commun. 2007, 1, 137–156. [Google Scholar] [CrossRef]
- Zhao, Y.; Wang, X.; Huang, Z. Multi-Function Radar Modeling: A Review. IEEE Sens. J. 2024, 24, 31658–31680. [Google Scholar] [CrossRef]
- Xu, J.L.; Su, W.; Zhou, M. Likelihood-Ratio Approaches to Automatic Modulation Classification. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2011, 41, 455–469. [Google Scholar] [CrossRef]
- Zheng, J.; Lv, Y. Likelihood-Based Automatic Modulation Classification in OFDM With Index Modulation. IEEE Trans. Veh. Technol. 2018, 67, 8192–8204. [Google Scholar] [CrossRef]
- Zhu, Z.; Nandi, A.K. Automatic Modulation Classification: Principles, Algorithms and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar] [CrossRef]
- Ge, Z.; Jiang, H.; Guo, Y.; Zhou, J. Accuracy Analysis of Feature-Based Automatic Modulation Classification via Deep Neural Network. Sensors 2021, 21, 8252. [Google Scholar] [CrossRef]
- Hameed, F.; Dobre, O.A.; Popescu, D.C. On the likelihood-based approach to modulation classification. IEEE Trans. Wirel. Commun. 2009, 8, 5884–5892. [Google Scholar] [CrossRef]
- Dobre, O.A.; Hameed, F. Likelihood-Based Algorithms for Linear Digital Modulation Classification in Fading Channels. In Proceedings of the 2006 Canadian Conference on Electrical and Computer Engineering, Ottawa, ON, Canada, 7–10 May 2006; pp. 1347–1350. [Google Scholar] [CrossRef]
- Panagiotou, P.; Anastasopoulos, A.; Polydoros, A. Likelihood ratio tests for modulation classification. In Proceedings of the MILCOM 2000 Proceedings, 21st Century Military Communications, Architectures and Technologies for Information Superiority (Cat. No. 00CH37155), Los Angeles, CA, USA, 22–25 October 2000; Volume 2, pp. 670–674. [Google Scholar]
- Derakhtian, M.; Tadaion, A.; Gazor, S. Modulation classification of linearly modulated signals in slow flat fading channels. IET Signal Process. 2011, 5, 443–450. [Google Scholar] [CrossRef]
- Stanescu, D.; Digulescu, A.; Ioana, C.; Serbanescu, A. Modulation Recognition of Underwater Acoustic Communication Signals Based on Phase Diagram Entropy. In Proceedings of the OCEANS, Hampton Roads, VA, USA, 17–20 October 2022; pp. 1–7. [Google Scholar] [CrossRef]
- Stanescu, D.; Digulescu, A.; Ioana, C.; Serbanescu, A. Corrigendum: Spread spectrum modulation recognition based on phase diagram entropy. Front. Signal Process. 2023, 3, 1334782. [Google Scholar] [CrossRef]
- Qin, X.; Jiang, W.; Gui, G.; Li, D.; Niyato, D.; Lu, J. Multilevel Adaptive Wavelet Decomposition Network-Based Automatic Modulation Recognition: Exploiting Time-Frequency Multiscale Correlations. IEEE Trans. Cogn. Commun. Netw. 2025, 11, 3218–3231. [Google Scholar] [CrossRef]
- Li, Y.; Tan, H.; Shi, X.; Zhou, W.; Zhou, F. Wavelet-based Adaptive Network for Automatic Modulation Recognition under Low SNR. In Proceedings of the 2024 IEEE 35th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Valencia, Spain, 2–5 September 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Sun, Z.; Wang, S.; Chen, X. Feature-Based Digital Modulation Recognition Using Compressive Sampling. Mob. Inf. Syst. 2016, 2016, 9754162. [Google Scholar] [CrossRef]
- Sun, X.; Su, S.; Zuo, Z.; Guo, X.; Tan, X. Modulation Classification Using Compressed Sensing and Decision Tree–Support Vector Machine in Cognitive Radio System. Sensors 2020, 20, 1438. [Google Scholar] [CrossRef] [PubMed]
- Wang, D.; Lin, M.; Zhang, X.; Huang, Y.; Zhu, Y. Automatic Modulation Classification Based on CNN-Transformer Graph Neural Network. Sensors 2023, 23, 7281. [Google Scholar] [CrossRef]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Wu, Y.; Schuster, M.; Chen, Z.; Le, Q.V.; Norouzi, M.; Macherey, W.; Krikun, M.; Cao, Y.; Gao, Q.; Macherey, K.; et al. Google’s Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation. arXiv 2016, arXiv:1609.08144. [Google Scholar]
- Gao, C.; Wang, X.; He, X.; Li, Y. Graph Neural Networks for Recommender System. In Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, Virtual, 21–25 February 2022. [Google Scholar]
- Qian, X.; Lin, S.; Cheng, G.; Yao, X.; Ren, H.; Wang, W. Object Detection in Remote Sensing Images Based on Improved Bounding Box Regression and Multi-Level Features Fusion. Remote Sens. 2020, 12, 143. [Google Scholar] [CrossRef]
- Lin, S.; Zhang, M.; Cheng, X.; Zhou, K.; Zhao, S.; Wang, H. Hyperspectral Anomaly Detection via Sparse Representation and Collaborative Representation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 946–961. [Google Scholar] [CrossRef]
- Lin, S.; Zhang, M.; Cheng, X.; Zhou, K.; Zhao, S.; Wang, H. Dual Collaborative Constraints Regularized Low-Rank and Sparse Representation via Robust Dictionaries Construction for Hyperspectral Anomaly Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 2009–2024. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef]
- Graves, A.; Mohamed, A.r.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 6645–6649. [Google Scholar] [CrossRef]
- Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4–24. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
- O’Shea, T.J.; Corgan, J.; Clancy, T.C. Convolutional Radio Modulation Recognition Networks. In Proceedings of the Engineering Applications of Neural Networks, Aberdeen, UK, 2–5 September 2016; pp. 213–226. [Google Scholar]
- Meng, F.; Chen, P.; Wu, L.; Wang, X. Automatic Modulation Classification: A Deep Learning Enabled Approach. IEEE Trans. Veh. Technol. 2018, 67, 10760–10772. [Google Scholar] [CrossRef]
- Wang, Y.; Fang, S.; Fan, Y.; Wang, M.; Xu, Z.; Hou, S. A complex-valued convolutional fusion-type multi-stream spatiotemporal network for automatic modulation classification. Sci. Rep. 2024, 14, 22401. [Google Scholar] [CrossRef] [PubMed]
- Rajendran, S.; Meert, W.; Giustiniano, D.; Lenders, V.; Pollin, S. Distributed deep learning models for wireless signal classification with low-cost spectrum sensors. arXiv 2017, arXiv:1707.08908. [Google Scholar] [CrossRef]
- West, N.E.; O’Shea, T. Deep architectures for modulation recognition. In Proceedings of the 2017 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), Baltimore, MD, USA, 6–9 March 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Xu, J.; Luo, C.; Parr, G.; Luo, Y. A Spatiotemporal Multi-Channel Learning Framework for Automatic Modulation Recognition. IEEE Wirel. Commun. Lett. 2020, 9, 1629–1632. [Google Scholar] [CrossRef]
- Njoku, J.N.; Morocho-Cayamcela, M.E.; Lim, W. CGDNet: Efficient Hybrid Deep Learning Model for Robust Automatic Modulation Recognition. IEEE Netw. Lett. 2021, 3, 47–51. [Google Scholar] [CrossRef]
- Huynh-The, T.; Hua, C.H.; Pham, Q.V.; Kim, D.S. MCNet: An Efficient CNN Architecture for Robust Automatic Modulation Classification. IEEE Commun. Lett. 2020, 24, 811–815. [Google Scholar] [CrossRef]
- Zhang, Z.; Luo, H.; Wang, C.; Gan, C.; Xiang, Y. Automatic Modulation Classification Using CNN-LSTM Based Dual-Stream Structure. IEEE Trans. Veh. Technol. 2020, 69, 13521–13531. [Google Scholar] [CrossRef]
- Liu, X.; Li, C.J.; Jin, C.T.; Leong, P.H.W. Wireless Signal Representation Techniques for Automatic Modulation Classification. IEEE Access 2022, 10, 84166–84187. [Google Scholar] [CrossRef]
- Mendis, G.J.; Wei, J.; Madanayake, A. Deep learning-based automated modulation classification for cognitive radio. In Proceedings of the 2016 IEEE International Conference on Communication Systems (ICCS), Shenzhen, China, 14–16 December 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Lee, J.; Kim, B.; Kim, J.; Yoon, D.; Choi, J.W. Deep neural network-based blind modulation classification for fading channels. In Proceedings of the 2017 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 18–20 October 2017; pp. 551–554. [Google Scholar] [CrossRef]
- Peng, S.; Jiang, H.; Wang, H.; Alwageed, H.; Zhou, Y.; Sebdani, M.M.; Yao, Y.D. Modulation Classification Based on Signal Constellation Diagrams and Deep Learning. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 718–727. [Google Scholar] [CrossRef]
- Huang, S.; Chai, L.; Li, Z.; Zhang, D.; Yao, Y.; Zhang, Y.; Feng, Z. Automatic Modulation Classification Using Compressive Convolutional Neural Network. IEEE Access 2019, 7, 79636–79643. [Google Scholar] [CrossRef]
- Chang, S.; Huang, S.; Zhang, R.; Feng, Z.; Liu, L. Multitask-Learning-Based Deep Neural Network for Automatic Modulation Classification. IEEE Internet Things J. 2022, 9, 2192–2206. [Google Scholar] [CrossRef]
- Ruikar, J.D.; Park, D.H.; Kwon, S.Y.; Kim, H.N. HCTC: Hybrid Convolutional Transformer Classifier for Automatic Modulation Recognition. Electronics 2024, 13, 3969. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 ×16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- O’Shea, T.J.; Roy, T.; Clancy, T.C. Over-the-Air Deep Learning Based Radio Signal Classification. IEEE J. Sel. Top. Signal Process. 2018, 12, 168–179. [Google Scholar] [CrossRef]
- Hermawan, A.P.; Ginanjar, R.R.; Kim, D.S.; Lee, J.M. CNN-Based Automatic Modulation Classification for Beyond 5G Communications. IEEE Commun. Lett. 2020, 24, 1038–1041. [Google Scholar] [CrossRef]
- Liu, X.; Yang, D.; Gamal, A.E. Deep neural network architectures for modulation classification. In Proceedings of the 2017 51st Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 29 October–1 November 2017; pp. 915–919. [Google Scholar] [CrossRef]















| Abbreviation | Full Form |
|---|---|
| ALRT | Average likelihood ratio test |
| GLRT | Generalized likelihood ratio test |
| HLRT | Hybrid likelihood ratio test |
| ANN | Artificial neural network |
| SVM | Support vector machine |
| HMM | Hidden Markov model |
| CNNs | Convolutional neural network |
| RNNs | Recurrent neural network |
| GNNs | Graph neural network |
| LSTM | Long short-term memory network |
| SPWVD | Smooth pseudo-Wigner–Ville distribution |
| BJD | Born–Jordan distribution |
| Layer (Name) | Output Shape | Description |
|---|---|---|
| Input | [B, 7, L] | 7-channel IQ signal input |
| TimeConv1 | [B, 64, L/2] | Conv1d (5 → 64, k = 3, p = 1) + BN + ReLU + SELayer + MaxPool (2) (BRNM) |
| TimeConv2 | [B, 128, L/4] | Conv1d (64 → 128, k = 3, p = 1) + BRNM |
| TimeConv3 | [B, 256, L/8] | Conv1d (128 → 256, k = 3, p = 1) + BRNM |
| FFTConv1 | [B, 64, L/2] | Conv1d (2 → 64, k = 3, p = 1) + BRNM |
| FFTConv2 | [B, 128, L/4] | Conv1d (64 → 128, k = 3, p = 1) + BRNM |
| FFTConv3 | [B, 256, L/8] | Conv1d (128 → 256, k = 3, p = 1) + BRNM |
| FFTConv4 | [B, 512, L/16] | Conv1d (256 → 512, k = 3, p = 1) + BRNM |
| FFT2Conv1 | [B, 64, L/2] | Same as FFTConv1, input is FFT of squared IQ |
| FFT2Conv2 | [B, 128, L/4] | Same as FFTConv2 |
| FFT2Conv3 | [B, 256, L/8] | Same as FFTConv3 |
| FFT2Conv4 | [B, 512, L/16] | Same as FFTConv4 |
| FFT4Conv1 | [B, 64, L/2] | Same as FFTConv1, input is FFT of |
| FFT4Conv2 | [B, 128, L/4] | Same as FFTConv2 |
| FFT4Conv3 | [B, 256, L/8] | Same as FFTConv3 |
| FFT4Conv4 | [B, 512, L/16] | Same as FFTConv4 |
| RawFeatures | [B, 5, L] | Raw input channels: real, imag, phase, dphase, amplitude |
| Concat | [B, 1797, min_len] | Concatenate all branches, align time dimension |
| FusionConv | [B, 512, min_len] | Conv1d (1797 → 512, kernel = 1) + BN + ReLU |
| GlobalAvgPool | [B, 512, 1] | AdaptiveAvgPool1d (1) |
| Flatten | [B, 512] | Flatten layer |
| Module Name | Input/Output Dimension | Operation/Functional Description |
|---|---|---|
| Constellation Map Generation | IQ signal | Complex mapping, 2D histogram generation, and Gaussian smoothing; converts the IQ sequence into a 2D spatial modulation distribution. |
| Patch Embedding | Input: constellation image | Divide the image into patches, flatten them, and project via an MLP; embeds each patch into a fixed-length vector representation. |
| Add [CLS] + Positional Encoding | Embedded patches | Add a learnable [CLS] token and positional encodings; provides a global representation and spatial position awareness for the Transformer encoder. |
| Transformer Encoder | Embedded sequence | Stacked Transformer encoder layers that model long-range dependencies in the constellation map using self-attention mechanisms. |
| Extract Token | Transformer output | Take the output corresponding to the [CLS] token; produces an image-level global representation of the modulation signal. |
| Branch Output | – | Fused with the CNN and LSTM branch outputs; represents global spatial features of modulation patterns. |
| Branch | Input Shape | Output Shape | Key Feature |
|---|---|---|---|
| CNN | 512 | Extracts multi-scale time–frequency features from the IQ signals. | |
| BiLSTM | 512 | Models temporal dependencies and modulation periodicity. | |
| ViT | 512 | Captures global constellation structures via Transformer attention. | |
| PAM + FC1 + FC2 | C (classes) | Fuses multi-branch features adaptively and performs classification. |
| Method | Overall Accuracy | Macro-F1 | Kappa | Time (s/Epoch) | Epochs |
|---|---|---|---|---|---|
| Proposed | 0.590 | 0.592 | 0.574 | 190 | 65 |
| MCLDNN | 0.561 | 0.562 | 0.549 | 162 | 50 |
| MCNET | 0.566 | 0.565 | 0.551 | 81 | 75 |
| CGDNET | 0.505 | 0.510 | 0.483 | 203 | 45 |
| ResNet | 0.535 | 0.537 | 0.518 | 132 | 80 |
| IC-AMCNET | 0.582 | 0.584 | 0.566 | 104 | 70 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, C.; Luo, J.; Shi, K.; Liu, T.; Ling, C. Multi-Channel Physical Feature Convolution and Tri-Branch Fusion Network for Automatic Modulation Recognition. Electronics 2025, 14, 4847. https://doi.org/10.3390/electronics14244847
Zhang C, Luo J, Shi K, Liu T, Ling C. Multi-Channel Physical Feature Convolution and Tri-Branch Fusion Network for Automatic Modulation Recognition. Electronics. 2025; 14(24):4847. https://doi.org/10.3390/electronics14244847
Chicago/Turabian StyleZhang, Changkai, Junyi Luo, Kaibo Shi, Tao Liu, and Chenyu Ling. 2025. "Multi-Channel Physical Feature Convolution and Tri-Branch Fusion Network for Automatic Modulation Recognition" Electronics 14, no. 24: 4847. https://doi.org/10.3390/electronics14244847
APA StyleZhang, C., Luo, J., Shi, K., Liu, T., & Ling, C. (2025). Multi-Channel Physical Feature Convolution and Tri-Branch Fusion Network for Automatic Modulation Recognition. Electronics, 14(24), 4847. https://doi.org/10.3390/electronics14244847

