Research on Cross-Scene Human Activity Recognition Based on Radar and Wi-Fi Multimodal Fusion
Abstract
:1. Introduction
- Development of a multimodal spatiotemporal synchronization acquisition system. A joint sensing platform based on hardware synchronization triggering is designed, integrating millimeter-wave radar and Wi-Fi to construct a heterogeneous signal acquisition system with unified spatiotemporal references.
- Construction of a multiscene cross-modal benchmark dataset. Radar and Wi-Fi CSI data for five common behaviors are collected in three typical indoor scenes: office, laboratory, and lounge, providing a multidimensional dataset for cross-modal algorithm research.
- Proposal of a spatiotemporal collaborative dual-stream fusion network. A dual-branch deep neural network is built using radar micro-Doppler time-frequency spectrograms and Wi-Fi CSI phase gradients. A feature alignment module based on attention gating is designed to achieve adaptive fusion of cross-modal spatiotemporal features. The behavior recognition accuracy reaches 94.8% in complex scenarios.
2. Theoretical Basis
2.1. Wi-Fi Perception and Channel State Information Mechanism
2.1.1. Basic Principles of Wi-Fi Communication
2.1.2. Physical Connotation and Mathematical Representation of CSI
2.1.3. CSI Signal Processing Flow
- Outlier removal: the Hampel filter is used to detect anomalies, with a window length set to 20 sampling points.
- Wavelet denoising: the sym4 wavelet base is used for 5-level decomposition, with soft-thresholding applied to high-frequency noise.
- Phase unwrapping: a linear transformation is applied to eliminate phase jumping phenomena.
- Time-frequency analysis: time-frequency features are extracted using Continuous Wavelet Transform (CWT), with the Morlet function selected as the mother wavelet.
2.1.4. Impact Mechanism of Multipath Effects on Behavior Recognition
2.2. Millimeter-Wave Radar Perception and Micro-Doppler Effect
2.2.1. Working Principle of Frequency-Modulated Continuous Wave Radar
2.2.2. Micro-Doppler Effect Mechanism
2.2.3. Radar Signal Processing Flow
- Range dimension FFT: perform a 2048-point FFT for each frequency sweep cycle, achieving range resolution .
- Doppler dimension FFT: perform FFT on 64 consecutive frequency sweep cycles, achieving velocity resolution .
- Constant False Alarm Rate Detection (CFAR): use the cell average greatest-of (GO-CFAR) algorithm to eliminate background noise.
- Phase compensation: correct phase errors based on target tracking results.
- Time-frequency map generation: perform Short-Time Fourier Transform (STFT) on a 2-s data segment to generate a 256 × 256 pixel time-frequency map.
3. Related Work
3.1. Radar-Based Cross-Scene Human Activity Recognition
3.2. Wi-Fi Signal-Based Human Activity Recognition
3.3. Multimodal Fusion for Robust HAR
4. Multimodal Fusion Method Design
4.1. System Architecture Overview
- Front-end heterogeneous signal analysis: the radar signal is processed using STFT to generate time-frequency spectrograms, and Wi-Fi CSI undergoes phase calibration to eliminate carrier frequency offsets.
- Dual-branch feature extraction: the radar branch uses an improved ResNet-50 to extract spatial-frequency domain features, while the Wi-Fi branch uses LSTM to capture temporal dynamic patterns.
- Prior-guided feature fusion: attention-weighted fusion based on Wi-Fi semantic constraints, combined with domain adversarial training, enhances cross-domain generalization.
4.2. Multimodal Data Collection and Synchronization
4.2.1. Radar Time-Frequency Feature Construction
4.2.2. Wi-Fi CSI Dynamic Calibration
4.2.3. Multimodal Synchronization
- (1)
- Hardware Synchronization Layer
- (2)
- Software Alignment Algorithm
4.3. Feature Extraction and Adaptive Encoding
4.3.1. Radar Spatiotemporal Feature Extraction
4.3.2. Wi-Fi Behavior Semantic Encoding
4.4. Prior Knowledge Guided Decision Fusion
4.4.1. Environmental Perception Feature Modulation
4.4.2. Adversarial Domain-Invariant Fusion
5. Experiments and Results Analysis
5.1. Dataset Construction
5.2. Multimodal Data Collection and Processing
5.3. Evaluation Protocol
5.3.1. Definition of Evaluation Metrics
- Metrics
- 2.
- Confusion Matrix
- 3.
- Statistical Significance Test
5.3.2. Data Splitting & Validation
5.3.3. Implementation Details
5.4. Model Architecture
5.5. Experiment Results Analysis
- Radar physical constraints: Doppler-based velocity perception generates false components in metal-furnished environments (e.g., offices). Multipath interference increases time-frequency spectrogram entropy, raising the fall detection error rate to 21.3%. This results from aliasing between target reflections and multipath components in NLoS conditions.
- Wi-Fi phase vulnerability: sudden movements (e.g., falls) cause CSI phase gradients exceeding π/2, inducing phase jumps.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Cardillo, E.; Caddemi, A. A review on biomedical MIMO radars for vital sign detection and human localization. Electronics 2020, 9, 1497. [Google Scholar] [CrossRef]
- Cardillo, E.; Caddemi, A. Radar range-breathing separation for the automatic detection of humans in cluttered environments. IEEE Sens. J. 2020, 21, 14043–14050. [Google Scholar] [CrossRef]
- Jung, J.; Lim, S.; Kim, B.K.; Lee, S. CNN-Based Driver Monitoring Using Millimeter-Wave Radar Sensor. IEEE Sens. Lett. 2021, 5, 3500404. [Google Scholar] [CrossRef]
- Qiao, X.; Feng, Y.; Shan, T.; Tao, R. Person identification with low training sample based on micro-doppler signatures separation. IEEE Sens. J. 2022, 22, 8846–8857. [Google Scholar] [CrossRef]
- Ding, E.; Zhang, Y.; Xin, Y.; Zhang, L.; Huo, Y.; Liu, Y. A robust and device-free daily activities recognition system using Wi-Fi signals. KSII Trans. Internet Inf. Syst. (TIIS) 2020, 14, 2377–2397. [Google Scholar]
- Björklund, S.; Petersson, H.; Nezirovic, A.; Guldogan, M.B.; Gustafsson, F. Millimeter-wave radar micro-Doppler signatures of human motion. In Proceedings of the 2011 12th International Radar Symposium (IRS), Leipzig, Germany, 7–9 September 2011; pp. 167–174. [Google Scholar]
- Yao, Y.; Liu, C.; Zhang, H.; Yan, B.; Jian, P.; Wang, P.; Du, L.; Chen, X.; Han, B.; Fang, Z. Fall detection system using millimeter-wave radar based on neural network and information fusion. IEEE Internet Things J. 2022, 9, 21038–21050. [Google Scholar] [CrossRef]
- Moghaddam, M.G.; Shirehjini, A.A.N.; Shirmohammadi, S. A wifi-based method for recognizing fine-grained multiple-subject human activities. IEEE Trans. Instrum. Meas. 2023, 72, 2520313. [Google Scholar] [CrossRef]
- Ma, Y.; Zhou, G.; Wang, S. WiFi sensing with channel state information: A survey. ACM Comput. Surv. (CSUR) 2019, 52, 1–36. [Google Scholar] [CrossRef]
- Wang, Y.; Wu, K.; Ni, L.M. Wifall: Device-free fall detection by wireless networks. IEEE Trans. Mob. Comput. 2016, 16, 581–594. [Google Scholar] [CrossRef]
- Zhang, J.; Tang, Z.; Li, M.; Fang, D.; Nurmi, P.; Wang, Z. CrossSense: Towards cross-site and large-scale WiFi sensing. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, New Delhi, India, 29 October–2 November 2018; pp. 305–320. [Google Scholar]
- Yousefi, S.; Narui, H.; Dayal, S.; Ermon, S.; Valaee, S. A survey on behavior recognition using WiFi channel state information. IEEE Commun. Mag. 2017, 55, 98–104. [Google Scholar] [CrossRef]
- Ratnam, V.V.; Chen, H.; Chang, H.-H.; Sehgal, A.; Zhang, J. Optimal preprocessing of WiFi CSI for sensing applications. IEEE Trans. Wirel. Commun. 2024, 23, 10820–10833. [Google Scholar] [CrossRef]
- Wang, F.; Song, Y.; Zhang, J.; Han, J.; Huang, D. Temporal unet: Sample level human action recognition using wifi. arXiv 2019, arXiv:1904.11953. [Google Scholar]
- Cardillo, E.; Li, C.; Caddemi, A. Radar-based monitoring of the worker activities by exploiting range-Doppler and micro-Doppler signatures. In Proceedings of the 2021 IEEE International Workshop on Metrology for Industry 4.0 & IoT (MetroInd4. 0&IoT), Rome, Italy, 7–9 June 2021; pp. 412–416. [Google Scholar]
- Li, X.; Qiu, Y.; Deng, Z.; Liu, X.; Huang, X. Lightweight Multi-Attention Enhanced Fusion Network for Omnidirectional Human Activity Recognition with FMCW Radar. IEEE Internet Things J. 2024, 12, 5755–5768. [Google Scholar] [CrossRef]
- Sun, C.; Wang, S.; Lin, Y. Omnidirectional Human Behavior Recognition Method Based on Frequency-Modulated Continuous-Wave Radar. J. Shanghai Jiaotong Univ. (Sci.) 2024, 1–9. [Google Scholar] [CrossRef]
- Chen, V.C.; Tahmoush, D.; Miceli, W.J. Radar Micro-Doppler Signatures; IET: London, UK, 2014. [Google Scholar]
- Yu, C.; Xu, Z.; Yan, K.; Chien, Y.-R.; Fang, S.-H.; Wu, H.-C. Noninvasive human activity recognition using millimeter-wave radar. IEEE Syst. J. 2022, 16, 3036–3047. [Google Scholar] [CrossRef]
- Jiang, W.; Miao, C.; Ma, F.; Yao, S.; Wang, Y.; Yuan, Y.; Xue, H.; Song, C.; Ma, X.; Koutsonikolas, D. Towards environment independent device free human activity recognition. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, New Delhi, India, 29 October–2 November 2018; pp. 289–304. [Google Scholar]
- Liang, J.; Hu, D.; Feng, J.; He, R. DINE: Domain Adaptation from Single and Multiple Black-box Predictors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022, New Orleans, LA, USA, 18–24 June 2021; pp. 8003–8013. [Google Scholar]
- Liu, X.; Yoo, C.; Xing, F.; Oh, H.; El Fakhri, G.; Kang, J.-W.; Woo, J. Deep unsupervised domain adaptation: A review of recent advances and perspectives. APSIPA Trans. Signal Inf. Process. 2022, 11, e25. [Google Scholar] [CrossRef]
- Wilson, G.; Cook, D.J. A survey of unsupervised deep domain adaptation. ACM Trans. Intell. Syst. Technol. (TIST) 2020, 11, 1–46. [Google Scholar] [CrossRef]
- Wang, J.; Zhao, Y.; Ma, X.; Gao, Q.; Pan, M.; Wang, H. Cross-Scenario Device-Free Activity Recognition Based on Deep Adversarial Networks. IEEE Trans. Veh. Technol. 2020, 69, 5416–5425. [Google Scholar] [CrossRef]
- Liang, J.; Hu, D.; Feng, J. Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation. In Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, 13–18 July 2020; pp. 6028–6039. [Google Scholar]
- Mosharaf, M.; Kwak, J.B.; Choi, W. WiFi-Based Human Identification with Machine Learning: A Comprehensive Survey. Sensors 2024, 24, 6413. [Google Scholar] [CrossRef]
- Chen, J.; Huang, X.; Jiang, H.; Miao, X. Low-cost and device-free human activity recognition based on hierarchical learning model. Sensors 2021, 21, 2359. [Google Scholar] [CrossRef]
- Ding, X.; Jiang, T.; Zhong, Y.; Huang, Y.; Li, Z. Wi-Fi-based location-independent human activity recognition via meta learning. Sensors 2021, 21, 2654. [Google Scholar] [CrossRef] [PubMed]
- Hao, Z.; Niu, J.; Dang, X.; Qiao, Z. WiPg: Contactless action recognition using ambient wi-fi signals. Sensors 2022, 22, 402. [Google Scholar] [CrossRef]
- Wang, F.; Li, Z.; Han, J. Continuous user authentication by contactless wireless sensing. IEEE Internet Things J. 2019, 6, 8323–8331. [Google Scholar] [CrossRef]
- Wei, Z.; Zhang, F.; Chang, S.; Liu, Y.; Wu, H.; Feng, Z. Mmwave radar and vision fusion for object detection in autonomous driving: A review. Sensors 2022, 22, 2542. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; Tsoi, J.K.; Patel, N. mmWave radar sensors fusion for indoor object detection and tracking. Electronics 2022, 11, 2209. [Google Scholar] [CrossRef]
- Theckedath, D.; Sedamkar, R. Detecting affect states using VGG16, ResNet50 and SE-ResNet50 networks. SN Comput. Sci. 2020, 1, 79. [Google Scholar] [CrossRef]
- Yu, C.; Wang, J.; Chen, Y.; Huang, M. Transfer Learning with Dynamic Adversarial Adaptation Network. In Proceedings of the In 2019 IEEE International Conference on Data Mining (ICDM), Beijing, China, 8–11 November 2019; pp. 778–786. [Google Scholar]
- Zhu, Y.; Zhuang, F.; Wang, J.; Ke, G.; He, Q. Deep Subdomain Adaptation Network for Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 1713–1722. [Google Scholar] [CrossRef]
- Cui, S.; Wang, S.; Zhuo, J.; Li, L.; Tian, Q. Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3941–3950. [Google Scholar]
Step 1 | Constructing Dual-Modal Temporal Association Matrix | where and are local feature blocks of the time-frequency spectrogram. The weight coefficient is dynamically adjusted by mutual information entropy. |
Step 2 | Hierarchical Path Optimization | The optimal path is solved using dynamic programming: where is a regularization term used to suppress excessive path curvature, and sgn(·) is the sign function. |
Step 3 | Subsampling-Level Compensation | Subsampling-Level Compensation using cubic spline interpolation. |
Transfer Direction | Radar Only | Wi-Fi Only | DAAN [34] | DSAN [35] | BNM [36] | Proposed Method |
---|---|---|---|---|---|---|
Lab → Lounge | 72.3 ± 1.5 | 65.1 ± 2.1 | 78.2 ± 1.3 | 81.5 ± 1.0 | 83.7 ± 0.9 | 95.6 ± 0.6 |
Lab → Office | 68.9 ± 1.8 | 60.5 ± 2.4 | 74.8 ± 1.6 | 78.1 ± 1.2 | 80.2 ± 1.1 | 94.2 ± 0.7 |
Office → Lounge | 70.4 ± 1.6 | 62.8 ± 2.3 | 76.5 ± 1.4 | 79.3 ± 1.1 | 81.9 ± 0.8 | 93.8 ± 0.5 |
Office → Lab | 75.1 ± 1.2 | 67.9 ± 1.9 | 80.6 ± 1.0 | 83.4 ± 0.8 | 85.0 ± 0.7 | 96.3 ± 0.4 |
Lounge → Office | 64.7 ± 2.0 | 57.2 ± 2.7 | 70.3 ± 1.7 | 73.8 ± 1.4 | 76.1 ± 1.2 | 92.1 ± 0.8 |
Lounge → Lab | 76.8 ± 1.1 | 69.5 ± 1.8 | 82.1 ± 0.9 | 84.6 ± 0.6 | 86.3 ± 0.5 | 96.9 ± 0.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, Z.; Sun, Y.; Qu, L. Research on Cross-Scene Human Activity Recognition Based on Radar and Wi-Fi Multimodal Fusion. Electronics 2025, 14, 1518. https://doi.org/10.3390/electronics14081518
Chen Z, Sun Y, Qu L. Research on Cross-Scene Human Activity Recognition Based on Radar and Wi-Fi Multimodal Fusion. Electronics. 2025; 14(8):1518. https://doi.org/10.3390/electronics14081518
Chicago/Turabian StyleChen, Zhiyu, Yanpeng Sun, and Lele Qu. 2025. "Research on Cross-Scene Human Activity Recognition Based on Radar and Wi-Fi Multimodal Fusion" Electronics 14, no. 8: 1518. https://doi.org/10.3390/electronics14081518
APA StyleChen, Z., Sun, Y., & Qu, L. (2025). Research on Cross-Scene Human Activity Recognition Based on Radar and Wi-Fi Multimodal Fusion. Electronics, 14(8), 1518. https://doi.org/10.3390/electronics14081518