Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs
Abstract
1. Introduction
2. Related Works and Problem Statement
2.1. Related Works
2.2. Dataset and Problem Statement
3. Proposed Methodology
3.1. Range–Doppler Image Generation
3.2. Adaptive Top-K Selection Based RTM and DTM Generation
3.2.1. Compression Methods from RDI to the 1D Vector
- (a)
- Summation across rows or columns:This method utilizes all available information and is robust to noise. However, strong peaks may be diluted by background energy, leading to blurred features.
- (b)
- Maximum extraction from each row or column: This approach emphasizes the strongest peaks, resulting in clearer features but may incorrectly highlight noise peaks or body-related reflections.
- (c)
- Slicing at the maximum peak position: This method suffers from severe information loss, and if the maximum peak originates from noise rather than the gesture, feature distortion may occur.
- (d)
- Top-K summation after sorting by magnitude: This method preserves information around strong reflectors and mitigates noise effects. However, the balance between information preservation and noise suppression depends on the choice of K.
3.2.2. Adaptive Top-K Selection Algorithm
- Peak distribution (low entropy): A larger K is selected to include surrounding components, thereby diluting sidelobe peaks and enhancing the relative emphasis on vectors containing richer information.
- Diffuse distribution (high entropy): A smaller K is chosen to avoid the inclusion of unnecessary zeros, preserving only the core components and maintaining structural clarity.
3.2.3. RTM and DTM Generation
3.3. Multi-Stream EfficientNetV2
3.4. Radar-Specific Data Augmentation
- 1.
- RTM Vertical Shift: The RTM is randomly shifted along the range axis (vertical direction), with the shift magnitude defined as pixels. Empty regions created by the shift are filled with zeros, and the same shift is applied to all three receiving channels. This simulates changes in the absolute distance between the sensor and the user.
- 2.
- RTM and DTM Horizontal Stretch: The RTM and DTM are scaled along the time axis (horizontal direction) using a random scaling factor . The transformation is centered, and empty regions are zero-padded. The same scaling is applied to all six RTM/DTM images. This simulates variations in gesture repetition cycles and overall motion speed.
- 3.
- DTM Vertical Stretch: The DTM is scaled along the Doppler axis (vertical direction) using a random scaling factor . The transformation is centered, and empty regions are zero-padded. The same scaling is applied to all three receiving channels. This reflects instantaneous variations in gesture speed.
4. Experiments
4.1. Experimental Setup
4.2. Training and Evaluation
4.3. Performance Optimization
5. Discussion and Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
1D FFT | One-Dimensional Fast Fourier Transform |
2D CNN | Two-Dimensional Convolutional Neural Network |
3D CNN | Three-Dimensional Convolutional Neural Network |
ADC | Analog-to-Digital Converter |
BN | Batch Normalization |
CA-CFAR | Cell Averaging Constant False Alarm Rate |
CUT | Cell Under Test |
DOA | Direction-of-Arrival |
DTM | Doppler-Time Map |
GFLOPs | Giga Floating-Point Operations |
FMCW | Frequency-Modulated Continuous-Wave |
HCI | Human–Computer Interaction |
HMI | Human–Machine Interface |
IF | Intermediate Frequency |
IoT | Internet of Things |
LSTM | Long Short-Term Memory |
MTI | Moving Target Indicator |
MUSIC | Multiple Signal Classification |
MVDR | Minimum Variance Distortionless Response |
RDI | Range–Doppler Image |
ReLU | Rectified Linear Unit |
RGB | Red, Green, Blue |
RNN | Recurrent Neural Network |
RTM | Range-Time Map |
STFT | Short-Time Fourier Transform |
UWB | Ultra-Wideband |
XGB | Extreme Gradient Boosting |
References
- Kumar, S.; Tiwari, P.; Zymbler, M. Internet of Things Is a Revolutionary Approach for Future Technology Enhancement: A Review. J. Big Data 2019, 6, 111. [Google Scholar] [CrossRef]
- Alotaibi, B. A Survey on Industrial Internet of Things Security: Requirements, Attacks, AI-Based Solutions, and Edge Computing Opportunities. Sensors 2023, 23, 7470. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, S.; Kallu, K.D.; Ahmed, S.; Cho, S.H. Hand Gestures Recognition Using Radar Sensors for Human-Computer-Interaction: A Review. Remote Sens. 2021, 13, 527. [Google Scholar] [CrossRef]
- Paravati, G.; Gatteschi, V. Human-Computer Interaction in Smart Environments. Sensors 2015, 15, 19487–19494. [Google Scholar] [CrossRef]
- Joseph, J.; D S, D. Hand Gesture Interface for Smart Operation Theatre Lighting. Int. J. Eng. Technol. 2018, 7, 20. [Google Scholar] [CrossRef]
- Dekker, B.; Jacobs, S.; Kossen, A.; Kruithof, M.; Huizing, A.; Geurts, M. Gesture Recognition with a Low Power FMCW Radar and a Deep Convolutional Neural Network. In Proceedings of the 2017 European Radar Conference (EURAD), Nuremberg, Germany, 11–13 October 2017; pp. 163–166. [Google Scholar] [CrossRef]
- Alexakis, G.; Panagiotakis, S.; Fragkakis, A.; Markakis, E.; Vassilakis, K. Control of Smart Home Operations Using Natural Language Processing, Voice Recognition and IoT Technologies in a Multi-Tier Architecture. Designs 2019, 3, 32. [Google Scholar] [CrossRef]
- Kröger, J.L.; Lutz, O.H.M.; Raschke, P. Privacy Implications of Voice and Speech Analysis – Information Disclosure by Inference. In Privacy and Identity Management. Data for Better Living: AI and Privacy; Friedewald, M., Önen, M., Lievens, E., Krenn, S., Fricker, S., Eds.; Springer International Publishing: Cham, Switzerland, 2020; Volume 576, pp. 242–258. [Google Scholar] [CrossRef]
- Yang, K.; Kim, M.; Jung, Y.; Lee, S. Hand Gesture Recognition Using FSK Radar Sensors. Sensors 2024, 24, 349. [Google Scholar] [CrossRef]
- Lin, P.; Li, C.; Chen, S.; Huangfu, J.; Yuan, W. Intelligent Gesture Recognition Based on Screen Reflectance Multi-Band Spectral Features. Sensors 2024, 24, 5519. [Google Scholar] [CrossRef]
- Shin, J.; Miah, A.S.M.; Kabir, M.H.; Rahim, M.A.; Al Shiam, A. A Methodological and Structural Review of Hand Gesture Recognition Across Diverse Data Modalities. IEEE Access 2024, 12, 142606–142639. [Google Scholar] [CrossRef]
- Foteinos, K.; Cani, J.; Linardakis, M.; Radoglou-Grammatikis, P.; Argyriou, V.; Sarigiannidis, P.; Varlamis, I.; Papadopoulos, G.T. Visual Hand Gesture Recognition with Deep Learning: A Comprehensive Review of Methods, Datasets, Challenges and Future Research Directions. arXiv 2025, arXiv:2507.04465. [Google Scholar] [CrossRef]
- Yu, J.; Qin, M.; Zhou, S. Dynamic Gesture Recognition Based on 2D Convolutional Neural Network and Feature Fusion. Sci. Rep. 2022, 12, 4345. [Google Scholar] [CrossRef] [PubMed]
- Xing, Z.; Meng, Z.; Zheng, G.; Ma, G.; Yang, L.; Guo, X.; Tan, L.; Jiang, Y.; Wu, H. Intelligent Rehabilitation in an Aging Population: Empowering Human-Machine Interaction for Hand Function Rehabilitation through 3D Deep Learning and Point Cloud. Front. Comput. Neurosci. 2025, 19, 1543643. [Google Scholar] [CrossRef] [PubMed]
- Chamorro, S.; Collier, J.; Grondin, F. Neural Network Based Lidar Gesture Recognition for Realtime Robot Teleoperation. In Proceedings of the 2021 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), New York City, NY, USA, 25–27 October 2021; pp. 98–103. [Google Scholar] [CrossRef]
- Vandersteegen, M.; Reusen, W.; Beeck, K.V.; Goedeme, T. Low-Latency Hand Gesture Recognition with a Low Resolution Thermal Imager. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 440–449. [Google Scholar] [CrossRef]
- Stadelmayer, T. Preprocessing and Classification Techniques for FMCW Radar and Deep Learning Based Indoor Motion Classification. Ph.D. Thesis, Friedrich-Alexander-Universitaet Erlangen-Nuernberg, Erlangen, Germany, 2024. [Google Scholar]
- Qiu, X.; Liu, J.; Song, L.; Teng, H.; Zhang, J.; Wang, Z. A Survey of Gesture Recognition Using Frequency Modulated Continuous Wave Radar. J. Comput. Commun. 2024, 12, 115–134. [Google Scholar] [CrossRef]
- Suh, J.S.; Ryu, S.; Han, B.; Choi, J.; Kim, J.H.; Hong, S. 24 GHz FMCW Radar System for Real-Time Hand Gesture Recognition Using LSTM. In Proceedings of the 2018 Asia-Pacific Microwave Conference (APMC), Kyoto, Japan, 6–9 November 2018; pp. 860–862. [Google Scholar] [CrossRef]
- Ahmed, S.; Kim, W.; Park, J.; Cho, S.H. Radar-Based Air-Writing Gesture Recognition Using a Novel Multistream CNN Approach. IEEE Internet Things J. 2022, 9, 23869–23880. [Google Scholar] [CrossRef]
- Zhang, Z.; Tian, Z.; Zhou, M. Latern: Dynamic Continuous Hand Gesture Recognition Using FMCW Radar Sensor. IEEE Sens. J. 2018, 18, 3278–3289. [Google Scholar] [CrossRef]
- Chmurski, M.; Mauro, G.; Santra, A.; Zubert, M.; Dagasan, G. Highly-Optimized Radar-Based Gesture Recognition System with Depthwise Expansion Module. Sensors 2021, 21, 7298. [Google Scholar] [CrossRef]
- Strobel, M.; Schoenfeldt, S.; Daugalas, J. Gesture Recognition for FMCW Radar on the Edge. In Proceedings of the 2024 IEEE Topical Conference on Wireless Sensors and Sensor Networks (WiSNeT), San Antonio, TX, USA, 21–24 January 2024; pp. 45–48. [Google Scholar] [CrossRef]
- Wang, S.; Song, J.; Lien, J.; Poupyrev, I.; Hilliges, O. Interacting with Soli: Exploring Fine-Grained Dynamic Gesture Recognition in the Radio-Frequency Spectrum. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology, Tokyo, Japan, 16–19 October 2016; pp. 851–860. [Google Scholar] [CrossRef]
- Choi, J.W.; Ryu, S.J.; Kim, J.H. Short-Range Radar Based Real-Time Hand Gesture Recognition Using LSTM Encoder. IEEE Access 2019, 7, 33610–33618. [Google Scholar] [CrossRef]
- Hayashi, E.; Lien, J.; Gillian, N.; Giusti, L.; Weber, D.; Yamanaka, J.; Bedal, L.; Poupyrev, I. RadarNet: Efficient Gesture Recognition Technique Utilizing a Miniature Radar Sensor. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; pp. 1–14. [Google Scholar] [CrossRef]
- Manitsaris, S.; Senteri, G.; Makrygiannis, D.; Glushkova, A. Human Movement Representation on Multivariate Time Series for Recognition of Professional Gestures and Forecasting Their Trajectories. Front. Robot. AI 2020, 7, 80. [Google Scholar] [CrossRef]
- Tiwari, B.; Gupta, S.H.; Balyan, V. Comparative Performance Exploration of Different Machine Learning and Deep Learning Algorithms for Classification of Hand Wrist Gestures. In Proceedings of the 2024 2nd International Conference on Disruptive Technologies (ICDT), Greater Noida, India, 15–16 March 2024; pp. 245–249. [Google Scholar] [CrossRef]
- Lien, J.; Gillian, N.; Karagozler, M.E.; Amihood, P.; Schwesig, C.; Olson, E.; Raja, H.; Poupyrev, I. Soli: Ubiquitous Gesture Sensing with Millimeter Wave Radar. ACM Trans. Graph. 2016, 35, 1–19. [Google Scholar] [CrossRef]
- Chudnikov, V.V.; Shakhtarin, B.I.; Bychkov, A.V.; Kazaryan, S.M. DOA Estimation in Radar Sensors with Colocated Antennas. In Proceedings of the 2020 Systems of Signal Synchronization, Generating and Processing in Telecommunications (SYNCHROINFO), Svetlogorsk, Russia, 1–3 July 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Dzvonkovskaya, A.; Rohling, H. Software-Improved Range Resolution for Oceanographic HF FMCW Radar. In Proceedings of the 2013 14th International Radar Symposium (IRS), Dresden, Germany, 19–21 June 2013; Volume 1, pp. 411–416. [Google Scholar]
- Molchanov, P.; Gupta, S.; Kim, K.; Pulli, K. Short-Range FMCW Monopulse Radar for Hand-Gesture Sensing. In Proceedings of the 2015 IEEE Radar Conference (RadarCon), Arlington, VA, USA, 10–15 May 2015; pp. 1491–1496. [Google Scholar] [CrossRef]
- Ritchie, M.; Jones, A.; Brown, J.; Griffiths, H. Hand Gesture Classification Using 24 GHz FMCW Dual Polarised Radar. In Proceedings of the International Conference on Radar Systems (Radar 2017), Belfast, UK, 23–26 October 2017. [Google Scholar] [CrossRef]
- Ash, M.; Ritchie, M.; Chetty, K. On the Application of Digital Moving Target Indication Techniques to Short-Range FMCW Radar Data. IEEE Sens. J. 2018, 18, 4167–4175. [Google Scholar] [CrossRef]
- Antes, T.; Bekker, E.; Bhutani, A.; Zwick, T. A Flexible Data Set for Radar-Based Gesture Recognition. In Proceedings of the 2025 16th German Microwave Conference (GeMiC), Dresden, Germany, 17–19 March 2025; pp. 538–541. [Google Scholar] [CrossRef]
- Antes, T.; Bekker, E.; Bhutani, A.; Zwick, T. A Flexible Data Set for Radar-Based Gesture Recognition with an FC-FMCW Radar. 2024. Available online: https://publikationen.bibliothek.kit.edu/1000172790 (accessed on 2 October 2024).
- BGT60TR13C—60 GHz Radar Sensors for IoT|Infineon Technologies AG. Available online: https://www.infineon.com/part/BGT60TR13C (accessed on 26 August 2025).
- Stove, A.G. Linear FMCW radar techniques. In IEE Proceedings F (Radar and Signal Processing); IET: London, UK, 1992; Volume 139, pp. 343–350. [Google Scholar]
- Enggar, F.D.; Muthiah, A.M.; Winarko, O.D.; Samijayani, O.N.; Rahmatia, S. Performance Comparison of Various Windowing On FMCW Radar Signal Processing. In Proceedings of the 2016 International Symposium on Electronics and Smart Devices (ISESD), Bandung, Indonesia, 29–30 November 2016; pp. 326–330. [Google Scholar] [CrossRef]
- Nguyen, M.Q.; Feger, R.; Wagner, T.; Stelzer, A. Analysis of 2D CA-CFAR for DDMA FMCW MIMO Radar. In Proceedings of the 2023 20th European Radar Conference (EuRAD), Berlin, Germany, 20–22 September 2023; pp. 423–426. [Google Scholar] [CrossRef]
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
- Huang, M.L.; Liao, Y.C. Stacking Ensemble and ECA-EfficientNetV2 Convolutional Neural Networks on Classification of Multiple Chest Diseases Including COVID-19. Acad. Radiol. 2023, 30, 1915–1935. [Google Scholar] [CrossRef]
- Hoang, L.; Lee, S.H.; Lee, E.J.; Kwon, K.R. Multiclass Skin Lesion Classification Using a Novel Lightweight Deep Learning Framework for Smart Healthcare. Appl. Sci. 2022, 12, 2677. [Google Scholar] [CrossRef]
- Zhao, Z.; Bakar, E.B.A.; Razak, N.B.A.; Akhtar, M.N. Corrosion Image Classification Method Based on EfficientNetV2. Heliyon 2024, 10, e36754. [Google Scholar] [CrossRef]
- Kim, B.; Seo, S. EfficientNetV2-based Dynamic Gesture Recognition Using Transformed Scalogram from Triaxial Acceleration Signal. J. Comput. Des. Eng. 2023, 10, 1694–1706. [Google Scholar] [CrossRef]
- Hartanto, J.; Wijaya, S.M.; Anderies; Chowanda, A. Performance Evaluation of EfficientNetB0, EfficientNetV2, and MobileNetV3 for American Sign Language Classification. In Proceedings of the 2023 8th International Conference on Electrical, Electronics and Information Engineering (ICEEIE), Malang City, Indonesia, 28–29 September 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Tan, M.; Le, Q.V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv 2019, arXiv:1905.11946. [Google Scholar]
- Tan, M.; Le, Q.V. EfficientNetV2: Smaller Models and Faster Training. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 10096–10106. [Google Scholar]
- Kern, N.; Waldschmidt, C. Data Augmentation in Time and Doppler Frequency Domain for Radar-based Gesture Recognition. In Proceedings of the 2021 18th European Radar Conference (EuRAD), London, UK, 5–7 April 2022; pp. 33–36. [Google Scholar] [CrossRef]
- Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. arXiv 2019, arXiv:1711.05101. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Frequency band | 57.5–64.5 GHz |
Sampling rate | 1 MHz |
Chirp repetition time | 300 s |
Number of samples | 128 |
Number of chirps | 600 |
Range resolution | 2.1 cm |
Maximum unambiguous range | 1.4 m |
Velocity resolution | 1.5 cm/s |
Maximum unambiguous velocity | 4.3 m/s |
Step | Description |
---|---|
Input | Vector , , |
Output | Adaptive K value |
Parameter | : a small positive constant to prevent division by zero |
(1) | Compute for . |
(2) | Compute entropy: . |
(3) | Normalize entropy: . |
(4) | Compute . |
end |
Evaluation Accuracy [%] | Compression Methods | ||||||||
---|---|---|---|---|---|---|---|---|---|
Total Sum |
Maximum Extraction |
Peak Slicing |
Top-5 Sum |
Top-10 Sum |
Top-15 Sum |
Top-20 Sum |
Top-25 Sum |
Adaptive Top-K Sum | |
(a) | (b) | (c) | (d) | (e) | |||||
Highest | 96.38 | 97.06 | 92.99 | 97.74 | 98.19 | 97.74 | 97.51 | 96.61 | 98.87 |
Lowest | 95.70 | 96.38 | 90.27 | 96.83 | 97.74 | 96.15 | 96.15 | 95.70 | 98.19 |
Average | 96.06 | 96.65 | 91.36 | 97.29 | 97.96 | 96.97 | 96.79 | 96.20 | 98.60 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Park, J.; Jeong, J. Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs. Sensors 2025, 25, 6324. https://doi.org/10.3390/s25206324
Park J, Jeong J. Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs. Sensors. 2025; 25(20):6324. https://doi.org/10.3390/s25206324
Chicago/Turabian StylePark, Jiseop, and Jaejin Jeong. 2025. "Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs" Sensors 25, no. 20: 6324. https://doi.org/10.3390/s25206324
APA StylePark, J., & Jeong, J. (2025). Radar-Based Gesture Recognition Using Adaptive Top-K Selection and Multi-Stream CNNs. Sensors, 25(20), 6324. https://doi.org/10.3390/s25206324