Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition

Belbekri, Nasreddine; Wang, Wenguang

doi:10.3390/app15062905

Open AccessArticle

Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition

by

Nasreddine Belbekri

and

Wenguang Wang

^*

School of Electronic Information Engineering, Beihang University, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(6), 2905; https://doi.org/10.3390/app15062905

Submission received: 12 February 2025 / Revised: 3 March 2025 / Accepted: 4 March 2025 / Published: 7 March 2025

Download

Browse Figures

Versions Notes

Abstract

Radar-based continuous human activity recognition (HAR) in realistic scenarios faces challenges in segmenting and classifying overlapping or concurrent activities. This paper introduces a feedback-driven adaptive segmentation framework for multi-label classification in continuous HAR, leveraging Bayesian optimization (BO) and reinforcement learning (RL) to dynamically adjust segmentation parameters such as segment length and overlap in the data stream, optimizing them based on performance metrics such as accuracy and F1-score. Using a public dataset of continuous human activities, the method trains ResNet18 models on spectrogram, range-Doppler, and range-time representations from a 20% computational subset. Then, it scales optimized parameters to the full dataset. Comparative analysis against fixed-segmentation baselines was made. The results demonstrate significant improvements in classification performance, confirming the potential of adaptive segmentation techniques in enhancing the accuracy and efficiency of continuous multi-label HAR systems.

Keywords:

multi-label classification; continuous human activity recognition; adaptive segmentation; Bayesian optimization; reinforcement learning

1. Introduction

Human activity recognition (HAR) has emerged as a cornerstone of modern intelligent systems, enabling applications in healthcare, smart homes, and sports analytics. In healthcare [1], HAR systems monitor patients’ daily activities to detect early signs of cognitive decline or physical deterioration [2], facilitating timely interventions. In smart homes, HAR enhances user experience by adapting environments to residents’ behaviors, such as adjusting lighting or temperature. Similarly, in sports analytics, HAR tracks athletes’ movements to optimize performance and prevent injuries.

HAR systems rely on diverse sensors [3,4], including cameras, LiDAR, wearable devices, and radar, each with distinct trade-offs. Cameras provide rich visual data but raise privacy concerns and struggle with occlusions or poor lighting. Wearables like accelerometers capture precise motion data but are intrusive for long-term use. LiDAR offers spatial precision but remains cost-prohibitive for widespread deployment.

Radar has recently emerged as a promising alternative, combining privacy preservation, robustness to lighting/occlusion, and continuous fine-grained motion capture. Early radar-based systems focused on human detection for security [5], but advances in radar technologies and signal processing have expanded applications to healthcare (e.g., vital sign monitoring) and smart homes (e.g., fall detection). Despite these advances, radar-based HAR remains challenging due to complex activity patterns.

A critical ongoing challenge in HAR is accurately identifying continuous activities that mirror real-world human behavior [6]. Although some HAR methods, such as those based on RNNs or LSTMs [7,8], can operate on continuous data streams without explicit segmentation, they are primarily effective only for single-activity classification and typically handle only time-dependent signals (e.g., time-range and spectrogram data) effectively. In contrast, radar data offer additional information, such as range-Doppler plots, which are independent of time but can provide details that the spectrogram cannot.

To leverage the benefits of all these features and overcome these limitations, some researchers have advocated for multi-label classification as a strategy to approach the complexity of continuous HAR [9]. However, multi-label classification for continuous data inherently requires segmentation, which is indispensable for effectively handling the complexity of continuous HAR. Existing research has employed fixed window segmentation with overlap to manage continuous data streams by dividing them into manageable windows that capture one or multiple activities and the transitions between them. A significant hurdle remains in determining the optimal way to segment data to maximize the performance. Fixed window segmentation imposes arbitrary boundaries on continuous data streams, making it crucial to develop a more adaptive segmentation approach that better captures the natural flow of human activities and the transitions between them for multi-label HAR.

To address these limitations, our paper proposes the following:

Feedback-driven segmentation framework: A system that dynamically optimizes segmentation parameters (e.g., window length, overlap) using Bayesian optimization (BO) and reinforcement learning (RL). This framework iteratively refines parameters based on performance metrics (accuracy, F1-score), adapting to variable activity durations and overlapping behaviors.
Multi-label ResNet18 architecture: A modified ResNet18 architecture tailored for multi-label classification, enhancing accuracy in complex scenarios. This adaptation effectively caters to the nuanced demands of radar-based human activity recognition.

The rest of this paper is outlined as follows: Section 2 discusses the evolution of HAR from early techniques to continuous multi-label approaches and segmentation challenges. Section 3 outlines our methodology of our feedback-driven segmentation. Section 4 details the experimental setup and results. Finally, Section 5 concludes with a summary of findings and future research directions.

2. Related Work

2.1. Radar-Based HAR

Radar-based HAR has evolved significantly, transitioning from basic motion detection to sophisticated machine learning-driven classification. Initially focused on simple movements, like walking and running, early studies utilized micro-Doppler spectrograms with CNNs for effective classification [10,11]. As the field advanced, researchers like [12] enhanced activity recognition with high-resolution spectrograms, aiming to tap into the full potential of the radar by extracting more diverse movement features. Another study [13] employed micro-Doppler radar data combined with a dual-stream RNN architecture, achieving high accuracy in recognizing human activities, demonstrating significant potential for real-world IoT applications

Recent innovations have embraced multi-representation fusion—integrating micro-Doppler, range-Doppler, and range-time data to enrich analysis—like [14,15,16] who fused micro-Doppler and range-time. This approach has proven effective in complex scenarios, including healthcare and smart home applications, as demonstrated by [17]. The cutting-edge research now explores three-dimensional radar data representations to capture intricate motion patterns, with [18] employing complex-valued neural networks for raw data processing, and [7] proposing hybrid networks for robust multi-domain fusion.

A significant shift in the paradigm is the move towards continuous HAR, analyzing unbounded data streams to better reflect real-world conditions. Pioneering this approach, the authors in [19] introduced an open-source dataset for sequential human activities, employing techniques like SVM and later LSTM networks to handle continuous data [8,20]. This transition from analyzing discrete activities to interpreting continuous human motions has set the stage for real-time, continuous activity recognition in dynamic environments, utilizing advanced signal processing and adaptive learning techniques [21,22].

Expanding beyond gross-motor activities, the authors in [23] applied Bi-LSTM networks to sequential ASL sign recognition, fusing range-Doppler and micro-Doppler data to bridge human–computer interaction gaps. Collectively, these works demonstrate radar’s versatility in handling temporal complexity and fine-grained motion patterns, paving the way for robust, real-world HAR systems.

One of the recent studies utilized distributed radar networks [24], employing a hybrid CNN-RNN classifier for spatio-temporal features extraction from multiple radars, achieving impressive accuracy on dynamic human activities. Building on this, the authors in [25] enhanced multi-node classification with recurrent networks that accurately capture ongoing human motions, considering both past and future movements.

Another recent research study [26] utilized multi-domain fusion vision transformers that integrate data from range-time, range-Doppler, and Doppler-time maps, significantly enhancing accuracy and robustness. Meanwhile, in [27], they used a system that combines multi-domain data fusion with advanced signal processing techniques offers precise activity segmentation and classification, highlighting radar technology’s capacity to overcome environmental and privacy issues. These developments promise substantial improvements in security and healthcare applications, showcasing the potential of HAR in practical scenarios.

Together, these advancements reflect a significant leap toward practical, privacy-aware HAR systems capable of operating in complex environments like healthcare and smart homes, illustrating the benefits of combining distributed sensing with sophisticated machine learning techniques.

2.2. Multi-Label Classification

Continuous HAR with multi-label classification, which aims to more accurately mirror real-world scenarios, was first introduced by [9] (see Figure 1). This pioneering approach is crucial in radar-based HAR as it allows for the assignment of multiple activity labels to a single instance of radar data, essential for recognizing continuous or concurrent human motions. In their seminal study, they segmented continuous radar streams into fixed windows with overlapping, subsequently applying a classifier to these segmented windows. Utilizing multi-representation inputs (range-time, range-Doppler, and spectrogram), their methodology achieved notable outcomes, including 95.8% accuracy and a 92.08% F1-score on a public dataset, thus demonstrating the efficacy of multi-label classification in managing continuous activities.

In a follow-up study [28], they extended their research by introducing significant technical enhancements. The team employed both signal-level and decision-level fusion techniques to process three distinct data representations. The refined approach greatly enhanced the integration and interpretation of sensor data, maintaining a high accuracy of 95.65% while substantially reducing computational demands. Additionally, the researchers undertook a comparative analysis of two neural network architectures.

Although these authors’ innovations have enhanced continuous HAR with their multi-label classification, challenges remain. Fixed window segmentation is typically predetermined based on known activity durations, limiting adaptability in scenarios where activity durations and transitions vary, potentially leading to suboptimal recognition performance. Additionally, their studies emphasize the ongoing need for improved data stream segmentation methods. In response, our work proposes a feedback-driven segmentation approach as a solution to these issues, aiming to advance HAR precision and adaptability. This method is designed to dynamically adjust segmentation, aligning with the complex nature of human activities in dynamic environments and driving significant progress toward robust HAR systems.

2.3. Segmentation Techniques in HAR

Segmentation is essential in HAR for dividing continuous data streams into precise segments. Traditionally, wearable sensors like accelerometers provided clear, event-triggered data for straightforward boundary detection. However, the shift to radar-based HAR, which captures richer but noisier and more unstructured signals, introduced significant challenges.

Dynamic Segmentation: The authors in [29] utilized STA/LTA motion detectors and multi-task learning to enhance radar segmentation, effectively managing activities in mixed-motion scenarios. This built upon earlier methods, like Rényi entropy [30], which improved the detection of transitions in micro-Doppler signatures under noisy conditions.
Fixed-Length Windowing: Addressing the limitations of traditional methods in multi-label HAR, refs. [9,28] implemented fixed-length overlapping windows to better handle the classification of multiple concurrent activities. This technique ensures comprehensive coverage of activity boundaries, enabling more effective recognition of complex, overlapping behaviors.

Despite advancements, segmentation in HAR remains challenging. Fixed windows often do not suit varied activity durations or datasets where activities overlap in segments, and traditional methods are not ideal for multi-label classification, which does not require defining the precise start and end of each activity. In response, our paper proposes a feedback-driven segmentation approach tailored for continuous, multi-label HAR. This innovative method dynamically adjusts window sizes based on performance analysis, effectively determining the optimal segment length without prior knowledge of activity durations and offering a scalable solution that more closely reflects the nuances of real-world behavior. Although dynamic segmentation does increase processing time, the performance improvement is substantial, justifying the additional computational cost.

3. Methodology

3.1. Feedback-Driven Adaptive Segmentation

The proposed feedback-driven segmentation system is designed to optimize both computational efficiency and the performance of HAR. The system operates in two distinct phases, as illustrated in Figure 2:

Phase 1: Parameter Optimization

A 20% subset of the dataset is used to minimize computational demands while ensuring a diverse representation of activities is maintained.
From this subset, three distinct data representations are generated: spectrogram, range-Doppler, and range-time.
Individual neural networks are trained on each of these representations to capture modality-specific features.
Segmentation parameters, including segment length (L) and overlap (O), are iteratively optimized using BO and RL separately. This optimization is guided by metrics such as accuracy and F1-score [31]. The overlap is set to one-third (≈33%) of the segment length, an optimal value based on [9,28], ensuring that activities occurring at the edges of one window are fully captured in the overlapping portion of the subsequent window.

Phase 2: Full-Scale Application and Result Fusion

Once the optimal segmentation parameters are identified, they are applied across the entire dataset using the modified Resnet18, to ensure consistency and robustness in activity segmentation.
Classification results from the individual modalities are then fused in a post-processing step. This fusion leverages the strengths of each representation, combining them to produce the final multi-label predictions. This structured approach allows the feedback-driven system to effectively achieve the highest performance.

This structured approach allows the feedback-driven system to effectively achieve the highest performance. The cornerstone of our methodology is the feedback-driven adaptive segmentation system, which begins by selecting an initial segmentation configuration tailored to the characteristics of the dataset used. Specifically, the initial segment length is determined based on the data stream duration and the average activity duration present in the dataset, while the overlap ratio is set to one-third (approximately 33%) of the segment length to ensure that activities occurring at segment boundaries are effectively captured. This crucial initial step sets the foundation for the subsequent dynamic adjustments. In the first phase of our methodology, we carry out the following steps:

We apply BO and RL separately to dynamically adjust segmentation parameters such as segment length (L) and overlap (O). The effectiveness of these adjustments is then evaluated by comparing the results from both techniques to determine the most effective approach for parameter optimization.
The adjustments are governed by a reward system, which is crucial in this phase and evaluates performance metrics including accuracy and F1-score. The reward function used for both optimization techniques is defined as follows:

$R e w a r d = α . A c c u r a c y + β . F 1 . S c o r e$

(1)

where $α$ and $β$ are weights determining the relative importance of each metric. In our experiments, these weights were set to $α$ = 0.25 and $β$ = 0.75 to provide greater emphasis to the F1-score, which balances precision and recall and is essential for addressing nuances in imbalanced classes, while still considering overall accuracy, which provides a measure of overall correctness.

3.2. Multi-Label Classification Using Modified ResNet18

In adapting to the complex demands of continuous HAR, we have modified the ResNet18 architecture, traditionally used for single-label image classification, to support multi-label classification. This adaptation is demonstrated in Figure 3 and is crucial for accurately identifying multiple activities from radar data across three key representations—spectrogram, range-Doppler, and range-time—as shown in Figure 4, each reflecting different aspects of the nuanced realities of human behavior.

Final Fully Connected Layer: The original network’s final fully connected layer was replaced with a layer adapted based on our dataset. This adjustment ensures that the network’s output dimensionality is aligned with the task’s requirements.
Softmax Layer: The softmax activation function is replaced, which constrains outputs to a probability distribution across classes, with a sigmoid function. This change allows for independent prediction of each label, being essential for scenarios where multiple activity types may be present.
Classification Output Layer: The standard classification output layer was replaced by a custom binary cross-entropy loss layer. This new layer is tailored to manage the outputs from the sigmoid activation, optimizing the network for scenarios where each instance may belong to multiple classes.

The selection of ResNet18 was informed by its demonstrated efficiency in extracting features from various data types, accommodating both time-specific and general data inputs. This versatility makes ResNet18 an optimal choice for our needs, ensuring robust feature extraction capabilities across a wide range of data.

Each network outputs a probability vector for all activities (using a threshold set at 0.45, based on preliminary experiments), representing the likelihood of each activity occurring within the segment. In the post-processing step, we fuse the outputs from each individual network by taking the mean of these probability vectors. This fusion method ensures a balanced consideration of each modality’s predictions, enhancing the reliability and accuracy of the final multi-label classification.

It is important to note, however, that our system is optimized for a predefined set of activities from the dataset used; adapting to new or unknown activities would require additional training data and corresponding model updates. Moreover, the fully connected layer is specifically tailored to the predefined set of activity labels in our dataset. Consequently, if new classes are introduced, an update to the output layer would be required, necessitating complete retraining of the model to effectively integrate the expanded set of labels

4. Experimental Setup

4.1. Dataset Used

For our study, we utilized a publicly available dataset [32], which comprises radar measurement data from 15 subjects performing a range of nine different activities. These activities include walking, being stationary (no activity), sitting down, standing up from sitting, bending from sitting, bending from standing, falling from walking, standing up, and falling from standing. Rather than being recorded as isolated examples, the data are captured as continuous 120 s streams that mimic real-world scenarios by including both mixed and single activities. For each subject, seven combined sequences of activities were constructed, with each combination performed four times. This collection strategy ensures that all nine activity classes are well represented across the dataset. These data were collected using a sophisticated network of five distributed radars deployed over a circular baseline with a spacing of approximately 45° between them, as demonstrated in Figure 5.

4.2. Results and Discussion

As we explained earlier, we applied the modified ResNet18 to a key dataset subset within our feedback-driven segmentation system. Building on this methodology, we have documented the outcomes for RL and BO in Figure 6 and Figure 7, respectively. These figures showcase the effectiveness of each optimization technique in refining our segmentation parameters and enhancing the overall performance of the system.

The BO results in Figure 6 demonstrate the relationship between segment length (L) and the objective function value across the range L = 20 to L = 40 s. As we can see, the optimal segment length is identified as L = 38, marked by a red star where the objective function reaches its minimum value of −0.88. This indicates that L = 38 s optimally enhances the performance. The initial exploration phase (L = 20 to 30 s) shows wide fluctuations, reflecting BO’s search for high-reward regions, while the convergence phase (L = 35 to 40 s) stabilizes around L = 38 s. This result highlights BO’s capability to navigate complex trade-offs, providing a robust, data-driven approach for optimizing the segmentation of the data.

Figure 7 depicts the optimization of segment length (L) using RL. The optimal segment length was also identified at L = 38 s and marked by a black dot, which indicates where the RL algorithm determined that no further adjustments were necessary to enhance performance. This decision point represents the effectiveness of RL in systematically exploring and selecting the optimal action. The graph effectively illustrates RL’s capability to fine-tune parameters dynamically, ensuring that segmentation is ideally suited for accurate activity recognition.

Following the confirmation that both BO and RL identified the same optimal segment length, we applied this parameter to the full dataset. The comparative results are presented in Table 1, where we contrast the performance using L = 38 s with L = 30 s, which is the segment length utilized in previous research. The results show a clear enhancement in system performance: accuracy improved from 94.66% to 95.58% and F1-score from 89.90% to 91.68%. These improvements not only highlight the efficacy of our feedback-driven segmentation process in refining the accuracy and efficiency of continuous HAR systems but also underscore the substantial potential of the deep network ResNet18 in HAR. This architecture has proven highly effective in handling the complex demands of multi-label classification in continuous HAR, demonstrating its robust capability in advanced applications.

The next series of graphs (Figure 8 and Figure 9) displaying accuracy and F1-score across segment durations from L = 20 to L = 40 s illustrates the effectiveness of our feedback-driven segmentation system in identifying the optimal segment length. The modified ResNet18 has demonstrated its ability to capture multiple activities simultaneously from a diverse range of features extracted from radar data, while L = 38 s emerged as the segment length where accuracy peaks at 95.58% and F1-score peaks at 91.68%, demonstrating superior classification performance. The primary takeaway is not the segment length itself but the system’s capability to pinpoint it based on classification performance. These consistent results across key metrics underscore the precision of our methodology in adapting to the complexities of real-world scenarios, effectively optimizing parameters to ensure optimal performance without being constrained to a predefined length.

5. Conclusions

This work demonstrates the effectiveness of a feedback-driven segmentation system for continuous multi-label HAR. Both BO and RL independently identified the same optimal segment length, significantly enhancing classification performance when applied across a comprehensive dataset of human activities. Comparative analysis revealed marked improvements over previous segment lengths used in the field, with accuracy and F1-score showing notable increases. Our evaluation further indicates that while both methods yielded similar optimized results, BO proved to be more robust and stable, requiring less hyperparameter fine-tuning; thus, it was chosen as our preferred method.

The use of the modified ResNet18 architecture was instrumental in achieving these results. Its ability to handle complex multi-label scenarios through robust feature extraction from diverse radar data contributed significantly to the improved accuracy and F1-score. These results validate the system’s ability to adaptively refine segmentation parameters in response to continuous human activities, striking a balance between computational efficiency and robustness. Notably, employing a 20% subset for optimization reduced the overall process time by at least a factor of five over the total optimization iterations compared to using the full dataset, while full-dataset validation in the second phase for final classification confirmed the approach’s robustness.

Author Contributions

Conceptualization, N.B. and W.W.; methodology, N.B.; software, N.B.; validation, N.B. and W.W.; formal analysis, N.B.; investigation, N.B.; resources, W.W.; data curation, N.B.; writing—original draft preparation, N.B.; writing—review and editing, N.B and W.W.; visualization, N.B.; supervision, W.W.; project administration, W.W.; funding acquisition, W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Aeronautical Science Foundation of China (No. 2024Z074051002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used in this study is publicly available at [32].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ASL	American Sign Language
Bi-LSTM	Bidirectional LSTM
BO	Bayesian optimization
CNN	Convolutional neural network
HAR	Human activity recognition
LSTM	Long short-term memory
LTA	Long time window
RNN	Recurrent neural network
RL	Reinforcement learning
STA	Short time window
SVM	Support vector machine

References

Fioranelli, F.; Le Kernec, J. Radar sensing for human healthcare: Challenges and results. In Proceedings of the 2021 IEEE Sensors, Virtual, 31 October–4 November 2021; pp. 1–4. [Google Scholar] [CrossRef]
Savvidou, F.; Tegos, S.A.; Diamantoulakis, P.D.; Karagiannidis, G.K. Passive Radar Sensing for Human Activity Recognition: A Survey. IEEE Open J. Eng. Med. Biol. 2024, 5, 700–706. [Google Scholar] [CrossRef] [PubMed]
Islam, M.M.; Nooruddin, S.; Karray, F.; Muhammad, G. Human activity recognition using tools of convolutional neural networks: A state of the art review, data sets, challenges, and future prospects. Comput. Biol. Med. 2022, 149, 106060. [Google Scholar] [CrossRef] [PubMed]
Miazek, P.; Żmudzińska, A.; karczmarek, P.; Kiersztyn, A. Human Behavior Analysis Using Radar Data: A Survey. IEEE Access 2024, 12, 153188–153202. [Google Scholar] [CrossRef]
Papadopoulos, K.; Jelali, M. A Comparative Study on Recent Progress of Machine Learning-Based Human Activity Recognition with Radar. Appl. Sci. 2023, 13, 12728. [Google Scholar] [CrossRef]
Ullmann, I.; Guendel, R.G.; Kruse, N.C.; Fioranelli, F.; Yarovoy, A. A Survey on Radar-Based Continuous Human Activity Recognition. IEEE J. Microwaves 2023, 3, 938–950. [Google Scholar] [CrossRef]
Ding, W.; Guo, X.; Wang, G. Radar-Based Human Activity Recognition Using Hybrid Neural Network Model With Multidomain Fusion. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 2889–2898. [Google Scholar] [CrossRef]
Li, H.; Shrestha, A.; Heidari, H.; Le Kernec, J.; Fioranelli, F. Bi-LSTM Network for Multimodal Continuous Human Activity Recognition and Fall Detection. IEEE Sens. J. 2020, 20, 1191–1201. [Google Scholar] [CrossRef]
Ullmann, I.; Guendel, R.G.; Kruse, N.C.; Fioranelli, F.; Yarovoy, A. Radar-Based Continuous Human Activity Recognition with Multi-Label Classification. In Proceedings of the 2023 IEEE Sensors, Vienna, Austria, 29 October–1 November 2023; pp. 1–4. [Google Scholar] [CrossRef]
Kang, S.w.; Jang, M.h.; Lee, S. Identification of Human Motion Using Radar Sensor in an Indoor Environment. Sensors 2021, 21, 2305. [Google Scholar] [CrossRef]
Zhao, Y.; Zhou, H.; Lu, S.; Liu, Y.; An, X.; Liu, Q. Human Activity Recognition Based on Non-Contact Radar Data and Improved PCA Method. Appl. Sci. 2022, 12, 7124. [Google Scholar] [CrossRef]
Biswas, S.; Manavi Alam, A.; Gurbuz, A.C. HRSpecNET: A Deep Learning-Based High-Resolution Radar Micro-Doppler Signature Reconstruction for Improved HAR Classification. IEEE Trans. Radar Syst. 2024, 2, 484–497. [Google Scholar] [CrossRef]
Tan, T.H.; Tian, J.H.; Sharma, A.K.; Liu, S.H.; Huang, Y.F. Human Activity Recognition Based on Deep Learning and Micro-Doppler Radar Data. Sensors 2024, 24, 2530. [Google Scholar] [CrossRef] [PubMed]
Cao, L.; Liang, S.; Zhao, Z.; Wang, D.; Fu, C.; Du, K. Human Activity Recognition Method Based on FMCW Radar Sensor with Multi-Domain Feature Attention Fusion Network. Sensors 2023, 23, 5100. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Fioranelli, F.; Yang, S.; Zhang, L.; Romain, O.; He, Q.; Cui, G.; Le Kernec, J. Multi-domains based human activity classification in radar. In Proceedings of the IET International Radar Conference (IET IRC 2020), Virtual, 4–6 November 2020; Volume 2020, pp. 1744–1749. [Google Scholar] [CrossRef]
Huang, L.; Lei, D.; Zheng, B.; Chen, G.; An, H.; Li, M. Lightweight Multi-Domain Fusion Model for Through-Wall Human Activity Recognition Using IR-UWB Radar. Appl. Sci. 2024, 14, 9522. [Google Scholar] [CrossRef]
Gurbuz, S.Z.; Amin, M.G. Radar-Based Human-Motion Recognition With Deep Learning: Promising Applications for Indoor Monitoring. IEEE Signal Process. Mag. 2019, 36, 16–28. [Google Scholar] [CrossRef]
Yang, X.; Guendel, R.G.; Yarovoy, A.; Fioranelli, F. Radar-based Human Activities Classification with Complex-valued Neural Networks. In Proceedings of the 2022 IEEE Radar Conference (RadarConf22), New York, NY, USA, 21–25 March 2022; pp. 1–6. [Google Scholar] [CrossRef]
Li, H.; Shrestha, A.; Heidari, H.; Kernec, J.L.; Fioranelli, F. Activities Recognition and Fall Detection in Continuous Data Streams Using Radar Sensor. In Proceedings of the 2019 IEEE MTT-S International Microwave Biomedical Conference (IMBioC), Nanjing, China, 6–8 May 2019; Volume 1, pp. 1–4. [Google Scholar] [CrossRef]
Shrestha, A.; Li, H.; Le Kernec, J.; Fioranelli, F. Continuous Human Activity Classification From FMCW Radar With Bi-LSTM Networks. IEEE Sens. J. 2020, 20, 13607–13619. [Google Scholar] [CrossRef]
Vaishnav, P.; Santra, A. Continuous Human Activity Classification With Unscented Kalman Filter Tracking Using FMCW Radar. IEEE Sens. Lett. 2020, 4, 1–4. [Google Scholar] [CrossRef]
Guendel, R.G.; Fioranelli, F.; Yarovoy, A. Derivative Target Line (DTL) for Continuous Human Activity Detection and Recognition. In Proceedings of the 2020 IEEE Radar Conference (RadarConf20), Florence, Italy, 21–25 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
Kurtoglu, E.; Gurbuz, A.C.; Malaia, E.; Griffin, D.; Crawford, C.; Gurbuz, S.Z. Sequential Classification of ASL Signs in the Context of Daily Living Using RF Sensing. In Proceedings of the 2021 IEEE Radar Conference (RadarConf21), Atlanta, GA, USA, 8–14 May 2021; pp. 1–6. [Google Scholar] [CrossRef]
Zhu, S.; Guendel, R.G.; Yarovoy, A.; Fioranelli, F. Continuous Human Activity Recognition with Distributed Radar Sensor Networks and CNN–RNN Architectures. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–15. [Google Scholar] [CrossRef]
Guendel, R.G.; Fioranelli, F.; Yarovoy, A. Distributed radar fusion and recurrent networks for classification of continuous human activities. IET Radar Sonar Navig. 2022, 16, 1144–1161. [Google Scholar] [CrossRef]
Qu, L.; Li, X.; Yang, T.; Wang, S. Radar-Based Continuous Human Activity Recognition Using Multidomain Fusion Vision Transformer. IEEE Sens. J. 2025, 1. [Google Scholar] [CrossRef]
Feng, X.; Chen, P.; Weng, Y.; Zheng, H. CMDN: Continuous Human Activity Recognition Based on Multi-domain Radar Data Fusion. IEEE Sens. J. 2025, 1. [Google Scholar] [CrossRef]
Ullmann, I.; Guendel, R.G.; Christian Kruse, N.; Fioranelli, F.; Yarovoy, A. Classification Strategies for Radar-Based Continuous Human Activity Recognition With Multiple Inputs and Multilabel Output. IEEE Sens. J. 2024, 24, 40251–40261. [Google Scholar] [CrossRef]
Kurtoğlu, E.; Gurbuz, A.C.; Malaia, E.A.; Griffin, D.; Crawford, C.; Gurbuz, S.Z. ASL Trigger Recognition in Mixed Activity/Signing Sequences for RF Sensor-Based User Interfaces. IEEE Trans. Hum. Mach. Syst. 2022, 52, 699–712. [Google Scholar] [CrossRef]
Kruse, N.; Guendel, R.; Fioranelli, F.; Yarovoy, A. Segmentation of Micro-Doppler Signatures of Human Sequential Activities using Rényi Entropy. In Proceedings of the International Conference on Radar Systems (RADAR 2022), Edinburgh, UK, 24–27 October 2022; Volume 2022, pp. 435–440. [Google Scholar] [CrossRef]
Liu, S.; Wang, B. Optimized Modified ResNet18: A Residual Neural Network for High Resolution. In Proceedings of the 2024 IEEE 4th International Conference on Electronic Technology, Communication and Information (ICETCI), Changchun, China, 24–26 May 2024; pp. 1–5. [Google Scholar] [CrossRef]
Guendel, R.G.; Unterhorst, M.; Fioranelli, F.; Yarovoy, A. Dataset of continuous human activities performed in arbitrary directions collected with a distributed radar network of five nodes. 4TU. ResearchData 2021, 10, 16691500. [Google Scholar] [CrossRef]

Figure 1. Depiction of multi-label classification in (a) computer vision and (b) HAR from time series data. This approach can identify multiple labels simultaneously, such as recognizing a cat, a plant, and a bench in a photograph. Similarly, it can determine that a specific time window, highlighted in orange, contains radar signatures indicative of activities like walking, sitting down, and sitting [25].

Figure 2. The feedback-driven segmentation workflow.

Figure 3. Modified Resnet18 architecture.

Figure 4. The three radar representations used. (a) Range-Time: directly from the data. (b) Range-Doppler: by applying Fourier transform on range-time data along the slow-time dimension. (c) Spectrogram: by applying the short-time Fourier transform (STFT) on range-time.

Figure 5. Data collection configuration: five radars arranged in a semicircle for dataset acquisition [32].

Figure 6. Objective function model of BO.

Figure 7. Reinforcement learning result.

Figure 8. Accuracy across segment durations.

Figure 9. F1-score variation with segment length.

Table 1. Comparative results.

L	Accuracy	F1-Score
30 s	94.66%	89.90 %
38 s	95.58%	91.68 %

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Belbekri, N.; Wang, W. Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition. Appl. Sci. 2025, 15, 2905. https://doi.org/10.3390/app15062905

AMA Style

Belbekri N, Wang W. Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition. Applied Sciences. 2025; 15(6):2905. https://doi.org/10.3390/app15062905

Chicago/Turabian Style

Belbekri, Nasreddine, and Wenguang Wang. 2025. "Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition" Applied Sciences 15, no. 6: 2905. https://doi.org/10.3390/app15062905

APA Style

Belbekri, N., & Wang, W. (2025). Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition. Applied Sciences, 15(6), 2905. https://doi.org/10.3390/app15062905

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Feedback-Driven Segmentation for Continuous Multi-Label Human Activity Recognition

Abstract

1. Introduction

2. Related Work

2.1. Radar-Based HAR

2.2. Multi-Label Classification

2.3. Segmentation Techniques in HAR

3. Methodology

3.1. Feedback-Driven Adaptive Segmentation

3.2. Multi-Label Classification Using Modified ResNet18

4. Experimental Setup

4.1. Dataset Used

4.2. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI