Context-Aware Alerting in Elderly Care Facilities: A Hybrid Framework Integrating LLM Reasoning with Rule-Based Logic
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper is interesting and has good potential but before considering publication there are a list of recommendations that should be addressed by the authors:
- Some context should be added to alarm definition. Three alarm categories are defined but lack concrete examples.
- The difference between "low care assistant" and "medium care assistant" should be explained.
- The author assumes that suppression algorithms have "limited interpretability and risk of missing critical events" making them "insufficient for deployment in long-term care (LTC) settings." without a proper source.
- The author assumes that middleware solutions "risk unsafe delays and require complex maintenance" without a proper source.
- The author assumes that existing methods "often overlook behavioral and environmental context, such as nurse-elderly distance and mobility status." Without a proper source.
- More challenges should be presented such as data privacy, budget constraints, and regulatory compliance.
- "very slow or clinically insignificant motion"? – should be defined using examples.
- Tmax not defined
- The algorithm presents only pseudocode bot not explains what will happen in the case of multiple simultaneous alerts.
- "possibility of abnormal behavior arising", abnormal behavior – should be defined.
- Equation (1) and equation (7) are identical
- The flowchart should show continuous monitoring
- How does the system handle new patients without history?
- The Δt = 1s? How is this value defined?
- Figure 3 lacks clarity
- How does the system distinguish nurses from patients?
- What will happen during the night?
Comments for author File:
Comments.pdf
Author Response
- Some context should be added to alarm definition. Three alarm categories are defined but lack concrete examples.
Response: We apologize for any inconvenience, but the comment regarding the alarm definition context was not entirely clear to us. The original manuscript already described the alarm context used in our work, and all relevant contexts were defined mathematically in the “Context-Aware Alert System” section 4. We are also uncertain regarding the question about alarm categories and whether the reviewer was referring to the three alarm categories illustrated in Figure 1. To ensure clarity, although these categories were already defined in the “Validation Strategy” subsection 5.2 under “Experimental Analysis” section 5, we have now reiterated and clarified their definitions in the Introduction (page 3) section 1 of the revised manuscript.
Modification: In the Figure the Baseline method is the traditional notify-all nurses approach, the rule-based method includes both rule-based suppression and delay and rule-based suppression, delay, and heuristic validation, and the proposed hybrid model integrating rule-based logic with LLM-driven semantic reasoning.
2.The difference between "low care assistant" and "medium care assistant" should be explained.
Response: We sincerely thank the reviewer for their valuable feedback. In the revised manuscript, we have included a more detailed explanation to clearly differentiate not only between low and medium care assistance levels but also among all care groups and care levels. Additionally, we have provided a reference to the scoring procedure in the “Data Collection” subsection (5.1) under the “Experimental Analysis” section (5) (page 15~17).
Modification: The residents were categorized into three care groups based on their care level: low (1–2), medium (3), and high (4–5) [49 ]. Low- level residents were generally independent, requiring minimal assistance or occasional reminders for meals, hygiene, or dressing, and were able to maintain an upright posture with minor age-related stooping. Medium-level residents required partial assistance for daily activities and movement, could stand or transfer with assistive devices, and often exhibited a forward-leaning or hunched posture due to reduced core strength. High-level residents were completely dependent on caregivers, mostly immobile, and showed severe postural instability without a consistent upright posture.
3. The author assumes that suppression algorithms have "limited interpretability and risk of missing critical events" making them "insufficient for deployment in long-term care (LTC) settings." without a proper source.
Response: We appriciate the reviewer for this valuable observation. The mentioned statement serves as a concluding remark derived from the suppression-related studies already cited in the subsection. Nevertheless, to address the reviewer’s concern, we have included an additional survey reference in the “Alarm Suppression Algorithms” subsection (2.1) under the “Related Works” section (2) in page 5.
4. The author assumes that middleware solutions "risk unsafe delays and require complex maintenance" without a proper source.
Response: We are sorry for the lack of clarity in our initial explanation. The referenced statement serves as a concluding summary based on the delay-related studies already cited within the subsection. However, to address the reviewer’s concern, we have added an additional survey reference in the “Middleware Solutions and Notification Delay Based Strategies” subsection (2.2) under the “Related Works” section (2) in page 5.
5. The author assumes that existing methods "often overlook behavioral and environmental context, such as nurse-elderly distance and mobility status." Without a proper source.
Response: We apologize if the description in the original manuscript led to any misinterpretation. To address this concern, we have included four additional literature references in the “Alarm Validation Approaches” subsection (2.3) under the “Related Works” section (2) on page 5. While prior studies have explored environmental and patient physiological factors for alarm triggering, the contextual relationship between nurses and patients has received limited attention, which our study aims to address.
6. More challenges should be presented such as data privacy, budget constraints, and regulatory compliance.
Response: We appreciate the reviewer’s insightful feedback. While it was beyond the scope of this paper to cover all possible challenges in depth, we have acknowledged and discussed additional issues—including data privacy, budget constraints, and regulatory compliance—in the “Limitations and Future Challenges” subsection 6.2 under “Discussion” section 6 of the revised manuscript as important considerations for future research directions.
Modification: Third, although broader challenges such as data privacy, budget constraints, and regulatory compliance remain important considerations, the proposed method already addresses some of these concerns by utilizing the existing video monitoring infrastructure of the facility without hardware modification. The only additional requirement is a GPU-enabled PC for processing, making deployment cost-effective and minimally intrusive. Also the data utlized is skeleton data and the elederly and nurses are given unique id for identification instead of their name which to some extend mitigates the privacy issue.
7. "very slow or clinically insignificant motion"? – should be defined using examples.
Response: We acknowledge the reviewer’s thoughtful observations. In the revised manuscript, we have added further clarification and included examples from our case study to define what constitutes “very slow” or “clinically insignificant” motion in the “Problem Statement” section (Section 3, Page 6). Since our study focuses on fall prevention during group activities such as mealtime, rapid or unstable standing attempts are considered clinically significant, whereas slower standing motions or minor lateral body movements are regarded as insignificant for triggering alarms.
Modification: Alarms triggered by very slow or clinically insignificant motion i.e. trying to stand up slowly or moving to a side.
8. Tmax not defined
Response: We thank the reviewer for pointing this out. This was a typing mistake in our part. Tmax is actually Tack which is the nurse acknowledgement time. The delayed alert can never be more than this time. We have already made correction in the Algorithm 1 in page 8.
9. The algorithm presents only pseudocode bot not explains what will happen in the case of multiple simultaneous alerts.
Response: We thank the reviewer for this valuable observation and apologize for any lack of clarity in the pseudocode explanation. In the algorithm, all abnormal occurrences within each frame are processed collectively—the initialization step represents a set of alarms A = {a1, a2, . . . , aN } rather than a single alert ai. To address this concern, the revised manuscript includes an expanded explanation describing how the proposed framework handles scenarios with multiple simultaneous alerts, ensuring clarity on multi-alert handling. This addition has been included in the “Context-Aware Alert System” section (Section 4, Page 7).
Modification: With the help of Facebook prophet annomalous activities are detected for each frame. The alert set is initialized for all the abnormal activities detected in a frame. Each alarm tuple contains information such as time, motion intensity, urgency level, care level, and nurse–resident distance. An empty alert set is initialized to store only the effective alerts. Each alarm is first checked to remove false or unnecessary ones. Alerts with low urgency, high stability, or low care priority are discarded to reduce false alarms and nurse fatigue. For the remaining alerts, a delay time is calculated using factors such as urgency, care level, and distance between nurse and resident. Alerts that would take too long to process (beyond the acceptable acknowledgment time) are discarded. Each valid alert is then evaluated using a LLM, which determines its clinical priority and recommends the most suitable nurse based on the situation. Alerts with low priority scores are ignored, ensuring that only meaningful alerts are sent. The selected nurses receives the alerts, and a timer starts to monitor acknowledgment. If no response is received within the set time limit, the alert is automatically escalated to the next available nurse. After acknowledgment, the system updates nurse workload and adds the handled alert to the effective alert set. The final output is a list of all processed and escalated alerts.
10."possibility of abnormal behavior arising", abnormal behavior – should be defined.
Response: We sincerely appreciate the reviewer’s perspective. In the revised manuscript, we have added a detailed explanation of the abnormal behaviors considered in our study. This description is now included in the “Anomalous Event Detection” subsection (4.1) under the “Context-Aware Alert System” section (Section 4, Page 9).
Modification: Because of asynchrony in real-time monitoring, there is often a noticeable delay between observing an event and responding to it. Instead of depending only on the current motion path, it becomes crucial to predict the person’s next movement and the activity they are likely to perform. Since abnormal behaviors can result from physical or environmental factors, this study addresses the issue by forecasting elderly motion through joint trajectory prediction and activity state estimation, allowing caregivers to act proactively. In our context, abnormal behavior mainly refers to pre-fall conditions such as sudden attempts to stand up or unstable standing posture.
11.Equation (1) and equation (7) are identical
Response: We thank the reviewer for pointing out this redundancy. Both equations were intended to represent the same formulation; however, we agree that repeating them is unnecessary. Accordingly, we have removed Equation (7) in the revised manuscript to avoid duplication.
12.The flowchart should show continuous monitoring
Response: We thank the reviewer for this valuable suggestion. In the revised manuscript, we have modified the flowchart to clearly represent the continuous monitoring process as recommended in page 10.
13.How does the system handle new patients without history?
Response: We are grateful for the reviewer’s helpful observations. If the term “history” refers to prior care level information or patient speed data, these parameters are crucial for the effective operation of the proposed system. Without them, the system’s functionality would be significantly limited, as discussed in the original manuscript under the “Discussion” subsection (5.3). To improve clarity, we have further elaborated on this dependency and proposed potential solutions for handling such cases in future research, now detailed in the “Limitations and Future Challenges” subsection (6.2) under the “Discussion” section (Section 6, Page 24).
Modification: First, the current framework depends on prior patient information, such as care level assessment and average movement speed, to ensure accurate context modeling and alert generation. These parameters are essential for personalized monitoring, yet they pose challenges when applied to new residents without existing care history. To address this, future iterations of the system can integrate adaptive initialization modules that automatically estimate baseline parameters through short-term observation or unsupervised clustering of motion patterns during the first few days of monitoring. Such self-calibrating models would allow the system to dynamically personalize its thresholds and alerts for new patients without manual input.
14.The Δt = 1s? How is this value defined?
Response:We appreciate the reviewer’s thoughtful assessment of our paper. The value of Δt = 1s was empirically determined based on our experimental setup and the temporal resolution required for accurate monitoring in our case. We have clarified this explanation in the revised manuscript in “Rule-Based Suppression Module” subsection 4.3 in “Context-Aware Alert System” section 4, page 13 .
Modification: In our study, we define this as the nurse failing to arrive within Δt = 1s, a value chosen empirically based on the nurse movement speed in our case study. It is adjustable depending on the requirements of the deployment setup.
15.Figure 3 lacks clarity
Response: We thank the reviewer for this observation. We apologize for the earlier inconvenience and have replaced Figure 3 with a higher-resolution version in the revised manuscript to improve clarity and readability in page 16.
16.How does the system distinguish nurses from patients?
Response: We acknowledge and appreciate the reviewer’s careful feedback. In the revised manuscript, we have clarified that our system differentiates nurses from patients through a customized YOLOv7-based object detection and tracking approach, along with the necessary references for the method. The detailed description of this implementation has been incorporated in the “Context-Aware Alert System” section (Section 4, Page 7).
Modification: We have used YOLOv7 custom object detection based person identification [50] and tracking [51] to identify and track the elderly and nurses.
17.What will happen during the night?
Response: We appreciate the reviewer’s thorough review of our manuscript. Our study primarily concentrates on group activity sessions during mealtime; thus, nighttime activities are beyond the current study’s scope. To clarify this limitation, we have explicitly noted the exclusion of nighttime scenarios in the “Limitations and Future Challenges” subsection (6.2) under the “Discussion” section (Section 6, Page 24) of the revised manuscript, and explained how to conduct this for future research.
Modification: Second, the current study is confined to daytime group activity sessions, particularly during mealtime, where multiple residents and nurses interact. Consequently, nighttime monitoring scenarios—such as sleeping posture analysis, nocturnal falls, or nighttime wandering—were beyond the present scope. However, because the proposed model operates on skeleton-based data, which are inherently privacy-preserving, it can be extended for 24-hour monitoring without compromising resident privacy. Future research could integrate low-light skeleton extraction and lightweight temporal models to extend system functionality to nighttime environments while maintaining operational continuity.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript entitled “Context-Aware Alerting in Elderly Care Facilities: A Hybrid Framework Integrating LLM Reasoning to Rule-Based Logic” was reviewed.
Below are the reviewer's comments regarding the adjustments that need to be made to the manuscript. I kindly ask that all changes made to the manuscript be highlighted in a different color in the text or highlighted in yellow.
1- Please adjust the manuscript according to the journal template: present the journal logo; mention the authors' affiliations in more detail; check the formatting of the references at the end of the manuscript; avoid using acronyms in the title.
Abstract:
2- Please shorten the introduction so the text is more direct.
3- Clarify the objectives so the reader can fully understand what will be proposed.
4- Explain the acronym LLM in full on line 6.
5- ​​Mention the type of study, the profile of the people assisted by the technology, and the statistical analyses that will be used.
6- In the results, try to present more quantitative information on the effectiveness of the proposal.
7- Try to draw a more assertive conclusion regarding the objectives.
Keywords:
8- Avoid repeating words already presented in the title. Replace them with synonyms.
Introduction:
9- Generally, figures are not used in the introduction. Please scroll down Figure 1 to describe the proposed methodology.
10- There are some acronyms that are not explained in the introduction, such as RQ1, RQ2, RQ3, FRR, FNR. Please check and adjust them.
11- Present the current state of the art regarding similar technologies and how your study will contribute to knowledge gains.
12- Align the objectives at the end of the introduction with those presented in the results.
Related Works:
13- In this topic you are bringing information from the literature that can also be used to discuss the results.
Problem Statement:
14- So far, I haven't found information about the type of study you're conducting, or the population selected for validation. Please describe more clearly how this was conducted. Perhaps a flowchart or graphic summary could help the reader fully understand the study.
Context-Aware Alert System:
15- Adjust the information in the algorithm table and Table 1. Provide a caption for both, and in the case of Table 1, also check the formatting, as the bottom row is missing.
16- Explain more details about both tables in the text.
Experiment and Analysis:
17- Present more clearly the type of study, location, and ethical information.
18- Was a sample size calculation performed to select a minimum number of people?
19- What were the inclusion and exclusion criteria for participants?
20- Create a subtopic about the statistical analyses performed. Describe in detail the tests used, the data presentation format, and the statistical program.
21- In the discussion, you are repeating the results and not citing information from the literature to compare the findings. Please adjust the discussion to explain the results obtained in the study.
Conclusion:
22- The conclusion is too long. Please make it more objective by focusing on addressing the proposed objectives. Also, avoid repeating abbreviations that have already been mentioned in the text.
Author Response
1- Please adjust the manuscript according to the journal template: present the journal logo; mention the authors' affiliations in more detail; check the formatting of the references at the end of the manuscript; avoid using acronyms in the title.
Response: We thank the reviewer for this helpful reminder. These formatting requirements were already followed in the original manuscript; however, to ensure full compliance, we have carefully rechecked all aspects, including the journal logo, detailed author affiliations, reference formatting, and title structure, in the revised version.
Abstract:
2- Please shorten the introduction so the text is more direct.
Response: We appreciate the reviewer for this valuable suggestion. In the revised manuscript, we have shortened and refined the introduction to make the text more concise and focused in page 1.
Modification: The rising demand for elderly care amid ongoing nursing shortages has highlighted the limitations of conventional alert systems, which frequently generate excessive alerts and contribute to alarm fatigue.
3- Clarify the objectives so the reader can fully understand what will be proposed.
Response: We are thankful for the reviewer’s suggestions. In the revised manuscript, we have rewritten the objectives in a clearer and more explicit manner to ensure that readers can fully understand the scope and purpose of the proposed study in page 1.
Modification: The objective of this study is to develop a hybrid, context-aware nurse alerting framework for long-term care (LTC) facilities that minimizes redundant alarms, reduces alarm fatigue, and enhances patient safety and caregiving balance during multi-person care scenarios such as mealtime.
4- Explain the acronym LLM in full on line 6.
Response: We thank the reviewer for pointing this out. In the revised manuscript, we have added the full form of the acronym LLM at its first occurrence for clarity in page 1.
5- ​​Mention the type of study, the profile of the people assisted by the technology, and the statistical analyses that will be used.
Response: We acknowledge and appreciate the reviewer’s careful feedback. In the revised manuscript, we have added the type of study, described the profile of the individuals assisted by the proposed technology, and specified the statistical analyses employed in the Abstract in page 1.
Modification: We conducted an experimental study in a real-world LTC environment involving 28 elderly residents (6 high, 8 medium, and 14 low care levels) and 4 nurses across three rooms over seven days. The proposed system utilizes video-derived skeletal motion, care-level annotations, and dynamic nurse–elderly proximity for decision-making. Statistical analyses were performed using F1 score, accuracy, false positive rate (FPR), and false negative rate (FNR) to evaluate performance improvements.
6- In the results, try to present more quantitative information on the effectiveness of the proposal.
Response: We sincerely appreciate the reviewer’s perspective. In the revised manuscript, we have added additional quantitative information in the Results part of the Abstract to better demonstrate the effectiveness of the proposed approach in page 1.
Modification: Compared to the baseline where all nurses were notified (100\% alarm load), the proposed method reduced average alarm load to 27.5\%, achieving a 72.5\% reduction, with suppression rates reaching 100\% in some rooms for some nurses. Performance metrics further validate the system’s effectiveness: macro F1 score improved from 0.18 (baseline) to 0.97, while accuracy rose from 0.21 (baseline) to 0.98. Compared to the baseline error rates (FPR 0.20, FNR 0.79), the proposed method achieved drastically lower values (FPR 0.005, FNR 0.023).
7- Try to draw a more assertive conclusion regarding the objectives.
Response: We are thankful for the reviewer’s suggestions.In the revised manuscript, we have strengthened the conclusion to more clearly and assertively address the stated objectives, and this revision has been reflected in the Abstract in page 1.
Modification: These findings demonstrate that the proposed approach effectively minimizes false alarms while maintaining strong operational efficiency. By integrating rule-based mechanisms with LLM-based contextual reasoning, the framework significantly enhances alert accuracy, mitigates alarm fatigue, and promotes safer, more sustainable, and human-centered care practices making it suitable for practical deployment within real-world long-term care environments.
Keywords:
8- Avoid repeating words already presented in the title. Replace them with synonyms.
Response: We thank the reviewer for this helpful suggestion. In the revised manuscript, we have modified the keywords accordingly by replacing repeated words from the title with appropriate synonyms to improve readability and avoid redundancy in page 1.
Modification: alarm fatigue; Large Language Model; fall detection; context-aware systems; nurse alerting; long-term care; nurse care.
Introduction:
9- Generally, figures are not used in the introduction. Please scroll down Figure 1 to describe the proposed methodology.
Response: We value the reviewer’s insight; however, we have intentionally retained Figure 1 in the Introduction to help readers quickly grasp the motivation and how our approach differs from existing works. The figure also highlights the key components of the proposed method, which supports readers—especially those who review only the Introduction section—to understand the context and significance of the study. Therefore, we respectfully chose not to make the modification.
10- There are some acronyms that are not explained in the introduction, such as RQ1, RQ2, RQ3, FRR, FNR. Please check and adjust them.
Response: We thank the reviewer for pointing this out. In the revised manuscript, we have provided the full forms and explanations for all acronyms. Specifically, RQ1–RQ3 have been renamed as R1–R3, with their meanings clarified upon first mention in the Introduction (pages 3–4). Additionally, the definitions of FRR and FNR have been included in the Abstract (page 1).
11- Present the current state of the art regarding similar technologies and how your study will contribute to knowledge gains.
Response: We thank the reviewer for their valuable observation. The current state of the art and the contribution of our study to advancing existing knowledge were already presented in the” Introduction” and” Related Works” sections of the original manuscript. We have carefully reviewed these sections to ensure that this information is clearly stated and added further details in “Introduction” of the revised version in page 4.
Modification: This study advances current elderly monitoring research by introducing a hybrid, context-aware nurse alerting framework that combines rule-based suppression and delay logic with LLM-driven contextual reasoning. Unlike existing systems that rely solely on fixed thresholds or opaque learning models, our approach balances interpretability and adaptability, enabling reliable alert decisions in real-world LTC settings. By integrating spatial (nurse–resident distance), temporal (motion trajectory), and clinical (care-level and mobility) information, the framework ensures alerts are both clinically relevant and context-sensitive. Quantitative results demonstrate clear improvements over the baseline and rule-based methods. In alignment with the study’s objectives, the system (1) reduces alarm fatigue through intelligent suppression, (2) enhances alert precision via LLM-based validation, (3) maintains interpretability, and (4) supports balanced, patient-centered caregiving. Overall, this work contributes a scalable, human-centered AI approach that improves safety and efficiency in LTC environments.
12- Align the objectives at the end of the introduction with those presented in the results.
Response: We acknowledge and appreciate the reviewer’s careful feedback. In the revised manuscript, we have modified the text to ensure that the objectives stated at the end of the Introduction are clearly aligned with those addressed and presented in the Results section in page 4.
Modification: Research question 1 (R1): How well does a hybrid alert framework consists of rule-based logic and LLM reasoning minimize alert fatigue without compromising care quality?
Contribution: This study introduces a hybrid alerting framework that combines rule-based logic with LLM reasoning to deliver context-aware alerts, aiming to reduce alarm fatigue without compromising care quality. Compared to the baseline scenario where all nurses were alerted (100% alarm load), the proposed framework lowered the average load to 27.5% but achieved an accuracy of 0.98 and a macro F1 score of 0.97, significantly surpassing the baseline (accuracy 0.21, macro F1 score 0.18).
- Research question 2 (R2): To what extent can the inclusion of clinically interpretable information help minimize false alarms in long-term care settings?
Contribution: Our method leverages clinically interpretable parameters like resident care condition, nurse-resident distance, and mobility to generate alerts, minimizing false alarms. Compared to the baseline (FPR 0.20, FNR 0.79), the method achieves markedly lower error rates (FPR 0.005, FNR 0.023).
- Research question 3 (R3): Is a flexible, context-driven alert validation strategy more effective than static rule-based thresholds in addressing the unique needs of elderly care?
Contribution: We introduce an LLM-based validation module that incorporates spatial, temporal, and clinical context, instead of rule-based thresholding. Our model achieved an accuracy of 0.98 and a macro F1 score of 0.97, outperforming the rule-based approach (accuracy 0.78, macro F1 score 0.79).
This study advances current elderly monitoring research by introducing a hybrid, context-aware nurse alerting framework that combines rule-based suppression and delay logic with LLM-driven contextual reasoning. Unlike existing systems that rely solely on fixed thresholds or opaque learning models, our approach balances interpretability and adaptability, enabling reliable alert decisions in real-world LTC settings. By integrating spatial (nurse–resident distance), temporal (motion trajectory), and clinical (care-level and mobility) information, the framework ensures alerts are both clinically relevant and context-sensitive. Quantitative results demonstrate clear improvements over the baseline and rule-based methods. In alignment with the study’s objectives, the system (1) reduces alarm fatigue through intelligent suppression, (2) enhances alert precision via LLM-based validation, (3) maintains interpretability, and (4) supports balanced, patient-centered caregiving. Overall, this work contributes a scalable, human-centered AI approach that improves safety and efficiency in LTC environments.
Related Works:
13- In this topic you are bringing information from the literature that can also be used to discuss the results.
Response: We understand the reviewer’s perspective; however, the referenced literature was not used to discuss the results because the experimental settings and study designs differ significantly from ours, which limits direct comparability. Including such discussions could lead to misleading interpretations. Therefore, we have kept the literature references within the appropriate contextual framework.
Problem Statement:
14- So far, I haven't found information about the type of study you're conducting, or the population selected for validation. Please describe more clearly how this was conducted. Perhaps a flowchart or graphic summary could help the reader fully understand the study.
Response: We thank the reviewer for their valuable effort. We apologize for any confusion, but the original manuscript already contains this information. Detailed population information, including participant characteristics, is presented in the Abstract and in the “Data Collection” subsection 5.1 under the “Experimental Analysis” section 5. Additionally, Figure 1 in the Introduction serves as a graphical summary of the overall study workflow. Furthermore, to strengthen the background on alarm management and false alarm reduction in long-term care (LTC) contexts, we have included additional references—particularly References 45 and 48—which provide further insights into related research.
Context-Aware Alert System:
15- Adjust the information in the algorithm table and Table 1. Provide a caption for both, and in the case of Table 1, also check the formatting, as the bottom row is missing.
Response: We are grateful for the reviewer’s helpful comment. In the revised manuscript, we have carefully reviewed and adjusted the contents of both the algorithm table and Table 1. Captions have been added to both tables, and the formatting issue—specifically the missing bottom row in Table 1—has been corrected to ensure completeness and clarity.
16- Explain more details about both tables in the text.
Response: We appreciate the reviewer’s constructive suggestion. In the revised manuscript, we have added further explanations regarding the process in “Context-Aware Alert System” section 4 page 7~8. These clarifications provide a clearer understanding of how the tabulated data supports our analysis and conclusions.
Modification: The alert set is initialized for all the abnormal activities detected in a frame. Each alarm tuple contains information such as time, motion intensity, urgency level, care level, and nurse–resident distance. An empty alert set is initialized to store only the effective alerts. Each alarm is first checked to remove false or unnecessary ones. Alerts with low urgency, high stability, or low care priority are discarded to reduce false alarms and nurse fatigue. For the remaining alerts, a delay time is calculated using factors such as urgency, care level, and distance between nurse and resident. Alerts that would take too long to process (beyond the acceptable acknowledgment time) are discarded. Each valid alert is then evaluated using a LLM, which determines its clinical priority and recommends the most suitable nurse based on the situation. Alerts with low priority scores are ignored, ensuring that only meaningful alerts are sent. The selected nurses receives the alerts, and a timer starts to monitor acknowledgment. If no response is received within the set time limit, the alert is automatically escalated to the next available nurse. After acknowledgment, the system updates nurse workload and adds the handled alert to the effective alert set. The final output is a list of all processed and escalated alerts.
Experiment and Analysis:
17- Present more clearly the type of study, location, and ethical information.
Response: We thank the reviewer for their valuable recommendations. In the revised manuscript, we have added further clarification regarding the type of study and the study location in the Abstract and in the “Data Collection” subsection 5.1 under the “Experimental Analysis” section 5. The associated ethical approval details are already in side the original manuscript in “Institutional Review Board Statement:” in page 25.
Modification: The dataset was collected from Global Care, a long-term care (LTC) facility in Japan that specializes in dementia care and operates an in-house video monitoring system. In collaboration with the facility, we accessed recorded footage capturing natural lunchtime activities of residents over seven consecutive days across three dining areas (Rooms 1, 2, and 3) for the case study. Each session lasted approximately 30–60 minutes, corresponding to the regular lunch period. A total of 28 elderly participants aged between 64 and 95 years were included in the study, along with four nurses who provided care support during the sessions.
18- Was a sample size calculation performed to select a minimum number of people?
Response: While we understand the reviewer’s concern, it may not be applicable to our study context as our research is exploratory in nature and focuses primarily on developing and validating a methodological framework rather than conducting a population-based inferential analysis. So, no formal sample size calculation was performed for this study. The dataset was collected from real-world care facilities under ethical constraints, where participant recruitment is limited by the number of available residents who meet inclusion criteria and consent to participate. Therefore, the sample size reflects the practical and ecological conditions of the study setting, which aligns with similar human activity recognition and care-facility-based observational studies. The objective was to demonstrate feasibility and model performance under realistic conditions rather than to generalize statistical outcomes to a broader population.
19- What were the inclusion and exclusion criteria for participants?
Response: We appreciate the reviewer’s thoughtful assessment of our paper. In the revised manuscript, we have added a detailed description of the inclusion and exclusion criteria for participant selection. Specifically, elderly participants were included if they were residents of the long-term care facility, capable of performing daily activities with or without assistance, and provided informed consent (or consent was obtained from their legal guardians). Participants were excluded if they did not wish to take part in the study.
20- Create a subtopic about the statistical analyses performed. Describe in detail the tests used, the data presentation format, and the statistical program.
Response: We appreciate the reviewer’s engagement with our work. In the revised manuscript, we have added a new subtopic “Validation Strategy,” subsection 5.2 under the “Experimental Analysis” section 5, specifically dedicated to discussing these. This section now details the statistical tests employed, the data presentation format (including mean, standard deviation, and percentage representation), and the statistical software used for analysis. These additions ensure greater transparency and reproducibility of the results.
Modification: To rigorously assess the effectiveness of our proposed context-aware nurse alerting system, we employed a dual validation approach—spatially across rooms and temporally across days. This design allowed us to evaluate both the consistency and robustness of the system in a real-world long-term care setting. We performed a comprehensive ablation study using four alerting strategies to dissect the contribution of each component in the pipeline. The Baseline method reflects the %traditional
conventional `notify-all' approach, where all nurses receive alerts whenever an abnormal event is detected, without regard for contextual relevance or nurse workload. This approach, while exhaustive, tends to overwhelm caregivers and generate unnecessary cognitive load. The R(S+D) strategy builds upon this by introducing rule-based suppression and delay—where alerts are blocked if contextually insignificant and delayed if premature—mimicking more practical judgment found in human care. Further extending this, the R(S+D+V) method adds a heuristic validation layer, checking the appropriateness of alerts before dispatch to enhance decision precision. Our Proposed Method combines the deterministic logic of suppression and delay with adaptive, semantic validation powered by a large language model (LLM). The rules guide initial alert filtering, while the LLM performs high-level contextual reasoning to confirm whether an alert should be sent. This hybrid approach retains interpretability while enabling flexibility in complex and ambiguous situations.
For quantitative comparison, we computed a set of performance metrics including Accuracy, Precision, Recall, F1-score, False Positives (FP), and False Negatives (FN). These metrics were analyzed both globally and per nurse to capture variations in alert distribution, responsiveness, and error rates. Accuracy and F1-score were used to assess the overall correctness and balance between sensitivity and specificity, while FP and FN provided insight into the model’s reliability in suppressing false alarms and minimizing missed critical events. Tests were also done to see temporal (day-based) and spatial (room-based) variation.
All data were tabulated and visualized using bar charts and confusion matrices to illustrate comparative performance among the four methods. Descriptive statistics (mean and standard deviation) were used to summarize results across days and rooms, ensuring comparability between temporal and spatial dimensions. Ground truth annotations were manually created from video recordings, identifying moments when facility staff intervened by alerting nurses in response to genuine abnormal events. This human-labeled data served as the benchmark for both forecasting accuracy and alert validation, ensuring our comparisons reflected actual clinical response behavior.
The analyses were conducted using Python (version 3.9) with the NumPy, SciPy, PyCM and scikit-learn libraries for statistical computation and metric evaluation, and Matplotlib/Seaborn for data visualization. This ensured reproducibility, transparency, and rigorous quantitative assessment of the alert system’s performance.
21- In the discussion, you are repeating the results and not citing information from the literature to compare the findings. Please adjust the discussion to explain the results obtained in the study.
Response: We understand the concern of the reviewer; however, the experimental setup and study context of our work differ significantly from those of existing studies, which limits direct comparability and the use of prior results for discussion. Including such comparisons could lead to misleading interpretations. Therefore, we focused the Discussion section on explaining our results in relation to the study’s objectives and methodological framework. Nonetheless, we have revised the text to emphasize the rationale behind the observed outcomes and to make the interpretation of results clearer.
Conclusion:
22- The conclusion is too long. Please make it more objective by focusing on addressing the proposed objectives. Also, avoid repeating abbreviations that have already been mentioned in the text.
Response: We acknowledge and appreciate the reviewer’s careful feedback. In the revised manuscript, we have shortened and refined the conclusion section to make the text more concise and focused. Also removed abbreviations that have already been mentioned in the text.
Modification: This study presents a hybrid, context-aware nurse alerting framework for long-term care settings, aimed at mitigating alarm fatigue, caregiver overload, and delayed response in elderly monitoring. By combining rule-based filtering and delay mechanisms with semantic reasoning, the system achieved significant improvements in both efficiency and reliability. Evaluations using real-world video data demonstrated a 72.5\% average reduction in nurse alert load and nearly complete elimination of false alarms, with macro F1 and accuracy scores of 0.97 and 0.98, respectively. These results validate the effectiveness of interpretable suppression, adaptive delay, and contextual validation mechanisms proposed in this study. The framework offers a scalable path toward personalized and real-time alerting that enhances both resident safety and caregiver well-being. Future work will extend this approach to incorporate caregiver fatigue, specialization, and interactive AI-human collaboration to further advance human-centered elderly care.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have addressed all the reviewers recommendations in a very comprehensive way and improved the paper a lot. Based on the detailed answers and the improvements in the paper I do recommend its publication.
Reviewer 2 Report
Comments and Suggestions for AuthorsDear Authors,
Thank you for providing the revised version of the manuscript. After carefully reviewing the text, it was found that the authors followed the reviewer's guidelines and significantly improved the work. Therefore, I suggest that the article be accepted for publication.
