Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Establishing Reference Metrics for Respiratory Exercises Through Wearable Sensors: A Comparative Study

Biomechanics 2025, 5(4), 90; https://doi.org/10.3390/biomechanics5040090

by Federico Caramia^1,2

, Emanuele D’Angelantonio^1,2,3, Leandro Lucangeli^1,2,3 and Valentina Camomilla^1,2,*

Reviewer 1:

Zhen Yuan

Reviewer 2:

Kai Guo

Reviewer 3:

Alexander Yu Meigal

Reviewer 4: Anonymous

Biomechanics 2025, 5(4), 90; https://doi.org/10.3390/biomechanics5040090

Submission received: 16 August 2025 / Revised: 15 October 2025 / Accepted: 22 October 2025 / Published: 5 November 2025

(This article belongs to the Special Issue Computational Modeling and AI Applications in Injury Biomechanics and Rehabilitation)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript (biomechanics-3851034) addresses wearable sensors in telerehabilitation, focusing on a low-cost IMU to quantify respiratory exercises and to establish reference metrics. The study proposes accelerometer-based indicators and a processing workflow, and it shows feasibility in a home-like setting (high VAS, consistent cycle detection). The application goal is clear and clinically relevant. However, several issues in the quantitative analyses need to be addressed. I recommend minor revision to address the points below.

In Table 2, PeakInsp and PeakExp are defined in the Methods as peaks of a normalized signal (values scaled between 0 and 1). They should therefore be reported without physical units (a.u.), not in [g]. If you also wish to report non-normalized peaks from the raw accelerometer signal, please explain the physical meaning and how they were computed relative to the normalized workflow.
Please clarify the fixation and possible relative rotation of the two stacked sensors across postures (RE1–RE4). You state that the antero-posterior axis (Z-axis) was used. Specify whether any alignment of sensor axes across postures was performed, and whether any gravity-related correction was applied. This is important because differences in axis orientation may affect peak detection on the Z-axis and may contribute to differences between the prototype and the reference device.
The segmentation relies on MATLAB findpeaks with parameters selected through iterative testing and a 30% maximum-peak threshold. Please provide evidence for these choices. Evaluate a simple adaptive rule by comparing the 30% relative threshold with alternatives (e.g., 20% and 40%), and report the impact on segmentation performance across RE1–RE4. This evidence will clarify robustness and support generalization across postures and participants.
The fourth-order Butterworth low-pass filter had an “optimal cut-off frequency” between 0.5 and 1 Hz. Please clarify whether this cut-off was the same for all participants and exercises, or selected per participant/posture. If adaptive, report the distribution of selected values by exercise (RE1–RE4) and justify how the selection was made according to the cited method.
Counting errors concentrated in the seated respiratory exercise with back support. Please discuss plausible mechanisms related to the rib cage motion in this posture (e.g., reduced antero-posterior excursion, changes in the breathing pattern) and how they affect Z-axis peak prominence. Suggest practical improvements for future protocols, such as testing an alternative position on the rib cage, adjusting the elastic band fixation, or using posture-specific adaptive thresholds.

Author Response

Comment 1: In Table 2, PeakInsp and PeakExp are defined in the Methods as peaks of a normalized signal (values scaled between 0 and 1). They should therefore be reported without physical units (a.u.), not in [g]. If you also wish to report non-normalized peaks from the raw accelerometer signal, please explain the physical meaning and how they were computed relative to the normalized workflow.

Response 1: We thank the reviewer for this observation. As correctly noted, PeakInsp and PeakExp were extracted from the normalized signal, where values were scaled between 0 and 1 using range normalization. Therefore, these parameters are dimensionless and should be expressed in arbitrary units (a.u.), not in [g]. We have corrected the labeling in Table 2 accordingly.

Comment 2:Please clarify the fixation and possible relative rotation of the two stacked sensors across postures (RE1–RE4). You state that the antero-posterior axis (Z-axis) was used. Specify whether any alignment of sensor axes across postures was performed, and whether any gravity-related correction was applied. This is important because differences in axis orientation may affect peak detection on the Z-axis and may contribute to differences between the prototype and the reference device.

Response 2:The two sensors (prototype and reference) were placed one on top of the other, taking advantage of their flat surfaces, and secured with an elastic band above the left lower rib. This setup ensured a stable and reproducible positioning, and a fixed relative orientation between the two devices across all postures (RE1–RE4). For this reason, a specific axis alignment procedure between the two sensors was not deemed necessary.Regarding signal correction, we confirm that a gravity component removal was applied to the acceleration signal. In addition, both sensors underwent calibration procedures for the accelerometer, gyroscope, and magnetometer.A clarification of the sensor positioning and signal correction has been added to the revised in section 2.1.

Comment 3: The segmentation relies on MATLAB findpeaks with parameters selected through iterative testing and a 30% maximum-peak threshold. Please provide evidence for these choices. Evaluate a simple adaptive rule by comparing the 30% relative threshold with alternatives (e.g., 20% and 40%), and on segmentation performance across RE1–RE4. This evidence will clarify robustness and support generalization across postures and participants.

Response 3: The 30% threshold for peak detection was selected based on iterative empirical testing across all participants and exercise postures (RE1–RE4). These preliminary tests showed that this value ensured stable respiratory cycle segmentation without differences in detection performance across the different positions and sensors. However, no formal comparative analysis (e.g., vs. 20% or 40%) was performed, as the primary goal of this study was to validate the feasibility of the segmentation method using a low-cost sensor, within the limits of a small sample size and available resources.We fully agree that testing alternative thresholds or adopting adaptive approaches could further improve generalizability. This will be considered as a direction for future research. A clarification has been added to the Methods section and in Discussion.

Comment 4: The fourth-order Butterworth low-pass filter had an “optimal cut-off frequency” between 0.5 and 1 Hz. Please clarify whether this cut-off was the same for all participants and exercises, or selected per participant/posture. If adaptive, report the distribution of selected values by exercise (RE1–RE4) and justify how the selection was made according to the cited method.

Response 4: The cut-off frequency of the fourth-order Butterworth low-pass filter was kept constant across all participants and exercises.The use of a 10th order Butterworth low-pass filter with a cut-off frequency of 1 Hz was selected to attenuate noise and reduce movement artifacts while preserving the typical frequency components of quiet breathing. This choice is supported by previous studies (REF 24) that adopted similar filtering strategies for respiratory signal processing using wearable sensors.Given the nature of our signals and the relatively slow frequency of the respiratory movements, the selected filter preserves physiological content while removing higher-frequency motion artifacts.We acknowledge that an adaptive or task-specific cut-off selection could further improve signal quality and segmentation performance. This direction will be considered for future work, where automated frequency optimization techniques may be applied. A clarification has been added to Limitations sections of the manuscript.

Comment 5: Counting errors concentrated in the seated respiratory exercise with back support. Please discuss plausible mechanisms related to the rib cage motion in this posture (e.g., reduced antero-posterior excursion, changes in the breathing pattern) and how they affect Z-axis peak prominence. Suggest practical improvements for future protocols, such as testing an alternative position on the rib cage, adjusting the elastic band fixation, or using posture-specific adaptive thresholds.

Response 5: As noted, cycle counting errors were concentrated in the seated exercise with back support (RE2). A plausible explanation is the reduced antero-posterior movement of the rib cage in this posture, due to both spinal support and potential restriction of diaphragmatic excursion, which may lead to lower peak amplitudes along the Z-axis. This reduced prominence may in turn affect the segmentation algorithm, which relies on Z-axis peak detection above a relative threshold. To address this limitation, we agree that posture-specific adaptations should be explored in future work. A discussion of these plausible mechanisms and future directions has been added to the revised manuscript.

Reviewer 2 Report

Comments and Suggestions for Authors

1. The abstract should quantify the main conclusions, It is recommended to include numerical ranges of the key indicators in the abstract and specify the sample size.
2. The introduction should clearly state the verifiable hypothesis of this study (e.g., “a single IMU under four postures achieves TimeRR BA Bias < 0.1 s and LoA < ±1.0 s”).
3. Since sensor orientation relative to gravity changes significantly under different postures (supine/sitting/standing), it should be clarified whether gravity compensation or posture normalization via sensor fusion was performed; otherwise, Z-axis amplitude is not comparable.
4. Figure 4 radar chart is not suitable for presenting BA agreement. It is recommended to use a Bland–Altman difference vs. mean plot as the main figure, with Bias, LoA, and proportional bias regression line indicated.

Author Response

Comment 1: The abstract should quantify the main conclusions, It is recommended to include numerical ranges of the key indicators in the abstract and specify the sample size.

Response 1: Key quantitative information has been reported in the abstract.

Comment 2: The introduction should clearly state the verifiable hypothesis of this study (e.g., “a single IMU under four postures achieves TimeRR BA Bias < 0.1 s and LoA < ±1.0 s”).

Response 2: This suggestion has been added to the introduction.

Comment 3: Since sensor orientation relative to gravity changes significantly under different postures (supine/sitting/standing), it should be clarified whether gravity compensation or posture normalization via sensor fusion was performed; otherwise, Z-axis amplitude is not comparable.

Response 3: We fully agree that sensor orientation relative to gravity may affect Z-axis signal comparability across postures. In response to both reviewers comments, we specified how gravity correction was applied to the acceleration signal in the revised manuscript.Additionally, the prototype and reference sensors were stacked with fixed orientation, and no axis realignment or sensor fusion was applied.These clarifications have been added in the Methods section (Section 2.2), along with a discussion of the limitations this may introduce in Section 4.3.

Comment 4: Figure 4 radar chart is not suitable for presenting BA agreement. It is recommended to use a Bland–Altman difference vs. mean plot as the main figure, with Bias, LoA, and proportional bias regression line indicated

Response 4: We appreciate the reviewer’s suggestion regarding the use of Bland–Altman (BA) plots to illustrate agreement between devices.We acknowledge that radar plots are not standard for BA analysis, but in this case, we adopted them intentionally as a synthetic and compact visual summary, aiming to present an overview across multiple parameters and exercise conditions (7 parameters × 4 exercises = 28 comparisons). Including individual BA plots for each combination would have significantly increased the length and visual complexity of the main figures.That said, we agree that detailed Bland–Altman difference vs. mean plots provide valuable insight into agreement patterns. Therefore, we opted for including all 28 BA plots as part of the Supplementary Materials to allow a comprehensive view of the analysis.nA statement clarifying this has been added to the figure caption and main text.

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript presents a well-designed study with clear objectives, statistical data, and results, as well as identified limitations.

However, one methodological concern arose. Specifically, at rest, abdominal breathing is normal and predominant over thoracic (rib) one. Why sensors were attached to the rib arc? Respiratory excursions would likely have been greater on the abdomen? Was such position of the IMU related to the task (protocol) of respiration? Probably, breathing according to the protocol is a bit artificial and needs greater involvement of intercostal muscles?

A minor comment – line 375 (abbreviations): BIAS definition is not completed – “BIAS Difference between the mean reference and the” (?)

Author Response

Comment 1: The manuscript presents a well-designed study with clear objectives, statistical data, and results, as well as identified limitations. However, one methodological concern arose. Specifically, at rest, abdominal breathing is normal and predominant over thoracic (rib) one. Why sensors were attached to the rib arc? Respiratory excursions would likely have been greater on the abdomen? Was such position of the IMU related to the task (protocol) of respiration? Probably, breathing according to the protocol is a bit artificial and needs greater involvement of intercostal muscles? A minor comment – line 375 (abbreviations): BIAS definition is not completed – “BIAS Difference between the mean reference and the” (?)

Response 1: We agree that abdominal breathing is predominant at rest, particularly in relaxed conditions. However, the respiratory tasks included in our study were not performed in a resting state. Instead, they were part of a structured rehabilitative exercise protocol, which required conscious breathing by the participants. As a result, the breathing pattern was likely more thoracic than abdominal due to increased engagement of the intercostal and accessory muscles, consistent with controlled exercise execution. The chosen sensor placement was based on the results of a prior feasibility study [23], where we compared different anatomical locations. That analysis showed that this rib cage position provided stable, repeatable, and sufficiently prominent respiratory signals for the selected exercises. Additionally, this placement is supported by literature as a commonly used and validated position for monitoring respiratory movements during structured tasks and physical activity [14,15].A clarification on the sensor placement rationale, as well as the active nature of the respiratory exercises, has been added to the revised manuscript. The definition of BIAS was corrected: Difference between the mean reference and the prototype sensor values

Reviewer 4 Report

Comments and Suggestions for Authors

This paper explores the use of a low-cost inertial sensor to characterize respiratory exercises in older adults, comparing it against a commercial reference IMU. The topic is interesting and clinically relevant, particularly for telerehabilitation and home-based monitoring. The manuscript is well written, but several aspects require clarification, expansion, or methodological strengthening.

Major comments

Sample Size and Representativeness

The sample is small (n = 11, mostly female) and limits generalizability. Please provide a justification (e.g., a priori or post-hoc power analysis) and discuss in more depth how this affects the establishment of reference values.

Validation Approach

The study compares the prototype with another accelerometer, not with a clinical gold standard (spirometry or respiratory inductance plethysmography). While this is acknowledged, it should be highlighted more strongly in the limitations and future work, as it affects the interpretation of construct validity.

Signal Processing Transparency

The filtering and peak detection parameters are described but not fully justified. Please provide clear rationale for the selected cut-off frequencies and thresholds. Ideally, consider providing the MATLAB code or pseudo-code as supplementary material to improve reproducibility.

Statistical Analysis

Effect sizes and confidence intervals should be consistently reported to enhance statistical robustness.

Interpretation of Limits of Agreement (LoA)

Some parameters (e.g., tidal volume, tidal volume variability) show wide LoA. Please discuss whether these levels of disagreement are clinically acceptable for rehabilitation monitoring, or whether they limit the use of the prototype.

Minor comments

Introduction

Strengthen the literature review with the most recent studies on wearable respiratory monitoring：

• Vitazkova, D.; Foltan, E.; Kosnacova, H.; Micjan, M.; Donoval, M.; Kuzma, A.; Vavrinsky, E. Advances in Respiratory Monitoring: A Comprehensive Review of Wearable and Remote Technologies. Biosensors 2024, 14(2), 90. https://doi.org/10.3390/bios14020090 • Massaroni, C.; Nicolò, A.; Lo Presti, D.; Sacchetti, M.; Silvestri, S.; Schena, E. Contact-Based Methods for Measuring Respiratory Rate: A Review. Sensors 2019, 19(4), 908. https://doi.org/10.3390/s19040908 • Hussain, T.; et al. Wearable Sensors for Respiration Monitoring: A Review. Sensors 2023, 23(17), 7518. https://doi.org/10.3390/s23177518 • Monaco, V.; Stefanini, C. Assessing the Tidal Volume through Wearables: A Scoping Review. Sensors 2021, 21(12), 4124. https://doi.org/10.3390/s21124124

Define all abbreviations at first mention (e.g., TRL, LoA).

Methods

Provide more detail on sensor placement (landmarks, distances).

Justify why only the Z-axis was selected for analysis, and address possible errors due to rib orientation.

Discussion

Avoid repeating results; focus instead on practical implications and the potential integration into clinical telerehabilitation.

Author Response

Comment 1: The sample is small (n = 11, mostly female) and limits generalizability. Please provide a justification (e.g., a priori or post-hoc power analysis) and discuss in more depth how this affects the establishment of reference values.

Response 1: We acknowledge that the small sample size (n = 11, predominantly female) limits the generalizability of the findings, and particularly the establishment of robust normative reference values. However, this study was designed as a pilot methodological investigation, primarily aimed at validating the functional feasibility of a low-cost wearable sensor for capturing respiratory metrics in a structured exercise protocol.The reference values reported should therefore be interpreted as preliminary benchmarks, useful for guiding future protocol development and sensor optimization.We have now expanded the discussion of this limitation in the revised manuscript and highlighted the need for larger and more diverse samples in future work to support broader generalization.

Comment 2: The study compares the prototype with another accelerometer, not with a clinical gold standard (spirometry or respiratory inductance plethysmography). While this is acknowledged, it should be highlighted more strongly in the limitations and future work, as it affects the interpretation of construct validity.

Response 2: We agree that the use of a secondary accelerometer does not represent a clinical gold standard such as spirometry or respiratory inductance plethysmography. As such, this limits the strength of construct validity conclusions that can be drawn from this comparison.Although our objective was to assess the consistency of the prototype, we acknowledge that true construct validation requires comparison with clinically established standards.We have now revised the Limitations section to emphasize this aspect more explicitly, and we plan to include spirometry in future studies to strengthen the clinical relevance of our findings.

Comment 3: The filtering and peak detection parameters are described but not fully justified. Please provide clear rationale for the selected cut-off frequencies and thresholds. Ideally, consider providing the MATLAB code or pseudo-code as supplementary material to improve reproducibility.

Response 3: This issue was also raised by another reviewer, and we have clarified in the revised manuscript that the filtering and peak detection parameters (cut-off frequencies and 30% peak threshold) were selected through iterative empirical testing, based on signal clarity and segmentation stability across all postures. The chosen cut-off range (0.5–1 Hz) is consistent with prior literature on normal breathing frequencies. To further support reproducibility, we have now included pseudocode snippets of the key MATLAB routines used for filtering and peak detection in the Supplementary Materials.

Comment 4: Effect sizes and confidence intervals should be consistently reported to enhance statistical robustness.

Response 4: Following the recommendation, we have revised Table 2 to include 95% confidence intervals (CI) alongside the mean and standard deviation values for each parameter and exercise. This addition improves the statistical robustness and interpretability of the results.Since the main comparison between the prototype and reference sensor was based on Bland–Altman analysis (Appendix), we did not include effect sizes (e.g., Cohen’s d) in the table to avoid redundancy and maintain consistency with the chosen validation framework.

Comment 5: Some parameters (e.g., tidal volume, tidal volume variability) show wide LoA. Please discuss whether these levels of disagreement are clinically acceptable for rehabilitation monitoring, or whether they limit the use of the prototype.

Response 5: It is true that some parameters, such as tidal volume (TVolume) and tidal volume variability (TVar), showed wider limits of agreement (LoA) compared to others. This may be partially attributed to differences in sensor design and signal amplitude, which affect the estimation of relative volume changes. However, it is important to note that the goal of this prototype is not to replace clinical-grade measurement tools, but to offer a low-cost, accessible solution for rehabilitation monitoring, where the detection of trends and relative changes over time is often more relevant than absolute precision. That said, we acknowledge that further refinement of the algorithm and validation against clinical gold standards (e.g., spirometry) are necessary steps to improve the reliability of volume-related parameters. This has been added to the discussion section.

Comment 6: Strengthen the literature review with the most recent studies on wearable respiratory monitoring

Response 6: Vitazkova et al. 2024 and Massaroni et al. 2019 have already been cited as references. Monaco et al. 2021 and Hussain et al. 2023 have been added.

Comment 7: Define all abbreviations at first mention (e.g., TRL, LoA).

Response 7: Abbreviations defined in the specific section

Comment 8: Justify why only the Z-axis was selected for analysis, and address possible errors due to rib orientation.

Response 8: The sensor was placed approximately 2–3 cm above the left lower rib margin in a position chosen to maximize the detection of rib cage excursion during breathing. This anatomical location was identified as optimal in a prior feasibility study [22], which compared multiple sensor placements for respiratory signal clarity and repeatability in the context of structured exercises. The flat surfaces of both the prototype and reference sensors allowed for a stable, stacked configuration, secured using an elastic band to minimize movement artifacts.As previously discussed in response to another comment, we selected the antero-posterior (Z) axis because it is perpendicular to the rib cage at the selected position and is therefore most sensitive to thoracic expansion and contraction during respiration. This approach is also consistent with several previous studies that used single-axis accelerometry for respiratory monitoring [25].We acknowledge that rib curvature and orientation may introduce small alignment differences between individuals or postures, potentially affecting the absolute amplitude of the Z-axis signal. However, the use of normalized signal helped reduce this variability. Future studies may consider using multi-axis fusion or orientation-aware corrections to account for anatomical differences more precisely.Additional details on sensor placement and axis selection have been included in the revised Methods and Limitations sections.

Comment 9: Avoid repeating results; focus instead on practical implications and the potential integration into clinical telerehabilitation.

Response 9: In the discussions, redundant phrases were removed and the practical application and results of this study were reinforced.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

In response to the previous review comments, the author has made good revisions.

Reviewer 4 Report

Comments and Suggestions for Authors The revised manuscript adequately addresses all reviewer concerns. Limitations are acknowledged and integrated appropriately, methodological justifications have been added, and new references strengthen the contextual background. The inclusion of pseudocode enhances transparency and reproducibility. I recommend acceptance.

Article Menu

Establishing Reference Metrics for Respiratory Exercises Through Wearable Sensors: A Comparative Study

Further Information

Guidelines

MDPI Initiatives

Follow MDPI