Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Task-Dependent Effectiveness of a Quasi-Direct-Drive Upper-Limb Exoskeleton: Shoulder Muscle Offloading Versus Metabolic Cost in Overhead Work

Bioengineering 2026, 13(4), 423; https://doi.org/10.3390/bioengineering13040423

by Yongxuan Hong^1,2,†, Jiying Du^3,†, Sida Du²

, Yue Ma², Xiangyang Wang²

and Chunjie Chen^2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Bioengineering 2026, 13(4), 423; https://doi.org/10.3390/bioengineering13040423

Submission received: 11 February 2026 / Revised: 27 March 2026 / Accepted: 31 March 2026 / Published: 3 April 2026

(This article belongs to the Special Issue Advanced Wearable Sensors for Human Gait Analysis)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript evaluates a quasi-direct-drive (QDD) upper-limb exoskeleton in seven healthy male participants during simulated static and dynamic overhead tasks, quantifying shoulder muscle activity, metabolic cost, and kinematic stability. The key contribution is the identification of a strong task-dependent dissociation: substantial and unanimous upper trapezius offloading during static holding (−68%, d=3.61) coexists with increased whole-body metabolic cost, particularly during dynamic tasks.

Here are my comments:

The manuscript overstates novelty. Task-dependent performance differences between static and dynamic tasks have been repeatedly reported in both passive and active exoskeleton literature. The added value here is specifically the QDD actuation architecture combined with simultaneous metabolic and sEMG measurements.

The dynamic task pace is self-selected, introducing variability in metabolic cost unrelated to exoskeleton effects.

The anthropometric range is limited, young? male? non-industrial workers?

No female participants were included, despite known sex differences in shoulder biomechanics and fatigue characteristics.

No a priori power analysis is reported.

No workload quantification, number of screw cycles; angular excursion range; torque demand, is reported.

Static holding duration (2.5 min) may be insufficient to induce meaningful fatigue in healthy young participants.

Several important cross-disciplinary literatures are missing, like Funabot-Sleeve: A Wearable Device Employing McKibben Artificial Muscles for Haptic Sensation in the Forearm.

Baseline-relative normalization removes the ability to interpret absolute muscle load relative to capacity.

Steady state is assumed from the final 2 minutes of each task. No RER stability or VO₂ time-course plots are shown.

The static task may not have achieved metabolic steady state within 2.5 minutes.

Donning/doffing effort is not isolated from task cost.

The IMU location and orientation are insufficiently specified.

Angle SD values (~0.1°) appear unrealistically small, suggesting heavy filtering or resolution limitations.

Jerk calculation via Savitzky-Golay smoothing is not parameter-validated.

Author Response

Comments 1: [The manuscript overstates novelty. Task-dependent performance differences between static and dynamic tasks have been repeatedly reported in both passive and active exoskeleton literature. The added value here is specifically the QDD actuation architecture combined with simultaneous metabolic and sEMG measurements.]

Response 1: [Thank you for this important clarification. We fully agree that task-dependent performance differences are not novel per se. We have revised the Introduction and Discussion (Section 4) to explicitly acknowledge prior work and precisely define our incremental contribution. The revised Discussion now states: "While task-dependent performance differences between static and dynamic conditions have been reported in both passive and active exoskeleton literature [21], the present study extends these observations to the QDD actuation paradigm and, critically, provides simultaneous metabolic calorimetry, surface electromyography, and kinematic assessment within a single protocol—a combination that remains rare in the field." Throughout the manuscript, claims of novelty have been replaced with language accurately framing this study as extending existing observations to QDD actuation with multi-modal physiological assessment.]

Comments 2: [The dynamic task pace is self-selected, introducing variability in metabolic cost unrelated to exoskeleton effects.]

Response 2: [Thank you for raising this valid concern. We acknowledge that self-selected pacing constitutes a confound for metabolic comparisons. In the revised manuscript, we have addressed this in three ways. First, we now report individual completed screw cycle counts and mean shoulder angular excursion under both conditions in Table 3, enabling readers to assess pace variability. The revised Section 2.3 states: "The self-selected pace was chosen to preserve ecological validity, as industrial workers naturally self-regulate manipulation speed; however, this introduces inter- and intra-individual pace variability as a potential confound for metabolic comparisons. To partially address this, completed screw cycles were counted retrospectively from IMU-derived shoulder angle periodicity for each participant under both conditions (Table 3)." Second, we report that mean cycle counts did not differ significantly between conditions (p > 0.05, Wilcoxon signed-rank test) in Section 3.2, providing partial reassurance. Third, we explicitly recommend metronome-paced protocols for future confirmatory studies in the Discussion (Section 4): "Task-adaptive control architectures enabling seamless transitions between high-torque assistance and low-impedance transparency could address the control mismatch limitation, while metronome-paced protocols in future studies would isolate the device's intrinsic metabolic impact from pace-induced variability."]

Comments 3: [The anthropometric range is limited, young? male? non-industrial workers?]

Response 3: [We agree this is a critical limitation. The revised manuscript now includes a dedicated paragraph in Section 2.1 explicitly acknowledging the demographic constraints: "All participants were young (24–26 years), male, and university-affiliated, without industrial overhead work experience. This homogeneous sample limits generalizability to the target industrial population." This limitation is reiterated in Sections 3.1, 4, and 5, and we explicitly recommend future recruitment from actual industrial populations spanning 18–60 years of age in the concluding future-work statement.]

Comments 4: [No female participants were included, despite known sex differences in shoulder biomechanics and fatigue characteristics.]

Response 4: [Thank you for this important point. We have added a substantive discussion of sex-based limitations to Section 2.1 of the revised manuscript: "Notably, no female participants were recruited. Sex-based differences in shoulder muscle architecture, fatigue resistance profiles, strength-to-body-weight ratios, and scapulohumeral kinematics are well-documented [20] and may substantially influence both muscle offloading efficacy and metabolic responses. Given that the 5 kg device mass represents approximately 6.7% of the mean participant body mass (74.6 kg) but could constitute 8–10% for lighter females, the metabolic penalty may be disproportionately amplified in female users. Future studies should recruit mixed-sex cohorts from industrial worker populations spanning 18–60 years of age." A supporting reference (Côté, 2012 [20]) on sex/gender differences in work-related neck/shoulder disorders has been added. The Conclusion (Section 5) also now explicitly states that confirmation with female participants is essential.]

Comments 5: [No a priori power analysis is reported.]

Response 5: [We acknowledge that no a priori power analysis was conducted. The revised Section 2.1 now transparently discloses this and provides comprehensive post-hoc power information: "This study was designed as an exploratory pilot investigation; no formal a priori power analysis was performed. The sample size (n = 7) was determined by participant availability and practical constraints associated with custom exoskeleton fitting. Post-hoc power analysis based on the primary outcome (Upper Trapezius sEMG change during static holding, observed d = 3.61) indicated achieved power > 0.99 with n = 6. For the secondary metabolic outcome (d = 0.98), achieved power was 0.52 with n = 7, confirming that this study was adequately powered for detecting large electromyographic effects but underpowered for metabolic comparisons. A future confirmatory study targeting medium effects (d = 0.8) at α = 0.05 with 80% power would require n ≥ 15 per condition." The manuscript now consistently characterizes all findings as "preliminary" and "exploratory," and the Conclusion explicitly calls for confirmatory studies with n ≥ 15.]

Comments 6: [No workload quantification, number of screw cycles; angular excursion range; torque demand, is reported.]

Response 6: [Thank you for this constructive suggestion. We have now added comprehensive workload quantification in the revised manuscript. Table 3 reports individual completed screw cycle counts and mean shoulder angular excursion range under both WO and WE conditions for each participant. Section 2.3 specifies screw specifications and estimated torque demand: "Screws were M6 × 20 mm (Grade 8.8 steel), tightened to finger-tight resistance using a 5 mm hex wrench (arm length: 150 mm), with estimated peak manipulation torque of 2–4 N·m." Section 3.2 reports that mean cycle counts did not differ significantly between conditions.]

Comments 7: [Static holding duration (2.5 min) may be insufficient to induce meaningful fatigue in healthy young participants.]

Response 7: [We agree this is a valid concern. The revised manuscript addresses this in two places. Section 2.3 now provides the rationale for the selected duration: "The 2.5-minute duration was selected based on established endurance time guidelines for overhead work with external loads at 90° shoulder flexion [6], where significant EMG amplitude increases indicating fatigue onset have been observed within 1–2 minutes, and was further constrained by the requirement for metabolic gas exchange stabilization within the measurement window." Additionally, the Discussion (Section 4) and Conclusion (Section 5) now explicitly identify this as a limitation requiring future investigation: "The 2.5-minute static holding protocol may not have induced meaningful fatigue in healthy young participants; prolonged or repeated-bout designs are needed to assess whether muscle offloading translates into delayed fatigue onset." Future work recommendations include extended static holding durations (≥5 minutes) with verified VO₂ plateau criteria.]

Comments 8: [Several important cross-disciplinary literatures are missing, like Funabot-Sleeve: A Wearable Device Employing McKibben Artificial Muscles for Haptic Sensation in the Forearm.]

Response 8: [Thank you for highlighting this relevant cross-disciplinary work. We have incorporated the suggested reference (Peng et al., 2025) and integrated it meaningfully into the Discussion (Section 4): "Beyond electromagnetic QDD actuators, alternative actuation paradigms such as pneumatic artificial muscles (e.g., McKibben actuators) offer high force-to-weight ratios and inherent compliance for upper-limb wearable devices [29], though their nonlinear force–length characteristics and dependence on external pneumatic supply present different deployment constraints that merit comparative investigation." This addition broadens the discussion beyond electromagnetic QDD to encompass alternative soft actuation paradigms, providing readers with a more comprehensive perspective on wearable assistive device technologies.]

Comments 9: [Baseline-relative normalization removes the ability to interpret absolute muscle load relative to capacity.]

Response 9: [We fully agree with this methodological concern. The revised manuscript now explicitly acknowledges this limitation in Section 2.4: "However, this approach quantifies the proportional change in muscle activation attributable to the exoskeleton but does not indicate absolute muscle demand relative to maximum voluntary capacity. A 68% reduction in a muscle operating at 15% MVC carries different ergonomic implications than the same proportional reduction at 60% MVC." To provide the capacity-normalized context the Reviewer requests, we have added Figure 4, which presents absolute %MVC values under both WO and WE conditions during static holding for the subset of participants with valid MVC data (n = 4). This enables readers to contextualize the proportional reductions relative to maximum voluntary capacity. The Conclusion (Section 5) further recommends that future studies combine both normalization approaches.]

Comments 10: [Steady state is assumed from the final 2 minutes of each task. No RER stability or VO₂ time-course plots are shown.]

Response 10: [Thank you for this important methodological point. We have now added three pieces of evidence supporting the steady-state assumption. First, the revised Section 2.4 reports coefficient of variation (CV) verification: "Mean CV was 8.2 ± 3.1% (dynamic task) and 9.4 ± 4.0% (static task), meeting the conventional steady-state criterion of CV < 10%." Second, RER values are now reported: "Mean respiratory exchange ratio (RER) during the analyzed periods was 0.85 ± 0.07 (dynamic) and 0.82 ± 0.05 (static), indicating predominantly aerobic metabolism consistent with sub-maximal conditions." Third, Figure 5 has been added, showing representative breath-by-breath VO₂ time-course plots with the analyzed window highlighted and CV values annotated.]

Comments 11: [The static task may not have achieved metabolic steady state within 2.5 minutes.]

Response 11: [We agree this concern cannot be fully eliminated. While the CV verification reported in Response 10 provides partial reassurance, the revised Discussion (Section 4) now explicitly addresses this potential confound: "The static task metabolic data may be influenced by incomplete steady-state attainment: although the VO₂ coefficient of variation during the final 2-minute window met the conventional <10% criterion, the total 2.5-minute task duration provided only approximately 30 seconds for metabolic kinetics to stabilize before the analysis window began. The VO₂ on-kinetics time constant for low-intensity static work is typically 30–45 seconds [34], suggesting near-steady-state but not fully stabilized metabolism was captured. If donning the exoskeleton prolongs the metabolic on-transient—due to the additional postural adjustment demands imposed by 5 kg of externally mounted mass—this could asymmetrically inflate the WE condition estimate relative to WO." This limitation is also noted in the recommended future directions (extended static holding durations ≥5 minutes with verified VO₂ plateau criteria).]

Comments 12: [Donning/doffing effort is not isolated from task cost.]

Response 12: [Thank you for raising this concern. The revised Section 2.3 now explicitly describes how donning/doffing metabolic costs were excluded: "Exoskeleton donning (~3 minutes) and doffing (~2 minutes) were completed during a mandatory 15-minute inter-condition recovery period. Metabolic data collection was paused during device transition and resumed only after the participant had returned to quiet standing. The remaining ≥10 minutes of standing rest ensured that donning/doffing metabolic costs were fully excluded from subsequent task-phase analyses and that physiological parameters returned to baseline levels before the next condition commenced." This clarification confirms temporal separation of donning/doffing activity from task-phase metabolic data.]

Comments 13: [The IMU location and orientation are insufficiently specified.]

Response 13: [We agree that IMU specifications were inadequate. Section 2.2 has been substantially expanded with detailed sensor placement information: "The IMU was positioned on the lateral aspect of the dominant upper arm, approximately 5 cm distal to the acromion process, with the sensor's x-axis aligned with the humeral longitudinal axis (pointing distally), y-axis directed anteriorly, and z-axis directed laterally. The sensor was secured using medical-grade double-sided adhesive and reinforced with elastic cohesive bandage to minimize soft-tissue motion artifact. Shoulder flexion/extension angle was derived from gyroscope integration with complementary filter correction (blending coefficient α = 0.98) using the accelerometer-based gravity vector as a drift reference." IMU hardware specifications (triaxial accelerometer: ±16 g, gyroscope: ±2000°/s, 100 Hz, ~0.01° resolution) are also now reported.]

Comments 14: [Angle SD values (~0.1°) appear unrealistically small, suggesting heavy filtering or resolution limitations.]

Response 14: [Thank you for this astute observation. We agree the small absolute magnitudes require contextualization. The revised Section 2.4 now specifies the high-pass filter applied (4th-order Butterworth, cutoff: 0.1 Hz) and explains that Angle SD captures postural micro-oscillations, not total range of motion. Section 3.4 now includes an explicit caution: "The small absolute magnitudes of angle SD (order of 0.1°) are consistent with participants maintaining a fixed instructed posture, but approach the IMU sensor's noise floor (~0.01° RMS for MEMS gyroscope-derived angles). Consequently, while the relative WO–WE differences and their direction are interpretable, absolute values should be treated with caution, and future studies employing optical motion capture would provide higher-resolution postural sway quantification." The future work section recommends optical motion capture systems to address these resolution limitations.]

Comments 15: [Jerk calculation via Savitzky-Golay smoothing is not parameter-validated.]

Response 15: [We agree that parameter sensitivity should be demonstrated. The revised manuscript now includes a comprehensive parameter sensitivity analysis presented in Figure 6 (four panels). Section 2.4 describes the analysis: "A parameter sensitivity analysis confirmed that the direction and relative magnitude of WO–WE differences in jerk RMS were qualitatively preserved across window lengths of 3, 5, 7, and 9 points, though absolute jerk magnitudes varied by approximately ±15%." Figure 6 demonstrates: (a–b) absolute jerk values across window lengths, (c) percentage change direction consistency, and (d) stable between-subject CV. These results confirm that the reported findings are robust to the specific smoothing parameter chosen.]

Reviewer 2 Report

Comments and Suggestions for Authors

This study evaluates a quasi-direct-drive active upper-limb exoskeleton during simulated industrial overhead tasks and proposes a “metabolic–muscle dissociation” as its central finding. Although the topic is relevant to occupational biomechanics, substantial methodological and analytical limitations undermine the validity of the conclusions.

1. Although seven participants were recruited, effective sample sizes drop to n=3–4 for several key metrics. Extremely large reported effect sizes (e.g., Cohen’s d = 3.61) are unstable and highly sensitive to small-sample variability. No a priori power analysis is reported, and the manuscript characterizes findings as “robust” and “replicable” despite limited statistical power. [Lines 83–84; 235–248]

2. The abandonment of %MVC normalization due to implausible values (>200% MVC in 43% of cases) reflects a fundamental protocol or measurement issue. Replacing this with baseline-relative normalization after data inspection does not constitute methodological validation. The claim that the new approach is “substantially superior” is based on only four overlapping cases and cannot be considered confirmatory evidence. [Lines 199–213]

3. The central interpretation of “metabolic–muscle dissociation” requires confirmation within the same individuals. However, metabolic and sEMG analyses appear to rely on partially different participant subsets. Without explicit within-subject confirmation, the dissociation remains insufficiently supported. [Tables 4-5]

4. Two Anterior Deltoid cases were excluded as “likely artifacts” without pre-specified exclusion criteria or presentation of raw signal validation. These exclusions directly produce the reported 100% responder rate. This raises concerns regarding post-hoc data handling and selective reporting. [Lines 362–364]

5. There appear to be inconsistencies in reported sample sizes between tables and narrative sections, and certain citation–reference alignments should be carefully verified. These issues suggest insufficient internal consistency checking prior to submission.

Overall Recommendation:

The concerns identified originate from study design decisions (sample size, normalization strategy, exclusion criteria) rather than issues that can be fully addressed through revision. A properly powered study (n ≥ 15), with prospectively defined normalization and exclusion criteria and explicit within-subject analyses, would be necessary to substantiate the claims made.

Author Response

Comments 1: [Although seven participants were recruited, effective sample sizes drop to n=3–4 for several key metrics. Extremely large reported effect sizes (e.g., Cohen's d = 3.61) are unstable and highly sensitive to small-sample variability. No a priori power analysis is reported, and the manuscript characterizes findings as "robust" and "replicable" despite limited statistical power.]

Response 1: [Thank you for this rigorous critique. We fully acknowledge that effect sizes from small samples are susceptible to inflation and instability. We have made the following revisions. First, the revised Section 2.1 now transparently states that no a priori power analysis was conducted and provides comprehensive post-hoc power information, including specific sample size recommendations for future confirmatory studies (n ≥ 15 per condition). Second, throughout the revised manuscript, all language characterizing findings as "robust" or "replicable" has been replaced with appropriately cautious terminology. For example, the Conclusion (Section 5) now states: "The high effect size and unanimous participant response within this pilot sample suggest a potentially meaningful pattern; however, the observed effect size magnitude may be inflated by small-sample variability, and replication in larger and more diverse cohorts—including female participants and industrial workers—is essential to confirm generalizability beyond the present homogeneous sample." Third, Section 3.1 now explicitly reports differential statistical power across outcomes: "Post-hoc power analysis confirmed that the achieved sample was adequately powered (>0.99) for the primary electromyographic outcome (Upper Trapezius, d = 3.61) but underpowered (0.52) for metabolic comparisons (d = 0.98), and metabolic results should therefore be interpreted as exploratory." Fourth, regarding the variable effective sample sizes (n = 3–7), we have retained the metric-specific exclusion approach rather than list-wise deletion, as the latter would unnecessarily discard valid data. However, sample sizes are now explicitly reported for every analysis, and all exclusion criteria are fully documented with physiological justification in Section 2.4. We believe these revisions appropriately frame the study as an exploratory pilot investigation while preserving the informational value of the observed patterns.]

Comments 2: [The abandonment of %MVC normalization due to implausible values (>200% MVC in 43% of cases) reflects a fundamental protocol or measurement issue. Replacing this with baseline-relative normalization after data inspection does not constitute methodological validation. The claim that the new approach is "substantially superior" is based on only four overlapping cases and cannot be considered confirmatory evidence.]

Response 2: [We acknowledge this concern as methodologically valid. The revised manuscript addresses this at multiple levels. First, we now explicitly state in Section 2.4 that the normalization adaptation was post-hoc rather than pre-specified: "We acknowledge that the normalization strategy was adapted after observing MVC data quality issues during analysis, rather than being pre-specified in the study protocol. This constitutes a methodological limitation of the present exploratory study." Second, the claim of "substantially superior" has been replaced. Section 3.3 now frames the comparison as providing "partial cross-validation" and states: "Nevertheless, the baseline-relative approach has independent methodological justification: it eliminates MVC reliability dependence, a recognized challenge in shoulder biomechanics research, and has been employed in prior exoskeleton intervention studies [8,9]. The convergent results from both normalization approaches for the overlapping 4-participant subset (Figure 4) provide partial cross-validation, though prospectively defined normalization criteria remain essential for future confirmatory studies." Third, Figure 4 has been added to present absolute %MVC values under both conditions for the 4-participant overlap subset, enabling readers to contextualize proportional changes relative to maximum voluntary capacity. We agree that robust MVC protocols with adequate familiarization should be prospectively defined in future studies, as recommended in the Conclusion (Section 5).]

Comments 3: [The central interpretation of "metabolic–muscle dissociation" requires confirmation within the same individuals. However, metabolic and sEMG analyses appear to rely on partially different participant subsets. Without explicit within-subject confirmation, the dissociation remains insufficiently supported.]

Response 3: [We agree this was a critical gap in the original manuscript. The revised manuscript now includes Table 7, which presents explicit within-subject paired analysis for participants with simultaneously valid Upper Trapezius sEMG and metabolic measurements during static holding (n = 5). Section 3.3 describes the results: "Of these five participants, four exhibited concurrent Upper Trapezius activation reduction (range: −48.8% to −89.8%) and metabolic cost increase (range: +8.8% to +194.7%), confirming within-subject dissociation. The sole exception was case4 (the lightest participant, 60 kg, device/body mass ratio 8.3%), who demonstrated both muscle offloading (UT: −44.0%) and marginal metabolic reduction (−8.0%), suggesting that device-to-body-weight ratio may represent a critical threshold determining whether localized biomechanical benefits translate into systemic energy savings." Table 7 also reports device-to-body-mass ratios for each participant and documents specific exclusion reasons (case2: artifact; case7: equipment malfunction). The Discussion (Section 4) confirms: "This dissociation was verified at the individual level: within the subset of participants with simultaneously valid paired data for both Upper Trapezius sEMG and metabolic cost (n = 5), four of five demonstrated concurrent muscle offloading and metabolic increase (Table 7), confirming that this pattern is not attributable to between-subject confounding from differential data availability." We believe this within-subject confirmation substantively addresses the Reviewer's concern.]

Comments 4: [Two Anterior Deltoid cases were excluded as "likely artifacts" without pre-specified exclusion criteria or presentation of raw signal validation. These exclusions directly produce the reported 100% responder rate. This raises concerns regarding post-hoc data handling and selective reporting.]

Response 4: [We take this concern seriously and have made several revisions to ensure transparency and address potential selective reporting. First, Section 2.4 now specifies exclusion thresholds with explicit physiological justification: "sEMG data were excluded if percentage change exceeded ±400% (indicating probable signal artifact such as electrode detachment) or baseline activation was <5 μV RMS (near noise floor). ... The ±400% sEMG threshold was selected because voluntary muscle activation modulation through exoskeleton intervention is physiologically bounded: the largest sEMG reductions reported in the exoskeleton literature approach −90% to −95% (near-complete offloading) [8,9], while increases exceeding +200–300% without changes in external load would require implausible levels of antagonist co-contraction or reflect non-physiological signal contamination (electrode lift-off, cable motion artifact, electromagnetic interference). Values beyond ±400% therefore almost certainly represent measurement artifacts rather than genuine neuromuscular responses." Second, to eliminate concerns about selective reporting, we now present both inclusive and exclusive analyses in Table 6a. Section 3.3 explicitly compares both: "Including all seven participants with sEMG data, mean activation change was +86.5 ± 228.1% (median: −36.8%) with 5 of 7 participants (71%) demonstrating beneficial reductions. ... Importantly, both the inclusive and exclusive analyses agree on the direction of the group-level effect: the median was negative in both cases (−36.8% vs. −37.3%), and the majority of participants (≥71%) demonstrated beneficial reductions regardless of how extreme values were handled." Third, visual inspection findings from the raw sEMG time series are described: "abrupt amplitude discontinuities temporally unrelated to task phase transitions, consistent with electrode displacement artifacts." We acknowledge that the 100% responder rate applies only to the exclusion-applied analysis (5/5) and that the inclusive rate is 71% (5/7). Both values are now transparently reported.]

Comments 5: [There appear to be inconsistencies in reported sample sizes between tables and narrative sections, and certain citation–reference alignments should be carefully verified. These issues suggest insufficient internal consistency checking prior to submission.]

Response 5: [We sincerely apologize for these inconsistencies. The revised manuscript has been thoroughly checked for internal consistency. Specifically, sample sizes are now explicitly reported for every individual analysis with clear annotation of which participants were excluded and why (documented in table footnotes and Section 2.4). All tables (Tables 5, 6, 6a, 7) include detailed footnotes specifying sample composition and exclusion reasons. The reference list has been carefully verified for citation–reference alignment, and all in-text citations have been cross-checked against the bibliography. Participant numbering has been standardized, with a clarifying note in Table 3: "Subject case2 in the original recruitment sequence was excluded from all analyses due to incomplete protocol completion; participant numbering presented here reflects the final analysis cohort (n = 7)." We thank the Reviewer for identifying these issues and believe the revised manuscript now demonstrates adequate internal consistency.]

Reviewer 3 Report

Comments and Suggestions for Authors

The authors work on a very interesting subject that is objectivating benefits driven by the use on an exoskeleton, focusing specifically on overhead work. The experimental set up is valid, but in the opinion of the reviewer it is severely hampered by two aspects: the very limited number or partecipants, and the lack of any repeated trials on the same subject. In the opinion of the reviewer, further tests are required and repeated tests cannot be skipped in order to have an idea of consistency of measured quantities on the same subject. In the following, more detailed comments are reported:

Introduction: dealing with kinematic compatibility (R58) it could be worth mentioning a recent work studying body/exoskeleton interaction (https://doi.org/10.1007/s12008-025-02376-6)
Methods: the authors should report the date and number of ethical approval of this experimentl study involving human subjects
Table 1 label can be improved: 'physical information' is vague, and so is the first column label 'case'
Figure 2: a back view with linkage modulus in evidence would be useful
Table 3 and figure 3: perhaps it would be clearer repeating phases II, III, IV after 'Recovery phase', specifying that WO and WE order is randomised
Discussion: the reason for not considering MVC is quite trivial (R207): it is not influent on percentage variation; adding this variable can bring additional noise with no true benefit
Conclusion: stating " Using baseline-relative sEMG normalization to circumvent MVC testing limitations, we obtained definitive evidence for substantial biomechanical benefits during static overhead holding. " is somehow excessive, given the very limited size of the sample

Author Response

Comments 1: [The experimental set up is valid, but in the opinion of the reviewer it is severely hampered by two aspects: the very limited number of participants, and the lack of any repeated trials on the same subject. In the opinion of the reviewer, further tests are required and repeated tests cannot be skipped in order to have an idea of consistency of measured quantities on the same subject.]

Response 1: [We sincerely thank the Reviewer for the positive assessment of the experimental setup and fully acknowledge both limitations. Regarding sample size, the revised Section 2.1 now clearly frames this as an exploratory pilot study, includes post-hoc power analysis, and provides specific recommendations for future confirmatory studies (n ≥ 15 per condition). Regarding the absence of repeated trials, we have added an explicit acknowledgment in the Discussion (Section 4): "Additionally, each participant in the present study performed only a single trial per condition without repeated measurements, precluding assessment of within-subject test-retest reliability. This design limitation means that the observed individual-level variability cannot be decomposed into true inter-individual differences versus measurement noise. Future studies should incorporate a minimum of two to three repeated trials per condition with intraclass correlation coefficient (ICC) reporting to establish measurement consistency and strengthen the evidentiary basis for individual-level responder classification." We agree that repeated trials are essential for establishing measurement consistency and have moved this to the first priority in our future work recommendations. We respectfully note that within the constraints of the current pilot study—involving a custom-fitted exoskeleton with limited battery runtime and participant fatigue considerations—the single-trial design was a practical necessity that is now transparently disclosed. We believe the revised manuscript, with its appropriately cautious interpretation (all findings characterized as "preliminary"), provides valuable pilot data warranting the scaled validation the Reviewer recommends.]

Comments 2: [Introduction: dealing with kinematic compatibility it could be worth mentioning a recent work studying body/exoskeleton interaction (https://doi.org/10.1007/s12008-025-02376-6).]

Response 2: [Thank you for this valuable suggestion. We have incorporated the recommended reference (Pascoletti et al., 2025) into the Introduction (Section 1). The relevant passage now reads: "Recent work has further examined body-exoskeleton interaction dynamics and their implications for kinematic compatibility design [19]." This reference (now [19] in the revised manuscript) strengthens the Introduction by providing additional context on human-exoskeleton interaction modeling, a topic directly relevant to our bio-inspired linkage design.]

Comments 3: [Methods: the authors should report the date and number of ethical approval of this experimental study involving human subjects.]

Response 3: [Thank you for this essential procedural point. The revised manuscript now includes full ethical approval details in the Informed Consent Statement: "The experimental protocol was approved by the Medical Ethics Committee of the Shenzhen Institutes of Advanced Technology (SIAT) under Approval No. SIAT-IRB-240415-H0741 (approved 2024-04-15). Informed consent was obtained from all participants prior to experimentation, and all participants were fully aware of the experimental procedures and content."]

Comments 4: [Table 1 label can be improved: 'physical information' is vague, and so is the first column label 'case'.]

Response 4: [We agree these labels lacked precision. In the revised manuscript, Table 1 has been updated. The table caption now reads: "Table 1. Physical information of seven volunteers." We have further refined the caption to clarify that it contains anthropometric and demographic data. The first column label "Case" has been retained as a participant identifier consistent with the anonymized coding used throughout the manuscript (case1–case7), which is referenced in individual-level analyses (Tables 3, 7, and Figures 7–9). We have ensured that the abbreviation is consistently defined upon first use.]

Comments 5: [Figure 2: a back view with linkage module in evidence would be useful.]

Response 5: [Thank you for this constructive suggestion. We have added a posterior view as Figure 2(c) in the revised manuscript, with the updated caption reading: "(c) Posterior view highlighting the bio-inspired latissimus dorsi linkage module and waist attachment configuration." This additional panel clearly displays the five-segment articulated linkage chain, its anatomical routing along the posterior torso, and the connection to the waist attachment module, providing readers with a comprehensive understanding of the mechanical architecture from all relevant viewing angles.]

Comments 6: [Table 3 and Figure 3: perhaps it would be clearer repeating phases II, III, IV after 'Recovery Phase', specifying that WO and WE order is randomised.]

Response 6: [Thank you for this suggestion to improve protocol clarity. The revised Table 4 (Experimental Protocol Summary) now includes a clarifying note: "Notes: Phases I-II conducted once under WO condition; Phases III-IV repeated under both WO and WE conditions in randomized order." Figure 3 has also been revised to more clearly illustrate the within-subject crossover design with randomized condition order, explicitly showing that the dynamic and static task phases are repeated under both WO and WE conditions separated by the recovery/washout period.]

Comments 7: [Discussion: the reason for not considering MVC is quite trivial: it is not influent on percentage variation; adding this variable can bring additional noise with no true benefit.]

Response 7: [We appreciate the Reviewer's pragmatic perspective. The revised Discussion (Section 3.3) now presents a more streamlined justification for the baseline-relative approach, emphasizing its independent methodological merits rather than framing it primarily as a workaround for MVC data quality issues: "The baseline-relative approach has independent methodological justification: it eliminates MVC reliability dependence, a recognized challenge in shoulder biomechanics research, and has been employed in prior exoskeleton intervention studies [8,9]." As the Reviewer notes, for paired within-subject percentage change calculations, MVC normalization introduces additional noise (from MVC measurement variability) without altering the direction or proportional magnitude of the intervention effect—a point now implicitly supported by the cross-validation showing identical mean reductions (−68%) from both methods. However, following Reviewer #2's concern, we also acknowledge that the adaptation was post-hoc and that absolute %MVC context (provided in Figure 4) retains ergonomic interpretive value regarding demand level.]

Comments 8: [Conclusion: stating "Using baseline-relative sEMG normalization to circumvent MVC testing limitations, we obtained definitive evidence for substantial biomechanical benefits during static overhead holding." is somehow excessive, given the very limited size of the sample.]

Response 8: [We completely agree. The word "definitive" was inappropriate for a pilot study of this size. The revised Conclusion (Section 5) now reads: "Using baseline-relative sEMG normalization to circumvent MVC testing limitations, we obtained preliminary evidence for substantial biomechanical benefits during static overhead holding." This change is consistent with the cautious framing adopted throughout the revised manuscript. All characterizations of findings have been systematically reviewed and moderated to reflect the exploratory nature of the study and the small sample size. The Conclusion further states: "These findings should be interpreted within the context of a pilot study involving young male participants without industrial experience. The absence of female participants, the limited sample size, and the controlled laboratory environment constrain direct translation to occupational deployment recommendations."]

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I have reviewed the revised manuscript and I am pleased to report that the authors have made substantial improvements in response to my previous comments. The manuscript now addresses the key concerns I raised, and the revisions enhance both the clarity and depth of the work. I recommend that the manuscript be accepted for publication in its current form.

Author Response

Dear Reviewer,

Thank you very much for your positive evaluation of our revised manuscript and for your recommendation for acceptance. We are grateful for the constructive feedback you provided during the review process, which has substantially strengthened the quality and clarity of our work.

Your comprehensive comments in the previous round were instrumental in improving multiple critical aspects of the manuscript. In particular, your guidance on:

Appropriately framing the novelty of our contribution within the existing exoskeleton literature
Enhancing methodological transparency through comprehensive workload quantification (cycle counts, angular excursion, torque specifications)
Validating the metabolic steady-state assumption with coefficient of variation analysis, RER values, and VO₂ time-course plots
Providing capacity-normalized interpretation alongside baseline-relative normalization to enable readers to contextualize proportional reductions relative to maximum voluntary capacity
Expanding IMU sensor specifications and placement details to ensure reproducibility
Demonstrating parameter sensitivity for jerk calculations
Acknowledging critical demographic limitations (young male participants without industrial experience, absence of female participants)
These revisions have transformed the manuscript into a more rigorous, transparent, and appropriately contextualized contribution that honestly represents both the preliminary proof-of-concept findings and the substantial validation work still required before industrial deployment recommendations can be made.

We deeply appreciate the time and effort you dedicated to reviewing our work and providing such detailed, constructive guidance.

Sincerely,

Yongxuan Hong

Reviewer 2 Report

Comments and Suggestions for Authors

The revised manuscript has improved in several important respects. The added acknowledgment of sample-size limitation, the within-subject paired analysis, the VO₂ time-course presentation, and the clearer reporting of exclusion handling are all appreciated. These changes strengthen the manuscript. Although I previously considered the manuscript unsuitable for publication, the current revision has addressed several earlier concerns sufficiently that I now consider major revision, rather than rejection, to be the more appropriate recommendation. However, several important issues still remain.

The authors now acknowledge in Section 2.4 that the baseline-relative normalization was adopted post hoc, which is an important clarification. However, Section 3.1 still describes this approach as “instrumental in maximising sample utilisation,” and Section 3.3 states that it provided “substantially enhanced statistical properties,” even though these comparisons rest on only four overlapping participants. In my view, the manuscript should state more clearly that this normalization comparison is exploratory and that interpretation remains limited by the unresolved MVC measurement issue.
Table 7 is a useful addition because it presents within-subject paired data (n = 5), and the observation that 4 of 5 participants showed concurrent muscle offloading and metabolic increase is informative. However, the wording in Section 3.3 still overstates the evidence by describing this result as “confirming” the dissociation. I suggest replacing this with wording such as “providing preliminary within-subject evidence for.”
Although the revised manuscript is more cautious in several places, the Conclusions still include specific deployment recommendations for applications such as ceiling installation, automotive undercar fastening, and aerospace overhead riveting. For a pilot study based on only six valid EMG observations in young male students without industrial experience, this level of specificity seems premature. The practical implications should be framed more tentatively unless supported by field-based evidence.
There is also still an internal inconsistency that should be corrected. Table 3 presents complete kinematic data for “case2” under both WO and WE conditions, yet the table note states that this participant “was excluded from all analyses due to incomplete protocol completion.” This contradiction should be resolved before the manuscript is considered further.

Author Response

Comments 1: [The authors now acknowledge in Section 2.4 that the baseline-relative normalization was adopted post hoc, which is an important clarification. However, Section 3.1 still describes this approach as "instrumental in maximising sample utilisation," and Section 3.3 states that it provided "substantially enhanced statistical properties," even though these comparisons rest on only four overlapping participants. In my view, the manuscript should state more clearly that this normalization comparison is exploratory and that interpretation remains limited by the unresolved MVC measurement issue.]

Response 1: [Thank you for this critical observation. We fully agree that our language overstated the strength of evidence for the normalization comparison. We have revised the manuscript in three locations to explicitly characterize this comparison as exploratory and acknowledge its limitations. First, in Section 3.1, we changed "instrumental in maximising sample utilisation" to "offered a potential approach to sample utilisation." Second, in Section 3.3, we replaced "substantially enhanced statistical properties" with "appeared to provide improved statistical properties in this exploratory comparison, though interpretation is limited by the small overlapping sample (n=4)." Third, we have added the following clarification to Section 2.4: "It should be noted that the baseline-relative normalization approach was adopted post hoc as an exploratory analysis. Given that the comparison between normalization methods is based on only four overlapping participants and that the MVC measurement issue remains unresolved, interpretation of the relative advantages of this approach should be considered preliminary." These revisions make clear that our normalization comparison represents exploratory rather than confirmatory evidence and that the MVC measurement issue remains a methodological limitation.]

Comments 2: [Table 7 is a useful addition because it presents within-subject paired data (n = 5), and the observation that 4 of 5 participants showed concurrent muscle offloading and metabolic increase is informative. However, the wording in Section 3.3 still overstates the evidence by describing this result as "confirming" the dissociation. I suggest replacing this with wording such as "providing preliminary within-subject evidence for."]

Response 2: [We completely agree that "confirming" was too strong given the small sample size. We have made the following changes in Section 3.3. First instance: changed "Of these five participants, four exhibited concurrent Upper Trapezius activation reduction (range: −48.8% to −89.8%) and metabolic cost increase (range: +8.8% to +194.7%), confirming within-subject dissociation" to "...providing preliminary within-subject evidence for the dissociation." Second instance: changed "These within-subject paired results confirm that the metabolic–muscle dissociation..." to "These within-subject paired results suggest that the metabolic–muscle dissociation..." We have also added the following sentence at the end of that paragraph: "However, given the small sample size (n=5), these findings should be interpreted as preliminary evidence requiring validation in larger studies." This revised wording appropriately reflects the exploratory nature of the findings while preserving their scientific value.]

Comments 3: [Although the revised manuscript is more cautious in several places, the Conclusions still include specific deployment recommendations for applications such as ceiling installation, automotive undercar fastening, and aerospace overhead riveting. For a pilot study based on only six valid EMG observations in young male students without industrial experience, this level of specificity seems premature. The practical implications should be framed more tentatively unless supported by field-based evidence.]

Response 3: [This is an extremely important point, and we acknowledge that our deployment recommendations were inappropriately specific given the pilot nature of this study. We have substantially revised the relevant paragraph in the Conclusions section. The original text "For industrial deployment, the evaluated exoskeleton may be preferentially considered for sustained static overhead postures (ceiling installation, automotive undercar fastening, aerospace overhead riveting requiring >2-minute holding periods), where unanimous shoulder offloading (100% Upper Trapezius responder rate) provides preliminary application rationale" has been replaced with: "These preliminary findings suggest potential relevance for work scenarios involving sustained static overhead postures, such as ceiling installation, automotive undercar fastening, or aerospace overhead riveting tasks requiring extended holding periods (>2 minutes). The observed unanimous shoulder offloading (100% Upper Trapezius responder rate, n=6) within this pilot sample provides initial proof-of-concept." We have also strengthened the cautionary language immediately following: "However, given that these observations are derived from a small-scale laboratory study with young male participants without industrial experience performing simplified simulated tasks, field-based validation studies with experienced workers in authentic work environments are essential before any specific deployment recommendations can be made. Such validation studies should include the full demographic range of the target workforce, including female workers and those aged 30–55 years, and should evaluate device performance under realistic production conditions, shift-length exposures, and task variability." The revised text now clearly positions our findings as preliminary proof-of-concept requiring field validation, rather than as a basis for immediate deployment decisions.]

Comments 4: [There is also still an internal inconsistency that should be corrected. Table 3 presents complete kinematic data for "case2" under both WO and WE conditions, yet the table note states that this participant "was excluded from all analyses due to incomplete protocol completion." This contradiction should be resolved before the manuscript is considered further.]

Response 4: [We sincerely apologize for this confusing wording. The inconsistency arose from ambiguous participant numbering. We have completely rewritten the table note to eliminate this contradiction. The original note stated: "Subject case2 in the original recruitment sequence was excluded from all analyses due to incomplete protocol completion; participant numbering presented here reflects the final analysis cohort (n = 7)." The revised note now reads: "Eight participants were originally recruited. One participant was excluded due to incomplete protocol completion and does not appear in any tables or figures in this manuscript. The remaining seven participants were renumbered sequentially (case1–case7) to form the final analysis cohort presented here; all seven completed the kinematic protocol under both WO and WE conditions." This clarifies that all participants appearing in Table 3—including the participant labeled case2 within the renumbered final cohort—completed the kinematic protocol successfully. The excluded participant from the original recruitment does not appear anywhere in the manuscript under any identifier.]

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have satisfactorily addressed all outstanding comments raised in the previous review round. Specifically: (1) the language describing the post-hoc normalization comparison has been appropriately softened and explicitly labeled as exploratory in Sections 2.4, 3.1, and 3.3; (2) the overstatement of "confirming" the metabolic–muscle dissociation has been corrected to "providing preliminary within-subject evidence," with an added caveat noting the small sample size (n=5); (3) deployment recommendations in the Conclusions have been reframed as preliminary proof-of-concept, with clear statements that field-based validation is required before any specific deployment decisions can be made; and (4) the internal inconsistency in the Table 3 footnote regarding participant numbering has been fully resolved. I have no further comment regarding this revised manuscript.

Article Menu

Task-Dependent Effectiveness of a Quasi-Direct-Drive Upper-Limb Exoskeleton: Shoulder Muscle Offloading Versus Metabolic Cost in Overhead Work

Further Information

Guidelines

MDPI Initiatives

Follow MDPI