4.1. Computational and Applied Science Contributions
Safari is designed to support proactive fatigue management through a graduated, physiologically grounded output aligned with the Special Issue’s focus on AI-driven training adaptation. The dual-pathway framing is physiologically meaningful: Fresh and Accumulating states signal the neuromuscular system is functioning normally; Fatigued indicates early neuromuscular complexity degradation (, rising) with emerging metabolic contribution ( falling); Critical signals dual-pathway involvement warranting immediate load reduction.
The Banister adaptive threshold directly embeds training adaptation theory [
6,
32]: as fitness surplus
is overtaken by fatigue surplus
, the detection threshold tightens, implementing progressive session-level sensitisation. This means the same movement pattern that registers as
Accumulating early in a session is reclassified as
Fatigued later: an earlier warning signal under the simulation conditions.
The 1.55× entropy discriminability advantage, combined with per-activity AUC of 0.9978 (running) and 0.9608 (jumping), confirms that the dual-pathway entropy triplet captures the biomechanical complexity changes that moment-based features miss. The ablation result (+ alone AUC = 0.9824; adding completing the metabolic dimension to reach 0.9820) demonstrates that both fatigue pathways contribute unique discriminative information.
4.3. Contributions to the Sustainable Development Goals
The Safari framework contributes directly to three United Nations Sustainable Development Goals (SDGs), an alignment that is increasingly required for open-access publication support and research impact evaluation.
SDG 3 – Good Health and Well-Being (Target 3.4). The primary contribution is to athlete health protection. Real-time classification of fatigue into the four-state Fresh, Accumulating, Fatigued, and Critical continuum enables sports scientists and coaches to intervene before biomechanical deterioration reaches injury-risk levels. The pre-symptomatic detection window, where entropy features detect neuromuscular and metabolic fatigue before RPE rises, directly reduces the incidence of overuse and acute musculoskeletal injuries in sprint and jump athletes. Injury prevention in sport contributes to SDG 3 by reducing the health burden of training-related musculoskeletal conditions, which disproportionately affect youth athletes.
SDG 9–Industry, Innovation and Infrastructure (Target 9.5). The polyhedral compilation approach to eliminating JIT latency on ARM edge devices is a novel engineering contribution. By demonstrating that entropy-based fatigue classification can run within 7.2 ms on a USD 35 Raspberry Pi 4 (hardware accessible to community sport organisations, schools, and university programmes),
Safari moves high-performance athlete monitoring from elite laboratory infrastructure toward broadly deployable wearable technology. The open-source simulation protocol and pipeline code [
33] further contribute to research infrastructure by providing a reproducible benchmark for future fatigue monitoring studies.
SDG 4–Quality Education (Target 4.4). This paper demonstrates a research pathway for sport science graduates into computational and interdisciplinary research.
4.5. Limitations and Future Work
The use of a synthetic dataset represents the primary methodological limitation of this study, and its implications deserve explicit treatment beyond a brief caveat.
External validity. Because the synthetic data generator and the entropy-based detector share structural assumptions: the generator injects phase jitter, amplitude envelope modulation, and spectral drift that entropy measures are designed to detect. The observed AUC-ROC of 0.9820 reflects benchmark performance under those assumptions, not performance in real athletes in uncontrolled field conditions. Real athletes exhibit fatigue trajectories that are non-monotone, confounded by arousal, nutrition, injury history, and inter-session variability that a single-session synthetic benchmark cannot model. The primary practical claim supported by these results is that the computational framework, including entropy triplet extraction, polyhedral kernel interpolation, OC-SVM free-energy scoring, and Banister adaptive thresholding, is technically sound and algorithmically complete. Whether the AUC achieved on this benchmark transfers to real deployment remains an open empirical question that can only be resolved by real-athlete validation with independent physiological ground truth (RPE, blood lactate, EMG, HRV, force plate).
Circularity. The framework detects the patterns it was designed to detect: phase jitter (targets ), amplitude modulation (targets ), and spectral drift (targets ). The AUC-ROC of 0.9820 (Monte Carlo 95% CI: 0.9726–0.9886) quantifies performance under this controlled condition, not real-world sensitivity. The null control (AUC = 0.500 after label permutation) confirms the features are capturing the injected signal rather than noise, but does not validate the clinical claim that these signals correspond to real neuromuscular and metabolic fatigue.
Absent physiological noise. Real IMU signals from fatigued athletes contain confounders absent from the simulation: sensor displacement from sweating skin, heart rate artefact in the 1–3 Hz band, thermoregulatory movement, motivational fluctuations in movement intensity, and surface changes (track vs. grass vs. indoor). These sources of variability would reduce real-world AUC-ROC relative to the simulated value.
SpEn calibration. The spectral entropy effect size in our simulation (
) exceeds the range in the published literature (
from Verdel et al. [
5]), indicating the metabolic pathway injection is stronger than real athlete data. SpEn-specific results therefore represent an optimistic upper bound on metabolic discriminability.
Banister parameter uncertainty. The time constants (, windows) were adapted from the endurance literature and have not been calibrated for high-intensity sprint and jump activities. A sensitivity analysis varying these parameters by ±50% showed threshold tightening between 0.8% and 2.3%, indicating the adaptive threshold mechanism is robust to moderate parameter misspecification.
Sample size. Nine synthetic subjects with two held out for testing is insufficient for population-level generalisation claims. The test set ( subjects) provides an indication of between-subject generalisation under simulation but not statistical power for real-world inference.
Precondition for clinical use. The primary limitation of the present study is that the evaluation dataset is computationally simulated. Although fatigue is injected as temporal complexity changes (phase jitter, amplitude modulation, spectral drift) calibrated from published biomechanical effect sizes [
4], and the signal properties are designed to match those documented in the PAMAP2 corpus [
33], the results reported here constitute a controlled proof-of-concept validation rather than evidence of real-world performance. In particular, the AUC-ROC of 0.9820 is obtained on data whose ground truth is known by construction; it should be interpreted as confirming that the
Safari framework correctly identifies the complexity changes it was designed to detect under controlled conditions, not as a claim of equivalent performance on unseen athlete populations. Five specific limitations arise from synthetic evaluation: (1)
Known-pattern circularity: The framework detects the temporal complexity changes it was designed to detect. Calibration against published effect sizes (
Section 3.7) mitigates but does not eliminate this concern. (2)
Missing physiological noise: Real IMU data contains heart-rate artefacts, sweat-induced sensor displacement, and clothing movement that are absent from the simulation. (3)
Limited inter-individual variability: Our injection model generates between-subject variability from parameter distributions; real athletes exhibit qualitatively different compensatory strategies under fatigue that our model cannot capture. (4)
Banister parameter uncertainty: The time constants (
,
windows) are adapted from the endurance literature and may require recalibration for high-intensity sprint and jump activities. (5)
Absence of longitudinal validation: The Banister threshold dynamics have not been validated against real within-session fatigue accumulation curves. Validation on real athlete data with physiological ground truth (RPE, blood lactate, heart rate variability, EMG) is the immediate priority for future work. This validation is planned as part of a prospective study to be designed and conducted by Koketso Millicent Moroke as part of her graduate research programme, combining her sport science foundation with the computational framework presented here.
The interpolation error for entropy features (mean 3.67% at ) is higher than would be expected for moment features, consistent with entropy’s greater sensitivity to the temporal structure of the window. Non-uniform anchor placement near biomechanically critical window lengths (e.g., near multiples of the dominant stride frequency) may reduce this error in future work.
The Banister model parameters (
,
windows) were adapted from the endurance literature. Recalibration for high-intensity sprint and jump activities through Bayesian individual parameter estimation [
21] is a natural extension.
Future directions include: (i) Riemannian geodesic interpolation on the statistical manifold under the Fisher–Rao metric; (ii) hidden Markov modelling of stride-window length as a latent fatigue-state variable; (iii) extension to convolutional and recurrent neural network feature extractors within the polyhedral framework; and (iv) ultra-low-power microcontroller deployment for multi-day wearable monitoring.