1. Introduction
Contemporary exercise volume assessment methodologies face critical challenges in addressing the dynamic biomechanical demands of collegiate basketball training. Traditional approaches predominantly rely on static physiological indicators such as Body Mass Index (BMI), heart rate, and maximal oxygen uptake (VO
2max), combined with subjective tools like the Borg Rating of Perceived Exertion (RPE) [
1]. While these metrics provide foundational insights through linear regression modeling, their temporal resolution proves inadequate for capturing basketball-specific movement transitions—particularly sudden directional changes during fast breaks (0.5–1.2 s duration) that elude standard MET categorization [
1]. Empirical evidence reveals alarming 23–38% misclassification rates when applying generic heart rate zones to basketball kinematics, underscoring the limitations of conventional frameworks derived from Haskell’s classical exercise prescription principles [
2].
Wang’s multi-modal neural decoding framework achieved 91.4% action recognition accuracy, demonstrating the technical viability of cross-modal fusion through neurophysiological signal alignment [
3]. Recent advancements in multi-modal data fusion demonstrate transformative potential through synergistic integration of physiological signals, kinematic data, and environmental parameters. The convergence of heart rate variability with triaxial accelerometry enables dynamic oxygen uptake prediction [
4], while inertial measurement units (IMUs) facilitate granular analysis of asymmetric landing forces (4–6× body weight impacts characteristic of basketball rebounds [
5]). This technological evolution aligns with Dishman’s individual difference hypothesis in exercise physiology, emphasizing the biological imperative for personalized adaptation models [
6]. However, current implementations exhibit critical deficiencies in temporal resolution and cross-modal alignment, as evidenced by suboptimal emotion recognition accuracy (≤72%) when fusing respiratory signals with cardiac data [
7], and inconsistent gait analysis outcomes in lumbar rehabilitation protocols [
8].
Long Short-Term Memory (LSTM) networks emerge as a potent solution for modeling basketball’s intermittent high-intensity demands (42 ± 15 directional changes per game [
9]), leveraging gated memory cells to capture transient cardiovascular responses. While successful in neuroprosthetic motion-state decoding [
10] and tactical trajectory prediction [
11], extant LSTM architectures inadequately address two fundamental challenges: (1) domain-specific feature alignment for vertical load asymmetry inherent in basketball jumps, and (2) quantitative integration of environmental covariates (temperature, humidity) with personalized physiological baselines. These limitations perpetuate a 28–35% prediction error gap in existing intelligent training systems [
12].
Our research establishes theoretical foundations through dual frameworks: (1) Haskell’s curve optimization for sport-specific metabolic equivalents, and (2) Newtonian biomechanical modeling of asymmetrical impact forces. We propose a spatio-temporal attention-enhanced LSTM architecture with dynamic weight allocation mechanisms, specifically designed to resolve basketball’s unique physiological paradox—simultaneous optimization of anaerobic burst capacity (≤1.2 s sprints) and aerobic recovery efficiency (HRV stabilization within 45 s rest intervals). This innovation addresses the critical research gap in cross-modal feature alignment while providing quantifiable solutions for individual response heterogeneity, ultimately establishing a novel paradigm for precision sports science in higher education.
6. Results
6.1. Exercise Intensity Classification Performance
The LSTM-attention model achieved a weighted accuracy of 85.3% (F1-score = 0.84) on the test set, significantly outperforming baseline models (SVM: 72.1%; vanilla LSTM: 79.6%). The confusion matrix (
Figure 3) revealed superior recognition accuracy for high-intensity movements (e.g., shuttle runs: 91.2%), while moderate-intensity classification exhibited partial misjudgments due to inter-individual HR variability.
6.2. Validation of Personalized Correction Effectiveness
Comparing uncorrected (baseline) and corrected exercise volume assessments (
Figure 4), the personalized model demonstrated significant improvements:
Overweight students (BMI ≥ 25): Assessment error decreased from 22.7% to 9.1% (p < 0.01, paired t-test). Beginner-level students: Misclassification rate due to delayed HR recovery reduced by 18.5%. Overall accuracy: Increased by 15.2% (from 70.1% to 85.3%), confirming the efficacy of BMI and fitness-level correction factors.
6.3. Impact of Dynamic Feedback Strategy
Performance metrics between the experimental group (receiving personalized feedback) and control group (no feedback) are summarized in
Table 7:
Goal achievement rate: The experimental group achieved 89.4% compliance, surpassing the control group (79.1%) by 10.3% (p = 0.003). Intensity maintenance: The experimental group increased time spent in moderate-to-vigorous intensity by 14.7% (from 58.3% to 73.0%).
Gender-stratified independent samples t-test revealed significantly higher lactate accumulation rates in male athletes during power training compared to females (18.7 ± 3.2 vs. 15.4 ± 2.8 mmol/L·min, p = 0.013), while females demonstrated faster heart rate (HR) recovery rates in endurance training (ΔHRR30s = 28.4 ± 4.1 vs. 23.7 ± 5.2 bpm, p = 0.047).
7. Discussion
Our findings demonstrate that the spatio-temporal attention-enhanced LSTM framework fundamentally addresses the temporal misalignment between abrupt kinematic events (e.g., 0.8 ± 0.2 s crossover dribbles) and delayed cardiorespiratory responses (10–15 s HRV stabilization lag post-exertion), achieving 91.2% prediction accuracy for high-intensity maneuvers—a 7.7% absolute improvement over conventional LSTM architectures [
15]. Notably, the proposed model achieved a mean absolute error (MAE) of 3.78 ± 1.23 in exercise load threshold identification, representing a 32.7% reduction compared to traditional ridge regression models (MAE = 5.62 ± 2.15,
p < 0.01). This performance aligns with findings in soccer athletes using similar temporal attention mechanisms (28.9% error reduction), confirming the generalizability of attention-based architectures for dynamic load monitoring. The innovation synergizes ConvLSTM’s spatial sensitivity [
14] with hierarchical temporal attention mechanisms from urban mobility prediction models [
16], enabling localized feature extraction critical for basketball-specific movement patterns while suppressing sensor jitter artifacts (≤2.3% signal distortion).
The multi-modal-personalization paradigm introduced here resolves a longstanding physiological paradox in team sports training: while heart rate monitors effectively capture metabolic load (R
2 = 0.83 vs. VO
2max), they systematically underestimate mechanical strain during asymmetric jumps (4.2–6.7× BW impact forces [
9]). Our solution—adaptively weighting inertial measurement data (Xsens MVN Awinda) against cardiac signals (Polar H10)—boosted high-intensity action recognition F1-scores by 19.8%. The empirical study by Hassan et al. demonstrated that core complex training enhances jump stability by 23%, substantiating the critical role of mechanical load monitoring in preventing sports-related injuries [
17], outperforming ECG-accelerometry fusion approaches in sedentary behavior studies [
18]. Gender-specific analysis revealed a significant 42 s delay in anaerobic threshold detection among female athletes (Δt = 42 s,
p < 0.05), potentially linked to estrogen-mediated mitochondrial biogenesis regulation pathways. This biological mechanism may explain the observed 18.5% improvement in personalized correction efficacy when implementing sex-adjusted models, necessitating future integration of hormonal cycle tracking in female athlete training protocols.
Nevertheless, three critical limitations warrant consideration. First, the 8-week observation window—though sufficient for neuromuscular adaptation cycles [
16]—cannot capture long-term cardiovascular remodeling effects (≥6 months [
19]), potentially underestimating our model’s fatigue accumulation prediction error by 11–15% in periodized training scenarios. Second, despite using medical-grade sensors (±1 bpm HR accuracy), environmental interference (e.g., 5.2 ± 1.8% optical HR signal loss during wet-court conditions) introduced irreducible noise in 7.3% of dataset entries, echoing validation challenges in wearable computing research. Third, the current model does not fully address circadian rhythm effects on performance metrics—a critical factor given morning vs. evening exercise metabolic differences.
Future developments should prioritize three directions grounded in our experimental findings: Cross-population generalization through federated transfer learning could mitigate current age restrictions (18–22 years), particularly given metabolic rate differences between collegiate and professional athletes. Hybrid CNN-LSTM architectures may further enhance spatial–temporal feature disentanglement—preliminary tests show 14% improvement in shot arc recognition by integrating VGG-inspired convolutional blocks. Finally, edge deployment via TensorFlow Lite quantization reduced model latency to 0.38 s (83% faster than cloud-based systems), crucial for real-time feedback during fast-break drills (decision window < 1.2 s [
9]). These advancements collectively establish a new paradigm for AI-driven precision training in dynamic team sports.