Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices

Chen, Guanyu; Aihara, Shimpei; Takegawa, Yoshinari

doi:10.3390/app16052612

Open AccessArticle

Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices

by

Guanyu Chen

¹

,

Shimpei Aihara

²

and

Yoshinari Takegawa

^1,*

¹

Graduate School of Systems Information Science, Future University Hakodate, 116-2 Kamedanakano-cho, Hakodate 041-8655, Hokkaido, Japan

²

Department of Sports Sciences, Japan Institute of Sports Sciences, Tokyo 115-0056, Japan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(5), 2612; https://doi.org/10.3390/app16052612

Submission received: 31 December 2025 / Revised: 26 February 2026 / Accepted: 4 March 2026 / Published: 9 March 2026

(This article belongs to the Special Issue Advances in Winter Sports and Data Science)

Download

Browse Figures

Versions Notes

Abstract

Real-time trajectory forecasting for curling stones is essential for on-ice decision support, yet prior work often emphasizes offline analysis, fixed-window predictors, or physics-driven models that require additional measurements, and it rarely reports end-to-end feasibility under edge-computing constraints (latency and memory). This leaves a practical gap between accurate trajectory reconstruction and deployable rink-side guidance. To bridge this gap, we propose an online forecaster based on low-dimensional

(x, y)

coordinate streams and a lightweight attention-enhanced Long Short-Term Memory (LSTM) architecture optimized for edge devices. The model uses a four-second sliding window (240 frames at 59.94 Hz) to predict fifteen seconds of future positions (900 frames) in a single multi-step forward pass, and an overlapping publication scheme is adopted to retain longer temporal context and stabilize continuous updates. We further provide a TensorFlow Lite (TFLite) conversion and quantization workflow to support on-device inference. Quantitatively, experiments on the CurlTracer dataset (1033 throws at 59.94 Hz) show that the proposed attention–LSTM achieves trajectory-level MAE/MdAE of 0.25/0.22 m over the full prediction horizon, improving over a plain LSTM (0.30/0.24 m) and a physics-based pivot-slide baseline (3.52/3.54 m). At two checkpoints, the first-step MAE/MdAE are 0.14/0.11 m and the mid-step MAE/MdAE are 0.21/0.18 m. For real-time feasibility, on a Raspberry Pi 4B the per-window latency is approximately 0.25 s (including I/O and post-processing), while CPU benchmarks show that TFLite variants provide 7–8× speedups over the original Keras runtime with only minor accuracy loss (e.g., window-level MAE 0.30–0.41 m across FP32/DRQ/FP16/INT8). Qualitatively, representative trajectory visualizations show good agreement in near/mid horizons and reasonable stopping-region guidance, supporting integration with a stone-mounted interface for actionable feedback.

Keywords:

curling; trajectory prediction; deep learning; sport technology

1. Introduction

Curling has emerged as a highly technical winter sport that integrates precision, strategy, and advanced athletic training, attracting growing interest from the sports analytics community for data-driven performance support [1]. The rise of large-scale sensing and analytics has stimulated research on competition analysis, trajectory modeling, and biomechanical parameter estimation in curling [2]. In parallel, the maturation of embedded and edge AI platforms has created an opportunity to move beyond offline post hoc analysis toward low-latency, on-ice decision support systems.

While modern computer vision pipelines can accurately capture stone trajectories in real time, the development of predictive frameworks that predict future positions as the stone slides remains underexplored in curling analytics [3]. Physics-based approaches can be faithful to underlying dynamics but often require high-dimensional measurements, offline solvers, or even instrumented stones; in practice, these systems typically report endpoint errors on the order of 20–40 cm and are difficult to deploy broadly. Learning-based predictors, on the other hand, commonly rely on fixed-length sliding windows that discard historical context as the window advances, and few works systematically examine whether such models are deployable on resource-constrained devices with tight latency and power budgets. These gaps hinder the transition from accurate trajectory reconstruction to actionable, real-time guidance on the ice.

Curling is played on a standardized ice sheet of length

45.72

m and maximum width

4.75

m, as specified by the World Curling Federation [4]. Scoring depends on which stones lie closest to the button after all deliveries; thus, centimeter-level deviations can change outcomes in a close end. For a typical draw shot on normal ice, the hog-to-hog travel time is about

13.0

–

14.5

s [5]. These concrete spatial and temporal constraints necessitate forecasting methods that reveal the likely path and stopping point before the stone enters the house, with sufficiently low computational overhead to run continuously at rinkside. While LSTM is a conventional baseline, recent forecasting literature has moved toward bidirectional RNN variants (BiLSTM/BiGRU) and hybrid recurrent–attention architectures to better capture long-range dependencies [6]. Accordingly, we position our model as an edge-feasible design and evaluate it against representative lightweight advanced baselines.

To address these limitations, we propose a lightweight trajectory prediction framework that operates solely on two-dimensional coordinate streams

(x, y)

with a compact recurrent backbone and an attention mechanism. The design supports variable-length input and forecast horizons; in our setting, a 4-s input window (240 frames at

59.94

Hz) is used to predict the next 15 s (900 frames), enabling the predicted stopping point to be presented well before the stone reaches the house. Targeted feature engineering provides basic kinematic descriptors without incurring multi-sensor complexity, thereby adhering to edge-AI principles and facilitating deployment on commodity devices.

Beyond algorithmic accuracy, we focus on real-time feasibility: the proposed model is converted and optimized with TFLite and evaluated end-to-end with a breakdown of inference latency, publication cadence, and memory footprint on constrained hardware. On a dataset of over 1000 trajectories sampled at

59.94

Hz, the family of models achieves a median absolute error (MdAE) of approximately

0.22

m while maintaining real-time throughput. For user-facing delivery, we integrate the predictor with a stone-mounted near-field interface (TabletStone), mapping trajectories and stopping points to actionable overlays. Two readability strategies—rotation-direction fixation and absolute-position fixation—stabilize the display under motion, offering low-latency, intuitive feedback for both throwers and sweepers.

A unified online forecasting framework for curling under $(x, y)$ -only inputs, balancing historical context retention and latency via overlapping windows and a “latest-is-best” publication policy.
An on-device optimization and real-time evaluation based on TF Lite, quantifying the trade-offs among latency, memory footprint, and publication cadence on resource-constrained edge platforms.
An integration path to a stone-mounted near-field interface (TabletStone) that turns forecasts into rotation-aware, position-locked overlays for actionable decision support on the ice.

To support these contributions and ensure reproducibility, the remainder of this paper is organized to mirror the workflow of the proposed system, from conceptual motivation to model design, to deployment, and evaluation. The structure below follows the pipeline of our approach and highlights how each component contributes to the overall goal of achieving real-time, on-ice trajectory forecasting.

This paper is organized as follows:

Section 2 reviews related work on curling trajectory forecasting, physical modeling, and learning-based prediction, with emphasis on online inference and edge deployment;
Section 3 details the proposed methodology, including feature extraction, network design, online inference, and the TFLite conversion/optimization workflow;
Section 4 presents experimental settings and in-depth analysis, covering accuracy metrics and end-to-end latency/memory breakdown on target devices;
Section 5 concludes the study and outlines future research directions.

2. Related Work

Research on trajectory prediction and motion analysis spans autonomous driving, robotics, and sports analytics, where future positions are forecast from historical trajectories to support safety and decision-making [7,8]. Recurrent neural networks, particularly Long Short-Term Memory (LSTM) models, have demonstrated strong capability for sequence modeling beyond hand-crafted or purely physics-based baselines. In traffic scenes, sequence-to-sequence LSTM encoder–decoder frameworks can produce multiple plausible futures and significantly improve highway forecasting accuracy [9]. Yet important gaps remain. Many predictors are trained offline on large datasets and then kept fixed, which degrades robustness under distribution shift in dynamic settings [10]. Incremental or hybrid online updates can mitigate drift but introduce stability risks and potential forgetting, requiring careful scheduling and regularization [11]. A second gap concerns deployment: edge applications often face strict latency and energy budgets, necessitating compact architectures, model compression, and efficient runtimes [12]. In embedded trajectory forecasting, on-board computation and sensor fusion are frequently combined to satisfy timing constraints [13].

In curling, prior work includes physics-informed simulators, tactical databases, and digital training tools, while computer vision pipelines can reconstruct stone trajectories with high fidelity [3]; forecasting future motion as the stone is sliding has been less emphasized in public literature. The Curling Science initiative explored information-technology-driven support for strategy and analysis, including digital curling simulators for tactical discussion [14,15] and portable database systems that log shots, layouts, scores, participants, and ice conditions on tablets [16,17,18]. Foundational studies have examined open questions behind stone curvature [19] and investigated sweeping with robotics-inspired approaches [20,21]. More recently, near-field and on-rink user interfaces have begun to focus on displays that provide actionable cues under tight time pressure. For example, TabletStone, a stone-mounted interface that overlays key cues on the device display and improves readability through rotation-direction fixation and absolute-position fixation [22]. Most of these systems, however, emphasize recording, visualization, or post hoc analysis rather than unifying online trajectory forecasting with near-field interaction.

Edge-oriented real-time inference for motion AI remains a central challenge across domains. Practical systems must account for end-to-end latency, including pre-processing, model inference, post-processing, and publication cadence, while balancing accuracy with throughput and memory footprint [12,13]. In the curling context, rink-side deployment must also tolerate condition changes and human interventions such as sweeping. Recent work has explored lightweight learning using two-dimensional positional inputs at 59.94 Hz and reported sub 0.25 m median absolute error with online feasibility [23]. Nevertheless, prior studies seldom analyze TF Lite quantization and on-device latency breakdowns in curling, nor do they tightly integrate forecast outputs with a stone-mounted user interface. The present study advances this line by both coupling a lightweight attention-enhanced recurrent predictor with TabletStone and providing a systematic on-device evaluation that quantifies latency, publication rate, and memory usage to support actionable, in situ decision-making.

3. Method

Figure 1 provides an overview of the proposed online trajectory prediction framework. The system receives raw

(x, y)

coordinates, applies pre-processing and feature extraction and feeds a lightweight predictor whose outputs are streamed to the near-field application.

3.1. Data Preprocessing

We use two-dimensional coordinates

s_{t} = (x_{t}, y_{t})

sampled at

f_{s} = 59.94

Hz from a vision tracker [3]. The dataset contains 1033 recorded throws. Data are split at the trajectory (file) level into 70% for training and 30% for testing (seed

= 42

) before sliding-window generation, so windows from the same throw do not appear in both subsets.

Raw traces are denoised by a zero-phase fourth-order Butterworth filter (cut-off 1 Hz). Frames with instantaneous speed

v_{t} \leq 0.02

m/s are discarded to suppress stationary jitter at begin/end states.

Discrete velocities and accelerations are computed as

v_{x, t} = \frac{x_{t} - x_{t - 1}}{Δ t}, v_{y, t} = \frac{y_{t} - y_{t - 1}}{Δ t}, v_{t} = \sqrt{v_{x, t}^{2} + v_{y, t}^{2}},

(1)

a_{x, t} = \frac{v_{x, t} - v_{x, t - 1}}{Δ t}, a_{y, t} = \frac{v_{y, t} - v_{y, t - 1}}{Δ t}, a_{v, t} = \frac{v_{t} - v_{t - 1}}{Δ t}, Δ t ≃ 0.0167 s .

(2)

To capture deceleration trends without external sensors, we use a windowed difference over

w = 30

frames:

{fric}_{x, t} = \frac{v_{x, t} - v_{x, t - w}}{w Δ t}, {fric}_{y, t} = \frac{v_{y, t} - v_{y, t - w}}{w Δ t} .

(3)

The feature vector per frame is

f_{t} = [x_{t}, y_{t}, v_{x, t}, v_{y, t}, a_{x, t}, a_{y, t}, a_{v, t}, {fric}_{x, t}, {fric}_{y, t}]

; ablations use subsets of these features.

We normalize coordinates using rink dimensions and a rigid transform estimated from calibration marks. Let

{\tilde{s}}_{t} = A (s_{t} - μ)

, where

A \in R^{2 \times 2}

accounts for anisotropic scaling and rotation into the sheet-aligned frame and

μ

recenters the origin at the hack line. All predictions are mapped back to world coordinates before evaluation or rendering. To avoid train-test leakage from overlapping sliding windows, we first split trajectories and only then generate windows within each split. In the primary benchmark reported in this paper, we use a 70%/30% trajectory-level train/test split (seed

= 42

).

For runs that use early stopping, a small validation subset is drawn only from training windows, while the held-out 30% test trajectories remain untouched until final evaluation. After splitting, each subset is independently transformed into sliding-window samples using the same pre-processing and feature construction rules described below. This ensures that no window derived from a test trajectory appears in training.

3.2. Sliding-Window Construction

A sliding window of length

T_{in}

advances along each trace with stride S, producing an input tensor and a multi-step target:

\begin{matrix} X^{(i)} & = [f_{t_{i}}, \dots, f_{t_{i} + T_{in} - 1}] \in R^{T_{in} \times d_{f}}, \end{matrix}

(4)

\begin{matrix} Y^{(i)} & = [s_{t_{i} + T_{in}}, \dots, s_{t_{i} + T_{in} + T_{out} - 1}] \in R^{T_{out} \times 2} . \end{matrix}

(5)

Unless stated otherwise, we use

T_{in} = 240

(≈4 s),

T_{out} = 900

(≈15 s), and

S = 5

(≈0.083 s). These values expose mid/late-phase curvature while keeping latency manageable for online publication at

f_{p} = f_{s} / S

Hz.

3.3. Attention-Enhanced Lightweight Recurrent Model

The predictor follows an encoder–recurrent–attention–decoder pattern and outputs all

T_{out}

future coordinates in a single forward pass (MIMO decoding).

We explore two compact encoders: (i) a depthwise-separable 1D CNN with kernel

k \in {3, 5}

, channels

c \in [16, 64]

, LayerNorm, and dropout

p \in [0.1, 0.3]

; and (ii) a two-layer MLP (hidden 64–128, SiLU) applied per time step followed by LayerNorm. The encoder projects

X \in R^{T_{in} \times d_{f}}

to a sequence

{h_{i}}_{i = 1}^{T_{in}}

with dimension

d_{e}

.

A two-layer unidirectional LSTM (hidden size

d_{h} \in {64, 96, 128}

) processes

{h_{i}}

and outputs states

{z_{i}}

. Unidirectionality preserves causality for online use; a bidirectional variant is reserved for offline ablation. Figure 2 provides a layer-level schematic of the implemented attention-enhanced LSTM used in our experiments. We apply scaled dot-product attention to highlight informative timesteps (e.g., the onset of late curl):

α_{i} = {softmax}_{i} (\frac{q^{⊤} k_{i}}{\sqrt{d}}), c = \sum_{i = 1}^{T_{in}} α_{i} v_{i} .

(6)

MultiMulti-head attention with

h \in {1, 2, 4}

heads is used depending on the latency budget.

We regress future coordinates in one shot:

vec (\hat{Y}) = W_{o} [z_{T_{in}} ∥ c] + b_{o}, \hat{Y} \in R^{T_{out} \times 2} .

(7)

We found MIMO decoding more stable than auto-regressive rollouts at long horizons under tight publication cadences.

3.4. Loss Design

The training objective combines sequence error, stopping error, and optional smoothness:

L = λ_{1} \cdot \frac{1}{T_{out}} \sum_{j = 1}^{T_{out}} {∥{\hat{s}}_{t + j} - s_{t + j}∥}_{2}^{2} + λ_{2} \cdot |{\hat{t}}_{stop} - t_{stop}| λ_{3} \cdot \frac{1}{T_{out} - 1} \sum_{j = 1}^{T_{out} - 1} {∥{\hat{s}}_{t + j + 1} - {\hat{s}}_{t + j}∥}_{2}^{2} .

(8)

Equation (8) is presented as a generalized objective for this prediction framework. In the implementation used for all experiments in this paper, we instantiate it as a single-term regression loss with

(λ_{1}, λ_{2}, λ_{3}) = (1, 0, 0)

, i.e., MSE-based trajectory supervision only. This setting was adopted to keep optimization stable and fully consistent across the Keras and TFLite pipelines.

The stopping-time estimate is computed during evaluation/post-processing from the predicted trajectory

\hat{Y} = {{\hat{s}}_{t + j}}_{j = 1}^{T_{out}}

. We first compute frame-to-frame displacement

d_{j} = {∥ {\hat{s}}_{t + j} - {\hat{s}}_{t + j - 1} ∥}_{2}

, and define

{\hat{j}}_{stop} = min {j \in [2, T_{out}] ∣ d_{j} < ϵ_{stop}}, ϵ_{stop} = 0.01 m,

(9)

with fallback

{\hat{j}}_{stop} = T_{out}

if no such index exists. The corresponding stopping-time estimate is

{\hat{t}}_{stop} = {\hat{j}}_{stop} Δ t

.

We use AdamW (TensorFlow 2.20) with cosine decay, warmup 5 epochs, early stopping on validation endpoint position error (EPE), and gradient clipping at

1.0

. BatchNorm is avoided to simplify TFLite conversion; LayerNorm is retained. Feature normalization uses statistics from the training split and is frozen at inference. We retain the generalized form in Equation (8) to clarify how stop-aware and smoothness terms can be incorporated in future extensions, while the present study reports results for the instantiated setting above.

3.5. Online Inference and Publication Cadence

A ring buffer of length

T_{in}

is updated at

f_{s}

Hz. Every S frames, a forward pass produces

\hat{Y}

;

{\hat{t}}_{stop}

is then derived from

\hat{Y}

using the displacement threshold rule above. Results are published at

f_{p} = f_{s} / S

Hz.

For timestamps covered by multiple forecasts, we adopt a last-write-wins policy for the first K near-future steps and a linear blend beyond K to suppress flicker: Here, K denotes the near-horizon switch point (in prediction steps); in this work we use

K = 60

, which corresponds to approximately

1.0

s at

59.94

Hz.

{\tilde{s}}_{t + j} = β_{j} {\hat{s}}_{t + j}^{(new)} + (1 - β_{j}) {\hat{s}}_{t + j}^{(prev)},

(10)

β_{j} = \{\begin{matrix} 1, & 1 \leq j \leq K, \\ \frac{T_{out} - j}{T_{out} - K}, & K < j \leq T_{out}, \end{matrix}

(11)

We decompose end-to-end latency as

T_{e 2 e} = T_{pre} + T_{infer} + T_{post} + T_{tx} + T_{render},

(12)

and choose

f_{p}

(typically 10–20 Hz) to balance freshness and UI stability.

To make the runtime procedure explicit, we provide a compact online algorithm used in deployment (Algorithm 1):

Algorithm 1 Online inference and publication (compact form)

1.: Initialize a ring buffer of length $T_{in}$ , sampling rate $f_{s}$ , and publication stride S.
2.: For each incoming coordinate sample $(x_{t}, y_{t})$ : append to buffer after basic pre-processing.
3.: If buffer is full and $t mod S = 0$ :
4.: Construct input window $X_{t} \in R^{T_{in} \times d_{f}}$ and run one forward pass to obtain ${\hat{Y}}_{t} = {{\hat{s}}_{t + j}}_{j = 1}^{T_{out}}$ .
5.: Estimate ${\hat{t}}_{stop}$ from ${\hat{Y}}_{t}$ using the displacement-threshold rule in Section 3.4.
6.: Reconcile overlaps at shared future timestamps using the piecewise rule above ( $β_{j} = 1$ for $j \leq K$ , linear decay for $j > K$ ) to produce one fused trajectory for display.
7.: Publish a timestamped output packet containing the fused trajectory and ${\hat{t}}_{stop}$ to the renderer/application.
8.: Store per-window raw predictions in parallel for offline diagnostics and uncertainty analysis.

This algorithm reflects the user-facing behavior of the online pipeline: one fused trajectory is published at each update, while raw window predictions are retained only for offline analysis.

To avoid over-claiming deterministic precision, we also report a basic trajectory uncertainty descriptor from overlapping forecasts. For each future step j, let

P_{t, j} = {{\hat{s}}_{t + j}^{(m)}}_{m = 1}^{M_{t, j}}

be the set of valid predictions contributed by overlapping windows (retained in raw form for analysis). We compute an empirical center and spread

{\bar{s}}_{t + j} = \frac{1}{M_{t, j}} \sum_{m = 1}^{M_{t, j}} {\hat{s}}_{t + j}^{(m)}, σ_{t + j} = \sqrt{\frac{1}{M_{t, j}} \sum_{m = 1}^{M_{t, j}} {∥{\hat{s}}_{t + j}^{(m)} - {\bar{s}}_{t + j}∥}_{2}^{2}} .

(13)

We use percentile bands (5th–95th) over

P_{t, j}

as a non-parametric confidence envelope in trajectory visualization and analysis. These intervals are descriptive uncertainty indicators from forecast disagreement rather than calibrated probabilistic guarantees.

3.6. Prediction-to-Information Mapping

This subsection specifies the interface that decouples prediction from presentation. We are producing a compact, device-agnostic message that downstream consumers (e.g., a handle-mounted display such as TabletStone) can render.

Data flow (three components). (1) Position Provider. A CurlTracer-based stream supplies rink-aligned coordinates

s_{t} = (x_{t}, y_{t})

. (2) Predictor Runtime. The trained model (TensorFlow 2.20/Keras or its TFLite variant) ingests a sliding window of length

T_{in}

and outputs the next

T_{out}

points

\hat{Y} = {{\hat{s}}_{t + j}}_{j = 1}^{T_{out}}

. (3) Mapping Adapter. The model outputs are serialized into a compact message for transport over a local link (UDP/WebSocket). The predictor currently transmits the full 900-frame forecast (two float32 coordinates per frame; approximately 7 kB per update). This size is well within the bandwidth of short-range Wi-Fi links and does not affect real-time operation in our deployment. In practice, the payload can be further reduced through subsampling or fixed-point quantization, but we keep the uncompressed format for clarity and reproducibility. Also, we keep the UDP protocol stateless for robustness.

3.7. TFLite Conversion

To enable deployment in edge AI scenarios, we converted the trained model into several TFLite variants and evaluated them on the same test split used in Section 4.1.

Friction-related terms

({fric}_{x, t}, {fric}_{y, t})

can be optionally enabled to improve robustness under variable ice conditions. In the experiments reported here, these optional friction terms are disabled to reduce memory footprint and arithmetic cost, and Section 4.3 compares quantization variants under this same lightweight feature setting.

For conversion, we first loaded the trained model and instantiated an equivalent CPU-only Keras model. All GPU devices were disabled in the script to emulate a CPU-only edge environment.

Using tf.lite.TFLiteConverter.from_concrete_functions, we generated four TFLite variants:

(i): an FP32 baseline (TFLite built-in ops plus SELECT_TF_OPS);
(ii): A dynamically quantized model (dynamic-range quantization, DRQ);
(iii): An FP16-quantized model (weights stored in float16);
(iv): A fully quantized INT8 model (full integer quantization with SELECT_TF_OPS).

For the INT8 variant, a small representative dataset was constructed by randomly sampling up to 256 training windows, each provided as a single-batch tensor of shape

(1, 240, 7)

. This dataset was passed to the converter as the representative_dataset function, and both inference_input_type and inference_output_type were set to int8. In all cases, the converter was configured to allow SELECT_TF_OPS to preserve the LSTM and attention behavior of the original model.

For evaluation, all TFLite models were executed via the TensorFlow Lite Interpreter in TensorFlow 2.20 under Python 3.12 on the same host CPU as the Keras baseline. We implemented a lightweight wrapper that handles any necessary quantization and de-quantization of input/output tensors based on the scale and zero-point parameters reported by the interpreter. Window-level metrics were computed by running inference with a batch size of one window at a time. For each window, the mean absolute error over the entire prediction horizon (window MAE) was computed. Latency was measured using wall-clock time (time.time()) around a single call to predict (Keras) or invoke (TFLite).

To estimate end-to-end performance for complete trajectories, we additionally selected several representative throws from the test files and ran a sliding-window prediction procedure over the full sequence. The total inference time required to generate the full prediction horizon was recorded for each runtime (Keras and all TFLite variants).

4. Evaluation

4.1. Trajectory Evaluation

We evaluate the proposed method on the CurlTracer dataset [3], which contains 1033 recorded throws sampled at

59.94

Hz.

Motivated by the characteristics of curling delivery and expert feedback, we adopt a two-checkpoint analysis. Specifically, we report Mean Absolute Error (MAE) and Median Absolute Error (MdAE) at the initial forecast step and at the midpoint of the prediction horizon, in addition to full-trajectory statistics. Both “first-step” and “mid-step” errors are computed per sliding window and then aggregated across all windows in the test split. We also generate continuous trajectory predictions for qualitative inspection (Figure 3).

For comparison, we re-implemented the constant-friction pivot–slide model of Shegelski and Lozowski [24]. A single average pivot ratio calibrated on

70 %

of the throws (

\bar{f} = 2.03 \times 10^{- 4}

) was applied to the remaining

30 %

. The resulting MAE/MdAE of

3.52

m/

3.54

m is an order of magnitude worse than our learning-based method but serves as a lightweight, interpretable baseline.

To clarify comparison fairness for Table 1, all learning-based models (Plain LSTM and Attention-LSTM) were trained and evaluated under the same protocol: identical trajectory-level train/test split, identical pre-processing and feature construction, identical sliding-window settings (

T_{in}

,

T_{out}

, stride S), and identical evaluation metrics. Model selection was performed using the same validation-based criterion within this unified pipeline. The only architectural difference between the two learning-based variants is the presence/absence of the attention block. For the physics baseline, calibration was performed on the training split only, and testing was conducted on the held-out split to keep data usage consistent across methods.

To complement Table 1, we additionally benchmarked three lightweight sequence models (Attention-LSTM, BiLSTM, and BiGRU) in a supplementary multi-seed profiling protocol for edge-oriented comparison. Results are summarized in Table 2.

In this supplementary comparison, BiGRU attains the lowest error, while the proposed Attention-LSTM remains the fastest and most compact among the three compared architectures and is close to BiLSTM in accuracy. To make this trade-off explicit, we provide an accuracy–latency Pareto visualization (Figure 4). For statistical transparency, Figure 5 further visualizes seed-level mean ± std error bars for the same comparison.

Removing the self-attention layer (“Plain LSTM”) degrades MAE by approximately

20 %

, indicating that global temporal context is important for modeling late–stage curl. For reference, the analytic prediction we use

r_{p} = \sqrt{R^{2} / 2 - r^{2}} \approx 0.0809 m

with

R = 0.145

m,

r = 0.063

m. is

\hat{y} (t) = s_{y} v_{0} t (1 - \frac{t}{2 t_{F}}), \hat{x} (t) = s_{x} \frac{v_{0}^{2} \bar{f}}{2 r_{p}} t^{2} (1 - \frac{t}{t_{F}} + \frac{t^{2}}{4 t_{F}^{2}}),

where

s_{y} = sgn (x_{F})

and

s_{x} = sgn (y_{F})

compensate for the rink-centric coordinate orientation. Despite relying only on 2-D coordinates and timestamps, the attention-augmented LSTM outperforms the physics model by an order of magnitude, underscoring the value of temporal context in this low-dimensional setting.

To further clarify the attention design choice, we compared four variants under the same data split and evaluation pipeline: no attention (plain), scaled dot-product attention, additive attention, and a two-head multi-head attention variant. Results are summarized in Table 3.

In this comparison, additive attention achieves lower MAE but with substantially higher latency, whereas scaled dot-product attention improves error over the no-attention baseline with only a small latency increase (Keras: 0.334 → 0.253 m, 80.13 → 82.37 ms; TFLite FP32 latency: 10.37 → 10.50 ms). Given the edge-oriented objective of this study, we selected scaled dot-product attention as a practical accuracy–latency compromise.

We further examine the accuracy at the first and midpoint checkpoints:

First - step: MAE = 0.14, MdAE = 0.11; Mid - step: MAE = 0.21, MdAE = 0.18 .

Following Section 3.5, we computed overlap-derived percentile envelopes (5th–95th) on the same held-out windows as descriptive uncertainty indicators. The envelope is tighter in the near horizon and widens toward the long horizon, consistent with the checkpoint trend in point-error statistics. To complement these point estimates, we additionally report compact error-distribution statistics for deployment reliability (median/IQR/p90) on held-out windows in Table 4.

Consistent with the online-use objective, we focus operational interpretation on near/mid horizons and stop-region guidance. Here, the aggregation sample size is the number of test windows (

n_{win}

) generated from held-out trajectories; for the split reported in this work,

n_{win} = 15, 165

. The three checkpoints (first, mid, and last) are visualized in Figure 6, Figure 7 and Figure 8 to illustrate model behavior across the horizon.

Stopping-point accuracy is shown in Figure 9: predicted stops (purple crosses) versus ground truth (green circles), overlaid with standard hogline and house markings. In the illustrated case, the deviation is

0.08

m. Considering the measurement characteristics of a monocular camera pipeline, we regard this level of accuracy as suitable for practical use.

4.2. Real-Time Evaluation

To assess real-time feasibility, we measured computational performance under two representative environments using the same prediction pipeline (84 sliding windows per throw with stride of 5 frames, ≈0.08 s).

On a Google Colab free-tier instance (AMD EPYC 7B12, 8 threads), a full-trajectory forecast was completed in

6.73

s. The per–window latency averaged ≈ 0.07 s.

On a Raspberry Pi 4B (4 GB RAM), the total prediction time was 21 s per throw. The mean per window latency was ≈0.25 s (including I/O and post-processing). Accordingly, we published updates at an effective cadence of ≈0.25 s while buffering intermediate frames. Adjusting the update rate can further trade freshness for computational headroom as needed. These results indicate that the model maintains acceptable throughput and compute overhead on a low-power edge device, supporting real-time field applications.

4.3. Lightweighting of Models in Edge AI Scenarios

To support on-device inference in edge AI settings, the trained model was converted into four TFLite variants: FP32 (builtins + select_tf_ops), dynamic-range quantization (DRQ), FP16, and INT8 (full integer quantization with select_tf_ops). All experiments were executed on the same host as the original Keras baseline.

Table 5 reports the window-level MAE (overall and first-step MAE) for each TFLite variant, evaluated on the same test split of

15, 165

windows. To keep comparison fair, quantization effects in this subsection are interpreted relative to FP32 under the same feature setting. Under this setting, FP32 and DRQ remain close in the early horizon, while FP16 and INT8 exhibit larger deviations. A closer comparison shows that the degradation is not uniform across the horizon. Relative to FP32 (first-step MAE

0.23

m; window MAE

0.30

m), FP16 increases to

0.29

m (+

0.06

m) at the first step and

0.39

m (+

0.09

m) over the full window, while INT8 increases to

0.31

m (+

0.08

m) and

0.41

m (+

0.11

m), respectively. This indicates that precision reduction has a moderate short-horizon effect but a larger impact over long-horizon outputs. As a compact stability proxy for online publication, we also report the horizon-gap index

Δ_{hor} = {MAE}_{window} - {MAE}_{first}

from Table 5: FP32

= 0.07

m, DRQ

= 0.07

m, FP16

= 0.10

m, INT8

= 0.10

m. Larger

Δ_{hor}

indicates stronger long-horizon output fluctuation, which is consistent with wider overlap-derived spread/percentile envelopes and potentially higher UI flicker risk in continuous updates. A plausible mechanism is that low-precision arithmetic introduces discretization and rounding perturbations in recurrent/attention computations, which become more visible as the prediction horizon extends [25]. In our setting, this supports the practical choice of FP32/DRQ when accuracy robustness is prioritized and FP16/INT8 when additional compression is required.

We additionally benchmarked the end-to-end inference time for three representative throws. As shown in Table 6, all TFLite runtimes provide substantial speed-ups over the Keras baseline, typically reducing latency by a factor of 7–8×. Among them, the FP32 variant offers the best balance between execution time and fidelity, requiring no additional calibration while maintaining accuracy close to the original model.

Overall, within this lightweight feature setting, the TFLite FP32 and DRQ models preserve most of the in-config predictive performance while reducing inference time to well under

1.2 s

per full trajectory prediction. These results confirm that lightweight deployment is feasible with moderate accuracy degradation, enabling real-time trajectory feedback on embedded devices.

5. Discussion

5.1. Validation Results Summary

Our model achieved an MdAE of 0.22 m, which aligns with the accuracy range reported by prior physics-based curling trajectory models. For example, the Scratch-Guided Model [26] and the Improved Mixed Friction Models [27,28] report RMSE or EPE values ranging from approximately 20 cm to 40 cm. Although some physics-based approaches achieve slightly higher precision, they typically require extensive force decomposition, precise rotational speed measurements, or high-dimensional input features such as angular velocity and friction coefficients.

In contrast, our approach achieves comparable performance using only 2D positional inputs (

x, y

) while preserving strong real-time inference capabilities. The model processes each trajectory in less than 7 s on a multi-core CPU (AMD EPYC 7B12 on Google Colab Free Tier) and generates continuous predictions at a frequency of 0.08 s, without relying on domain-specific parameters or handcrafted physics assumptions.

This trade-off between minimal feature dependence and acceptable prediction accuracy makes our system highly practical and adaptable for live curling analysis scenarios, particularly where computational constraints or sensor limitations are present.

A major advantage of our proposed approach lies in its suitability for edge AI deployment. By relying solely on low-dimensional

(x, y)

coordinate inputs and employing a lightweight LSTM–Attention architecture, our model maintains low computational and memory requirements without sacrificing prediction quality. This design enables seamless deployment on embedded systems and edge devices where power and processing resources are limited. Real-time inference is consistently achieved even under CPU-only environments, as evidenced by our experiments on standard cloud hardware. Such characteristics make the framework highly applicable for on-site tactical analysis, automated scorekeeping, and interactive coaching systems in curling, while the model’s generic input requirements and modest runtime footprint further ensure compatibility with diverse tracking hardware. Thus, our approach supports broader accessibility and cost-effective analytics solutions for both professional and amateur curling communities.

5.2. Application

Accurate forward looking trajectories enable data-informed refinement of release speed and rotation. Recent learning-based approaches (e.g., CasLSTM) report centimeter-level accuracy and have been proposed as training aids for elite curlers [29]. Motion-centered reviews likewise emphasize that quantitative feedback accelerates technical skill acquisition [30]. Analogous progress in golf, where TrackMan-style launch monitoring provides actionable shot shape insight [31], suggests that trajectory prediction can become a routine diagnostic on the ice.

Tactical simulators that rank potential shots or adjust sweeping intensity require credible future-path estimates. Real-time predictors have been integrated into drill-scheduling interfaces [1], while camera-based systems with automatic coordinate calibration indicate that low-cost tracking can support on-ice planning during practice sessions [3]. By delivering a 15 s outlook after only four seconds of play, our model can present likely stopping regions before the stone reaches the far hog-line, offering coaches an earlier window for intervention.

Predicted-path graphics can aid audience comprehension and engagement, mirroring the impact of Shot Tracer/TopTracer technologies in televised golf [32]. Because our network runs on commodity CPUs, the predictor can be inserted into a production pipeline to render low-latency overlays without dedicated GPU hardware.

Beyond off-ice screens and broadcast graphics, a near-field display mounted on or near the stone handle can present the same forecasts to players on the rink [22]. Two design properties are particularly relevant in our setting. First, absolute-position fixation (world-locked homography) keeps overlays anchored to the rink frame regardless of how the device moves, preventing drift of markers and polylines during delivery. Second, rotation-direction fixation stabilizes the heading of near-future cues when the stone is rotating, improving glanceability at a publication cadence of 10–20 Hz. In practice, such stability benefits entry-level sweepers: a stationary, world-locked stop marker and a short-horizon path band make it easier to judge when to start/stop sweeping and how to modulate intensity relative to the projected line, reducing cognitive load during drills.

As a complementary modality, AR glasses could project predicted trajectories onto the ice surface in the player’s field of view, aligning virtual estimates with real motion. Compared to fixed screens, AR offers spatially registered feedback that may facilitate immediate, context-aware adjustments during practice. Our real-time outputs are directly compatible with such overlays; evaluating human factors and long-term training effects is left for future work.

6. Conclusions

In this work, we presented a lightweight deep-learning framework for predicting curling stone trajectories using an attention-enhanced LSTM architecture. Unlike traditional methods that rely on high-dimensional sensor data or fixed-length offline models, our approach operates on simple

(x, y)

coordinate inputs and is designed for real-time inference using sliding-window sequences. Through targeted feature engineering and architectural design, the model balances prediction accuracy with low computational cost, making it well-suited for deployment on edge AI devices.

Extensive experiments on over 1000 real-world curling trajectories demonstrate that the model achieves an MdAE of 0.22 m. Importantly, each prediction sequence can be generated within approximately 6 s on a CPU-only platform using standard hardware (e.g., AMD EPYC processors on Google Colab Free Tier), confirming that our approach is feasible for both real-time and offline applications. This level of performance is sufficient to enable predictive visualizations during gameplay, providing actionable feedback for athletes and coaches while also enhancing audience engagement in broadcast environments. Compared with the plain LSTM baseline, the proposed attention-enhanced model reduces full-horizon MAE from 0.30 m to 0.25 m and MdAE from 0.24 m to 0.22 m; compared with the pivot-slide physics baseline (MAE/MdAE 3.52/3.54 m), it yields substantially smaller trajectory error under the same evaluation protocol. Qualitatively, representative trajectory plots show better agreement in near/mid horizons and useful stopping-region guidance, while long-horizon uncertainty trends are summarized using overlap-based percentile envelopes.

The overall contribution of this work lies not only in the predictive accuracy achieved but also in the practical and scalable system design. By combining model simplicity, real-time operation, and cross-domain applicability, our framework provides a foundation for future research on interpretable, low-latency prediction systems in physically grounded sports. Future work will explore integrating physics-based modules, automating the data pre-processing pipeline, and refining inference latency to further extend the applicability of our system under competitive conditions.

Author Contributions

Conceptualization, G.C. and Y.T.; methodology, G.C.; software, G.C.; validation, G.C. and S.A.; formal analysis, G.C.; investigation, G.C.; resources, S.A. and Y.T.; data curation, S.A.; writing—original draft preparation, G.C.; writing—review and editing, S.A. and Y.T.; visualization, G.C.; supervision, Y.T.; project administration, S.A.; funding acquisition, Y.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are subject to restrictions that apply to the datasets. The datasets presented in this article are not publicly available due to privacy restrictions. Requests to access the datasets should be directed to the corresponding author.

Acknowledgments

This work was supported by the “The Enhancement of HPSC Infrastructure through Technology Innovation Project” of Japan Sports Agency. During the preparation of this work, the author(s) used ChatGPT GPT-5, Open AI and Grammarly 1.154 in order to improve the readability of the paper. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Shi, X.; Wang, Q.; Wang, C.; Wang, R.; Zheng, L.; Qian, C.; Tang, W. An AI-Based Curling Game System for Winter Olympics. Research 2022, 2022, 9805054. [Google Scholar] [CrossRef] [PubMed]
Lee, K.; Kim, S.A.; Choi, J.; Lee, S.W. Deep Reinforcement Learning in Continuous Action Spaces: A Case Study in the Game of Simulated Curling. In Proceedings of the 35th International Conference on Machine Learning; Dy, J., Krause, A., Eds.; PMLR: Cambridge, MA, USA, 2018; Volume 80, pp. 2937–2946. [Google Scholar]
Aihara, S.; Chen, G.; Hara, R.; Nakagawa, M.; Ogasawara, A.; Yanagi, H.; Takegawa, Y.; Ito, T.; Masui, F. Development of a simple tracking system to monitor curling stone dynamics. In Proceedings of the 12th International Conference on Sport Sciences Research and Technology Support; SciTePress: Setúbal, Portugal, 2024; pp. 22–33. [Google Scholar] [CrossRef]
World Curling Federation. Building a Modern Curling Facility; World Curling Federation: Perth, UK, 2020. [Google Scholar]
Timing Rocks—CurlTech. Available online: https://www.curltech.com/curling-training/timing-rocks (accessed on 7 November 2025).
Quan, R.; Cheng, G.; Guan, X.; Zhang, G.; Quan, J. A HO-BiGRU-Transformer based PEMFC degradation prediction method under different current conditions. Renew. Energy 2026, 256, 124132. [Google Scholar] [CrossRef]
Alahi, A.; Goel, K.; Ramanathan, V.; Robicquet, A.; Fei-Fei, L.; Savarese, S. Social LSTM: Human Trajectory Prediction in Crowded Spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2016. [Google Scholar]
Honda, Y.; Kawakami, R.; Yoshihashi, R.; Kato, K.; Naemura, T. Pass Receiver Prediction in Soccer using Video and Players’ Trajectories. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2022; pp. 3502–3511. [Google Scholar] [CrossRef]
Park, S.H.; Kim, B.; Kang, C.M.; Chung, C.C.; Choi, J.W. Sequence-to-Sequence Prediction of Vehicle Trajectory via LSTM Encoder-Decoder Architecture. arXiv 2018, arXiv:1802.06338. [Google Scholar] [CrossRef]
Yuan, L.; Li, H.; Xia, B.; Gao, C.; Liu, M.; Yuan, W.; You, X. Recent Advances in Concept Drift Adaptation Methods for Deep Learning. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22; Raedt, L.D., Ed.; International Joint Conferences on Artificial Intelligence Organization: Montreal, QC, Canda, 2022; pp. 5654–5661. [Google Scholar] [CrossRef]
Narkhede, P.; Walambe, R.; Poddar, S.; Kotecha, K. Incremental learning of LSTM framework for sensor fusion in attitude estimation. PeerJ Comput. Sci. 2021, 7, e662. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Yang, F.; Ginhac, D. ACDnet: An action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregation. Pattern Recognit. Lett. 2021, 145, 118–126. [Google Scholar] [CrossRef]
Wang, P.; Yu, H.; Liu, C.; Wang, Y.; Ye, R. Real-Time Trajectory Prediction Method for Intelligent Connected Vehicles in Urban Intersection Scenarios. Sensors 2023, 23, 2950. [Google Scholar] [CrossRef] [PubMed]
Ito, T.; Kitasei, Y. Proposal and implementation of “digital curling”. In Proceedings of the 2015 IEEE Conference on Computational Intelligence and Games (CIG); IEEE: New York, NY, USA, 2015; pp. 469–473. [Google Scholar] [CrossRef]
Yamamoto, M.; Kato, S.; Iizuka, H. Digital curling strategy based on game tree search. In Proceedings of the 2015 IEEE Conference on Computational Intelligence and Games (CIG); IEEE: New York, NY, USA, 2015; pp. 474–480. [Google Scholar] [CrossRef]
Masui, F.; Ueno, H.; Yanagi, H.; Ptaszynski, M. Toward curling informatics—Digital scorebook development and game information analysis. In Proceedings of the 2015 IEEE Conference on Computational Intelligence and Games (CIG); IEEE: New York, NY, USA, 2015; pp. 481–488. [Google Scholar] [CrossRef]
Masui, F.; Hirata, K.; Otani, H.; Yanagi, H.; Ptaszynski, M. Informatics to Support Tactics and Strategies in Curling. Int. J. Autom. Technol. 2016, 10, 244–252. [Google Scholar] [CrossRef]
Otani, H.; Masui, F.; Hirata, K.; Yanagi, H.; Ptaszynski, M. Analysis of Curling Team Strategy and Tactics using Curling Informatics. In Proceedings of the icSPORTS; Correia, P.P., Cabri, J., Eds.; SciTePress: Setúbal, Portugal, 2016; pp. 182–187. [Google Scholar]
Murata, J. Study of curling mechanism by precision kinematic measurements of curling stone’s motion. Sci. Rep. 2022, 12, 15047. [Google Scholar] [CrossRef] [PubMed]
Gwon, J.; Kim, H.; Bae, H.; Lee, S. Path Planning of a Sweeping Robot Based on Path Estimation of a Curling Stone Using Sensor Fusion. Electronics 2020, 9, 457. [Google Scholar] [CrossRef]
Won, D.O.; Kim, B.D.; Kim, H.J.; Eom, T.S.; Muller, K.R.; Lee, S.W. Curly: An AI-based Curling Robot Successfully Competing in the Olympic Discipline of Curling. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18; International Joint Conferences on Artificial Intelligence Organization: Montreal, QC, Canda, 2018; pp. 5883–5885. [Google Scholar] [CrossRef]
Guanyu, C.; Mori, H.; Takegawa, Y.; Aihara, S.; Masui, F. Design and Implementation of a Stone Rotation Measurement System with IMU Sensor and Stone Behavior Presentation System. In Proceedings of the 12th International Conference on Sport Sciences Research and Technology Support—Volume 1: icSPORTS; INSTICC; SciTePress: Setúbal, Portugal, 2024; pp. 140–147. [Google Scholar] [CrossRef]
Chen, G.; Aihara, S.; Takekawa, Y. Real-time Curling Trajectory Prediction via Attention-Enhanced Sequential Modeling and Multi-Dimensional Feature Fusion. In Proceedings of the IEEE STAR 2025; IEEE: New York, NY, USA, 2025. [Google Scholar]
Shegelski, M.R.; Lozowski, E. Pivot–slide model of the motion of a curling rock. Can. J. Phys. 2016, 94, 1305–1309. [Google Scholar] [CrossRef]
Jacob, B.; Kligys, S.; Chen, B.; Zhu, M.; Tang, M.; Howard, A.; Adam, H.; Kalenichenko, D. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: New York, NY, USA, 2018; pp. 2704–2713. [Google Scholar] [CrossRef]
Penner, A.R. A Scratch-Guide Model for the Motion of a Curling Rock. Tribol. Lett. 2019, 67, 35. [Google Scholar] [CrossRef]
Shegelski, M.R.A.; Niebergall, R.; Walton, M.A. The motion of a curling rock. Can. J. Phys. 1996, 74, 663–670. [Google Scholar] [CrossRef]
Jensen, E.T.; Shegelski, M.R. The motion of curling rocks: Experimental investigation and semi-phenomenological description. Can. J. Phys. 2004, 82, 791–809. [Google Scholar] [CrossRef]
Guo, Y.; Jin, J.; Zhao, H.; Jiang, Y.; Li, D.; Shen, Y. High-Precision Prediction of Curling Trajectory Multivariate Time Series Using the Novel CasLSTM Approach. Sci. Rep. 2025, 15, 3468. [Google Scholar] [CrossRef] [PubMed]
Zacharias, E.; Robak, N.; Passmore, S. An Examination of Studies Related to the Sport of Curling: A Scoping Review. Front. Sport. Act. Living 2024, 6, 1291241. [Google Scholar] [CrossRef] [PubMed]
Trackman. 2025. Available online: http://www.trackman.com (accessed on 5 June 2025).
Gailus, L. Technology in Golf: A Behind-the-Scenes Look by Marques Brownlee. Golf Post. 2024. Available online: https://cms.golfpost.com/golf-tracer-tracker-marques-brownlee-888123927/ (accessed on 5 June 2025).

Figure 1. System overview of the proposed trajectory prediction framework.

Figure 2. Implementation-level architecture of the proposed attention-enhanced LSTM predictor used in experiments.

Figure 3. Representative full-trajectory prediction. The concentric circles indicate the approximate location of the curling house, the coordinate axes provide spatial reference.

Figure 4. Accuracy–latency Pareto view for supplementary lightweight-model comparison (lower is better on both axes).

Figure 5. Seed-level variability (mean ± std) for supplementary lightweight-model comparison. Left: window MAE. Right: per-window latency.

Figure 6. First-step prediction at the beginning of the forecast horizon. The concentric circles indicate the approximate location of the curling house, and the coordinate axes provide spatial reference. The model already aligns the initial curvature of the path shortly after release.

Figure 7. Mid-step prediction halfway through the 15 s forecast horizon. The concentric circles indicate the approximate location of the curling house, and the coordinate axes provide spatial reference. The predicted curl and deceleration remain consistent with the ground–truth trajectory.

Figure 8. Last-step prediction at the end of the forecast horizon. The concentric circles indicate the approximate location of the curling house, and the coordinate axes provide spatial reference.The long-horizon endpoint is more sensitive to accumulated error.

Figure 9. Stopping-point prediction near the house (endpoint deviation

0.08

m). The concentric circles indicate the approximate location of the curling house, and the coordinate axes provide spatial reference. The inset highlights the distance between the predicted and true resting positions.

Figure 9. Stopping-point prediction near the house (endpoint deviation

0.08

m). The concentric circles indicate the approximate location of the curling house, and the coordinate axes provide spatial reference. The inset highlights the distance between the predicted and true resting positions.

Table 1. Trajectory-level errors over the full prediction horizon. “Plain LSTM” denotes the attention-ablated variant; “Pivot-slide” is the physics baseline.

Method	MAE [m]	MdAE [m]
Pivot-slide (phys.)	3.52	3.54
Plain LSTM	0.30	0.24
Attention-LSTM (ours)	0.25	0.22

Table 2. Lightweight-model comparison (Keras runtime, mean ± std across seeds; Colab L4 + Intel Xeon 2.2 GHz host).

Model	Params	Window MAE [m]	Latency [ms/Window]
Attention-LSTM (ours)	61,976	$0.256 \pm 0.018$	$87.9 \pm 1.7$
BiLSTM	109,720	$0.250 \pm 0.017$	$99.2 \pm 0.3$
BiGRU	90,776	$0.238 \pm 0.031$	$112.9 \pm 0.7$

Table 3. Attention-variant comparison (same split/protocol; window-level MAE and per window latency).

Variant	Params	Keras MAE [m]	Keras Lat. [ms]	TFLite FP32 Lat. [ms]
Plain (no attention)	61,976	0.334	80.13	10.37
Scaled dot-product (ours)	61,976	0.253	82.37	10.50
Additive attention	62,040	0.205	107.60	17.47
Multi-head (2 heads)	70,328	0.377	102.04	11.88

Table 4. Compact error-distribution summary on held-out windows.

Error Checkpoint	Median [m]	IQR [m]	p90 [m]
Window MdAE	0.22	0.13	0.33
First-step MdAE	0.11	0.06	0.22
Mid-step MdAE	0.18	0.07	0.26
Last-step MdAE	0.28	0.26	0.36

Table 5. Window-level MAE of TFLite variants on the test split (

n = 15, 165

).

Table 5. Window-level MAE of TFLite variants on the test split (

n = 15, 165

).

Runtime	MAE (Window) [m]	MAE (First) [m]
TFLite FP32	0.30	0.23
TFLite DRQ	0.31	0.24
TFLite FP16	0.39	0.29
TFLite INT8	0.41	0.31

Table 6. Per-throw inference time [s] for three representative cases (same host).

Runtime	Case A	Case B	Case C
Keras (baseline)	5.572	7.695	8.154
TFLite FP32	0.813	1.009	1.092
TFLite DRQ	0.848	1.040	1.131
TFLite FP16	0.987	1.197	1.313
TFLite INT8	1.024	1.251	1.355

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, G.; Aihara, S.; Takegawa, Y. Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices. Appl. Sci. 2026, 16, 2612. https://doi.org/10.3390/app16052612

AMA Style

Chen G, Aihara S, Takegawa Y. Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices. Applied Sciences. 2026; 16(5):2612. https://doi.org/10.3390/app16052612

Chicago/Turabian Style

Chen, Guanyu, Shimpei Aihara, and Yoshinari Takegawa. 2026. "Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices" Applied Sciences 16, no. 5: 2612. https://doi.org/10.3390/app16052612

APA Style

Chen, G., Aihara, S., & Takegawa, Y. (2026). Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices. Applied Sciences, 16(5), 2612. https://doi.org/10.3390/app16052612

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Attention-Enhanced LSTM for Real-Time Curling Stone Trajectory Prediction on Resource-Constrained Devices

Abstract

1. Introduction

2. Related Work

3. Method

3.1. Data Preprocessing

3.2. Sliding-Window Construction

3.3. Attention-Enhanced Lightweight Recurrent Model

3.4. Loss Design

3.5. Online Inference and Publication Cadence

3.6. Prediction-to-Information Mapping

3.7. TFLite Conversion

4. Evaluation

4.1. Trajectory Evaluation

4.2. Real-Time Evaluation

4.3. Lightweighting of Models in Edge AI Scenarios

5. Discussion

5.1. Validation Results Summary

5.2. Application

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI