Reliable gesture interfaces are essential for coordinating distributed robot teams in the field. However, models trained in a single domain often perform poorly when confronted with new users, different sensors, or unfamiliar environments. To address this challenge, we propose a memory-efficient replay-based domain
[...] Read more.
Reliable gesture interfaces are essential for coordinating distributed robot teams in the field. However, models trained in a single domain often perform poorly when confronted with new users, different sensors, or unfamiliar environments. To address this challenge, we propose a memory-efficient replay-based domain incremental learning (DIL) framework, ReDIaL, that adapts to sequential domain shifts while minimizing catastrophic forgetting. Our approach employs a frozen encoder to create a stable latent space and a clustering-based exemplar replay strategy to retain compact, representative samples from prior domains under strict memory constraints. We evaluate the framework on a multi-domain air-marshalling gesture recognition task, where an in-house dataset serves as the initial training domain and the NATOPS dataset provides 20 cross-user domains for sequential adaptation. During each adaptation step, training data from the current NATOPS subject is interleaved with stored exemplars to retain prior knowledge while accommodating new knowledge variability. Across 21 sequential domains, our approach attains
accuracy on the domain incremental setting, exceeding pooled fine-tuning (
), incremental fine-tuning (
), and Experience Replay (
) by
,
, and
percentage points, respectively. Performance also approaches the joint-training upper bound (
), which represents the ideal case where data from all domains are available simultaneously. These results demonstrate that memory-efficient latent exemplar replay provides both strong adaptation and robust retention, enabling practical and trustworthy gesture-based human–robot interaction in dynamic real-world deployments.
Full article