1. Introduction
Over the past decade, the hospitality industry has undergone steady digital transformation driven by the adoption of Internet of Things (IoT) devices, advanced analytics, and artificial intelligence (AI) to improve operational efficiency and guest experience [
1,
2,
3,
4,
5]. Within this context, Digital Twins (DTs)—virtual representations of physical environments connected to real-time data and analytics—have emerged as a promising paradigm for hotels [
6,
7]. DTs can support continuous monitoring of service spaces, scenario-based what-if analysis, and data-driven optimization across operational workflows and management systems [
8,
9].
Despite this potential, most DT applications in hospitality emphasize building operations, energy management, or customer-facing visualization tools (e.g., 3D room explorers and digital replicas of resorts) [
10,
11,
12]. In contrast, front-desk operations—where guest-facing interactions, service personalization, and human workload converge—remain comparatively underexplored from a DT perspective [
13]. This gap is notable because reception services strongly influence perceived service quality and are often constrained by fragmented information systems, manual decision-making, and limited visibility into real-time demand and staff state [
14,
15].
A core challenge in this domain is the integration and fusion of heterogeneous signals that describe front-desk activity in real time. Traditional Property Management Systems (PMS), Customer Relationship Management (CRM) platforms, and building management systems provide essential yet siloed information about reservations, guest profiles, and infrastructure [
16,
17]. At the same time, new sensing modalities can capture physical presence, movement, and physiological indicators. Indoor positioning based on Optical Camera Communication (OCC) [
18,
19] can estimate where people are located in the reception area and how they move across predefined zones [
20,
21]. Wrist-worn wearables can provide heart rate, heart rate variability, and derived stress or workload proxies, offering indirect insight into staff workload in real time [
22,
23,
24]. However, these streams are rarely integrated into a unified semantic model that supports human-centered operational optimization in hospitality.
In parallel, human-centered AI (HCAI) and ontology-based modeling have been proposed as foundations for DTs that move beyond technical monitoring and explicitly account for human factors, ethical constraints, and semantic interoperability [
25,
26,
27,
28,
29,
30]. Ontologies provide a formal, machine-interpretable representation of guests, staff, spaces, devices, and services, enabling reasoning over events, constraints, and relationships in complex environments such as hotel receptions [
31,
32]. HCAI principles further guide the design of systems that support staff rather than replace them, improve transparency and fairness, and respect privacy and data protection requirements [
33,
34,
35,
36].
In previous work, we proposed an ontology-driven DT framework for hotel front-desk services that models guests, rooms, receptionists, and IoT devices and demonstrated through simulation-based evaluation that this approach can reduce check-in times and improve staff efficiency [
37]. That framework emphasized semantic interoperability and reasoning, but it was primarily validated with synthetic or abstracted data and did not incorporate fine-grained positioning information or physiological indicators from real or emulated devices.
This article extends that line of research by introducing a human-centered DT architecture that fuses two complementary sources in real time: (i) spatial events from an OCC-based camera system providing timestamped positions of individuals within the reception area, and (ii) physiological and activity metrics collected from commercial wrist-worn wearables assigned to front-desk staff. Both streams are ingested through a vendor-agnostic, property-oriented REST interface and mapped onto an extended ontology that captures position events, zone semantics, and workload-related indicators. This enables the DT to reason about where people are, how long they remain in specific zones (e.g., queueing, service, or back-office areas), and how demanding the current conditions are for each staff member.
Figure 1 summarizes the architecture and data-to-decision flow.
The resulting system is intended to support real-time and near-real-time front-desk workforce management. By combining spatial occupancy patterns, queue dynamics, and wearable-derived workload proxies, the DT derives key performance indicators (KPIs) such as estimated queue length, time spent in service and waiting zones, and a staff workload index. These KPIs can be used to recommend operational actions (e.g., reallocating staff across service points or opening additional counters) while maintaining a strong privacy focus through pseudonymization and zone-level aggregation rather than individually identifiable tracking [
38,
39].
Building on the ontology and architectural patterns introduced in our prior DT framework for hospitality front-desk services, this work makes the following contributions:
An extension of an existing ontology for hotel front-desk operations with a space–time event model for people positioning (including PositionEvent and zone semantics) and wearable-derived metrics (e.g., heart rate, heart rate variability, and a stress/workload index) associated with reception staff.
A vendor-agnostic, property-based REST API for ingesting OCC and wearable events into the DT platform, including design elements that support idempotency, versioning, pagination, and privacy-aware data handling suitable for integration with existing hotel information systems.
A people-positioning and workload model that fuses zone transitions and physiological signals to estimate queue- and workload-related KPIs in real time, enabling human-centered decision support for staff allocation and service prioritization.
An instantiation and evaluation through scenarios using real and/or realistically synthesized datasets aligned with the deployed APIs, assessing localization quality, zone classification accuracy, queue estimation error, and workload prediction performance, and comparing the DT-based approach to baselines without semantic fusion.
By combining semantic modeling, multi-sensor data fusion, and human-centered AI principles, this work demonstrates how ontology-based DTs can evolve from structural representations into operational tools for managing guest flows and staffing at the hotel front desk while explicitly addressing privacy, ethical, and organizational constraints in realistic hospitality settings.
2. Related Work
Digital Twins (DTs) were originally developed for engineering assets and cyber–physical systems in manufacturing, infrastructure, and smart cities, with an emphasis on equipment health monitoring and process optimization [
40,
41]. More recently, DT concepts have been extended to service-oriented and human-centric contexts, where the objective is to improve service delivery, user experience, and staff performance rather than only technical assets. In hospitality, emerging conceptual models and case studies explore DTs for hotels through the integration of real-time data, predictive analytics, and customer-centric services.
Recent service-oriented DT research argues that digital twins can support customer-experience management by coupling real-time data streams with simulation to anticipate issues and test interventions [
42]. Broader surveys of DT progress and industry adoption also highlight governance and organizational barriers that become especially salient in people-intensive service settings [
43]. Within tourism, recent work on smart destinations emphasizes acceptance, perceived risks, and governance capacity as determinants of whether DT deployments are trusted and sustained [
44]. Architectural proposals for product–service systems likewise motivate DT designs that separate low-level device integration from higher-level service models and analytics [
45]. For example, Hassan and Eassa propose a Smart Hotel Management Information System (SHMIS), an integrated IoT context-awareness framework intended to enhance guest experience and operational efficiency [
46]. Practitioner reports similarly present DTs as a means to replicate hotel facilities virtually, analyze guest journeys, and support scenario planning for operations and marketing. However, much of this work remains at the architectural or facility level and emphasizes guest-facing applications, while the modeling of front-desk workflows, staff workload, and human factors is comparatively limited. Prior research began to address this gap through an ontology-driven DT for hotel front-end services that explicitly models receptionists, guests, resources, and service processes and demonstrates, via simulation, that ontology-based reasoning can reduce waiting time and improve front-desk performance. Compared with these service-oriented DT approaches, the present article advances the state of the art by incorporating fine-grained, real-time indoor positioning and wearable signals into the same semantic framework, with the explicit goal of supporting human-centric management of front-desk staff.
Ontologies provide formal, machine-interpretable representations of complex domains, enabling knowledge sharing and semantic interoperability. Within this line of work, event-based and spatio-temporal ontologies have emerged as an effective approach for modeling situations that unfold over time and space. Event-centric models treat events as first-class entities linked to participants, places, and times and have been applied to domains such as cultural heritage, sensor networks, and urban dynamics [
47,
48]. Many of these approaches explicitly represent temporal extent and spatial footprint, often adopting a four-dimensional view in which entities are described as space–time regions. Recent work also combines ontological representations with operational data structures to support querying and processing of dynamic events. For example, ref. [
49] proposes a spatio-temporal entity-based event data model that integrates object-centric and event-centric views to represent evolving entities and their interactions in space and time. Comparative studies highlight recurring design patterns, such as linking events to time intervals, locations, actors, and thematic roles, and discuss trade-offs between expressiveness and computational tractability. In hospitality and tourism, ontologies have been used mainly to represent tourism products, accessibility, and personalized recommendations [
50,
51,
52], whereas less attention has been paid to formalizing spatio-temporal events that describe how guests and staff move through service spaces such as reception areas. The ontology introduced in prior work begins to address this gap by modeling front-desk processes and IoT interactions, yet its treatment of space and time remains relatively coarse. The present work builds on these foundations by extending the ontology with explicit classes and properties for PositionEvent, semantic zones, and trajectories, enabling the DT to reason about where and when people are located at the front desk and how these patterns relate to workload and service quality.
Indoor positioning systems (IPS) rely on technologies such as Wi-Fi fingerprinting, Bluetooth Low Energy beacons, ultra-wideband, and optical camera communication (OCC) to provide location estimates in environments where satellite-based positioning is not available. IPS solutions have been applied in transportation hubs, hospitals, shopping malls, and event venues to support wayfinding, asset tracking, and people-flow analytics [
53,
54,
55]. In tourism and hospitality, deployments often focus on indoor navigation for guests, location-based personalization, and visitor-flow analytics in large venues such as resorts or airports. OCC has been proposed as a promising technology for high-accuracy indoor localization. Chavez-Burbano et al. present an OCC system for three-dimensional indoor localization using modulated light sources and standard cameras, reporting sub-meter accuracy under realistic conditions. Subsequent work has shown that OCC-based systems can be combined with wearable transmitters and image-processing pipelines to support indoor positioning in domains including healthcare and sports [
56]. In hospitality, IPS are also being adopted for indoor navigation and for operational use cases such as coordinating cleaning and maintenance staff and analyzing guest flows through lobbies and shared areas [
57]. However, these applications are typically guest-centric and rarely integrate localization data into a broader semantic model of front-desk operations and human resources. The present work leverages OCC-based positioning as a source of space–time events (camera identifier, timestamp, and
coordinates within predefined zones) that can be mapped onto an ontology-driven DT, enabling richer reasoning about queues, occupancy of service zones, and staff deployment at the reception.
Figure 2 illustrates how OCC-based sensing is interpreted within the proposed front-desk DT. A ceiling- or wall-mounted camera observes the service area and decodes OCC identifiers associated with tracked persons, producing timestamped occupancy and positioning events (including camera ID and confidence). These events are forwarded to the backend, mapped to the DT’s canonical 3D reference frame, and classified using predefined semantic zones (e.g., the highlighted queue zone). Zone-aware trajectories and dwell times support real-time estimation of queue dynamics and occupancy, which then feed higher-level KPIs and decision-support recommendations.
Wearable devices such as smart bands and smartwatches provide continuous streams of physiological and activity data, including heart rate, heart rate variability (HRV), step count, and sleep indicators. Systematic reviews report extensive use of wearables for stress monitoring and intervention, often relying on HRV-derived features as proxies for autonomic balance [
58,
59]. These studies suggest that commercial wearables can support stress awareness and self-regulation and may contribute to early detection of mental health risks while also noting limitations related to signal quality, robustness in real-world conditions, and long-term adherence. In occupational settings, HRV from wearable ECG or photoplethysmography (PPG) sensors has been used to estimate stress and fatigue among shift workers, including nurses, showing associations between physiological indicators, perceived workload, and shift patterns [
60,
61]. Other work discusses how smartwatches and HRV-based analytics could be integrated into broader stress-management and well-being programs, emphasizing the potential for continuous, unobtrusive monitoring of workload and recovery [
62]. Although much of this literature focuses on healthcare and office environments, it provides methodological guidance on feature extraction, labeling strategies, and modeling approaches for mapping wearable signals to stress and workload indices. In hospitality, wearable research has primarily targeted guests (e.g., activity tracking and experience analysis), with limited attention to staff-facing applications for managing human resources and preventing burnout. To the best of our knowledge, no prior work integrates wearable-derived workload indicators with indoor positioning data within an ontology-based DT for hotel front-desk services. By combining OCC-based spatial events and wearable metrics in a unified semantic model, the present work addresses this gap and evaluates how multimodal sensing can support human-centric decision-making for staff allocation and service quality management.
Indoor positioning for hospitality DTs involves practical trade-offs among accuracy, infrastructure cost, device requirements, and environmental sensitivity. UWB deployments typically provide decimeter-level accuracy but require anchors and calibration, while WiFi-RTT can leverage existing infrastructure with variable accuracy depending on access-point geometry and multipath conditions. BLE beacons offer low cost and wide availability but usually yield coarser accuracy, and vision-only approaches can avoid tags at the expense of privacy, occlusion sensitivity, and higher computational cost. In this work, OCC is treated as one positioning modality that can provide fine-grained reception coverage when tags are available, while the DT interfaces are defined in terms of normalized events and zone semantics so that the core DT logic can be preserved if the underlying positioning modality changes.
Indoor positioning at the front desk can also be supported by alternatives such as ultra-wideband (UWB), WiFi RTT, BLE beacons, or vision-based tracking. UWB typically offers high accuracy but requires infrastructure deployment and tag management, whereas WiFi RTT leverages existing WiFi access points but can be sensitive to device support and multipath. BLE-based solutions are comparatively low cost but often provide coarser accuracy and require careful calibration. OCC is selected here as a camera-based approach that can coexist with front-desk visual infrastructure and provide event-level positioning under a unified ingestion schema.
From a semantic interoperability perspective, the proposed sensing concepts can be aligned with standard sensor ontologies such as SOSA/SSN. In particular, OCC detections and wearable measurements can be interpreted as observations with associated timestamps, results, and provenance, while the front-desk ontology contributes the domain semantics required for staffing, zones, and queue reasoning. This alignment supports interoperability without sacrificing the reception-specific abstractions needed for operational decision support.
3. Materials and Methods
This work follows a design science approach to extend an ontology-based DT for hotel front-desk services with multimodal sensing from an OCC positioning system and staff wearables. The materials comprise the ontology and its extensions, the OCC-derived 3D positioning and zone semantics, the wearable-derived workload indicators, and a modular implementation that connects ingestion, reasoning, and DT state. These elements are exposed through a property-based REST API that provides both raw events and aggregated key performance indicators (KPIs).
The semantic foundation of the DT is a front-desk ontology that models entities such as Guest, Receptionist, Room, ServiceCounter, and Device, together with service processes such as CheckIn, CheckOut, and generic ServiceActivity instances. The baseline ontology was developed following standard ontology engineering practices, with reuse of established vocabularies where feasible and explicit documentation of constraints and axioms. While the baseline supports reasoning about processes and resource utilization, its native representation of physical space is coarse and it does not capture physiological workload. The DT is therefore extended with spatio-temporal observation concepts and workload-related concepts so that spatial behavior and staff physiological state can be queried and reasoned about jointly.
To represent spatio-temporal observations coming from the positioning system, the ontology introduces a PositionEvent concept that captures the observation of a person token at a given time and 3D location. A PositionEvent includes hasTimestamp (UTC instant), hasCameraId (the producing camera identifier), hasPersonToken (a pseudonymous token corresponding to detected_id), posX, posY, and posZ (normalized 3D coordinates derived from object_position_x, object_position_y, and object_position_z), eventType (a semantic label derived from event, e.g., AccessAllowed), hasROI (a region-of-interest descriptor corresponding to roi), and hasConfidence (a numeric score in ). Spatial semantics are represented with Zone entities that correspond to meaningful reception areas such as QueueZone, ServiceZone, BackOfficeZone, and EntranceZone. Zones are configured using a 2D footprint and an optional height interval, which provides a practical 3D representation for reception spaces. Each PositionEvent is assigned a zone label. Sequences of events associated with the same person token are grouped as Trajectory instances, and maximal contiguous sequences that remain in the same zone are represented as ZoneDwellSegment entities, enabling queries about dwell times, transitions, and zone occupancy. To improve interoperability across sensing modalities, the event layer can be aligned with standard observation vocabularies (e.g., W3C SOSA/SSN) by treating each positioning or physiological sample as an observation with a feature of interest, a result, and a timestamp. The manuscript retains domain-specific class names for readability, while the mapping to an observation pattern is documented so that OCC can be replaced by UWB/BLE/WiFi RTT without changing the DT semantics beyond the adapter that populates the canonical event schema.
Physiological workload and stress are represented through a PhysiologicalMetric abstraction with subclasses such as HeartRate, HeartRateVariabilityMetric, and ActivityLevel. Each metric is linked to a StaffMember, a time interval, and a numeric value. A derived WorkloadIndex is defined as a continuous measure in computed over sliding windows, and WorkloadState is used to provide an interpretable categorical discretization (e.g., Low, Medium, High). These concepts are connected to the spatial layer by associating workload states with trajectory segments and service activities so that the DT can represent where staff are, what they are doing, and how loaded they are within a unified semantic model.
The OCC system produces records with the following fields:
timestamp, camera_id, detected_id,
object_position_x, object_position_y, object_position_z,
event, roi, confidence
A representative example as received is:
2025-11-26T16:05:22.558623Z, CAM01, 110100,
"1.25", "0.85", "1.70", AccessAllowed, [124,210,40,40], "0.97"
During ingestion, timestamps are parsed as UTC instants and the triplet is mapped from the camera-local frame into a canonical 3D reference frame attached to the reception area. Calibration uses camera intrinsics and extrinsics and a rigid transformation to support alignment across cameras when multiple devices are deployed. The transformed coordinates are normalized to per camera, simplifying zone configuration and reducing dependence on specific hardware characteristics. Zones are configured as labeled 3D volumes using a polygonal footprint and a height interval. Each event is classified through a point-in-volume test; events that do not fall into any configured zone are assigned to UnknownZone and reserved for diagnostics and calibration validation.
The core spatial processing can be summarized by the following pseudo-code:
for each incoming OCC event e:
parse and normalize timestamp and (x,y,z)
transform to canonical frame and normalize to [0,1]^3
if e.confidence < conf_threshold: continue
zone = first Zone prism that contains (x,y,z), else UnknownZone
update last_seen[token] = (t, zone, position)
expire tokens with now - last_seen[token].t > timeout
occupancy[zone] = count of non-expired tokens assigned to zone}
For each person token, the system maintains an ordered sequence of events
where
is the timestamp,
is the zone label, and
is the normalized 3D position. Zone transitions correspond to changes in the zone label, and dwell time is computed from entry and exit instants within each contiguous zone segment. These sequences provide the basis for queue-related and service-related indicators, including instantaneous queue length as the number of tokens in QueueZone, occupancy of service zones, distributions of dwell times in queue and service areas, and trajectories connecting entrance, queue, and service counters. The confidence score is stored in PositionEvent.hasConfidence and supports filtering or weighted aggregation; events below a configurable threshold can be excluded from KPI computation, while alternative configurations may down-weight low-confidence contributions.
The positioning model assumes that tracked entities carry a detectable tag that yields detected_id. This assumption is limited to persons of interest whose trajectories are needed at individual level, such as front-desk staff or opt-in participants in controlled pilots; it is not a requirement that all hotel guests carry a token. Guest-flow indicators can be derived from zone-level occupancy and queue estimates and can be complemented with existing operational systems (e.g., check-in logs) without persistent per-guest identification. Any mapping between sensor-facing tokens and internal identities is handled outside the DT under hotel-controlled governance and access control.
Wearable data ingestion is designed for wrist devices that provide heart rate (HR), heart rate variability (HRV) metrics when available, and basic activity indicators. A background job periodically queries the vendor cloud API for registered devices and retrieves data over overlapping windows. Each sample is stored with staffId (internal staff identifier), deviceId, metricType (HR, RMSSD, SDNN, activity, etc.), startTimestamp, endTimestamp, and a numeric value. Identifiers can be pseudonymized at ingestion, with a private mapping table controlled by the hotel. Timestamps are normalized to UTC and can optionally be duplicated in Unix epoch format for interoperability with downstream systems.
Workload estimation is computed per staff member over sliding windows W by extracting a feature vector from available signals, such as HR summary statistics, time-domain HRV features, and activity proxies. Features are normalized using rolling baselines per person to mitigate inter-individual variability. A configurable function yields the WorkloadIndex. A rule-based configuration can be used with thresholds on HR increases and HRV decreases, and a data-driven configuration can be used when a supervised model is calibrated using periodic self-reported workload labels. The index is discretized into WorkloadState categories and combined with spatial context, enabling detection of patterns such as sustained high workload while remaining in high-pressure zones.
The system is implemented as a pipeline from ingestion to DT-level KPIs. OCC events are received through a webhook or message queue, while wearable metrics are retrieved through periodic API calls. Both streams are validated and normalized before being processed by spatio-temporal reasoning that performs calibration, coordinate normalization, zone assignment, and derivation of trajectories and dwell segments. Wearable metrics are aggregated into workload indices over sliding windows. Normalized observations and derived quantities are mapped to instances of the ontology, and a reasoning component enforces constraints and infers higher-level facts such as overload conditions or unusual movement patterns. The DT state and aggregated KPIs are then exposed via a REST API and consumed by dashboards, monitoring tools, and other hotel systems. The design is modular in the sense that sensing components can be replaced as long as they provide the normalized event schema and the DT core remains stable as new data sources are integrated.
To decouple the DT from specific devices and vendors, the REST API follows a property-based design. The API is centered on observable DT properties such as position events, physiological metrics, snapshots, and KPIs. Ingestion endpoints are idempotent with respect to a composite natural key: {timestamp, camera_id, detected_id} for OCC events and {metricType, staffId, startTimestamp, endTimestamp} for wearable metrics. Duplicate submissions are ignored. Versioning is encoded in the URL path, and all endpoints require token-based authentication handled by the existing authorization service. Pseudonymized identifiers are used systematically, and mappings to real identities remain outside the DT under hotel-controlled policies and applicable data-protection requirements. Query endpoints provide pagination and filtering by time range, zone, staff role, and metric type so that clients can request aggregated KPIs when fine-grained data are not necessary.
Key operational parameters are summarized in
Table 1 to support reproducibility and facilitate independent re-implementation.
Listing 1 summarizes the core logic of OCC event normalization and zone assignment used by the DT.
| Listing 1. OCC normalization, canonical transform, and zone assignment. |
Input: raw_event(timestamp, camera_id, detected_id, x, y, z, roi, confidence) Parse timestamp as UTC instant Parse x, y, z as float (decimal points); validate required fields If confidence < threshold: mark low confidence or discard # Canonical transformation r_local <- (x, y, z) r_world <- T_camera_to_world(camera_id) ∗ r_local r_norm <- normalize_to_unit_cube(r_world, camera_id) # maps to [0,1]^3 # Zone assignment (prismatic zones) zone <- UnknownZone For each zone volume Z in active configuration: if point_in_volume(r_norm, Z): zone <- Z.label; break Create PositionEvent with (timestamp, camera_id, detected_id, r_norm, zone, confidence) Update last_seen [detected_id] and expire tokens older than timeout Output: normalized PositionEvent with zone label |
To support reproducibility and peer validation, key research assets—including DTDL ontology files, SPARQL and SWRL rules, simulation scripts, and synthetic datasets—are publicly available at
https://github.com/msegced/doctorado/ (accessed on 22 January 2026) under an open-source license. The repository includes documentation and a reproducible setup guide.
4. Implementation
The proposed ontology-based DT is implemented as a runtime pipeline that turns heterogeneous sensing streams into a unified, queryable DT state. The implementation covers semantic mapping from raw OCC and wearable records to ontology instances, operational processing for normalization, zone assignment, and state inference, and integration through a property-based REST interface.
Figure 3 summarizes the deployment view: OCC positioning events and wearable metrics are ingested through the API, mapped to ontology individuals, validated and normalized, and then processed through parallel spatial and workload pipelines. The resulting ontology-backed DT state supports KPI computation, rule-based triggers, and downstream consumption by dashboards, human-resource decision support, and simulation modules.
The runtime is designed to tolerate partial observability and degrade gracefully. When OCC events are temporarily missing (occlusions, camera maintenance, network loss), occupancy and queue indicators fall back to the most recent valid observations subject to timeout expiry; prolonged gaps explicitly reduce confidence in KPIs and can suspend recommendations that require reliable positioning. When wearable data are missing (battery, off-wrist, non-compliance), workload indicators are marked as unavailable and recommendations rely on spatial and operational KPIs only. Conflicting signals (e.g., high inferred workload with low queue load) are handled by surfacing the inconsistency to the dashboard (rather than forcing a single interpretation) and by keeping recommendation rules conjunctive and explainable (e.g., requiring both sustained queue pressure and sustained high workload before suggesting staff rotation).
A dedicated semantic mapping layer transforms raw records from OCC cameras and the wearable provider into instances of the front-desk ontology.
Table 2 summarizes the correspondence between source fields and ontology concepts. The mapping is implemented declaratively through configuration rules that specify JSON-to-ontology correspondences, type conversions (e.g., ISO 8601 [
63]. Further details are available on the ISO website [
64].) to epoch time), unit normalization, and enumeration mappings for categorical fields. This design reduces coupling between upstream schemas and the DT core: when source fields evolve, the mapping configuration can be updated without changing the ontology or the downstream processing logic.
At runtime, OCC events pass through a normalization pipeline that performs parsing, validation, timestamp handling, 3D coordinate transformation, and confidence-aware filtering. Timestamps in ISO 8601 format are interpreted as UTC instants and converted to Unix epoch seconds to simplify time indexing. The triplet is transformed from the camera-local frame into the canonical 3D reference frame of the front desk and normalized to per camera. Confidence values are stored with each event and can be used to discard detections below a configurable threshold or to retain them with a low-confidence flag, depending on the deployment profile. Wearable metrics follow analogous validation steps, including interval ordering checks and value-range validation.
Zone assignment is performed through a point-in-volume test in the canonical 3D frame. Zones are configured through a management interface that stores polygonal footprints and height intervals in a dedicated configuration store. Zone definitions are versioned so that layout changes are traceable: subsequent events are evaluated with the updated configuration, while historical events remain associated with their original version. Zone occupancy is computed continuously by maintaining, for each person token, the most recent observation and its associated zone, and tokens expire after a configurable timeout to avoid drifting counts when observations stop.
Coarse behavioral states per person token are inferred from zone transitions and time-based rules. The states include Idle, Approaching, WaitingInQueue, InService, and Leaving. State changes are recorded as ontology instances, enabling reconstruction of service episodes and estimation of waiting-time proxies from spatial traces.
Cross-stream integration relies on controlled identifier governance across the OCC detected_id token, the wearable deviceId, and the internal staffId. The mapping is stored in a secure configuration service that is not exposed through the DT API, and identity resolution is separated from DT analytics through role-based access control and audit logging. Guest identities are not resolved within the DT; guest-related observations remain pseudonymous. Identity resolution is performed only when operationally required (staff), using a hotel-controlled mapping service; guest tokens remain non-resolved within the DT, and the mapping table is not accessible through the DT API.
The DT is designed to degrade gracefully under missing or conflicting data. Confidence-aware filtering, token expiry, and zone-level aggregation reduce the impact of transient OCC dropouts, while workload estimation uses sliding-window aggregation and can produce an explicit Unknown workload state when wearable windows are missing. When signals disagree (e.g., low queue occupancy but high workload), recommendations are conditioned on multi-signal persistence rather than on a single snapshot, and the system exposes data-quality indicators so that managers can interpret KPIs conservatively.
The REST API provides the stable integration surface for external consumers. Ingestion endpoints such as /occ/events and /wearable/metrics support both live devices and replay tools for recorded or synthetic datasets. State endpoints such as /frontdesk/state provide real-time snapshots, and KPI endpoints such as /frontdesk/kpis run aggregation queries over fused data with pagination and filtering. This property-based interface supports incremental evolution of the deployment, enabling additional cameras or refined workload models to be introduced without breaking compatibility with the existing authorization and front-end services, provided endpoint semantics remain stable.
5. Evaluation and Scenarios
The evaluation examines how accurately the DT reconstructs the physical and physiological states of the front desk from OCC and wearable data, and how useful the resulting KPIs are for operational decision support. The assessment combines offline evaluation on logged traces for which reference annotations are available with an online-style replay of realistic operating situations designed to stress the system under varying demand and workload conditions.
In addition to logged traces, the evaluation uses realistically synthesized traces to enable controlled stress testing under varied demand and workload conditions. Synthetic traces are generated by sampling guest arrivals from non-homogeneous Poisson processes with configurable peaks, sampling service times from parametric distributions calibrated to front-desk benchmarks, and simulating staff schedules (shift start/end, breaks, part-time availability) with role constraints. OCC event streams are then produced by replaying the resulting trajectories through the same ingestion schema (timestamps, camera IDs, positions, ROI, confidence), including configurable occlusion/dropout and confidence degradation to emulate real capture conditions. Wearable traces are generated by combining baseline per-person physiological profiles with workload-driven perturbations (e.g., HR elevation and HRV reduction under sustained load) and missing-data patterns (battery/off-wrist). Where ground-truthed field traces are available, the same metrics are computed on those segments and reported separately.
Table 3 summarizes the key runtime and evaluation parameters used in the prototype and experimental setup. These parameters define the operational configuration of the OCC-based positioning pipeline, the temporal aggregation of wearable-derived workload indicators, and the decision thresholds applied for workload classification, queue alerts, and short-term prediction. Unless otherwise stated, default values were selected to balance responsiveness and robustness, while remaining fully configurable to support sensitivity analysis and deployment in different front-desk contexts.
Real traces are used when available (e.g., controlled positioning runs and pilot logs), while scenario traces are generated to enable repeatable stress testing of the DT logic and ablation comparisons. Synthetic traces are produced by a scenario generator that samples guest arrivals from peaked or piecewise-stationary processes, samples service times from configurable distributions, simulates zone transitions and dwell times consistent with the reception layout, and injects missingness and noise to emulate sensing imperfections such as intermittent OCC detections and dropped wearable windows. The parameters of these generators are aligned with the schemas and contracts exposed by the deployed ingestion API so that replayed traces exercise the same runtime pipeline as live ingestion.
All quantitative metrics are computed on time-synchronized traces where OCC data, wearable data, and reference labels are available. Spatial accuracy is assessed through controlled experiments in which staff or test subjects follow predefined paths and stop at calibrated markers in the reception area. Let
denote the reference 3D position of sample
i in meters, and let
denote the corresponding position estimated by the OCC-based pipeline after calibration and normalization. The Euclidean localization error is:
Localization performance is summarized using mean error (ME), median error, and the 95th percentile. Zone classification performance is evaluated by comparing the predicted zone label
assigned by the DT to each labeled event with a reference zone label
obtained from manual annotation or derived from reference positions. Overall accuracy is computed as follows:
where
is the indicator function that equals 1 when the condition holds and 0 otherwise. Macro-averaged precision, recall, and F1-score across zones are also reported, together with confusion matrices.
Queue-related KPIs are derived from the number of tokens assigned to queue zones and from the duration of the inferred WaitingInQueue behavioral state. The DT’s estimated queue length
is compared with a reference queue length
when reference information is available. The root-mean-square error (RMSE) is computed over a set of evaluation timestamps
T, with
denoting the cardinality of
T:
Here,
T denotes the set of discrete evaluation instants (after time synchronization) and
is the number of instants in that set.
In addition, the mean absolute error (MAE) of waiting-time proxies is computed. The waiting-time proxy is defined as the elapsed time between first entering a queue zone and first entering a service zone; when transaction-system timestamps are available, those timestamps are used as reference values.
Wearable-derived staff workload is evaluated against reference labels obtained from self-reports or expert annotation. For each time window
and staff member, the DT produces a predicted workload state
and a corresponding reference label
. Categorical workload accuracy is computed as follows:
and macro-averaged F1-score across workload levels is reported together with confusion matrices. When continuous self-reported workload scores are available, the continuous WorkloadIndex is additionally evaluated using correlation and RMSE against those scores.
Three scenario families are defined to assess DT behavior in operating conditions and the utility of DT-derived KPIs for decision support. The arrival-peak scenario models check-in peaks characterized by bursts of guest arrivals, queue growth, and increased workload for receptionists. The human-centered staff management scenario evaluates staff allocation under variable demand by comparing a baseline strategy with static assignments to a DT-informed strategy that triggers temporary reassignment when workload exceeds thresholds while queue conditions remain unfavorable. The near-term queue prediction scenario evaluates short-horizon forecasting of queue conditions from recent DT state by comparing a baseline moving-average predictor with a DT-based predictor that incorporates trajectory and workload features. Performance is measured with RMSE and MAE between predicted and reference queue lengths, and calibration plots can be used to assess probabilities of exceeding operational thresholds.
6. Results
This section summarizes the results obtained with the proposed front-desk DT. The results combine quantitative comparisons between a DT-based configuration and baseline strategies across the scenarios defined in
Section 5, together with qualitative examples of interpretable recommendations produced from fused OCC positioning and wearable-derived workload information. All experiments used the same ingestion, semantic mapping, and reasoning pipeline as the front-desk prototype. Unless otherwise stated, the numerical results reported below correspond to a 1-week realistically synthesized trace generated to match the deployed schemas and API contracts; the prototype also supports replay of logged traces, and field validation with ground-truthed operational data is planned as future work.
Across scenarios, a baseline configuration (no semantic fusion, static staff assignment, and monitoring based on coarse operational logs) is compared with a DT-based configuration that fuses OCC, wearable, and front-desk process data through ontology-backed mapping and reasoning. Queue-related KPIs follow the definitions in
Section 5. The DT estimates queue length
from the number of active tokens classified in QueueZone and from the membership and duration of the inferred WaitingInQueue behavioral state. Waiting time is evaluated using a waiting-time proxy defined as the time a token remains in a queue zone before first entering a service zone. Workload KPIs follow the wearable pipeline described in
Section 3 and
Section 4.
Figure 4 shows a dashboard view produced by the DT REST API for monitoring front-desk operations. The Occupancy panel reports the number of active tokens and their distribution across semantic zones. The Trajectory panel visualizes queue dynamics as a time series of estimated queue length derived from queue-zone occupancy and/or WaitingInQueue membership. The Workload panel summarizes the distribution of staff workload states inferred from wearables using sliding-window aggregation. The Recommendations panel lists active, human-readable suggestions generated by rule- or policy-based logic over ontology-level facts, together with the triggering conditions. The current dashboard is an early prototype; the figure is presented as a representative snapshot of implemented widgets and API outputs, and future iterations will include deployment screenshots from a live pilot environment.
Table 4 reports representative comparative results for queue-related KPIs in the arrival-peak scenario, the workload-based reassignment scenario, and the queue prediction scenario. On the one-week synthetic trace, the DT-based configuration yields substantially lower queue-length RMSE and lower waiting-time proxy error than the baseline, with the largest differences occurring under high-variance conditions such as arrival peaks. In the queue prediction scenario, the improvement is more modest, reflecting that both predictors are constrained by forecast horizon and intrinsic variability.
The workload-related results are summarized in
Table 5. Workload estimation is evaluated against reference labels using categorical accuracy and macro-averaged F1 over Low/Medium/High. Operational summaries include the fraction of time in High workload and a workload-balance measure computed as the standard deviation of time-normalized per-staff average workload over the evaluation horizon so that staff with different active durations (e.g., part-time shifts) remain comparable. To account for different shift lengths or part-time staff, per-staff averages are computed over the windows in which that staff member is scheduled/observed, yielding time-normalized averages before computing the standard deviation.
Beyond numerical KPIs, the DT generates human-interpretable recommendations intended for front-desk managers. Recommendations are produced by declarative rules and policies over ontology-level representations of occupancy, zone transitions, dwell segments, inferred queue state, and staff workload. Queue-management recommendations can be triggered when the queue load remains above a threshold for longer than a configured duration while at least one staff member is available and in a low-load state. Workload-balancing recommendations can be triggered when sustained High workload for one staff member co-occurs with predominantly Low/Medium workload for others, suggesting temporary rotation to reduce sustained overload.
Overall, the results indicate that the ontology-driven DT can reconstruct and visualize front-desk state in near real time and can generate actionable recommendations that combine spatial behavior, queue dynamics, and staff workload within a unified semantic framework.
7. Discussion
This section discusses the implications of the proposed front-desk DT for hospitality operations, its practical usability under realistic sensing constraints, and its relationship to prior work on digital twins, ontologies, indoor positioning, and wearable-based workload estimation.
Digital transformation in hospitality has traditionally emphasized revenue management, online distribution, and guest-facing services. DTs have recently been explored as an additional enabler for sustainability, experience design, and data-driven decision-making in hotels and tourism environments. Many hospitality DT efforts, however, concentrate on building performance, marketing-oriented visualization, or high-level conceptual models. Day-to-day management of front-desk human resources has received comparatively less attention despite its direct impact on waiting time, perceived service quality, and staff well-being. The DT presented in this work addresses this operational gap by integrating OCC-based 3D positioning with wearable-derived workload indicators within an ontology-driven representation of reception processes.
Practical usefulness depends on whether the DT remains informative when sensing is imperfect, signals are noisy, or visibility conditions change. OCC-based positioning can be affected by occlusions, reflections, and lighting variability. The DT mitigates these effects through confidence-aware processing, zone-level aggregation, and trajectory/dwell abstractions that emphasize semantically meaningful transitions over fine-grained coordinate fluctuations. Wearable-derived workload states are subject to motion artifacts and inter-individual variability; the DT therefore treats workload states as operational indicators rather than clinical measurements and exposes uncertainty through missing-window handling and conservative rule triggers.
The property-based REST API and modular architecture decouple the DT core from specific sensing vendors and allow for evolution of the sensing layer without breaking downstream integrations. This modularity is complemented by explicit governance around pseudonymous identifiers and by designs that allow aggregated KPIs to be consumed when fine-grained data are not necessary.
Relative to prior DT work in hospitality and tourism, the main distinguishing feature is the explicit focus on front-desk operations, human workload, and multimodal fusion of spatial and physiological sensing. The semantic mapping between source fields and ontology properties improves reproducibility and facilitates adaptation to other deployments. Interoperability can be strengthened further by aligning the DT’s sensing concepts with standard sensor ontologies such as SOSA/SSN; the proposed PositionEvent and physiological metrics can be viewed as specializations of observation patterns while preserving the domain-specific front-desk semantics required for staffing and queue reasoning.
The results should be interpreted in light of limitations related to sensing variability and to the current scale of evaluation. Synthetic scenarios enable repeatable stress testing and ablation studies, but they do not replace long-running field pilots that capture real lighting variability, crowding, and operational diversity. Future work therefore includes broader in-hotel validation, uncertainty-aware multi-camera fusion, richer geometric representations for complex lobbies, and co-designed governance processes that ensure workload analytics are used to support staff rather than to enable punitive surveillance.
8. Privacy, Ethics, and Compliance Considerations
The proposed front-desk DT processes data streams that can be sensitive in a workplace context, including indoor positioning traces and physiological measurements derived from staff wearables. The design treats privacy, ethics, and compliance as part of the system requirements. The guiding objective is to enable operational insight and staff support while avoiding unnecessary identifiability, reducing the incentives and technical affordances for surveillance, and ensuring that any real deployment is governed by clear organizational safeguards.
For staff wearables, the intended deployment assumes an explicit informed-consent process that describes what is collected (variables and sampling), for which purposes (operational monitoring and well-being support), who can access which views (role-based access), and how long data are retained. Participation must be voluntary, with an opt-out path that does not entail negative consequences. The link between pseudonymous identifiers and real identities is controlled outside the DT by the hotel under a least-privilege model (e.g., an HR data controller), and the DT API never exposes identity-resolution endpoints. Any use of individual-level information for performance evaluation is explicitly out of scope; the DT is designed to support operational decision-making and staff well-being through aggregated indicators and transparent, inspectable recommendation triggers.
Pseudonymization is the default. OCC and wearable streams are handled through opaque identifiers that have no civil meaning within the DT. In OCC events, detected_id is treated as a sensor-facing token used only to link observations across time. Wearable metrics are ingested using a pseudonymous staffId token for the same purpose. The DT is intentionally structured so that most operational indicators can be computed without resolving real identities. Any linkage between pseudonyms and civil identities is kept outside the DT in hotel-controlled systems under strict access control and auditability.
The DT is designed to reason at the level that is necessary for front-desk decisions. Many KPIs are defined at zone or team level, such as queue length, occupancy per zone, dwell-time distributions, and the number of staff currently in a high workload state. Movement is typically summarized through zone transitions and dwell segments rather than continuous trajectories. For guests, the DT does not require mapping between detected_id and personal identity; guest-flow is represented through anonymous tokens and zone-level aggregates. For staff, individual-level workload indicators can be restricted to authorized roles and replaced by anonymized summaries when group-level analytics are sufficient.
Workplace sensing raises ethical risks beyond data leakage, including power imbalances and function creep. A responsible deployment requires explicit informed-consent procedures and organizational safeguards. Staff should be informed in clear language about what is collected, at what granularity, for what operational purposes, and with what retention period. Consent should be revisitable if purposes change, and opt-out mechanisms should be available without retaliation. The DT is positioned as a supportive operational tool that helps detect overload and allocate assistance rather than as an instrument for punitive evaluation, and any real deployment should be co-designed with stakeholder representation and documented acceptable-use policies.
Governance mechanisms complement technical safeguards. Retention policies should distinguish between raw events and aggregated indicators: raw OCC detections and raw wearable measurements can be kept only for the minimum time needed for operational monitoring and short-term calibration, after which they can be deleted or transformed into aggregated summaries such as per-shift KPIs. Access control is enforced through the REST API, which separates ingestion privileges from analytics and reporting privileges. Endpoints returning more sensitive information can be restricted to specific roles and logged with an auditable trail, while broader consumers can be limited to aggregated KPIs. Recommendations are presented together with their triggering conditions and supporting KPIs, and managers remain responsible for deciding whether and how to act, considering context not visible to the DT.
In summary, the DT combines pseudonymization, minimization of fine-grained exposure, zone-level operational reasoning, and governance requirements to reduce surveillance risk while preserving the operational value of multimodal sensing.
9. Limitations
Although the proposed front-desk DT provides a concrete example of integrating OCC-based positioning with wearable-derived workload indicators in a hospitality setting, it should be interpreted as a research prototype and reference implementation rather than a turnkey commercial solution. Several limitations affect sensing robustness, data completeness, model generalizability, and organizational applicability, and these limitations delineate the scope of the claims supported by the current evaluation.
The robustness of the OCC component depends on reception layout, camera placement, and environmental conditions. Reception areas may contain occlusions caused by guests, luggage, furniture, and temporary structures, as well as reflective surfaces and lighting variability that can degrade detection quality. These factors can create blind spots and reduce confidence scores, affecting zone assignment and downstream KPIs. While confidence-aware filtering and zone-level aggregation mitigate transient noise, environments with persistent occlusion or rapidly changing lighting may require denser camera coverage, stronger calibration, or complementary positioning modalities. The current zone model is well-suited to planar or near-planar reception areas, but more complex lobbies (multi-level mezzanines, stairs/ramps, irregular geometry) require richer spatial representations (e.g., multi-surface zoning, connectivity graphs between levels, and vertical transitions). The ontology and processing pipeline can accommodate this by extending zone geometry and by adding explicit transition constructs (stairs/ramp segments), but this is not yet validated in a multi-level deployment.
In deployments where OCC relies on detectable tokens, practicality depends on the scope of tagging. The proposed approach assumes tokens for staff and optional pilot participants; it does not require all guests to carry emitters, and guest-related KPIs can be derived from anonymous zone occupancy or transaction logs when tokenization is not feasible.
Wearable signals are affected by motion artifacts, skin contact, and synchronization issues, and data can be missing because of connectivity interruptions, depleted batteries, or devices being removed. Sliding-window aggregation and per-person normalization reduce sensitivity to transient noise, but prolonged gaps can degrade workload inference and reduce the reliability of recommendations. Adoption constraints are therefore non-trivial, and some staff may be reluctant to participate in continuous sensing. The present work assumes a minimum level of compliance for the prototype scenarios and does not explicitly model adversarial or strategic behavior such as altering device usage to influence inferred workload.
The behavioral and workload models are intentionally simplified to support interpretability and operational use. The behavioral state inference captures a small set of reception-relevant states and does not fully represent the diversity of front-desk work. The workload index aggregates a limited set of physiological and activity features and does not incorporate longer-term context such as sleep quality, cumulative fatigue, or psychosocial factors. The rule-based recommendation logic preserves transparency but may not be optimal under all regimes.
The evaluation uses a combination of controlled traces and scenario traces, with a substantial portion of the reported comparative results derived from a one-week synthetic trace designed to stress the DT under repeatable conditions. Synthetic scenarios enable systematic exploration of rare peak-demand patterns, but they do not substitute for a long-running field deployment in a live hotel environment. Claims about robustness of OCC under complex lighting and crowding therefore require validation through pilot studies with ground-truthed field data, including guest-flow logs and verified workload labels.
Scaling to larger properties increases throughput, storage, and reasoning demands and may require stream processing and incremental aggregation. Finally, the operational value of DT outputs depends on organizational readiness and governance: workload indicators must be used to support staff rather than to enable punitive monitoring, and recommendations require training and managerial buy-in to translate into sustained improvements.
These limitations motivate future work on broader field validation, stronger handling of imperfect sensing and degraded modes, richer modeling under human-centered constraints, and co-designed deployment practices that integrate technical measures with organizational safeguards.
10. Conclusions and Future Work
This article presented a human-centered Digital Twin (DT) for hotel front-desk services that combines OCC-based 3D positioning, wearable-derived workload indicators, and an ontology-driven semantic model. Building on prior work on front-desk ontologies and hospitality DTs, the proposed system focuses on queue dynamics, staff workload, and near-real-time decision support in the reception area. The goal is to move beyond static staffing rules and retrospective reporting by providing an integrated operational view that can be acted upon while service conditions evolve.
This work contributes an ontology extension that represents spatio-temporal observations and workload concepts within a unified model. Positioning events, zones, trajectories, and zone dwell segments provide a structured representation of movement and occupancy in the reception space, while physiological metrics, workload indices, and workload states capture interpretable indicators of staff load derived from wearable signals. This semantic foundation enables queries and rules that connect where staff are, what service context they are in, and how workload evolves over time.
The DT specifies an OCC-based spatial processing pipeline that includes calibration, canonical-frame alignment, normalization of 3D coordinates, and zone assignment under confidence-aware handling. In parallel, a wearable workload pipeline aggregates windowed physiological and activity signals, applies per-person normalization, and maps features to a continuous workload index and categorical workload states. A modular implementation and a property-based REST API ingest events, map them to ontology instances, and expose DT state and KPIs through stable, technology-agnostic endpoints, enabling incremental evolution of the sensing infrastructure without breaking downstream integrations.
Future work includes tighter integration with core hotel information systems such as PMS and CRM platforms, which can provide expected arrivals and transaction timestamps to validate waiting-time proxies and enrich operational context. The spatial component can be strengthened through uncertainty-aware multi-camera fusion and richer geometric representations for complex lobbies, including multi-level layouts and temporary structures. Behavioral modeling and recommendation logic can be extended with data-driven transition models and optimization-based policies under strict human-in-the-loop control and fairness constraints. Continued attention to privacy, ethics, and organizational governance remains essential to ensure that these technologies are adopted in ways that are operationally effective and socially responsible.