1. Introduction
Urbanization in the 21st century has moved beyond simple infrastructure digitization toward the “Cognitive City” concept. In this era, the convergence of Internet of Things (IoT) sensors, Digital Twins, and physical systems aims to enhance urban efficiency and livability [
1,
2,
3]. The notion of a smart city is commonly defined as an urban system where investments in human capital, digital infrastructure, and information and communication technologies are integrated to improve economic performance, sustainability, and quality of life [
4]. Importantly, this perspective emphasizes the interaction between technological infrastructure and human activity rather than the mere presence of sensors or data collection systems. Within this framework, university campuses serve as effective microcosms of the broader smart city because they possess dense sensing networks and high-performance connectivity layers [
5,
6]. In this context, the concept of a smart campus has emerged as a localized implementation of smart city principles, where networked infrastructure, IoT systems, and data-driven services support intelligent management of mobility, energy, learning environments, and user experience [
7]. These environments support continuous foot traffic and digital services, making them ideal testbeds for analyzing mobility in connected spaces. Consequently, pedestrian mobility here evolves beyond simple origin-destination traversal. It becomes a “phygital” (physical-digital) experience where physical movement is fundamentally coupled with digital connectivity [
8,
9].
Conventional pedestrian routing algorithms have historically prioritized efficiency. Foundational models, notably Dijkstra’s algorithm [
10] and A* search [
11], predicate their logic on the assumption that pedestrians function as rational agents optimizing for minimal distance or time. Although computationally robust, these deterministic models often overlook the stochastic and multi-objective nature of human decision-making in complex urban settings [
12,
13]. Literature in urban design suggests that pedestrians frequently behave as “sensual beings” who engage in trade-offs between path length and environmental factors such as comfort, safety, and aesthetics [
14,
15]. This behavior is particularly evident in tropical climates, where thermal comfort and shade availability become critical determinants of route choice and often outweigh the utility of the shortest path [
16,
17].
However, the integration of a “Digital Layer” into pedestrian navigation frameworks remains underexplored. As wireless connectivity evolves into a fundamental urban utility, WiFi hotspots and 5G nodes function as “digital anchors” that shape dwell times and movement trajectories [
18,
19]. Nevertheless, most routing systems continue to treat connectivity merely as a binary attribute or subordinate it entirely to physical constraints. Research explicitly quantifying the behavioral trade-off between digital utility and physical exertion factors, such as increased walking distance or elevation gain, remains limited [
20]. Moreover, capturing these preferences necessitates analyzing the realized connectivity experience along a route rather than relying solely on theoretical coverage maps.
To address this gap, this study presents a data-driven framework designed to infer latent pedestrian mobility preferences using a large-scale tropical university campus as a case study. Rather than equating infrastructure density with urban “smartness,” this study focuses on the behavioral interaction between digital connectivity infrastructure and pedestrian mobility decisions within a digitally instrumented environment. We treat the campus as a representative outdoor environment with dense WiFi infrastructure comparable to emerging smart city districts. Diverging from forward-planning models, we adopt an approach inspired by Inverse Reinforcement Learning (IRL) [
21,
22]. We utilize Bayesian Optimization [
23,
24] to approximate the cost function that best explains observed pedestrian trajectories. By constructing a multi-factor semantic graph that integrates OpenStreetMap (OSM) for topology, SRTM for elevation, and retrospective roaming logs for realized connectivity, we empirically measure “detour tolerance.” We define this metric as the additional physical effort pedestrians are willing to expend to maintain stable digital connectivity.
3. Materials and Methods
This study establishes the P-WARP (Personalized WiFi-Aware Route Profiling) framework, a comprehensive computational engine designed to reconstruct and analyze pedestrian navigation behaviors within complex outdoor environments. Unlike conventional routing systems that rely on a single cost metric, P-WARP is architected as a flexible multi-factor inference framework capable of simulating diverse navigation strategies.
To rigorously quantify the influence of digital connectivity, the framework is configured to evaluate two distinct modeling strategies: (1) a Global Static Model serving as an infrastructure-centric baseline, and (2) a Trip-Centric Dynamic Model representing the proposed trip-centric approach. Through the comparative optimization of these models within a unified framework, P-WARP captures the behavioral dynamics of digital comfort, providing empirical evidence that a preference for stable, realized connectivity can act as a stronger predictor of pedestrian movement than physical distance alone.
The methodology is structured into a four-stage pipeline to ensure a robust and adaptive inference process. The logical sequence of these operational stages is illustrated in the methodology flowchart (
Figure 1), while the overall system design contrasting the Global Static Model and the Trip-Centric Dynamic Model is depicted in the architectural framework (
Figure 2). The framework integrates heterogeneous data sources, including OSM, Digital Elevation Models (DEM), and WiFi roaming logs, as summarized below:
Data Acquisition and Heterogeneous Fusion: Physical geospatial data are consolidated with behavioral mobility logs, and a ground-truth dataset is established for evaluating mobility inference performance.
Multi-layered Semantic Graph Construction: A navigable network graph is constructed in which edges represent physical pathways enriched with environmental, topographic, and digital attributes.
Connectivity Modeling (Dual-Model Strategy): Two contrasting impedance models are evaluated to isolate the effect of WiFi connectivity: a Global Static Model based on collective infrastructure and a Trip-Centric Dynamic Model based on realized roaming behavior.
Parameter Optimization via Bayesian Learning: The routing cost function is calibrated using Bayesian Optimization with 5-fold cross-validation to minimize the discrepancy between reconstructed paths and ground-truth trajectories.
3.1. Study Area
The empirical analysis in this study was conducted on the main campus of Chiang Mai University (CMU), Thailand. The campus represents a dense, digitally connected academic environment that combines extensive pedestrian infrastructure with large-scale wireless network deployment. Covering approximately 14.25 square kilometers, the campus contains academic buildings, residential areas, administrative facilities, and open public spaces connected through a complex network of pedestrian pathways and road segments.
Figure 3 illustrates the spatial structure of the study area. The map presents the outdoor walkable area, building footprints, pedestrian pathways, observed walking trajectories, and the distribution of WiFi access points across the campus. Indoor building areas are treated as structural obstacles within the routing model, while outdoor spaces constitute the walkable environment in which pedestrian movement occurs. Major campus landmarks, including academic buildings, the central food center, and main campus access points, are labeled to provide spatial reference.
The pedestrian network used in this study was derived from OSM data and further refined to represent walkable pathways within the campus environment. This network forms the base topological layer for the routing framework. Observed pedestrian trajectories collected from campus users were spatially aligned with this network to enable trajectory reconstruction experiments. In addition, the locations of WiFi access points were integrated into the spatial model to represent the digital connectivity layer.
The campus environment provides a suitable testbed for investigating the interaction between physical mobility and digital infrastructure. This makes the campus an appropriate micro-scale environment for examining how digital connectivity factors may shape pedestrian navigation behavior in smart urban environments.
3.2. Data Acquisition and Heterogeneous Fusion
To build the P-WARP framework, we combined two types of information: physical infrastructure data and user movement logs. All datasets were carefully processed and spatially aligned to the geographic extent of the CMU campus. The data acquisition process comprises the following components.
3.2.1. Ground-Truth Mobility Dataset (GPS and WiFi Association Logs)
To evaluate the developed mobility inference framework, a set of ground-truth walking trips was collected using a custom mobile application developed by our team. The study involved 21 volunteers, all of whom were first-year undergraduate students at Chiang Mai University. Recruitment was conducted on a voluntary basis during the first semester of the 2024 academic year. Prior to participation, volunteers were informed about the purpose of the study and the types of data that would be collected.
The application records GPS trajectories and simultaneously logs WiFi association events, including connected Access Point (AP) identifiers (SSID/BSSID) and timestamps. The application was implemented using the Flutter framework and utilized the platform’s GPS library to capture location updates from the participant’s smartphone. Instead of a fixed time-based sampling interval, the logging mechanism was event-driven, meaning that a new data record was created only when the device detected a meaningful change in geographic position. Consequently, the sampling frequency varied dynamically depending on the participant’s movement patterns. A separate lookup table mapping AP identifiers to their geolocations was available for analysis.
To preserve participant autonomy and privacy, data recording was fully user-triggered. Volunteers manually started and stopped recording sessions when logging a walking trip. The collected data were initially stored locally on the participant’s device and were later shared manually with the research team by the volunteers themselves. In addition, an informed consent agreement was presented within the application prior to any data collection. Participants were informed about the nature of the study and their right to withdraw from participation at any time without consequences. Ethical approval for this study was granted by the Chiang Mai University Research Ethics Committee (CMUREC No. 66/039).
During data collection, participants explicitly marked trip boundaries within the application by specifying the start (origin) and stop (destination) of each walking trip. As a result, each trip record contains a time-ordered GPS trajectory together with the sequence of connected APs observed during movement. Across all participants, a total of 71 verified walking trips were recorded and aggregated to form the ground-truth mobility dataset used in this study.
In total, 71 verified walking trips were collected. The total geometric length of each walking trajectory was computed after projection to UTM Zone 47N (EPSG:32647) using cumulative Euclidean distances between consecutive GPS samples. Across the 71 verified trips, the mean trip length is 715.72 m and the standard deviation is 377.25 m. These statistics describe the distribution of walking distances in the collected dataset and indicate moderate variability in trip scales.
This dataset reflects a realistic operational scenario in which campus or area network administrators possess WiFi association logs from which pedestrian mobility can be inferred.
3.2.2. Spatial Infrastructure and Environmental Grid
The road network topology and building footprints were extracted using the
OSMnx library [
54]. To account for environmental walkability and pedestrian preferences, we implemented a structured grid-based representation with a spatial resolution of
. To ensure computational efficiency during multi-factor weight assignment and nearest-neighbor spatial queries, an R-tree spatial indexing method was implemented. Each grid cell was assigned an environmental impedance weight (
) based on land-use characteristics, where lower values represent more desirable walking environments and higher values represent physical or functional obstacles.
To provide methodological transparency, obstacles within the spatial framework are classified into four distinct categories as detailed in
Table 1. This classification distinguishes between absolute physical barriers, environmental impediments, and functional constraints.
A critical refinement in our spatial modeling is the treatment of indoor environments as non-navigable zones. While these spaces are structurally navigable, they are excluded from the routing graph to mitigate the impact of severe GPS signal attenuation and multipath effects typically encountered within buildings. This justification ensures that the P-WARP framework focuses on reliable outdoor pedestrian circulation, avoiding erroneous trajectory reconstructions in indoor corridors where sensing accuracy is compromised.
The road network was spatially intersected with the environmental grid such that each road segment inherited the impedance value of the grid cells it traversed. For segments intersecting multiple grid cells, the effective weight was computed based on the proportional length within each cell, preserving fine-grained environmental context in the graph representation.
3.2.3. Topographic Data Integration
Given the undulating terrain of the study area, elevation data were integrated to capture physical effort associated with slope. Elevation values were retrieved using the Open-Elevation API, which provides access to Shuttle Radar Topography Mission (SRTM) data [
55]. Elevation was assigned to each graph node, and the absolute vertical grade for each edge was computed to represent topographic impedance.
3.2.4. WiFi-Trajectory Fusion and Data Cleaning
Empirical mobility data were initially stored in raw CSV format and converted to Apache Parquet for efficient processing. To ensure data quality, we applied a minimum movement threshold of to filter out static noise and GPS drift. The campus network operates under a Single Sign-On (SSO) environment, allowing devices to roam seamlessly across APs without repeated authentication. Consequently, the collected logs represent Active Access Points, where devices maintained successful connections. The set of APs observed across the 71 trips therefore represents the effective digital infrastructure relevant to pedestrian mobility rather than an exhaustive inventory of deployed hardware.
3.3. Multi-Layered Semantic Graph Construction
The core of the P-WARP framework is the construction of a multi-layered semantic graph, which transforms a standard topological network into a feature-rich environment for pedestrian navigation. The environment is modeled as a weighted directed graph , where V represents the set of nodes and E represents the set of edges. Specifically, each edge denotes a directional link connecting a source node u to a target node v. Unlike conventional routing models that rely solely on distance, our approach enriches each edge with an augmented attribute set across four semantic layers:
Topological and Physical Layer: The graph G is initialized using OSM data, with each edge e attributed with its metric length () and projected UTM coordinates, establishing the fundamental physical layout of the network.
Environmental Walkability Layer: Grid-based environmental weights are projected onto the graph to capture semantic preferences. By calculating the centroid of each edge and executing a spatial join with the environmental grid, we assign a semantic impedance factor () to each segment. This ensures the graph inherently favors high-walkability zones (e.g., parks) while penalizing architectural obstacles.
Topographic Impedance Layer (2.5D Enrichment): Terrain variations are integrated by mapping altitude data to each node . Each edge is attributed with an absolute vertical grade (), effectively transforming the 2D network into a 2.5D model that reflects the physical exertion required on sloped terrain.
Digital Infrastructure Layer: To facilitate connectivity-aware routing, each edge is spatially associated with the set of nearby Access Points. A WiFi connectivity factor () is derived by buffering the edge geometry with a radius. This threshold is adopted as a conservative operational approximation intended to represent a zone of relatively stable association during pedestrian movement, rather than the maximum physical signal propagation range. While WiFi signals may extend beyond this distance under favorable conditions, signal strength, stability, and roaming reliability typically degrade with distance and environmental interference. The fixed-radius buffer therefore serves as a simplified proxy for effective connectivity continuity. We acknowledge that this approach does not model signal attenuation explicitly and does not account for spatial variability in RSSI or interference. Future extensions may incorporate signal-strength-based buffering strategies, probabilistic association surfaces, or adaptive coverage modeling to more precisely represent connectivity variability. This spatial indexing allows the framework to dynamically quantify the digital attractiveness of specific paths based on either the global infrastructure (Model A) or Trip-Centric roaming logs (Model B).
By the end of this stage, each edge in the graph G is fully characterized by its physical, environmental, and digital properties, providing the foundation for the Bayesian weight optimization process in Stage 3.
3.4. Connectivity Modeling and Dual-Model Strategy
The P-WARP framework employs two distinct weighting strategies to calculate the final traversal cost of an edge. While both models utilize a multi-attribute vector, the source of the digital connectivity parameter and the structural composition of the cost function are specifically designed to address different navigation scenarios while maintaining strict methodological distinction.
The fundamental cost function
applied in both strategies is defined as a weighted linear combination of physical and digital attributes:
where
u and
v denote the source and target nodes of a directed edge, while
,
, and
correspond to the metric length, elevation grade, and environmental grid weight defined in
Section 3.3. The core distinction lies in the definition of the WiFi impedance term
, as detailed below.
This formulation serves as a scalarization function that converts heterogeneous attributes into a unified traversal cost. The weighting coefficients act as tunable parameters governing the relative importance of each factor, allowing the model to quantify the pedestrian’s tolerance for physical detours in exchange for digital connectivity.
3.4.1. Model A: Global Static Model (Baseline)
The Global Model provides a standardized routing baseline by evaluating the environment through a static lens. The weighting structure focuses on the physical and infrastructural density of the campus as a whole, as visualized in the multi-layered composition of
Figure 4.
All digital connectivity attributes are derived from the Aggregate Observed Inventory (the union of unique APs identified across the dataset), as depicted in
Figure 4c. Since the model does not utilize any trip-specific logs or real-time connectivity data during the weighting process, the resulting path selection is based purely on environmental and infrastructural density (
Figure 4d), thereby ensuring a leakage-free baseline. The WiFi impedance for the global configuration is defined as:
This formulation treats every Access Point as equally relevant, ignoring individual connection stability or device-specific roaming behaviors. The spatial search uses the same buffer radius defined in the graph construction phase. Consequently, edges with higher AP densities yield lower impedance values, mathematically incentivizing the routing algorithm to prioritize well-connected infrastructure.
3.4.2. Model B: Trip-Centric Dynamic Model
In contrast, the Trip-Centric Model integrates the P-WARP framework’s concept of Digital Comfort within a Single Sign-On (SSO) environment. Since the device maintains a persistent authentication state, it continuously attempts to roam across APs without user intervention.
Consequently, the WiFi impedance is derived dynamically from the retrospective roaming logs to capture the stability of the connection. We consider only the active APs where the device successfully maintained the session:
Unlike static coverage maps, delineates the actual “corridor of connectivity” where the roaming mechanism functioned effectively, acting as a direct proxy for the user’s seamless experience. Analogous to the global strategy, higher realized counts result in lower edge weights, effectively guiding the reconstruction algorithm to align closely with the user’s verified digital footprint.
As summarized in
Table 2, the Trip-Centric Dynamic Model is explicitly designed to reconstruct the realized connectivity context of the specific trip. While this configuration incorporates trip-specific logs, it serves a critical analytical purpose: to strictly quantify the behavioral trade-off between physical effort (distance) and digital utility (WiFi stability).
By establishing this theoretical reference where the model is fully aware of the user’s connectivity experience, we can accurately calibrate the weighting parameters (). This allows us to measure exactly how much additional distance a user is willing to traverse to maintain a connection, thereby validating the “Detour Tolerance” hypothesis without the noise of potential signal estimation errors.
3.5. Parameter Calibration via Bayesian Optimization
The final component of the methodology involves determining the optimal values for the weighting vector
. Since the relationship between these weights and the resulting pedestrian trajectory is non-linear and computationally expensive to evaluate (requiring a full shortest-path graph traversal for every candidate set), traditional exhaustive methods like grid search are infeasible. Therefore, we employed Bayesian Optimization [
23], a sequential model-based optimization strategy designed to efficiently infer the global optima of black-box functions.
3.5.1. Objective Function: Behavioral Alignment via SAD
The optimization goal is not merely to minimize error, but to maximize the behavioral alignment between the model’s reconstructed path () and the actual ground-truth GPS trajectory (). We utilized the Symmetric Average Distance (SAD), which measures the geometric deviation between predicted and observed trajectories, as the loss function to quantify this geometric similarity.
To ensure robust distance calculation, both the reconstructed and ground-truth geometries were projected to the UTM Zone 47N coordinate system. As implemented in our evaluation pipeline, the geometries are discretized into points at 5 m intervals using linear interpolation. The SAD metric is then computed as the average of the directed mean distances:
where
and
denote the cardinality (total point count) of the reconstructed and ground-truth trajectory sets, respectively. The term
represents the point-to-set distance, defined as the Euclidean distance from a query point
x (where
) to its nearest neighbor in the target geometry set
Y. By minimizing this metric, the optimization process identifies the specific combination of weights
that best rationalizes the observed route choices.
3.5.2. Optimization Protocol with 5-Fold Cross-Validation
To ensure generalizability and mitigate overfitting, we implemented a 5-fold cross-validation scheme, as visually illustrated in
Figure 5. The comprehensive computational procedure, encompassing the dynamic cost update logic and the Bayesian update loop, is formally detailed in Algorithm 1.
| Algorithm 1 P-WARP Parameter Optimization and Path Inference Strategy |
- Require:
Graph , Trajectory Set , Search Space - Ensure:
Optimal Weights and Validation SAD Score
- 1:
Initialize: Split into K folds (5-Fold CV) - 2:
for each fold do - 3:
Training set (80%) - 4:
Hold-out set (20%) - 5:
Initialize Gaussian Process (GP) Surrogate Model ▹— Bayesian Optimization Loop — - 6:
for to N do - 7:
Select candidate weights via GP - 8:
- 9:
for each trip do - 10:
Identify Origin and Destination ▹— Dynamic Cost Function Update — - 11:
for each edge do - 12:
if Model == Global then - 13:
(Equation ( 2)) - 14:
else ▹ Trip-Centric - 15:
using trip logs (Equation ( 3)) - 16:
end if - 17:
- 18:
end for - 19:
- 20:
- 21:
end for - 22:
Update GP with mean (Objective to Minimize) - 23:
end for - 24:
▹ Best weights found - 25:
Calculate Validation SAD on using - 26:
end for - 27:
return Average Validation SAD and Mean
|
The dataset of 71 verified trips was randomly partitioned into five subsets (folds). The optimization process for each fold proceeded as follows:
- 1.
Search Space Definition: We defined the hyperparameter search space to bound the exploration: and . This constraint ensures that physical length serves as a consistent baseline factor, preventing the model from collapsing into zero-cost solutions while allowing semantic factors to vary significantly.
- 2.
Bayesian Search: For each training fold (80% of data), we utilized a Gaussian Process (GP) regressor as the surrogate model. The optimizer performed 15 iterations (5 random initialization steps followed by 10 guided exploration steps using the Expected Improvement acquisition function) to maximize the negative SAD score.
- 3.
Validation: The optimal weights derived from the training phase of fold k were then rigorously applied to the corresponding held-out test fold (20%) to compute the final unbiased validation SAD error.
This process was repeated for both the Global Static Model (Model A) and the Trip-Centric Dynamic Model (Model B) to facilitate a direct performance comparison. The convergence of the optimization process across iterations is visualized in
Figure 6, demonstrating the algorithm’s ability to efficiently navigate the search space towards the global minimum.
In
Figure 6, the solid curves represent the cumulative minimum SAD value (i.e., the best objective score observed up to a given iteration), while the dashed curves correspond to the objective values of candidate weight configurations proposed by the Expected Improvement acquisition function at each optimization step. The fluctuations observed in the dashed curves reflect the exploratory nature of Bayesian Optimization as it evaluates different regions of the weight search space.
The relatively smoother convergence pattern of the Global Static Model suggests a less complex objective surface under aggregated infrastructure assumptions. In contrast, the Trip-Centric Dynamic Model exhibits larger exploratory variations, indicating higher sensitivity of the objective function to connectivity-related weights and a more intricate cost landscape. Despite these exploratory fluctuations, both models demonstrate stable convergence within the allocated iterations, confirming the robustness of the optimization procedure.
3.6. Evaluation Metrics
To provide a comprehensive assessment of the model’s performance beyond the optimization objective (SAD), we employed a suite of geometric similarity metrics. Each metric captures a different aspect of the spatial deviation between the reconstructed path () and the ground truth ():
3.6.1. Hausdorff Distance
The Hausdorff Distance measures the “worst-case” deviation. It is defined as:
where
denotes the Euclidean distance between points
p and
q. The operators sup and inf represent the supremum and infimum, effectively capturing the maximum of the minimum distances between the two sets. This metric is particularly useful for identifying substantial outliers where the reconstruction diverges significantly from the actual route.
3.6.2. Fréchet Distance
Often described as the “dog-walking distance,” the Fréchet Distance accounts for the continuity and ordering of points along the curves. It is defined as:
where
and
are continuous, non-decreasing re-parameterizations mapping the unit interval
to the respective trajectories. Unlike Hausdorff distance, which treats paths as unordered point sets, this metric ensures that the reconstructed path follows the same directional sequence as the ground truth.
3.6.3. Dynamic Time Warping (DTW) Normalized Distance
Dynamic Time Warping (DTW) finds the optimal non-linear alignment between two sequences by warping the time dimension. When applied to spatial trajectories, it allows for elastic matching of geometric shapes even if they are slightly shifted or locally distorted. The distance is derived from the optimal warping path
W that minimizes the cumulative cost:
where
represents the sequence of aligned point pairs between
and
, and
K is the length of the warping path. We report this normalized value to provide a robust measure of overall shape similarity independent of the total travel distance.
3.6.4. Length Similarity Coefficient
While geometric metrics capture spatial alignment, they do not explicitly measure whether the model correctly predicts the magnitude of a detour. The Length Similarity Coefficient quantifies the agreement in total travel distance between the reconstructed and actual paths:
where
represents the total metric length of a trajectory. A value closer to
indicates that the reconstructed path length closely matches the ground truth length, suggesting that the model has accurately captured the user’s willingness to traverse a specific distance (e.g., a detour) to satisfy their connectivity needs.
By evaluating the models against a diverse set of metrics, including geometric alignment measures (SAD, Hausdorff, Fréchet, DTW) and path magnitude similarity (), we ensure that the reported improvements are robust, geometrically consistent, and not artifacts of a single measurement technique.
4. Results and Discussion
We evaluate the performance of the P-WARP framework by comparing the Global Static Model (Model A) against the Trip-Centric Dynamic Model (Model B). The evaluation focuses on three key dimensions: quantitative accuracy across multiple geometric metrics, the interpretation of learned behavioral weights, and the visual validation of trajectory reconstructions.
4.1. Quantitative Performance Analysis
To rigorously quantify the capability of each model, we computed a comprehensive set of geometric metrics across all 71 verified trips using the 5-fold cross-validation scheme. While the Symmetric Average Distance (SAD) served as the primary optimization objective, we also calculated Hausdorff Distance, Fréchet Distance, and DTW Normalized Distance to ensure a robust evaluation.
Table 3 summarizes the comparative results. The Global Static Model (Model A) yielded an average SAD error of
. This baseline reflects the limitation of relying on a “one-size-fits-all” infrastructure map, which fails to account for connection failures or device-specific roaming patterns.
In contrast, the Trip-Centric Dynamic Model (Model B), which reconstructs the path using the user’s realized connectivity history, reduced the SAD error to , representing a moderate but consistent improvement of . Furthermore, consistent improvements were observed across all auxiliary metrics (Hausdorff: +5.77%, DTW: +7.38%). These results demonstrate that modeling the effective connectivity (SSO active states) yields trajectory reconstructions that are consistently closer to the ground truth than theoretical coverage models, offering a more realistic representation of pedestrian movement in digital environments.
To further evaluate the robustness of the P-WARP framework, we conducted a sensitivity analysis across varying grid resolutions (5 m, 10 m, and 20 m) and performed statistical validation using a paired
t-test (N = 71 trips). As summarized in
Table 4, the Trip-Centric model consistently outperforms the Global model across all grid configurations. However, the results are not identical across resolutions, indicating that spatial discretization does have a measurable, albeit modest, impact on model performance.
Specifically, finer grid resolutions (5 m) yield slightly improved reconstruction accuracy (8.27% SAD improvement), while coarser grids (20 m) exhibit a reduction in performance gain (5.63%). This trend suggests that higher-resolution grids better preserve local environmental and connectivity variations, whereas coarser discretization introduces smoothing effects that slightly degrade sensitivity to these factors.
Despite these variations, the overall performance differences remain relatively small, indicating that the P-WARP framework is robust to reasonable changes in spatial resolution. This analysis confirms that the model does not rely on a specific grid configuration and can maintain stable behavior across different levels of spatial granularity.
Based on this trade-off, a 10-m grid is selected as a practical balance between spatial precision and computational efficiency, offering strong performance (6.84% SAD improvement) with moderate computational cost.
To evaluate the statistical significance of the observed improvements, a paired t-test was conducted on the SAD metrics. The resulting p-value of 0.359 indicates that the improvement is not statistically significant at the 95% confidence level, likely due to the variability in individual walking behavior. Nevertheless, the consistent improvement trend across all grid resolutions, along with corresponding gains in RMSE, suggests a meaningful practical advantage of incorporating realized connectivity into the routing framework.
4.2. Analysis of Learned Behavioral Weights
Beyond error reduction, the Bayesian Optimization process provides insight into how different factors are prioritized in pedestrian route choice. The optimal weighting vectors obtained for each model reveal distinct behavioral structures underlying navigation decisions.
For the Global Static Model (Model A), the learned weights exhibit a relatively balanced configuration:
This pattern indicates that, under an infrastructure-centric assumption, physical distance remains a non-negligible cost, while environmental walkability and digital connectivity contribute comparably to route selection. In this setting, pedestrians are implicitly modeled as making trade-offs between minimizing distance and seeking favorable environmental and connectivity conditions.
In contrast, the Trip-Centric Dynamic Model (Model B) reveals a markedly different prioritization:
Here, the weight associated with physical length is reduced to a near-zero value, while the contribution of WiFi connectivity becomes dominant. This shift suggests that when a pedestrian’s realized connectivity history is explicitly accounted for, the marginal cost of additional walking distance diminishes substantially relative to the benefit of maintaining a stable connection.
From a behavioral perspective, this weighting structure implies that pedestrians are willing to deviate from shortest paths when doing so preserves effective connectivity. The learned parameters indicate that distance minimization becomes secondary once reliable roaming conditions are known, and that route choice is instead governed by the continuity of digital access. In effect, the model internalizes a preference for remaining within corridors where connectivity is stable, even if this entails additional physical effort. These results provide empirical support for the Digital Comfort hypothesis, demonstrating that pedestrian navigation in digitally instrumented environments is driven less by geometric efficiency than by the quality of the underlying connectivity experience.
4.3. Visual Inspection of Trajectories
To provide an intuitive understanding of how the P-WARP framework operates in practice, we first examine a representative case study in which the Global Static Model and the Trip-Centric Dynamic Model produced markedly different reconstructions. Quantitative evaluation of this trip highlights substantial variation in performance. The naive Shortest Path baseline exhibited a deviation of from the ground truth, whereas the P-WARP models achieved considerably lower reconstruction errors.
Figure 7 illustrates both the environmental context and the resulting path decisions for this trip. Under the Global Static Model, the aggregate infrastructure view suggests a relatively uniform density of WiFi access points across the area. Guided by this theoretical availability, the model produces a direct path through the central region of the campus. Although this approach improves upon the shortest-path baseline, reducing the deviation to
, it fails to account for the user’s actual roaming behavior and therefore still diverges from the observed trajectory.
In contrast, the Trip-Centric Dynamic Model leverages the user’s realized connectivity history. As shown in
Figure 7d, the set of access points with which the device actively associated during the trip is considerably sparser and does not support the central route implied by the global map. Incorporating this trip-specific context, the model correctly infers a peripheral detour that aligns with the user’s movement, achieving a reconstruction error of
. This corresponds to an improvement of
over the Global Static Model and
relative to the Shortest Path baseline. Together, these results demonstrate that pedestrian route choice is shaped not by the mere presence of infrastructure, but by the effectiveness of connectivity experienced along the path.
To demonstrate that this behavior is systematic rather than anecdotal, a second case study is presented in
Figure 8. In this instance, static infrastructure information provides no additional explanatory power: both the Shortest Path baseline and the Global Static Model yield identical deviations of
. The Trip-Centric Dynamic Model, however, reduces the reconstruction error to
, representing a consistent improvement of
over the static approaches.
Visual inspection reveals why the static models fail in this case. The global infrastructure map again suggests coverage that would support a direct route, leading the Global Static Model to default to the shortest geometric path. The ground-truth trajectory, however, follows a more complex route weaving through building spaces. The user’s realized connectivity history, shown in
Figure 8d, exhibits a strong preference for peripheral corridors where stable connections are maintained. By penalizing regions that appear viable in the aggregate map but lack effective connectivity for the user, the Trip-Centric Dynamic Model deviates from the straight-line solution and more closely approximates the observed movement pattern, even if it does not replicate every local turn.
5. Discussion
This study advances pedestrian routing research by explicitly incorporating the digital layer of urban space into route inference, demonstrating that effective connectivity constitutes a latent but influential component of walkability. The empirical results consistently show that models accounting for realized WiFi connectivity outperform conventional distance-based and infrastructure-centric approaches. More importantly, the learned weighting structures provide insight into how pedestrians implicitly balance physical effort against digital utility, revealing behavioral priorities that are not observable through geometry alone.
5.1. Implications for Pedestrian Routing and Walkability Modeling
The findings suggest that pedestrian navigation cannot be fully explained by minimizing physical distance or travel time, particularly in digitally saturated environments. While walkability has traditionally been associated with physical attributes such as safety, aesthetics, and comfort, the results indicate that digital accessibility functions as an additional, and sometimes dominant, dimension of perceived walkability. In this sense, P-WARP reframes walkability as a hybrid construct, shaped jointly by spatial form and digital continuity.
The pronounced weighting shift observed in the Trip-Centric Dynamic Model, where WiFi impedance dominates and physical length becomes marginal, highlights the existence of implicit detour behavior driven by connectivity needs. This observation aligns with the broader notion of “Digital Comfort,” wherein pedestrians optimize their movement to maintain seamless access to online services, messaging, and cloud-based applications. Such behavior is especially relevant in smart campuses, outdoor commercial districts, and emerging smart city environments where connectivity is assumed but unevenly realized.
5.2. Methodological Contributions and Interpretability
From a methodological perspective, this work contributes a data-driven, inference-based approach that moves beyond forward simulation of pedestrian preferences. By adopting a Bayesian optimization framework inspired by inverse reinforcement learning, P-WARP enables the extraction of interpretable behavioral weights from observed trajectories. Unlike black-box prediction models, the learned parameters offer a transparent mechanism for understanding trade-offs between distance, terrain, environmental context, and digital infrastructure.
The dual-model strategy further strengthens interpretability by providing a principled baseline. The contrast between the Global Static Model and the Trip-Centric Dynamic Model isolates the added explanatory power of realized connectivity, ensuring that performance gains are attributable to behavioral inference rather than model complexity alone. This design choice is particularly important for GIS applications, where reproducibility and explainability are central concerns.
5.3. Limitations and Operational Constraints
Despite these contributions, several limitations define the current scope of the framework. First, the empirical evaluation is based on 71 verified walking trips collected within a university campus. While this dataset is sufficient to demonstrate statistically significant improvements and to validate the proposed methodology through cross-validation, it remains limited in scale relative to large urban mobility datasets. The inferred weighting structures therefore reflect the behavioral tendencies of a specific demographic within a controlled environment and may require recalibration when applied to more heterogeneous populations or denser urban settings.
Second, the Trip-Centric Dynamic Model relies on retrospective roaming logs to reconstruct realized connectivity. This dependency introduces an inherent “cold start” constraint for real-time deployment, as personalized routing cannot be generated for users or devices without prior connectivity history. In its current form, P-WARP functions primarily as an analytical framework for explaining observed movement patterns rather than as a fully predictive navigation system for first-time users.
Addressing this limitation will require the development of device-agnostic or probabilistic connectivity profiles that can approximate likely roaming behavior based on general signal sensitivity, device class, or contextual network conditions. Such extensions would enable the framework to transition from post hoc behavioral analysis toward anticipatory routing support.
5.4. Broader Applicability and Future Directions
Although the case study focuses on a smart campus, the conceptual framework is applicable to a broader range of urban contexts where WiFi or similar digital infrastructures are publicly accessible, such as pedestrianized city centers, outdoor shopping districts, transport hubs, and mixed-use developments. As cities increasingly deploy municipal WiFi and edge computing infrastructure, the gap between theoretical coverage and realized connectivity is likely to widen, further reinforcing the relevance of connectivity-aware routing models.
Future work will explore the integration of real-time signal indicators, such as RSSI variability and network congestion, to enable dynamic weight updates during navigation. Additionally, the energy implications of continuous connectivity sensing warrant further investigation, particularly in balancing battery consumption against the benefits of maintaining stable digital access during pedestrian movement.
Overall, this study underscores the importance of recognizing digital infrastructure as an integral component of urban space. By embedding realized connectivity into pedestrian route inference, P-WARP offers both methodological and conceptual insights that contribute to the evolving discourse on smart cities, walkability, and human-centered spatial analytics.
6. Conclusions
This study introduced P-WARP, a trip-centric weighted graph framework designed to align pedestrian routing algorithms with the notion of Digital Comfort in digitally connected environments. By contrasting a generic infrastructure-based approach (Global Static Model) with a trip-centric, log-driven model (Trip-Centric Dynamic Model), we quantified the behavioral trade-off between physical distance and effective connectivity within a Single Sign-On (SSO) campus setting. Experimental results, validated using 5-fold cross-validation on 71 verified walking trips, demonstrate that incorporating realized connectivity information reduces trajectory reconstruction error (SAD) by 6.84%. This improvement establishes a theoretical upper bound for path predictability and underscores the limitations of conventional navigation models that omit the digital layer of the built environment.
Beyond performance gains, analysis of the learned weighting parameters provides insight into pedestrian route-choice behavior. The Trip-Centric Dynamic Model consistently assigns a near-zero weight to physical path length () while placing dominant emphasis on WiFi connectivity (). This weighting structure indicates that, when reliable connectivity histories are available, pedestrians are willing to accept additional physical effort in exchange for maintaining stable digital access. These findings empirically support the hypothesis that connectivity continuity is a key determinant of navigation decisions in smart, digitally instrumented spaces.
Nevertheless, several limitations define the current scope of this work. First, the empirical evaluation is based on 71 verified walking trajectories collected within a single university campus environment. While this dataset enables statistically consistent validation under cross-validation and is sufficient to demonstrate the methodological contribution of the proposed framework, it remains modest in scale relative to large urban mobility datasets. The inferred behavioral weights therefore reflect tendencies within a specific demographic and spatial context rather than universal behavioral constants.
Importantly, the trajectory dataset was collected from 21 volunteer participants, all of whom were first-year undergraduate students. As a result, the observed walking behavior may reflect mobility patterns specific to this demographic group, including relatively high digital reliance and familiarity with campus infrastructure. Such sampling characteristics may introduce bias in the estimation of detour tolerance and connectivity preference. Broader datasets covering more diverse user groups, including different age ranges, occupational roles, and levels of digital dependence, would be necessary to fully evaluate the generalizability of the findings.
The learned detour tolerance should therefore be interpreted as context-dependent rather than universal. For example, elderly pedestrians may exhibit greater sensitivity to slope, safety, or physical exertion, potentially reducing their willingness to accept detours for connectivity continuity. Conversely, users with robust mobile data plans or strong cellular coverage may exhibit lower reliance on public WiFi infrastructure, thereby altering the relative weight assigned to connectivity in route choice decisions. Socioeconomic factors, device characteristics, and levels of digital dependence may all influence the observed trade-offs.
Second, the current WiFi impedance formulation relies on realized association counts as a proxy for connectivity stability. Although this captures effective digital continuity under an SSO roaming environment, it does not explicitly incorporate signal quality indicators such as RSSI, signal strength variability, latency, or packet loss. Integrating continuous Quality-of-Service (QoS) measurements would allow a more nuanced representation of digital impedance and may further refine behavioral inference in connectivity-aware routing models.
Third, the topographic impedance model employs absolute vertical grade, treating uphill and downhill traversal symmetrically. In practice, physiological effort and route preference may differ between ascent and descent. Future extensions may incorporate direction-sensitive slope penalties or energy-expenditure-based formulations to better capture asymmetric terrain effects.
Building on the limitations discussed above, future research will focus on narrowing the gap between analytical reconstruction and real-time prediction. A central direction is addressing the cold-start problem through the development of device-agnostic connectivity profiles that characterize likely roaming behavior without requiring extensive personal history.
In particular, probabilistic connectivity profiles could be constructed using aggregate infrastructure density and historical network statistics. For example, access point density may be modeled as a spatial stochastic process to estimate the probability of stable association along each graph edge. Historical AP-to-AP transition matrices derived from aggregated roaming logs could approximate expected roaming corridors without requiring user-specific trajectory history. Spatial regression or Gaussian Process interpolation of historical RSSI observations could generate continuous connection-probability surfaces across the network. Additionally, hierarchical Bayesian priors conditioned on device class may account for heterogeneous roaming stability across different hardware types. Such probabilistic formulations would enable predictive routing for new users while preserving the structural interpretability of the P-WARP framework.
In addition to infrastructure-based environmental grids, future work may integrate street view imagery and computer vision techniques to enrich the walkability layer with perceptual and semantic features. Recent advances in automated walkability audits using street view semantic segmentation [
56], participatory AI-based frameworks for assessing streetscape inclusivity [
57], and cross-view geo-localization using transformer-based feature alignment [
58] demonstrate that fine-grained environmental perception can be extracted from large-scale visual data sources. Incorporating such vision-derived features into the P-WARP semantic graph may enable a more comprehensive representation of enclosure, greenery, façade continuity, inclusivity, and spatial coherence. This multimodal integration would further bridge remote sensing, street-level imagery, and connectivity-aware routing within a unified urban analytics framework.
In parallel, we plan to extend P-WARP from a research prototype toward a deployable navigation module. Future iterations will incorporate dynamic graph updates informed by live signal measurements and temporal network conditions, such as congestion during peak activity periods. Given the energy costs associated with continuous WiFi scanning, further investigation will also examine the trade-off between battery consumption and the benefits of maintaining stable, low-power connections during pedestrian movement.
Overall, P-WARP bridges physical navigation and digital infrastructure by explicitly modeling connectivity as a first-class factor in pedestrian routing. By treating effective connectivity as an integral component of walkability, the framework offers actionable insights for campus operators and urban planners and provides a transparent, interpretable methodological foundation for the development of user-centric navigation systems in future smart city environments.