MDPI - Publisher of Open Access Journals

29 pages, 8314 KB

Open AccessArticle

Prediction-Aware UAV Swarm Crowd Surveillance: Balancing Coverage and Recognition Accuracy

by Yan Lyu, Zhiyu Fan, Xueyong Xu, Di Tang, Guanyu Gao, Weiwei Wu and Yanfeng He

Drones 2026, 10(7), 487; https://doi.org/10.3390/drones10070487 (registering DOI) - 26 Jun 2026

UAV swarms provide a flexible sensing platform for smart-city crowd surveillance, but cooperative aerial monitoring remains challenging due to dynamic pedestrian distributions, partial observability, and the trade-off between visual coverage and recognition accuracy. In particular, flying at higher altitudes increases the field of [...] Read more.

UAV swarms provide a flexible sensing platform for smart-city crowd surveillance, but cooperative aerial monitoring remains challenging due to dynamic pedestrian distributions, partial observability, and the trade-off between visual coverage and recognition accuracy. In particular, flying at higher altitudes increases the field of view but reduces recognition accuracy, while low-altitude flight improves visual quality at the cost of limited coverage. To address these challenges, this paper proposes an environment-aware cooperative navigation framework that integrates spatiotemporal density prediction with multi-agent reinforcement learning. The surveillance area is modeled as a spatiotemporal graph, where sparse and partial UAV observations are used to predict future pedestrian-density maps and confidence intervals. The predicted density and uncertainty, together with empirical recognition error, UAV position, flight height, battery state, and historical observations, are incorporated into MARL-based policy learning. The learned policy enables UAVs to cooperatively adjust movement and altitude decisions under the centralized training and decentralized execution paradigm. Extensive simulations in UAV-based crowd surveillance environments demonstrate that the proposed framework achieves a more favorable coverage–error trade-off than representative heuristic, prediction-based, single-agent reinforcement learning, and multi-agent reinforcement learning baselines. The results show that prediction-aware and accuracy-aware cooperation improves pedestrian-level surveillance performance under dynamic and partially observable crowd distributions. Full article

(This article belongs to the Special Issue UAV Swarm Intelligent Control and Decision-Making)

► Show Figures

Figure 1

22 pages, 12841 KB

Open AccessArticle

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation

by Hao Li, Yuyang Feng, Xin Zhao, Xuan Li and Tao Zhang

Sensors 2026, 26(12), 3968; https://doi.org/10.3390/s26123968 - 22 Jun 2026

Viewed by 335

Abstract

Person re-identification (re-ID) aims to match pedestrian images across disjoint camera views. Existing multi-source unsupervised domain adaptation (UDA) re-ID methods still face two critical issues: they fail to effectively balance domain-invariant feature learning and domain-specific style preservation and cannot adequately model the implicit [...] Read more.

Person re-identification (re-ID) aims to match pedestrian images across disjoint camera views. Existing multi-source unsupervised domain adaptation (UDA) re-ID methods still face two critical issues: they fail to effectively balance domain-invariant feature learning and domain-specific style preservation and cannot adequately model the implicit correlations among diverse source domains, resulting in limited cross-domain generalization performance. To address these challenges, this paper proposes a novel multi-source UDA re-ID framework equipped with a Mixture of Experts feature extraction (MEFE) network and a Graph-Based Relation (GBR) module. Specifically, the MEFE network integrates mixed Instance and Batch Normalization (MIBN) to extract robust domain-invariant features, while the embedded domain-specific style information (DSI) module compensates for lost domain-specific style details at the feature level. Furthermore, the cascaded Graph Attention and Graph Convolution Networks (GATs/GCNs) in the GBR module adaptively explore implicit feature correlations and achieve effective multi-source feature fusion. Center maximum mean discrepancy loss is adopted to further reduce cross-domain distribution discrepancies. Extensive experiments on large-scale datasets demonstrate that the proposed method achieves state-of-the-art performance and substantially outperforms mainstream UDA re-ID approaches. Full article

(This article belongs to the Special Issue Smart Sensors and Imaging for Face and Gesture Recognition)

► Show Figures

Figure 1

18 pages, 4201 KB

Open AccessArticle

A Multi-Modal AI System for Detecting Pedestrians Lying on the Road: Simulation-Based Safety and Injury Risk Analysis

by Nick Barua and Masahito Hitosugi

Vehicles 2026, 8(6), 136; https://doi.org/10.3390/vehicles8060136 - 18 Jun 2026

Viewed by 300

Abstract

Introduction: Pedestrians lying on the road—collapsed through medical emergency, intoxication, or displacement following a prior collision—represent a disproportionately lethal and underaddressed category in road traffic safety. Forensic database analyses derived from Japan’s national police records document a fatality rate of 33.0% for collisions [...] Read more.

Introduction: Pedestrians lying on the road—collapsed through medical emergency, intoxication, or displacement following a prior collision—represent a disproportionately lethal and underaddressed category in road traffic safety. Forensic database analyses derived from Japan’s national police records document a fatality rate of 33.0% for collisions involving pedestrians lying on the road, more than double the rate for upright pedestrian collisions. Standard Advanced Driver-Assistance Systems (ADAS) yield a True Positive Rate (TPR) of only 21.4% for detecting pedestrians lying on the road under night conditions—a classification gap of 73.3 percentage points. Methods: In simulation trials, we evaluated the Advanced Falling Object Detection System (AFODS—where “falling object” denotes the low-profile human form at road level, distinguishing the prone pedestrian from the upright postures addressed by conventional ADAS) on a composite dataset of 3200 annotated fall events and 12,000 negative samples (training/validation), with 320 independent controlled simulation trials used for performance evaluation, spanning real-world, forensic-reconstruction, and Total Human Body Model for Safety (THUMS)-validated synthetic scenarios. No physical prototype has been evaluated; all performance data are derived from simulation, and 37.5% of positive samples are synthetically generated. These simulation conditions represent a first feasibility demonstration pending real-world hardware validation. This paper introduces three original contributions absent from prior work: a three-stage quantitative injury-risk model, a formal ISO 26262 Hazard Analysis and Risk Assessment (HARA), and a medicolegal SHAP interpretability framework. The injury-risk model translated detection latency via impact velocity to Head Injury Criterion (HIC) and estimated fatal injury probability (AIS ≥ 5); these model outputs should be interpreted as exploratory estimates pending ATD validation. Reporting follows principles consistent with the TRIPOD statement. Results: Under clear daytime conditions, AFODS demonstrated a TPR of 98.2% (95% CI: 97.4–98.8%) in simulation, decreasing to 95.6% under night dry-road conditions and 89.4% under night rain. The system achieved an AUC of 0.981 and a mean end-to-end latency of 46.5 ms, representing a 76.8 percentage-point improvement in simulation over the monocular RGB baseline (p < 0.001). The injury-risk model projects a reduction in estimated fatal head injury probability from 66.2% (Monte Carlo mean) (no detection, 50 km/h full-speed impact) to 0.7% under AFODS worst-case night/rain conditions, and to ≈0% under clear daytime simulation conditions. Conclusions: A 73.3 percentage-point classification gap places pedestrians lying on the road outside the effective detection envelope of current ADAS, compounded by the systematic exclusion of non-upright postures from regulatory test protocols and benchmark datasets. AFODS supports proof-of-concept feasibility under simulation conditions. Three translational steps are required: prototype validation on real-world hardware using instrumented Anthropomorphic Test Devices (ATDs); prone-posture biomechanical injury modelling using HIC and BrIC criteria; and regulatory extension of pedestrian AEB test standards to non-upright scenarios. Full article

(This article belongs to the Topic Safe Automotive Systems: Trends, Opportunities, and Challenges)

► Show Figures

Figure 1

24 pages, 2077 KB

Open AccessArticle

Few-Shot Transfer Learning for Cross-City Pedestrian Level-of-Service Mapping Using Spatio-Temporal Graph Models

by Atakilti Brhanu Kiros, Jonathan Dortheimer, Noam Teshuva and Achituv Cohen

Urban Sci. 2026, 10(6), 334; https://doi.org/10.3390/urbansci10060334 - 18 Jun 2026

Viewed by 182

Abstract

Urban planners need scalable ways to monitor pedestrian conditions across heterogeneous cities, but conventional Level-of-Service (LOS) methods are often locally calibrated and difficult to transfer. This study proposes a city-adaptive framework for pedestrian LOS mapping using spatio-temporal graph models and few-shot transfer learning. [...] Read more.

Urban planners need scalable ways to monitor pedestrian conditions across heterogeneous cities, but conventional Level-of-Service (LOS) methods are often locally calibrated and difficult to transfer. This study proposes a city-adaptive framework for pedestrian LOS mapping using spatio-temporal graph models and few-shot transfer learning. Pedestrian count data from Melbourne, Dublin, and Zurich were converted into six ordinal LOS classes using city-specific percentile thresholds computed from the training data, yielding a relative congestion measure rather than an absolute cross-city standard. We developed a spatio-temporal graph transformer with an ordinal prediction head and evaluated it under in-domain, zero-shot, few-shot, and domain-adaptive settings. The results show strong in-domain performance in Melbourne (accuracy 79.7%; Acc ± 1 99.1%) and effective adaptation to the city-adaptive ordinal classification task. Few-shot fine-tuning with only 5% labeled target city data recovered 95–99% of in-domain performance, suggesting that small amounts of local supervision can substantially reduce calibration requirements in data-scarce environments. KernelSHAP analysis indicates that short-term temporal lag features dominate predictions across cities, whereas spatial and contextual features vary more strongly with local urban structure. The findings suggest that few-shot transfer learning can support pedestrian LOS estimation in cities with limited labeled data; however, the proposed LOS formulation should be interpreted as a city-specific relative indicator rather than an absolute measure of pedestrian comfort, crowding, or service quality. While the framework was evaluated across three cities, additional validation in diverse urban contexts and against perceptual measures of pedestrian experience remains necessary. Overall, the study contributes a city-adaptive framework for transferable relative LOS prediction rather than a universal cross-city LOS standard. Full article

(This article belongs to the Special Issue Addressing the Challenges in the Development and Management of Public Spaces in Contemporary Cities)

► Show Figures

Figure 1

29 pages, 3581 KB

Open AccessArticle

A Semantic-Aware Video Offloading Framework for Bandwidth-Efficient Cloud-Based Surveillance

by Neeta Gajanan Kadukar and Diksha Dani

Algorithms 2026, 19(6), 483; https://doi.org/10.3390/a19060483 - 16 Jun 2026

Viewed by 206

Abstract

The proliferation of IoT-based surveillance has caused a sharp rise in video data, straining network bandwidth and cloud storage. Conventional video compression exploits pixel-level redundancy but ignores the semantic importance of content, transmitting large volumes of redundant background. This paper proposes a semantic-aware [...] Read more.

The proliferation of IoT-based surveillance has caused a sharp rise in video data, straining network bandwidth and cloud storage. Conventional video compression exploits pixel-level redundancy but ignores the semantic importance of content, transmitting large volumes of redundant background. This paper proposes a semantic-aware video offloading framework that improves bandwidth efficiency in cloud-based surveillance. DeepLabV3+ with a ResNet-50 backbone performs semantic segmentation at the edge to extract relevant foreground objects (e.g., pedestrians and vehicles) while suppressing static background. A background reference caching mechanism transmits the static scene once and reuses it at the cloud for full-frame reconstruction, minimizing redundant transmission. On a dataset of 12 surveillance sequences (self-captured videos plus sequences from the CDnet 2014 benchmark), the method achieves up to 74.63% reduction in transmitted data, a 33% improvement in storage efficiency, and a compression ratio of 2.88×, while maintaining an average PSNR of 44.92 dB. Paired t-tests (

p < 0.001

) and sensitivity analysis across varying scene dynamics and semantic configurations confirm the robustness of the approach, and comparisons indicate clear gains over conventional motion-based offloading in bandwidth efficiency and reconstruction fidelity. Full article

(This article belongs to the Special Issue Artificial Intelligence, Image Processing and Spatial Analytics in Environmental Informatics)

► Show Figures

Figure 1

28 pages, 8801 KB

Open AccessArticle

Smartphone and Smartwatch Crowdsensing for Bridge Modal Identification with Convergence Behavior and Bootstrap Uncertainty Analysis

by Furkan Luleci and Sadig Nuraliyev

Infrastructures 2026, 11(6), 204; https://doi.org/10.3390/infrastructures11060204 - 16 Jun 2026

Viewed by 228

Abstract

This study investigates the feasibility, accuracy, and data-sufficiency requirements of smartphone- and smartwatch-based crowdsensing for pedestrian bridge modal identification under real-world conditions. Full-scale experiments were conducted on a bridge across two crowdsensing scenarios with varying dynamic excitation intensities by six pedestrians performing walking, [...] Read more.

This study investigates the feasibility, accuracy, and data-sufficiency requirements of smartphone- and smartwatch-based crowdsensing for pedestrian bridge modal identification under real-world conditions. Full-scale experiments were conducted on a bridge across two crowdsensing scenarios with varying dynamic excitation intensities by six pedestrians performing walking, running, and bicycling activities while carrying smartphones and wearing smartwatches. Triaxial acceleration data were collected over 300 s and processed using a framework comprising preprocessing, modal estimation, growing-window convergence analysis, and block-bootstrap uncertainty quantification. Using the full dataset, both devices reliably identified the four consistently detectable bridge modes with average errors of approximately 3% across the scenarios relative to the benchmark. In the convergence analysis, smartwatches consistently produced narrower confidence intervals and more stable early-window estimates, which may be related to their more constrained wearing condition and reduced incidental motion compared to pocket-carried smartphones. Higher pedestrian excitation with additional pedestrians running accelerated the convergence, reducing the required data duration and number of pedestrian passes, albeit with increased uncertainty. The study established data-sufficiency thresholds, showing that reliable modal estimates require in the range of 5–17 walking or running passes, while bicycling passes range from 14 to 28, depending on bridge excitation level and device type. Results demonstrate that commodity smartphones and smartwatches are viable, scalable, and cost-effective platforms for crowdsensed bridge modal identification, provided that uncertainty ranges are properly accounted for and sufficient passes across different pedestrian activities are collected to achieve the desired accuracy. Full article

(This article belongs to the Special Issue Advanced Technologies for Bridge Health Monitoring)

► Show Figures

Graphical abstract

21 pages, 4711 KB

Open AccessArticle

An Integrated Model for Dam Evacuation Under Explosion-Induced Damage: Coupling Physical Damage and Crowd Behavior

by Hongpeng Qiu, Eric Wai Ming Lee, Lingling Hu and Xiangping Xian

Fire 2026, 9(6), 259; https://doi.org/10.3390/fire9060259 - 16 Jun 2026

Viewed by 467

Abstract

This study develops an integrated computational framework to assess the passage efficiency of a dam crest serving as a critical inter-regional corridor following a severe explosion event. The framework combines a physics-based damage model with an agent-based cellular automata (CA) approach that incorporates [...] Read more.

This study develops an integrated computational framework to assess the passage efficiency of a dam crest serving as a critical inter-regional corridor following a severe explosion event. The framework combines a physics-based damage model with an agent-based cellular automata (CA) approach that incorporates pedestrian behavioral heterogeneity. The damage model conceptualizes three concentric zones: a complete fragmentation zone (0–1.5 m) with total material disintegration, a primary damage zone (1.5–5 m) following an exponential decay in structural integrity, and a secondary damage zone (5–20 m) governed by a power-law attenuation of fragmentation effects. Pedestrian behavior is parameterized by the Allowable Conflict Coefficient (ACC), the inverse of interpersonal friction, and the Emergency Level (EL), which scales the desired velocity. Extensive simulations under stochastic and targeted impact scenarios reveal a consistent evacuation performance hierarchy: Center (C) > Bottom-Left (BL) > Top-Left (TL) > Bottom-Right (BR) ≈ Top-Right (TR). Exit-proximal damage (TR, BR) increased evacuation time by up to 85% compared with central impacts. Results demonstrate a strong coupling between physical friction and urgency: the “faster-is-faster” effect is maximized under low friction (high ACC), while high friction not only suppresses the benefits of elevated EL but can also induce “faster-is-slower” phenomena under extreme conditions. These findings underscore that optimal evacuation strategies depend critically on both impact location and crowd behavior management, providing actionable insights for emergency planning and highlighting the importance of conflict mitigation in enhancing infrastructure resilience. The proposed framework thus offers a versatile and validated simulation tool for emergency planners to proactively assess and optimize evacuation strategies under various damage scenarios. Full article

(This article belongs to the Special Issue Behavioral Research on Fire Evacuation and Decision-Making Processes)

► Show Figures

Figure 1

25 pages, 8152 KB

Open AccessArticle

Nonlinear Effects of Station-Area Environments on Commercial–Employment Composite Vitality: Evidence from Osaka’s Midosuji Line

by Yu Li, Zihao Wang, Minfeng Yao, Yikang Zhang and Qi Zhang

Land 2026, 15(6), 1054; https://doi.org/10.3390/land15061054 - 15 Jun 2026

Viewed by 216

Abstract

Rail-transit station areas concentrate commercial services, employment, and intensive land development, but their vitality is shaped by multiple built-environment conditions rather than rail accessibility alone. Focusing on 20 stations along the Osaka Metro Midosuji Line in Japan, this study uses Japanese chome units, [...] Read more.

Rail-transit station areas concentrate commercial services, employment, and intensive land development, but their vitality is shaped by multiple built-environment conditions rather than rail accessibility alone. Focusing on 20 stations along the Osaka Metro Midosuji Line in Japan, this study uses Japanese chome units, which are small neighborhood-level address and statistical units, within an 800 m pedestrian catchment as analytical units and measures commercial-service agglomeration intensity, employment intensity, and commercial–employment composite vitality. The composite indicator measures the static co-concentration of commercial-service provision and employment carrying capacity, with pedestrian flow, consumption activity, and dwell time treated as separate dimensions of station-area vitality. Ten station-area environmental variables are examined using ordinary least squares (OLS), Lasso, Random Forest, Back-Propagation (BP) Neural Network, and extreme gradient boosting (XGBoost) models, with Shapley additive explanations (SHAP) applied to interpret variable contributions and nonlinear responses. Results show that nonlinear models generally outperform linear models. Development intensity, officially assessed land price, and network distance to the nearest metro station are the most influential variables, showing threshold, marginal, and non-monotonic effects. Split models indicate that commercial-service agglomeration is more sensitive to rail proximity and street-network conditions, whereas employment intensity is more associated with development intensity and land price. These findings support fine-grained station-area renewal and mixed-function planning. Full article

(This article belongs to the Special Issue Transport Planning in Smart Cities and Sustainable Urban Design)

► Show Figures

Figure 1

36 pages, 14641 KB

Open AccessArticle

Physics-Informed Inference of Historical Stair Usage from Geometric Wear Profiles in Heritage Structures

by Jianchao Yu, Yating Zhong, Ziheng Luo, Yuqi Guo and Jufang Hu

Appl. Sci. 2026, 16(12), 6025; https://doi.org/10.3390/app16126025 - 14 Jun 2026

Viewed by 141

Abstract

Wear on historic staircases is often used as evidence for conservation assessment and historical interpretation, yet existing studies are largely descriptive and rarely provide a quantitative explanation of how observed wear relates to long-term pedestrian use. To address this limitation, this paper proposes [...] Read more.

Wear on historic staircases is often used as evidence for conservation assessment and historical interpretation, yet existing studies are largely descriptive and rarely provide a quantitative explanation of how observed wear relates to long-term pedestrian use. To address this limitation, this paper proposes a physics-constrained inversion framework for analyzing directional preference and wear-related usage regimes from geometric wear profiles of heritage staircases. An Archard-type wear model is extended to account for spatial footfall distribution, cumulative abrasion, material deterioration, and environmental loss, and the reconstruction problem is formulated as an inverse parameter estimation task. Bayesian uncertainty quantification is introduced to estimate posterior distributions, credible intervals, and parameter coupling. A unified workflow is developed for staircase geometry representation, reference surface reconstruction, profile extraction, regularized height field construction, forward simulation, and inverse solution. Nine synthetic scenarios with different usage levels and directional preferences are tested under 1%, 3%, and 5% noise, and the method is further applied to a publicly available three-dimensional heritage staircase model. Under 3% noise, profile correlation coefficients for three representative scenarios reach 0.9646, 0.9807, and 0.9868, indicating strong recoverability of geometric wear morphology under model-consistent conditions. The results indicate that directional preference, material hardness, and some degradation-related parameters are identifiable, whereas pedestrian volume and the wear coefficient show strong compensation. Overall, the proposed framework provides a quantitative basis for identifying directional asymmetry, analyzing parameter identifiability, and supporting geometry-based interpretation in heritage staircase studies. Full article

► Show Figures

Figure 1

26 pages, 1547 KB

Open AccessArticle

Sustainable Urban Accessibility and Retail Choices: Consumer Behaviour Through Discrete Choice Analysis in Southern Italy

by Antonio Russo, Tiziana Campisi, Socrates Basbas, Efstathios Bouhouras and Giovanni Tesoriere

Sustainability 2026, 18(12), 6081; https://doi.org/10.3390/su18126081 - 12 Jun 2026

Viewed by 377

Abstract

Shopping mobility accounts for a significant share of total travel, while the growth of e-commerce is reshaping consumer purchasing behaviour and retail dynamics. Comprehending how territorial and sociodemographic factors shape the choice between physical and digital retail channels is therefore a key issue [...] Read more.

Shopping mobility accounts for a significant share of total travel, while the growth of e-commerce is reshaping consumer purchasing behaviour and retail dynamics. Comprehending how territorial and sociodemographic factors shape the choice between physical and digital retail channels is therefore a key issue for transport planning and sustainable urban mobility. In this context, it is important to understand how accessibility to different classes of retailers is configured and how it can impact purchasing choices. Through a discrete choice analysis, this study examines the sociodemographic and territorial determinants of purchasing behaviour, focusing on the clothing market. Four purchase alternatives are considered: medium-sized and small urban retail stores, shopping malls, online purchasing, and no purchase. This multi-alternative framework enables the direct estimation of substitution patterns not only between physical and digital retail, but also between distinct forms of physical retail. Data were collected through a survey conducted in Southern Italy, providing empirical evidence from a territorial setting that is structurally underrepresented in the existing literature. A multinomial logit model and a two-level hierarchical logit model incorporating pedestrian accessibility—measured as walking time from residence to the nearest clothing store—alongside sociodemographic and territorial attributes were calibrated to analyse alternative choice behaviour. The calibrated models show interesting results, highlighting the role of pedestrian accessibility in the choice of clothing stores in city centres. Age, income, and territorial variables further differentiate channel preferences across population segments. The findings offer relevant implications for policymakers, governance managers, urban planners, and researchers concerned with retail location, sustainable accessibility, and consumer behaviour. These insights are highly valuable for developing planning that addresses the United Nations 2030 Agenda, particularly Sustainable Development Goal 11. Full article

(This article belongs to the Special Issue Sustainable Urban Green Transport and Mobility: Lessons from Practice)

► Show Figures

Figure 1

19 pages, 1785 KB

Open AccessArticle

AI-Driven Urban Traffic Monitoring and Control Using YOLOv11 for Enhanced Throughput

by Benjamin Ilo and Hongwei Zhang

Electronics 2026, 15(12), 2590; https://doi.org/10.3390/electronics15122590 - 12 Jun 2026

Viewed by 186

Abstract

Urban traffic congestion remains a persistent global challenge, contributing to significant economic inefficiencies, elevated greenhouse gas emissions, and diminished quality of life. This paper presents a real-world video-based traffic monitoring study combined with a proposed adaptive signal control framework. In the monitoring component, [...] Read more.

Urban traffic congestion remains a persistent global challenge, contributing to significant economic inefficiencies, elevated greenhouse gas emissions, and diminished quality of life. This paper presents a real-world video-based traffic monitoring study combined with a proposed adaptive signal control framework. In the monitoring component, YOLOv11 object detection was applied directly to footage recorded from an overhead bridge position on a 40 km/h road. The model successfully detected and tracked multiple road-user categories, including cars, trucks, buses, motorcycles, cyclists, and pedestrians, yielding 1041 vehicle detections across 25 unique tracked objects. Vehicle speeds were estimated from inter-frame centroid displacement, and a Region of Interest (ROI) occupancy model was used to classify congestion states as High, Medium, or Free Flow using thresholds grounded in Highway Capacity Manual (HCM) level-of-service criteria. The system detected 11 high-congestion frames (3.8%), 184 medium-congestion frames (63.9%), and 93 free-flow frames (32.3%), consistent with moderate congestion observed during the recording period. In the proposed control component, a Proximal Policy Optimisation (PPO)-based reinforcement learning signal controller is designed around the YOLOv11 detection outputs as its state representation. Based on comparable adaptive traffic signal control studies in the literature, the proposed framework is projected to achieve approximately 25% higher peak-hour throughput, 35% shorter queue lengths, and 32% lower average waiting times relative to a fixed-time signal baseline. The detection accuracy (mAP@0.5 = 93.2%) and inference speed (32 FPS) cited are published YOLOv11 benchmarks used as indicative performance references. This work bridges real-world perception and proposed intelligent control, providing a transparent and reproducible methodology for next-generation smart city traffic management. Full article

(This article belongs to the Special Issue Artificial Intelligence for Advanced Engineering: Techniques, Methods, and Frameworks)

► Show Figures

Figure 1

17 pages, 1163 KB

Open AccessArticle

SHARP: A Risk-Constrained Transformer with Closed-Form CVaR Safety Masks for Multi-Robot Task Allocation in Human-Shared Warehouses

by Shengshuo Gong, Qiujie Shen and Oleg. O. Varlamov

Mathematics 2026, 14(12), 2096; https://doi.org/10.3390/math14122096 - 11 Jun 2026

Viewed by 154

Abstract

Modern fulfillment centers share floor space with human workers, making warehouse multi-robot task allocation a safety-critical problem. We propose SHARP (Safe Heterogeneous Allocation with Risk Prediction), a Transformer-based constrained reinforcement-learning framework with a closed-form deployment-time safety mask. Under a Gaussian pedestrian belief and [...] Read more.

Modern fulfillment centers share floor space with human workers, making warehouse multi-robot task allocation a safety-critical problem. We propose SHARP (Safe Heterogeneous Allocation with Risk Prediction), a Transformer-based constrained reinforcement-learning framework with a closed-form deployment-time safety mask. Under a Gaussian pedestrian belief and fixed closest-approach directions, the mask uses Bonferroni-allocated per-pair CVaR scores; a nonnegative mask score implies a conservative trajectory-level chance constraint under the stated assumptions. We also present an idealized primal–dual surrogate analysis, without claiming global convergence for the nonconvex Transformer/PPO implementation. Expanded experiments use ten training seeds per learned method and deterministic final-checkpoint evaluation on twenty independently generated held-out instances. No statistically significant difference between SHARP and Lagrangian-PPO was detected in any of the four scenarios. The held-out analysis further reveals late-training instability and severe over-conservatism in the dense S40_high scenario. These findings position SHARP as an auditable geometric filtering mechanism, while identifying conservatism and training stability as important limitations for deployment. Full article

► Show Figures

Figure 1

30 pages, 5128 KB

Open AccessArticle

GATE (Ground-Floor Architectural Typology at the Street Edge): A Multi-Resolution Morphometric Framework for Resolving Urban Vitality in a Mid-Sized Turkish City

by Nihansu Banu Albayrak Evren and Ömür Barkul

Buildings 2026, 16(12), 2342; https://doi.org/10.3390/buildings16122342 - 11 Jun 2026

Viewed by 234

Abstract

Urban vitality research treats food-and-beverage venues as aggregate point-of-interest counts and existing morphometric classification frameworks operate at the building, block or neighbourhood scale, leaving the commercial ground-floor interface without a programme-specific typology. This study develops GATE (Ground-floor Architectural Typology at the street Edge), [...] Read more.

Urban vitality research treats food-and-beverage venues as aggregate point-of-interest counts and existing morphometric classification frameworks operate at the building, block or neighbourhood scale, leaving the commercial ground-floor interface without a programme-specific typology. This study develops GATE (Ground-floor Architectural Typology at the street Edge), a 22-variable morphological framework operating at the venue–street–interface scale, and applies it to 85 interfaces across eleven commercial arteries in the core of a mid-sized Turkish city. Ward hierarchical clustering yields a single dendrogram read at macro (k = 3) and micro (k = 7) resolutions, validated through Kruskal–Wallis tests that separate 17 and 19 of the 22 variables, respectively. Three macro types emerge: narrow-fronted apartment-ground-floor venues, detached garden-plot pavilion venues and vertically organised transparent-fronted venues. Space Syntax Integration, Choice and Shannon diversity produce no significant relationship with pedestrian density in the aggregate. Type stratification points to a resolution-dependent moderator effect: the apartment-ground-floor type shows negative Integration and positive Choice coupling, while the transparent vertical type shows positive Integration coupling, producing a directional pattern consistent with Simpson’s paradox. GATE provides one of the first programme-specific venue-level morphological frameworks and establishes an explicit quantitative mapping of the Panerai–Castex analytical typology with future multi-city applications to test its generalisability. Full article

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

► Show Figures

Figure 1

29 pages, 3529 KB

Open AccessArticle

TrackRefine: A Plug-and-Play Decoupled Enhancement Framework for Online Multi-Object Tracking and Segmentation

by Longfei Qie, Chunlei Chai, Ruixue Wang, Chao Bi, Ruiqi Ma, Aijun Zhang and Jiakui Tang

Sensors 2026, 26(12), 3696; https://doi.org/10.3390/s26123696 - 10 Jun 2026

Viewed by 243

Abstract

Multi-object tracking and segmentation (MOTS) aims to jointly perform pixel-level instance segmentation and temporal identity association for multiple objects in video sequences. Existing online decoupled MOTS methods face several challenges in complex scenarios, including limited front-end mask quality, corruption of memory representations under [...] Read more.

Multi-object tracking and segmentation (MOTS) aims to jointly perform pixel-level instance segmentation and temporal identity association for multiple objects in video sequences. Existing online decoupled MOTS methods face several challenges in complex scenarios, including limited front-end mask quality, corruption of memory representations under prolonged occlusion, and unstable data association and trajectory recovery. To address these limitations, we propose TrackRefine, a plug-and-play decoupled enhancement framework. TrackRefine enhances overall performance through back-end refinement without modifying the architecture of the front-end instance segmenter or relying on additional end-to-end joint training. Specifically, we introduce a lightweight Fast GrabCut-based mask refinement module to optimize mask boundaries, a multimodal long-short-term memory bank that integrates appearance, semantic, and shape cues for identity modeling, and a progressive three-stage association strategy for stable matching and long-term trajectory recovery. Experimental results on MOTS20 show that TrackRefine achieves 69.4 sMOTSA, 82.7 MOTSA, and 478 Frag. Experimental results on KITTI MOTS show that it achieves 62.4/73.7 sMOTSA and 78.0/85.4 MOTSA for pedestrians and cars, respectively. Extensive experiments with different front-end instance segmenters verify its plug-and-play flexibility and decoupled design, while ablation studies confirm the effectiveness of each core module. These results show that TrackRefine provides an efficient and practical solution for online MOTS in complex scenarios. Full article

(This article belongs to the Special Issue Smart Remote Sensing Images Processing for Sensor-Based Applications)

► Show Figures

Figure 1

24 pages, 10534 KB

Open AccessArticle

Trajectory-Driven Road Network Extraction via Coupled Multi-Level Grid Semantics

by Yunfei Zhang, Hongjie Zhu, Baifa Wu, Naisi Sun, Cuifeng Zhang, Tianyu Zhong and Chaoyang Shi

ISPRS Int. J. Geo-Inf. 2026, 15(6), 254; https://doi.org/10.3390/ijgi15060254 - 7 Jun 2026

Viewed by 236

Abstract

Road network extraction and updating are crucial for urban development, map updating, and mobility applications. Existing trajectory-based methods often underutilize grid-level semantic information and neighborhood context, thereby limiting their robustness to noisy, heterogeneous, and cross-city trajectory conditions. This study proposes a supervised framework [...] Read more.

Road network extraction and updating are crucial for urban development, map updating, and mobility applications. Existing trajectory-based methods often underutilize grid-level semantic information and neighborhood context, thereby limiting their robustness to noisy, heterogeneous, and cross-city trajectory conditions. This study proposes a supervised framework for trajectory-driven road network extraction by coupling intra-grid movement semantics with inter-grid neighborhood context. Multi-level features, including convex-hull shape descriptors, directional clustering, DTW-based (Dynamic Time Warping) heterogeneity, and neighborhood density differences, are used to train a Random Forest classifier for key-grid detection. The detected key grids are further processed through morphology-aware thinning and Kalman smoothing to generate a topology-preserving and vectorization-ready road skeleton. The model is trained on pedestrian trajectories from Shenzhen and directly transferred to vehicle trajectories in Wuhan and Changsha under a zero-shot setting. Experimental results show that the proposed method achieves longer correctly extracted road length and competitive length-based precision compared with raster-based reference methods, while feature-importance and ablation analyses confirm the complementary role of neighborhood context. The proposed pipeline is scalable, interpretable, and transferable, supporting trajectory-based road map updating and urban network analysis. Full article

► Show Figures

Figure 1

Search Results (826)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (826)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI