Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (998)

Search Parameters:
Keywords = depth vision

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 12851 KiB  
Article
Evaluation of a Vision-Guided Shared-Control Robotic Arm System with Power Wheelchair Users
by Breelyn Kane Styler, Wei Deng, Cheng-Shiu Chung and Dan Ding
Sensors 2025, 25(15), 4768; https://doi.org/10.3390/s25154768 (registering DOI) - 2 Aug 2025
Abstract
Wheelchair-mounted assistive robotic manipulators can provide reach and grasp functions for power wheelchair users. This in-lab study evaluated a vision-guided shared control (VGS) system with twelve users completing two multi-step kitchen tasks: a drinking task and a popcorn making task. Using a mixed [...] Read more.
Wheelchair-mounted assistive robotic manipulators can provide reach and grasp functions for power wheelchair users. This in-lab study evaluated a vision-guided shared control (VGS) system with twelve users completing two multi-step kitchen tasks: a drinking task and a popcorn making task. Using a mixed methods approach participants compared VGS and manual joystick control, providing performance metrics, qualitative insights, and lessons learned. Data collection included demographic questionnaires, the System Usability Scale (SUS), NASA Task Load Index (NASA-TLX), and exit interviews. No significant SUS differences were found between control modes, but NASA-TLX scores revealed VGS control significantly reduced workload during the drinking task and the popcorn task. VGS control reduced operation time and improved task success but was not universally preferred. Six participants preferred VGS, five preferred manual, and one had no preference. In addition, participants expressed interest in robotic arms for daily tasks and described two main operation challenges: distinguishing wrist orientation from rotation modes and managing depth perception. They also shared perspectives on how a personal robotic arm could complement caregiver support in their home. Full article
(This article belongs to the Special Issue Intelligent Sensors and Robots for Ambient Assisted Living)
Show Figures

Figure 1

18 pages, 1910 KiB  
Article
Hierarchical Learning for Closed-Loop Robotic Manipulation in Cluttered Scenes via Depth Vision, Reinforcement Learning, and Behaviour Cloning
by Hoi Fai Yu and Abdulrahman Altahhan
Electronics 2025, 14(15), 3074; https://doi.org/10.3390/electronics14153074 (registering DOI) - 31 Jul 2025
Viewed by 192
Abstract
Despite rapid advances in robot learning, the coordination of closed-loop manipulation in cluttered environments remains a challenging and relatively underexplored problem. We present a novel two-level hierarchical architecture for a depth vision-equipped robotic arm that integrates pushing, grasping, and high-level decision making. Central [...] Read more.
Despite rapid advances in robot learning, the coordination of closed-loop manipulation in cluttered environments remains a challenging and relatively underexplored problem. We present a novel two-level hierarchical architecture for a depth vision-equipped robotic arm that integrates pushing, grasping, and high-level decision making. Central to our approach is a prioritised action–selection mechanism that facilitates efficient early-stage learning via behaviour cloning (BC), while enabling scalable exploration through reinforcement learning (RL). A high-level decision neural network (DNN) selects between grasping and pushing actions, and two low-level action neural networks (ANNs) execute the selected primitive. The DNN is trained with RL, while the ANNs follow a hybrid learning scheme combining BC and RL. Notably, we introduce an automated demonstration generator based on oriented bounding boxes, eliminating the need for manual data collection and enabling precise, reproducible BC training signals. We evaluate our method on a challenging manipulation task involving five closely packed cubic objects. Our system achieves a completion rate (CR) of 100%, an average grasping success (AGS) of 93.1% per completion, and only 7.8 average decisions taken for completion (DTC). Comparative analysis against three baselines—a grasping-only policy, a fixed grasp-then-push sequence, and a cloned demonstration policy—highlights the necessity of dynamic decision making and the efficiency of our hierarchical design. In particular, the baselines yield lower AGS (86.6%) and higher DTC (10.6 and 11.4) scores, underscoring the advantages of content-aware, closed-loop control. These results demonstrate that our architecture supports robust, adaptive manipulation and scalable learning, offering a promising direction for autonomous skill coordination in complex environments. Full article
Show Figures

Figure 1

15 pages, 4667 KiB  
Article
Longitudinal High-Resolution Imaging of Retinal Sequelae of a Choroidal Nevus
by Kaitlyn A. Sapoznik, Stephen A. Burns, Todd D. Peabody, Lucie Sawides, Brittany R. Walker and Thomas J. Gast
Diagnostics 2025, 15(15), 1904; https://doi.org/10.3390/diagnostics15151904 - 29 Jul 2025
Viewed by 215
Abstract
Background: Choroidal nevi are common, benign tumors. These tumors rarely cause adverse retinal sequalae, but when they do, they can lead to disruption of the outer retina and vision loss. In this paper, we used high-resolution retinal imaging modalities, optical coherence tomography [...] Read more.
Background: Choroidal nevi are common, benign tumors. These tumors rarely cause adverse retinal sequalae, but when they do, they can lead to disruption of the outer retina and vision loss. In this paper, we used high-resolution retinal imaging modalities, optical coherence tomography (OCT) and adaptive optics scanning laser ophthalmoscopy (AOSLO), to longitudinally monitor retinal sequelae of a submacular choroidal nevus. Methods: A 31-year-old female with a high-risk choroidal nevus resulting in subretinal fluid (SRF) and a 30-year-old control subject were longitudinally imaged with AOSLO and OCT in this study over 18 and 22 months. Regions of interest (ROI) including the macular region (where SRF was present) and the site of laser photocoagulation were imaged repeatedly over time. The depth of SRF in a discrete ROI was quantified with OCT and AOSLO images were assessed for visualization of photoreceptors and retinal pigmented epithelium (RPE). Cell-like structures that infiltrated the site of laser photocoagulation were measured and their count was assessed over time. In the control subject, images were assessed for RPE visualization and the presence and stability of cell-like structures. Results: We demonstrate that AOSLO can be used to assess cellular-level changes at small ROIs in the retina over time. We show the response of the retina to SRF and laser photocoagulation. We demonstrate that the RPE can be visualized when SRF is present, which does not appear to depend on the height of retinal elevation. We also demonstrate that cell-like structures, presumably immune cells, are present within and adjacent to areas of SRF on both OCT and AOSLO, and that similar cell-like structures infiltrate areas of retinal laser photocoagulation. Conclusions: Our study demonstrates that dynamic, cellular-level retinal responses to SRF and laser photocoagulation can be monitored over time with AOSLO in living humans. Many retinal conditions exhibit similar retinal findings and laser photocoagulation is also indicated in numerous retinal conditions. AOSLO imaging may provide future opportunities to better understand the clinical implications of such responses in vivo. Full article
(This article belongs to the Special Issue High-Resolution Retinal Imaging: Hot Topics and Recent Developments)
Show Figures

Figure 1

17 pages, 1603 KiB  
Perspective
A Perspective on Quality Evaluation for AI-Generated Videos
by Zhichao Zhang, Wei Sun and Guangtao Zhai
Sensors 2025, 25(15), 4668; https://doi.org/10.3390/s25154668 - 28 Jul 2025
Viewed by 238
Abstract
Recent breakthroughs in AI-generated content (AIGC) have transformed video creation, empowering systems to translate text, images, or audio into visually compelling stories. Yet reliable evaluation of these machine-crafted videos remains elusive because quality is governed not only by spatial fidelity within individual frames [...] Read more.
Recent breakthroughs in AI-generated content (AIGC) have transformed video creation, empowering systems to translate text, images, or audio into visually compelling stories. Yet reliable evaluation of these machine-crafted videos remains elusive because quality is governed not only by spatial fidelity within individual frames but also by temporal coherence across frames and precise semantic alignment with the intended message. The foundational role of sensor technologies is critical, as they determine the physical plausibility of AIGC outputs. In this perspective, we argue that multimodal large language models (MLLMs) are poised to become the cornerstone of next-generation video quality assessment (VQA). By jointly encoding cues from multiple modalities such as vision, language, sound, and even depth, the MLLM can leverage its powerful language understanding capabilities to assess the quality of scene composition, motion dynamics, and narrative consistency, overcoming the fragmentation of hand-engineered metrics and the poor generalization ability of CNN-based methods. Furthermore, we provide a comprehensive analysis of current methodologies for assessing AIGC video quality, including the evolution of generation models, dataset design, quality dimensions, and evaluation frameworks. We argue that advances in sensor fusion enable MLLMs to combine low-level physical constraints with high-level semantic interpretations, further enhancing the accuracy of visual quality assessment. Full article
(This article belongs to the Special Issue Perspectives in Intelligent Sensors and Sensing Systems)
Show Figures

Figure 1

29 pages, 868 KiB  
Article
Relationship Between Visual Acuity, Colour Vision, Contrast Sensitivity and Stereopsis, and Road Traffic Accidents: A Systematic Review and Meta-Analysis
by Diana García-Lozada, Fanny Rivera-Pinzón and Edgar Ibáñez-Pinilla
Safety 2025, 11(3), 71; https://doi.org/10.3390/safety11030071 - 28 Jul 2025
Viewed by 238
Abstract
The aim of this study was to evaluate the relationship between visual functions and road traffic accidents (RTAs) by meta-analysis of observational studies. The analysis included all drivers of motor vehicles, regardless of age, and those using private or public transport. Self-reported visual [...] Read more.
The aim of this study was to evaluate the relationship between visual functions and road traffic accidents (RTAs) by meta-analysis of observational studies. The analysis included all drivers of motor vehicles, regardless of age, and those using private or public transport. Self-reported visual outcomes were excluded. The risk of RTA in patients with reduced visual acuity was observed in commercial drivers in cross-sectional studies (PR 1.54, 95% CI 1.26–1.88), but not in private drivers in cohort (RR 1.04, 95% CI 0.74–1.46) or case–control studies (OR 1.04, 95% CI 0.78–1.40). A non-statistically significant association between colour vision defects and RTA was observed in cross-sectional studies (PR 1.50, 95% CI 0.91–2.45). No evidence was found for an increased risk of accidents in people with reduced stereopsis. In older adults with abnormal contrast sensitivity, a weak risk of RTA was observed in cohort studies. Evidence from low-quality cross-sectional studies suggests an increased risk of RTAs among commercial drivers with reduced visual acuity. The few case–control and cohort studies identified did not show an association between accident occurrence and visual function. Attention needs to be paid to this issue to facilitate the conduct of high-quality research that can support the development of road safety policies. Full article
Show Figures

Figure 1

19 pages, 1940 KiB  
Article
Linkages Between Sorghum bicolor Root System Architectural Traits and Grain Yield Performance Under Combined Drought and Heat Stress Conditions
by Alec Magaisa, Elizabeth Ngadze, Tshifhiwa P. Mamphogoro, Martin P. Moyo and Casper N. Kamutando
Agronomy 2025, 15(8), 1815; https://doi.org/10.3390/agronomy15081815 - 26 Jul 2025
Viewed by 259
Abstract
Breeding programs often overlook the use of root traits. Therefore, we investigated the relevance of sorghum root traits in explaining its adaptation to combined drought and heat stress (CDHS). Six (i.e., three pre-release lines + three checks) sorghum genotypes were established at two [...] Read more.
Breeding programs often overlook the use of root traits. Therefore, we investigated the relevance of sorghum root traits in explaining its adaptation to combined drought and heat stress (CDHS). Six (i.e., three pre-release lines + three checks) sorghum genotypes were established at two low-altitude (i.e., <600 masl) locations with a long-term history of averagely very high temperatures in the beginning of the summer season, under two management (i.e., CDHS and well-watered (WW)) regimes. At each location, the genotypes were laid out in the field using a randomized complete block design (RCBD) replicated two times. Root trait data, namely root diameter (RD), number of roots (NR), number of root tips (NRT), total root length (TRL), root depth (RDP), root width (RW), width–depth ratio (WDR), root network area (RNA), root solidity (RS), lower root area (LRA), root perimeter (RP), root volume (RV), surface area (SA), root holes (RH) and root angle (RA) were gathered using the RhizoVision Explorer software during the pre- and post-flowering stage of growth. RSA traits differentially showed significant (p < 0.05) correlations with grain yield (GY) at pre- and post-flowering growth stages and under CDHS and WW conditions also revealing genotypic variation estimates exceeding 50% for all the traits. Regression models varied between pre-flowering (p = 0.013, R2 = 47.15%, R2 Predicted = 29.32%) and post-flowering (p = 0.000, R2 = 85.64%, R2 Predicted = 73.30%) growth stages, indicating post-flowering as the optimal stage to relate root traits to yield performance. RD contributed most to the regression model at post-flowering, explaining 51.79% of the 85.64% total variation. The Smith–Hazel index identified ICSV111IN and ASAREACA12-3-1 as superior pre-release lines, suitable for commercialization as new varieties. The study demonstrated that root traits (in particular, RD, RW, and RP) are linked to crop performance under CDHS conditions and should be incorporated in breeding programs. This approach may accelerate genetic gains not only in sorghum breeding programs, but for other crops, while offering a nature-based breeding strategy for stress adaptation in crops. Full article
Show Figures

Figure 1

21 pages, 2514 KiB  
Article
Investigations into Picture Defogging Techniques Based on Dark Channel Prior and Retinex Theory
by Lihong Yang, Zhi Zeng, Hang Ge, Yao Li, Shurui Ge and Kai Hu
Appl. Sci. 2025, 15(15), 8319; https://doi.org/10.3390/app15158319 - 26 Jul 2025
Viewed by 167
Abstract
To address the concerns of contrast deterioration, detail loss, and color distortion in images produced under haze conditions in scenarios such as intelligent driving and remote sensing detection, an algorithm for image defogging that combines Retinex theory and the dark channel prior is [...] Read more.
To address the concerns of contrast deterioration, detail loss, and color distortion in images produced under haze conditions in scenarios such as intelligent driving and remote sensing detection, an algorithm for image defogging that combines Retinex theory and the dark channel prior is proposed in this paper. The method involves building a two-stage optimization framework: in the first stage, global contrast enhancement is achieved by Retinex preprocessing, which effectively improves the detail information regarding the dark area and the accuracy of the transmittance map and atmospheric light intensity estimation; in the second stage, an a priori compensation model for the dark channel is constructed, and a depth-map-guided transmittance correction mechanism is introduced to obtain a refined transmittance map. At the same time, the atmospheric light intensity is accurately calculated by the Otsu algorithm and edge constraints, which effectively suppresses the halo artifacts and color deviation of the sky region in the dark channel a priori defogging algorithm. The experiments based on self-collected data and public datasets show that the algorithm in this paper presents better detail preservation ability (the visible edge ratio is minimally improved by 0.1305) and color reproduction (the saturated pixel ratio is reduced to about 0) in the subjective evaluation, and the average gradient ratio of the objective indexes reaches a maximum value of 3.8009, which is improved by 36–56% compared with the classical DCP and Tarel algorithms. The method provides a robust image defogging solution for computer vision systems under complex meteorological conditions. Full article
Show Figures

Figure 1

54 pages, 1242 KiB  
Review
Optical Sensor-Based Approaches in Obesity Detection: A Literature Review of Gait Analysis, Pose Estimation, and Human Voxel Modeling
by Sabrine Dhaouadi, Mohamed Moncef Ben Khelifa, Ala Balti and Pascale Duché
Sensors 2025, 25(15), 4612; https://doi.org/10.3390/s25154612 - 25 Jul 2025
Viewed by 221
Abstract
Optical sensor technologies are reshaping obesity detection by enabling non-invasive, dynamic analysis of biomechanical and morphological biomarkers. This review synthesizes recent advances in three key areas: optical gait analysis, vision-based pose estimation, and depth-sensing voxel modeling. Gait analysis leverages optical sensor arrays and [...] Read more.
Optical sensor technologies are reshaping obesity detection by enabling non-invasive, dynamic analysis of biomechanical and morphological biomarkers. This review synthesizes recent advances in three key areas: optical gait analysis, vision-based pose estimation, and depth-sensing voxel modeling. Gait analysis leverages optical sensor arrays and video systems to identify obesity-specific deviations, such as reduced stride length and asymmetric movement patterns. Pose estimation algorithms—including markerless frameworks like OpenPose and MediaPipe—track kinematic patterns indicative of postural imbalance and altered locomotor control. Human voxel modeling reconstructs 3D body composition metrics, such as waist–hip ratio, through infrared-depth sensing, offering precise, contactless anthropometry. Despite their potential, challenges persist in sensor robustness under uncontrolled environments, algorithmic biases in diverse populations, and scalability for widespread deployment in existing health workflows. Emerging solutions such as federated learning and edge computing aim to address these limitations by enabling multimodal data harmonization and portable, real-time analytics. Future priorities involve standardizing validation protocols to ensure reproducibility, optimizing cost-efficacy for scalable deployment, and integrating optical systems with wearable technologies for holistic health monitoring. By shifting obesity diagnostics from static metrics to dynamic, multidimensional profiling, optical sensing paves the way for scalable public health interventions and personalized care strategies. Full article
Show Figures

Figure 1

22 pages, 6487 KiB  
Article
An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
by Runxi Wu and Ping Yang
Actuators 2025, 14(8), 370; https://doi.org/10.3390/act14080370 - 24 Jul 2025
Viewed by 269
Abstract
Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. [...] Read more.
Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. Built upon a Vision Transformer backbone, MaskNet adopts a dual-branch architecture for RGB and depth modalities and integrates multi-modal features using an attention-based fusion module. Further, spatial and channel attention mechanisms are employed to refine feature representation and improve instance-level discrimination. The segmentation outputs are used in conjunction with regional depth to optimize the grasping sequence. Experimental evaluations on camshaft depalletizing tasks demonstrate that MaskNet achieves a precision of 0.980, a recall of 0.971, and an F1-score of 0.975, outperforming a YOLO11-based baseline. In an actual scenario, with a self-designed flexible magnetic gripper, the system maintains a maximum grasping error of 9.85 mm and a 98% task success rate across multiple camshaft types. These results validate the effectiveness of MaskNet in enabling fine-grained perception for robotic manipulation in cluttered, real-world scenarios. Full article
(This article belongs to the Section Actuators for Robotics)
Show Figures

Figure 1

24 pages, 824 KiB  
Article
MMF-Gait: A Multi-Model Fusion-Enhanced Gait Recognition Framework Integrating Convolutional and Attention Networks
by Kamrul Hasan, Khandokar Alisha Tuhin, Md Rasul Islam Bapary, Md Shafi Ud Doula, Md Ashraful Alam, Md Atiqur Rahman Ahad and Md. Zasim Uddin
Symmetry 2025, 17(7), 1155; https://doi.org/10.3390/sym17071155 - 19 Jul 2025
Viewed by 380
Abstract
Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often [...] Read more.
Gait recognition is a reliable biometric approach that uniquely identifies individuals based on their natural walking patterns. It is widely used to recognize individuals who are challenging to camouflage and do not require a person’s cooperation. The general face-based person recognition system often fails to determine the offender’s identity when they conceal their face by wearing helmets and masks to evade identification. In such cases, gait-based recognition is ideal for identifying offenders, and most existing work leverages a deep learning (DL) model. However, a single model often fails to capture a comprehensive selection of refined patterns in input data when external factors are present, such as variation in viewing angle, clothing, and carrying conditions. In response to this, this paper introduces a fusion-based multi-model gait recognition framework that leverages the potential of convolutional neural networks (CNNs) and a vision transformer (ViT) in an ensemble manner to enhance gait recognition performance. Here, CNNs capture spatiotemporal features, and ViT features multiple attention layers that focus on a particular region of the gait image. The first step in this framework is to obtain the Gait Energy Image (GEI) by averaging a height-normalized gait silhouette sequence over a gait cycle, which can handle the left–right gait symmetry of the gait. After that, the GEI image is fed through multiple pre-trained models and fine-tuned precisely to extract the depth spatiotemporal feature. Later, three separate fusion strategies are conducted, and the first one is decision-level fusion (DLF), which takes each model’s decision and employs majority voting for the final decision. The second is feature-level fusion (FLF), which combines the features from individual models through pointwise addition before performing gait recognition. Finally, a hybrid fusion combines DLF and FLF for gait recognition. The performance of the multi-model fusion-based framework was evaluated on three publicly available gait databases: CASIA-B, OU-ISIR D, and the OU-ISIR Large Population dataset. The experimental results demonstrate that the fusion-enhanced framework achieves superior performance. Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Image Processing)
Show Figures

Figure 1

20 pages, 8104 KiB  
Article
Energy Consumption Analysis of Using Mashrabiya as a Retrofit Solution for a Residential Apartment in Al Ain Square, Al Ain, UAE
by Lindita Bande, Anwar Ahmad, Saada Al Mansoori, Waleed Ahmed, Amna Shibeika, Shama Anbrine and Abdul Rauf
Buildings 2025, 15(14), 2532; https://doi.org/10.3390/buildings15142532 - 18 Jul 2025
Viewed by 252
Abstract
The city of Al Ain is a fast-developing area. With building typology varying from low-rise to mid-rise, sustainable design in buildings is needed. As the majority of the city’s population is Emirati Citizens, the percentage of expats is increasing. The expats tend to [...] Read more.
The city of Al Ain is a fast-developing area. With building typology varying from low-rise to mid-rise, sustainable design in buildings is needed. As the majority of the city’s population is Emirati Citizens, the percentage of expats is increasing. The expats tend to live in mid-rise buildings. One of the central midrise areas is AL Ain Square. This study aims to investigate how an optimized mashrabiya pattern can impact the energy and the Predicted Mean Vote (PMV) in a 3-bedroom apartment, fully oriented to the south, of an expat family. The methodology is as follows: case study selection, Weather analysis, Modeling/Validation of the base case scenario, Optimization of the mashrabiya pattern, Simulation of various scenarios, and Results. Analyzing the selected case study is the initial step of the methodology. This analysis begins with the district, building typology, and the chosen apartment. The weather analysis is relevant for using the mashrabiya (screen device) and the need to improve energy consumption and thermal comfort. The modeling of the base case shall be performed in Rhino Grasshopper. The validation is based on a one-year electricity bill provided by the owner. The optimization of mashrabiya patterns is an innovative process, where various designs are compared and then optimized to select the most efficient pattern. The solutions to the selected scenarios will then yield the results of the optimal scenario. This study is relevant to industry, academia, and local authorities as an innovative approach to retrofitting buildings. Additionally, the research presents a creative vision that suggests optimized mashrabiya patterns can significantly enhance energy savings, with the hexagonal grid configuration demonstrating the highest efficiency. This finding highlights the potential for geometry-driven shading optimization tailored to specific climatic and building conditions. Contrasting earlier mashrabiya studies that assess one static pattern, we couple a geometry-agnostic evolutionary solver with a utility-calibrated EnergyPlus model to test thousands of square, hexagonal, and triangular permutations. This workflow uncovers a previously undocumented non-linear depth perforation interaction. It validates a hexagonal screen that reduces annual cooling energy by 12.3%, establishing a replicable, grid-specific retrofit method for hot-arid apartments. Full article
Show Figures

Figure 1

21 pages, 3577 KiB  
Article
Branding Cities Through Architecture: Identify, Formulate, and Communicate the City Image of Amman, Jordan
by Yamen N. Al-Betawi and Heba B. Abu Ehmaid
Architecture 2025, 5(3), 50; https://doi.org/10.3390/architecture5030050 - 18 Jul 2025
Viewed by 1290
Abstract
This research aims to explore the role of architecture in creating an identifiable brand for Amman. It seeks to put forward a vision through which Amman’s city can formulate a clear model for implementing a successful branding strategy. In doing so, this research [...] Read more.
This research aims to explore the role of architecture in creating an identifiable brand for Amman. It seeks to put forward a vision through which Amman’s city can formulate a clear model for implementing a successful branding strategy. In doing so, this research studies the concepts associated with the ideas of branding, city image and identity, and the extent to which such ideas are to be implemented in Amman. The study adopted an inductive approach using in-depth, semi-structured interviews with 35 experts with central roles in stating the city’s key values that best reflect the city’s identity. A thematic analysis was conducted in line with theoretical aspects, including the city’s message, strategies for formulating the brand, and communication via architecture. The image of Amman shows an obvious distinction between its historical character and modern global styles as it suffers from disorder within its architectural landscape. Amman needs to rethink its identity in order to create a new brand that keeps pace with time without losing the originality of the place. This calls for re-evaluating the role of the iconic buildings and their associations with the surroundings, enabling them to become of significant presence, both symbolically and operationally, in expressing the city’s personality and promoting its message. Full article
Show Figures

Figure 1

22 pages, 11043 KiB  
Article
Digital Twin-Enabled Adaptive Robotics: Leveraging Large Language Models in Isaac Sim for Unstructured Environments
by Sanjay Nambiar, Rahul Chiramel Paul, Oscar Chigozie Ikechukwu, Marie Jonsson and Mehdi Tarkian
Machines 2025, 13(7), 620; https://doi.org/10.3390/machines13070620 - 17 Jul 2025
Viewed by 392
Abstract
As industrial automation evolves towards human-centric, adaptable solutions, collaborative robots must overcome challenges in unstructured, dynamic environments. This paper extends our previous work on developing a digital shadow for industrial robots by introducing a comprehensive framework that bridges the gap between physical systems [...] Read more.
As industrial automation evolves towards human-centric, adaptable solutions, collaborative robots must overcome challenges in unstructured, dynamic environments. This paper extends our previous work on developing a digital shadow for industrial robots by introducing a comprehensive framework that bridges the gap between physical systems and their virtual counterparts. The proposed framework advances toward a fully functional digital twin by integrating real-time perception and intuitive human–robot interaction capabilities. The framework is applied to a hospital test lab scenario, where a YuMi robot automates the sorting of microscope slides. The system incorporates a RealSense D435i depth camera for environment perception, Isaac Sim for virtual environment synchronization, and a locally hosted large language model (Mistral 7B) for interpreting user voice commands. These components work together to achieve bi-directional synchronization between the physical and digital environments. The framework was evaluated through 20 test runs under varying conditions. A validation study measured the performance of the perception module, simulation, and language interface, with a 60% overall success rate. Additionally, synchronization accuracy between the simulated and physical robot joint movements reached 98.11%, demonstrating strong alignment between the digital and physical systems. By combining local LLM processing, real-time vision, and robot simulation, the approach enables untrained users to interact with collaborative robots in dynamic settings. The results highlight its potential for improving flexibility and usability in industrial automation. Full article
(This article belongs to the Topic Smart Production in Terms of Industry 4.0 and 5.0)
Show Figures

Figure 1

21 pages, 9749 KiB  
Article
Enhanced Pose Estimation for Badminton Players via Improved YOLOv8-Pose with Efficient Local Attention
by Yijian Wu, Zewen Chen, Hongxing Zhang, Yulin Yang and Weichao Yi
Sensors 2025, 25(14), 4446; https://doi.org/10.3390/s25144446 - 17 Jul 2025
Viewed by 404
Abstract
With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To [...] Read more.
With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To address these issues, we propose an enhanced pose estimation framework tailored to badminton players, built upon an improved YOLOv8-Pose architecture. In particular, we introduce an efficient local attention (ELA) mechanism that effectively captures fine-grained spatial dependencies and contextual information, thereby significantly improving the keypoint localization accuracy and overall pose estimation performance. To support this study, we construct a dedicated badminton pose dataset comprising 4000 manually annotated samples, captured using a Microsoft Kinect v2 camera. The raw data undergo careful processing and refinement through a combination of depth-assisted annotation and visual inspection to ensure high-quality ground truth keypoints. Furthermore, we conduct an in-depth comparative analysis of multiple attention modules and their integration strategies within the network, offering generalizable insights to enhance pose estimation models in other sports domains. The experimental results show that the proposed ELA-enhanced YOLOv8-Pose model consistently achieves superior accuracy across multiple evaluation metrics, including the mean squared error (MSE), object keypoint similarity (OKS), and percentage of correct keypoints (PCK), highlighting its effectiveness and potential for broader applications in sports vision tasks. Full article
(This article belongs to the Special Issue Computer Vision-Based Human Activity Recognition)
Show Figures

Figure 1

19 pages, 5755 KiB  
Article
A Context-Aware Doorway Alignment and Depth Estimation Algorithm for Assistive Wheelchairs
by Shanelle Tennekoon, Nushara Wedasingha, Anuradhi Welhenge, Nimsiri Abhayasinghe and Iain Murray
Computers 2025, 14(7), 284; https://doi.org/10.3390/computers14070284 - 17 Jul 2025
Viewed by 273
Abstract
Navigating through doorways remains a daily challenge for wheelchair users, often leading to frustration, collisions, or dependence on assistance. These challenges highlight a pressing need for intelligent doorway detection algorithm for assistive wheelchairs that go beyond traditional object detection. This study presents the [...] Read more.
Navigating through doorways remains a daily challenge for wheelchair users, often leading to frustration, collisions, or dependence on assistance. These challenges highlight a pressing need for intelligent doorway detection algorithm for assistive wheelchairs that go beyond traditional object detection. This study presents the algorithmic development of a lightweight, vision-based doorway detection and alignment module with contextual awareness. It integrates channel and spatial attention, semantic feature fusion, unsupervised depth estimation, and doorway alignment that offers real-time navigational guidance to the wheelchairs control system. The model achieved a mean average precision of 95.8% and a F1 score of 93%, while maintaining low computational demands suitable for future deployment on embedded systems. By eliminating the need for depth sensors and enabling contextual awareness, this study offers a robust solution to improve indoor mobility and deliver actionable feedback to support safe and independent doorway traversal for wheelchair users. Full article
(This article belongs to the Special Issue AI for Humans and Humans for AI (AI4HnH4AI))
Show Figures

Figure 1

Back to TopTop