MDPI - Publisher of Open Access Journals

22 pages, 92348 KB

Open AccessArticle

Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning

by Genki Higashiuchi, Tomoyasu Shimada, Xiangbo Kong and Hiroyuki Tomiyama

Appl. Sci. 2026, 16(2), 714; https://doi.org/10.3390/app16020714 - 9 Jan 2026

Self-supervised monocular depth estimation has demonstrated high practical utility, as it can be trained using a photometric image reconstruction loss between the original image and a reprojected image generated from the estimated depth and relative pose, thereby alleviating the burden of large-scale label [...] Read more.

Self-supervised monocular depth estimation has demonstrated high practical utility, as it can be trained using a photometric image reconstruction loss between the original image and a reprojected image generated from the estimated depth and relative pose, thereby alleviating the burden of large-scale label creation. However, this photometric image reconstruction loss relies on the Lambertian reflectance assumption. Under non-Lambertian conditions such as specular reflections or strong illumination gradients, pixel values fluctuate depending on the lighting and viewpoint, which often misguides training and leads to large depth errors. To address this issue, we propose a multitask learning framework that integrates albedo estimation as a supervised auxiliary task. The proposed framework is implemented on top of representative self-supervised monocular depth estimation backbones, including Monodepth2 and Lite-Mono, by adopting a multi-head architecture in which the shared encoder–decoder branches at each upsampling block into a Depth Head and an Albedo Head. Furthermore, we apply Intrinsic Image Decomposition to generate albedo images and design an albedo supervision loss that uses these albedo maps as training targets for the Albedo Head. We then integrate this loss term into the overall training objective, explicitly exploiting illumination-invariant albedo components to suppress erroneous learning in reflective regions and areas with strong illumination gradients. Experiments on the ScanNetV2 dataset demonstrate that, for the lightweight backbone Lite-Mono, our method achieves an average reduction of 18.5% over the four standard depth error metrics and consistently improves accuracy metrics, without increasing the number of parameters and FLOPs at inference time. Full article

(This article belongs to the Special Issue Convolutional Neural Networks and Computer Vision)

21 pages, 21514 KB

Open AccessArticle

Robust Geometry–Hue Point Cloud Registration via Hybrid Adaptive Residual Optimization

by Yangmin Xie, Jinghan Zhang, Rijian Xu and Hang Shi

ISPRS Int. J. Geo-Inf. 2026, 15(1), 22; https://doi.org/10.3390/ijgi15010022 - 4 Jan 2026

Viewed by 118

Abstract

Accurate point cloud registration is a fundamental prerequisite for reality-based 3D reconstruction and large-scale spatial modeling. Despite significant international progress, reliable registration in architectural and urban scenes remains challenging due to geometric intricacies arising from repetitive and strongly symmetric structures and photometric variability [...] Read more.

Accurate point cloud registration is a fundamental prerequisite for reality-based 3D reconstruction and large-scale spatial modeling. Despite significant international progress, reliable registration in architectural and urban scenes remains challenging due to geometric intricacies arising from repetitive and strongly symmetric structures and photometric variability caused by illumination inconsistencies. Conventional ICP-based and color-augmented methods often suffer from local convergence and color drift, limiting their robustness in large-scale real-world applications. To address these challenges, we propose Hybrid Adaptive Residual Optimization (HARO), a unified framework that organically integrates geometric cues with hue-robust color features. Specifically, RGB data are transformed into a decoupled HSV representation with histogram-matched hue correction applied in overlapping regions, enabling illumination-invariant color modeling. Furthermore, a novel adaptive residual kernel dynamically balances geometric and chromatic constraints, ensuring stable convergence even in structurally complex or partially overlapping scenes. Extensive experiments conducted on diverse real-world datasets, including Subway, Railway, urban, and Office environments, demonstrate that HARO consistently achieves sub-degree rotational accuracy (0.11°) and negligible translation errors relative to the scene scale. These results indicate that HARO provides an effective and generalizable solution for large-scale point cloud registration, successfully bridging geometric complexity and photometric variability in reality-based reconstruction tasks. Full article

(This article belongs to the Topic 3D Computer Vision and Smart Building and City, 3rd Edition)

► Show Figures

Figure 1

26 pages, 8467 KB

Open AccessArticle

Low-Light Pose-Action Collaborative Network for Industrial Monitoring in Power Systems

by Qifeng Luo, Heng Zhou, Mianting Wu and Qiang Zhou

Electronics 2026, 15(1), 199; https://doi.org/10.3390/electronics15010199 - 1 Jan 2026

Viewed by 190

Abstract

Recognizing human actions in low-light industrial environments remains a significant challenge for safety-critical applications in power systems. In this paper, we propose a Low-Light Pose-Action Collaborative Network (LPAC-Net), an integrated framework specifically designed for monitoring scenarios in underground electrical vaults and smart power [...] Read more.

Recognizing human actions in low-light industrial environments remains a significant challenge for safety-critical applications in power systems. In this paper, we propose a Low-Light Pose-Action Collaborative Network (LPAC-Net), an integrated framework specifically designed for monitoring scenarios in underground electrical vaults and smart power stations. The pipeline begins with a modified Zero-DCE++ module for reference-free illumination correction, followed by pose extraction using YOLO-Pose and a novel rotation-invariant encoding of keypoints optimized for confined industrial spaces. Temporal dependencies are captured through a bidirectional LSTM network with attention mechanisms to model complex operational behaviors. We evaluate LPAC-Net on the newly curated ARID-Fall dataset, enhanced with industrial monitoring scenarios representative of electrical infrastructure environments. Experimental results demonstrate that our method outperforms state-of-the-art models, including DarkLight-R101, DTCM, FRAGNet, and URetinex-Net++, achieving 95.53% accuracy in recognizing worker activities and safety-critical events. Additional studies confirm LPAC-Net’s robustness under keypoint noise and motion blur, highlighting its practical value for intelligent monitoring in challenging industrial lighting conditions typical of underground electrical facilities and automated power stations. Full article

(This article belongs to the Special Issue AI Applications for Smart Grid)

► Show Figures

Figure 1

24 pages, 14385 KB

Open AccessArticle

LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination

by Cong Liu, You Wang, Weichao Luo and Yanhong Peng

Machines 2026, 14(1), 44; https://doi.org/10.3390/machines14010044 - 29 Dec 2025

Viewed by 218

Abstract

Visual SLAM systems face significant performance degradation under dynamic lighting conditions, where traditional feature extraction methods suffer from reduced keypoint detection and unstable matching. This paper presents LDFE-SLAM, a novel visual SLAM framework that addresses illumination challenges through a Light-Aware Deep Front-End (LDFE) [...] Read more.

Visual SLAM systems face significant performance degradation under dynamic lighting conditions, where traditional feature extraction methods suffer from reduced keypoint detection and unstable matching. This paper presents LDFE-SLAM, a novel visual SLAM framework that addresses illumination challenges through a Light-Aware Deep Front-End (LDFE) architecture. Our key insight is that low-light degradation in SLAM is fundamentally a geometric feature distribution problem rather than merely a visibility issue. The proposed system integrates three synergistic components: (1) an illumination-adaptive enhancement module based on EnlightenGAN with geometric consistency loss that restores gradient structures for downstream feature extraction, (2) SuperPoint-based deep feature detection that provides illumination-invariant keypoints, and (3) LightGlue attention-based matching that filters enhancement-induced noise while maintaining geometric consistency. Through systematic evaluation of five method configurations (M1–M5), we demonstrate that enhancement, deep features, and learned matching must be co-designed rather than independently optimized. Experiments on EuRoC and TUM sequences under synthetic illumination degradation show that LDFE-SLAM maintains stable localization accuracy (∼1.2 m ATE) across all brightness levels, while baseline methods degrade significantly (up to 3.7 m). Our method operates normally down to severe lighting conditions (30% ambient brightness and 20–50 lux—equivalent to underground parking or night-time streetlight illumination), representing a 4–6× lower illumination threshold compared to ORB-SLAM3 (200–300 lux minimum). Under severe (25% brightness) conditions, our method achieves a 62% tracking success rate, compared to 12% for ORB-SLAM3, with keypoint detection remaining above the critical 100-point threshold, even under extreme degradation. Full article

(This article belongs to the Special Issue Robotic Intelligence Development of AI in Robot Perception, Learning, and Decision)

► Show Figures

Figure 1

19 pages, 3834 KB

Open AccessArticle

Chamber-Reflection-Aware Image Enhancement Method for Powder Spreading Quality Inspection in Selective Laser Melting

by Zhenxing Huang, Changfeng Yan and Siwei Yang

Appl. Sci. 2026, 16(1), 203; https://doi.org/10.3390/app16010203 - 24 Dec 2025

Viewed by 301

Abstract

In selective laser melting (SLM), real-time visual inspection of powder spreading quality is essential for maintaining dimensional accuracy and mechanical performance. However, reflections from metallic chamber walls introduce non-uniform illumination and reduce local contrast, hindering reliable defect detection. To overcome this problem, a [...] Read more.

In selective laser melting (SLM), real-time visual inspection of powder spreading quality is essential for maintaining dimensional accuracy and mechanical performance. However, reflections from metallic chamber walls introduce non-uniform illumination and reduce local contrast, hindering reliable defect detection. To overcome this problem, a chamber-reflection-aware image enhancement method is proposed, integrating a physical reflection model with a dual-channel deep network. A Gaussian-based curved-surface reflection model is first developed to describe the spatial distribution of reflective interference. The enhancement network then processes the input through two complementary channels: a Retinex-based branch to extract illumination-invariant reflectance components and a principal components analysis (PCA)-based branch to preserve structural information. Furthermore, a noise-aware loss function is designed to suppress the mixed Gaussian–Poisson noise that is inherent in SLM imaging. Experiments conducted on real SLM monitoring data demonstrate that the proposed method significantly improves contrast and defect visibility, outperforming existing enhancement algorithms in peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and natural image quality evaluator (NIQE). The approach provides a physically interpretable and robust preprocessing framework for online SLM quality monitoring. Full article

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition: 2nd Edition)

► Show Figures

Figure 1

25 pages, 3067 KB

Open AccessArticle

Lightweight Attention-Augmented YOLOv5s for Accurate and Real-Time Fall Detection in Elderly Care Environments

by Bibo Yang, Lan Thi Nguyen and Wirapong Chansanam

Sensors 2025, 25(23), 7365; https://doi.org/10.3390/s25237365 - 3 Dec 2025

Viewed by 509

Abstract

Falls among the elderly represent a leading cause of injury and mortality worldwide, necessitating reliable and real-time monitoring solutions. This study aims to develop a lightweight, accurate, and efficient fall detection framework based on an improved YOLOv5s model. The proposed architecture incorporates a [...] Read more.

Falls among the elderly represent a leading cause of injury and mortality worldwide, necessitating reliable and real-time monitoring solutions. This study aims to develop a lightweight, accurate, and efficient fall detection framework based on an improved YOLOv5s model. The proposed architecture incorporates a Convolutional Block Attention Module (CBAM) to enhance salient feature extraction, optimizes multi-scale feature fusion in the Neck for better small-object detection, and re-clusters anchor boxes tailored to the horizontal morphology of elderly falls. A multi-scene dataset comprising 11,314 images was constructed to evaluate performance under diverse lighting, occlusion, and spatial conditions. Experimental results demonstrate that the improved YOLOv5s achieves a mean average precision (mAP@0.5) of 94.2%, a recall of 92.5%, and a false alarm rate of 4.2%, outperforming baseline YOLOv5s and YOLOv4 models while maintaining real-time detection speed at 32 FPS. These findings confirm that integrating attention mechanisms, adaptive fusion, and anchor optimization significantly enhances robustness and generalization. Although performance slightly declines under extreme lighting or heavy occlusion, this limitation highlights future opportunities for multimodal fusion and illumination-invariant modeling. Overall, the study contributes a scalable and deployable AI framework that bridges the gap between algorithmic innovation and real-world elderly care applications, advancing intelligent and non-intrusive safety monitoring in aging societies. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

29 pages, 3446 KB

Open AccessArticle

QRetinex-Net: A Quaternion Retinex Framework for Bio-Inspired Color Constancy

by Sos Agaian and Vladimir Frants

Appl. Sci. 2025, 15(22), 12336; https://doi.org/10.3390/app152212336 - 20 Nov 2025

Viewed by 453

Abstract

Color constancy, the ability to perceive consistent object colors under varying illumination, is a core function of the human visual system and a persistent challenge in machine vision. Retinex theory models this process by decomposing an image

S

into reflectance (

R

) [...] Read more.

Color constancy, the ability to perceive consistent object colors under varying illumination, is a core function of the human visual system and a persistent challenge in machine vision. Retinex theory models this process by decomposing an image

S

into reflectance (

R

) and illumination (

I

) components (

S^{'} = R I

). However, conventional Retinex methods suffer from key limitations: independent RGB processing that disrupts inter-channel correlations, weak grounding in color perception models, non-invertible decomposition (

S^{'} \neq S

), and limited biological plausibility. We propose QRetinex-Net, a unified Retinex framework formulated in the quaternion domain—

S = R \otimes I

, where

\otimes

denotes the Hamilton product. Representing RGB channels as pure quaternions enables holistic color processing, biologically inspired modeling, and invertible image reconstruction. We further introduce the Reflectance Consistency Index (RCI) to quantitatively assess illumination invariance and reflectance stability. Experiments on low-light crack detection, infrared–visible fusion, and face detection under varying lighting demonstrate that QRetinex-Net outperforms RetinexNet, KIND++, U-RetinexNet, and Diff-Retinex, achieving up to 11% performance gains, LPIPS ≈ 0.0001, and RCI ≈ 0.988. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

26 pages, 7162 KB

Open AccessArticle

A Fractional-Order SSIM-Based Gaussian Loss with Long-Range Memory for Dense VSLAM

by Junyang Zhao, Huixin Zhu, Zhili Zhang, Mingtao Feng, Han Yu and Yuxuan Li

Fractal Fract. 2025, 9(11), 744; https://doi.org/10.3390/fractalfract9110744 - 17 Nov 2025

Cited by 1 | Viewed by 856

Abstract

In dense visual simultaneous localization and mapping VSLAM (VSLAM), a fundamental challenge lies in the inability of existing loss functions to dynamically balance luminance, contrast, and structural fidelity under photometric variations, while their underlying mechanisms, particularly the conventional Gaussian kernel in SSIM, suffer [...] Read more.

In dense visual simultaneous localization and mapping VSLAM (VSLAM), a fundamental challenge lies in the inability of existing loss functions to dynamically balance luminance, contrast, and structural fidelity under photometric variations, while their underlying mechanisms, particularly the conventional Gaussian kernel in SSIM, suffer from limited receptive fields due to rapid exponential decay, preventing the capture of long-range dependencies essential for global consistency. To address this, we propose a fractional Gaussian field (FGF) that synergizes Caputo derivatives with Gaussian weighting, creating a hybrid kernel that couples power-law decay for long-range memory with local smoothness. This foundational kernel serves as the core component of FGF-SSIM, a novel loss function that adaptively recalibrates luminance, contrast, and structure using fractional-order statistics. The proposed FGF-SSIM is further integrated into a complete 3D Gaussian Splatting (3DGS)-based SLAM system, named FGF-SLAM, where it is employed across both tracking and mapping modules to enhance performance. Extensive evaluations demonstrate state-of-the-art performance across multiple benchmarks. Comprehensive analysis confirms the superior long-range dependency of the fractional kernel, dedicated illumination robustness tests validate the enhanced invariance of FGF-SSIM, and quantitative results on TUM and Replica datasets show significant improvements in reconstruction quality and trajectory estimation. Ablation studies further substantiate the contribution of each proposed component. Full article

(This article belongs to the Special Issue Advances in Pattern Recognition—Image and Time Series Analyses—through Fractal Geometry and Complexity Theory)

► Show Figures

Figure 1

21 pages, 23184 KB

Open AccessArticle

FDC-YOLO: A Blur-Resilient Lightweight Network for Engine Blade Defect Detection

by Xinyue Xu, Fei Li, Lanhui Xiong, Chenyu He, Haijun Peng, Yiwen Zhao and Guoli Song

Algorithms 2025, 18(11), 725; https://doi.org/10.3390/a18110725 - 17 Nov 2025

Viewed by 467

Abstract

The synergy between continuum robots and visual inspection technology provides an efficient automated solution for aero-engine blade defect detection. However, flexible end-effector instability and complex internal illumination conditions cause defect image blurring and defect feature loss, leading existing detection methods to fail in [...] Read more.

The synergy between continuum robots and visual inspection technology provides an efficient automated solution for aero-engine blade defect detection. However, flexible end-effector instability and complex internal illumination conditions cause defect image blurring and defect feature loss, leading existing detection methods to fail in simultaneously achieving both high-precision and high-speed requirements. To address this, this study proposes the real-time defect detection algorithm FDC-YOLO, enabling precise and efficient identification of blurred defects. We design the dynamic subtractive attention sampling module (DSAS) to dynamically compensate for information discrepancies during sampling, which reduces critical information loss caused by multi-scale feature fusion. We design a high-frequency information processing module (HFM) to enhance defect feature representation in the frequency domain, which significantly improves the visibility of defect regions while mitigating blur-induced noise interference. Additionally, we design a classification domain detection head (CDH) to focus on domain-invariant features across categories. Finally, FDC-YOLO achieves 7.9% and 3.5% mAP improvements on the aero-engine blade defect dataset and low-resolution NEU-DET dataset, respectively, with only 2.68 M parameters and 7.0G FLOPs. These results validate the algorithm’s generalizability in addressing low-accuracy issues across diverse blur artifacts in defect detection. Furthermore, this algorithm is combined with the tensegrity continuum robot to jointly construct an automatic defect detection system for aircraft engines, providing an efficient and reliable innovative solution to the problem of internal damage detection in engines. Full article

(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))

► Show Figures

Figure 1

26 pages, 13736 KB

Open AccessArticle

Off-Nadir Satellite Image Scene Classification: Benchmark Dataset, Angle-Aware Active Domain Adaptation, and Angular Impact Analysis

by Feifei Peng, Mengchu Guo, Haoqing Hu, Tongtong Yan and Liangcun Jiang

Remote Sens. 2025, 17(22), 3697; https://doi.org/10.3390/rs17223697 - 12 Nov 2025

Viewed by 750

Abstract

Accurate remote sensing scene classification is essential for applications such as environmental monitoring and disaster management. In real-world scenarios, particularly during emergency response and disaster relief operations, acquiring nadir-view satellite images is often infeasible due to cloud cover, satellite scheduling constraints, or dynamic [...] Read more.

Accurate remote sensing scene classification is essential for applications such as environmental monitoring and disaster management. In real-world scenarios, particularly during emergency response and disaster relief operations, acquiring nadir-view satellite images is often infeasible due to cloud cover, satellite scheduling constraints, or dynamic scene conditions. Instead, off-nadir images are frequently captured and can provide enhanced spatial understanding through angular perspectives. However, remote sensing scene classification has primarily relied on nadir-view satellite or airborne imagery, leaving off-nadir perspectives largely unexplored. This study addresses this gap by introducing Off-nadir-Scene10, the first controlled and comprehensive benchmark dataset specifically designed for off-nadir satellite image scene classification. The Off-nadir-Scene10 dataset contains 5200 images across 10 common scene categories captured at 26 different off-nadir angles. All images were collected under controlled single-day conditions, ensuring that viewing geometry was the sole variable and effectively minimizing confounding factors such as illumination, atmospheric conditions, seasonal changes, and sensor characteristics. To effectively leverage abundant nadir imagery for advancing off-nadir scene classification, we propose an angle-aware active domain adaptation method that incorporates geometric considerations into sample selection and model adaptation processes. The method strategically selects informative off-nadir samples while transferring discriminative knowledge from nadir to off-nadir domains. The experimental results show that the method achieves consistent accuracy improvements across three different training ratios: 20%, 50%, and 80%. The comprehensive angular impact analysis reveals that models trained on larger off-nadir angles generalize better to smaller angles than vice versa, indicating that exposure to stronger geometric distortions promotes the learning of view-invariant features. This asymmetric transferability primarily stems from geometric perspective effects, as temporal, atmospheric, and sensor-related variations were rigorously minimized through controlled single-day image acquisition. Category-specific analysis demonstrates that angle-sensitive classes, such as sparse residential areas, benefit significantly from off-nadir viewing observations. This study provides a controlled foundation and practical guidance for developing robust, geometry-aware off-nadir scene classification systems. Full article

(This article belongs to the Special Issue New Insights in Remote Sensing Image Interpretation with Deep Learning)

► Show Figures

Figure 1

31 pages, 34773 KB

Open AccessArticle

Learning Domain-Invariant Representations for Event-Based Motion Segmentation: An Unsupervised Domain Adaptation Approach

by Mohammed Jeryo and Ahad Harati

J. Imaging 2025, 11(11), 377; https://doi.org/10.3390/jimaging11110377 - 27 Oct 2025

Viewed by 887

Abstract

Event cameras provide microsecond temporal resolution, high dynamic range, and low latency by asynchronously capturing per-pixel luminance changes, thereby introducing a novel sensing paradigm. These advantages render them well-suited for high-speed applications such as autonomous vehicles and dynamic environments. Nevertheless, the sparsity of [...] Read more.

Event cameras provide microsecond temporal resolution, high dynamic range, and low latency by asynchronously capturing per-pixel luminance changes, thereby introducing a novel sensing paradigm. These advantages render them well-suited for high-speed applications such as autonomous vehicles and dynamic environments. Nevertheless, the sparsity of event data and the absence of dense annotations are significant obstacles to supervised learning for motion segmentation from event streams. Domain adaptation is also challenging due to the considerable domain shift in intensity images. To address these challenges, we propose a two-phase cross-modality adaptation framework that translates motion segmentation knowledge from labeled RGB-flow data to unlabeled event streams. A dual-branch encoder extracts modality-specific motion and appearance features from RGB and optical flow in the source domain. Using reconstruction networks, event voxel grids are converted into pseudo-image and pseudo-flow modalities in the target domain. These modalities are subsequently re-encoded using frozen RGB-trained encoders. Multi-level consistency losses are implemented on features, predictions, and outputs to enforce domain alignment. Our design enables the model to acquire domain-invariant, semantically rich features through the use of shallow architectures, thereby reducing training costs and facilitating real-time inference with a lightweight prediction path. The proposed architecture, alongside the utilized hybrid loss function, effectively bridges the domain and modality gap. We evaluate our method on two challenging benchmarks: EVIMO2, which incorporates real-world dynamics, high-speed motion, illumination variation, and multiple independently moving objects; and MOD++, which features complex object dynamics, collisions, and dense 1kHz supervision in synthetic scenes. The proposed UDA framework achieves 83.1% and 79.4% accuracy on EVIMO2 and MOD++, respectively, outperforming existing state-of-the-art approaches, such as EV-Transfer and SHOT, by up to 3.6%. Additionally, it is lighter and faster and also delivers enhanced mIoU and F1 Score. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

13 pages, 16914 KB

Open AccessArticle

Traversal by Touch: Tactile-Based Robotic Traversal with Artificial Skin in Complex Environments

by Adam Mazurick and Alex Ferworn

Sensors 2025, 25(21), 6569; https://doi.org/10.3390/s25216569 - 25 Oct 2025

Viewed by 738

Abstract

We evaluate tactile-first robotic traversal on the Department of Homeland Security (DHS) figure-8 mobility test using a two-way repeated-measures design across various algorithms (three tactile policies—M1 reactive, M2 terrain-weighted, M3 memory-augmented; a monocular camera baseline, CB-V; a tactile histogram baseline, T-VFH; and an [...] Read more.

We evaluate tactile-first robotic traversal on the Department of Homeland Security (DHS) figure-8 mobility test using a two-way repeated-measures design across various algorithms (three tactile policies—M1 reactive, M2 terrain-weighted, M3 memory-augmented; a monocular camera baseline, CB-V; a tactile histogram baseline, T-VFH; and an optional tactile-informed replanner, T-D* Lite) and lighting conditions (Indoor, Outdoor, and Dark). The platform is the custom-built Eleven robot—a quadruped integrating a joint-mounted tactile tentacle with a tip force-sensitive resistor (FSR; Walfront 9snmyvxw25, China; 0–10 kg range, ≈0.1 N resolution @ 83 Hz) and a woven Galvorn carbon-nanotube (CNT) yarn for proprioceptive bend sensing. Control and sensing are fully wireless via an ESP32-S3, Arduino Nano 33 BLE, Raspberry Pi 400, and a mini VESC controller. Across 660 trials, the tactile stack maintained ∼21 ms (p50) policy latency and mid-80% success across all lighting conditions, including total darkness. The memory-augmented tactile policy (M3) exhibited consistent robustness relative to the camera baseline (CB-V), trailing by only ≈3–4% in Indoor and ≈13–16% in Outdoor and Dark conditions. Pre-specified, two one-sided tests (TOSTs) confirmed no speed equivalence in any M3↔CB-V comparison. Unlike vision-based approaches, tactile-first traversal is invariant to illumination and texture—an essential capability for navigation in darkness, smoke, or texture-poor, confined environments. Overall, these results show that a tactile-first, memory-augmented control stack achieves lighting-independent traversal on DHS benchmarks while maintaining competitive latency and success, trading modest speed for robustness and sensing independence. Full article

(This article belongs to the Special Issue Intelligent Robots: Control and Sensing)

► Show Figures

Figure 1

34 pages, 5288 KB

Open AccessArticle

A Video-Based Mobile Palmprint Dataset and an Illumination-Robust Deep Learning Architecture for Unconstrained Environments

by Betül Koşmaz Sünnetci, Özkan Bingöl, Eyüp Gedikli, Murat Ekinci, Ramazan Özgür Doğan, Salih Türk and Nihan Güngör

Appl. Sci. 2025, 15(21), 11368; https://doi.org/10.3390/app152111368 - 23 Oct 2025

Viewed by 964

Abstract

The widespread adoption of mobile devices has made secure and user-friendly biometric authentication critical. However, widely used modalities such as fingerprint and facial recognition show limited robustness under uncontrolled illumination and on heterogeneous devices. In contrast, palmprint recognition offers strong potential because of [...] Read more.

The widespread adoption of mobile devices has made secure and user-friendly biometric authentication critical. However, widely used modalities such as fingerprint and facial recognition show limited robustness under uncontrolled illumination and on heterogeneous devices. In contrast, palmprint recognition offers strong potential because of its rich textural patterns and high discriminative power. This study addresses the limitations of laboratory-based datasets that fail to capture real-world challenges. We introduce MPW-180, a novel dataset comprising videos of 180 participants recorded on their own smartphones in everyday environments. By systematically incorporating diverse illumination conditions (with and without flash) and natural free-hand movements, MPW-180 is the first dataset to adopt a bring-your-own-device paradigm, providing a realistic benchmark for evaluating generalization in mobile biometric models. In addition, we propose PalmWildNet, an SE-block-enhanced deep learning architecture trained with Triplet Loss and a cross-illumination sampling strategy. The experimental results show that conventional methods suffer over 50% performance degradation under cross-illumination conditions. In contrast, our method reduces the Equal Error Rate to 1–2% while maintaining an accuracy above 97%. These findings demonstrate that the proposed framework not only tolerates illumination variability but also learns robust illumination-invariant representations, making it well-suited for mobile biometric authentication. Full article

► Show Figures

Figure 1

27 pages, 6430 KB

Open AccessArticle

Bayesian–Geometric Fusion: A Probabilistic Framework for Robust Line Feature Matching

by Chenyang Zhang, Yufan Ge and Shuo Gu

Electronics 2025, 14(19), 3783; https://doi.org/10.3390/electronics14193783 - 24 Sep 2025

Viewed by 425

Abstract

Line feature matching is a fundamental and extensively studied subject in the fields of photogrammetry and computer vision. Traditional methods, which rely on handcrafted descriptors and distance-based filtering outliers, frequently encounter challenges related to robustness and a high incidence of outliers. While some [...] Read more.

Line feature matching is a fundamental and extensively studied subject in the fields of photogrammetry and computer vision. Traditional methods, which rely on handcrafted descriptors and distance-based filtering outliers, frequently encounter challenges related to robustness and a high incidence of outliers. While some approaches leverage point features to assist line feature matching by establishing the invariant geometric constraints between points and lines, this typically results in a considerable computational load. In order to overcome these limitations, we introduce a novel Bayesian posterior probability framework for line matching that incorporates three geometric constraints: the distance between line feature endpoints, midpoint distance, and angular consistency. Our approach initially characterizes inter-image geometric relationships using Fourier representation. Subsequently, we formulate the posterior probability distributions for the distance constraint and the uniform distribution based on the constraint of angular consistency. By calculating the joint probability distribution under three geometric constraints, robust line feature matches are iteratively optimized through the Expectation–Maximization (EM) algorithm. Comprehensive experiments confirm the effectiveness of our approach: (i) it outperforms state-of-the-art (including deep learning-based) algorithms in match count and accuracy across common scenarios; (ii) it exhibits superior robustness to rotation, illumination variation, and motion blur compared to descriptor-based methods; and (iii) it notably reduces computational overhead in comparison to algorithms that involve point-assisted line matching. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

11 pages, 4334 KB

Open AccessCommunication

Real-Time Object Classification via Dual-Pixel Measurement

by Jianing Yang, Ran Chen, Yicheng Peng, Lingyun Zhang, Ting Sun and Fei Xing

Sensors 2025, 25(18), 5886; https://doi.org/10.3390/s25185886 - 20 Sep 2025

Viewed by 588

Abstract

Achieving rapid and accurate object classification holds significant importance in various domains. However, conventional vision-based techniques suffer from several limitations, including high data redundancy and strong dependence on image quality. In this work, we present a high-speed, image-free object classification method based on [...] Read more.

Achieving rapid and accurate object classification holds significant importance in various domains. However, conventional vision-based techniques suffer from several limitations, including high data redundancy and strong dependence on image quality. In this work, we present a high-speed, image-free object classification method based on dual-pixel measurement and normalized central moment invariants. Leveraging the complementary modulation capability of a digital micromirror device (DMD), the proposed system requires only five tailored binary illumination patterns to simultaneously extract geometric features and perform classification. The system can achieve a classification update rate of up to 4.44 kHz, offering significant improvements in both efficiency and accuracy compared to traditional image-based approaches. Numerical simulations verify the robustness of the method under similarity transformations—including translation, scaling, and rotation—while experimental validations further demonstrate reliable performance across diverse object types. This approach enables real-time, low-data throughput, and reconstruction-free classification, offering new potential for optical computing and edge intelligence applications. Full article

(This article belongs to the Special Issue Real-Time Object Detection and Classification Using Advanced Sensing Techniques)

► Show Figures

Figure 1

Search Results (124)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (124)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI