Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (135)

Search Parameters:
Keywords = illumination invariance

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 2875 KB  
Article
Correlations and Kappa Distributions: Numerical Experiment with 3D Collisions and Debye-like Shielding
by David J. McComas, George Livadiotis and Nicholas Sarlis
Entropy 2026, 28(6), 688; https://doi.org/10.3390/e28060688 (registering DOI) - 14 Jun 2026
Abstract
Contrary to the common assumption of Maxwell–Boltzmann (MB) distributions, space plasmas are characterized by kappa distributions and reside in thermodynamic stationary states out of classical thermal equilibrium, owing to the correlations between the charged plasma particles. In this study, we extend prior work [...] Read more.
Contrary to the common assumption of Maxwell–Boltzmann (MB) distributions, space plasmas are characterized by kappa distributions and reside in thermodynamic stationary states out of classical thermal equilibrium, owing to the correlations between the charged plasma particles. In this study, we extend prior work to include realistic 3D collisions and Debye-like shielding of the correlations to show how these two processes compete in the development of realistic plasma particle velocity distributions. We modify our prior numerical experiment to incorporate both 3D collisions and correlations that include realistic Debye-like shielding of plasma particles and run it over many collisions until it becomes stationary. While 3D collisions alone produce Maxwell–Boltzmann (MB) distributions of the particles (κ → ∞), introducing correlations drives the distributions to stationary states with finite thermodynamic kappa (κ), where stronger correlations produce lower values of κ, as observed in space plasmas. Further, development of correlation clusters around each collision rapidly produces thermodynamic systems where the Debye length is proportional to 1+1/κ0th, for invariant thermal kappa κ0th, just as predicted by theory. This simple numerical experiment explores much more realistic particle interactions to show how 3D collisions and properly shielded correlations compete to produce stationary states of plasma particle kappa distributions and illuminates how long-range interactions correlate particles over the scale of the Debye lengths. Full article
Show Figures

Figure 1

16 pages, 7030 KB  
Article
DDCATNet: Effective Deep Learning-Based Illumination Color Cast Estimation Approach for Achieving Computational Color Constancy
by Ho-Hyoung Choi
Sensors 2026, 26(11), 3313; https://doi.org/10.3390/s26113313 - 23 May 2026
Viewed by 277
Abstract
Digital camera sensors are designed to capture a wide range of incident illuminants, enabling the creation of high-quality images. However, these sensors lack the capability to differentiate between the color of the source illuminant and the actual color (or original color) of the [...] Read more.
Digital camera sensors are designed to capture a wide range of incident illuminants, enabling the creation of high-quality images. However, these sensors lack the capability to differentiate between the color of the source illuminant and the actual color (or original color) of the object being captured. For this reason, the computational color constancy (CCC) was introduced and has been developed over decades. The CCC is an approach to modeling the color perception of the human visual system (HVS) by ensuring accurate object color determination under varying source illuminant conditions. At the core of human visual perception (HVP)-based CCC is attaining higher accuracy in scene illuminant estimation. The emergence of deep convolutional neural networks (DCNNs) was a recent innovation in accurate illuminant estimation, fundamentally transforming the CCC research landscape. Nevertheless, accurate illuminant estimation still remains a huge challenge for both traditional and state-of-the-art (SOTA) approaches. To further advance precision in illuminant estimation, this article presents a novel learning-based illumination color cast estimation approach to HVP-based CCC. Most importantly, the proposed approach is intended to integrate informative features into both channel and spatial regions while preserving long-term dependency feature information with the use of dense skip connections. To achieve these objectives, the proposed Dense Dual Connection Aggregated Transform Network (DDCATNet) architecture is designed to comprise several modules: shallow feature extraction, channel-wise and spatial feature-based Dense Dual Connection (DDC), fusion of the dense channel-wise attention (CA) and spatial attention (SA) branches through a gate mechanism (GM) unit, and aggregate transform. It is worth noting that both the CA blocks and the SA blocks in the DDC module are characterized by dense and cascading connections, meant to preserve long-term feature information and modulate different-level feature information at both global and local scales. The densely connected CA branch (DCA) and the densely connected SA branch (DSA) are also highly effective in securing high-contribution information while suppressing redundant data. The GM unit is integrated at the back of the DDC module, fusing the two DCA and DSA branches to ensure the adaptive merging of useful hierarchical feature information and the extraction of more valuable feature information. As a result, the proposed DDCATNet architecture significantly enhanced precision in illuminant estimation, thereby improving performance. In rigorous experiments on a wide range of datasets, the proposed DDCATNet approach outperformed its SOTA counterparts, validating the efficacy and generalization capabilities, as well as robust camera-invariance, across diverse, single- and multi-illuminant datasets and model architectures. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 18628 KB  
Article
CISPD: Complementary Illumination–Semantic Prompt Diffusion for Low-Light Remote Sensing Image Enhancement
by Huan Gao, Yuntai Liao, Zongfang Ma and Lin Song
Remote Sens. 2026, 18(9), 1347; https://doi.org/10.3390/rs18091347 - 28 Apr 2026
Viewed by 475
Abstract
When performing nighttime passive visible remote sensing of non-emissive land surfaces, illumination is typically dominated by weak moonlight that varies with lunar phase, producing low-radiance images with degraded textures and thus motivating low-radiance visible remote sensing image enhancement. We propose a Complementary Illumination–Semantic [...] Read more.
When performing nighttime passive visible remote sensing of non-emissive land surfaces, illumination is typically dominated by weak moonlight that varies with lunar phase, producing low-radiance images with degraded textures and thus motivating low-radiance visible remote sensing image enhancement. We propose a Complementary Illumination–Semantic Prompt Diffusion framework (CISPD) that incorporates a semantic-invariant prompt and a self-learned illumination-aware prompt to guide diffusion-based low-light remote sensing image enhancement. During denoising, we sequentially inject two complementary prompts. We first retrieve a self-learned illumination-aware prompt from a learnable pool conditioned on the current latent context to correct non-uniform brightness, and then apply a semantic-invariant prompt extracted from a vision foundation model to reinforce geometric structures and suppress artifacts. To keep the two prompts complementary rather than redundant, we introduce a contrastive constraint that encourages their representations to remain distinct, and the dual prompts jointly steer the diffusion trajectory toward well-exposed results with faithful structures. Experiments on iSAID-dark and darkrs, together with LOLv1 and LOLv2, demonstrate that CISPD achieves the best PSNR and SSIM on iSAID-dark, strong qualitative generalization on darkrs, and competitive quantitative performance on LOLv1 and LOLv2. Full article
Show Figures

Figure 1

28 pages, 21434 KB  
Article
Illumination-Invariant Normalization for Robust rPPG Extraction
by Byeong Seon An, Song Hee Park, Ye Jun Kim, Ye Rin Song, Geum Joon Cho and Eui Chul Lee
Electronics 2026, 15(8), 1683; https://doi.org/10.3390/electronics15081683 - 16 Apr 2026
Viewed by 356
Abstract
Remote photoplethysmography (rPPG) estimates heart rate by analyzing subtle blood-flow-induced color variations from camera videos; however, its performance is highly sensitive to illumination changes caused by variations in light intensity, position, and environmental conditions. To address this limitation, this study proposes a lightweight, [...] Read more.
Remote photoplethysmography (rPPG) estimates heart rate by analyzing subtle blood-flow-induced color variations from camera videos; however, its performance is highly sensitive to illumination changes caused by variations in light intensity, position, and environmental conditions. To address this limitation, this study proposes a lightweight, training-free brightness normalization method that suppresses illumination-induced luminance fluctuations while preserving physiologically relevant color variations associated with blood perfusion. The proposed approach separates luminance and chrominance components from the frame-mean RGB vector and applies normalization only to the brightness component, thereby maintaining the intrinsic color direction essential for rPPG signal extraction and stabilizing temporal brightness without distorting chrominance relationships. Experimental evaluations show that channel-wise mean values vary only within ±612% with negligible changes in standard deviation, while dynamic range and temporal stability are significantly improved. Furthermore, when combined with an SNR-based signal selection strategy, the proposed method reduces the mean absolute error (MAE) of the CHROM algorithm on the DLCN dataset from approximately 18–19 BPM to 4.87 BPM under complex illumination scenarios, with consistent improvements also observed on the MR-NIRP dataset. These results suggest that the proposed preprocessing method helps preserve blood-flow-induced temporal color variations and improves the robustness of rPPG measurement under diverse illumination conditions. Full article
Show Figures

Figure 1

33 pages, 15024 KB  
Article
HFA-Net: Explainable Multi-Scale Deep Learning Framework for Illumination-Invariant Plant Disease Diagnosis in Precision Agriculture
by Muhammad Hassaan Ashraf, Farhana Jabeen, Muhammad Waqar and Ajung Kim
Sensors 2026, 26(7), 2067; https://doi.org/10.3390/s26072067 - 26 Mar 2026
Viewed by 851
Abstract
Robust plant disease detection in real-world agricultural environments remains challenging due to dynamic environmental conditions. Accurate and reliable disease identification is essential for precision agriculture and effective crop management. Although computer vision and Artificial Intelligence (AI) have shown promising results in controlled settings, [...] Read more.
Robust plant disease detection in real-world agricultural environments remains challenging due to dynamic environmental conditions. Accurate and reliable disease identification is essential for precision agriculture and effective crop management. Although computer vision and Artificial Intelligence (AI) have shown promising results in controlled settings, their performance often drops under lesion scale variability, inter- and intra-class similarity among diseases, class imbalance, and illumination fluctuations. To overcome these challenges, we propose a Heterogeneous Feature Aggregation Network (HFA-Net) that brings together architectural improvements, illumination-aware preprocessing, and training-level enhancements into a single cohesive framework. To extract richer and more discriminative features from the early layers of the network, HFA-Net introduces a multi-scale, multi-level feature aggregation stem. The Reduction-Expansion (RE) mechanism helps preserve important lesion details while adapting to variations in scale. Considering real agricultural environments, an Illumination-Adaptive Contrast Enhancement (IACE) preprocessing pipeline is designed to address illumination variability in real agricultural environments. Experimental results show that HFA-Net achieves 96.03% accuracy under normal conditions and maintains strong performance under challenging lighting scenarios, achieving 92.95% and 93.07% accuracy in extremely dark and bright environments, respectively. Furthermore, quantitative explainability analysis using perturbation-based metrics demonstrates that the model’s predictions are not only accurate but also faithful to disease-relevant regions. Finally, Grad-CAM-based visual explanations confirm that the model’s predictions are driven by disease-specific regions, enhancing interpretability and practical reliability. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

29 pages, 4764 KB  
Article
A Two-Level Illumination Correction Network for Digital Meter Reading Recognition in Non-Uniform Low-Light Conditions
by Haoning Fu, Zhiwei Xie, Wenzhu Jiang, Xingjiang Ma and Dongying Yang
J. Imaging 2026, 12(4), 146; https://doi.org/10.3390/jimaging12040146 - 25 Mar 2026
Viewed by 489
Abstract
The automatic reading recognition of digital instruments is crucial for achieving metering automation and intelligent inspection. However, in non-standardized industrial environments, the masking effect caused by the coupling of non-uniform low-light conditions and the reflective surfaces of instrument panels severely degrades the displayed [...] Read more.
The automatic reading recognition of digital instruments is crucial for achieving metering automation and intelligent inspection. However, in non-standardized industrial environments, the masking effect caused by the coupling of non-uniform low-light conditions and the reflective surfaces of instrument panels severely degrades the displayed information, significantly limiting the recognition performance. Conventional image processing methods, while aiming to restore the imaging quality of instrument panels through low-light enhancement, inevitably introduce overexposure and indiscriminately amplify background noise during this process. To address the two key challenges of illumination recovery and noise suppression in the process of restoring panel image quality under non-uniform low-light conditions, this paper proposes a coarse-to-fine cascaded perception framework (CFCP). First, a lightweight YOLOv10 detector is employed to coarsely localize the meter reading region under non-uniform illumination conditions. Second, an Adaptive Illumination Correction Module (AICM) is designed to decouple and correct the illumination component at the pixel level, effectively restoring details in dark areas. Then, an Illumination-invariant Feature Perception Module (IFPM) is embedded at the feature level to dynamically perceive illumination-invariant features and filter out noise interference. Finally, the refined detection results are fed into a lightweight sequence recognition network to obtain the final meter readings. Experiments on a self-built industrial digital instrument dataset show that the proposed method achieves 93.2% recognition accuracy, with 17.1 ms latency and only 7.9 M parameters. Full article
(This article belongs to the Special Issue AI-Driven Image and Video Understanding)
Show Figures

Figure 1

18 pages, 4538 KB  
Article
Analytical-Numerical Modeling of Filling-Fraction-Dependent Plasmonic Coupling in Nanostructured Metasurfaces Under Kretschmann Configuration
by Karan K. Singh, Guillermo E. Sánchez-Guerrero, Perla M. Viera-González, Carlos A. Fuentes-Hernandez, María T. Romero de la Cruz, Eduardo Martínez-Guerra, Rodolfo Cortés-Martínez and Edgar Martínez-Guerra
Optics 2026, 7(2), 22; https://doi.org/10.3390/opt7020022 - 24 Mar 2026
Cited by 1 | Viewed by 603
Abstract
Surface plasmon resonance (SPR) sensors based on nanostructured metasurfaces offer enhanced sensitivity through engineered electromagnetic responses. In this study, we present an analytical and numerical investigation of the plasmonic behavior of gold nanopillar (Au-NP) and nanohole (Au-NH) arrays under both p- and [...] Read more.
Surface plasmon resonance (SPR) sensors based on nanostructured metasurfaces offer enhanced sensitivity through engineered electromagnetic responses. In this study, we present an analytical and numerical investigation of the plasmonic behavior of gold nanopillar (Au-NP) and nanohole (Au-NH) arrays under both p- and s-polarized illumination, employing the Effective Medium Theory (EMT) in combination with the Transfer Matrix Method (TMM). The study combines Effective Medium Theory (EMT) and the Transfer Matrix Method (TMM) to describe the macroscopic optical response of multilayer plasmonic systems. For p-polarization, the nanostructure geometry strongly modulates the real and imaginary parts of the effective permittivity, with nanoholes supporting stronger SPR coupling and reduced optical losses compared to nanopillars. Under s-polarization, the effective permittivity remains largely invariant, primarily driven by the filling fraction. The analysis reveals that polarization-dependent behavior arises from boundary-condition-mediated coupling mechanisms governing surface plasmon excitation, aligning with classical plasmonic theory. Benchmarking against analytical dispersion relations and published experimental data for Au/BK7 systems shows close agreement within ±2°, confirming the physical consistency of the EMT–TMM framework. These results provide a systematic description of how polarization and filling fraction jointly modulate SPR coupling. The results offer a foundation for the rational design of plasmonic coatings and SPR-supporting metasurfaces by elucidating macroscopic coupling trends; however, no quantitative sensor performance metrics, such as refractive index sensitivity or figure of merit, are evaluated in this work. Full article
Show Figures

Figure 1

31 pages, 3479 KB  
Article
MV-S2CD: A Modality-Bridged Vision Foundation Model-Based Framework for Unsupervised Optical–SAR Change Detection
by Yongqi Shi, Ruopeng Yang, Changsheng Yin, Yiwei Lu, Bo Huang, Yongqi Wen, Yihao Zhong and Zhaoyang Gu
Remote Sens. 2026, 18(6), 931; https://doi.org/10.3390/rs18060931 - 19 Mar 2026
Cited by 1 | Viewed by 698
Abstract
Unsupervised change detection (UCD) from heterogeneous bitemporal optical–SAR imagery is challenging due to modality discrepancy, speckle/illumination variations, and the absence of change annotations. We propose MV-S2CD, a vision foundation model (VFM)-based framework that learns a modality-bridged latent space and produces dense change maps [...] Read more.
Unsupervised change detection (UCD) from heterogeneous bitemporal optical–SAR imagery is challenging due to modality discrepancy, speckle/illumination variations, and the absence of change annotations. We propose MV-S2CD, a vision foundation model (VFM)-based framework that learns a modality-bridged latent space and produces dense change maps in a fully unsupervised manner. To robustly adapt pretrained VFM priors to heterogeneous inputs with minimal task-specific parameters, MV-S2CD incorporates lightweight modality-specific adapters and parameter-efficient low-rank adaptation (LoRA) in high-level layers. A shared projector embeds the two observations into a common geometry, enabling consistent cross-modal comparison and reducing sensor-induced domain shift. Building on the bridged representation, we design a dual-branch change reasoning module that decouples structure-sensitive cues from semantic-consistency cues: a structure pathway preserves fine boundaries and local variations, while a semantic-consistency pathway employs reliability gating and multi-scale context aggregation to suppress pseudo-changes caused by modality-specific nuisances and residual misregistration. For label-free optimization, we develop a difference-centric self-supervision scheme with two perturbation views and reliability-guided pseudo-partitioning, jointly enforcing pseudo-unchanged invariance, pseudo-changed/unchanged separability, and sparsity and edge-preserving regularization. Experiments on three heterogeneous optical–SAR benchmarks demonstrate that MV-S2CD consistently improves the Precision–Recall trade-off and achieves state-of-the-art performance among unsupervised baselines, while remaining backbone-flexible and efficient. Full article
Show Figures

Figure 1

23 pages, 13360 KB  
Article
Lumina-4DGS: Illumination-Robust Four-Dimensional Gaussian Splatting for Dynamic Scene Reconstruction
by Xiaoqiang Wang, Qing Wang, Yang Sun and Shengyi Liu
Sensors 2026, 26(5), 1650; https://doi.org/10.3390/s26051650 - 5 Mar 2026
Viewed by 1104
Abstract
High-fidelity 4D reconstruction of dynamic scenes is pivotal for immersive simulation yet remains challenging due to the photometric inconsistencies inherent in multi-view sensor arrays. Standard 3D Gaussian Splatting (3DGS) strictly adheres to the brightness constancy assumption, failing to distinguish between intrinsic scene radiance [...] Read more.
High-fidelity 4D reconstruction of dynamic scenes is pivotal for immersive simulation yet remains challenging due to the photometric inconsistencies inherent in multi-view sensor arrays. Standard 3D Gaussian Splatting (3DGS) strictly adheres to the brightness constancy assumption, failing to distinguish between intrinsic scene radiance and transient brightness shifts caused by independent auto-exposure (AE), auto-white-balance (AWB), and non-linear ISP processing. This misalignment often forces the optimization process to compensate for spectral discrepancies through incorrect geometric deformation, resulting in severe temporal flickering and spatial floating artifacts. To address these limitations, we present Lumina-4DGS, a robust framework that harmonizes spatiotemporal geometry modeling with a hierarchical exposure compensation strategy. Our approach explicitly decouples photometric variations into two levels: a Global Exposure Affine Module that neutralizes sensor-specific AE/AWB fluctuations and a Multi-Scale Bilateral Grid that residually corrects spatially varying non-linearities, such as vignetting, using luminance-based guidance. Crucially, to prevent these powerful appearance modules from masking geometric flaws, we introduce a novel SSIM-Gated Optimization mechanism. This strategy dynamically gates the gradient flow to the exposure modules based on structural similarity. By ensuring that photometric enhancement is only activated when the underlying geometry is structurally reliable, we effectively prioritize geometric accuracy over photometric overfitting. Extensive experiments validate the quantitative superiority of Lumina-4DGS. On the Waymo Open Dataset, our method achieves a state-of-the-art Full Image PSNR of 31.12 dB while minimizing geometric errors to a Depth RMSE of 1.89 m and Chamfer Distance of 0.215 m. Furthermore, on our highly challenging self-collected surround-view dataset featuring severe unconstrained illumination shifts, Lumina-4DGS yields a significant 2.13 dB PSNR improvement over recent driving-scene baselines. These results confirm that our framework achieves photorealistic, exposure-invariant novel view synthesis while maintaining superior geometric consistency across heterogeneous camera inputs. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

26 pages, 9177 KB  
Article
DGC_GAN: An Unpaired Method for Cross-Spectral Image Translation from Visible to Thermal Infrared
by Shun Yao, Xiaobing Sun, Bo Song, Yichen Wei, Yuyao Wang, Yiqi Li and Xiao Liu
Remote Sens. 2026, 18(4), 569; https://doi.org/10.3390/rs18040569 - 11 Feb 2026
Viewed by 564
Abstract
Thermal infrared imaging is widely used in applications such as disaster monitoring and target recognition because it remains stable under illumination changes and supports nighttime observation. However, thermal infrared data are expensive to acquire, and the related application scenarios are often sensitive, which [...] Read more.
Thermal infrared imaging is widely used in applications such as disaster monitoring and target recognition because it remains stable under illumination changes and supports nighttime observation. However, thermal infrared data are expensive to acquire, and the related application scenarios are often sensitive, which leads to limited publicly available thermal infrared datasets and restricts the development of relevant research. Cross-spectral image translation from visible to thermal infrared provides a solution for expanding infrared datasets, but accurate mapping remains difficult because visible light reflection and thermal infrared emission follow different physical mechanisms. This paper proposes a Dual Geometric Cycle Generative Adversarial Network (DGC_GAN), for unpaired visible-to-thermal infrared translation. The proposed method improves cross-spectral mapping accuracy by combining geometric-consistency constraints with cycle-consistency constraints. In addition, disentangled representation learning is introduced to decompose cross-spectral images into a domain-invariant semantic structure space and a domain-specific imaging style space, enabling one-to-many synthesis through the cross-combination of structure and style. Experiments on public aerial datasets, including AVIID and Drone Vehicle, demonstrate that DGC_GAN significantly improves the realism and diversity of generated images compared with other popular unpaired translation methods. Specifically, DGC_GAN achieves FID and KID values of 63.727 and 0.008711 on the Day Road dataset (part of Drone Vehicle), 69.419 and 0.019352 on the Night Road dataset (part of Drone Vehicle). Moreover, it outperforms other methods on all four evaluation metrics on the AVIID dataset. Furthermore, real drone data collected using a dual-spectrum platform are used to validate the practical usefulness of the proposed method. We also collected real data using a dual-spectrum drone platform to verify the practical usefulness of the proposed method. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

22 pages, 7096 KB  
Article
An Improved ORB-KNN-Ratio Test Algorithm for Robust Underwater Image Stitching on Low-Cost Robotic Platforms
by Guanhua Yi, Tianxiang Zhang, Yunfei Chen and Dapeng Yu
J. Mar. Sci. Eng. 2026, 14(2), 218; https://doi.org/10.3390/jmse14020218 - 21 Jan 2026
Viewed by 795
Abstract
Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching [...] Read more.
Underwater optical images often exhibit severe color distortion, weak texture, and uneven illumination due to light absorption and scattering in water. These issues result in unstable feature detection and inaccurate image registration. To address these challenges, this paper proposes an underwater image stitching method that integrates ORB (Oriented FAST and Rotated BRIEF) feature extraction with a fixed-ratio constraint matching strategy. First, lightweight color and contrast enhancement techniques are employed to restore color balance and improve local texture visibility. Then, ORB descriptors are extracted and matched via a KNN (K-Nearest Neighbors) nearest-neighbor search, and Lowe’s ratio test is applied to eliminate false matches caused by weak texture similarity. Finally, the geometric transformation between image frames is estimated by incorporating robust optimization, ensuring stable homography computation. Experimental results on real underwater datasets show that the proposed method significantly improves stitching continuity and structural consistency, achieving 40–120% improvements in SSIM (Structural Similarity Index) and PSNR (peak signal-to-noise ratio) over conventional Harris–ORB + KNN, SIFT (scale-invariant feature transform) + BF (brute force), SIFT + KNN, and AKAZE (accelerated KAZE) + BF methods while maintaining processing times within one second. These results indicate that the proposed method is well-suited for real-time underwater environment perception and panoramic mapping on low-cost, micro-sized underwater robotic platforms. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

22 pages, 92351 KB  
Article
Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning
by Genki Higashiuchi, Tomoyasu Shimada, Xiangbo Kong and Hiroyuki Tomiyama
Appl. Sci. 2026, 16(2), 714; https://doi.org/10.3390/app16020714 - 9 Jan 2026
Viewed by 741
Abstract
Self-supervised monocular depth estimation has demonstrated high practical utility, as it can be trained using a photometric image reconstruction loss between the original image and a reprojected image generated from the estimated depth and relative pose, thereby alleviating the burden of large-scale label [...] Read more.
Self-supervised monocular depth estimation has demonstrated high practical utility, as it can be trained using a photometric image reconstruction loss between the original image and a reprojected image generated from the estimated depth and relative pose, thereby alleviating the burden of large-scale label creation. However, this photometric image reconstruction loss relies on the Lambertian reflectance assumption. Under non-Lambertian conditions such as specular reflections or strong illumination gradients, pixel values fluctuate depending on the lighting and viewpoint, which often misguides training and leads to large depth errors. To address this issue, we propose a multitask learning framework that integrates albedo estimation as a supervised auxiliary task. The proposed framework is implemented on top of representative self-supervised monocular depth estimation backbones, including Monodepth2 and Lite-Mono, by adopting a multi-head architecture in which the shared encoder–decoder branches at each upsampling block into a Depth Head and an Albedo Head. Furthermore, we apply Intrinsic Image Decomposition to generate albedo images and design an albedo supervision loss that uses these albedo maps as training targets for the Albedo Head. We then integrate this loss term into the overall training objective, explicitly exploiting illumination-invariant albedo components to suppress erroneous learning in reflective regions and areas with strong illumination gradients. Experiments on the ScanNetV2 dataset demonstrate that, for the lightweight backbone Lite-Mono, our method achieves an average reduction of 18.5% over the four standard depth error metrics and consistently improves accuracy metrics, without increasing the number of parameters and FLOPs at inference time. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Computer Vision)
Show Figures

Figure 1

21 pages, 21514 KB  
Article
Robust Geometry–Hue Point Cloud Registration via Hybrid Adaptive Residual Optimization
by Yangmin Xie, Jinghan Zhang, Rijian Xu and Hang Shi
ISPRS Int. J. Geo-Inf. 2026, 15(1), 22; https://doi.org/10.3390/ijgi15010022 - 4 Jan 2026
Viewed by 771
Abstract
Accurate point cloud registration is a fundamental prerequisite for reality-based 3D reconstruction and large-scale spatial modeling. Despite significant international progress, reliable registration in architectural and urban scenes remains challenging due to geometric intricacies arising from repetitive and strongly symmetric structures and photometric variability [...] Read more.
Accurate point cloud registration is a fundamental prerequisite for reality-based 3D reconstruction and large-scale spatial modeling. Despite significant international progress, reliable registration in architectural and urban scenes remains challenging due to geometric intricacies arising from repetitive and strongly symmetric structures and photometric variability caused by illumination inconsistencies. Conventional ICP-based and color-augmented methods often suffer from local convergence and color drift, limiting their robustness in large-scale real-world applications. To address these challenges, we propose Hybrid Adaptive Residual Optimization (HARO), a unified framework that organically integrates geometric cues with hue-robust color features. Specifically, RGB data are transformed into a decoupled HSV representation with histogram-matched hue correction applied in overlapping regions, enabling illumination-invariant color modeling. Furthermore, a novel adaptive residual kernel dynamically balances geometric and chromatic constraints, ensuring stable convergence even in structurally complex or partially overlapping scenes. Extensive experiments conducted on diverse real-world datasets, including Subway, Railway, urban, and Office environments, demonstrate that HARO consistently achieves sub-degree rotational accuracy (0.11°) and negligible translation errors relative to the scene scale. These results indicate that HARO provides an effective and generalizable solution for large-scale point cloud registration, successfully bridging geometric complexity and photometric variability in reality-based reconstruction tasks. Full article
Show Figures

Figure 1

26 pages, 8467 KB  
Article
Low-Light Pose-Action Collaborative Network for Industrial Monitoring in Power Systems
by Qifeng Luo, Heng Zhou, Mianting Wu and Qiang Zhou
Electronics 2026, 15(1), 199; https://doi.org/10.3390/electronics15010199 - 1 Jan 2026
Viewed by 756
Abstract
Recognizing human actions in low-light industrial environments remains a significant challenge for safety-critical applications in power systems. In this paper, we propose a Low-Light Pose-Action Collaborative Network (LPAC-Net), an integrated framework specifically designed for monitoring scenarios in underground electrical vaults and smart power [...] Read more.
Recognizing human actions in low-light industrial environments remains a significant challenge for safety-critical applications in power systems. In this paper, we propose a Low-Light Pose-Action Collaborative Network (LPAC-Net), an integrated framework specifically designed for monitoring scenarios in underground electrical vaults and smart power stations. The pipeline begins with a modified Zero-DCE++ module for reference-free illumination correction, followed by pose extraction using YOLO-Pose and a novel rotation-invariant encoding of keypoints optimized for confined industrial spaces. Temporal dependencies are captured through a bidirectional LSTM network with attention mechanisms to model complex operational behaviors. We evaluate LPAC-Net on the newly curated ARID-Fall dataset, enhanced with industrial monitoring scenarios representative of electrical infrastructure environments. Experimental results demonstrate that our method outperforms state-of-the-art models, including DarkLight-R101, DTCM, FRAGNet, and URetinex-Net++, achieving 95.53% accuracy in recognizing worker activities and safety-critical events. Additional studies confirm LPAC-Net’s robustness under keypoint noise and motion blur, highlighting its practical value for intelligent monitoring in challenging industrial lighting conditions typical of underground electrical facilities and automated power stations. Full article
(This article belongs to the Special Issue AI Applications for Smart Grid)
Show Figures

Figure 1

24 pages, 14385 KB  
Article
LDFE-SLAM: Light-Aware Deep Front-End for Robust Visual SLAM Under Challenging Illumination
by Cong Liu, You Wang, Weichao Luo and Yanhong Peng
Machines 2026, 14(1), 44; https://doi.org/10.3390/machines14010044 - 29 Dec 2025
Viewed by 1567
Abstract
Visual SLAM systems face significant performance degradation under dynamic lighting conditions, where traditional feature extraction methods suffer from reduced keypoint detection and unstable matching. This paper presents LDFE-SLAM, a novel visual SLAM framework that addresses illumination challenges through a Light-Aware Deep Front-End (LDFE) [...] Read more.
Visual SLAM systems face significant performance degradation under dynamic lighting conditions, where traditional feature extraction methods suffer from reduced keypoint detection and unstable matching. This paper presents LDFE-SLAM, a novel visual SLAM framework that addresses illumination challenges through a Light-Aware Deep Front-End (LDFE) architecture. Our key insight is that low-light degradation in SLAM is fundamentally a geometric feature distribution problem rather than merely a visibility issue. The proposed system integrates three synergistic components: (1) an illumination-adaptive enhancement module based on EnlightenGAN with geometric consistency loss that restores gradient structures for downstream feature extraction, (2) SuperPoint-based deep feature detection that provides illumination-invariant keypoints, and (3) LightGlue attention-based matching that filters enhancement-induced noise while maintaining geometric consistency. Through systematic evaluation of five method configurations (M1–M5), we demonstrate that enhancement, deep features, and learned matching must be co-designed rather than independently optimized. Experiments on EuRoC and TUM sequences under synthetic illumination degradation show that LDFE-SLAM maintains stable localization accuracy (∼1.2 m ATE) across all brightness levels, while baseline methods degrade significantly (up to 3.7 m). Our method operates normally down to severe lighting conditions (30% ambient brightness and 20–50 lux—equivalent to underground parking or night-time streetlight illumination), representing a 4–6× lower illumination threshold compared to ORB-SLAM3 (200–300 lux minimum). Under severe (25% brightness) conditions, our method achieves a 62% tracking success rate, compared to 12% for ORB-SLAM3, with keypoint detection remaining above the critical 100-point threshold, even under extreme degradation. Full article
Show Figures

Figure 1

Back to TopTop