Next Article in Journal
Parallel Optimization for Coupled Lattice Boltzmann-Finite Volume Method on Heterogeneous Many-Core Supercomputer
Previous Article in Journal
Automated Pollen Classification via Subinstance Recognition: A Comprehensive Comparison of Classical and Deep Learning Architectures
Previous Article in Special Issue
Real-Time Detection of Rear Car Signals for Advanced Driver Assistance Systems Using Meta-Learning and Geometric Post-Processing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning

1
Graduate School of Science and Engineering, Ritsumeikan University, Kusatsu 525-8577, Shiga, Japan
2
Department of Intelligent Robotics, Faculty of Information Engineering, Toyama Prefectural University, Imizu 939-0398, Toyama, Japan
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2026, 16(2), 714; https://doi.org/10.3390/app16020714
Submission received: 1 December 2025 / Revised: 25 December 2025 / Accepted: 5 January 2026 / Published: 9 January 2026
(This article belongs to the Special Issue Convolutional Neural Networks and Computer Vision)

Abstract

Self-supervised monocular depth estimation has demonstrated high practical utility, as it can be trained using a photometric image reconstruction loss between the original image and a reprojected image generated from the estimated depth and relative pose, thereby alleviating the burden of large-scale label creation. However, this photometric image reconstruction loss relies on the Lambertian reflectance assumption. Under non-Lambertian conditions such as specular reflections or strong illumination gradients, pixel values fluctuate depending on the lighting and viewpoint, which often misguides training and leads to large depth errors. To address this issue, we propose a multitask learning framework that integrates albedo estimation as a supervised auxiliary task. The proposed framework is implemented on top of representative self-supervised monocular depth estimation backbones, including Monodepth2 and Lite-Mono, by adopting a multi-head architecture in which the shared encoder–decoder branches at each upsampling block into a Depth Head and an Albedo Head. Furthermore, we apply Intrinsic Image Decomposition to generate albedo images and design an albedo supervision loss that uses these albedo maps as training targets for the Albedo Head. We then integrate this loss term into the overall training objective, explicitly exploiting illumination-invariant albedo components to suppress erroneous learning in reflective regions and areas with strong illumination gradients. Experiments on the ScanNetV2 dataset demonstrate that, for the lightweight backbone Lite-Mono, our method achieves an average reduction of 18.5% over the four standard depth error metrics and consistently improves accuracy metrics, without increasing the number of parameters and FLOPs at inference time.
Keywords: monocular depth; self supervision; multitask learning monocular depth; self supervision; multitask learning

Share and Cite

MDPI and ACS Style

Higashiuchi, G.; Shimada, T.; Kong, X.; Tomiyama, H. Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning. Appl. Sci. 2026, 16, 714. https://doi.org/10.3390/app16020714

AMA Style

Higashiuchi G, Shimada T, Kong X, Tomiyama H. Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning. Applied Sciences. 2026; 16(2):714. https://doi.org/10.3390/app16020714

Chicago/Turabian Style

Higashiuchi, Genki, Tomoyasu Shimada, Xiangbo Kong, and Hiroyuki Tomiyama. 2026. "Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning" Applied Sciences 16, no. 2: 714. https://doi.org/10.3390/app16020714

APA Style

Higashiuchi, G., Shimada, T., Kong, X., & Tomiyama, H. (2026). Robust Self-Supervised Monocular Depth Estimation via Intrinsic Albedo-Guided Multi-Task Learning. Applied Sciences, 16(2), 714. https://doi.org/10.3390/app16020714

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop