Sensors

Research

17 pages, 15387 KB

Open AccessEditor’s ChoiceArticle

Improving 3D Reconstruction Through RGB-D Sensor Noise Modeling

by Fahira Afzal Maken, Sundaram Muthu, Chuong Nguyen, Changming Sun, Jinguang Tong, Shan Wang, Russell Tsuchida, David Howard, Simon Dunstall and Lars Petersson

Sensors 2025, 25(3), 950; https://doi.org/10.3390/s25030950 - 5 Feb 2025

Cited by 3 | Viewed by 2771

Abstract

High-resolution RGB-D sensors are widely used in computer vision, manufacturing, and robotics. The depth maps from these sensors have inherently high measurement uncertainty that includes both systematic and non-systematic noise. These noisy depth estimates degrade the quality of scans, resulting in less accurate [...] Read more.

High-resolution RGB-D sensors are widely used in computer vision, manufacturing, and robotics. The depth maps from these sensors have inherently high measurement uncertainty that includes both systematic and non-systematic noise. These noisy depth estimates degrade the quality of scans, resulting in less accurate 3D reconstruction, making them unsuitable for some high-precision applications. In this paper, we focus on quantifying the uncertainty in the depth maps of high-resolution RGB-D sensors for the purpose of improving 3D reconstruction accuracy. To this end, we estimate the noise model for a recent high-precision RGB-D structured light sensor called Zivid when mounted on a robot arm. Our proposed noise model takes into account the measurement distance and angle between the sensor and the measured surface. We additionally analyze the effect of background light, exposure time, and the number of captures on the quality of the depth maps obtained. Our noise model seamlessly integrates with well-known classical and modern neural rendering-based algorithms, from KinectFusion to Point-SLAM methods using bilinear interpolation as well as 3D analytical functions. We collect a high-resolution RGB-D dataset and apply our noise model to improve tracking and produce higher-resolution 3D models. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

20 pages, 23119 KB

Open AccessArticle

Three-Dimensional Visualization Using Proportional Photon Estimation Under Photon-Starved Conditions

by Jin-Ung Ha, Hyun-Woo Kim, Myungjin Cho and Min-Chul Lee

Sensors 2025, 25(3), 893; https://doi.org/10.3390/s25030893 - 1 Feb 2025

Cited by 2 | Viewed by 823

Abstract

In this paper, we propose a new method for three-dimensional (3D) visualization that proportionally estimates the number of photons in the background and the object under photon-starved conditions. Photon-counting integral imaging is one of the techniques for 3D image visualization under photon-starved conditions. [...] Read more.

In this paper, we propose a new method for three-dimensional (3D) visualization that proportionally estimates the number of photons in the background and the object under photon-starved conditions. Photon-counting integral imaging is one of the techniques for 3D image visualization under photon-starved conditions. However, conventional photon-counting integral imaging has the problem that a random noise is generated in the background of the image by estimating the same number of photons in entire areas of images. On the other hand, our proposed method reduces the random noise by estimating the proportional number of photons in the background and the object. In addition, the spatial overlaps have been applied to the space where photons overlap to obtain the enhanced 3D images. To demonstrate the feasibility of our proposed method, we conducted optical experiments and calculated the performance metrics such as normalized cross-correlation, peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM). For SSIM of 3D visualization results by our proposed method and conventional method, our proposed method achieves about 3.42 times higher SSIM than conventional method. Therefore, our proposed method can obtain better 3D visualization of objects than conventional photon-counting integral imaging methods under photon-starved conditions. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

19 pages, 7424 KB

Open AccessArticle

Residual Vision Transformer and Adaptive Fusion Autoencoders for Monocular Depth Estimation

by Wei-Jong Yang, Chih-Chen Wu and Jar-Ferr Yang

Sensors 2025, 25(1), 80; https://doi.org/10.3390/s25010080 - 26 Dec 2024

Cited by 2 | Viewed by 1869

Abstract

Precision depth estimation plays a key role in many applications, including 3D scene reconstruction, virtual reality, autonomous driving and human–computer interaction. Through recent advancements in deep learning technologies, monocular depth estimation, with its simplicity, has surpassed the traditional stereo camera systems, bringing new [...] Read more.

Precision depth estimation plays a key role in many applications, including 3D scene reconstruction, virtual reality, autonomous driving and human–computer interaction. Through recent advancements in deep learning technologies, monocular depth estimation, with its simplicity, has surpassed the traditional stereo camera systems, bringing new possibilities in 3D sensing. In this paper, by using a single camera, we propose an end-to-end supervised monocular depth estimation autoencoder, which contains an encoder with a structure with a mixed convolution neural network and vision transformers and an effective adaptive fusion decoder to obtain high-precision depth maps. In the encoder, we construct a multi-scale feature extractor by mixing residual configurations of vision transformers to enhance both local and global information. In the adaptive fusion decoder, we introduce adaptive fusion modules to effectively merge the features of the encoder and the decoder together. Lastly, the model is trained using a loss function that aligns with human perception to enable it to focus on the depth values of foreground objects. The experimental results demonstrate the effective prediction of the depth map from a single-view color image by the proposed autoencoder, which increases the first accuracy rate about 28% and reduces the root mean square error about 27% compared to an existing method in the NYU dataset. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

18 pages, 3855 KB

Open AccessArticle

Impact of Camera Settings on 3D Reconstruction Quality: Insights from NeRF and Gaussian Splatting

by Dimitar Rangelov, Sierd Waanders, Kars Waanders, Maurice van Keulen and Radoslav Miltchev

Sensors 2024, 24(23), 7594; https://doi.org/10.3390/s24237594 - 28 Nov 2024

Cited by 6 | Viewed by 3224

Abstract

This paper explores the influence of various camera settings on the quality of 3D reconstructions, particularly in indoor crime scene investigations. Utilizing Neural Radiance Fields (NeRF) and Gaussian Splatting for 3D reconstruction, we analyzed the impact of ISO, shutter speed, and aperture settings [...] Read more.

This paper explores the influence of various camera settings on the quality of 3D reconstructions, particularly in indoor crime scene investigations. Utilizing Neural Radiance Fields (NeRF) and Gaussian Splatting for 3D reconstruction, we analyzed the impact of ISO, shutter speed, and aperture settings on the quality of the resulting 3D reconstructions. By conducting controlled experiments in a meeting room setup, we identified optimal settings that minimize noise and artifacts while maximizing detail and brightness. Our findings indicate that an ISO of 200, a shutter speed of 1/60 s, and an aperture of f/3.5 provide the best balance for high-quality 3D reconstructions. These settings are especially useful for forensic applications, architectural visualization, and cultural heritage preservation, offering practical guidelines for professionals in these fields. The study also highlights the potential for future research to expand on these findings by exploring other camera parameters and real-time adjustment techniques. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

18 pages, 6642 KB

Open AccessArticle

Enlarged Eye-Box Accommodation-Capable Augmented Reality with Hologram Replicas

by Woonchan Moon and Joonku Hahn

Sensors 2024, 24(12), 3930; https://doi.org/10.3390/s24123930 - 17 Jun 2024

Cited by 2 | Viewed by 2564

Abstract

Augmented reality (AR) technology has been widely applied across a variety of fields, with head-up displays (HUDs) being one of its prominent uses, offering immersive three-dimensional (3D) experiences and interaction with digital content and the real world. AR-HUDs face challenges such as limited [...] Read more.

Augmented reality (AR) technology has been widely applied across a variety of fields, with head-up displays (HUDs) being one of its prominent uses, offering immersive three-dimensional (3D) experiences and interaction with digital content and the real world. AR-HUDs face challenges such as limited field of view (FOV), small eye-box, bulky form factor, and absence of accommodation cue, often compromising trade-offs between these factors. Recently, optical waveguide based on pupil replication process has attracted increasing attention as an optical element for its compact form factor and exit-pupil expansion. Despite these advantages, current waveguide displays struggle to integrate visual information with real scenes because they do not produce accommodation-capable virtual content. In this paper, we introduce a lensless accommodation-capable holographic system based on a waveguide. Our system aims to expand the eye-box at the optimal viewing distance that provides the maximum FOV. We devised a formalized CGH algorithm based on bold assumption and two constraints and successfully performed numerical observation simulation. In optical experiments, accommodation-capable images with a maximum horizontal FOV of 7.0 degrees were successfully observed within an expanded eye-box of 9.18 mm at an optimal observation distance of 112 mm. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

17 pages, 2336 KB

Open AccessArticle

Three-Dimensional Segmentation of Equine Paranasal Sinuses in Multidetector Computed Tomography Datasets: Preliminary Morphometric Assessment Assisted with Clustering Analysis

by Marta Borowska, Paweł Lipowicz, Kristina Daunoravičienė, Bernard Turek, Tomasz Jasiński, Jolanta Pauk and Małgorzata Domino

Sensors 2024, 24(11), 3538; https://doi.org/10.3390/s24113538 - 30 May 2024

Cited by 4 | Viewed by 1609

Abstract

The paranasal sinuses, a bilaterally symmetrical system of eight air-filled cavities, represent one of the most complex parts of the equine body. This study aimed to extract morphometric measures from computed tomography (CT) images of the equine head and to implement a clustering [...] Read more.

The paranasal sinuses, a bilaterally symmetrical system of eight air-filled cavities, represent one of the most complex parts of the equine body. This study aimed to extract morphometric measures from computed tomography (CT) images of the equine head and to implement a clustering analysis for the computer-aided identification of age-related variations. Heads of 18 cadaver horses, aged 2–25 years, were CT-imaged and segmented to extract their volume, surface area, and relative density from the frontal sinus (FS), dorsal conchal sinus (DCS), ventral conchal sinus (VCS), rostral maxillary sinus (RMS), caudal maxillary sinus (CMS), sphenoid sinus (SS), palatine sinus (PS), and middle conchal sinus (MCS). Data were grouped into young, middle-aged, and old horse groups and clustered using the K-means clustering algorithm. Morphometric measurements varied according to the sinus position and age of the horses but not the body side. The volume and surface area of the VCS, RMS, and CMS increased with the age of the horses. With accuracy values of 0.72 for RMS, 0.67 for CMS, and 0.31 for VCS, the possibility of the age-related clustering of CT-based 3D images of equine paranasal sinuses was confirmed for RMS and CMS but disproved for VCS. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

22 pages, 11286 KB

Open AccessArticle

Analysis of the Photogrammetric Use of 360-Degree Cameras in Complex Heritage-Related Scenes: Case of the Necropolis of Qubbet el-Hawa (Aswan Egypt)

by José Luis Pérez-García, José Miguel Gómez-López, Antonio Tomás Mozas-Calvache and Jorge Delgado-García

Sensors 2024, 24(7), 2268; https://doi.org/10.3390/s24072268 - 2 Apr 2024

Cited by 5 | Viewed by 2890

Abstract

This study shows the results of the analysis of the photogrammetric use of 360-degree cameras in complex heritage-related scenes. The goal is to take advantage of the large field of view provided by these sensors and reduce the number of images used to [...] Read more.

This study shows the results of the analysis of the photogrammetric use of 360-degree cameras in complex heritage-related scenes. The goal is to take advantage of the large field of view provided by these sensors and reduce the number of images used to cover the entire scene compared to those needed using conventional cameras. We also try to minimize problems derived from camera geometry and lens characteristics. In this regard, we used a multi-sensor camera composed of six fisheye lenses, applying photogrammetric procedures to several funerary structures. The methodology includes the analysis of several types of spherical images obtained using different stitching techniques and the comparison of the results of image orientation processes considering these images and the original fisheye images. Subsequently, we analyze the possible use of the fisheye images to model complex scenes by reducing the use of ground control points, thus minimizing the need to apply surveying techniques to determine their coordinates. In this regard, we applied distance constraints based on a previous extrinsic calibration of the camera, obtaining results similar to those obtained using a traditional schema based on points. The results have allowed us to determine the advantages and disadvantages of each type of image and configuration, providing several recommendations regarding their use in complex scenes. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

25 pages, 11159 KB

Open AccessArticle

Multi-Resolution 3D Rendering for High-Performance Web AR

by Argyro-Maria Boutsi, Charalabos Ioannidis and Styliani Verykokou

Sensors 2023, 23(15), 6885; https://doi.org/10.3390/s23156885 - 3 Aug 2023

Cited by 8 | Viewed by 4124

Abstract

In the context of web augmented reality (AR), 3D rendering that maintains visual quality and frame rate requirements remains a challenge. The lack of a dedicated and efficient 3D format often results in the degraded visual quality of the original data and compromises [...] Read more.

In the context of web augmented reality (AR), 3D rendering that maintains visual quality and frame rate requirements remains a challenge. The lack of a dedicated and efficient 3D format often results in the degraded visual quality of the original data and compromises the user experience. This paper examines the integration of web-streamable view-dependent representations of large-sized and high-resolution 3D models in web AR applications. The developed cross-platform prototype exploits the batched multi-resolution structures of the Nexus.js library as a dedicated lightweight web AR format and tests it against common formats and compression techniques. Built with AR.js and Three.js open-source libraries, it allows the overlay of the multi-resolution models by interactively adjusting the position, rotation and scale parameters. The proposed method includes real-time view-dependent rendering, geometric instancing and 3D pose regression for two types of AR: natural feature tracking (NFT) and location-based positioning for large and textured 3D overlays. The prototype achieves up to a 46% speedup in rendering time compared to optimized glTF models, while a 34 M vertices 3D model is visible in less than 4 s without degraded visual quality in slow 3D networks. The evaluation under various scenes and devices offers insights into how a multi-resolution scheme can be adopted in web AR for high-quality visualization and real-time performance. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

15 pages, 12087 KB

Open AccessArticle

Computational Integral Imaging Reconstruction via Elemental Image Blending without Normalization

by Eunsu Lee, Hyunji Cho and Hoon Yoo

Sensors 2023, 23(12), 5468; https://doi.org/10.3390/s23125468 - 9 Jun 2023

Cited by 3 | Viewed by 1965

Abstract

This paper presents a novel computational integral imaging reconstruction (CIIR) method using elemental image blending to eliminate the normalization process in CIIR. Normalization is commonly used in CIIR to address uneven overlapping artifacts. By incorporating elemental image blending, we remove the normalization step [...] Read more.

This paper presents a novel computational integral imaging reconstruction (CIIR) method using elemental image blending to eliminate the normalization process in CIIR. Normalization is commonly used in CIIR to address uneven overlapping artifacts. By incorporating elemental image blending, we remove the normalization step in CIIR, leading to decreased memory consumption and computational time compared to those of existing techniques. We conducted a theoretical analysis of the impact of elemental image blending on a CIIR method using windowing techniques, and the results showed that the proposed method is superior to the standard CIIR method in terms of image quality. We also performed computer simulations and optical experiments to evaluate the proposed method. The experimental results showed that the proposed method enhances the image quality over that of the standard CIIR method, while also reducing memory usage and processing time. Full article

(This article belongs to the Special Issue Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Challenges and Future Trends of 3D Image Sensing, Visualization, and Processing

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (9 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI