Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (77)

Search Parameters:
Keywords = multiple camera stereo

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 6708 KB  
Article
Feasibility Domain Construction and Characterization Method for Intelligent Underground Mining Equipment Integrating ORB-SLAM3 and Depth Vision
by Siya Sun, Xiaotong Han, Hongwei Ma, Haining Yuan, Sirui Mao, Chuanwei Wang, Kexiang Ma, Yifeng Guo and Hao Su
Sensors 2026, 26(3), 966; https://doi.org/10.3390/s26030966 - 2 Feb 2026
Viewed by 269
Abstract
To address the limited environmental perception capability and the difficulty of achieving consistent and efficient representation of the workspace feasible domain caused by high dust concentration, uneven illumination, and enclosed spaces in underground coal mines, this paper proposes a digital spatial construction and [...] Read more.
To address the limited environmental perception capability and the difficulty of achieving consistent and efficient representation of the workspace feasible domain caused by high dust concentration, uneven illumination, and enclosed spaces in underground coal mines, this paper proposes a digital spatial construction and representation method for underground environments by integrating RGB-D depth vision with ORB-SLAM3. First, a ChArUco calibration board with embedded ArUco markers is adopted to perform high-precision calibration of the RGB-D camera, improving the reliability of geometric parameters under weak-texture and non-uniform lighting conditions. On this basis, a “dense–sparse cooperative” OAK-DenseMapper Pro module is further developed; the module improves point-cloud generation using a mathematical projection model, and combines enhanced stereo matching with multi-stage depth filtering to achieve high-quality dense point-cloud reconstruction from RGB-D observations. The dense point cloud is then converted into a probabilistic octree occupancy map, where voxel-wise incremental updates are performed for observed space while unknown regions are retained, enabling a memory-efficient and scalable 3D feasible-space representation. Experiments are conducted in multiple representative coal-mine tunnel scenarios; compared with the original ORB-SLAM3, the number of points in dense mapping increases by approximately 38% on average; in trajectory evaluation on the TUM dataset, the root mean square error, mean error, and median error of the absolute pose error are reduced by 7.7%, 7.1%, and 10%, respectively; after converting the dense point cloud to an octree, the map memory footprint is only about 0.5% of the original point cloud, with a single conversion time of approximately 0.75 s. The experimental results demonstrate that, while ensuring accuracy, the proposed method achieves real-time, efficient, and consistent representation of the 3D feasible domain in complex underground environments, providing a reliable digital spatial foundation for path planning, safe obstacle avoidance, and autonomous operation. Full article
Show Figures

Figure 1

21 pages, 20581 KB  
Article
Stereo-Based Single-Shot Hand-to-Eye Calibration for Robot Arms
by Pushkar Kadam, Gu Fang, Farshid Amirabdollahian, Ju Jia Zou and Patrick Holthaus
Computers 2026, 15(1), 53; https://doi.org/10.3390/computers15010053 - 13 Jan 2026
Viewed by 292
Abstract
Robot hand-to-eye calibration is a necessary process for a robot arm to perceive and interact with its environment. Past approaches required collecting multiple images using a calibration board placed at different locations relative to the robot. When the robot or camera is displaced [...] Read more.
Robot hand-to-eye calibration is a necessary process for a robot arm to perceive and interact with its environment. Past approaches required collecting multiple images using a calibration board placed at different locations relative to the robot. When the robot or camera is displaced from its calibrated position, hand–eye calibration must be redone using the same tedious process. In this research, we developed a novel method that uses a semi-automatic process to perform hand-to-eye calibration with a stereo camera, generating a transformation matrix from the world to the camera coordinate frame from a single image. We use a robot-pointer tool attached to the robot’s end-effector to manually establish a relationship between the world and the robot coordinate frame. Then, we establish the relationship between the camera and the robot using a transformation matrix that maps points observed in the stereo image frame from two-dimensional space to the robot’s three-dimensional coordinate frame. Our analysis of the stereo calibration showed a reprojection error of 0.26 pixels. An evaluation metric was developed to test the camera-to-robot transformation matrix, and the experimental results showed median root mean square errors of less than 1 mm in the x and y directions and less than 2 mm in the z directions in the robot coordinate frame. The results show that, with this work, we contribute a hand-to-eye calibration method that uses three non-collinear points in a single stereo image to map camera-to-robot coordinate-frame transformations. Full article
(This article belongs to the Special Issue Advanced Human–Robot Interaction 2025)
Show Figures

Figure 1

24 pages, 4196 KB  
Article
Real-Time Cooperative Path Planning and Collision Avoidance for Autonomous Logistics Vehicles Using Reinforcement Learning and Distributed Model Predictive Control
by Mingxin Li, Hui Li, Yunan Yao, Yulei Zhu, Hailong Weng, Huabiao Jin and Taiwei Yang
Machines 2026, 14(1), 27; https://doi.org/10.3390/machines14010027 - 24 Dec 2025
Viewed by 460
Abstract
In industrial environments such as ports and warehouses, autonomous logistics vehicles face significant challenges in coordinating multiple vehicles while ensuring safe and efficient path planning. This study proposes a novel real-time cooperative control framework for autonomous vehicles, combining reinforcement learning (RL) and distributed [...] Read more.
In industrial environments such as ports and warehouses, autonomous logistics vehicles face significant challenges in coordinating multiple vehicles while ensuring safe and efficient path planning. This study proposes a novel real-time cooperative control framework for autonomous vehicles, combining reinforcement learning (RL) and distributed model predictive control (DMPC). The RL agent dynamically adjusts the optimization weights of the DMPC to adapt to the vehicle’s real-time environment, while the DMPC enables decentralized path planning and collision avoidance. The system leverages multi-source sensor fusion, including GNSS, UWB, IMU, LiDAR, and stereo cameras, to provide accurate state estimations of vehicles. Simulation results demonstrate that the proposed RL-DMPC approach outperforms traditional centralized control strategies in terms of tracking accuracy, collision avoidance, and safety margins. Furthermore, the proposed method significantly improves control smoothness compared to rule-based strategies. This framework is particularly effective in dynamic and constrained industrial settings, offering a robust solution for multi-vehicle coordination with minimal communication delays. The study highlights the potential of combining RL with DMPC to achieve real-time, scalable, and adaptive solutions for autonomous logistics. Full article
(This article belongs to the Special Issue Control and Path Planning for Autonomous Vehicles)
Show Figures

Figure 1

17 pages, 2339 KB  
Article
Robust Direct Multi-Camera SLAM in Challenging Scenarios
by Yonglei Pan, Yueshang Zhou, Qiming Qi, Guoyan Wang, Yanwen Jiang, Hongqi Fan and Jun He
Electronics 2025, 14(23), 4556; https://doi.org/10.3390/electronics14234556 - 21 Nov 2025
Viewed by 722
Abstract
Traditional monocular and stereo visual SLAM systems often fail to operate stably in complex unstructured environments (e.g., weakly textured or repetitively textured scenes) due to feature scarcity from their limited fields of view. In contrast, multi-camera systems can effectively overcome the perceptual limitations [...] Read more.
Traditional monocular and stereo visual SLAM systems often fail to operate stably in complex unstructured environments (e.g., weakly textured or repetitively textured scenes) due to feature scarcity from their limited fields of view. In contrast, multi-camera systems can effectively overcome the perceptual limitations of monocular or stereo setups by providing broader field-of-view coverage. However, most existing multi-camera visual SLAM systems are primarily feature-based and thus still constrained by the inherent limitations of feature extraction in such environments. To address this issue, a multi-camera visual SLAM framework based on the direct method is proposed. In the front-end, a detector-free matcher named Efficient LoFTR is incorporated, enabling pose estimation through dense pixel associations to improve localization accuracy and robustness. In the back-end, geometric constraints among multiple cameras are integrated, and system localization accuracy is further improved through a joint optimization process. Through extensive experiments on public datasets and a self-built simulation dataset, the proposed method achieves superior performance over state-of-the-art approaches regarding localization accuracy, trajectory completeness, and environmental adaptability, thereby validating its high robustness in complex unstructured environments. Full article
Show Figures

Figure 1

20 pages, 2797 KB  
Article
Seed 3D Phenotyping Across Multiple Crops Using 3D Gaussian Splatting
by Jun Gao, Chao Zhu, Junguo Hu, Fei Deng, Zhaoxin Xu and Xiaomin Wang
Agriculture 2025, 15(22), 2329; https://doi.org/10.3390/agriculture15222329 - 8 Nov 2025
Viewed by 1726
Abstract
This study introduces a versatile seed 3D reconstruction method that is applicable to multiple crops—including maize, wheat, and rice—and designed to overcome the inefficiency and subjectivity of manual measurements and the high costs of laser-based phenotyping. A panoramic video of the seed is [...] Read more.
This study introduces a versatile seed 3D reconstruction method that is applicable to multiple crops—including maize, wheat, and rice—and designed to overcome the inefficiency and subjectivity of manual measurements and the high costs of laser-based phenotyping. A panoramic video of the seed is captured and processed through frame sampling to extract multi-view images. Structure-from-Motion (SFM) is employed for sparse reconstruction and camera pose estimation, while 3D Gaussian Splatting (3DGS) is utilized for high-fidelity dense reconstruction, generating detailed point cloud models. The subsequent point cloud preprocessing, filtering, and segmentation enable the extraction of key phenotypic parameters, including length, width, height, surface area, and volume. The experimental evaluations demonstrated a high measurement accuracy, with coefficients of determination (R2) for length, width, and height reaching 0.9361, 0.8889, and 0.946, respectively. Moreover, the reconstructed models exhibit superior image quality, with peak signal-to-noise ratio (PSNR) values consistently ranging from 35 to 37 dB, underscoring the robustness of 3DGS in preserving fine structural details. Compared to conventional multi-view stereo (MVS) techniques, the proposed method can achieve significantly improved reconstruction accuracy and visual fidelity. The key outcomes of this study confirm that the 3DGS-based pipeline provides a highly accurate, efficient, and scalable solution for digital phenotyping, establishing a robust foundation for its application across diverse crop species. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

16 pages, 4910 KB  
Article
Three-Dimensional Reconstruction of Fragment Shape and Motion in Impact Scenarios
by Milad Davoudkhani and Hans-Gerd Maas
Sensors 2025, 25(18), 5842; https://doi.org/10.3390/s25185842 - 18 Sep 2025
Viewed by 991
Abstract
Photogrammetry-based 3D reconstruction of the shape of fast-moving objects from image sequences presents a complex yet increasingly important challenge. The 3D reconstruction of a large number of fast-moving objects may, for instance, be of high importance in the study of dynamic phenomena such [...] Read more.
Photogrammetry-based 3D reconstruction of the shape of fast-moving objects from image sequences presents a complex yet increasingly important challenge. The 3D reconstruction of a large number of fast-moving objects may, for instance, be of high importance in the study of dynamic phenomena such as impact experiments and explosions. In this context, analyzing the 3D shape, size, and motion trajectory of the resulting fragments provides valuable insights into the underlying physical processes, including energy dissipation and material failure. High-speed cameras are typically employed to capture the motion of the resulting fragments. The high cost, the complexity of synchronizing multiple units, and lab conditions often limit the number of high-speed cameras that can be practically deployed in experimental setups. In some cases, only a single high-speed camera will be available or can be used. Challenges such as overlapping fragments, shadows, and dust often complicate tracking and degrade reconstruction quality. These challenges highlight the need for advanced 3D reconstruction techniques capable of handling incomplete, noisy, and occluded data to enable accurate analysis under such extreme conditions. In this paper, we use a combination of photogrammetry, computer vision, and artificial intelligence techniques in order to improve feature detection of moving objects and to enable more robust trajectory and 3D shape reconstruction in complex, real-world scenarios. The focus of this paper is on achieving accurate 3D shape estimation and motion tracking of dynamic objects generated by impact loading using stereo- or monoscopic high-speed cameras. Depending on the object’s rotational behavior and the number of available cameras, two methods are presented, both enabling the successful 3D reconstruction of fragment shapes and motion. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

27 pages, 5515 KB  
Article
Optimizing Multi-Camera Mobile Mapping Systems with Pose Graph and Feature-Based Approaches
by Ahmad El-Alailyi, Luca Morelli, Paweł Trybała, Francesco Fassi and Fabio Remondino
Remote Sens. 2025, 17(16), 2810; https://doi.org/10.3390/rs17162810 - 13 Aug 2025
Cited by 1 | Viewed by 3103
Abstract
Multi-camera Visual Simultaneous Localization and Mapping (V-SLAM) increases spatial coverage through multi-view image streams, improving localization accuracy and reducing data acquisition time. Despite its speed and generally robustness, V-SLAM often struggles to achieve precise camera poses necessary for accurate 3D reconstruction, especially in [...] Read more.
Multi-camera Visual Simultaneous Localization and Mapping (V-SLAM) increases spatial coverage through multi-view image streams, improving localization accuracy and reducing data acquisition time. Despite its speed and generally robustness, V-SLAM often struggles to achieve precise camera poses necessary for accurate 3D reconstruction, especially in complex environments. This study introduces two novel multi-camera optimization methods to enhance pose accuracy, reduce drift, and ensure loop closures. These methods refine multi-camera V-SLAM outputs within existing frameworks and are evaluated in two configurations: (1) multiple independent stereo V-SLAM instances operating on separate camera pairs; and (2) multi-view odometry processing all camera streams simultaneously. The proposed optimizations include (1) a multi-view feature-based optimization that integrates V-SLAM poses with rigid inter-camera constraints and bundle adjustment; and (2) a multi-camera pose graph optimization that fuses multiple trajectories using relative pose constraints and robust noise models. Validation is conducted through two complex 3D surveys using the ATOM-ANT3D multi-camera fisheye mobile mapping system. Results demonstrate survey-grade accuracy comparable to traditional photogrammetry, with reduced computational time, advancing toward near real-time 3D mapping of challenging environments. Full article
Show Figures

Graphical abstract

21 pages, 4909 KB  
Article
Rapid 3D Camera Calibration for Large-Scale Structural Monitoring
by Fabio Bottalico, Nicholas A. Valente, Christopher Niezrecki, Kshitij Jerath, Yan Luo and Alessandro Sabato
Remote Sens. 2025, 17(15), 2720; https://doi.org/10.3390/rs17152720 - 6 Aug 2025
Cited by 1 | Viewed by 2077
Abstract
Computer vision techniques such as three-dimensional digital image correlation (3D-DIC) and three-dimensional point tracking (3D-PT) have demonstrated broad applicability for monitoring the conditions of large-scale engineering systems by reconstructing and tracking dynamic point clouds corresponding to the surface of a structure. Accurate stereophotogrammetry [...] Read more.
Computer vision techniques such as three-dimensional digital image correlation (3D-DIC) and three-dimensional point tracking (3D-PT) have demonstrated broad applicability for monitoring the conditions of large-scale engineering systems by reconstructing and tracking dynamic point clouds corresponding to the surface of a structure. Accurate stereophotogrammetry measurements require the stereo cameras to be calibrated to determine their intrinsic and extrinsic parameters by capturing multiple images of a calibration object. This image-based approach becomes cumbersome and time-consuming as the size of the tested object increases. To streamline the calibration and make it scale-insensitive, a multi-sensor system embedding inertial measurement units and a laser sensor is developed to compute the extrinsic parameters of the stereo cameras. In this research, the accuracy of the proposed sensor-based calibration method in performing stereophotogrammetry is validated experimentally and compared with traditional approaches. Tests conducted at various scales reveal that the proposed sensor-based calibration enables reconstructing both static and dynamic point clouds, measuring displacements with an accuracy higher than 95% compared to image-based traditional calibration, while being up to an order of magnitude faster and easier to deploy. The novel approach has broad applications for making static, dynamic, and deformation measurements to transform how large-scale structural health monitoring can be performed. Full article
(This article belongs to the Special Issue New Perspectives on 3D Point Cloud (Third Edition))
Show Figures

Figure 1

17 pages, 5356 KB  
Article
A Study on the Features for Multi-Target Dual-Camera Tracking and Re-Identification in a Comparatively Small Environment
by Jong-Chen Chen, Po-Sheng Chang and Yu-Ming Huang
Electronics 2025, 14(10), 1984; https://doi.org/10.3390/electronics14101984 - 13 May 2025
Viewed by 1631
Abstract
Tracking across multiple cameras is a complex problem in computer vision. Its main challenges include camera calibration, occlusion handling, camera overlap and field of view, person re-identification, and data association. In this study, we designed a laboratory as a research environment that facilitates [...] Read more.
Tracking across multiple cameras is a complex problem in computer vision. Its main challenges include camera calibration, occlusion handling, camera overlap and field of view, person re-identification, and data association. In this study, we designed a laboratory as a research environment that facilitates our exploration of some of the above challenging issues. This study uses stereo camera calibration and key point detection to reconstruct the three-dimensional key points of the person being tracked, thereby performing person-tracking tasks. The results show that the dual cameras’ 3D spatial tracking method can have a relatively better continuous monitoring effect than a single camera alone. This study adopts four ways to evaluate person similarity, which can effectively reduce the unnecessary identity generation of persons. However, using all four methods simultaneously may not produce better results than a specific assessment method alone due to differences in people’s activity situations. Full article
(This article belongs to the Collection Computer Vision and Pattern Recognition Techniques)
Show Figures

Figure 1

15 pages, 11293 KB  
Article
An Assessment of the Stereo and Near-Infrared Camera Calibration Technique Using a Novel Real-Time Approach in the Context of Resource Efficiency
by Larisa Ivascu, Vlad-Florin Vinatu and Mihail Gaianu
Processes 2025, 13(4), 1198; https://doi.org/10.3390/pr13041198 - 15 Apr 2025
Viewed by 1520
Abstract
This paper provides a comparative analysis of calibration techniques applicable to stereo and near-infrared (NIR) camera systems, with a specific emphasis on the Intel RealSense SR300 alongside a standard 2-megapixel NIR camera. This study investigates the pivotal function of calibration within both stereo [...] Read more.
This paper provides a comparative analysis of calibration techniques applicable to stereo and near-infrared (NIR) camera systems, with a specific emphasis on the Intel RealSense SR300 alongside a standard 2-megapixel NIR camera. This study investigates the pivotal function of calibration within both stereo vision and NIR imaging applications, which are essential across various domains, including robotics, augmented reality, and low-light imaging. For stereo systems, we scrutinise the conventional method involving a 9 × 6 chessboard pattern utilised to ascertain the intrinsic and extrinsic camera parameters. The proposed methodology consists of three main steps: (1) real-time calibration error classification for stereo cameras, (2) NIR-specific calibration techniques, and (3) a comprehensive evaluation framework. This research introduces a novel real-time evaluation methodology that classifies calibration errors predicated on the pixel offsets between corresponding points in the left and right images. Conversely, NIR camera calibration techniques are modified to address the distinctive properties of near-infrared light. We deliberate on the difficulties encountered in devising NIR–visible calibration patterns and the imperative to consider the spectral response and temperature sensitivity within the calibration procedure. The paper also puts forth an innovative calibration assessment application that is relevant to both systems. Stereo cameras evaluate the corner detection accuracy in real time across multiple image pairs, whereas NIR cameras concentrate on assessing the distortion correction and intrinsic parameter accuracy under varying lighting conditions. Our experiments validate the necessity of routine calibration assessment, as environmental factors may compromise the calibration quality over time. We conclude by underscoring the disparities in the calibration requirements between stereo and NIR systems, thereby emphasising the need for specialised approaches tailored to each domain to guarantee an optimal performance in their respective applications. Full article
(This article belongs to the Special Issue Circular Economy and Efficient Use of Resources (Volume II))
Show Figures

Figure 1

25 pages, 6410 KB  
Article
Multi-View Stereo Using Perspective-Aware Features and Metadata to Improve Cost Volume
by Zongcheng Zuo, Yuanxiang Li, Yu Zhou and Fan Mo
Sensors 2025, 25(7), 2233; https://doi.org/10.3390/s25072233 - 2 Apr 2025
Viewed by 3403
Abstract
Feature matching is pivotal when using multi-view stereo (MVS) to reconstruct dense 3D models from calibrated images. This paper proposes PAC-MVSNet, which integrates perspective-aware convolution (PAC) and metadata-enhanced cost volumes to address the challenges in reflective and texture-less regions. PAC dynamically aligns convolutional [...] Read more.
Feature matching is pivotal when using multi-view stereo (MVS) to reconstruct dense 3D models from calibrated images. This paper proposes PAC-MVSNet, which integrates perspective-aware convolution (PAC) and metadata-enhanced cost volumes to address the challenges in reflective and texture-less regions. PAC dynamically aligns convolutional kernels with scene perspective lines, while the use of metadata (e.g., camera pose distance) enables geometric reasoning during cost aggregation. In PAC-MVSNet, we introduce feature matching with long-range tracking that utilizes both internal and external focuses to integrate extensive contextual data within individual images as well as across multiple images. To enhance the performance of the feature matching with long-range tracking, we also propose a perspective-aware convolution module that directs the convolutional kernel to capture features along the perspective lines. This enables the module to extract perspective-aware features from images, improving the feature matching. Finally, we crafted a specific 2D CNN that fuses image priors, thereby integrating keyframes and geometric metadata within the cost volume to evaluate depth planes. Our method represents the first attempt to embed the existing physical model knowledge into a network for completing MVS tasks, which achieved optimal performance using multiple benchmark datasets. Full article
Show Figures

Figure 1

19 pages, 39933 KB  
Article
SIFT-Based Depth Estimation for Accurate 3D Reconstruction in Cultural Heritage Preservation
by Porawat Visutsak, Xiabi Liu, Chalothon Choothong and Fuangfar Pensiri
Appl. Syst. Innov. 2025, 8(2), 43; https://doi.org/10.3390/asi8020043 - 24 Mar 2025
Cited by 1 | Viewed by 2985
Abstract
This paper describes a proposed method for preserving tangible cultural heritage by reconstructing a 3D model of cultural heritage using 2D captured images. The input data represent a set of multiple 2D images captured using different views around the object. An image registration [...] Read more.
This paper describes a proposed method for preserving tangible cultural heritage by reconstructing a 3D model of cultural heritage using 2D captured images. The input data represent a set of multiple 2D images captured using different views around the object. An image registration technique is applied to configure the overlapping images with the depth of images computed to construct the 3D model. The automatic 3D reconstruction system consists of three steps: (1) Image registration for managing the overlapping of 2D input images; (2) Depth computation for managing image orientation and calibration; and (3) 3D reconstruction using point cloud and stereo-dense matching. We collected and recorded 2D images of tangible cultural heritage objects, such as high-relief and round-relief sculptures, using a low-cost digital camera. The performance analysis of the proposed method, in conjunction with the generation of 3D models of tangible cultural heritage, demonstrates significantly improved accuracy in depth information. This process effectively creates point cloud locations, particularly in high-contrast backgrounds. Full article
(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)
Show Figures

Figure 1

21 pages, 16376 KB  
Article
Triple-Camera Rectification for Depth Estimation Sensor
by Minkyung Jeon, Jinhong Park, Jin-Woo Kim and Sungmin Woo
Sensors 2024, 24(18), 6100; https://doi.org/10.3390/s24186100 - 20 Sep 2024
Cited by 2 | Viewed by 2220
Abstract
In this study, we propose a novel rectification method for three cameras using a single image for depth estimation. Stereo rectification serves as a fundamental preprocessing step for disparity estimation in stereoscopic cameras. However, off-the-shelf depth cameras often include an additional RGB camera [...] Read more.
In this study, we propose a novel rectification method for three cameras using a single image for depth estimation. Stereo rectification serves as a fundamental preprocessing step for disparity estimation in stereoscopic cameras. However, off-the-shelf depth cameras often include an additional RGB camera for creating 3D point clouds. Existing rectification methods only align two cameras, necessitating an additional rectification and remapping process to align the third camera. Moreover, these methods require multiple reference checkerboard images for calibration and aim to minimize alignment errors, but often result in rotated images when there is significant misalignment between two cameras. In contrast, the proposed method simultaneously rectifies three cameras in a single shot without unnecessary rotation. To achieve this, we designed a lab environment with checkerboard settings and obtained multiple sample images from the cameras. The optimization function, designed specifically for rectification in stereo matching, enables the simultaneous alignment of all three cameras while ensuring performance comparable to traditional methods. Experimental results with real camera samples demonstrate the benefits of the proposed method and provide a detailed analysis of unnecessary rotations in the rectified images. Full article
(This article belongs to the Collection 3D Imaging and Sensing System)
Show Figures

Figure 1

16 pages, 2676 KB  
Article
Point Cloud Densification Algorithm for Multiple Cameras and Lidars Data Fusion
by Jakub Winter and Robert Nowak
Sensors 2024, 24(17), 5786; https://doi.org/10.3390/s24175786 - 5 Sep 2024
Cited by 1 | Viewed by 3103
Abstract
Fusing data from many sources helps to achieve improved analysis and results. In this work, we present a new algorithm to fuse data from multiple cameras with data from multiple lidars. This algorithm was developed to increase the sensitivity and specificity of autonomous [...] Read more.
Fusing data from many sources helps to achieve improved analysis and results. In this work, we present a new algorithm to fuse data from multiple cameras with data from multiple lidars. This algorithm was developed to increase the sensitivity and specificity of autonomous vehicle perception systems, where the most accurate sensors measuring the vehicle’s surroundings are cameras and lidar devices. Perception systems based on data from one type of sensor do not use complete information and have lower quality. The camera provides two-dimensional images; lidar produces three-dimensional point clouds. We developed a method for matching pixels on a pair of stereoscopic images using dynamic programming inspired by an algorithm to match sequences of amino acids used in bioinformatics. We improve the quality of the basic algorithm using additional data from edge detectors. Furthermore, we also improve the algorithm performance by reducing the size of matched pixels determined by available car speeds. We perform point cloud densification in the final step of our method, fusing lidar output data with stereo vision output. We implemented our algorithm in C++ with Python API, and we provided the open-source library named Stereo PCD. This library very efficiently fuses data from multiple cameras and multiple lidars. In the article, we present the results of our approach to benchmark databases in terms of quality and performance. We compare our algorithm with other popular methods. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

16 pages, 3639 KB  
Article
Time-of-Flight Camera Intensity Image Reconstruction Based on an Untrained Convolutional Neural Network
by Tian-Long Wang, Lin Ao, Na Han, Fu Zheng, Yan-Qiu Wang and Zhi-Bin Sun
Photonics 2024, 11(9), 821; https://doi.org/10.3390/photonics11090821 - 30 Aug 2024
Cited by 6 | Viewed by 3143
Abstract
With the continuous development of science and technology, laser ranging technology will become more efficient, convenient, and widespread, and it has been widely used in the fields of medicine, engineering, video games, and three-dimensional imaging. A time-of-flight (ToF) camera is a three-dimensional stereo [...] Read more.
With the continuous development of science and technology, laser ranging technology will become more efficient, convenient, and widespread, and it has been widely used in the fields of medicine, engineering, video games, and three-dimensional imaging. A time-of-flight (ToF) camera is a three-dimensional stereo imaging device with the advantages of small size, small measurement error, and strong anti-interference ability. However, compared to traditional sensors, ToF cameras typically exhibit lower resolution and signal-to-noise ratio due to inevitable noise from multipath interference and mixed pixels during usage. Additionally, in environments with scattering media, the information about objects gets scattered multiple times, making it challenging for ToF cameras to obtain effective object information. To address these issues, we propose a solution that combines ToF cameras with single-pixel imaging theory. Leveraging intensity information acquired by ToF cameras, we apply various reconstruction algorithms to reconstruct the object’s image. Under undersampling conditions, our reconstruction approach yields higher peak signal-to-noise ratio compared to the raw camera image, significantly improving the quality of the target object’s image. Furthermore, when ToF cameras fail in environments with scattering media, our proposed approach successfully reconstructs the object’s image when the camera is imaging through the scattering medium. This experimental demonstration effectively reduces the noise and direct ambient light generated by the ToF camera itself, while opening up the potential application of ToF cameras in challenging environments, such as scattering media or underwater. Full article
Show Figures

Figure 1

Back to TopTop