Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,474)

Search Parameters:
Keywords = 3D cameras

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
30 pages, 11975 KB  
Article
Structured Light Camera’s Point Clouds Captured and Stitched by Humanoid for 3D Objects Based on ICP Registration Algorithm
by Hong-Yu Lin, Che-Ping Hung, Kuo-Yang Tu and Fang-Tsen Kuo
Biomimetics 2026, 11(7), 449; https://doi.org/10.3390/biomimetics11070449 (registering DOI) - 29 Jun 2026
Abstract
In recent decades, humanoids have become more popular in various applications. However, their applications in human life are more than those in industry. In this paper, a humanoid is used to capture the sets of point clouds of an object for three-dimensional reconstruction. [...] Read more.
In recent decades, humanoids have become more popular in various applications. However, their applications in human life are more than those in industry. In this paper, a humanoid is used to capture the sets of point clouds of an object for three-dimensional reconstruction. The structured light camera is widely used across diverse 3D scanning applications due to its high resolution, rapid acquisition capability, and adaptability to various material surfaces. Therefore, the humanoid developed by our team holds a structured light camera which captures the point clouds of an object put on a platform for the reconstruction of its 3D digital model. The platform is rotated so that the structured light camera can capture the image of all view angles on the object. Meanwhile, the structured light camera captures point clouds, and the camera of the humanoid recognizes the QR code on the platform so that the sets of point clouds can be distinguished by view angles on the object. Then, the automated registration process of the point cloud sets for a 3D model based on the point-to-plane iterative closest point (ICP) algorithm is proposed. The process incorporates preprocessing techniques, such as downsampling and normal vector estimated from plane, and utilizes the ICP algorithm for registration, ultimately achieving markerless and precision automatic merging of multi-view point cloud data. Experimental results demonstrate that the proposed method with the humanoid can effectively improve the completeness and accuracy of 3D reconstruction models, significantly reduce manual intervention, and enhance the system’s versatility and practical feasibility. Key parameters adjusted for more efficient computation of the ICP algorithm are revealed. In addition, the experimental results of the proposed ICP compared with G-ICP are also included. Full article
(This article belongs to the Special Issue Bio-Inspired Intelligent Robot)
31 pages, 2412 KB  
Review
3D Particle Field Reconstruction for Tomographic Particle Image Velocimetry Based on a Single Light-Field Camera: A Survey
by Lixia Cao, Wei Gu and Xing Tian
Processes 2026, 14(13), 2101; https://doi.org/10.3390/pr14132101 (registering DOI) - 28 Jun 2026
Viewed by 38
Abstract
Three-dimensional (3D) particle field reconstruction is a core procedure of tomographic particle image velocimetry (Tomo-PIV). Its reconstruction accuracy and efficiency directly determine the ability of the PIV system to characterize various complex flow fields. Compared with traditional multicamera Tomo-PIV, a single light-field camera [...] Read more.
Three-dimensional (3D) particle field reconstruction is a core procedure of tomographic particle image velocimetry (Tomo-PIV). Its reconstruction accuracy and efficiency directly determine the ability of the PIV system to characterize various complex flow fields. Compared with traditional multicamera Tomo-PIV, a single light-field camera offers a compact layout, simple calibration, and strong adaptability, making it widely applicable for 3D flow measurement in confined space. This paper systematically reviews recent advances in 3D particle field reconstruction algorithms that use a single light-field camera, including both traditional iterative reconstruction methods and deep learning techniques. First, the imaging mechanism of different light-field cameras, the fundamental theory of light-field Tomo-PIV, and the mathematical foundation of tomographic reconstruction are elaborated to establish a theoretical framework for subsequent algorithm analysis. Next, the advantages, disadvantages, and limitations of traditional iterative reconstruction methods and deep learning techniques are comprehensively analyzed from key dimensions, including reconstruction quality, computational efficiency, inherent defects such as particle elongation and ghost particles, and applicable scenarios. On this basis, the current technical bottlenecks are concluded, including low computational efficiency under high particle concentration, insufficient research on velocity uncertainty quantification, domain mismatch between simulated and experimental datasets, and poor interpretability of deep learning models. Finally, several promising future research directions are discussed, such as the optimization of multiframe correlation-based high-precision reconstruction algorithms, the development of standardized open-source datasets, the interpretability of deep neural networks, and time-resolved flow measurement. This study aims to provide a comprehensive algorithmic reference for researchers in the field and facilitate the practical application of light-field Tomo-PIV in engineering fluid mechanics and related disciplines. Full article
(This article belongs to the Section Particle Processes)
Show Figures

Figure 1

17 pages, 7950 KB  
Article
High-Resolution MgB4O7:Ce,Li OSL Foils for Bragg Curve Mapping in Proton Eye Therapy
by Michał Sądel, Leszek Grzanka, Jan Swakoń, Tomasz Horwacik, Damian Wróbel, Sebastian Kusyk, Piotr Płatek and Paweł Bilski
Materials 2026, 19(13), 2751; https://doi.org/10.3390/ma19132751 (registering DOI) - 27 Jun 2026
Viewed by 143
Abstract
By using a PMMA-made therapeutic wedge and a recently developed reusable silicone foil dosimeter based on the optically stimulated luminescence (OSL) of MgB4O7:Ce,Li (MBO) material, direct measurements of the complete proton Bragg curves for two independent clinically relevant proton [...] Read more.
By using a PMMA-made therapeutic wedge and a recently developed reusable silicone foil dosimeter based on the optically stimulated luminescence (OSL) of MgB4O7:Ce,Li (MBO) material, direct measurements of the complete proton Bragg curves for two independent clinically relevant proton beams were achieved. The PMMA wedge compensator created a controlled range gradient across the beam field, enabling comprehensive characterisation of Bragg curve features, including the entrance plateau, the maximum of the Bragg peak, and the dosimetrically critical distal fall-off region. Measurements were performed using a dedicated, self-built (3D-printed) optical detection setup equipped with a blue LED (440 nm) that illuminates the MBO foil dosimeter and a highly sensitive electron-multiplication (EMCCD) camera, which simultaneously acquires 2D OSL light from the foil. The prototype technology enables single-shot 2D mapping of the complete Bragg curve. Validation against Monte Carlo (MC) simulations and GafchromicTM EBT3 films demonstrates sub-millimetre accuracy in localising the clinically critical proton parameters: peak-to-plateau, FWHM and distal fall-off. Measurements were performed for two independent therapeutic proton beams with initial energies of 58.8 and 61.1 MeV, routinely used for proton eye-beam treatments at IFJ PAN Krakow. As a proof of concept, the results demonstrate the potential of MBO-based silicone foil technology to reproduce clinically relevant Bragg-curve parameters with accuracy approaching that of the current gold standard for passive 2D dosimetry, GafchromicTM EBT3 films, while systematic differences attributable to optical diffusion, residual LET-dependent quenching, and the dual-foil junction remain to be corrected. Full article
Show Figures

Figure 1

43 pages, 11884 KB  
Article
Quantifying and Improving Stereo Camera Calibration Robustness: An Outlier-Aware Algorithm for Digital Twin Data Acquisition
by Madalina Carbureanu and Florin-Stefan Zamfir
J. Imaging 2026, 12(7), 280; https://doi.org/10.3390/jimaging12070280 - 25 Jun 2026
Viewed by 96
Abstract
As calibration errors have a direct impact on epipolar consistency, rectification accuracy, and metric 3D reconstruction performance, stereo camera calibration is a fundamental requirement for high-accuracy 3D modeling and reliable digital twin data acquisition. Because current calibration workflows (based on pairwise calibration methods) [...] Read more.
As calibration errors have a direct impact on epipolar consistency, rectification accuracy, and metric 3D reconstruction performance, stereo camera calibration is a fundamental requirement for high-accuracy 3D modeling and reliable digital twin data acquisition. Because current calibration workflows (based on pairwise calibration methods) lack systematic data-quality checks mechanisms, there is a clear need for more robust data selection strategies. The novelty of the approach consists in the development of a new outlier-aware stereo calibration algorithm (OutAw) that introduces a unified multi-stage approach that integrates hard geometric selection, candidate subset generation, multi-criterion ranking, bootstrap stability analysis, and triangulation assessment into a comprehensive and systematic calibration framework. Unlike conventional approaches, OutAw (through its mechanism of detecting and rejecting inconsistent pairs) redefines the calibration strategy from arbitrary to criterion-based data selection. Also, the proposed algorithm is compared with BSC (a baseline OpenCV all-pairs calibration algorithm) and InterFil (an intermediate filtered variant) using 49 stereo pairs (at 1280 × 720 resolution) captured using a planar checkerboard. OutAw algorithm achieved (using only nine image pairs) superior results (epipolar error 0.5119 px, stereo RMS 0.7666 px) to the BSC ones (epipolar error 1.3687 px, stereo RMS 1.9385 px), representing statistically significant improvements (60.5%, respectively 62.3%). OutAw geometric consistency was validated by triangulation-based metrics (square-length standard deviation 0.1140 mm and square absolute error 0.1097 mm). Contamination analysis revealed that as the outlier rate increases, the calibration process degrades progressively. Also, the results obtained highlight that geometric quality-driven image selection is critical for achieving a reliable stereo calibration for DT applications. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
27 pages, 36204 KB  
Article
Full-Field 3D Displacement Measurement of Suspended Ceiling Systems Under Seismic Loading Using a Consumer-Grade Multi-Camera Framework
by Mearge Kahsay Seyfu, Yuan-Sen Yang, Cameron C. W. Flude, David T. Lau, Jeffrey Erochko and Hung-Wei Liu
Sensors 2026, 26(13), 4011; https://doi.org/10.3390/s26134011 - 24 Jun 2026
Viewed by 198
Abstract
Suspended ceiling systems are among the most seismically vulnerable non-structural components in buildings, posing significant life-safety risks and economic losses, yet understanding their full-field kinematic behavior under seismic loading remains a major experimental challenge. Conventional contact sensors offer limited spatial coverage and can [...] Read more.
Suspended ceiling systems are among the most seismically vulnerable non-structural components in buildings, posing significant life-safety risks and economic losses, yet understanding their full-field kinematic behavior under seismic loading remains a major experimental challenge. Conventional contact sensors offer limited spatial coverage and can alter the dynamic properties of lightweight panels due to mass loading. In contrast, non-contact optical alternatives are rarely feasible in shake-table environments due to restricted viewing angles, extensive areal coverage requirements, and the risk of equipment damage from falling panels. This study proposes an end-to-end three-dimensional displacement measurement framework for large-scale shake-table testing of suspended ceiling systems, employing consumer-grade cameras with purpose-built tools that cover the complete experimental workflow, including motion-based video trimming, semi-automated calibration, a robust multi-stage image-tracking pipeline that maintains trajectory continuity under extreme inter-frame displacements, and a ceiling system motion visualization and analysis tool. The framework was validated through a full-scale shake-table experiment continuously tracking 324 spatial nodes across 81 ceiling panels, achieving an RMSE below 3 mm in all spatial directions and exact peak-frequency agreement in 9 out of 10 test cases. A parallel processing architecture reduced total processing time from over 27 h to under 10 min without GPU acceleration, and six-degree-of-freedom rigid-body analysis resolved the complete panel failure sequence from constrained oscillation through multi-axis rotation to gravitational free fall, a level of kinematic detail unattainable with conventional instrumentation. This framework establishes a practical, scalable foundation for full-field seismic performance assessment of non-structural systems where conventional instrumentation is physically or logistically infeasible. Full article
(This article belongs to the Special Issue Advanced Sensors for Image Processing and Analysis)
Show Figures

Figure 1

20 pages, 8158 KB  
Article
IIR-PoinTr: A Framework for Enhancing Pig Body Structure in Pose Point Cloud Completion
by Faming Chang, Mengting Zhou, Zhenwei Yu, Haobo Hu, Benhai Xiong, Fuyang Tian and Xiangfang Tang
Agriculture 2026, 16(13), 1375; https://doi.org/10.3390/agriculture16131375 - 24 Jun 2026
Viewed by 191
Abstract
In precision livestock farming, 3D point clouds provide important data support for analyzing pig behavior and monitoring their health. However, due to environmental occlusions, limited sensor viewpoints, and mutual shielding between pigs, the acquired point clouds are often severely partial, which affects the [...] Read more.
In precision livestock farming, 3D point clouds provide important data support for analyzing pig behavior and monitoring their health. However, due to environmental occlusions, limited sensor viewpoints, and mutual shielding between pigs, the acquired point clouds are often severely partial, which affects the accuracy of body shape modeling and behavior recognition. To address these challenges, this study constructed a pig pose point cloud dataset using multi-view depth camera acquisition and point cloud registration techniques. Based on this dataset, an improved point cloud completion model, IIR-PoinTr, is proposed to enhance the reconstruction of geometric and topological structures in pig bodies. By strengthening local geometric perception and high-dimensional feature representation, the model improves the reconstruction quality of partial pig point clouds and produces more structurally consistent pig body shapes. Experimental results show that, on the self-constructed pig posture dataset, the proposed method reduces Chamfer Distance (CD-L1) by 3.6%, CD-L2 by 6.9%, and Earth Mover’s Distance (EMD) by 2.0%, while improving the F-score by 5.4% compared with the baseline model. In single-view point cloud completion tasks, the method is capable of reconstructing geometrically consistent pig body structures and increases downstream classification accuracy by 34.9%. These results indicate that the proposed method can improve the reconstruction quality of partial pig point clouds and provide preliminary technical support for posture analysis under occlusion. Full article
Show Figures

Figure 1

23 pages, 1986 KB  
Article
Development, Reliability, and Validity Assessment of a Portable 3D Camera-Based System for Quantifying Postural Sway and Balance
by Vivek Ganesh Sonar, Vibhor Agrawal, Krushal Kalkani, Javad Hashemi and Abhijit Pandya
Sensors 2026, 26(13), 3987; https://doi.org/10.3390/s26133987 - 23 Jun 2026
Viewed by 259
Abstract
Accurate assessment of postural sway is essential for evaluating balance disorders, rehabilitation outcomes, and fall risk. Traditional laboratory-based motion capture systems provide precise center-of-pressure (CoP) measurements, but are expensive, non-portable, and impractical for widespread clinical use. This study describes the development and testing [...] Read more.
Accurate assessment of postural sway is essential for evaluating balance disorders, rehabilitation outcomes, and fall risk. Traditional laboratory-based motion capture systems provide precise center-of-pressure (CoP) measurements, but are expensive, non-portable, and impractical for widespread clinical use. This study describes the development and testing (reliability and validity) of a portable three-dimensional (3D) camera system (Intel RealSense D415) for quantifying sway and balance. Test–retest reliability was evaluated in healthy adults (n = 10; 6 males, 4 females; mean age 22.3 ± 1.6 years), yielding intraclass correlation coefficients ICC = 0.84–0.86 (95% CI: 0.61–0.95). Concurrent validity, established against a laboratory-based optical motion capture system (Optotrak), demonstrated strong correlations with a mean absolute percentage error of 10.5% relative to Optotrak-derived path length measurements and high levels of agreement. Operating at 30 Hz with end-to-end latency of <40 ms, the RealSense-based system provides a reliable, valid, and portable alternative to lab-based systems. Low-cost markerless motion capture systems based on standard RGB cameras have been validated for postural risk assessment, showing good consistency with gold-standard Vicon systems. These preliminary findings suggest that the system shows promise as a low-cost alternative; however, further validation in clinical populations is required before clinical deployment. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

28 pages, 10424 KB  
Article
Distance-Aware DBSCAN–STM Pipeline with Centralized Point Augmentation for LiDAR-Based Pedestrian Candidate Generation
by Jihwan Yeom, Jinman Kim and Joongjin Kook
Appl. Sci. 2026, 16(13), 6286; https://doi.org/10.3390/app16136286 - 23 Jun 2026
Viewed by 162
Abstract
This paper presents a non-learning-based, seed-dependent, semi-automatic pedestrian candidate generation pipeline for LiDAR point clouds. The proposed method is designed to support 3D annotation workflows by reducing irrelevant candidate clusters while improving the reliability of pedestrian candidate selection under distance-dependent point sparsity. The [...] Read more.
This paper presents a non-learning-based, seed-dependent, semi-automatic pedestrian candidate generation pipeline for LiDAR point clouds. The proposed method is designed to support 3D annotation workflows by reducing irrelevant candidate clusters while improving the reliability of pedestrian candidate selection under distance-dependent point sparsity. The pipeline integrates distance-aware DBSCAN clustering, Single Template Matching (STM), and Centralized Point Augmentation (CPA). First, LiDAR points within the camera field of view are preprocessed, and pedestrian candidate clusters are generated using DBSCAN parameters configured according to distance intervals. Ground-snapping-based bounding-box refinement and height-based filtering are then applied to improve geometric consistency and reduce non-pedestrian candidates. In the second stage, STM compares PCA-aligned projected silhouettes of candidate clusters with a seed pedestrian template to suppress false positives. To address silhouette instability caused by sparse mid-range pedestrian points, CPA adds centroid-contracted points in the projection-relevant plane before template matching. Experiments on pedestrian-containing frames from the KITTI dataset show that STM improves precision from 27.6% to 60.5% and increases the F1-score from 36.8% to 51.4% compared with the initial DBSCAN-based candidate generation stage. The final CPA configuration improves recall from 44.7% to 46.7% and the overall F1-score from 51.4% to 52.1%, while revealing a precision–recall trade-off. Supplementary IoU analysis shows that the final DBSCAN–STM–CPA configuration maintains meaningful spatial overlap with pedestrian ground-truth boxes, achieving 88.9% at 3D IoU ≥ 0.10 and 81.6% at BEV IoU ≥ 0.25. Runtime analysis further shows that height-based filtering reduces the average per-frame processing time from 151.5 ms to 125.1 ms, while the final CPA configuration introduces only a small overhead, resulting in 126.2 ms per frame. These results demonstrate that the proposed DBSCAN–STM–CPA pipeline can provide reliable pedestrian candidates for semi-automatic 3D labeling without requiring class-specific detector training. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

15 pages, 4825 KB  
Article
Integrating Visual Perception and Control Strategies in Custom Omnidirectional Mobile Robots
by Radu-Laurențiu Roșca, Andrei-Iulian Iancu, Adrian Burlacu and Cătălin Dosoftei
Sensors 2026, 26(12), 3918; https://doi.org/10.3390/s26123918 - 20 Jun 2026
Viewed by 166
Abstract
Autonomous mobile robots are used in optimizing warehouse logistics, yet achieving precise positioning during docking maneuvers and autonomous planning remains a technical challenge. This study presents a custom vision-based control system designed for an autonomous omnidirectional wheeled robot. The proposed methodology acquires visual [...] Read more.
Autonomous mobile robots are used in optimizing warehouse logistics, yet achieving precise positioning during docking maneuvers and autonomous planning remains a technical challenge. This study presents a custom vision-based control system designed for an autonomous omnidirectional wheeled robot. The proposed methodology acquires visual feedback using a stereo camera integrated within the Robot Operating System framework. Two visual feedback control laws are formulated and rigorously evaluated: a Classic Position-Based Visual Servoing algorithm, which minimizes pose error using a quaternion-based approach, and a second solution that utilizes Dual Lie Algebra to compute the 3D visual sensor’s velocities, ensuring convergence towards the desired point-feature configuration. Experimental validation reveals that while both methods achieve docking, the dual pose-free approach enables more robust, effortless movement of the robot platform than Classic Position-Based Visual Servoing. Consequently, these findings indicate that integrating depth-based feature recovery with advanced algebraic strategies offers a stable control strategy for automated industrial scenarios. Full article
(This article belongs to the Special Issue Intelligent Sensing for Robotic Control and Visual Perception)
Show Figures

Figure 1

40 pages, 27259 KB  
Article
Monocular 3D Position Estimation of a Moving Vehicle Based on a Kalman-Goldschmidt Adaptive Filter
by Diana Kalita, Pavel Lyakhov, Valery Andreev and Denis Butusov
J. Sens. Actuator Netw. 2026, 15(3), 48; https://doi.org/10.3390/jsan15030048 - 18 Jun 2026
Viewed by 173
Abstract
Determining the 3D position of a vehicle from a 2D image plays a key role in video surveillance, autonomous driving, and spatial localization. However, localization accuracy can significantly degrade in conditions of incomplete or synthetic measurement noise and keypoint jitter. In this paper, [...] Read more.
Determining the 3D position of a vehicle from a 2D image plays a key role in video surveillance, autonomous driving, and spatial localization. However, localization accuracy can significantly degrade in conditions of incomplete or synthetic measurement noise and keypoint jitter. In this paper, we propose a new iterative 3D position estimation algorithm (KGA). This algorithm includes geometric correction and calibration steps for converting from 2D to 3D coordinates; trajectory prediction and correction using a Kalman filter; and adaptive tuning of the filter parameters using the Goldschmidt algorithm. Experiments confirm that KGA outperforms the standard (FK) and modified (MFK) Kalman filters in accuracy and convergence speed, demonstrating robustness to various camera angles and noise levels. The novelty of this approach lies in the integration of the Goldschmidt algorithm into the Kalman filter to create an adaptation mechanism that dynamically adjusts the measurement noise covariance based on instantaneous innovation magnitude. Unlike end-to-end deep learning trackers or nonlinear filters (EKF/UKF), KGA is designed as a lightweight post-processing stage that can be seamlessly integrated into existing detection pipelines while maintaining the low computational footprint required for UAV-based edge deployment. The algorithm is of practical value for computer vision systems requiring accurate and robust tracking under varying observational conditions, with current implementation suitable for offline or buffered processing, and clear pathways to real-time deployment through code optimization. The algorithm is of practical value for computer vision systems requiring accurate and robust tracking under varying observational conditions. Full article
(This article belongs to the Section Big Data, Computing and Artificial Intelligence)
Show Figures

Figure 1

26 pages, 13171 KB  
Article
A Deep Learning Approach for Pixel-Level Material Classification via Hyperspectral Imaging
by Savvas Sifnaios, George Arvanitakis, Fotios K. Konstantinidis, Georgios Tsimiklis, Angelos Amditis and Panayiotis Frangos
J. Imaging 2026, 12(6), 267; https://doi.org/10.3390/jimaging12060267 - 18 Jun 2026
Viewed by 254
Abstract
Recent advancements in computer vision, particularly in detection, segmentation, and classification, have significantly impacted various domains. However, these advancements are still strongly tied to RGB-based systems, which are insufficient for applications in industries such as waste sorting, pharmaceuticals, and defence, where material characterization [...] Read more.
Recent advancements in computer vision, particularly in detection, segmentation, and classification, have significantly impacted various domains. However, these advancements are still strongly tied to RGB-based systems, which are insufficient for applications in industries such as waste sorting, pharmaceuticals, and defence, where material characterization beyond shape or visible colour is necessary. Hyperspectral (HS) imaging captures spatial and spectral information for each pixel and therefore offers a promising route for material-level classification. This study evaluates the potential of combining HS imaging with deep learning for plastic material classification. The work includes: (i) the design of an experimental setup with a HS line-scan camera, conveyor, and controlled illumination; (ii) the construction of an object-disjoint dataset of HDPE, PET, PP, and PS samples with semi-automated mask generation and Raman spectroscopy-based labelling; and (iii) the development of P1CH, a lightweight pixel-wise 1D convolutional hyperspectral classifier. On object-disjoint test images, P1CH achieved 97.44% all-pixel accuracy. A boundary sensitivity analysis, reported separately because semi-automated labels are uncertain at material/background interfaces, yielded 99.94% accuracy after excluding a pre-defined two-pixel border band. Additional ablation, baseline, and robustness analyses show that the proposed pixel-wise spectral approach is effective for small fragments, visually similar plastics, and overlapping materials, while black or very dark plastics remain challenging under the present camera and illumination configuration. Full article
(This article belongs to the Special Issue Advancement in Hyperspectral Image Processing with Machine Learning)
Show Figures

Figure 1

25 pages, 8924 KB  
Article
3D Localization of Heat Sources Using LiDAR–Thermal Data Fusion and Multisensor Calibration
by Rafał Gasz, Mateusz Pluskota and Krzysztof Schwierz
Sensors 2026, 26(12), 3876; https://doi.org/10.3390/s26123876 - 18 Jun 2026
Viewed by 308
Abstract
Integration of LiDAR and thermal sensing has become increasingly important in robotics, infrastructure diagnostics, environmental monitoring, and autonomous perception systems. LiDAR sensors provide accurate three-dimensional geometric information but do not directly capture thermal properties of observed objects, whereas thermal cameras provide temperature distributions [...] Read more.
Integration of LiDAR and thermal sensing has become increasingly important in robotics, infrastructure diagnostics, environmental monitoring, and autonomous perception systems. LiDAR sensors provide accurate three-dimensional geometric information but do not directly capture thermal properties of observed objects, whereas thermal cameras provide temperature distributions without explicit spatial structure. Fusion of both sensing modalities enables thermally augmented 3D scene reconstruction and spatial localization of temperature anomalies. This paper presents a practical LiDAR–thermal fusion framework for three-dimensional localization of heat sources using an Ouster OS1 LiDAR sensor and a FLIR A70 thermal camera. The proposed framework includes intrinsic thermal-camera calibration, extrinsic LiDAR–thermal calibration, multimodal data synchronization, projection of LiDAR points onto the thermal image plane, and assignment of temperature values to spatial points. Additionally, a dedicated thermally distinguishable calibration target is proposed to enable reliable multimodal feature extraction under low-contrast LWIR imaging conditions. The developed framework was experimentally validated using real radiometric thermal data and LiDAR point clouds acquired under laboratory conditions. Quantitative evaluation demonstrated reprojection errors below 1 pixel and a mean hottest-point localisation error of approximately 4.1 cm at a distance of 12.3 m. The results confirm that accurate spatial localisation of thermal anomalies can be achieved using a geometry-based multimodal fusion approach without relying on computationally expensive learning-based methods. The proposed framework emphasises practical deployment, deterministic calibration, and applicability in scenarios where limited training data or constrained computational resources make learning-based approaches difficult to apply. The proposed system may be applied to building energy diagnostics, industrial inspection, technical infrastructure monitoring, and robotic perception systems that require reliable spatial localisation of heat sources under real measurement conditions. Full article
(This article belongs to the Collection 3D Imaging and Sensing System)
Show Figures

Figure 1

20 pages, 13974 KB  
Article
A Perceptual Rate Control Algorithm Based on JND for Screen Content Video
by Huijie Zheng, Jing Chen and Qi Lin
Sensors 2026, 26(12), 3866; https://doi.org/10.3390/s26123866 - 17 Jun 2026
Viewed by 336
Abstract
The rate control algorithm is designed for natural video by default in video-coding standards. However, computer-generated screen content video (SCV) is very different from natural video captured by a camera, with many different statistical characteristics, such as sharp edges, thin lines, and flat [...] Read more.
The rate control algorithm is designed for natural video by default in video-coding standards. However, computer-generated screen content video (SCV) is very different from natural video captured by a camera, with many different statistical characteristics, such as sharp edges, thin lines, and flat area. This will lead to a difference in the focus of the human visual system (HVS) when viewing on-screen content video. Especially in various sensor data visualization applications such as intelligent display terminals, industrial monitoring and human–computer interaction interfaces, screen content video carries key information collected and reconstructed by image sensors, vision sensors and multimodal sensors. Its edge structures and local details directly affect the interpretation accuracy and application reliability of sensor information. Therefore, it is crucial to investigate perceptual rate control methods that integrate both video content characteristics and human visual perception properties, which possesses substantial theoretical and practical significance. In this paper, we propose a perceptual rate control algorithm for screen content video based on just-noticeable distortion (JND) which is established on the edge profile reconstruction with tolerable variations. First of all, target bit rate allocation for the frame level and CTU level is based on a perceptual weight which is calculated on the JND factor and reconstruction edge character. Secondly, under the constraint of the JND model, an intra rate-distortion (RD) model is established under the constraint of the JND model. The similarity between reference frames and reconstructed frames is taken as feedback in this model. Finally, the proposed rate control algorithm (JND–perceptual rate control (PRC)) is integrated to the existing rate control framework in High-Efficiency Video Coding–Screen Content Coding (HEVC-SCC) for improving the coding efficiency. The experimental results show that the proposed algorithm achieves better bit control precision than the platform, as well as improves the R-D performance of screen content video. In particular, compared with the HEVC-SCC reference software, the coding performance is improved by 3.09 dB on average, the bit rate is saved by 26.51% on average, and the average bit rate mismatch is within 1.159%. Full article
(This article belongs to the Special Issue Intelligent Sensing Technology for Image and Video Processing)
Show Figures

Figure 1

15 pages, 32174 KB  
Article
YOLO-FSEP: An Improved YOLOv8n Algorithm for Sugar Orange Detection in Orchards
by Tianfa Deng, Jinchao Sun, Qingjuan Zhao and Faguo Huang
Sensors 2026, 26(12), 3848; https://doi.org/10.3390/s26123848 - 17 Jun 2026
Viewed by 161
Abstract
To address the challenges of detecting sugar orange fruits in complex natural orchard environments—where fruits are frequently occluded by leaves and branches and may be mutually occluded due to dense growth, leading to missed detections, false positives, and low detection confidence—we propose an [...] Read more.
To address the challenges of detecting sugar orange fruits in complex natural orchard environments—where fruits are frequently occluded by leaves and branches and may be mutually occluded due to dense growth, leading to missed detections, false positives, and low detection confidence—we propose an improved algorithm based on YOLOv8n, named YOLO-FSEP. A Spatial-Channel Synergistic Attention (SCSA) module is introduced into the main network to enhance feature extraction capabilities; the IoU loss function is replaced with Focal_SIOU to improve the detection accuracy for difficult samples; and an SE attention mechanism is embedded in the detection head, with the addition of a P6 high-resolution detection layer to optimize multi-scale object performance. Experimental results on a self-built sugar orange dataset show that, compared to the baseline YOLOv8n, the improved model achieves a 0.9% increase in accuracy, a 1.3% increase in recall, and a 3.2% increase in mAP50-95, while maintaining an inference speed of 62.6 FPS. To evaluate the model under dynamic conditions, we performed a 200-frame continuous test of the 3D localization pipeline on a laptop with a RealSense D435i camera. The average YOLO inference time was 49.90 ms, post-processing (depth extraction and 3D coordinate conversion) took 0.24 ms, and the total processing time was 50.15 ms. Given that the typical response time for a robotic arm’s single positioning operation is 100–200 ms, this real-time performance meets the dynamic localization requirements of sugar orange harvesting. Full article
(This article belongs to the Special Issue Smart Sensors in Precision Agriculture)
Show Figures

Figure 1

24 pages, 5888 KB  
Article
NeRF-Based Three-Dimensional Reconstruction for Large-Diameter Rescue Shafts
by Hairong Gu, Jiaxi Wang, Chenggang Chen, Wenjuan Yang, Mostak Ahamed and Zujie Zou
Sensors 2026, 26(12), 3847; https://doi.org/10.3390/s26123847 - 17 Jun 2026
Viewed by 158
Abstract
Large-diameter rescue shafts serve as critical infrastructure for emergency response in mining disaster scenarios, and their structural deformation directly affects the safe passage of rescue capsules. In this paper, we investigate three-dimensional (3D) reconstruction techniques for large-diameter rescue shaft environments and develop a [...] Read more.
Large-diameter rescue shafts serve as critical infrastructure for emergency response in mining disaster scenarios, and their structural deformation directly affects the safe passage of rescue capsules. In this paper, we investigate three-dimensional (3D) reconstruction techniques for large-diameter rescue shaft environments and develop a Neural Radiance Fields (NeRF)-based reconstruction and deformation assessment scheme. The proposed workflow integrates no reference signal-to-noise-ratio (NR-SNR), image-quality filtering, SfM-based camera-pose estimation, Nerfacto reconstruction, point-cloud export, and circular-section fitting. The NR-SNR retention-ratio experiment shows that retaining approximately 35% high-quality images provides a practical efficiency–quality trade-off for the present dataset, reducing the computational burden of SfM pose estimation while preserving sufficient geometric information for subsequent reconstruction. The reconstructed radiance field is further exported as a dense point cloud and evaluated using relative radius error, circle-fitting residuals, and image-level rendering metrics. Experiments on a simulated large-diameter rescue shaft platform show that the proposed NeRF-based scheme provides favorable geometric measurement applicability and visual reconstruction quality under weak-texture and low-illumination conditions. Compared with conventional MVS and the tested 3DGS baseline, the proposed scheme produces a point-cloud output that is more suitable for subsequent circular-section fitting and deformation-related assessment. In addition, comparison with a representative SDF-based baseline indicates that direct implicit surface recovery remains challenging for the tested hollow cylindrical shaft-wall scene. The results demonstrate the potential of the proposed NeRF-based workflow for rescue-shaft inner-wall reconstruction and engineering-oriented deformation evaluation. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Back to TopTop