MDPI - Publisher of Open Access Journals

33 pages, 10063 KiB

Open AccessArticle

Wide-Angle Image Distortion Correction and Embedded Stitching System Design Based on Swin Transformer

by Shiwen Lai, Zuling Cheng, Wencui Zhang and Maowei Chen

Appl. Sci. 2025, 15(14), 7714; https://doi.org/10.3390/app15147714 - 9 Jul 2025

Viewed by 333

Wide-angle images often suffer from severe radial distortion, compromising geometric accuracy and challenging image correction and real-time stitching, especially in resource-constrained embedded environments. To address this, this study proposes a wide-angle image correction and stitching framework based on a Swin Transformer, optimized for [...] Read more.

Wide-angle images often suffer from severe radial distortion, compromising geometric accuracy and challenging image correction and real-time stitching, especially in resource-constrained embedded environments. To address this, this study proposes a wide-angle image correction and stitching framework based on a Swin Transformer, optimized for lightweight deployment on edge devices. The model integrates multi-scale feature extraction, Thin Plate Spline (TPS) control point prediction, and optical flow-guided constraints, balancing correction accuracy and computational efficiency. Experiments on synthetic and real-world datasets show that the method outperforms mainstream algorithms, with PSNR gains of 3.28 dB and 2.18 dB on wide-angle and fisheye images, respectively, while maintaining real-time performance. To validate practical applicability, the model is deployed on a Jetson TX2 NX device, and a real-time dual-camera stitching system is built using C++ and DeepStream. The system achieves 15 FPS at 1400 × 1400 resolution, with a correction latency of 56 ms and stitching latency of 15 ms, demonstrating efficient hardware utilization and stable performance. This study presents a deployable, scalable, and edge-compatible solution for wide-angle image correction and real-time stitching, offering practical value for applications such as smart surveillance, autonomous driving, and industrial inspection. Full article

(This article belongs to the Special Issue Latest Research on Computer Vision and Image Processing)

► Show Figures

Figure 1

23 pages, 14051 KiB

Open AccessArticle

A Novel Method for Water Surface Debris Detection Based on YOLOV8 with Polarization Interference Suppression

by Yi Chen, Honghui Lin, Lin Xiao, Maolin Zhang and Pingjun Zhang

Photonics 2025, 12(6), 620; https://doi.org/10.3390/photonics12060620 - 18 Jun 2025

Viewed by 316

Abstract

Aquatic floating debris detection is a key technological foundation for ecological monitoring and integrated water environment management. It holds substantial scientific and practical value in applications such as pollution source tracing, floating debris control, and maritime navigation safety. However, this field faces ongoing [...] Read more.

Aquatic floating debris detection is a key technological foundation for ecological monitoring and integrated water environment management. It holds substantial scientific and practical value in applications such as pollution source tracing, floating debris control, and maritime navigation safety. However, this field faces ongoing challenges due to water surface polarization. Reflections of polarized light produce intense glare, resulting in localized overexposure, detail loss, and geometric distortion in captured images. These optical artifacts severely impair the performance of conventional detection algorithms, increasing both false positives and missed detections. To overcome these imaging challenges in complex aquatic environments, we propose a novel YOLOv8-based detection framework with integrated polarized light suppression mechanisms. The framework consists of four key components: a fisheye distortion correction module, a polarization feature processing layer, a customized residual network with Squeeze-and-Excitation (SE) attention, and a cascaded pipeline for super-resolution reconstruction and deblurring. Additionally, we developed the PSF-IMG dataset (Polarized Surface Floats), which includes common floating debris types such as plastic bottles, bags, and foam boards. Extensive experiments demonstrate the network’s robustness in suppressing polarization artifacts and enhancing feature stability under dynamic optical conditions. Full article

(This article belongs to the Special Issue Advancements in Optical Measurement Techniques and Applications)

► Show Figures

Figure 1

29 pages, 19553 KiB

Open AccessArticle

Let’s Go Bananas: Beyond Bounding Box Representations for Fisheye Camera-Based Object Detection in Autonomous Driving

by Senthil Yogamani, Ganesh Sistu, Patrick Denny and Jane Courtney

Sensors 2025, 25(12), 3735; https://doi.org/10.3390/s25123735 - 14 Jun 2025

Viewed by 655

Abstract

Object detection is a mature problem in autonomous driving, with pedestrian detection being one of the first commercially deployed algorithms. It has been extensively studied in the literature. However, object detection is relatively less explored for fisheye cameras used for surround-view near-field sensing. [...] Read more.

Object detection is a mature problem in autonomous driving, with pedestrian detection being one of the first commercially deployed algorithms. It has been extensively studied in the literature. However, object detection is relatively less explored for fisheye cameras used for surround-view near-field sensing. The standard bounding-box representation fails in fisheye cameras due to heavy radial distortion, particularly in the periphery. In this paper, a generic object detection framework is implemented using the base YOLO (You Only Look Once) detector to systematically explore various object representations using the public WoodScape dataset. First, we implement basic representations, namely the standard bounding box, the oriented bounding box, and the ellipse. Secondly, we implement a generic polygon and propose a novel curvature-adaptive polygon, which obtains an improvement of 3 mAP (mean average precision) points. A polygon is expensive to annotate and complex to use in downstream tasks; thus, it is not practical to use it in real-world applications. However, we utilize it to demonstrate that the accuracy gap between the polygon and the bounding box representation is very high due to strong distortion in fisheye cameras. This motivates the design of a distortion-aware optimal representation of the bounding box for fisheye images, which tend to be banana-shaped near the periphery. We derive a novel representation called a curved box and improve it further by leveraging vanishing-point constraints. The proposed curved box representations outperform the bounding box by 3 mAP points and the oriented bounding box by 1.6 mAP points. In addition, the camera geometry tensor is formulated to provide adaptation to non-linear fisheye camera distortion characteristics and improves the performance further by 1.4 mAP points. Full article

(This article belongs to the Special Issue Design, Communication, and Control of Autonomous Vehicle Systems)

► Show Figures

Figure 1

42 pages, 47882 KiB

Open AccessArticle

Product Engagement Detection Using Multi-Camera 3D Skeleton Reconstruction and Gaze Estimation

by Matus Tanonwong, Yu Zhu, Naoya Chiba and Koichi Hashimoto

Sensors 2025, 25(10), 3031; https://doi.org/10.3390/s25103031 - 11 May 2025

Viewed by 785

Abstract

Product engagement detection in retail environments is critical for understanding customer preferences through nonverbal cues such as gaze and hand movements. This study presents a system leveraging a 360-degree top-view fisheye camera combined with two perspective cameras, the only sensors required for deployment, [...] Read more.

Product engagement detection in retail environments is critical for understanding customer preferences through nonverbal cues such as gaze and hand movements. This study presents a system leveraging a 360-degree top-view fisheye camera combined with two perspective cameras, the only sensors required for deployment, effectively capturing subtle interactions even under occlusion or distant camera setups. Unlike conventional image-based gaze estimation methods that are sensitive to background variations and require capturing a person’s full appearance, raising privacy concerns, our approach utilizes a novel Transformer-based encoder operating directly on 3D skeletal keypoints. This innovation significantly reduces privacy risks by avoiding personal appearance data and benefits from ongoing advancements in accurate skeleton estimation techniques. Experimental evaluation in a simulated retail environment demonstrates that our method effectively identifies critical gaze-object and hand-object interactions, reliably detecting customer engagement prior to product selection. Despite yielding slightly higher mean angular errors in gaze estimation compared to a recent image-based method, the Transformer-based model achieves comparable performance in gaze-object detection. Its robustness, generalizability, and inherent privacy preservation make it particularly suitable for deployment in practical retail scenarios such as convenience stores, supermarkets, and shopping malls, highlighting its superiority in real-world applicability. Full article

(This article belongs to the Special Issue Feature Papers in Sensing and Imaging 2025)

► Show Figures

Figure 1

24 pages, 10571 KiB

Open AccessArticle

Evaluation of Network Design and Solutions of Fisheye Camera Calibration for 3D Reconstruction

by Sina Rezaei and Hossein Arefi

Sensors 2025, 25(6), 1789; https://doi.org/10.3390/s25061789 - 13 Mar 2025

Cited by 2 | Viewed by 1294

Abstract

The evolution of photogrammetry has been significantly influenced by advancements in camera technology, particularly the emergence of spherical cameras. These devices offer extensive photographic coverage and are increasingly utilised in many photogrammetry applications due to their significant user-friendly configuration, especially in their low-cost [...] Read more.

The evolution of photogrammetry has been significantly influenced by advancements in camera technology, particularly the emergence of spherical cameras. These devices offer extensive photographic coverage and are increasingly utilised in many photogrammetry applications due to their significant user-friendly configuration, especially in their low-cost versions. Despite their advantages, these cameras are subject to high image distortion. This necessitates specialised calibration solutions related to fisheye images, which represent the primary geometry of the raw files. This paper evaluates fisheye calibration processes for the effective utilisation of low-cost spherical cameras, for the purpose of 3D reconstruction and the verification of geometric stability. Calibration optical parameters include focal length, pixel positions, and distortion coefficients. Emphasis was placed on the evaluation of solutions for camera calibration, calibration network design, and the assessment of software or toolboxes that support the correspondent geometry and calibration for processing. The efficiency in accuracy, correctness, computational time, and stability parameters was assessed with the influence of calibration parameters based on the accuracy of the 3D reconstruction. The assessment was conducted using a previous case study of graffiti on an underpass in Wiesbaden, Germany. The robust calibration solution is a two-step calibration process, including a pre-calibration stage and the consideration of the best possible network design. Fisheye undistortion was performed using OpenCV, and finally, calibration parameters were optimized with self-calibration through bundle adjustment to achieve both calibration parameters and 3D reconstruction using Agisoft Metashape software. In comparison to 3D calibration, self-calibration, and a pre-calibration strategy, the two-step calibration process has demonstrated an average improvement of 2826 points in the 3D sparse point cloud and a 0.22 m decrease in the re-projection error value derived from the front lens images of two individual spherical cameras. The accuracy and correctness of the 3D point cloud and the statistical analysis of parameters in the two-step calibration solution are presented as a result of the quality assessment of this paper and in comparison with the 3D point cloud produced by a laser scanner. Full article

(This article belongs to the Special Issue Computer Vision and Sensing Technologies for Industrial Quality Inspection: 2nd Edition)

► Show Figures

Figure 1

22 pages, 27425 KiB

Open AccessArticle

Semiautomatic Diameter-at-Breast-Height Extraction from Structure-from-Motion-Based Point Clouds Using a Low-Cost Fisheye Lens

by Mustafa Zeybek

Forests 2025, 16(3), 439; https://doi.org/10.3390/f16030439 - 28 Feb 2025

Viewed by 1552

Abstract

The diameter at breast height (DBH) is a fundamental index used to characterize trees and establish forest inventories. The conventional method of measuring the DBH involves using steel tape meters, rope, and calipers. Alternatively, this study has shown that it can be calculated [...] Read more.

The diameter at breast height (DBH) is a fundamental index used to characterize trees and establish forest inventories. The conventional method of measuring the DBH involves using steel tape meters, rope, and calipers. Alternatively, this study has shown that it can be calculated automatically using image-based algorithms, thus reducing time and effort while remaining cost-effective. The method consists of three main steps: image acquisition using a fisheye lens, 3D point cloud generation using structure-from-motion (SfM)-based image processing, and improved DBH estimation. The results indicate that this proposed methodology is comparable to traditional urban forest DBH measurements, with a root-mean-square error ranging from 0.7 to 2.4 cm. The proposed approach has been evaluated using real-world data, and it has been determined that the F-score assessment metric achieves a maximum of 0.91 in a university garden comprising 74 trees. The successful automated DBH measurements through SfM combined with fisheye lenses demonstrate the potential to improve urban tree inventories. Full article

(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)

► Show Figures

Figure 1

19 pages, 30440 KiB

Open AccessArticle

A Method for the Calibration of a LiDAR and Fisheye Camera System

by Álvaro Martínez, Antonio Santo, Monica Ballesta, Arturo Gil and Luis Payá

Appl. Sci. 2025, 15(4), 2044; https://doi.org/10.3390/app15042044 - 15 Feb 2025

Cited by 2 | Viewed by 1581

Abstract

LiDAR and camera systems are frequently used together to gain a more complete understanding of the environment in different fields, such as mobile robotics, autonomous driving, or intelligent surveillance. Accurately calibrating the extrinsic parameters is crucial for the accurate fusion of the data [...] Read more.

LiDAR and camera systems are frequently used together to gain a more complete understanding of the environment in different fields, such as mobile robotics, autonomous driving, or intelligent surveillance. Accurately calibrating the extrinsic parameters is crucial for the accurate fusion of the data captured by both systems, which is equivalent to finding the transformation between the reference systems of both sensors. Traditional calibration methods for LiDAR and camera systems are developed for pinhole cameras and are not directly applicable to fisheye cameras. This work proposes a target-based calibration method for LiDAR and fisheye camera systems that avoids the need to transform images to a pinhole camera model, reducing the computation time. Instead, the method uses the spherical projection of the image, obtained with the intrinsic calibration parameters and the corresponding point cloud for LiDAR–fisheye calibration. Thus, unlike a pinhole-camera-based system, a wider field of view is provided, adding more information, which will lead to a better understanding of the environment itself, as well as enabling using fewer image sensors to cover a wider area. Full article

► Show Figures

Figure 1

30 pages, 6461 KiB

Open AccessArticle

Comprehensive Comparative Analysis and Innovative Exploration of Green View Index Calculation Methods

by Dongmin Yin and Terumitsu Hirata

Land 2025, 14(2), 289; https://doi.org/10.3390/land14020289 - 30 Jan 2025

Cited by 1 | Viewed by 1229

Abstract

Despite the widespread use of street view imagery for Green View Index (GVI) analyses, variations in sampling methodologies across studies and the potential impact of these differences on the results, including associated errors, remain largely unexplored. This study aims to investigate the effectiveness [...] Read more.

Despite the widespread use of street view imagery for Green View Index (GVI) analyses, variations in sampling methodologies across studies and the potential impact of these differences on the results, including associated errors, remain largely unexplored. This study aims to investigate the effectiveness of various GVI calculation methods, with a focus on analyzing the impact of sampling point selection and coverage angles on GVI results. Through a systematic review of the extensive relevant literature, we synthesized six predominant sampling methods: the four-quadrant view method, six-quadrant view method, eighteen-quadrant view method, panoramic view method, fisheye view method and pedestrian view method. We further evaluated the strengths and weaknesses of each approach, along with their applicability across different research domains. In addition, to address the limitations of existing methods in specific contexts, we developed a novel sampling technique based on three 120° street view images and experimentally validated its feasibility and accuracy. The results demonstrate the method’s high reliability, making it a valuable tool for acquiring and analyzing street view images. Our findings demonstrate that the choice of sampling method significantly influences GVI calculations, underscoring the necessity for researchers to select the optimal approach based on a specific research context. To mitigate errors arising from initial sampling angles, this study introduces a novel concept, the “Green View Circle”, which enhances the precision and applicability of calculations through the meticulous segmentation of observational angles, particularly in complex urban environments. Full article

► Show Figures

Figure 1

18 pages, 36094 KiB

Open AccessArticle

Arbitrary Optics for Gaussian Splatting Using Space Warping

by Jakob Nazarenus, Simin Kou, Fang-Lue Zhang and Reinhard Koch

J. Imaging 2024, 10(12), 330; https://doi.org/10.3390/jimaging10120330 - 22 Dec 2024

Viewed by 1683

Abstract

Due to recent advances in 3D reconstruction from RGB images, it is now possible to create photorealistic representations of real-world scenes that only require minutes to be reconstructed and can be rendered in real time. In particular, 3D Gaussian splatting shows promising results, [...] Read more.

Due to recent advances in 3D reconstruction from RGB images, it is now possible to create photorealistic representations of real-world scenes that only require minutes to be reconstructed and can be rendered in real time. In particular, 3D Gaussian splatting shows promising results, outperforming preceding reconstruction methods while simultaneously reducing the overall computational requirements. The main success of 3D Gaussian splatting relies on the efficient use of a differentiable rasterizer to render the Gaussian scene representation. One major drawback of this method is its underlying pinhole camera model. In this paper, we propose an extension of the existing method that removes this constraint and enables scene reconstructions using arbitrary camera optics such as highly distorting fisheye lenses. Our method achieves this by applying a differentiable warping function to the Gaussian scene representation. Additionally, we reduce overfitting in outdoor scenes by utilizing a learnable skybox, reducing the presence of floating artifacts within the reconstructed scene. Based on synthetic and real-world image datasets, we show that our method is capable of creating an accurate scene reconstruction from highly distorted images and rendering photorealistic images from such reconstructions. Full article

(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))

► Show Figures

Figure 1

17 pages, 2373 KiB

Open AccessArticle

Depth Segmentation Approach for Egocentric 3D Human Pose Estimation with a Fisheye Camera

by Hyeonghwan Shin and Seungwon Kim

Appl. Sci. 2024, 14(24), 11937; https://doi.org/10.3390/app142411937 - 20 Dec 2024

Viewed by 1663

Abstract

In this paper, we propose a novel approach for egocentric 3D human pose estimation using fisheye images captured by a head-mounted display (HMD). Most studies on 3D pose estimation focused on heatmap regression and lifting 2D information to 3D space. This paper addresses [...] Read more.

In this paper, we propose a novel approach for egocentric 3D human pose estimation using fisheye images captured by a head-mounted display (HMD). Most studies on 3D pose estimation focused on heatmap regression and lifting 2D information to 3D space. This paper addresses the issue of depth ambiguity with highly distorted 2D fisheye images by proposing the SegDepth module, which jointly regresses segmentation and depth maps from the image. The SegDepth module distinguishes the human silhouette, which is directly related to pose estimation through segmentation, and simultaneously estimates depth to resolve the depth ambiguity. The extracted segmentation and depth information are transformed into embeddings and used for 3D joint estimation. In the evaluation, the SegDepth module improves the performance of existing methods, demonstrating its effectiveness and general applicability in improving 3D pose estimation. This suggests that the SegDepth module can be integrated into well-established methods such as Mo2Cap2 and xR-EgoPose to improve 3D pose estimation and provide a general performance improvement. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

12 pages, 1842 KiB

Open AccessArticle

Neural Radiance Fields for Fisheye Driving Scenes Using Edge-Aware Integrated Depth Supervision

by Jiho Choi and Sang Jun Lee

Sensors 2024, 24(21), 6790; https://doi.org/10.3390/s24216790 - 22 Oct 2024

Viewed by 1382

Abstract

Neural radiance fields (NeRF) have become an effective method for encoding scenes into neural representations, allowing for the synthesis of photorealistic views of unseen views from given input images. However, the applicability of traditional NeRF is significantly limited by its assumption that images [...] Read more.

Neural radiance fields (NeRF) have become an effective method for encoding scenes into neural representations, allowing for the synthesis of photorealistic views of unseen views from given input images. However, the applicability of traditional NeRF is significantly limited by its assumption that images are captured for object-centric scenes with a pinhole camera. Expanding these boundaries, we focus on driving scenarios using a fisheye camera, which offers the advantage of capturing visual information from a wide field of view. To address the challenges due to the unbounded and distorted characteristics of fisheye images, we propose an edge-aware integration loss function. This approach leverages sparse LiDAR projections and dense depth maps estimated from a learning-based depth model. The proposed algorithm assigns larger weights to neighboring points that have depth values similar to the sensor data. Experiments were conducted on the KITTI-360 and JBNU-Depth360 datasets, which are public and real-world datasets of driving scenarios using fisheye cameras. Experimental results demonstrated that the proposed method is effective in synthesizing novel view images, outperforming existing approaches. Full article

(This article belongs to the Special Issue Image and Video Processing and Recognition Based on Artificial Intelligence: 3rd Edition)

► Show Figures

Figure 1

19 pages, 6649 KiB

Open AccessArticle

Pipeline Landmark Classification of Miniature Pipeline Robot π-II Based on Residual Network ResNet18

by Jian Wang, Chuangeng Chen, Bingsheng Liu, Juezhe Wang and Songtao Wang

Machines 2024, 12(8), 563; https://doi.org/10.3390/machines12080563 - 16 Aug 2024

Cited by 3 | Viewed by 4264

Abstract

A pipeline robot suitable for miniature pipeline detection, namely π-II, was proposed in this paper. It features six wheel-leg mobile mechanisms arranged in a staggered manner, with a monocular fisheye camera located at the center of the front end. The proposed robot can [...] Read more.

A pipeline robot suitable for miniature pipeline detection, namely π-II, was proposed in this paper. It features six wheel-leg mobile mechanisms arranged in a staggered manner, with a monocular fisheye camera located at the center of the front end. The proposed robot can be used to capture images during detection in miniature pipes with an inner diameter of 120 mm. To efficiently identify the robot’s status within the pipeline, such as navigating in straight pipes, curved pipes, or T-shaped pipes, it is necessary to recognize and classify these specific pipeline landmarks accurately. For this purpose, the residual network model ResNet18 was employed to learn from the images of various pipeline landmarks captured by the fisheye camera. A detailed analysis of image characteristics of some common pipeline landmarks was provided, and a dataset of approximately 908 images was created in this paper. After modifying the outputs of the network model, the ResNet18 was trained according to the proposed datasets, and the final test results indicate that this modified network has a high accuracy rate in classifying various pipeline landmarks, demonstrating a promising application prospect of image detection technology based on deep learning in miniature pipelines. Full article

(This article belongs to the Special Issue New Localization Methods and Motion Tracking Algorithms for Mechatronic Systems, Robots and Unmanned Vehicles)

► Show Figures

Figure 1

15 pages, 3294 KiB

Open AccessArticle

Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance

by Ching-Chang Wong, Kun-Duo Weng, Bo-Yun Yu and Yung-Shan Chou

Appl. Sci. 2024, 14(15), 6836; https://doi.org/10.3390/app14156836 - 5 Aug 2024

Viewed by 2137

Abstract

In this study, under the limited volume of 18 cm × 18 cm × 21 cm, a small-sized mobile robot is designed and implemented. It consists of a CPU, a GPU, a 2D LiDAR (Light Detection And Ranging), and two fisheye cameras to [...] Read more.

In this study, under the limited volume of 18 cm × 18 cm × 21 cm, a small-sized mobile robot is designed and implemented. It consists of a CPU, a GPU, a 2D LiDAR (Light Detection And Ranging), and two fisheye cameras to let the robot have good computing processing and graphics processing capabilities. In addition, three functions of road detection, sign recognition, and obstacle avoidance are implemented on this small-sized robot. For road detection, we divide the captured image into four areas and use Intel NUC to perform road detection calculations. The proposed method can significantly reduce the system load and also has a high processing speed of 25 frames per second (fps). For sign recognition, we use the YOLOv4-tiny model and a data augmentation strategy to significantly improve the computing performance of this model. From the experimental results, it can be seen that the mean Average Precision (mAP) of the used model has increased by 52.14%. For obstacle avoidance, a 2D LiDAR-based method with a distance-based filtering mechanism is proposed. The distance-based filtering mechanism is proposed to filter important data points and assign appropriate weights, which can effectively reduce the computational complexity and improve the robot’s response speed to avoid obstacles. Some results and actual experiments illustrate that the proposed methods for these three functions can be effectively completed in the implemented small-sized robot. Full article

(This article belongs to the Special Issue Artificial Intelligence and Its Application in Robotics)

► Show Figures

Figure 1

17 pages, 6129 KiB

Open AccessArticle

Improving Otsu Method Parameters for Accurate and Efficient in LAI Measurement Using Fisheye Lens

by Jiayuan Tian, Xianglong Liu, Yili Zheng, Liheng Xu, Qingqing Huang and Xueyang Hu

Forests 2024, 15(7), 1121; https://doi.org/10.3390/f15071121 - 27 Jun 2024

Cited by 2 | Viewed by 1330

Abstract

The leaf area index (LAI) is an essential indicator for assessing vegetation growth and understanding the dynamics of forest ecosystems and is defined as the ratio of the total leaf surface area in the plant canopy to the corresponding surface area below it. [...] Read more.

The leaf area index (LAI) is an essential indicator for assessing vegetation growth and understanding the dynamics of forest ecosystems and is defined as the ratio of the total leaf surface area in the plant canopy to the corresponding surface area below it. LAI has applications for obtaining information on plant health, carbon cycling, and forest ecosystems. Due to their price and portability, mobile devices are becoming an alternative to measuring LAI. In this research, a new method for estimating LAI using a smart device with a fisheye lens (SFL) is proposed. The traditional Otsu method was enhanced to improve the accuracy and efficiency of foreground segmentation. The experimental samples were located in Gansu Ziwuling National Forest Park in Qingyang. In the accuracy parameter improvement experiment, the variance of the average LAI value obtained by using both zenith angle segmentation and azimuth angle segmentation methods was reduced by 50%. The results show that the segmentation of the front and back scenes of the new Otsu method is more accurate, and the obtained LAI values are more reliable. In the efficiency parameter improvement experiment, the time spent is reduced by 17.85% when the enhanced Otsu method is used to ensure that the data anomaly rate does not exceed 10%, which improves the integration of the algorithm into mobile devices and the efficiency of obtaining LAI. This study provides a fast and effective method for the near-ground measurement of forest vegetation productivity and provides help for the calculation of forest carbon sequestration efficiency, oxygen release rate, and forest water and soil conservation ability. Full article

(This article belongs to the Special Issue Deep Learning Techniques for Forests Parameter Retrieval and Accurate Tree Modeling from Remote Sensing Data—Volume Ⅱ)

► Show Figures

Figure 1

18 pages, 17598 KiB

Open AccessArticle

Fisheye Object Detection with Visual Prompting-Aided Fine-Tuning

by Minwoo Jeon, Gyeong-Moon Park and Hyoseok Hwang

Remote Sens. 2024, 16(12), 2054; https://doi.org/10.3390/rs16122054 - 7 Jun 2024

Cited by 2 | Viewed by 2221

Abstract

Fisheye cameras play a crucial role in various fields by offering a wide field of view, enabling the capture of expansive areas within a single frame. Nonetheless, the radial distortion characteristics of fisheye lenses lead to notable shape deformation, particularly at the edges [...] Read more.

Fisheye cameras play a crucial role in various fields by offering a wide field of view, enabling the capture of expansive areas within a single frame. Nonetheless, the radial distortion characteristics of fisheye lenses lead to notable shape deformation, particularly at the edges of the image, posing a significant challenge for accurate object detection. In this paper, we introduce a novel method, ‘VP-aided fine-tuning’, which harnesses the strengths of the pretraining–fine-tuning paradigm augmented by visual prompting (VP) to bridge the domain gap between undistorted standard datasets and distorted fisheye image datasets. Our approach involves two key elements: the use of VPs to effectively adapt a pretrained model to the fisheye domain, and a detailed 24-point regression of objects to fit the unique distortions of fisheye images. This 24-point regression accurately defines the object boundaries and substantially reduces the impact of environmental noise. The proposed method was evaluated against existing object detection frameworks on fisheye images, demonstrating superior performance and robustness. Experimental results also showed performance improvements with the application of VP, regardless of the variety of fine-tuning method applied. Full article

(This article belongs to the Section Remote Sensing Image Processing)

► Show Figures

Figure 1

Search Results (92)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (92)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI