Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (50)

Search Parameters:
Keywords = PnPS algorithm

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 12600 KB  
Article
Underwater Object Recovery Using a Hybrid-Controlled ROV with Deep Learning-Based Perception
by Inés Pérez-Edo, Salvador López-Barajas, Raúl Marín-Prades and Pedro J. Sanz
J. Mar. Sci. Eng. 2026, 14(2), 198; https://doi.org/10.3390/jmse14020198 - 18 Jan 2026
Viewed by 400
Abstract
The deployment of large remotely operated vehicles (ROVs) or autonomous underwater vehicles (AUVs) typically requires support vessels, crane systems, and specialized personnel, resulting in increased logistical complexity and operational costs. In this context, lightweight and modular underwater robots have emerged as a cost-effective [...] Read more.
The deployment of large remotely operated vehicles (ROVs) or autonomous underwater vehicles (AUVs) typically requires support vessels, crane systems, and specialized personnel, resulting in increased logistical complexity and operational costs. In this context, lightweight and modular underwater robots have emerged as a cost-effective alternative, capable of reaching significant depths and performing tasks traditionally associated with larger platforms. This article presents a system architecture for recovering a known object using a hybrid-controlled ROV, integrating autonomous perception, high-level interaction, and low-level control. The proposed architecture includes a perception module that estimates the object pose using a Perspective-n-Point (PnP) algorithm, combining object segmentation from a YOLOv11-seg network with 2D keypoints obtained from a YOLOv11-pose model. In addition, a Natural Language ROS Agent is incorporated to enable high-level command interaction between the operator and the robot. These modules interact with low-level controllers that regulate the vehicle degrees of freedom and with autonomous behaviors such as target approach and grasping. The proposed system is evaluated through simulation and experimental tank trials, including object recovery experiments conducted in a 12 × 8 × 5 m test tank at CIRTESU, as well as perception validation in simulated, tank, and harbor scenarios. The results demonstrate successful recovery of a black box using a BlueROV2 platform, showing that architectures of this type can effectively support operators in underwater intervention tasks, reducing operational risk, deployment complexity, and mission costs. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

38 pages, 7210 KB  
Article
Vision–Geometry Fusion for Measuring Pupillary Height and Interpupillary Distance via RC-BlendMask and Ensemble Regression Trees
by Shishuo Han, Zihan Yang and Huiyu Xiang
Appl. Syst. Innov. 2025, 8(6), 181; https://doi.org/10.3390/asi8060181 - 27 Nov 2025
Viewed by 1000
Abstract
This study proposes an automated, visual–geometric fusion method for measuring pupillary height (PH) and interpupillary distance (PD), aiming to replace manual measurements while balancing accuracy, efficiency, and cost accessibility. To this end, a two-layer Ensemble of Regression Tree (ERT) is used to coarsely [...] Read more.
This study proposes an automated, visual–geometric fusion method for measuring pupillary height (PH) and interpupillary distance (PD), aiming to replace manual measurements while balancing accuracy, efficiency, and cost accessibility. To this end, a two-layer Ensemble of Regression Tree (ERT) is used to coarsely localize facial landmarks and the pupil center, which is then refined via direction-aware ray casting and edge-side-stratified RANSAC followed by least-squares fitting; in parallel, an RC-BlendMask instance-segmentation module extracts the lowest rim point of the spectacle lens. Head pose and lens-plane depth are estimated with the Perspective-n-Point (PnP) algorithm to enable pixel-to-millimeter calibration and pose gating, thereby achieving 3D quantification of PH/PD under a single-camera setup. In a comparative study with 30 participants against the Zeiss i.Terminal2, the proposed method achieved mean absolute errors of 1.13 mm (PD), 0.73 mm (PH-L), and 0.89 mm (PH-R); Pearson correlation coefficients were r = 0.944 (PD), 0.964 (PH-L), and 0.916 (PH-R), and Bland–Altman 95% limits of agreement were −2.00 to 2.70 mm (PD), −0.84 to 1.76 mm (PH-L), and −1.85 to 1.79 mm (PH-R). Lens segmentation performance reached a Precision of 97.5% and a Recall of 93.8%, supporting robust PH extraction. Overall, the proposed approach delivers measurement agreement comparable to high-end commercial devices on low-cost hardware, satisfies ANSI Z80.1/ISO 21987 clinical tolerances for decentration and prism error, and is suitable for both in-store dispensing and tele-dispensing scenarios. Full article
Show Figures

Figure 1

24 pages, 41430 KB  
Article
An Optimal Viewpoint-Guided Visual Indexing Method for UAV Autonomous Localization
by Zhiyang Ye, Yukun Zheng, Zheng Ji and Wei Liu
Remote Sens. 2025, 17(13), 2194; https://doi.org/10.3390/rs17132194 - 25 Jun 2025
Viewed by 2010
Abstract
The autonomous positioning of drone-based remote sensing plays an important role in navigation in urban environments. Due to GNSS (Global Navigation Satellite System) signal occlusion, obtaining precise drone locations is still a challenging issue. Inspired by vision-based positioning methods, we proposed an autonomous [...] Read more.
The autonomous positioning of drone-based remote sensing plays an important role in navigation in urban environments. Due to GNSS (Global Navigation Satellite System) signal occlusion, obtaining precise drone locations is still a challenging issue. Inspired by vision-based positioning methods, we proposed an autonomous positioning method based on multi-view reference images rendered from the scene’s 3D geometric mesh and apply a bag-of-words (BoW) image retrieval pipeline to achieve efficient and scalable positioning, without utilizing deep learning-based retrieval or 3D point cloud registration. To minimize the number of reference images, scene coverage quantification and optimization are employed to generate the optimal viewpoints. The proposed method jointly exploits a visual-bag-of-words tree to accelerate reference image retrieval and improve retrieval accuracy, and the Perspective-n-Point (PnP) algorithm is utilized to obtain the drone’s pose. Experiments are conducted in urban real-word scenarios and the results show that positioning errors are decreased, with accuracy ranging from sub-meter to 5 m and an average latency of 0.7–1.3 s; this indicates that our method significantly improves accuracy and latency, offering robust, real-time performance over extensive areas without relying on GNSS or dense point clouds. Full article
(This article belongs to the Section Engineering Remote Sensing)
Show Figures

Figure 1

22 pages, 23449 KB  
Article
Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network
by Junhui Li and Xingsong Hou
Remote Sens. 2025, 17(12), 2074; https://doi.org/10.3390/rs17122074 - 17 Jun 2025
Cited by 2 | Viewed by 1638
Abstract
Despite the impressive performance of existing image compression algorithms, they struggle to balance perceptual quality and high image fidelity. To address this issue, we propose a novel invertible neural network-based remote sensing image compression (INN-RSIC) method. Our approach captures the compression distortion from [...] Read more.
Despite the impressive performance of existing image compression algorithms, they struggle to balance perceptual quality and high image fidelity. To address this issue, we propose a novel invertible neural network-based remote sensing image compression (INN-RSIC) method. Our approach captures the compression distortion from an existing image compression algorithm and encodes it as Gaussian-distributed latent variables using an INN, ensuring that the distortion in the decoded image remains independent of the ground truth. By using the inverse mapping of the INN, we input the decoded image with randomly resampled Gaussian variables, generating enhanced images with improved perceptual quality. We incorporate channel expansion, Haar transformation, and invertible blocks into the INN to accurately represent compression distortion. Additionally, a quantization module (QM) is introduced to mitigate format conversion impact, enhancing generalization and perceptual quality. Extensive experiments show that INN-RSIC achieves superior perceptual quality and fidelity compared to existing algorithms. As a lightweight plug-and-play (PnP) method, the proposed INN-based enhancer can be easily integrated into existing high-fidelity compression algorithms, enabling flexible and simultaneous decoding of images with enhanced perceptual quality. Full article
Show Figures

Graphical abstract

29 pages, 2702 KB  
Article
IFMIR-VR: Visual Relocalization for Autonomous Vehicles Using Integrated Feature Matching and Image Retrieval
by Gang Li, Xiaoman Xu, Jian Yu and Hao Luo
Appl. Sci. 2025, 15(10), 5767; https://doi.org/10.3390/app15105767 - 21 May 2025
Viewed by 1282
Abstract
Relocalization technology is an important part of autonomous vehicle navigation. It allows the vehicle to find its position on the map after a reboot. This paper presents a relocalization algorithm framework that uses image retrieval techniques. An integrated matching algorithm is applied during [...] Read more.
Relocalization technology is an important part of autonomous vehicle navigation. It allows the vehicle to find its position on the map after a reboot. This paper presents a relocalization algorithm framework that uses image retrieval techniques. An integrated matching algorithm is applied during the feature matching process. This improves the accuracy of the vehicle’s relocalization. We use image retrieval to select the most relevant image from the map database. The integrated matching algorithm then finds precise feature correspondences. Using these correspondences and depth information, we calculate the vehicle’s global pose with the Perspective-n-Point (PnP) and Levenberg–Marquardt (L-M) algorithms. This process helps the vehicle determine its position on the map. Experimental results on public datasets show that the proposed framework outperforms existing methods like LightGlue and LoFTR in terms of matching accuracy. Full article
Show Figures

Figure 1

25 pages, 12377 KB  
Article
Exploiting Weighted Multidirectional Sparsity for Prior Enhanced Anomaly Detection in Hyperspectral Images
by Jingjing Liu, Jiashun Jin, Xianchao Xiu, Wanquan Liu and Jianhua Zhang
Remote Sens. 2025, 17(4), 602; https://doi.org/10.3390/rs17040602 - 10 Feb 2025
Cited by 2 | Viewed by 1083
Abstract
Anomaly detection (AD) is an important topic in remote sensing, aiming to identify unusual or abnormal features within the data. However, most existing low-rank representation methods usually use the nuclear norm for background estimation, and do not consider the different contributions of different [...] Read more.
Anomaly detection (AD) is an important topic in remote sensing, aiming to identify unusual or abnormal features within the data. However, most existing low-rank representation methods usually use the nuclear norm for background estimation, and do not consider the different contributions of different singular values. Besides, they overlook the spatial relationships of abnormal regions, particularly failing to fully leverage the 3D structured information of the data. Moreover, noise in practical scenarios can disrupt the low-rank structure of the background, making it challenging to separate anomaly from the background and ultimately reducing detection accuracy. To address these challenges, this paper proposes a weighted multidirectional sparsity regularized low-rank tensor representation method (WMS-LRTR) for AD. WMS-LRTR uses the weighted tensor nuclear norm for background estimation to characterize the low-rank property of the background. Considering the correlation between abnormal pixels across different dimensions, the proposed method introduces a novel weighted multidirectional sparsity (WMS) by unfolding anomaly into multimodal to better exploit the sparsity of the anomaly. In order to improve the robustness of AD, we further embed a user-friendly plug-and-play (PnP) denoising prior to optimize the background modeling under low-rank structure and facilitate the separation of sparse anomalous regions. Furthermore, an effective iterative algorithm using alternate direction method of multipliers (ADMM) is introduced, whose subproblems can be solved quickly by fast solvers or have closed-form solutions. Numerical experiments on various datasets show that WMS-LRTR outperforms state-of-the-art AD methods, demonstrating its better detection ability. Full article
Show Figures

Figure 1

21 pages, 6413 KB  
Article
Targetless Radar–Camera Extrinsic Parameter Calibration Using Track-to-Track Association
by Xinyu Liu, Zhenmiao Deng and Gui Zhang
Sensors 2025, 25(3), 949; https://doi.org/10.3390/s25030949 - 5 Feb 2025
Cited by 1 | Viewed by 4607
Abstract
One of the challenges in calibrating millimeter-wave radar and camera lies in the sparse semantic information of the radar point cloud, making it hard to extract environment features corresponding to the images. To overcome this problem, we propose a track association algorithm for [...] Read more.
One of the challenges in calibrating millimeter-wave radar and camera lies in the sparse semantic information of the radar point cloud, making it hard to extract environment features corresponding to the images. To overcome this problem, we propose a track association algorithm for heterogeneous sensors, to achieve targetless calibration between the radar and camera. Our algorithm extracts corresponding points from millimeter-wave radar and image coordinate systems by considering the association of tracks from different sensors, without any explicit target or prior for the extrinsic parameter. Then, perspective-n-point (PnP) and nonlinear optimization algorithms are applied to obtain the extrinsic parameter. In an outdoor experiment, our algorithm achieved a track association accuracy of 96.43% and an average reprojection error of 2.6649 pixels. On the CARRADA dataset, our calibration method yielded a reprojection error of 3.1613 pixels, an average rotation error of 0.8141°, and an average translation error of 0.0754 m. Furthermore, robustness tests demonstrated the effectiveness of our calibration algorithm in the presence of noise. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

14 pages, 2756 KB  
Article
SCOUT: Skull-Corrected Optimization for Ultrasound Transducers
by Zheng Jiang, Michelle Hua, Jacqueline Li, Hieu Le Mau, James Choi, William B. Gormley, Jose M. Amich and Raahil M. Sha
Bioengineering 2024, 11(11), 1144; https://doi.org/10.3390/bioengineering11111144 - 13 Nov 2024
Cited by 1 | Viewed by 3320
Abstract
Transcranial focused ultrasound has been studied for non-invasive and localized treatment of many brain diseases. The biggest challenge for focusing ultrasound onto the brain is the skull, which attenuates ultrasound and changes its propagation direction, leading to pressure drop, focus shift, and defocusing. [...] Read more.
Transcranial focused ultrasound has been studied for non-invasive and localized treatment of many brain diseases. The biggest challenge for focusing ultrasound onto the brain is the skull, which attenuates ultrasound and changes its propagation direction, leading to pressure drop, focus shift, and defocusing. We presented an optimization algorithm which automatically found the optimal location for placing a single-element focused transducer. At this optimal location, the focus shift was in an acceptable range and the ultrasound was tightly focused. The algorithm simulated the beam profiles of placing the transducer at different locations and compared the results. Locations with a normalized peak-negative pressure (PNP) above threshold were first found. Then, the optimal location was identified as the location with the smallest focal volume. The optimal location found in this study had a normalized PNP of 0.966 and a focal volume of 6.8% smaller than without the skull. A Zeta navigation system was used to automatically place the transducer and track the error caused by movement. These results demonstrated that the algorithm could find the optimal transducer location to avoid large focus shift and defocusing. With the Zeta navigation system, our algorithm can help to make transcranial focused ultrasound treatment safer and more successful. Full article
(This article belongs to the Section Biomedical Engineering and Biomaterials)
Show Figures

Figure 1

25 pages, 11107 KB  
Article
Joint Optimization of the 3D Model and 6D Pose for Monocular Pose Estimation
by Liangchao Guo, Lin Chen, Qiufu Wang, Zhuo Zhang and Xiaoliang Sun
Drones 2024, 8(11), 626; https://doi.org/10.3390/drones8110626 - 30 Oct 2024
Cited by 1 | Viewed by 1848
Abstract
The autonomous landing of unmanned aerial vehicles (UAVs) relies on a precise relative 6D pose between platforms. Existing model-based monocular pose estimation methods need an accurate 3D model of the target. They cannot handle the absence of an accurate 3D model. This paper [...] Read more.
The autonomous landing of unmanned aerial vehicles (UAVs) relies on a precise relative 6D pose between platforms. Existing model-based monocular pose estimation methods need an accurate 3D model of the target. They cannot handle the absence of an accurate 3D model. This paper adopts the multi-view geometry constraints within the monocular image sequence to solve the problem. And a novel approach to monocular pose estimation is introduced, which jointly optimizes the target’s 3D model and the relative 6D pose. We propose to represent the target’s 3D model using a set of sparse 3D landmarks. The 2D landmarks are detected in the input image by a trained neural network. Based on the 2D–3D correspondences, the initial pose estimation is obtained by solving the PnP problem. To achieve joint optimization, this paper builds the objective function based on the minimization of the reprojection error. And the correction values of the 3D landmarks and the 6D pose are parameters to be solved in the optimization problem. By solving the optimization problem, the joint optimization of the target’s 3D model and the 6D pose is realized. In addition, a sliding window combined with a keyframe extraction strategy is adopted to speed up the algorithm processing. Experimental results on synthetic and real image sequences show that the proposed method achieves real-time and online high-precision monocular pose estimation with the absence of an accurate 3D model via the joint optimization of the target’s 3D model and pose. Full article
Show Figures

Figure 1

18 pages, 2495 KB  
Article
An Energy-Efficient Field-Programmable Gate Array (FPGA) Implementation of a Real-Time Perspective-n-Point Solver
by Haobo Lv and Qiongzhi Wu
Electronics 2024, 13(19), 3815; https://doi.org/10.3390/electronics13193815 - 26 Sep 2024
Viewed by 1486
Abstract
Solving the Perspective-n-Point (PnP) problem is difficult in low-power systems due to the high computing workload. To handle this challenge, we present an originally designed FPGA implementation of a PnP solver based on Vivado HLS. A matrix operation library and a matrix decomposition [...] Read more.
Solving the Perspective-n-Point (PnP) problem is difficult in low-power systems due to the high computing workload. To handle this challenge, we present an originally designed FPGA implementation of a PnP solver based on Vivado HLS. A matrix operation library and a matrix decomposition library based on QR decomposition have been developed, upon which the EPnP algorithm has been implemented. To enhance the operational speed of the system, we employed pipeline optimization techniques and adjusted the computational process to shorten the calculation time. The experimental results show that when the number of input data points is 300, the proposed system achieves a processing speed of 45.2 fps with a power consumption of 1.7 W and reaches a peak-signal-to-noise ratio of over 70 dB. Our system consumes only 3.9% of the power consumption per calculation compared to desktop-level processors. The proposed system significantly reduces the power consumption required for the PnP solution and is suitable for application in low-power systems. Full article
(This article belongs to the Special Issue System-on-Chip (SoC) and Field-Programmable Gate Array (FPGA) Design)
Show Figures

Figure 1

27 pages, 1997 KB  
Article
Robust a Posteriori Error Estimates of Time-Dependent Poisson–Nernst–Planck Equations
by Keli Fu and Tingting Hao
Mathematics 2024, 12(17), 2610; https://doi.org/10.3390/math12172610 - 23 Aug 2024
Cited by 1 | Viewed by 1014
Abstract
The paper considers the a posteriori error estimates for fully discrete approximations of time-dependent Poisson–Nernst–Planck (PNP) equations, which provide tools that allow for optimizing the choice of each time step when working with adaptive meshes. The equations are discretized by the Backward Euler [...] Read more.
The paper considers the a posteriori error estimates for fully discrete approximations of time-dependent Poisson–Nernst–Planck (PNP) equations, which provide tools that allow for optimizing the choice of each time step when working with adaptive meshes. The equations are discretized by the Backward Euler scheme in time and conforming finite elements in space. Overcoming the coupling of time and the space with a full discrete solution and dealing with nonlinearity by taking G-derivatives of the nonlinear system, the computable, robust, effective, and reliable space–time a posteriori error estimation is obtained. The adaptive algorithm constructed based on the estimates realizes the parallel adaptations of time steps and mesh refinements, which are verified by numerical experiments with the time singular point and adaptive mesh refinement with boundary layer effects. Full article
Show Figures

Figure 1

22 pages, 11273 KB  
Article
Identification and Positioning Method of Bulk Cargo Terminal Unloading Hopper Based on Monocular Vision Three-Dimensional Measurement
by Ziyang Shen, Jiaqi Wang, Yujie Zhang, Luocheng Zheng, Chao Mi and Yang Shen
J. Mar. Sci. Eng. 2024, 12(8), 1282; https://doi.org/10.3390/jmse12081282 - 30 Jul 2024
Cited by 7 | Viewed by 1972
Abstract
Rapid identification and localization of dry bulk cargo hoppers are currently core issues in the automation control of gantry cranes at dry bulk terminals. The current conventional method relies on LiDAR systems for the identification and positioning of bulk unloading hoppers. However, this [...] Read more.
Rapid identification and localization of dry bulk cargo hoppers are currently core issues in the automation control of gantry cranes at dry bulk terminals. The current conventional method relies on LiDAR systems for the identification and positioning of bulk unloading hoppers. However, this approach is complex and costly. In contrast, GPS-based positioning solutions for bulk unloading hoppers are prone to damage due to the vibrations generated during the operation process. Therefore, in this paper, a hopper localization system based on monocular camera vision is proposed to locate the position of the bulk unloading hopper. The hopper identification and localization process are divided into three stages. The first stage uses the improved YOLOv5 model to quickly and roughly locate the hopper target. The second stage uses morphological geometrical features to locate the corner points of the hopper target. The third stage determines the three-dimensional coordinates of the hopper target by solving the position of the corner points in the world coordinate system through the PnP (Perspective-n-Point) algorithm. The experimental results show that the average positioning accuracy of the coordinates of the method is above 93%, demonstrating the accuracy and effectiveness of the method. Full article
Show Figures

Figure 1

20 pages, 9507 KB  
Article
Sparse SAR Imaging Based on Non-Local Asymmetric Pixel-Shuffle Blind Spot Network
by Yao Zhao, Decheng Xiao, Zhouhao Pan, Bingo Wing-Kuen Ling, Ye Tian and Zhe Zhang
Remote Sens. 2024, 16(13), 2367; https://doi.org/10.3390/rs16132367 - 28 Jun 2024
Viewed by 1585
Abstract
The integration of Synthetic Aperture Radar (SAR) imaging technology with deep neural networks has experienced significant advancements in recent years. Yet, the scarcity of high-quality samples and the difficulty of extracting prior information from SAR data have experienced limited progress in this domain. [...] Read more.
The integration of Synthetic Aperture Radar (SAR) imaging technology with deep neural networks has experienced significant advancements in recent years. Yet, the scarcity of high-quality samples and the difficulty of extracting prior information from SAR data have experienced limited progress in this domain. This study introduces an innovative sparse SAR imaging approach using a self-supervised non-local asymmetric pixel-shuffle blind spot network. This strategy enables the network to be trained without labeled samples, thus solving the problem of the scarcity of high-quality samples. Through asymmetric pixel-shuffle downsampling (AP) operation, the spatial correlation between pixels is broken so that the blind spot network can adapt to the actual scene. The network also incorporates a non-local module (NLM) into its blind spot architecture, enhancing its capability to analyze a broader range of information and extract more comprehensive prior knowledge from SAR data. Subsequently, Plug and Play (PnP) technology is used to integrate the trained network into the sparse SAR imaging model to solve the regularization term problem. The optimization of the inverse problem is achieved through the Alternating Direction Method of Multipliers (ADMM) algorithm. The experimental results of the unlabeled samples demonstrate that our method significantly outperforms traditional techniques in reconstructing images across various regions. Full article
(This article belongs to the Special Issue Advances in Radar Imaging with Deep Learning Algorithms)
Show Figures

Graphical abstract

23 pages, 4053 KB  
Article
Hyperspectral Image Denoising Based on Deep and Total Variation Priors
by Peng Wang, Tianman Sun, Yiming Chen, Lihua Ge, Xiaoyi Wang and Liguo Wang
Remote Sens. 2024, 16(12), 2071; https://doi.org/10.3390/rs16122071 - 7 Jun 2024
Cited by 5 | Viewed by 4337
Abstract
To address the problems of noise interference and image blurring in hyperspectral imaging (HSI), this paper proposes a denoising method for HSI based on deep learning and a total variation (TV) prior. The method minimizes the first-order moment distance between the deep prior [...] Read more.
To address the problems of noise interference and image blurring in hyperspectral imaging (HSI), this paper proposes a denoising method for HSI based on deep learning and a total variation (TV) prior. The method minimizes the first-order moment distance between the deep prior of a Fast and Flexible Denoising Convolutional Neural Network (FFDNet) and the Enhanced 3D TV (E3DTV) prior, obtaining dual priors that complement and reinforce each other’s advantages. Specifically, the original HSI is initially processed with a random binary sparse observation matrix to achieve a sparse representation. Subsequently, the plug-and-play (PnP) algorithm is employed within the framework of generalized alternating projection (GAP) to denoise the sparsely represented HSI. Experimental results demonstrate that, compared to existing methods, this method shows significant advantages in both quantitative and qualitative assessments, effectively enhancing the quality of HSIs. Full article
Show Figures

Figure 1

13 pages, 2958 KB  
Article
Research on Six-Degree-of-Freedom Refueling Robotic Arm Positioning and Docking Based on RGB-D Visual Guidance
by Mingbo Yang and Jiapeng Liu
Appl. Sci. 2024, 14(11), 4904; https://doi.org/10.3390/app14114904 - 5 Jun 2024
Cited by 7 | Viewed by 2892
Abstract
The main contribution of this paper is the proposal of a six-degree-of-freedom (6-DoF) refueling robotic arm positioning and docking technology guided by RGB-D camera visual guidance, as well as conducting in-depth research and experimental validation on the technology. We have integrated the YOLOv8 [...] Read more.
The main contribution of this paper is the proposal of a six-degree-of-freedom (6-DoF) refueling robotic arm positioning and docking technology guided by RGB-D camera visual guidance, as well as conducting in-depth research and experimental validation on the technology. We have integrated the YOLOv8 algorithm with the Perspective-n-Point (PnP) algorithm to achieve precise detection and pose estimation of the target refueling interface. The focus is on resolving the recognition and positioning challenges of a specialized refueling interface by the 6-DoF robotic arm during the automated refueling process. To capture the unique characteristics of the refueling interface, we developed a dedicated dataset for the specialized refueling connectors, ensuring the YOLO algorithm’s accurate identification of the target interfaces. Subsequently, the detected interface information is converted into precise 6-DoF pose data using the PnP algorithm. These data are used to determine the desired end-effector pose of the robotic arm. The robotic arm’s movements are controlled through a trajectory planning algorithm to complete the refueling gun docking process. An experimental setup was established in the laboratory to validate the accuracy of the visual recognition and the applicability of the robotic arm’s docking posture. The experimental results demonstrate that under general lighting conditions, the recognition accuracy of this docking interface method meets the docking requirements. Compared to traditional vision-guided methods based on OpenCV, this visual guidance algorithm exhibits better adaptability and effectively provides pose information for the robotic arm. Full article
Show Figures

Figure 1

Back to TopTop