MDPI - Publisher of Open Access Journals

31 pages, 34013 KiB

Open AccessArticle

Vision-Based 6D Pose Analytics Solution for High-Precision Industrial Robot Pick-and-Place Applications

by Balamurugan Balasubramanian and Kamil Cetin

Sensors 2025, 25(15), 4824; https://doi.org/10.3390/s25154824 - 6 Aug 2025

High-precision 6D pose estimation for pick-and-place operations remains a critical problem for industrial robot arms in manufacturing. This study introduces an analytics-based solution for 6D pose estimation designed for a real-world industrial application: it enables the Staubli TX2-60L (manufactured by Stäubli International AG, [...] Read more.

High-precision 6D pose estimation for pick-and-place operations remains a critical problem for industrial robot arms in manufacturing. This study introduces an analytics-based solution for 6D pose estimation designed for a real-world industrial application: it enables the Staubli TX2-60L (manufactured by Stäubli International AG, Horgen, Switzerland) robot arm to pick up metal plates from various locations and place them into a precisely defined slot on a brake pad production line. The system uses a fixed eye-to-hand Intel RealSense D435 RGB-D camera (manufactured by Intel Corporation, Santa Clara, California, USA) to capture color and depth data. A robust software infrastructure developed in LabVIEW (ver.2019) integrated with the NI Vision (ver.2019) library processes the images through a series of steps, including particle filtering, equalization, and pattern matching, to determine the X-Y positions and Z-axis rotation of the object. The Z-position of the object is calculated from the camera’s intensity data, while the remaining X-Y rotation angles are determined using the angle-of-inclination analytics method. It is experimentally verified that the proposed analytical solution outperforms the hybrid-based method (YOLO-v8 combined with PnP/RANSAC algorithms). Experimental results across four distinct picking scenarios demonstrate the proposed solution’s superior accuracy, with position errors under 2 mm, orientation errors below 1°, and a perfect success rate in pick-and-place tasks. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

24 pages, 8344 KiB

Open AccessArticle

Research and Implementation of Travel Aids for Blind and Visually Impaired People

by Jun Xu, Shilong Xu, Mingyu Ma, Jing Ma and Chuanlong Li

Sensors 2025, 25(14), 4518; https://doi.org/10.3390/s25144518 - 21 Jul 2025

Viewed by 356

Abstract

Blind and visually impaired (BVI) people face significant challenges in perception, navigation, and safety during travel. Existing infrastructure (e.g., blind lanes) and traditional aids (e.g., walking sticks, basic audio feedback) provide limited flexibility and interactivity for complex environments. To solve this problem, we [...] Read more.

Blind and visually impaired (BVI) people face significant challenges in perception, navigation, and safety during travel. Existing infrastructure (e.g., blind lanes) and traditional aids (e.g., walking sticks, basic audio feedback) provide limited flexibility and interactivity for complex environments. To solve this problem, we propose a real-time travel assistance system based on deep learning. The hardware comprises an NVIDIA Jetson Nano controller, an Intel D435i depth camera for environmental sensing, and SG90 servo motors for feedback. To address embedded device computational constraints, we developed a lightweight object detection and segmentation algorithm. Key innovations include a multi-scale attention feature extraction backbone, a dual-stream fusion module incorporating the Mamba architecture, and adaptive context-aware detection/segmentation heads. This design ensures high computational efficiency and real-time performance. The system workflow is as follows: (1) the D435i captures real-time environmental data; (2) the processor analyzes this data, converting obstacle distances and path deviations into electrical signals; (3) servo motors deliver vibratory feedback for guidance and alerts. Preliminary tests confirm that the system can effectively detect obstacles and correct path deviations in real time, suggesting its potential to assist BVI users. However, as this is a work in progress, comprehensive field trials with BVI participants are required to fully validate its efficacy. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

16 pages, 6435 KiB

Open AccessArticle

A Switched-Capacitor-Based Quasi-H7 Inverter for Common-Mode Voltage Reduction

by Thi-Thanh Nga Nguyen, Tan-Tai Tran, Minh-Duc Ngo and Seon-Ju Ahn

Energies 2025, 18(12), 3218; https://doi.org/10.3390/en18123218 - 19 Jun 2025

Viewed by 338

Abstract

This paper proposes a novel three-phase two-level DC-AC inverter with significantly reduced common-mode voltage. The proposed inverter combines a conventional three-phase H7 configuration with a voltage multiplier network, effectively doubling the DC-link voltage relative to the input. Compared to existing solutions, the topology [...] Read more.

This paper proposes a novel three-phase two-level DC-AC inverter with significantly reduced common-mode voltage. The proposed inverter combines a conventional three-phase H7 configuration with a voltage multiplier network, effectively doubling the DC-link voltage relative to the input. Compared to existing solutions, the topology achieves a remarkably low common-mode voltage, limited to only 16.6% of the DC-link voltage. Additionally, the voltage stress across the additional switches remains at half of the DC-link voltage. The paper details the operating principles, mathematical formulation, and circuit-level analysis of the proposed inverter. Simulation results are provided to validate its performance. Furthermore, a hardware prototype has been implemented using a DSP TMS320F28379D microcontroller manufactured by Texas Instruments, headquartered in Dallas, TX, USA in conjunction with an Altera Cyclone® IV EP4CE22F17C6N FPGA-based digital control platform manufactured by Intel Corporation, headquarters in Santa Clara, CA, USA. Experimental results are presented to confirm the effectiveness and feasibility of the proposed design. Full article

(This article belongs to the Special Issue Advanced Control and Operation of Microgrids and Power Distribution Systems)

► Show Figures

Figure 1

17 pages, 5647 KiB

Open AccessArticle

Solar Photovoltaic Diagnostic System with Logic Verification and Integrated Circuit Design for Fabrication

by Abhitej Divi and Shuza Binzaid

Solar 2025, 5(2), 24; https://doi.org/10.3390/solar5020024 - 30 May 2025

Cited by 1 | Viewed by 1083

Abstract

Solar photovoltaic (PV) panels are the best solution to reduce greenhouse gas emissions by fossil fuel combustion, with global capability now exceeding 714 GW due to rapid technological advances in solar panels (SPs). However, SPs’ efficiency and lifespan remain limited due to the [...] Read more.

Solar photovoltaic (PV) panels are the best solution to reduce greenhouse gas emissions by fossil fuel combustion, with global capability now exceeding 714 GW due to rapid technological advances in solar panels (SPs). However, SPs’ efficiency and lifespan remain limited due to the absence of advanced fault-detection systems, and they are prone to short circuits (SC), open circuits (OC), and power degradation. Therefore, this large-scale production requires reliable, real-time fault diagnosis to maintain panel performance. However, traditional diagnostic methods implemented using MPPT, neural networks, or microcontroller-based systems often rely on complex computational algorithms and are not cost-effective. So, this paper proposes a diagnostic system composed of six functional blocks to address this issue. The proposed system was initially verified using an Intel DE-10 Lite FPGA board. Once its functionality was confirmed, an ASIC design was proposed for mass production, offering a significantly lower implementation cost and reduced hardware complexity than prior methods. Different circuit designs were developed for each of the six blocks. All designs were created using Cadence software and TSMC 180 nm technology files. The basic components used in these designs include PMOS transistors with 300 nm channel length and 2 µm width, NMOS transistors with 350 nm channel length and 2 µm width, as well as resistors and capacitors. Differential amplifiers with a gain of 40 dB were used for voltage and current sensing from the SP. The chip activation signal generator circuit was designed with an adjustable frequency and generated 120 MHz and 100 MHz signals in this work. The decision-making block, Logic Driver Circuit, was innovatively implemented using a reduced number of transistors. A custom memory block with a reset switch was also implemented to store the fault value detected at the SP. Finally, the proposed ASIC was implemented for fabrication, which is highly cost-effective in mass production and does not require complex computational stages. Full article

► Show Figures

Figure 1

11 pages, 4392 KiB

Open AccessProceeding Paper

Implementation of Autonomous Navigation for Solar-Panel-Cleaning Vehicle Based on YOLOv4-Tiny

by Wen-Chang Cheng and Xu-Dong Chen

Eng. Proc. 2025, 92(1), 31; https://doi.org/10.3390/engproc2025092031 - 28 Apr 2025

Viewed by 361

Abstract

We developed an autonomous navigation system for a solar-panel-cleaning vehicle. The system utilizes the YOLOv4-Tiny object detection model to detect white lines on the solar panels and combines the model with a proportional–integral–derivative (PID) controller to achieve autonomous navigation functionality. The main system [...] Read more.

We developed an autonomous navigation system for a solar-panel-cleaning vehicle. The system utilizes the YOLOv4-Tiny object detection model to detect white lines on the solar panels and combines the model with a proportional–integral–derivative (PID) controller to achieve autonomous navigation functionality. The main system platform was built on Raspberry Pi, and the Intel Neural Compute Stick 2 (NCS2) was used for hardware acceleration, which boosted the model’s inference speed from 2 to 8 frames per second (FPS), significantly enhancing the system’s real-time performance. By tuning the PID controller parameters, the system achieved an optimal performance, with

K_{P}

= 11,

K_{i}

= 0.01, and

K_{d}

= 30, maintaining the average value of the error e(t) at −0.0412 and the standard deviation at 0.1826 and improving the inference speed. The system autonomously followed the white lines on the solar panels and automatically turned when reaching the boundaries. The system also autonomously cleaned itself. The developed autonomous navigation system effectively improved the efficiency and convenience of solar panel cleaning. Full article

(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)

► Show Figures

Figure 1

20 pages, 11233 KiB

Open AccessArticle

Capturing Free Surface Dynamics of Flows over a Stepped Spillway Using a Depth Camera

by Megh Raj K C, Brian M. Crookston and Daniel B. Bung

Sensors 2025, 25(8), 2525; https://doi.org/10.3390/s25082525 - 17 Apr 2025

Viewed by 486

Abstract

Spatio-temporal measurements of turbulent free surface flows remain challenging with in situ point methods. This study explores the application of an inexpensive depth-sensing RGB-D camera, the Intel^® RealSense™ D455, to capture detailed water surface measurements of a highly turbulent, self-aerated flow in [...] Read more.

Spatio-temporal measurements of turbulent free surface flows remain challenging with in situ point methods. This study explores the application of an inexpensive depth-sensing RGB-D camera, the Intel^® RealSense™ D455, to capture detailed water surface measurements of a highly turbulent, self-aerated flow in the case of a stepped spillway. Ambient lighting conditions and various sensor settings, including configurations and parameters affecting data capture and quality, were assessed. A free surface profile was extracted from the 3D measurements and compared against phase detection conductivity probe (PDCP) and ultrasonic sensor (USS) measurements. Measurements in the non-aerated region were influenced by water transparency and a lack of detectable surface features, with flow depths consistently smaller than USS measurements (up to 32.5% less). Measurements in the clear water region also resulted in a “no data” region with holes in the depth map due to shiny reflections. In the aerated flow region, the camera effectively detected the dynamic water surface, with mean surface profiles close to characteristic depths measured with PDCP and within one standard deviation of the mean USS flow depths. The flow depths were within 10% of the USS depths and corresponded to depths with 80–90% air concentration levels obtained with the PDCP. Additionally, the depth camera successfully captured temporal fluctuations, allowing for the calculation of time-averaged entrapped air concentration profiles and dimensionless interface frequency distributions. This facilitated a direct comparison with PDCP and USS sensors, demonstrating that this camera sensor is a practical and cost-effective option for detecting free surfaces of high velocity, aerated, and dynamic flows in a stepped chute. Full article

(This article belongs to the Special Issue 3D Reconstruction with RGB-D Cameras and Multi-sensors)

► Show Figures

Figure 1

26 pages, 9389 KiB

Open AccessArticle

Real-Time Data-Driven Method for Bolt Defect Detection and Size Measurement in Industrial Production

by Jinlong Yang and Chul-Hee Lee

Actuators 2025, 14(4), 185; https://doi.org/10.3390/act14040185 - 9 Apr 2025

Cited by 1 | Viewed by 867

Abstract

To enhance the automatic quality monitoring of bolt production, YOLOv10, Intel RealSense D435, and OpenCV were integrated to leverage GPU parallel computing capabilities for defect recognition and size measurement. To improve the model’s effectiveness across various industrial production environments, data augmentation techniques were [...] Read more.

To enhance the automatic quality monitoring of bolt production, YOLOv10, Intel RealSense D435, and OpenCV were integrated to leverage GPU parallel computing capabilities for defect recognition and size measurement. To improve the model’s effectiveness across various industrial production environments, data augmentation techniques were employed, resulting in a trained model with notable precision, accuracy, and robustness. A high-precision camera calibration method was used, and image processing was accelerated through GPU parallel computing to ensure efficient and real-time target size measurement. In the real-time monitoring system, the average defect prediction time was 0.009241 s, achieving an accuracy of 99% and demonstrating high stability under varying lighting conditions. The average size measurement time was 0.021616 s, and increasing the light intensity could reduce the maximum error rate to 1%. These results demonstrated that the system excelled in real-time performance, accuracy, robustness, and efficiency, effectively addressing the demands of industrial production lines for rapid and precise defect detection and size measurement. In the dynamic and variable context of industrial applications, the system can be optimized and adjusted according to specific production environments and requirements, further enhancing the accuracy of defect detection and size measurement tasks. Full article

► Show Figures

Figure 1

19 pages, 50560 KiB

Open AccessArticle

Garment Recognition and Reconstruction Using Object Simultaneous Localization and Mapping

by Yilin Zhang and Koichi Hashimoto

Sensors 2024, 24(23), 7622; https://doi.org/10.3390/s24237622 - 28 Nov 2024

Cited by 1 | Viewed by 897

Abstract

The integration of robotics in the garment industry remains relatively limited, primarily due to the challenges in the highly deformable nature of garments. The objective of this study is thus to explore a vision-based garment recognition and environment reconstruction model to facilitate the [...] Read more.

The integration of robotics in the garment industry remains relatively limited, primarily due to the challenges in the highly deformable nature of garments. The objective of this study is thus to explore a vision-based garment recognition and environment reconstruction model to facilitate the application of robots in garment processing. Object SLAM (Simultaneous Localization and Mapping) was employed as the core methodology for real-time mapping and tracking. To enable garment detection and reconstruction, two datasets were created: a 2D garment image dataset for instance segmentation model training and a synthetic 3D mesh garment dataset to enhance the DeepSDF (Signed Distance Function) model for generative garment reconstruction. In addition to garment detection, the SLAM system was extended to identify and reconstruct environmental planes, using the CAPE (Cylinder and Plane Extraction) model. The implementation was tested using an Intel Realsense^® camera, demonstrating the feasibility of simultaneous garment and plane detection and reconstruction. This study shows improved performance in garment recognition with the 2D instance segmentation models and an enhanced understanding of garment shapes and structures with the DeepSDF model. The integration of CAPE plane detection with SLAM allows for more robust environment reconstruction that is capable of handling multiple objects. The implementation and evaluation of the system highlight its potential for enhancing automation and efficiency in the garment processing industry. Full article

(This article belongs to the Special Issue Advances in Sensing, Control and Path Planning for Robotic Systems)

► Show Figures

Figure 1

32 pages, 11087 KiB

Open AccessArticle

Path Planning and Motion Control of Robot Dog Through Rough Terrain Based on Vision Navigation

by Tianxiang Chen, Yipeng Huangfu, Sutthiphong Srigrarom and Boo Cheong Khoo

Sensors 2024, 24(22), 7306; https://doi.org/10.3390/s24227306 - 15 Nov 2024

Viewed by 4098

Abstract

This article delineates the enhancement of an autonomous navigation and obstacle avoidance system for a quadruped robot dog. Part one of this paper presents the integration of a sophisticated multi-level dynamic control framework, utilizing Model Predictive Control (MPC) and Whole-Body Control (WBC) from [...] Read more.

This article delineates the enhancement of an autonomous navigation and obstacle avoidance system for a quadruped robot dog. Part one of this paper presents the integration of a sophisticated multi-level dynamic control framework, utilizing Model Predictive Control (MPC) and Whole-Body Control (WBC) from MIT Cheetah. The system employs an Intel RealSense D435i depth camera for depth vision-based navigation, which enables high-fidelity 3D environmental mapping and real-time path planning. A significant innovation is the customization of the EGO-Planner to optimize trajectory planning in dynamically changing terrains, coupled with the implementation of a multi-body dynamics model that significantly improves the robot’s stability and maneuverability across various surfaces. The experimental results show that the RGB-D system exhibits superior velocity stability and trajectory accuracy to the SLAM system, with a 20% reduction in the cumulative velocity error and a 10% improvement in path tracking precision. The experimental results also show that the RGB-D system achieves smoother navigation, requiring 15% fewer iterations for path planning, and a 30% faster success rate recovery in challenging environments. The successful application of these technologies in simulated urban disaster scenarios suggests promising future applications in emergency response and complex urban environments. Part two of this paper presents the development of a robust path planning algorithm for a robot dog on a rough terrain based on attached binocular vision navigation. We use a commercial-of-the-shelf (COTS) robot dog. An optical CCD binocular vision dynamic tracking system is used to provide environment information. Likewise, the pose and posture of the robot dog are obtained from the robot’s own sensors, and a kinematics model is established. Then, a binocular vision tracking method is developed to determine the optimal path, provide a proposal (commands to actuators) of the position and posture of the bionic robot, and achieve stable motion on tough terrains. The terrain is assumed to be a gentle uneven terrain to begin with and subsequently proceeds to a more rough surface. This work consists of four steps: (1) pose and position data are acquired from the robot dog’s own inertial sensors, (2) terrain and environment information is input from onboard cameras, (3) information is fused (integrated), and (4) path planning and motion control proposals are made. Ultimately, this work provides a robust framework for future developments in the vision-based navigation and control of quadruped robots, offering potential solutions for navigating complex and dynamic terrains. Full article

(This article belongs to the Special Issue Control Systems, Vision Technology and Sensor Fusion for Unmanned Robotic Vehicles)

► Show Figures

Figure 1

26 pages, 33294 KiB

Open AccessArticle

RGB-D Camera and Fractal-Geometry-Based Maximum Diameter Estimation Method of Apples for Robot Intelligent Selective Graded Harvesting

by Bin Yan and Xiameng Li

Fractal Fract. 2024, 8(11), 649; https://doi.org/10.3390/fractalfract8110649 - 7 Nov 2024

Cited by 4 | Viewed by 1465

Abstract

Realizing the integration of intelligent fruit picking and grading for apple harvesting robots is an inevitable requirement for the future development of smart agriculture and precision agriculture. Therefore, an apple maximum diameter estimation model based on RGB-D camera fusion depth information was proposed [...] Read more.

Realizing the integration of intelligent fruit picking and grading for apple harvesting robots is an inevitable requirement for the future development of smart agriculture and precision agriculture. Therefore, an apple maximum diameter estimation model based on RGB-D camera fusion depth information was proposed in the study. Firstly, the maximum diameter parameters of Red Fuji apples were collected, and the results were statistically analyzed. Then, based on the Intel RealSense D435 RGB-D depth camera and LabelImg software, the depth information of apples and the two-dimensional size information of fruit images were obtained. Furthermore, the relationship between fruit depth information, two-dimensional size information of fruit images, and the maximum diameter of apples was explored. Based on Origin software, multiple regression analysis and nonlinear surface fitting were used to analyze the correlation between fruit depth, diagonal length of fruit bounding rectangle, and maximum diameter. A model for estimating the maximum diameter of apples was constructed. Finally, the constructed maximum diameter estimation model was experimentally validated and evaluated for imitation apples in the laboratory and fruits on the Red Fuji fruit trees in modern apple orchards. The experimental results showed that the average maximum relative error of the constructed model in the laboratory imitation apple validation set was ±4.1%, the correlation coefficient (R²) of the estimated model was 0.98613, and the root mean square error (RMSE) was 3.21 mm. The average maximum diameter estimation relative error on the modern orchard Red Fuji apple validation set was ±3.77%, the correlation coefficient (R²) of the estimation model was 0.84, and the root mean square error (RMSE) was 3.95 mm. The proposed model can provide theoretical basis and technical support for the selective apple-picking operation of intelligent robots based on apple size grading. Full article

(This article belongs to the Special Issue Fractional Order Complex Systems: Advanced Control, Intelligent Estimation and Reinforcement Learning Image Processing Algorithms)

► Show Figures

Figure 1

25 pages, 8051 KiB

Open AccessArticle

Dexterous Manipulation Based on Object Recognition and Accurate Pose Estimation Using RGB-D Data

by Udaka A. Manawadu and Naruse Keitaro

Sensors 2024, 24(21), 6823; https://doi.org/10.3390/s24216823 - 24 Oct 2024

Cited by 1 | Viewed by 2255

Abstract

This study presents an integrated system for object recognition, six-degrees-of-freedom pose estimation, and dexterous manipulation using a JACO robotic arm with an Intel RealSense D435 camera. This system is designed to automate the manipulation of industrial valves by capturing point clouds (PCs) from [...] Read more.

This study presents an integrated system for object recognition, six-degrees-of-freedom pose estimation, and dexterous manipulation using a JACO robotic arm with an Intel RealSense D435 camera. This system is designed to automate the manipulation of industrial valves by capturing point clouds (PCs) from multiple perspectives to improve the accuracy of pose estimation. The object recognition module includes scene segmentation, geometric primitives recognition, model recognition, and a color-based clustering and integration approach enhanced by a dynamic cluster merging algorithm. Pose estimation is achieved using the random sample consensus algorithm, which predicts position and orientation. The system was tested within a 60° field of view, which extended in all directions in front of the object. The experimental results show that the system performs reliably within acceptable error thresholds for both position and orientation when the objects are within a ±15° range of the camera’s direct view. However, errors increased with more extreme object orientations and distances, particularly when estimating the orientation of ball valves. A zone-based dexterous manipulation strategy was developed to overcome these challenges, where the system adjusts the camera position for optimal conditions. This approach mitigates larger errors in difficult scenarios, enhancing overall system reliability. The key contributions of this research include a novel method for improving object recognition and pose estimation, a technique for increasing the accuracy of pose estimation, and the development of a robot motion model for dexterous manipulation in industrial settings. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

25 pages, 1511 KiB

Open AccessArticle

Performance Study of an MRI Motion-Compensated Reconstruction Program on Intel CPUs, AMD EPYC CPUs, and NVIDIA GPUs

by Mohamed Aziz Zeroual, Karyna Isaieva, Pierre-André Vuissoz and Freddy Odille

Appl. Sci. 2024, 14(21), 9663; https://doi.org/10.3390/app14219663 - 23 Oct 2024

Cited by 2 | Viewed by 1440

Abstract

Motion-compensated image reconstruction enables new clinical applications of Magnetic Resonance Imaging (MRI), but it relies on computationally intensive algorithms. This study focuses on the Generalized Reconstruction by Inversion of Coupled Systems (GRICS) program, applied to the reconstruction of 3D images in cases of [...] Read more.

Motion-compensated image reconstruction enables new clinical applications of Magnetic Resonance Imaging (MRI), but it relies on computationally intensive algorithms. This study focuses on the Generalized Reconstruction by Inversion of Coupled Systems (GRICS) program, applied to the reconstruction of 3D images in cases of non-rigid or rigid motion. It uses hybrid parallelization with the MPI (Message Passing Interface) and OpenMP (Open Multi-Processing). For clinical integration, the GRICS needs to efficiently harness the computational resources of compute nodes. We aim to improve the GRICS’s performance without any code modification. This work presents a performance study of GRICS on two CPU architectures: Intel Xeon Gold and AMD EPYC. The roofline model is used to study the software–hardware interaction and quantify the code’s performance. For CPU–GPU comparison purposes, we propose a preliminary MATLAB–GPU implementation of the GRICS’s reconstruction kernel. We establish the roofline model of the kernel on two NVIDIA GPU architectures: Quadro RTX 5000 and A100. After the performance study, we propose some optimization patterns for the code’s execution on CPUs, first considering only the OpenMP implementation using thread binding and affinity and appropriate architecture-compilation flags and then looking for the optimal combination of MPI processes and OpenMP threads in the case of the hybrid MPI–OpenMP implementation. The results show that the GRICS performed well on the AMD EPYC CPUs, with an architectural efficiency of 52%. The kernel’s execution was fast on the NVIDIA A100 GPU, but the roofline model reported low architectural efficiency and utilization. Full article

(This article belongs to the Special Issue Advances in Computer Architecture Design, Parallel Processing, and Fault Tolerance)

► Show Figures

Figure 1

15 pages, 3294 KiB

Open AccessArticle

Implementation of a Small-Sized Mobile Robot with Road Detection, Sign Recognition, and Obstacle Avoidance

by Ching-Chang Wong, Kun-Duo Weng, Bo-Yun Yu and Yung-Shan Chou

Appl. Sci. 2024, 14(15), 6836; https://doi.org/10.3390/app14156836 - 5 Aug 2024

Viewed by 2160

Abstract

In this study, under the limited volume of 18 cm × 18 cm × 21 cm, a small-sized mobile robot is designed and implemented. It consists of a CPU, a GPU, a 2D LiDAR (Light Detection And Ranging), and two fisheye cameras to [...] Read more.

In this study, under the limited volume of 18 cm × 18 cm × 21 cm, a small-sized mobile robot is designed and implemented. It consists of a CPU, a GPU, a 2D LiDAR (Light Detection And Ranging), and two fisheye cameras to let the robot have good computing processing and graphics processing capabilities. In addition, three functions of road detection, sign recognition, and obstacle avoidance are implemented on this small-sized robot. For road detection, we divide the captured image into four areas and use Intel NUC to perform road detection calculations. The proposed method can significantly reduce the system load and also has a high processing speed of 25 frames per second (fps). For sign recognition, we use the YOLOv4-tiny model and a data augmentation strategy to significantly improve the computing performance of this model. From the experimental results, it can be seen that the mean Average Precision (mAP) of the used model has increased by 52.14%. For obstacle avoidance, a 2D LiDAR-based method with a distance-based filtering mechanism is proposed. The distance-based filtering mechanism is proposed to filter important data points and assign appropriate weights, which can effectively reduce the computational complexity and improve the robot’s response speed to avoid obstacles. Some results and actual experiments illustrate that the proposed methods for these three functions can be effectively completed in the implemented small-sized robot. Full article

(This article belongs to the Special Issue Artificial Intelligence and Its Application in Robotics)

► Show Figures

Figure 1

16 pages, 4465 KiB

Open AccessArticle

An Advanced Approach to Object Detection and Tracking in Robotics and Autonomous Vehicles Using YOLOv8 and LiDAR Data Fusion

by Yanyan Dai, Deokgyu Kim and Kidong Lee

Electronics 2024, 13(12), 2250; https://doi.org/10.3390/electronics13122250 - 7 Jun 2024

Cited by 8 | Viewed by 4912

Abstract

Accurately and reliably perceiving the environment is a major challenge in autonomous driving and robotics research. Traditional vision-based methods often suffer from varying lighting conditions, occlusions, and complex environments. This paper addresses these challenges by combining a deep learning-based object detection algorithm, YOLOv8, [...] Read more.

Accurately and reliably perceiving the environment is a major challenge in autonomous driving and robotics research. Traditional vision-based methods often suffer from varying lighting conditions, occlusions, and complex environments. This paper addresses these challenges by combining a deep learning-based object detection algorithm, YOLOv8, with LiDAR data fusion technology. The principle of this combination is to merge the advantages of these technologies: YOLOv8 excels in real-time object detection and classification through RGB images, while LiDAR provides accurate distance measurement and 3D spatial information, regardless of lighting conditions. The integration aims to apply the high accuracy and robustness of YOLOv8 in identifying and classifying objects, as well as the depth data provided by LiDAR. This combination enhances the overall environmental perception, which is critical for the reliability and safety of autonomous systems. However, this fusion brings some research challenges, including data calibration between different sensors, filtering ground points from LiDAR point clouds, and managing the computational complexity of processing large datasets. This paper presents a comprehensive approach to address these challenges. Firstly, a simple algorithm is introduced to filter out ground points from LiDAR point clouds, which are essential for accurate object detection, by setting different threshold heights based on the terrain. Secondly, YOLOv8, trained on a customized dataset, is utilized for object detection in images, generating 2D bounding boxes around detected objects. Thirdly, a calibration algorithm is developed to transform 3D LiDAR coordinates to image pixel coordinates, which are vital for correlating LiDAR data with image-based object detection results. Fourthly, a method for clustering different objects based on the fused data is proposed, followed by an object tracking algorithm to compute the 3D poses of objects and their relative distances from a robot. The Agilex Scout Mini robot, equipped with Velodyne 16-channel LiDAR and an Intel D435 camera, is employed for data collection and experimentation. Finally, the experimental results validate the effectiveness of the proposed algorithms and methods. Full article

(This article belongs to the Special Issue Advances in Intelligent Data Analysis and Its Applications, 2nd Edition)

► Show Figures

Figure 1

18 pages, 7366 KiB

Open AccessArticle

Realistic Texture Mapping of 3D Medical Models Using RGBD Camera for Mixed Reality Applications

by Cosimo Aliani, Alberto Morelli, Eva Rossi, Sara Lombardi, Vincenzo Yuto Civale, Vittoria Sardini, Flavio Verdino and Leonardo Bocchi

Appl. Sci. 2024, 14(10), 4133; https://doi.org/10.3390/app14104133 - 13 May 2024

Cited by 5 | Viewed by 1628

Abstract

Augmented and mixed reality in the medical field is becoming increasingly important. The creation and visualization of digital models similar to reality could be a great help to increase the user experience during augmented or mixed reality activities like surgical planning and educational, [...] Read more.

Augmented and mixed reality in the medical field is becoming increasingly important. The creation and visualization of digital models similar to reality could be a great help to increase the user experience during augmented or mixed reality activities like surgical planning and educational, training and testing phases of medical students. This study introduces a technique for enhancing a 3D digital model reconstructed from cone-beam computed tomography images with its real coloured texture using an Intel D435 RGBD camera. This method is based on iteratively projecting the two models onto a 2D plane, identifying their contours and then minimizing the distance between them. Finally, the coloured digital models were displayed in mixed reality through a Microsoft HoloLens 2 and an application to interact with them using hand gestures was developed. The registration error between the two 3D models evaluated using 30,000 random points indicates values of: 1.1 ± 1.3 mm on the x-axis, 0.7 ± 0.8 mm on the y-axis, and 0.9 ± 1.2 mm on the z-axis. This result was achieved in three iterations, starting from an average registration error on the three axes of 1.4 mm to reach 0.9 mm. The heatmap created to visualize the spatial distribution of the error shows how it is uniformly distributed over the surface of the pointcloud obtained with the RGBD camera, except for some areas of the nose and ears where the registration error tends to increase. The obtained results indicate that the proposed methodology seems effective. In addition, since the used RGBD camera is inexpensive, future approaches based on the simultaneous use of multiple cameras could further improve the results. Finally, the augmented reality visualization of the obtained result is innovative and could provide support in all those cases where the visualization of three-dimensional medical models is necessary. Full article

(This article belongs to the Special Issue Advanced Virtual, Augmented, and Mixed Reality: Immersive Applications and Innovative Techniques)

► Show Figures

Figure 1

Search Results (84)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (84)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI