Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (64)

Search Parameters:
Keywords = vision-based grasping system

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 2689 KB  
Article
Design of a Pill-Sorting and Pill-Grasping Robot System Based on Machine Vision
by Xuejun Tian, Jiadu Ke, Weiguo Wu and Jian Teng
Future Internet 2025, 17(11), 501; https://doi.org/10.3390/fi17110501 - 31 Oct 2025
Viewed by 192
Abstract
We developed a machine vision-based robotic system to address automation challenges in pharmaceutical pill sorting and packaging. The hardware platform integrates a high-resolution industrial camera with an HSR-CR605 robotic arm. Image processing leverages the VisionMaster 4.3.0 platform for color classification and positioning. Coordinate [...] Read more.
We developed a machine vision-based robotic system to address automation challenges in pharmaceutical pill sorting and packaging. The hardware platform integrates a high-resolution industrial camera with an HSR-CR605 robotic arm. Image processing leverages the VisionMaster 4.3.0 platform for color classification and positioning. Coordinate mapping between camera and robot is established through a three-point calibration method, with real-time communication realized via the Modbus/TCP protocol. Experimental validation demonstrates that the system achieves 95% recognition accuracy under conditions of pill overlap ≤ 30% and dynamic illumination of 50–1000 lux, ±0.5 mm picking precision, and a sorting efficiency of108 pills per minute. These results confirm the feasibility of integrating domestic hardware and algorithms, providing an efficient automated solution for the pharmaceutical industry. This work makes three key contributions: (1) demonstrating a cost-effective domestic hardware-software integration achieving 42% cost reduction while maintaining comparable performance to imported alternatives, (2) establishing a systematic validation methodology under industrially-relevant conditions that provides quantitative robustness metrics for pharmaceutical automation, and (3) offering a practical implementation framework validated through multi-scenario experiments that bridges the gap between laboratory research and production-line deployment. Full article
(This article belongs to the Special Issue Advances and Perspectives in Human-Computer Interaction—2nd Edition)
Show Figures

Figure 1

16 pages, 3657 KB  
Article
Development and Performance Evaluation of a Vision-Based Automated Oyster Size Classification System
by Jonghwan Baek, Seolha Kim, Chang-Hee Lee, Myeongsu Jeong, Jin-Ho Suh and Jaeyoul Lee
Inventions 2025, 10(5), 76; https://doi.org/10.3390/inventions10050076 - 27 Aug 2025
Viewed by 685
Abstract
This study presents the development and validation of an automated oyster classification system designed to classify oysters by size and place them into trays for freezing. Addressing limitations in conventional manual processing, the proposed system integrates a vision-based recognition algorithm and a delta [...] Read more.
This study presents the development and validation of an automated oyster classification system designed to classify oysters by size and place them into trays for freezing. Addressing limitations in conventional manual processing, the proposed system integrates a vision-based recognition algorithm and a delta robot (parallel robot) equipped with a soft gripper. The vision system identifies oyster size and optimal grasp points using image moment calculations, enhancing the accuracy of classification for irregularly shaped oysters. Experimental tests demonstrated classification and grasping success rates of 99%. A process simulation based on real industrial conditions revealed that seven units of the automated system are required to match the daily output of 7 tons achieved by 60 workers. When compared with a theoretical 100% success rate, the system showed a marginal production loss of 715 oysters and 15 trays. These results confirm the potential of the proposed system to improve consistency, reduce labor dependency, and increase productivity in oyster processing. Future work will focus on gripper design optimization and parameter tuning to further improve system stability and efficiency. Full article
(This article belongs to the Section Inventions and Innovation in Advanced Manufacturing)
Show Figures

Figure 1

32 pages, 25342 KB  
Article
An End-to-End Computationally Lightweight Vision-Based Grasping System for Grocery Items
by Thanavin Mansakul, Gilbert Tang, Phil Webb, Jamie Rice, Daniel Oakley and James Fowler
Sensors 2025, 25(17), 5309; https://doi.org/10.3390/s25175309 - 26 Aug 2025
Viewed by 1092
Abstract
Vision-based grasping for mobile manipulators poses significant challenges in machine perception, computational efficiency, and real-world deployment. This study presents a computationally lightweight, end-to-end grasp detection framework that integrates object detection, object pose estimation, and grasp point prediction for a mobile manipulator equipped with [...] Read more.
Vision-based grasping for mobile manipulators poses significant challenges in machine perception, computational efficiency, and real-world deployment. This study presents a computationally lightweight, end-to-end grasp detection framework that integrates object detection, object pose estimation, and grasp point prediction for a mobile manipulator equipped with a parallel gripper. A transformation model is developed to map coordinates from the image frame to the robot frame, enabling accurate manipulation. To evaluate system performance, a benchmark and a dataset tailored to pick-and-pack grocery tasks are introduced. Experimental validation demonstrates an average execution time of under 5 s on an edge device, achieving a 100% success rate on Level 1 and 96% on Level 2 of the benchmark. Additionally, the system achieves an average compute-to-speed ratio of 0.0130, highlighting its energy efficiency. The proposed framework offers a practical, robust, and efficient solution for lightweight robotic applications in real-world environments. Full article
Show Figures

Figure 1

21 pages, 5469 KB  
Article
Radio Frequency Passive Tagging System Enabling Object Recognition and Alignment by Robotic Hands
by Armin Gharibi, Mahmoud Tavakoli, André F. Silva, Filippo Costa and Simone Genovesi
Electronics 2025, 14(17), 3381; https://doi.org/10.3390/electronics14173381 - 25 Aug 2025
Viewed by 1306
Abstract
Robotic hands require reliable and precise sensing systems to achieve accurate object recognition and manipulation, particularly in environments where vision- or capacitive-based approaches face limitations such as poor lighting, dust, reflective surfaces, or non-metallic materials. This paper presents a novel radiofrequency (RF) pre-touch [...] Read more.
Robotic hands require reliable and precise sensing systems to achieve accurate object recognition and manipulation, particularly in environments where vision- or capacitive-based approaches face limitations such as poor lighting, dust, reflective surfaces, or non-metallic materials. This paper presents a novel radiofrequency (RF) pre-touch sensing system that enables robust localization and orientation estimation of objects prior to grasping. The system integrates a compact coplanar waveguide (CPW) probe with fully passive chipless RF resonator tags fabricated using a patented flexible and stretchable conductive ink through additive manufacturing. This approach provides a low-cost, durable, and highly adaptable solution that operates effectively across diverse object geometries and environmental conditions. The experimental results demonstrate that the proposed RF sensor maintains stable performance under varying distances, orientations, and inter-tag spacings, showing robustness where traditional methods may fail. By combining compact design, cost-effectiveness, and reliable near-field sensing independent of an object or lighting, this work establishes RF sensing as a practical and scalable alternative to optical and capacitive systems. The proposed method advances robotic perception by offering enhanced precision, resilience, and integration potential for industrial automation, warehouse handling, and collaborative robotics. Full article
Show Figures

Figure 1

18 pages, 1910 KB  
Article
Hierarchical Learning for Closed-Loop Robotic Manipulation in Cluttered Scenes via Depth Vision, Reinforcement Learning, and Behaviour Cloning
by Hoi Fai Yu and Abdulrahman Altahhan
Electronics 2025, 14(15), 3074; https://doi.org/10.3390/electronics14153074 - 31 Jul 2025
Viewed by 939
Abstract
Despite rapid advances in robot learning, the coordination of closed-loop manipulation in cluttered environments remains a challenging and relatively underexplored problem. We present a novel two-level hierarchical architecture for a depth vision-equipped robotic arm that integrates pushing, grasping, and high-level decision making. Central [...] Read more.
Despite rapid advances in robot learning, the coordination of closed-loop manipulation in cluttered environments remains a challenging and relatively underexplored problem. We present a novel two-level hierarchical architecture for a depth vision-equipped robotic arm that integrates pushing, grasping, and high-level decision making. Central to our approach is a prioritised action–selection mechanism that facilitates efficient early-stage learning via behaviour cloning (BC), while enabling scalable exploration through reinforcement learning (RL). A high-level decision neural network (DNN) selects between grasping and pushing actions, and two low-level action neural networks (ANNs) execute the selected primitive. The DNN is trained with RL, while the ANNs follow a hybrid learning scheme combining BC and RL. Notably, we introduce an automated demonstration generator based on oriented bounding boxes, eliminating the need for manual data collection and enabling precise, reproducible BC training signals. We evaluate our method on a challenging manipulation task involving five closely packed cubic objects. Our system achieves a completion rate (CR) of 100%, an average grasping success (AGS) of 93.1% per completion, and only 7.8 average decisions taken for completion (DTC). Comparative analysis against three baselines—a grasping-only policy, a fixed grasp-then-push sequence, and a cloned demonstration policy—highlights the necessity of dynamic decision making and the efficiency of our hierarchical design. In particular, the baselines yield lower AGS (86.6%) and higher DTC (10.6 and 11.4) scores, underscoring the advantages of content-aware, closed-loop control. These results demonstrate that our architecture supports robust, adaptive manipulation and scalable learning, offering a promising direction for autonomous skill coordination in complex environments. Full article
Show Figures

Figure 1

22 pages, 6487 KB  
Article
An RGB-D Vision-Guided Robotic Depalletizing System for Irregular Camshafts with Transformer-Based Instance Segmentation and Flexible Magnetic Gripper
by Runxi Wu and Ping Yang
Actuators 2025, 14(8), 370; https://doi.org/10.3390/act14080370 - 24 Jul 2025
Viewed by 926
Abstract
Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. [...] Read more.
Accurate segmentation of densely stacked and weakly textured objects remains a core challenge in robotic depalletizing for industrial applications. To address this, we propose MaskNet, an instance segmentation network tailored for RGB-D input, designed to enhance recognition performance under occlusion and low-texture conditions. Built upon a Vision Transformer backbone, MaskNet adopts a dual-branch architecture for RGB and depth modalities and integrates multi-modal features using an attention-based fusion module. Further, spatial and channel attention mechanisms are employed to refine feature representation and improve instance-level discrimination. The segmentation outputs are used in conjunction with regional depth to optimize the grasping sequence. Experimental evaluations on camshaft depalletizing tasks demonstrate that MaskNet achieves a precision of 0.980, a recall of 0.971, and an F1-score of 0.975, outperforming a YOLO11-based baseline. In an actual scenario, with a self-designed flexible magnetic gripper, the system maintains a maximum grasping error of 9.85 mm and a 98% task success rate across multiple camshaft types. These results validate the effectiveness of MaskNet in enabling fine-grained perception for robotic manipulation in cluttered, real-world scenarios. Full article
(This article belongs to the Section Actuators for Robotics)
Show Figures

Figure 1

24 pages, 5534 KB  
Article
Enhancing Healthcare Assistance with a Self-Learning Robotics System: A Deep Imitation Learning-Based Solution
by Yagna Jadeja, Mahmoud Shafik, Paul Wood and Aaisha Makkar
Electronics 2025, 14(14), 2823; https://doi.org/10.3390/electronics14142823 - 14 Jul 2025
Viewed by 1068
Abstract
This paper presents a Self-Learning Robotic System (SLRS) for healthcare assistance using Deep Imitation Learning (DIL). The proposed SLRS solution can observe and replicate human demonstrations, thereby acquiring complex skills without the need for explicit task-specific programming. It incorporates modular components for perception [...] Read more.
This paper presents a Self-Learning Robotic System (SLRS) for healthcare assistance using Deep Imitation Learning (DIL). The proposed SLRS solution can observe and replicate human demonstrations, thereby acquiring complex skills without the need for explicit task-specific programming. It incorporates modular components for perception (i.e., advanced computer vision methodologies), actuation (i.e., dynamic interaction with patients and healthcare professionals in real time), and learning. The innovative approach of implementing a hybrid model approach (i.e., deep imitation learning and pose estimation algorithms) facilitates autonomous learning and adaptive task execution. The environmental awareness and responsiveness were also enhanced using both a Convolutional Neural Network (CNN)-based object detection mechanism using YOLOv8 (i.e., with 94.3% accuracy and 18.7 ms latency) and pose estimation algorithms, alongside a MediaPipe and Long Short-Term Memory (LSTM) framework for human action recognition. The developed solution was tested and validated in healthcare, with the aim to overcome some of the current challenges, such as workforce shortages, ageing populations, and the rising prevalence of chronic diseases. The CAD simulation, validation, and verification tested functions (i.e., assistive functions, interactive scenarios, and object manipulation) of the system demonstrated the robot’s adaptability and operational efficiency, achieving an 87.3% task completion success rate and over 85% grasp success rate. This approach highlights the potential use of an SLRS for healthcare assistance. Further work will be undertaken in hospitals, care homes, and rehabilitation centre environments to generate complete holistic datasets to confirm the system’s reliability and efficiency. Full article
Show Figures

Figure 1

15 pages, 2355 KB  
Article
Intelligent Detection and Automatic Removal Robot for Skinned Garlic Cloves
by Zhengbo Zhu, Xin Cao, Yawen Xiao, Li Xin, Lei Xin and Shuqian Li
Agriculture 2025, 15(10), 1076; https://doi.org/10.3390/agriculture15101076 - 16 May 2025
Viewed by 548
Abstract
After undergoing peeling-machine operations, skinned garlic cloves affect subsequent processing, and their manual removal is harmful to health. In this paper, an intelligent garlic-clove-removal test bench was designed, which mainly included a hopper, lifter, vibration conveyor, conveyor belt, visual system, removal robot, control [...] Read more.
After undergoing peeling-machine operations, skinned garlic cloves affect subsequent processing, and their manual removal is harmful to health. In this paper, an intelligent garlic-clove-removal test bench was designed, which mainly included a hopper, lifter, vibration conveyor, conveyor belt, visual system, removal robot, control cabinet, frame, etc. A technical method based on machine vision technology to distinguish whether or not garlic cloves had a skin was explored to ensure that the test bench could complete the recognition of the skinned garlic cloves, and to check that the test bench could also complete the removal of skinned garlic cloves. Tests were carried out to check the success rate of machine vision and the removal robot, and to optimize the parameters of the test bench. The results showed that the average success rate of machine vision was 99.15%, and the average success rate of the removal robot was 99.13%. The results also showed that the order of the three factors influence index was the conveying speed, the conveying volume, and the removal period. The regression analysis showed that when the conveying speed was 0.1 m·s−1, the grasping period was 1.725 s, the conveying volume was 104.4 kg·h−1, the qualified rate of the finished product was 97.15%, and the verification test result was 97.02%, which had no significant difference from the analysis result. The research results of this paper are conducive to the development of intelligent detection technology of garlic cloves, and to the development of garlic-planting technology and deep processing technology. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

23 pages, 4734 KB  
Article
Optimal Viewpoint Assistance for Cooperative Manipulation Using D-Optimality
by Kyosuke Kameyama, Kazuki Horie and Kosuke Sekiyama
Sensors 2025, 25(10), 3002; https://doi.org/10.3390/s25103002 - 9 May 2025
Viewed by 2808
Abstract
This study proposes a D-optimality-based viewpoint selection method to improve visual assistance for a manipulator by optimizing camera placement. The approach maximizes the information gained from visual observations, reducing uncertainty in object recognition and localization. A mathematical framework utilizing D-optimality criteria is developed [...] Read more.
This study proposes a D-optimality-based viewpoint selection method to improve visual assistance for a manipulator by optimizing camera placement. The approach maximizes the information gained from visual observations, reducing uncertainty in object recognition and localization. A mathematical framework utilizing D-optimality criteria is developed to determine the most informative camera viewpoint in real time. The proposed method is integrated into a robotic system where a mobile robot adjusts its viewpoint to support the manipulator in grasping and placing tasks. Experimental evaluations demonstrate that D-optimality-based viewpoint selection improves recognition accuracy and task efficiency. The results suggest that optimal viewpoint planning can enhance perception robustness, leading to better manipulation performance. Although tested in structured environments, the approach has the potential to be extended to dynamic or unstructured settings. This research contributes to the integration of viewpoint optimization in vision-based robotic cooperation, with promising applications in industrial automation, service robotics, and human–robot collaboration. Full article
Show Figures

Graphical abstract

19 pages, 11348 KB  
Article
Vision-Based Grasping Method for Prosthetic Hands via Geometry and Symmetry Axis Recognition
by Yi Zhang, Yanwei Xie, Qian Zhao, Xiaolei Xu, Hua Deng and Nianen Yi
Biomimetics 2025, 10(4), 242; https://doi.org/10.3390/biomimetics10040242 - 15 Apr 2025
Cited by 1 | Viewed by 1121
Abstract
This paper proposes a grasping method for prosthetic hands based on object geometry and symmetry axis. The method utilizes computer vision to extract the geometric shape, spatial position, and symmetry axis of target objects and selects appropriate grasping modes and postures through the [...] Read more.
This paper proposes a grasping method for prosthetic hands based on object geometry and symmetry axis. The method utilizes computer vision to extract the geometric shape, spatial position, and symmetry axis of target objects and selects appropriate grasping modes and postures through the extracted features. First, grasping patterns are classified based on the analysis of hand-grasping movements. A mapping relationship between object geometry and grasp patterns is established. Then, target object images are captured using binocular depth cameras, and the YOLO algorithm is employed for object detection. The SIFT algorithm is applied to extract the object’s symmetry axis, thereby determining the optimal grasp point and initial hand posture. An experimental platform is built based on a seven-degree-of-freedom (7-DoF) robotic arm and a multi-mode prosthetic hand to conduct grasping experiments on objects with different characteristics. Experimental results demonstrate that the proposed method achieves high accuracy and real-time performance in recognizing object geometric features. The system can automatically match appropriate grasp modes according to object features, improving grasp stability and success rate. Full article
(This article belongs to the Special Issue Human-Inspired Grasp Control in Robotics 2025)
Show Figures

Figure 1

19 pages, 4998 KB  
Article
Computer Vision-Based Robotic System Framework for the Real-Time Identification and Grasping of Oysters
by Hao-Ran Qu, Jue Wang, Lang-Rui Lei and Wen-Hao Su
Appl. Sci. 2025, 15(7), 3971; https://doi.org/10.3390/app15073971 - 3 Apr 2025
Cited by 2 | Viewed by 1821
Abstract
This study addresses the labor-intensive and safety-critical challenges of manual oyster processing by innovating an advanced robotic intelligent sorting system. Central to this system is the integration of a high-resolution vision module, dual operational controllers, and the collaborative AUBO-i3 robot, all harmonized through [...] Read more.
This study addresses the labor-intensive and safety-critical challenges of manual oyster processing by innovating an advanced robotic intelligent sorting system. Central to this system is the integration of a high-resolution vision module, dual operational controllers, and the collaborative AUBO-i3 robot, all harmonized through a sophisticated Robot Operating System (ROS) framework. A specialized oyster image dataset was curated and augmented to train a robust You Only Look Once version 8 Oriented Bounding Box (YOLOv8-OBB) model, further enhanced through the incorporation of MobileNet Version 4 (MobileNetV4). This optimization reduced the number of model parameters by 50% and lowered the computational load by 23% in terms of GFLOPS (Giga Floating-point Operations Per Second). In order to capture oyster motion dynamically on a conveyor belt, a Kalman filter (KF) combined with a Low-Pass filter algorithm was employed to predict oyster trajectories, thereby improving noise reduction and motion stability. This approach achieves superior noise reduction compared to traditional Moving Average methods. The system achieved a 95.54% success rate in static gripping tests and an impressive 84% in dynamic conditions. These technological advancements demonstrate a significant leap towards revolutionizing seafood processing, offering substantial gains in operational efficiency, reducing potential contamination risks, and paving the way for a transition to fully automated, unmanned production systems in the seafood industry. Full article
Show Figures

Figure 1

29 pages, 5686 KB  
Article
GPTArm: An Autonomous Task Planning Manipulator Grasping System Based on Vision–Language Models
by Jiaqi Zhang, Zinan Wang, Jiaxin Lai and Hongfei Wang
Machines 2025, 13(3), 247; https://doi.org/10.3390/machines13030247 - 19 Mar 2025
Cited by 4 | Viewed by 1688
Abstract
The integration of vision–language models (VLMs) with robotic systems represents a transformative advancement in autonomous task planning and execution. However, traditional robotic arms relying on pre-programmed instructions exhibit limited adaptability in dynamic environments and face semantic gaps between perception and execution, hindering their [...] Read more.
The integration of vision–language models (VLMs) with robotic systems represents a transformative advancement in autonomous task planning and execution. However, traditional robotic arms relying on pre-programmed instructions exhibit limited adaptability in dynamic environments and face semantic gaps between perception and execution, hindering their ability to handle complex task demands. This paper introduces GPTArm, an environment-aware robotic arm system driven by GPT-4V, designed to overcome these challenges through hierarchical task decomposition, closed-loop error recovery, and multimodal interaction. The proposed robotic task processing framework (RTPF) integrates real-time visual perception, contextual reasoning, and autonomous strategy planning, enabling robotic arms to interpret natural language commands, decompose user-defined tasks into executable subtasks, and dynamically recover from errors. Experimental evaluations across ten manipulation tasks demonstrate GPTArm’s superior performance, achieving a success rate of up to 91.4% in standardized benchmarks and robust generalization to unseen objects. Leveraging GPT-4V’s reasoning and YOLOv10’s precise small-object localization, the system surpasses existing methods in accuracy and adaptability. Furthermore, GPTArm supports flexible natural language interaction via voice and text, significantly enhancing user experience in human–robot collaboration. Full article
(This article belongs to the Section Robotics, Mechatronics and Intelligent Machines)
Show Figures

Figure 1

25 pages, 13905 KB  
Article
A Framework for Real-Time Autonomous Robotic Sorting and Segregation of Nuclear Waste: Modelling, Identification and Control of DexterTM Robot
by Mithun Poozhiyil, Omer F. Argin, Mini Rai, Amir G. Esfahani, Marc Hanheide, Ryan King, Phil Saunderson, Mike Moulin-Ramsden, Wen Yang, Laura Palacio García, Iain Mackay, Abhishek Mishra, Sho Okamoto and Kelvin Yeung
Machines 2025, 13(3), 214; https://doi.org/10.3390/machines13030214 - 6 Mar 2025
Viewed by 2203
Abstract
Robots are essential for carrying out tasks, for example, in a nuclear industry, where direct human involvement is limited. However, present-day nuclear robots are not versatile due to limited autonomy and higher costs. This research presents a merely teleoperated DexterTM nuclear robot’s [...] Read more.
Robots are essential for carrying out tasks, for example, in a nuclear industry, where direct human involvement is limited. However, present-day nuclear robots are not versatile due to limited autonomy and higher costs. This research presents a merely teleoperated DexterTM nuclear robot’s transformation into an autonomous manipulator for nuclear sort and segregation tasks. The DexterTM system comprises a arm client manipulator designed to operate in extreme radiation environments and a similar single/dual-arm local manipulator. In this paper, initially, a kinematic model and convex optimization-based dynamic model identification of a single-arm DexterTM manipulator is presented. This model is used for autonomous DexterTM control through Robot Operating System (ROS). A new integration framework incorporating vision, AI-based grasp generation and an intelligent radiological surveying method for enhancing the performance of autonomous DexterTM is presented. The efficacy of the framework is demonstrated on a mock-up nuclear waste test-bed using similar waste materials found in the nuclear industry. The experiments performed show potency, generality and applicability of the proposed framework in overcoming the entry barriers for autonomous systems in regulated domains like the nuclear industry. Full article
(This article belongs to the Special Issue New Trends in Industrial Robots)
Show Figures

Figure 1

26 pages, 30384 KB  
Article
A Vision-Guided Deep Learning Framework for Dexterous Robotic Grasping Using Gaussian Processes and Transformers
by Suhas Kadalagere Sampath, Ning Wang, Chenguang Yang, Howard Wu, Cunjia Liu and Martin Pearson
Appl. Sci. 2025, 15(5), 2615; https://doi.org/10.3390/app15052615 - 28 Feb 2025
Cited by 3 | Viewed by 2997
Abstract
Robotic manipulation of objects with diverse shapes, sizes, and properties, especially deformable ones, remains a significant challenge in automation, necessitating human-like dexterity through the integration of perception, learning, and control. This study enhances a previous framework combining YOLOv8 for object detection and LSTM [...] Read more.
Robotic manipulation of objects with diverse shapes, sizes, and properties, especially deformable ones, remains a significant challenge in automation, necessitating human-like dexterity through the integration of perception, learning, and control. This study enhances a previous framework combining YOLOv8 for object detection and LSTM networks for adaptive grasping by introducing Gaussian Processes (GPs) for robust grasp predictions and Transformer models for efficient multi-modal sensory data integration. A Random Forest classifier also selects optimal grasp configurations based on object-specific features like geometry and stability. The proposed grasping framework achieved a 95.6% grasp success rate using Transformer-based force modulation, surpassing LSTM (91.3%) and GP (91.3%) models. Evaluation of a diverse dataset showed significant improvements in grasp force modulation, adaptability, and robustness for two- and three-finger grasps. However, limitations were observed in five-finger grasps for certain objects, and some classification failures occurred in the vision system. Overall, this combination of vision-based detection and advanced learning techniques offers a scalable solution for flexible robotic manipulation. Full article
(This article belongs to the Special Issue Recent Advances in Autonomous Systems and Robotics, 2nd Edition)
Show Figures

Figure 1

20 pages, 9017 KB  
Article
Machine Vision-Assisted Design of End Effector Pose in Robotic Mixed Depalletizing of Heterogeneous Cargo
by Sebastián Valero, Juan Camilo Martinez, Ana María Montes, Cesar Marín, Rubén Bolaños and David Álvarez
Sensors 2025, 25(4), 1137; https://doi.org/10.3390/s25041137 - 13 Feb 2025
Cited by 5 | Viewed by 1740
Abstract
Automated depalletizing systems aim to offer continuous and efficient operation in warehouse logistics, reducing cycle times and contributing to worker safety. However, most commercially available depalletizing solutions are designed primarily for highly homogeneous cargo arranged in orthogonal configurations. This paper presents a real-time [...] Read more.
Automated depalletizing systems aim to offer continuous and efficient operation in warehouse logistics, reducing cycle times and contributing to worker safety. However, most commercially available depalletizing solutions are designed primarily for highly homogeneous cargo arranged in orthogonal configurations. This paper presents a real-time approach for depalletizing heterogeneous pallets with boxes of varying sizes and arbitrary orientations, including configurations where the topmost surfaces of boxes are not necessarily parallel to each other. To accomplish this, we propose an algorithm that leverages deep learning-based machine vision to determine the size, position, and orientation of boxes relative to the horizontal plane of a robot arm from sparse depth data. Using this information, we implement a path planning method that generates collision-free trajectories to enable precise box grasping and placement onto a production line. Validation through both simulated and real-world experiments demonstrates the feasibility and accuracy of this approach in complex industrial settings, highlighting potential improvements in the efficiency and adaptability of automated depalletizing systems. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

Back to TopTop