Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing
Abstract
:1. Introduction
2. Background and Motivation
2.1. Depalletizing Systems in Industrial Environments
2.2. Computer Vision Techniques for Object Detection and Localization
2.3. Previous Work on Comparing Methods
3. Methodology
3.1. Problem Specifications
3.2. The Object Detection Methods
3.3. Experimental Setup
- Collecting the image data.
- Preprocessing and generating feature vectors using four feature descriptors, pattern matching, Haar cascade, SIFT, and ORB, individually.
- Comparing outcome algorithms.
- Selecting the most suitable algorithm for a specific depalletizing system.
3.4. Pattern Matching
- Template Generation:Templates images are selected and augmented by applying rotations at various angles.
- Image Processing: The system captured a grayscale image to simplify subsequent computations and facilitate robust feature extraction.
- Template Matching: Image patches are compared to templates by using an appropriate similarity metric, discussed in the following.
- Non-maximum suppression: Of a set of overlapping candidate regions, only the top scoring is retained.
- Result Visualization
3.5. Scale-Invariant Feature Transform (SIFT)
- Scale-space Extrema Detection;
- Keypoint Localization;
- Orientation Assignment;
- Keypoint Descriptor.
- SIFT Feature Extraction: SIFT key points are detected and descriptors are extracted from key points in both the template and target images as histograms of local intensity gradients at multiple scales using differences of Gaussians.
- Looping Over Image Regions: The code iterates over different regions.
- Matching and selection: Matches between descriptors are selected, with a minimum match count.
- Filtering Matches: A ratio test is applied to filter out good matches from the initial set of matches comparing the distances between the nearest descriptor and the second one in each descriptor.
- Homography Estimation: a perspective transformation matrix M is estimated based the best-matched key points to align the template with the current region of interest.
- Perspective Transformation: Transformation M is applied.
- Bounding Box Computation
- Drawing and Visualization: The bounding box and center of each detected object are visualized on the target image.
3.6. Oriented FAST and Rotated BRIEF (ORB)
- implements the ORB detector for feature extraction in both the template and target image;
- applies a sliding window approach with a defined step size for efficient detection;
- employs a Brute-Force Matcher with Hamming distance for descriptor matching;
- filters matches based on a predefined threshold;
- performs non-maximum suppression to merge nearby bounding boxes;
- marks the center of each bounding box and assigns a unique identifier.
3.7. Haar Cascade Classifier
- Prepare Negative Images: A folder for negative images was created by copying images from a source directory. A background file (bg.txt) listed the negative images.
- Resize and Edit Images: Positive and negative images were resized to consistent dimensions (the size pixels was used for positive images).
- Create Positive Samples: The opencv_createsamples tool generated positive samples and related information for each positive image in a separate directory for each image’s data. The default configuration includes parameters like angles, number of samples, width, and height.
- Merge Vector Files: Positive sample vector files were merged into a single file for training input.
- Train Cascade Classifier: The opencv_traincascade tool trained the classifier.
- Completion: After training, object detection is performed.
4. Results and Discussion
4.1. Accuracy
- Pattern Matching: This achieved high accuracy in object detection, with straightforward configuration by adjusting a single threshold and angle.
- SIFT: This demonstrated efficiency in finding key points, especially effective in rotation scenarios, contributing to its versatility across various applications.
- ORB: This maintained reliable detection accuracy for the front side of objects under certain conditions but showed limitations in recognizing the back part of matchboxes.
- Haar cascade: Despite a time-intensive training process, this exhibited only acceptable accuracy.
4.2. Robustness to Variability
- Pattern Matching: This demonstrated robustness to variability, showcasing resilience to changes in object appearance, lighting, and orientation.
- SIFT: This proved robust against scale, rotation, and illumination changes, contributing to its adaptability in diverse conditions.
- ORB: This displayed limitations in recognizing specific object orientations, impacting its robustness to variability. However, it remained reliable under certain conditions.
- Haar Cascade: This showed resilience to variations in object appearance and lighting conditions, contributing to its effectiveness in real-world scenarios.
4.3. Computing Speed
- Pattern Matching: This achieved fast detection speed, taking only a few seconds for implementation.
- SIFT: This boasted a fast implementation with efficient key point detection, contributing to its real-time applicability.
- ORB: This exhibited slower execution speed, contrary to expectations for a binary method, suggesting potential performance optimizations.
- Haar Cascade: This demonstrated quick detection post-training, with the inevitable and initial time investment required during the training phase.
4.4. Detection Sensitivity
- Pattern Matching: This exhibited sensitivity to changes in the detection threshold, offering flexibility in configuration.
- SIFT: This showed sensitivity to parameter adjustments, with a relatively quick tuning process.
- ORB: This displayed sensitivity to object orientation, requiring careful parameter tuning for optimal performance.
- Haar Cascade: This required attention to parameters such as setting variation and rotation angle, contributing to the time-consuming tuning process.
4.5. Resource Consumption
- Pattern Matching, SIFT, and ORB: These demonstrated efficient resource consumption, making them suitable for practical applications.
- Haar Cascade: This required significant computer resources during the training phase, with efficient resource consumption during detection.
- Training Time
- Pattern Matching, SIFT, and ORB: These algorithms do not require training, making them advantageous in scenarios where rapid deployment is needed.
- Haar Classifier: This requires a substantial training time of 3.55 h, indicating an initial setup cost. However, this investment pays off with excellent detection performance.
- Detection Latency
- Haar Classifier: The fastest detection time (0.09 s) highlights its efficiency post-training.
- Pattern Matching: The quick detection time (0.13 s) without the need for training makes it a strong candidate for real-time applications.
- SIFT: The moderate detection time (0.39 s) reflects its computer complexity due to the detailed feature extraction process.
- ORB: Surprisingly, ORB takes the longest detection time (12.06 s), which is unexpected for a binary feature descriptor. This may be attributed to implementation details or the specific test conditions.
- Total Matches
- Pattern Matching and Haar Classifier: Both achieve the highest number of matches (7), indicating high effectiveness in object detection.
- SIFT: Slightly lower total matches (6), reflecting its robustness but also its selective nature.
- ORB: The lowest total matches (4), highlighting potential limitations in detecting all relevant objects, especially in more complex scenarios.
- Precision, Recall, and F1 Score
- Precision: All four algorithms exhibit perfect precision (1.00), indicating that when they do make detections, they are consistently accurate.
- Recall: pattern matching and the Haar classifier achieve perfect recall (1.00), showing their ability to detect all relevant objects. SIFT has a slightly lower recall (0.86), while ORB has the lowest (0.57), indicating it misses more objects.
- F1 Score: The F1 Score combines precision and recall into a single index. Pattern matching and the Haar classifier both achieve the highest possible F1 score (1.00). SIFT has a respectable F1 score (0.92), while ORB lags behind at 0.73.
4.6. Additional Remarks
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Brunelli, R. Template Matching Techniques in Computer Vision: Theory and Practice; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
- Piccinini, P.; Prati, A.; Cucchiara, R. Real-time object detection and localization with SIFT-based clustering. Image Vis. Comput. 2012, 30, 573–587. [Google Scholar] [CrossRef]
- Guennouni, S.; Ahaitouf, A.; Mansouri, A. A Comparative Study of Multiple Object Detection Using Haar-Like Feature Selection and Local Binary Patterns in Several Platforms. Model. Simul. Eng. 2015, 2015, 948960. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Wang, J.; Jiang, S.; Song, W.; Yang, Y. A comparative study of small object detection algorithms. In Proceedings of the 2019 Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 8507–8512. [Google Scholar]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2564–2571. [Google Scholar]
- Brown, M.; Szeliski, R.; Winder, S. Multi-image matching using multi-scale oriented patches. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; IEEE: Piscataway, NJ, USA, 2005; Volume 1, pp. 510–517. [Google Scholar]
- Muja, M.; Lowe, D.G. Fast approximate nearest neighbors with automatic algorithm configuration. In Proceedings of the VISAPP 2009, Lisboa, Portugal, 5–8 February 2009; 2009; Volume 2, p. 2. [Google Scholar]
- Widiawan, B.; Kautsar, S.; Purnomo, F.; Etikasari, B. Implementation of Template Matching Method for Door Lock Security System Using Raspberry Pi. VOLT J. Ilm. Pendidik. Tek. Elektro 2017, 2, 143. [Google Scholar] [CrossRef]
- Wang, W.; Li, W.; Zhang, Z. A Parallel PCA-SIFT Algorithm Based on Raspberry Pi 4B. In Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering, Xiamen, China, 20–22 October 2023; pp. 913–920. [Google Scholar]
- Bhatlawande, S.; Nahar, S.; Mundada, S.; Shilaskar, S.; Shaikh, M.D. Driver Assistance System for Detection of Marked Speed Breakers. In Proceedings of the 2024 2nd International Conference on Advancement in Computation & Computer Technologies (InCACCT), Gharuan, India, 2–3 May 2024; pp. 618–622. [Google Scholar] [CrossRef]
- KAYMAK, C.; UCAR, A. Implementation of Object Detection and Recognition Algorithms on a Robotic Arm Platform Using Raspberry Pi. In Proceedings of the 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Turkey, 28–30 September 2018; pp. 1–8. [Google Scholar] [CrossRef]
- Kumar, V.P.; Aravind, P.; Pooja, S.N.D.; Prathyush, S.; AngelDeborah, S.; Chandran, K.R.S. Driver Assistance System using Raspberry Pi and Haar Cascade Classifiers. In Proceedings of the 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 6–8 May 2021; pp. 1729–1735. [Google Scholar] [CrossRef]
- Greco, D.; Masulli, F.; Rovetta, S.; Cabri, A.; Daffonchio, D. A cost-effective eye-tracker for early detection of mild cognitive impairment. In Proceedings of the 2022 IEEE 21st Mediterranean Electrotechnical Conference (MELECON), Palermo, Italy, 14–16 June 2022; pp. 1141–1146. [Google Scholar]
- Mohamed, I.S.; Capitanelli, A.; Mastrogiovanni, F.; Rovetta, S.; Zaccaria, R. Detection, localisation and tracking of pallets using machine learning techniques and 2D range data. Neural Comput. Appl. 2020, 32, 8811–8828. [Google Scholar] [CrossRef]
- Bay, H.; Tuytelaars, T.; Van Gool, L. Surf: Speeded up robust features. In Proceedings of the Computer Vision–ECCV 2006: 9th European Conference on Computer Vision, Graz, Austria, 7–13 May 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 404–417. [Google Scholar]
- Gue, K. Automated Order Picking. In Warehousing in the Global Supply Chain: Advanced Models, Tools and Applications for Storage Systems; Manzini, R., Ed.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 151–174. [Google Scholar]
- Zhang, J.; Xie, J.; Zhang, D.; Li, Y. Development of Control System for a Prefabricated Board Transfer Palletizer Based on S7-1500 PLC. Electronics 2024, 13, 2147. [Google Scholar] [CrossRef]
- Okura Flexible Automation Systems Pte Ltd. Okura Robot Palletizer Models A1600III and A700III Brochure. Available online: https://okura.com.sg/pdf/RobotPalletizer.pdf (accessed on 2 August 2024).
- Asea Brown Boveri Ltd. ABB Robotic Depalletizer Brochure. Available online: https://search.abb.com/library/Download.aspx?DocumentID=9AKK108466A9114 (accessed on 2 August 2024).
- Horn, B.K. Patrick Winston and the MIT AI Lab Copy Demo (1970). Available online: https://people.csail.mit.edu/bkph/phw_copy_demo.shtml (accessed on 2 August 2024).
- Ikeuchi, K.; Horn, B.K. The Mechanical Manipulation of Randomly Oriented Parts. Sci. Am. 1984, 251, 100–111. [Google Scholar]
- Holz, D.; Topalidou-Kyniazopoulou, A.; Stückler, J.; Behnke, S. Real-time object detection, localization and verification for fast robotic depalletizing. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September–2 October 2015; pp. 1459–1466. [Google Scholar]
- Chiaravalli, D.; Palli, G.; Monica, R.; Aleotti, J.; Rizzini, D.L. Integration of a Multi-Camera Vision System and Admittance Control for Robotic Industrial Depalletizing. In Proceedings of the 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Vienna, Austria, 8–11 September 2020; Volume 1, pp. 667–674. [Google Scholar] [CrossRef]
- Arpenti, P.; Caccavale, R.; Paduano, G.; Andrea Fontanelli, G.; Lippiello, V.; Villani, L.; Siciliano, B. RGB-D Recognition and Localization of Cases for Robotic Depalletizing in Supermarkets. IEEE Robot. Autom. Lett. 2020, 5, 6233–6238. [Google Scholar] [CrossRef]
- Aleotti, J.; Baldassarri, A.; Bonfè, M.; Carricato, M.; Chiaravalli, D.; Di Leva, R.; Fantuzzi, C.; Farsoni, S.; Innero, G.; Lodi Rizzini, D.; et al. Toward Future Automatic Warehouses: An Autonomous Depalletizing System Based on Mobile Manipulation and 3D Perception. Appl. Sci. 2021, 11, 5959. [Google Scholar] [CrossRef]
- Prasse, C.; Skibinski, S.; Weichert, F.; Stenzel, J.; Müller, H.; ten Hompel, M. Concept of automated load detection for de-palletizing using depth images and RFID data. In Proceedings of the 2011 IEEE International Conference on Control System, Computing and Engineering, Penang, Malaysia, 25–27 November 2011; pp. 249–254. [Google Scholar] [CrossRef]
- Vu, V.D.; Hoang, D.D.; Tan, P.X.; Nguyen, V.T.; Nguyen, T.U.; Hoang, N.A.; Phan, K.T.; Tran, D.T.; Vu, D.Q.; Ngo, P.Q.; et al. Occlusion-Robust Pallet Pose Estimation for Warehouse Automation. IEEE Access 2024, 12, 1927–1942. [Google Scholar] [CrossRef]
- Li, Y.; Qi, H.; Dai, J.; Ji, X.; Wei, Y. Fully convolutional instance-aware semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2359–2367. [Google Scholar]
- Bradski, G.; Kaehler, A. Learning OpenCV: Computer Vision with the OpenCV Library; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2008. [Google Scholar]
- Wuest, T.; Weimer, D.; Irgens, C.; Thoben, K.D. Machine learning in manufacturing: Advantages, challenges, and applications. Prod. Manuf. Res. 2016, 4, 23–45. [Google Scholar] [CrossRef]
- Bansal, M.; Kumar, M.; Kumar, M. 2D object recognition: A comparative analysis of SIFT, SURF and ORB feature descriptors. Multimed. Tools Appl. 2021, 80, 18839–18857. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA, 8–14 December 2001; Volume 1, pp. I-511–I-518. [Google Scholar]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS–improving object detection with one line of code. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 5561–5569. [Google Scholar]
- Rodgers, J.L.; Nicewander, W.A. Thirteen Ways to Look at the Correlation Coefficient. Am. Stat. 1988, 42, 59–66. [Google Scholar] [CrossRef]
- Kuncheva, L.I. Combining Pattern Classifiers: Methods and Algorithms; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
Training Time (h) | Latency (s) | Total Matches | Precision | Recall | F1 Score | |
---|---|---|---|---|---|---|
Pattern Matching | – | 0.13 | 7 | 1.00 | 1.00 | 1.00 |
Haar Classifier | 3.55 | 0.09 | 7 | 1.00 | 1.00 | 1.00 |
SIFT | – | 0.39 | 6 | 1.00 | 0.86 | 0.92 |
ORB | – | 12.06 | 4 | 1.00 | 0.57 | 0.73 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Greco, D.; Fasihiany, M.; Ranjbar, A.V.; Masulli, F.; Rovetta, S.; Cabri, A. Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing. Algorithms 2024, 17, 363. https://doi.org/10.3390/a17080363
Greco D, Fasihiany M, Ranjbar AV, Masulli F, Rovetta S, Cabri A. Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing. Algorithms. 2024; 17(8):363. https://doi.org/10.3390/a17080363
Chicago/Turabian StyleGreco, Danilo, Majid Fasihiany, Ali Varasteh Ranjbar, Francesco Masulli, Stefano Rovetta, and Alberto Cabri. 2024. "Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing" Algorithms 17, no. 8: 363. https://doi.org/10.3390/a17080363
APA StyleGreco, D., Fasihiany, M., Ranjbar, A. V., Masulli, F., Rovetta, S., & Cabri, A. (2024). Computer Vision Algorithms on a Raspberry Pi 4 for Automated Depalletizing. Algorithms, 17(8), 363. https://doi.org/10.3390/a17080363