Abstract
Automated guided vehicles (AGVs) have become prevalent over the last decade. However, numerous challenges remain, including path planning, security, and the capacity to operate safely in unstructured environments. This study proposes an obstacle avoidance system that leverages deep action learning (DAL) to address these challenges and meet the requirements of Industry 4.0 for AGVs, such as speed, accuracy, and robustness. In the proposed approach, the DAL is integrated into an AGV platform to enhance its visual navigation, object recognition, localization, and decision-making capabilities. Then DAL itself was introduced to combine the work of You Only Look Once (YOLOv4), speeded-up robust features (SURF), and k-nearest neighbor (kNN) and AGV control in indoor visual navigation. The DAL system triggers SURF to differentiate two navigation images, and kNN is used to verify visual distance in real time to avoid obstacles on the floor while searching for the home position. The testing findings show that the suggested system is reliable and fits the needs of advanced AGV operations.
1. Introduction
Achieving the large-scale use of automatic guided vehicles (AGVs) called for by Industry 4.0 is a daunting undertaking. Most AGVs nowadays are used in massive industries, such as Amazon, Alibaba, Lotte, Carrefour, Walmart, and Pinduoduo [1,2,3,4,5,6]. AGVs have shown great benefits in the logistics field and have led to a significant reduction in handling and transportation costs. However, to meet the anticipated needs of Society 5.0, further work is required to improve the cost, speed, safety, and versatility of AGV systems [5,6]. Industry 4.0 requirements include interoperability (compatibility), information transparency, technical assistance, and independent decisions. The system built for the AGV fulfills the element of the independent decision making, where the AGV will decide autonomously to navigate and stop through a DAL mechanism. The proposal of DAL can only be optimal with the support of other existing systems, such as YOLO for the detection system, SURF for route confirmation, and kNN used for AGV verification against references, such as start, home, and obstacles.
Continuous advancements in wireless connectivity, sensor miniaturization, cloud computing, big data, and analytics have led to a new concept called the Internet of Things (IoT), in which devices collect and exchange information with one another with little or no human invention [5,7,8]. The integration of IoT and Artificial Intelligence (AI) technology has led to the development of the Artificial Intelligence of Things (AIoT), which is essentially a more intelligent and capable ecosystem of connected devices and systems. In the context of AGVs, AIoT primarily aims to emulate the execution of human tasks within logistics and storage systems through the utilization of Internet networks and intelligent decision making [8,9,10,11]. Although Industry 4.0 already leverages such AGV technology to a certain extent, its use is generally limited to larger enterprises and structured environments [2].
One of the many components of Industry 4.0 is warehousing, which integrates technology and automation to optimize various tasks in the storage and distribution of goods. Within warehouse environments, AGVs are mainly used for order picking and material transport and are typically guided by line tracking, barcodes, laser sensors, and camera techniques [1]. Such methods work well in structured environments with clearly defined paths, controlled obstacles, predictable workflows, and minimal human interactions [12,13]. However, in environments with a dynamic layout, obstacle variability, and the need for complex decision making, existing AGV systems face significant challenges. Unfortunately, these problems are yet to be adequately resolved in the context of warehousing. Broadly, other studies link Industry 4.0 with several machine models and simulations through dynamic, intelligent, flexible, and open applications. Including photogrammetry models is integral to Industry 4.0 regarding flexibility, primarily assisted by GNSS and GCP navigation. It has also been developed to forge a wheeled AGV, using artificial neural networks to deal with high accuracy [14,15].
In the literature, AIoT approaches have been widely used for applications such as roadside recognition for AGV forklift equipment, unmanned vehicle obstacle detection, autonomous lane detection, crop and weed detection, and collision avoidance [2,16]. The authors in [17] used a recurrent neural network (RNN) to guide musculoskeletal arms and robotic arms to achieve the precise and efficient completion of goal-directed tasks. Where research [17] was developed from Krizhevsky et al. [18] who earlier developed a convolutional neural network (CNN) called AlexNet that is efficient and has dropout regularization to reduce overfitting [19]. Zhang et al. [5,11,12,20,21] proposed that features be generated from two consistent domains using Generative Adversarial Networks (GANs) But the training procedure is slow, so updates appear through unsupervised cross-domain object detection; this detection is known as CA-FRCNN (Cycle-Consistent Domain Adaptive Faster RCNN). Another method, You Only Look Once (YOLO), is a real-time object identification method that has been developed by Fang et al. Several variants of the original YOLO model have been proposed [22]. In general, the results have shown that YOLO has many advantages over other lightweight models, including real-time processing capabilities, multi-object detection, and customization for specific applications, making it a versatile and efficient choice for many computer vision tasks.
YOLO has found many applications across various industries, including robotics, agriculture, medicine, health, education, and the military. As the detection speed of YOLO has improved over the past few years, the scope of its applications has expanded [23], and there is now growing interest in applying YOLO to AGVs. However, YOLO has a high power consumption and relies on fast GPU cards and complex computational processes. Furthermore, while YOLO facilitates autonomous driving, navigation, and obstacle avoidance, particularly in unstructured environments, there remain many challenging concerns to be overcome [18,19,20,22,23,24].
The reliable navigation of modern AGVs generally depends on the successful recognition and detection of routing markers. Stereo camera systems, such as D435i, Kinect, or RealSense, provide an effective solution for the detection of fixed objects. However, they suffer from several severe limitations in practical situations, such as vulnerability to environmental conditions, the need to maintain accurate calibration and camera alignment over time, and a high computational complexity. Consequently, the feasibility of using mono camera systems for AGV navigation has gained increasing traction in recent years. In such approaches, a relative baseline is calculated as the AGV moves based on the pixel shift at a fixed point on an object, and the speeded-up robust features (SURF) law is then applied to the pixel shift for navigation purposes. Compared with stereoscopic vision systems, mono camera systems have a lower power consumption, thus facilitating a longer AGV running time. Moreover, in automatic navigation scenarios, the ability to detect random obstacles, narrow positions, and changes in the object size, orientation, and type is typically more important than performing depth estimation [7,25,26].
The studies in [3,27,28] examined the roles of AI, robotics, and data mining in AGV navigation, and concluded that effective algorithms for navigating indoor spaces rely heavily on the extraction of appropriate local features for performing keyframe selection, localization, and relative posture calculation. Many features and feature processing methods have been proposed, including segments of invariant column [4], SIFT (Scale Invariant Feature Transform) [29,30], and FREAK (Fast Retina Keypoint) [29]. It was shown in [31] that the feature processing speed can be accelerated through a bag-of-words (BoW) technique, in which a histogram of visible words is used to represent the quantified image. The Term Frequency–Inverse Document Frequency (TF-IDF), as statistics-based methods, can also be applied to each histogram bin to quantify the relevance of a particular visual term to any image within the image set [27,32]. However, although local features are theoretically less sensitive to lighting variations and motion blur, indoor environments still pose a significant challenge owing to their extreme visual diversity, the presence of repeated patterns, and the potential for occlusion. Random Sample Consensus (RANSAC) [2,28,33] provides a means of overcoming these problems through more accurate and robust feature matching. However, in complex, unstructured environments, the resulting substantial mismatch ratio increases the computation time required by RANSAC to estimate the relative poses with precision.
In previous studies [34,35,36], the present group developed a R-CNN (region-based CNN) with a structure of eye-in-hand that utilized a single camera to perform an estimate of depth and object location. In a later study, this approach was extended to the task of object picking. An action learning (AL) method was also proposed to help the manipulator robot learn from its mistakes [32]. Although the system can learn from actions, the working area coverage is very narrow, and the object detection area is fixed. So, the AL method is less suitable for unstructured areas, distances, and dynamic object positions. In the present study, a robust and efficient navigation method is proposed for AGVs by merging AL with deep action learning (DAL) and utilizing SURF and the k-nearest neighbors (kNN) method as feedback guides for the navigation process.
This study’s primary contributions can be summed up as follows:
- A DAL architecture is employed to perform robust and accurate detection of objects in an indoor environment.
- Object localization is performed using a single monochrome camera fixed to the AGV. An automated navigation capability is realized through the amalgamation of YOLOv4, SURF, and kNN in a seamless DAL architecture.
- The AGV’s self-navigation performance is enhanced by representing obstacles as points or nodes in the AGV mapping system, thereby improving its ability to plan routes around them.
- The experimental outcomes demonstrate that the suggested system performs robustly and meets the requirements of advanced AGV operations.
The remainder of this paper is organized as follows. The general system design is presented in Section 2, including both the AGV robotic platform and the navigation system. The suggested visual navigation system, the localization, and detection techniques are described in depth in Section 3. Section 4 discusses the visual navigation, obstacle avoidance, safety, and moving obstacle detection issues for the AGV platform in an indoor warehouse environment. The results of the experiment are presented and analyzed in Section 5. Section 6 concludes by offering a brief conclusion and suggesting future research directions.
2. System Design
The proposed system comprises two main components: the physical AGV robotic platform and the navigation system used to control its motion. The navigation system is designed to allow the AGV to perform naturally inside an environment containing various objects, such as walls, aisles, shelves, and objects on the floor. For evaluation purposes, in this research, the AGV navigates by passing various obstacles or markers from the starting position to the home position. As the AGV moves, it performs continuous object detection and recognition using a visual navigation system implemented using a DAL architecture based on YOLOv4 for segmentation purposes and SURF to perform collision avoidance.
The fundamental elements of the suggested DAL architecture are shown in Figure 1. The brown boxes refer to the simulated localization environment, which contains various objects, including sports cones, gallon water containers, and cardboard boxes. A dataset consisting of images showing these objects and the associated environment was compiled to support the YOLO segmentation process. While the AGV is moving, the paired RGB images captured by the mono camera are used as input data to the SURF algorithm (shown in purple) to perform obstacle avoidance according to the object detection results obtained from the kNN-assisted YOLOv4 model. Finally, a set of commands was produced to instruct the AGV to move forward, backward, left, right, or stop, as required, to safely reach the designated home position (depicted in white).
Figure 1.
Overall structure of DAL architecture for indoor AGV navigation.
In general, DAL is divided into three main parts. So, it needs to be declared that the DAL concept adopts AL and reinforcement learning (RL) so that the first part is the environment. This environment is set in the form of cones, water containers, cardboard, and other items. On the other hand, a previously collected dataset is inserted into a passive environment. In the middle is the visual navigation system. The RGB image in the form of an indoor environment becomes input from the system for detection and recognition by YOLOv4 with consideration of three optimizers: SDGM, RMSProp, and ADAM. Then, the results of YOLOv4 recognition are also used by kNN to find connection points between the previous image and the current image. The results of this search for the shortest distance are used to consider the AGV’s movements in finding a safe route. The RGB input from the AGV is taken continuously and compared; therefore, as soon as the AGV moves, the first image will be taken, and then it will move again for the following image. In this case, there is a shift in the baseline, both on the x-axis and/or z-axis, so it seems that the AGV is using a stereo camera with a pair of identical images.
SURF processes identical images to find link points between the first and second images. There is no limit to the number of matching points. Naturally, many matching points are more valid in this case, but it is also necessary to limit it to make performance efficient; ten nodes are enough. The navigation algorithm uses the results of these nodes to avoid obstacles, approach the target, turn, and stop. Another algorithm related to DAL tests the accuracy of target detection. The two algorithms are combined to find the target and determine navigation by the AGV in an indoor context.
5. Experimental Results
5.1. Experimental Settings
The present study considered the problem of building an AGV navigation system for SMEs. Thus, the aim was to reduce the overall cost of the devices used, while maintaining an acceptable system performance. The proposed system runs on a CPU with i7 1165G7 @2.80 GHz and memory of 16 GB RAM. In order to overcome the graphical requirement, NVIDIA®GeForce RTX™ 3060 GPU 6 GB GDDR6 was installed, along with all technical specifications onboard in a customized AGV with four-wheel drive. The evaluation of the proposed navigation system was limited to the case of detecting three objects and performing maneuvering as appropriate to reach the home position without obstacle contact. The feasibility of the system is evidenced by Figure 9 and Figure 10.
Figure 10.
The YOLOv4 detector iteration RMSEs over training loss in each class for the SGDM (top), ADAM (middle), and RMSProp optimizer (bottom).
5.2. Detection Evaluation
It is necessary to evaluate the metrics considered in the present study: accuracy, precision, recall, F1 score, and average precision (AP); see Equation (24). This evaluation is obtained from calculations of the true-positive (TP), false-positive (FP), true-negative (TN), and false-negative (FN) statistics collected for around 560 images. The precision, memory, F1 score, and AP were calculated using a confidence value of 0.85. In the experiments, TP was taken as a detection outcome with an IoU value > 0.75 and a single bounding box, and FP was taken otherwise. The TN criterion was specified as the absence of a double bounding box and an IoU value of <0.75. Finally, the FN output was specified as an incorrect detection output and an IoU value of <0.75 or the absence of a bounding box.
As shown in Table 3, the maximum confidence, accuracy, and precision values were obtained for the carton box, while the optimal recall and F1 performance were obtained for the water container, while the ideal AP and detection speed were obtained for the sports cone, with an average time of µ = 0.004 s with σ = 0.004. The training time of YOLOv4 was evaluated for three different optimizer schemes: Root Mean Square Propagation (RMSProp), Adaptive Moment Optimization (ADAM), and Stochastic Gradient Descent with Momentum (SGDM) as shown in Figure 10.
Table 3.
Comparison of detection performance of YOLOv4 and DAL with that of other methods.
In Figure 10, ADAM was determined to be the best optimizer for the proposed navigation system and was thus used in all the remaining experiments. After successful detection and localization, the beginning position was established, as shown in Figure 9 in Section 4.
Practical training results of YOLOv4 based on those shown in Figure 10 require 2500 iterations, so the final results are almost the same. SDGM, for example, has small learning loss results, but in specific iterations (1000–1500), it increases while the noise decreases. Meanwhile, the RMSProp optimizer has lower performance than ADAM. The ADAM optimizer has the maximum learning loss compared to the other two. However, the noise in the ADAM optimizer still needs to be cleaner than that SGDM optimizer. This condition is sufficient for detection in indoor environments with various variations of obstacles/targets: orientation, zooming, image angle, and lighting.
Before discussing the detection results as in Table 3, it is better to explain the YOLOv4 concept in carrying out detection. The first technical preprocessing for the dataset is where the data is labeled and then stored in the bounding box in matrix form. After that, it is necessary to carry out training with several settings: of course, by setting the training options (ADAM, RMSProp, and SGDM), setting the initial learning rate to 0.00001, and by setting the squared gradient decay factor (SGDF) to 0.99, the maximum epochs to 20, and the mini-batch size to 64, respectively. If the learning rate is too low, training may take longer. However, training may provide a suboptimal or distorted output if the learning rate is too high.
In evaluating detection results, they are greatly influenced by the input image; viewing angle, brightness, initial mini-batch size, learning rate, and maximum epochs will all significantly impact detection accuracy and training duration. When the long training process has been completed, a training loss graph will appear as in Figure 10, and the detector will be obtained with each optimizer. Before the optimizer appears, training loss will be visible; essentially, the smaller the value, the better the results. This learning loss depends on the initial learning rate determined. Hasil training YOLOv4 tercipta tiga detektor untuk masing kelas dataset sport cone, water container, dan cupboard. The YOLOv4 training results produce three detectors for each class of the sport cone, water container, and cupboard dataset. All images detected by DAL and YOLOv4 are captured with shift left, right, forward, and backward conditions. It would have yet to think about whether the image is captured by the AGV by pivoting and moving diagonally closer or further away.
After studying several empirical results of applying DAL to conventional wheeled AGVs, it was realized that the movement of industrial AGVs allows them to move not only forward and backward but also to pivot and move diagonally back and forth. This movement brings image consequences that are difficult to detect even by YOLOv4. Jika hal itu diterapkan pada AGV beroda omni atau mecanum maka kecepatan belok diperkirakan semikin sulit meski menggukanan DAL yang sama. If this is applied to an AGV with omni or mecanum wheels, turning speed is estimated to be even more challenging, even if the same DAL is used. This challenge goes to prove that DAL still has weaknesses, especially in connecting images with the AGV slippage.
5.3. DAL Navigation Experiments
Table 4 evaluates the successful task completion performance of the AGV given the use of only YOLOv4 (upper rows) and YOLOv4 and DAL (lower rows). Two groups of attempts were completed for each implementation, with 58 trials each time. In evaluating the performance, “success” was defined as a homing time of ≤0.40 m, while failure was determined otherwise. Overall, the results presented in Table 4 show that the inclusion of DAL in the navigation scheme increases the mean success probability from 0.902 to 0.952.
Table 4.
Performance effects of DAL vs. YOLOv4.
The experimental trials provide useful insights into the effectiveness of the YOLOv4 detector. Employing a data-driven methodology is essential in this approach. The selection of the detector type is determined using raw data derived from the input image and YOLOv4 detection. Furthermore, the system demonstrates a rapid response time of just 0.36 min. This enables DAL to swiftly analyze the next frame to improve localization accuracy. Making incorrect navigation choices can lead to route complications and require additional time for correction. Next, the fourth column for YOLOv4 does not use cycles, so it conveys not available (n/a).
This DAL navigation experiment has not made comparisons with other methods. In this paper, the confirmation carried out is between DAL-YOLOv4 and YOLOv4 only. This comparison was carried out to determine the time consumption and average execution by the system. As in Table 4, the system was tested for two large trials on YOLOv4, the first of 54 and the second of 49 attempts, while DAL had 57 and 50 attempts, respectively. Both produce rating rates of 0.902 and 0.952 for YOLOv4 and DAL. Kemudian pada kolom keempat untuk YOLOv4 while tentu tidak menggunakan siklus, sehingga tertulis n/a (not avilable).
Based on Table 4, the uniqueness of DAL can be seen from the existence of cycles. This cycle continues testing until the AGV navigation goal is achieved. The passing score β determines AGV achievement of targets; in this DAL cycle, β is set at 0.87 by splitting into Algorithms 1 and 2, and so detects 0.04 faster than YOLOv4. Looking more deeply, DAL architecturally uses YOLOv4 in it, but empirically, it is almost as fast because the DAL principle with the four stages of planning, acting, observing, and reflecting can carry out work from various parts. For example, the data will be stored in the reflecting section if the first detection has been carried out. If one day, YOLOv4 fails or poorly detects, the reflecting data can be used. Meanwhile, YOLOv4 works in three stages: backbone, neck, and head (YOLOv3), as shown in Figure 8.
In the first experiment, the DAL took 63 cycles for 57 successful attempts, meaning there were six extra cycles for each navigation process. In the second DAL experiment, there were 72 cycles for 50 experiments or a surplus of 22 cycles. This surplus could come from one trial containing more than one cycle. The ideal conditions between experiments and cycles are the same; in other words, DAL does not repeat the process. However, it seems impossible that the ratio between attempts and cycles is 1:1.
6. Conclusions
This study develops a visual navigation system for indoor AGVs, and the algorithm results can demonstrate its robustness. The built navigation system, with the DAL architecture consisting of a combination of YOLOv4, SURF, kNN, and AL, was able to complete visual navigation tasks in an average of 0.36 min, 0.04 min faster than the system that only uses YOLOv4. The speed of DAL is generally contributed to by the robustness and localization accuracy in reading each obstacle by SURF and kNN. DAL has a particular cycle and will end depending on the beta setting value. Of 107 navigation attempts, nine failed and required 135 cycles; however, the time was still 0.04 faster than YOLOv4. The experimental findings reveal that navigation with DAL allows AGVs to complete indoor driving tasks with an average success rate of 0.952. However, this research is still limited to conventional wheeled robot applications, even though industrial AGVs are dominated by mecanum or omni wheels. Future research plans will consider AGVs that use omni or mecanum wheels to perform more complex turning speed tactics with the same DAL. This challenge is to prove that DAL can work optimally to minimize slippage.
Author Contributions
Conceptualized the research, M.-S.W.; methodology and software used, M.M. and A.A.; investigation and validation, M.-S.W.; formal analysis, M.M.; resources, A.A.; data curation, A.A. and M.M.; writing—preparing the original draft, M.M.; writing—reviewing and editing the manuscript, A.A., M.M. and M.-S.W.; supervised the analysis, reviewed the manuscript, M.-S.W.; visualization and project administration, A.A. and M.M.; funding acquisition, M.-S.W. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Higher Education Sprout Project of the Ministry of Education, Taiwan, and Ministry of Science and Technology, grant number NSTC 112-2221-E-218-014.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data presented in this study are available in this article.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Digani, V.; Sabattini, L.; Secchi, C. A Probabilistic Eulerian Traffic Model for the Coordination of Multiple AGVs in Automatic Warehouses. IEEE Robot. Autom. Lett. 2016, 1, 26–32. [Google Scholar] [CrossRef]
- Liu, Y.; Ma, X.; Shu, L.; Hancke, G.P.; Abu-Mahfouz, A.M. From Industry 4.0 to Agriculture 4.0: Current Status, Enabling Technologies, and Research Challenges. IEEE Trans. Ind. Inform. 2021, 17, 4322–4334. [Google Scholar] [CrossRef]
- Santos, J.; Rebelo, P.M.; Rocha, L.F.; Costa, P.; Veiga, G. A* Based Routing and Scheduling Modules for Multiple AGVs in an Industrial Scenario. Robotics 2021, 10, 72. [Google Scholar] [CrossRef]
- Li, Z.; Liu, J.; Huang, Z.; Peng, Y.; Pu, H.; Ding, L. Adaptive Impedance Control of Human–Robot Cooperation Using Reinforcement Learning. IEEE Trans. Ind. Electron. 2017, 64, 8013–8022. [Google Scholar] [CrossRef]
- Liu, S.; Xiong, M.; Zhong, W.; Xiong, H. Towards Industrial Scenario Lane Detection: Vision-Based AGV Navigation Methods. In Proceedings of the 2020 IEEE International Conference on Mechatronics and Automation (ICMA), Beijing, China, 13–16 October 2020; pp. 1101–1106. [Google Scholar] [CrossRef]
- Yang, Z.; Yang, X.; Wu, L.; Hu, J.; Zou, B.; Zhang, Y.; Zhang, J. Pre-Inpainting Convolutional Skip Triple Attention Segmentation Network for AGV Lane Detection in Overexposure Environment. Appl. Sci. 2022, 12, 10675. [Google Scholar] [CrossRef]
- Matos, D.; Costa, P.; Lima, J.; Costa, P. Multi AGV Coordination Tolerant to Communication Failures. Robotics 2021, 10, 55. [Google Scholar] [CrossRef]
- Chowdhury, M.E.H.; Khandakar, A.; Ahmed, S.; Al-Khuzaei, F.; Hamdalla, J.; Haque, F.; Reaz, M.B.I.; Al Shafei, A.; Al-Emadi, N. Design, Construction and Testing of IoT Based Automated Indoor Vertical Hydroponics Farming Test-Bed in Qatar. Sensors 2020, 20, 5637. [Google Scholar] [CrossRef] [PubMed]
- Lottes, P.; Behley, J.; Milioto, A.; Stachniss, C. Fully Convolutional Networks with Sequential Information for Robust Crop and Weed Detection in Precision Farming. IEEE Robot. Autom. Lett. 2018, 3, 2870–2877. [Google Scholar] [CrossRef]
- Tokekar, P.; Hook, J.V.; Mulla, D.; Isler, V. Sensor Planning for a Symbiotic UAV and UGV System for Precision Agriculture. IEEE Trans. Robot. 2016, 32, 1498–1511. [Google Scholar] [CrossRef]
- Qadeer, N.; Shah, J.H.; Sharif, M.; Khan, M.A.; Muhammad, G.; Zhang, Y.-D. Intelligent Tracking of Mechanically Thrown Objects by Industrial Catching Robot for Automated In-Plant Logistics 4.0. Sensors 2022, 22, 2113. [Google Scholar] [CrossRef]
- Badrloo, S.; Varshosaz, M.; Pirasteh, S.; Li, J. Image-Based Obstacle Detection Methods for the Safe Navigation of Unmanned Vehicles: A Review. Remote Sens. 2022, 14, 3824. [Google Scholar] [CrossRef]
- Sheng, W.; Thobbi, A.; Gu, Y. An Integrated Framework for Human–Robot Collaborative Manipulation. IEEE Trans. Cybern. 2015, 45, 2030–2041. [Google Scholar] [CrossRef] [PubMed]
- Bozek, P.; Karavaev, Y.L.; Ardentov, A.A.; Yefremov, K.S. Neural network control of a wheeled mobile robot based on optimal trajectories. Int. J. Adv. Robot. Syst. 2020, 17, 172988142091607. [Google Scholar] [CrossRef]
- Urban, R.; Štroner, M.; Kuric, I. The use of onboard UAV GNSS navigation data for area and volume calculation. Acta Montan. Slovaca 2020, 25, 361–374. [Google Scholar] [CrossRef]
- Feng, S.; Sebastian, B.; Ben-Tzvi, P. A Collision Avoidance Method Based on Deep Reinforcement Learning. Robotics 2021, 10, 73. [Google Scholar] [CrossRef]
- Huang, X.; Wu, W.; Qiao, H.; Ji, Y. Brain-Inspired Motion Learning in Recurrent Neural Network With Emotion Modulation. IEEE Trans. Cogn. Dev. Syst. 2018, 10, 1153–1164. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Caglayan, A.; Can, A.B. Volumetric Object Recognition Using 3-D CNNs on Depth Data. IEEE Access 2018, 6, 20058–20066. [Google Scholar] [CrossRef]
- Zhang, D.; Li, J.; Xiong, L.; Lin, L.; Ye, M.; Yang, S. Cycle-Consistent Domain Adaptive Faster RCNN. IEEE Access 2019, 7, 123903–123911. [Google Scholar] [CrossRef]
- Zhang, Y.; Song, C.; Zhang, D. Deep Learning-Based Object Detection Improvement for Tomato Disease. IEEE Access 2020, 8, 56607–56614. [Google Scholar] [CrossRef]
- Josef, S.; Degani, A. Deep Reinforcement Learning for Safe Local Planning of a Ground Vehicle in Unknown Rough Terrain. IEEE Robot. Autom. Lett. 2020, 5, 6748–6755. [Google Scholar] [CrossRef]
- Yang, H.; Chen, L.; Chen, M.; Ma, Z.; Deng, F.; Li, M.; Li, X. Tender Tea Shoots Recognition and Positioning for Picking Robot Using Improved YOLO-V3 Model. IEEE Access 2019, 7, 180998–181011. [Google Scholar] [CrossRef]
- Fang, W.; Wang, L.; Ren, P. Tinier-YOLO: A Real-Time Object Detection Method for Constrained Environments. IEEE Access 2020, 8, 1935–1944. [Google Scholar] [CrossRef]
- Divyanth, L.G.; Soni, P.; Pareek, C.M.; Machavaram, R.; Nadimi, M.; Paliwal, J. Detection of Coconut Clusters Based on Occlusion Condition Using Attention-Guided Faster R-CNN for Robotic Harvesting. Foods 2022, 11, 3903. [Google Scholar] [CrossRef] [PubMed]
- Du, Y.-C.; Muslikhin, M.; Hsieh, T.-H.; Wang, M.-S. Stereo Vision-Based Object Recognition and Manipulation by Regions with Convolutional Neural Network. Electronics 2020, 9, 210. [Google Scholar] [CrossRef]
- Cheng, R.; Wang, K.; Bai, J.; Xu, Z. Unifying Visual Localization and Scene Recognition for People With Visual Impairment. IEEE Access 2020, 8, 64284–64296. [Google Scholar] [CrossRef]
- Chalup, S.K.; Murch, C.L.; Quinlan, M.J. Machine Learning With AIBO Robots in the Four-Legged League of RoboCup. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2007, 37, 297–310. [Google Scholar] [CrossRef][Green Version]
- Ali, M.A.H.; Baggash, M.; Rustamov, J.; Abdulghafor, R.; Abdo, N.A.-D.N.; Abdo, M.H.G.; Mohammed, T.S.; Hasan, A.A.; Abdo, A.N.; Turaev, S.; et al. An Automatic Visual Inspection of Oil Tanks Exterior Surface Using Unmanned Aerial Vehicle with Image Processing and Cascading Fuzzy Logic Algorithms. Drones 2023, 7, 133. [Google Scholar] [CrossRef]
- Semwal, A.; Lee, M.M.J.; Sanchez, D.; Teo, S.L.; Wang, B.; Mohan, R.E. Object-of-Interest Perception in a Reconfigurable Rolling-Crawling Robot. Sensors 2022, 22, 5214. [Google Scholar] [CrossRef]
- Singh, D. Fast-BoW: Scaling Bag-of-Visual-Words Generation. In Proceedings of the 2018 British Machine Vision Conference, Newcastle, UK, 2–6 September 2018. [Google Scholar]
- Feng, M.; Wang, Y.; Liu, J.; Zhang, L.; Zaki, H.F.M.; Mian, A. Benchmark Data Set and Method for Depth Estimation From Light Field Images. IEEE Trans. Image Process. 2018, 27, 3586–3598. [Google Scholar] [CrossRef]
- Dornaika, F.; Horaud, R. Simultaneous robot-world and hand-eye calibration. IEEE Trans. Robot. Autom. 1998, 14, 617–622. [Google Scholar] [CrossRef]
- Ibrahim, Y.; Wang, H.; Liu, J.; Wei, J.; Chen, L.; Rech, P.; Adam, K.; Guo, G. Soft errors in DNN accelerators: A comprehensive review. Microelectron. Reliab. 2020, 115, 113969. [Google Scholar] [CrossRef]
- Muslikhin; Horng, J.-R.; Yang, S.-Y.; Wang, M.-S. Self-Correction for Eye-In-Hand Robotic Grasping Using Action Learning. IEEE Access 2021, 9, 156422–156436. [Google Scholar] [CrossRef]
- Chen, P.-J.; Yang, S.-Y.; Chen, Y.-P.; Muslikhin, M.; Wang, M.-S. Slip Estimation and Compensation Control of Omnidirectional Wheeled Automated Guided Vehicle. Electronics 2021, 10, 840. [Google Scholar] [CrossRef]
- Adam, S.; Busoniu, L.; Babuska, R. Experience Replay for Real-Time Reinforcement Learning Control. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2012, 42, 201–212. [Google Scholar] [CrossRef]
- Sanchez, A.G.; Smart, W.D. Verifiable Surface Disinfection Using Ultraviolet Light with a Mobile Manipulation Robot. Technologies 2022, 10, 48. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

