Object Detection using Deep Learning for Autonomous Intelligent Robots

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (30 September 2019) | Viewed by 54416

Special Issue Editor


E-Mail Website
Guest Editor
Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
Interests: image processing; signal processing; intelligent systems; robotics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Autonomous intelligent robots are dynamic systems consisting of an electronic controller coupled to a mechanical body, with the need of an adequate sensory system to perceive the environment where they operate. Digital cameras are one of the most commonly used sensors at the moment.

As these robots move towards more complex environments and applications, image-understanding algorithms that allow a more precise and detailed object recognition become crucial. In this context, in addition to classifying images, it is also necessary to precisely estimate the class and location of objects contained within the images, a problem known as object detection.

The most important advances in object detection were achieved due to improvements in object representation and machine learning models. In the last few years, deep neural networks (DNNs) have emerged as a powerful machine-learning model. They are deep architectures which have the capacity to learn powerful object representations/models without the need to manually design features.

Usually, we associate the use of deep learning with high-complexity processing systems, and this is a challenge when we think of how to use it in autonomous intelligent robots. However, recent advances in single-board computers and networks allow the use of these technologies in real-time on these types of intelligent systems.

The main aim of this Special Issue is to present novel approaches and results focusing on deep-learning approaches for the vision systems of intelligent robots. Contributions that explore both on-board implementations or distributed vision systems with modules running remotely from the robot are welcome.

Prof. Dr. António J. R. Neves
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep leaning
  • Neural networks
  • Object detection
  • Image processing
  • Real-time systems
  • Autonomous robots
  • Intelligent robots

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

13 pages, 2488 KiB  
Article
FFESSD: An Accurate and Efficient Single-Shot Detector for Target Detection
by Wenxu Shi, Shengli Bao and Dailun Tan
Appl. Sci. 2019, 9(20), 4276; https://doi.org/10.3390/app9204276 - 12 Oct 2019
Cited by 25 | Viewed by 5209
Abstract
The Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current target detection field. It has achieved good results in target detection but there are problems such as poor extraction of features in shallow layers and loss of features [...] Read more.
The Single Shot MultiBox Detector (SSD) is one of the fastest algorithms in the current target detection field. It has achieved good results in target detection but there are problems such as poor extraction of features in shallow layers and loss of features in deep layers. In this paper, we propose an accurate and efficient target detection method, named Single Shot Object Detection with Feature Enhancement and Fusion (FFESSD), which is to enhance and exploit the shallow and deep features in the feature pyramid structure of the SSD algorithm. To achieve it we introduced the Feature Fusion Module and two Feature Enhancement Modules, and integrated them into the conventional structure of the SSD. Experimental results on the PASCAL VOC 2007 dataset demonstrated that FFESSD achieved 79.1% mean average precision (mAP) at the speed of 54.3 frame per second (FPS) with the input size 300 × 300, while FFESSD with a 512 × 512 sized input achieved 81.8% mAP at 30.2 FPS. The proposed network shows state-of-the-art mAP, which is better than the conventional SSD, Deconvolutional Single Shot Detector (DSSD), Feature-Fusion SSD (FSSD), and other advanced detectors. On extended experiment, the performance of FFESSD in fuzzy target detection was better than the conventional SSD. Full article
Show Figures

Figure 1

21 pages, 9407 KiB  
Article
Autonomous Robotics for Identification and Management of Invasive Aquatic Plant Species
by Maharshi Patel, Shaphan Jernigan, Rob Richardson, Scott Ferguson and Gregory Buckner
Appl. Sci. 2019, 9(12), 2410; https://doi.org/10.3390/app9122410 - 13 Jun 2019
Cited by 9 | Viewed by 5242
Abstract
Invasive aquatic plant species can expand rapidly throughout water bodies and cause severely adverse economic and ecological impacts. While mechanical, chemical, and biological methods exist for the identification and treatment of these invasive species, they are manually intensive, inefficient, costly, and can cause [...] Read more.
Invasive aquatic plant species can expand rapidly throughout water bodies and cause severely adverse economic and ecological impacts. While mechanical, chemical, and biological methods exist for the identification and treatment of these invasive species, they are manually intensive, inefficient, costly, and can cause collateral ecological damage. To address current deficiencies in aquatic weed management, this paper details the development of a small fleet of fully autonomous boats capable of subsurface hydroacoustic imaging (to scan aquatic vegetation), machine learning (for automated weed identification), and herbicide deployment (for vegetation control). These capabilities aim to minimize manual labor and provide more efficient, safe (reduced chemical exposure to personnel), and timely weed management. Geotagged hydroacoustic imagery of three aquatic plant varieties (Hydrilla, Cabomba, and Coontail) was collected and used to create a software pipeline for subsurface aquatic weed classification and distribution mapping. Employing deep learning, the novel software achieved a classification accuracy of 99.06% after training. Full article
Show Figures

Figure 1

17 pages, 666 KiB  
Article
A YOLOv2 Convolutional Neural Network-Based Human–Machine Interface for the Control of Assistive Robotic Manipulators
by Gianluca Giuffrida, Gabriele Meoni and Luca Fanucci
Appl. Sci. 2019, 9(11), 2243; https://doi.org/10.3390/app9112243 - 31 May 2019
Cited by 11 | Viewed by 3168
Abstract
During the last years, the mobility of people with upper limb disabilities and constrained on power wheelchairs is empowered by robotic arms. Nowadays, even though modern manipulators offer a high number of functionalities, some users cannot exploit all those potentialities due to their [...] Read more.
During the last years, the mobility of people with upper limb disabilities and constrained on power wheelchairs is empowered by robotic arms. Nowadays, even though modern manipulators offer a high number of functionalities, some users cannot exploit all those potentialities due to their reduced manual skills, even if capable of driving the wheelchair by means of proper Human–Machine Interface (HMI). Owing to that, this work proposes a low-cost manipulator realizing only simple tasks and controllable by three different graphical HMI. The latter are empowered using a You Only Look Once (YOLO) v2 Convolutional Neural Network that analyzes the video stream generated by a camera placed on the robotic arm end-effector and recognizes the objects with which the user can interact. Such objects are shown to the user in the HMI surrounded by a bounding box. When the user selects one of the recognized objects, the target position information is exploited by an automatic close-feedback algorithm which leads the manipulator to automatically perform the desired task. A test procedure showed that the accuracy in reaching the desired target is 78%. The produced HMIs were appreciated by different user categories, obtaining a mean score of 8.13/10. Full article
Show Figures

Figure 1

13 pages, 2581 KiB  
Article
Quality Index of Supervised Data for Convolutional Neural Network-Based Localization
by Seigo Ito, Mineki Soga, Shigeyoshi Hiratsuka, Hiroyuki Matsubara and Masaru Ogawa
Appl. Sci. 2019, 9(10), 1983; https://doi.org/10.3390/app9101983 - 15 May 2019
Cited by 6 | Viewed by 19423
Abstract
Automated guided vehicles (AGVs) are important in modern factories. The main functions of an AGV are its own localization and object detection, for which both sensor and localization methods are crucial. For localization, we used a small imaging sensor named a single-photon avalanche [...] Read more.
Automated guided vehicles (AGVs) are important in modern factories. The main functions of an AGV are its own localization and object detection, for which both sensor and localization methods are crucial. For localization, we used a small imaging sensor named a single-photon avalanche diode (SPAD) light detection and ranging (LiDAR), which uses the time-of-flight principle and arrays of SPADs. The SPAD LiDAR works both indoors and outdoors and is suitable for AGV applications. We utilized a deep convolutional neural network (CNN) as a localization method. For accurate CNN-based localization, the quality of the supervised data is important. The localization results can be poor or good if the supervised training data are noisy or clean, respectively. To address this issue, we propose a quality index for supervised data based on correlations between consecutive frames visualizing the important pixels for CNN-based localization. First, the important pixels for CNN-based localization are determined, and the quality index of supervised data is defined based on differences in these pixels. We evaluated the quality index in indoor-environment localization using the SPAD LiDAR and compared the localization performance. Our results demonstrate that the index correlates well to the quality of supervised training data for CNN-based localization. Full article
Show Figures

Figure 1

20 pages, 5805 KiB  
Article
Estimation of the Lateral Distance between Vehicle and Lanes Using Convolutional Neural Network and Vehicle Dynamics
by Xiang Zhang, Wei Yang, Xiaolin Tang and Zhonghua He
Appl. Sci. 2018, 8(12), 2508; https://doi.org/10.3390/app8122508 - 06 Dec 2018
Cited by 7 | Viewed by 2524
Abstract
With the aim to achieve an accurate lateral distance between vehicle and lane boundaries during the road test of Lane Departure Warning and Lane Keeping Assist, this study proposes a recognition model to estimate the distance directly by training a deep neural network, [...] Read more.
With the aim to achieve an accurate lateral distance between vehicle and lane boundaries during the road test of Lane Departure Warning and Lane Keeping Assist, this study proposes a recognition model to estimate the distance directly by training a deep neural network, called LatDisLanes. The neural network model obtains the distance using two down-face cameras without data pre-processing and post-processing. Nevertheless, the accuracy of recognition is disrupted by inclination angle, but the bias is decreased using a proposed dynamic correction model. Furthermore, as training a model requires a large number of label images, an image synthesis algorithm that is based on the Image Quilting is proposed. The experiment on test data set shows that the accuracy of LatDisLanes is 94.78% and 99.94%, respectively, if the allowable error is 0.46 cm and 2.3 cm when the vehicle runs smoothly. In addition, a bigger error can be caused when inclination angle is greater than 3°, but the error can be reduced by proposing a dynamic correction model. Full article
Show Figures

Figure 1

Review

Jump to: Research

31 pages, 3580 KiB  
Review
Facial Expression Recognition Using Computer Vision: A Systematic Review
by Daniel Canedo and António J. R. Neves
Appl. Sci. 2019, 9(21), 4678; https://doi.org/10.3390/app9214678 - 02 Nov 2019
Cited by 89 | Viewed by 18090
Abstract
Emotion recognition has attracted major attention in numerous fields because of its relevant applications in the contemporary world: marketing, psychology, surveillance, and entertainment are some examples. It is possible to recognize an emotion through several ways; however, this paper focuses on facial expressions, [...] Read more.
Emotion recognition has attracted major attention in numerous fields because of its relevant applications in the contemporary world: marketing, psychology, surveillance, and entertainment are some examples. It is possible to recognize an emotion through several ways; however, this paper focuses on facial expressions, presenting a systematic review on the matter. In addition, 112 papers published in ACM, IEEE, BASE and Springer between January 2006 and April 2019 regarding this topic were extensively reviewed. Their most used methods and algorithms will be firstly introduced and summarized for a better understanding, such as face detection, smoothing, Principal Component Analysis (PCA), Local Binary Patterns (LBP), Optical Flow (OF), Gabor filters, among others. This review identified a clear difficulty in translating the high facial expression recognition (FER) accuracy in controlled environments to uncontrolled and pose-variant environments. The future efforts in the FER field should be put into multimodal systems that are robust enough to face the adversities of real world scenarios. A thorough analysis on the research done on FER in Computer Vision based on the selected papers is presented. This review aims to not only become a reference for future research on emotion recognition, but also to provide an overview of the work done in this topic for potential readers. Full article
Show Figures

Figure 1

Back to TopTop