sensors-logo

Journal Browser

Journal Browser

Vision Sensors for Object Detection and Recognition

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Physical Sensors".

Deadline for manuscript submissions: closed (30 April 2023) | Viewed by 28593

Special Issue Editor


E-Mail Website
Guest Editor
Department of Mechanical Engineering, National Central University, Taoyuan 320317, Taiwan
Interests: computer vision; image processing; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Thanks to the development of deep learning and AI, there have been rapid and successful applications of object detection and recognition. These results lay a strong foundation for advanced applications, such as computer vision technology, object tracking, video understanding, and so on. In addition, the metaverse also brings a new trend for object detection and recognition. The environments of VR, AR, and MR will be more and more popular in the future, and new vision technology is the key to its success. This Special Issue of Sensors aims to bring together leading academic research results on advanced object detection and recognition methods. It also provides a platform to discuss insightful concerns as well as practical challenges encountered and solutions adopted in this Special Issue.

The topics of interest for the Special Issue include, but are not limited to, the following:

  • Object detection and recognition in virtual reality, augmented reality, and mixed reality;
  • New deep-learning neural networks for object detection and recognition;
  • New practical applications for object detection and recognition;
  • Sensor and sensing technologies for object detection and recognition;
  • Fusion of multiple vision sensors for object detection and recognition;
  • Challenges and solutions of object detection and recognition;
  • 3D object detection and recognition;
  • Novel ideas and frameworks for developing object detection and recognition systems.

Dr. Chih-Yang Lin
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • object detection
  • object recognition
  • deep learning
  • vision sensors
  • 3D images
  • virtual reality, augmented reality, and mixed reality
  • metaverse

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

18 pages, 11048 KiB  
Article
Smart Task Assistance in Mixed Reality for Astronauts
by Qingwei Sun, Wei Chen, Jiangang Chao, Wanhong Lin, Zhenying Xu and Ruizhi Cao
Sensors 2023, 23(9), 4344; https://doi.org/10.3390/s23094344 - 27 Apr 2023
Viewed by 1387
Abstract
Mixed reality (MR) registers virtual information and real objects and is an effective way to supplement astronaut training. Spatial anchors are generally used to perform virtual–real fusion in static scenes but cannot handle movable objects. To address this issue, we propose a smart [...] Read more.
Mixed reality (MR) registers virtual information and real objects and is an effective way to supplement astronaut training. Spatial anchors are generally used to perform virtual–real fusion in static scenes but cannot handle movable objects. To address this issue, we propose a smart task assistance method based on object detection and point cloud alignment. Specifically, both fixed and movable objects are detected automatically. In parallel, poses are estimated with no dependence on preset spatial position information. Firstly, YOLOv5s is used to detect the object and segment the point cloud of the corresponding structure, called the partial point cloud. Then, an iterative closest point (ICP) algorithm between the partial point cloud and the template point cloud is used to calculate the object’s pose and execute the virtual–real fusion. The results demonstrate that the proposed method achieves automatic pose estimation for both fixed and movable objects without background information and preset spatial anchors. Most volunteers reported that our approach was practical, and it thus expands the application of astronaut training. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

15 pages, 3079 KiB  
Article
Dual-View Single-Shot Multibox Detector at Urban Intersections: Settings and Performance Evaluation
by Marta Lenatti, Sara Narteni, Alessia Paglialonga, Vittorio Rampa and Maurizio Mongelli
Sensors 2023, 23(6), 3195; https://doi.org/10.3390/s23063195 - 16 Mar 2023
Cited by 2 | Viewed by 1657
Abstract
The explosion of artificial intelligence methods has paved the way for more sophisticated smart mobility solutions. In this work, we present a multi-camera video content analysis (VCA) system that exploits a single-shot multibox detector (SSD) network to detect vehicles, riders, and pedestrians and [...] Read more.
The explosion of artificial intelligence methods has paved the way for more sophisticated smart mobility solutions. In this work, we present a multi-camera video content analysis (VCA) system that exploits a single-shot multibox detector (SSD) network to detect vehicles, riders, and pedestrians and triggers alerts to drivers of public transportation vehicles approaching the surveilled area. The evaluation of the VCA system will address both detection and alert generation performance by combining visual and quantitative approaches. Starting from a SSD model trained for a single camera, we added a second one, under a different field of view (FOV) to improve the accuracy and reliability of the system. Due to real-time constraints, the complexity of the VCA system must be limited, thus calling for a simple multi-view fusion method. According to the experimental test-bed, the use of two cameras achieves a better balance between precision (68%) and recall (84%) with respect to the use of a single camera (i.e., 62% precision and 86% recall). In addition, a system evaluation in temporal terms is provided, showing that missed alerts (false negatives) and wrong alerts (false positives) are typically transitory events. Therefore, adding spatial and temporal redundancy increases the overall reliability of the VCA system. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

14 pages, 4954 KiB  
Article
Visual SLAM for Dynamic Environments Based on Object Detection and Optical Flow for Dynamic Object Removal
by Charalambos Theodorou, Vladan Velisavljevic and Vladimir Dyo
Sensors 2022, 22(19), 7553; https://doi.org/10.3390/s22197553 - 5 Oct 2022
Cited by 12 | Viewed by 3460
Abstract
In dynamic indoor environments and for a Visual Simultaneous Localization and Mapping (vSLAM) system to operate, moving objects should be considered because they could affect the system’s visual odometer stability and its position estimation accuracy. vSLAM can use feature points or a sequence [...] Read more.
In dynamic indoor environments and for a Visual Simultaneous Localization and Mapping (vSLAM) system to operate, moving objects should be considered because they could affect the system’s visual odometer stability and its position estimation accuracy. vSLAM can use feature points or a sequence of images, as it is the only source of input that can perform localization while simultaneously creating a map of the environment. A vSLAM system based on ORB-SLAM3 and on YOLOR was proposed in this paper. The newly proposed system in combination with an object detection model (YOLOX) applied on extracted feature points is capable of achieving 2–4% better accuracy compared to VPS-SLAM and DS-SLAM. Static feature points such as signs and benches were used to calculate the camera position, and dynamic moving objects were eliminated by using the tracking thread. A specific custom personal dataset that includes indoor and outdoor RGB-D pictures of train stations, including dynamic objects and high density of people, ground truth data, sequence data, and video recordings of the train stations and X, Y, Z data was used to validate and evaluate the proposed method. The results show that ORB-SLAM3 with YOLOR as object detection achieves 89.54% of accuracy in dynamic indoor environments compared to previous systems such as VPS-SLAM. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

18 pages, 3191 KiB  
Article
Novel Spatio-Temporal Continuous Sign Language Recognition Using an Attentive Multi-Feature Network
by Wisnu Aditya, Timothy K. Shih, Tipajin Thaipisutikul, Arda Satata Fitriajie, Munkhjargal Gochoo, Fitri Utaminingrum and Chih-Yang Lin
Sensors 2022, 22(17), 6452; https://doi.org/10.3390/s22176452 - 26 Aug 2022
Cited by 11 | Viewed by 2444
Abstract
Given video streams, we aim to correctly detect unsegmented signs related to continuous sign language recognition (CSLR). Despite the increase in proposed deep learning methods in this area, most of them mainly focus on using only an RGB feature, either the full-frame image [...] Read more.
Given video streams, we aim to correctly detect unsegmented signs related to continuous sign language recognition (CSLR). Despite the increase in proposed deep learning methods in this area, most of them mainly focus on using only an RGB feature, either the full-frame image or details of hands and face. The scarcity of information for the CSLR training process heavily constrains the capability to learn multiple features using the video input frames. Moreover, exploiting all frames in a video for the CSLR task could lead to suboptimal performance since each frame contains a different level of information, including main features in the inferencing of noise. Therefore, we propose novel spatio-temporal continuous sign language recognition using the attentive multi-feature network to enhance CSLR by providing extra keypoint features. In addition, we exploit the attention layer in the spatial and temporal modules to simultaneously emphasize multiple important features. Experimental results from both CSLR datasets demonstrate that the proposed method achieves superior performance in comparison with current state-of-the-art methods by 0.76 and 20.56 for the WER score on CSL and PHOENIX datasets, respectively. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

23 pages, 13342 KiB  
Article
Optimisation of Deep Learning Small-Object Detectors with Novel Explainable Verification
by Elhassan Mohamed, Konstantinos Sirlantzis, Gareth Howells and Sanaul Hoque
Sensors 2022, 22(15), 5596; https://doi.org/10.3390/s22155596 - 26 Jul 2022
Cited by 2 | Viewed by 1993
Abstract
In this paper, we present a novel methodology based on machine learning for identifying the most appropriate from a set of available state-of-the-art object detectors for a given application. Our particular interest is to develop a road map for identifying verifiably optimal selections, [...] Read more.
In this paper, we present a novel methodology based on machine learning for identifying the most appropriate from a set of available state-of-the-art object detectors for a given application. Our particular interest is to develop a road map for identifying verifiably optimal selections, especially for challenging applications such as detecting small objects in a mixed-size object dataset. State-of-the-art object detection systems often find the localisation of small-size objects challenging since most are usually trained on large-size objects. These contain abundant information as they occupy a large number of pixels relative to the total image size. This fact is normally exploited by the model during training and inference processes. To dissect and understand this process, our approach systematically examines detectors’ performances using two very distinct deep convolutional networks. The first is the single-stage YOLO V3 and the second is the double-stage Faster R-CNN. Specifically, our proposed method explores and visually illustrates the impact of feature extraction layers, number of anchor boxes, data augmentation, etc., utilising ideas from the field of explainable Artificial Intelligence (XAI). Our results, for example, show that multi-head YOLO V3 detectors trained using augmented data produce better performance even with a fewer number of anchor boxes. Moreover, robustness regarding the detector’s ability to explain how a specific decision was reached is investigated using different explanation techniques. Finally, two new visualisation techniques are proposed, WS-Grad and Concat-Grad, for identifying explanation cues of different detectors. These are applied to specific object detection tasks to illustrate their reliability and transparency with respect to the decision process. It is shown that the proposed techniques can result in high resolution and comprehensive heatmaps of the image areas, significantly affecting detector decisions as compared to the state-of-the-art techniques tested. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

16 pages, 11457 KiB  
Article
A Hierarchical Approach for Traffic Sign Recognition Based on Shape Detection and Image Classification
by Eric Hsueh-Chan Lu, Michal Gozdzikiewicz, Kuei-Hua Chang and Jing-Mei Ciou
Sensors 2022, 22(13), 4768; https://doi.org/10.3390/s22134768 - 24 Jun 2022
Cited by 10 | Viewed by 2646
Abstract
In recent years, the development of self-driving cars and their inclusion in our daily life has rapidly transformed from an idea into a reality. One of the main issues that autonomous vehicles must face is the problem of traffic sign detection and recognition. [...] Read more.
In recent years, the development of self-driving cars and their inclusion in our daily life has rapidly transformed from an idea into a reality. One of the main issues that autonomous vehicles must face is the problem of traffic sign detection and recognition. Most works focusing on this problem utilize a two-phase approach. However, a fast-moving car has to quickly detect the sign as seen by humans and recognize the image it contains. In this paper, we chose to utilize two different solutions to solve tasks of detection and classification separately and compare the results of our method with a novel state-of-the-art detector, YOLOv5. Our approach utilizes the Mask R-CNN deep learning model in the first phase, which aims to detect traffic signs based on their shapes. The second phase uses the Xception model for the task of traffic sign classification. The dataset used in this work is a manually collected dataset of 11,074 Taiwanese traffic signs collected using mobile phone cameras and a GoPro camera mounted inside a car. It consists of 23 classes divided into 3 subclasses based on their shape. The conducted experiments utilized both versions of the dataset, class-based and shape-based. The experimental result shows that the precision, recall and mAP can be significantly improved for our proposed approach. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

18 pages, 5770 KiB  
Article
Enhancing Precision with an Ensemble Generative Adversarial Network for Steel Surface Defect Detectors (EnsGAN-SDD)
by Fityanul Akhyar, Elvin Nur Furqon and Chih-Yang Lin
Sensors 2022, 22(11), 4257; https://doi.org/10.3390/s22114257 - 2 Jun 2022
Cited by 6 | Viewed by 2492
Abstract
Defects are the primary problem affecting steel product quality in the steel industry. The specific challenges in developing detect defectors involve the vagueness and tiny size of defects. To solve these problems, we propose incorporating super-resolution technique, sequential feature pyramid network, and boundary [...] Read more.
Defects are the primary problem affecting steel product quality in the steel industry. The specific challenges in developing detect defectors involve the vagueness and tiny size of defects. To solve these problems, we propose incorporating super-resolution technique, sequential feature pyramid network, and boundary localization. Initially, the ensemble of enhanced super-resolution generative adversarial networks (ESRGAN) was proposed for the preprocessing stage to generate a more detailed contour of the original steel image. Next, in the detector section, the latest state-of-the-art feature pyramid network, known as De-tectoRS, utilized the recursive feature pyramid network technique to extract deeper multi-scale steel features by learning the feedback from the sequential feature pyramid network. Finally, Side-Aware Boundary Localization was used to precisely generate the output prediction of the defect detectors. We named our approach EnsGAN-SDD. Extensive experimental studies showed that the proposed methods improved the defect detector’s performance, which also surpassed the accuracy of state-of-the-art methods. Moreover, the proposed EnsGAN achieved better performance and effectiveness in processing time compared with the original ESRGAN. We believe our innovation could significantly contribute to improved production quality in the steel industry. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

18 pages, 3892 KiB  
Article
A Deep Learning Method for Foot Progression Angle Detection in Plantar Pressure Images
by Peter Ardhianto, Raden Bagus Reinaldy Subiakto, Chih-Yang Lin, Yih-Kuen Jan, Ben-Yi Liau, Jen-Yung Tsai, Veit Babak Hamun Akbari and Chi-Wen Lung
Sensors 2022, 22(7), 2786; https://doi.org/10.3390/s22072786 - 5 Apr 2022
Cited by 16 | Viewed by 6884
Abstract
Foot progression angle (FPA) analysis is one of the core methods to detect gait pathologies as basic information to prevent foot injury from excessive in-toeing and out-toeing. Deep learning-based object detection can assist in measuring the FPA through plantar pressure images. This study [...] Read more.
Foot progression angle (FPA) analysis is one of the core methods to detect gait pathologies as basic information to prevent foot injury from excessive in-toeing and out-toeing. Deep learning-based object detection can assist in measuring the FPA through plantar pressure images. This study aims to establish a precision model for determining the FPA. The precision detection of FPA can provide information with in-toeing, out-toeing, and rearfoot kinematics to evaluate the effect of physical therapy programs on knee pain and knee osteoarthritis. We analyzed a total of 1424 plantar images with three different You Only Look Once (YOLO) networks: YOLO v3, v4, and v5x, to obtain a suitable model for FPA detection. YOLOv4 showed higher performance of the profile-box, with average precision in the left foot of 100.00% and the right foot of 99.78%, respectively. Besides, in detecting the foot angle-box, the ground-truth has similar results with YOLOv4 (5.58 ± 0.10° vs. 5.86 ± 0.09°, p = 0.013). In contrast, there was a significant difference in FPA between ground-truth vs. YOLOv3 (5.58 ± 0.10° vs. 6.07 ± 0.06°, p < 0.001), and ground-truth vs. YOLOv5x (5.58 ± 0.10° vs. 6.75 ± 0.06°, p < 0.001). This result implies that deep learning with YOLOv4 can enhance the detection of FPA. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

Review

Jump to: Research

26 pages, 1033 KiB  
Review
Review of IoT Sensor Systems Used for Monitoring the Road Infrastructure
by Kristian Micko, Peter Papcun and Iveta Zolotova
Sensors 2023, 23(9), 4469; https://doi.org/10.3390/s23094469 - 4 May 2023
Cited by 5 | Viewed by 3400
Abstract
An intelligent transportation system is one of the fundamental goals of the smart city concept. The Internet of Things (IoT) concept is a basic instrument to digitalize and automatize the process in the intelligent transportation system. Digitalization via the IoT concept enables the [...] Read more.
An intelligent transportation system is one of the fundamental goals of the smart city concept. The Internet of Things (IoT) concept is a basic instrument to digitalize and automatize the process in the intelligent transportation system. Digitalization via the IoT concept enables the automatic collection of data usable for management in the transportation system. The IoT concept includes a system of sensors, actuators, control units and computational distribution among the edge, fog and cloud layers. The study proposes a taxonomy of sensors used for monitoring tasks based on motion detection and object tracking in intelligent transportation system tasks. The sensor’s taxonomy helps to categorize the sensors based on working principles, installation or maintenance methods and other categories. The sensor’s categorization enables us to compare the effectiveness of each sensor’s system. Monitoring tasks are analyzed, categorized, and solved in intelligent transportation systems based on a literature review and focusing on motion detection and object tracking methods. A literature survey of sensor systems used for monitoring tasks in the intelligent transportation system was performed according to sensor and monitoring task categorization. In this review, we analyzed the achieved results to measure, sense, or classify events in intelligent transportation system monitoring tasks. The review conclusions were used to propose an architecture of the universal sensor system for common monitoring tasks based on motion detection and object tracking methods in intelligent transportation tasks. The proposed architecture was built and tested for the first experimental results in the case study scenario. Finally, we propose methods that could significantly improve the results in the following research. Full article
(This article belongs to the Special Issue Vision Sensors for Object Detection and Recognition)
Show Figures

Figure 1

Back to TopTop