sensors-logo

Journal Browser

Journal Browser

Object Recognition with Vision Sensors Based on Machine Learning and Deep Learning

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: closed (5 March 2025) | Viewed by 15664

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou 510006, China
Interests: image understanding; machine learning; language understanding; open-world semantic understanding

E-Mail Website
Guest Editor
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
Interests: pattern recognition; computer vision

Special Issue Information

Dear Colleagues,

As massive vision sensors have been established, object recognition has achieved significant success due to visual big data and advanced machine learning methods (e.g., deep learning). However, there are still several challenges, e.g., the assumption of closed-world recognition, the requirement of large-scale labeled images, the problem of few-shot learning, and poor model interpretability. This Special Issue aims to address the object recognition methods with vision sensors designed to meet these challenges.

Dr. Meng Yang
Dr. Jianjun Qian
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • open-world object recognition
  • semi-supervised object recognition
  • weakly supervised object recognition
  • unsupervised object recognition
  • few-shot object recognition
  • interpretable object recognition
  • object recognition based on deep learning
  • object recognition based on machine learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 4465 KiB  
Article
HYFF-CB: Hybrid Feature Fusion Visual Model for Cargo Boxes
by Juedong Li, Kaifan Yang, Cheng Qiu, Lubin Wang, Yujia Cai, Hailan Wei, Qiang Yu and Peng Huang
Sensors 2025, 25(6), 1865; https://doi.org/10.3390/s25061865 - 17 Mar 2025
Viewed by 194
Abstract
In automatic loading and unloading systems, it is crucial to accurately detect the locations of boxes inside trucks in real time. However, the existing methods for box detection have multiple shortcomings, and can hardly meet the strict requirements of actual production. When the [...] Read more.
In automatic loading and unloading systems, it is crucial to accurately detect the locations of boxes inside trucks in real time. However, the existing methods for box detection have multiple shortcomings, and can hardly meet the strict requirements of actual production. When the truck environment is complex, the currently common models based on convolutional neural networks show certain limitations in the practical application of box detection. For example, these models fail to effectively handle the size inconsistency and occlusion of boxes, resulting in a decrease in detection accuracy. These problems seriously restrict the performance and reliability of automatic loading and unloading systems, making it impossible to achieve ideal detection accuracy, speed, and adaptability. Therefore, there is an urgent need for a new and more effective box detection method. To this end, this paper proposes a new model, HYFF-CB, which incorporates key technologies such as a location attention mechanism, a fusion-enhanced pyramid structure, and a synergistic weighted loss system. After real-time images of a truck were obtained by an industrial camera, the HYFF-CB model was used to detect the boxes in the truck, having the capability to accurately detect the stacking locations and quantity of the boxes. After rigorous testing, the HYFF-CB model was compared with other existing models. The results show that the HYFF-CB model has apparent advantages in detection rate. With its detection performance and effect fully meeting the actual application requirements of automatic loading and unloading systems, the HYFF-CB model can excellently adapt to various complex and changing scenarios for the application of automatic loading and unloading. Full article
Show Figures

Figure 1

23 pages, 6672 KiB  
Article
A Real-Time Fish Detection System for Partially Dewatered Fish to Support Selective Fish Passage
by Jonathan Gregory, Scott M. Miehls, Jesse L. Eickholt and Daniel P. Zielinski
Sensors 2025, 25(4), 1022; https://doi.org/10.3390/s25041022 - 9 Feb 2025
Cited by 1 | Viewed by 954
Abstract
Recent advances in fish transportation technologies and deep machine learning-based fish classification have created an opportunity for real-time, autonomous fish sorting through a selective passage mechanism. This research presents a case study of a novel application that utilizes deep machine learning to detect [...] Read more.
Recent advances in fish transportation technologies and deep machine learning-based fish classification have created an opportunity for real-time, autonomous fish sorting through a selective passage mechanism. This research presents a case study of a novel application that utilizes deep machine learning to detect partially dewatered fish exiting an Archimedes Screw Fish Lift (ASFL). A MobileNet SSD model was trained on images of partially dewatered fish volitionally passing through an ASFL. Then, this model was integrated with a network video recorder to monitor video from the ASFL. Additional models were also trained using images from a similar fish scanning device to test the feasibility of this approach for fish classification. Open source software and edge computing design principles were employed to ensure that the system is capable of fast data processing. The findings from this research demonstrate that such a system integrated with an ASFL can support real-time fish detection. This research contributes to the goal of automated data collection in a selective fish passage system and presents a viable path towards realizing optical fish sorting. Full article
Show Figures

Figure 1

19 pages, 11309 KiB  
Article
FSH-DETR: An Efficient End-to-End Fire Smoke and Human Detection Based on a Deformable DEtection TRansformer (DETR)
by Tianyu Liang and Guigen Zeng
Sensors 2024, 24(13), 4077; https://doi.org/10.3390/s24134077 - 23 Jun 2024
Cited by 7 | Viewed by 1837
Abstract
Fire is a significant security threat that can lead to casualties, property damage, and environmental damage. Despite the availability of object-detection algorithms, challenges persist in detecting fires, smoke, and humans. These challenges include poor performance in detecting small fires and smoke, as well [...] Read more.
Fire is a significant security threat that can lead to casualties, property damage, and environmental damage. Despite the availability of object-detection algorithms, challenges persist in detecting fires, smoke, and humans. These challenges include poor performance in detecting small fires and smoke, as well as a high computational cost, which limits deployments. In this paper, we propose an end-to-end object detector for fire, smoke, and human detection based on Deformable DETR (DEtection TRansformer) called FSH-DETR. To effectively process multi-scale fire and smoke features, we propose a novel Mixed Encoder, which integrates SSFI (Separate Single-scale Feature Interaction Module) and CCFM (CNN-based Cross-scale Feature Fusion Module) for multi-scale fire, smoke, and human feature fusion. Furthermore, we enhance the convergence speed of FSH-DETR by incorporating a bounding box loss function called PIoUv2 (Powerful Intersection of Union), which improves the precision of fire, smoke, and human detection. Extensive experiments on the public dataset demonstrate that the proposed method surpasses state-of-the-art methods in terms of the mAP (mean Average Precision), with mAP and mAP50 reaching 66.7% and 84.2%, respectively. Full article
Show Figures

Figure 1

15 pages, 529 KiB  
Article
A Pix2Pix Architecture for Complete Offline Handwritten Text Normalization
by Alvaro Barreiro-Garrido, Victoria Ruiz-Parrado, A. Belen Moreno and Jose F. Velez
Sensors 2024, 24(12), 3892; https://doi.org/10.3390/s24123892 - 16 Jun 2024
Cited by 2 | Viewed by 1628
Abstract
In the realm of offline handwritten text recognition, numerous normalization algorithms have been developed over the years to serve as preprocessing steps prior to applying automatic recognition models to handwritten text scanned images. These algorithms have demonstrated effectiveness in enhancing the overall performance [...] Read more.
In the realm of offline handwritten text recognition, numerous normalization algorithms have been developed over the years to serve as preprocessing steps prior to applying automatic recognition models to handwritten text scanned images. These algorithms have demonstrated effectiveness in enhancing the overall performance of recognition architectures. However, many of these methods rely heavily on heuristic strategies that are not seamlessly integrated with the recognition architecture itself. This paper introduces the use of a Pix2Pix trainable model, a specific type of conditional generative adversarial network, as the method to normalize handwritten text images. Also, this algorithm can be seamlessly integrated as the initial stage of any deep learning architecture designed for handwritten recognition tasks. All of this facilitates training the normalization and recognition components as a unified whole, while still maintaining some interpretability of each module. Our proposed normalization approach learns from a blend of heuristic transformations applied to text images, aiming to mitigate the impact of intra-personal handwriting variability among different writers. As a result, it achieves slope and slant normalizations, alongside other conventional preprocessing objectives, such as normalizing the size of text ascenders and descenders. We will demonstrate that the proposed architecture replicates, and in certain cases surpasses, the results of a widely used heuristic algorithm across two metrics and when integrated as the first step of a deep recognition architecture. Full article
Show Figures

Figure 1

15 pages, 9386 KiB  
Article
Three-Dimensional Positioning for Aircraft Using IoT Devices Equipped with a Fish-Eye Camera
by Junichi Mori, Makoto Morinaga, Takumi Asakura, Takenobu Tsuchiya, Ippei Yamamoto, Kentaro Nishino and Shigenori Yokoshima
Sensors 2023, 23(22), 9108; https://doi.org/10.3390/s23229108 - 10 Nov 2023
Viewed by 1566
Abstract
Radar is an important sensing technology for three-dimensional positioning of aircraft. This method requires detecting the response from the object to the signal transmitted from the antenna, but the accuracy becomes unstable due to effects such as obstruction and reflection from surrounding buildings [...] Read more.
Radar is an important sensing technology for three-dimensional positioning of aircraft. This method requires detecting the response from the object to the signal transmitted from the antenna, but the accuracy becomes unstable due to effects such as obstruction and reflection from surrounding buildings at low altitudes near the antenna. Accordingly, there is a need for a ground-based positioning method with high accuracy. Among the positioning methods using cameras that have been proposed for this purpose, we have developed a multisite synchronized positioning system using IoT devices equipped with a fish-eye camera, and have been investigating its performance. This report describes the details and calibration experiments for this technology. Also, a case study was performed in which flight paths measured by existing GPS positioning were compared with results from the proposed method. Although the results obtained by each of the methods showed individual characteristics, the three-dimensional coordinates were a good match, showing the effectiveness of the positioning technology proposed in this study. Full article
Show Figures

Figure 1

24 pages, 3227 KiB  
Article
An Improved Wildfire Smoke Detection Based on YOLOv8 and UAV Images
by Saydirasulov Norkobil Saydirasulovich, Mukhriddin Mukhiddinov, Oybek Djuraev, Akmalbek Abdusalomov and Young-Im Cho
Sensors 2023, 23(20), 8374; https://doi.org/10.3390/s23208374 - 10 Oct 2023
Cited by 54 | Viewed by 8340
Abstract
Forest fires rank among the costliest and deadliest natural disasters globally. Identifying the smoke generated by forest fires is pivotal in facilitating the prompt suppression of developing fires. Nevertheless, succeeding techniques for detecting forest fire smoke encounter persistent issues, including a slow identification [...] Read more.
Forest fires rank among the costliest and deadliest natural disasters globally. Identifying the smoke generated by forest fires is pivotal in facilitating the prompt suppression of developing fires. Nevertheless, succeeding techniques for detecting forest fire smoke encounter persistent issues, including a slow identification rate, suboptimal accuracy in detection, and challenges in distinguishing smoke originating from small sources. This study presents an enhanced YOLOv8 model customized to the context of unmanned aerial vehicle (UAV) images to address the challenges above and attain heightened precision in detection accuracy. Firstly, the research incorporates Wise-IoU (WIoU) v3 as a regression loss for bounding boxes, supplemented by a reasonable gradient allocation strategy that prioritizes samples of common quality. This strategic approach enhances the model’s capacity for precise localization. Secondly, the conventional convolutional process within the intermediate neck layer is substituted with the Ghost Shuffle Convolution mechanism. This strategic substitution reduces model parameters and expedites the convergence rate. Thirdly, recognizing the challenge of inadequately capturing salient features of forest fire smoke within intricate wooded settings, this study introduces the BiFormer attention mechanism. This mechanism strategically directs the model’s attention towards the feature intricacies of forest fire smoke, simultaneously suppressing the influence of irrelevant, non-target background information. The obtained experimental findings highlight the enhanced YOLOv8 model’s effectiveness in smoke detection, proving an average precision (AP) of 79.4%, signifying a notable 3.3% enhancement over the baseline. The model’s performance extends to average precision small (APS) and average precision large (APL), registering robust values of 71.3% and 92.6%, respectively. Full article
Show Figures

Figure 1

Back to TopTop