MDPI - Publisher of Open Access Journals

23 pages, 3908 KiB

Open AccessArticle

MSUD-YOLO: A Novel Multiscale Small Object Detection Model for UAV Aerial Images

by Xiaofeng Zhao, Hui Zhang, Wenwen Zhang, Junyi Ma, Chenxiao Li, Yao Ding and Zhili Zhang

Drones 2025, 9(6), 429; https://doi.org/10.3390/drones9060429 - 13 Jun 2025

Cited by 1 | Viewed by 803

Due to the objects in UAV aerial images often presenting characteristics of multiple scales, small objects, complex backgrounds, etc., the performance of object detection using current models is not satisfactory. To address the above issues, this paper designs a multiscale small object detection [...] Read more.

Due to the objects in UAV aerial images often presenting characteristics of multiple scales, small objects, complex backgrounds, etc., the performance of object detection using current models is not satisfactory. To address the above issues, this paper designs a multiscale small object detection model for UAV aerial images, namely MSUD-YOLO, based on YOLOv10s. First, the model uses an attention scale sequence fusion mode to achieve more efficient multiscale feature fusion. Meanwhile, a tiny prediction head is incorporated to make the model focus on the low-level features, thus improving its ability to detect small objects. Secondly, a novel feature extraction module named CFormerCGLU has been designed, which improves feature extraction capability in a lighter way. In addition, the model uses lightweight convolution instead of standard convolution to reduce the model’s computation. Finally, the WIoU v3 loss function is used to make the model more focused on low-quality examples, thereby improving the model’s object localization ability. Experimental results on the VisDrone2019 dataset show that MSUD-YOLO improves mAP50 by 8.5% compared with YOLOv10s. Concurrently, the overall model reduces parameters by 6.3%, verifying the model’s effectiveness for object detection in UAV aerial images in complex environments. Furthermore, compared with multiple latest UAV object detection algorithms, our designed MSUD-YOLO offers higher detection accuracy and lower computational cost; e.g., mAP50 reaches 43.4%, but parameters are only 6.766 M. Full article

► Show Figures

Figure 1

24 pages, 5775 KiB

Open AccessArticle

GESC-YOLO: Improved Lightweight Printed Circuit Board Defect Detection Based Algorithm

by Xiangqiang Kong, Guangmin Liu and Yanchen Gao

Sensors 2025, 25(10), 3052; https://doi.org/10.3390/s25103052 - 12 May 2025

Viewed by 664

Abstract

Printed circuit boards (PCBs) are an indispensable part of electronic products, and their quality is crucial to the operational integrity and functional reliability of these products. Currently, existing PCB defect detection models are beset with issues such as excessive model size and parameter [...] Read more.

Printed circuit boards (PCBs) are an indispensable part of electronic products, and their quality is crucial to the operational integrity and functional reliability of these products. Currently, existing PCB defect detection models are beset with issues such as excessive model size and parameter complexity, rendering them ill-equipped to meet the requirements for lightweight deployment on mobile devices. To address this challenge, this paper proposes a lightweight detection model, GESC-YOLO, developed through modifications to the YOLOv8n architecture. First, a new lightweight module, C2f-GE, is designed to replace the C2f module of the backbone network, which effectively reduces the computational parameters, and at the same time increases the number of channels of the feature map to enhance the feature extraction capability of the model. Second, the neck network employs the lightweight hybrid convolution GSConv. By integrating it with the VoV-GSCSP module, the Slim-neck structure is constructed. This approach not only guarantees detection precision but also enables model lightweighting and a reduction in the number of parameters. Finally, the coordinate attention is introduced into the neck network to decompose the channel attention and aggregate the features, which can effectively retain the spatial information and thus improve the detection and localization accuracy of tiny defects (defect area less than 1% of total image area) in PCB defect images. Experimental results demonstrate that, in contrast to the original YOLOv8n model, the GESC-YOLO algorithm boosts the mean Average Precision (mAP) of PCB surface defects by 0.4%, reaching 99%. Simultaneously, the model size is reduced by 25.4%, the parameter count is cut down by 28.6%, and the computational resource consumption is reduced by 26.8%. This successfully achieves the harmonization of detection precision and model lightweighting. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

10 pages, 2080 KiB

Open AccessProceeding Paper

Tunnel Traffic Enforcement Using Visual Computing and Field-Programmable Gate Array-Based Vehicle Detection and Tracking

by Yi-Chen Lin and Rey-Sern Lin

Eng. Proc. 2025, 92(1), 30; https://doi.org/10.3390/engproc2025092030 - 25 Apr 2025

Viewed by 265

Abstract

Tunnels are commonly found in small and enclosed environments on highways, roads, or city streets. They are constructed to pass through mountains or beneath crowded urban areas. To prevent accidents in these confined environments, lane changes, slow driving, or speeding are prohibited on [...] Read more.

Tunnels are commonly found in small and enclosed environments on highways, roads, or city streets. They are constructed to pass through mountains or beneath crowded urban areas. To prevent accidents in these confined environments, lane changes, slow driving, or speeding are prohibited on single- or multi-lane one-way roads. We developed a foreground detection algorithm based on the K-nearest neighbor (KNN) and Gaussian mixture model and 400 collected images. The KNN was used to gather the first 200 image data, which were processed to remove differences and estimate a high-quality background. Once the background was obtained, new images were extracted without the background image to extract the vehicle’s foreground. The background image was processed using Canny edge detection and the Hough transform to calculate road lines. At the same time, the oriented FAST and rotated BRIEF (ORB) algorithm was employed to track vehicles in the foreground image and determine positions and lane deviations. This method enables the calculation of traffic flow and abnormal movements. We accelerated image processing using xfOpenCV on the PYNQ-Z2 and FPGA Xilinx platforms. The developed algorithm does not require pre-labeled training models and can be used during the daytime to automatically collect the required footage. For real-time monitoring, the proposed algorithm increases the computation speed ten times compared with YOLO-v2-tiny. Additionally, it uses less than 1% of YOLO’s storage space. The proposed algorithm operates stably on the PYNQ-Z2 platform with existing surveillance cameras, without additional hardware setup. These advantages make the system more appropriate for smart traffic management than the existing framework. Full article

(This article belongs to the Proceedings of 2024 IEEE 6th Eurasia Conference on IoT, Communication and Engineering)

► Show Figures

Figure 1

25 pages, 8312 KiB

Open AccessArticle

Automated Surface Crack Identification of Reinforced Concrete Members Using an Improved YOLOv4-Tiny-Based Crack Detection Model

by Sofía Rajesh, K. S. Jinesh Babu, M. Chengathir Selvi and M. Chellapandian

Buildings 2024, 14(11), 3402; https://doi.org/10.3390/buildings14113402 - 26 Oct 2024

Cited by 7 | Viewed by 1746

Abstract

In recent times, the deployment of advanced structural health monitoring techniques has increased due to the aging infrastructural elements. This paper employed an enhanced You Only Look Once (YOLO) v4-tiny algorithm, based on the Crack Detection Model (CDM), to accurately identify and classify [...] Read more.

In recent times, the deployment of advanced structural health monitoring techniques has increased due to the aging infrastructural elements. This paper employed an enhanced You Only Look Once (YOLO) v4-tiny algorithm, based on the Crack Detection Model (CDM), to accurately identify and classify crack types in reinforced concrete (RC) members. YOLOv4-tiny is faster and more efficient than its predecessors, offering real-time detection with reduced computational complexity. Despite its smaller size, it maintains competitive accuracy, making it ideal for applications requiring high-speed processing on resource-limited devices. First, an extensive experimental program was conducted by testing full-scale RC members under different shear span (a) to depth ratios to achieve flexural and shear dominant failure modes. The digital images captured from the failure of RC beams were analyzed using the CDM of the YOLOv4-tiny algorithm. Results reveal the accurate identification of cracks formed along the depth of the beam at different stages of loading. Moreover, the confidence score attained for all the test samples was more than 95%, which indicates the accuracy of the developed model in capturing the types of cracks in the RC beam. The outcomes of the proposed work encourage the use of a developed CDM algorithm in real-time crack detection analysis of critical infrastructural elements. Full article

► Show Figures

Figure 1

24 pages, 10749 KiB

Open AccessArticle

A Long-Term Video Tracking Method for Group-Housed Pigs

by Qiumei Yang, Xiangyang Hui, Yigui Huang, Miaobin Chen, Senpeng Huang and Deqin Xiao

Animals 2024, 14(10), 1505; https://doi.org/10.3390/ani14101505 - 19 May 2024

Cited by 4 | Viewed by 2071

Abstract

Pig tracking provides strong support for refined management in pig farms. However, long and continuous multi-pig tracking is still extremely challenging due to occlusion, distortion, and motion blurring in real farming scenarios. This study proposes a long-term video tracking method for group-housed pigs [...] Read more.

Pig tracking provides strong support for refined management in pig farms. However, long and continuous multi-pig tracking is still extremely challenging due to occlusion, distortion, and motion blurring in real farming scenarios. This study proposes a long-term video tracking method for group-housed pigs based on improved StrongSORT, which can significantly improve the performance of pig tracking in production scenarios. In addition, this research constructs a 24 h pig tracking video dataset, providing a basis for exploring the effectiveness of long-term tracking algorithms. For object detection, a lightweight pig detection network, YOLO v7-tiny_Pig, improved based on YOLO v7-tiny, is proposed to reduce model parameters and improve detection speed. To address the target association problem, the trajectory management method of StrongSORT is optimized according to the characteristics of the pig tracking task to reduce the tracking identity (ID) switching and improve the stability of the algorithm. The experimental results show that YOLO v7-tiny_Pig ensures detection applicability while reducing parameters by 36.7% compared to YOLO v7-tiny and achieving an average video detection speed of 435 frames per second. In terms of pig tracking, Higher-Order Tracking Accuracy (HOTA), Multi-Object Tracking Accuracy (MOTP), and Identification F1 (IDF1) scores reach 83.16%, 97.6%, and 91.42%, respectively. Compared with the original StrongSORT algorithm, HOTA and IDF1 are improved by 6.19% and 10.89%, respectively, and Identity Switch (IDSW) is reduced by 69%. Our algorithm can achieve the continuous tracking of pigs in real scenarios for up to 24 h. This method provides technical support for non-contact pig automatic monitoring. Full article

(This article belongs to the Section Pigs)

► Show Figures

Figure 1

29 pages, 40648 KiB

Open AccessArticle

Detection of Crabs and Lobsters Using a Benchmark Single-Stage Detector and Novel Fisheries Dataset

by Muhammad Iftikhar, Marie Neal, Natalie Hold, Sebastian Gregory Dal Toé and Bernard Tiddeman

Computers 2024, 13(5), 119; https://doi.org/10.3390/computers13050119 - 11 May 2024

Cited by 1 | Viewed by 2434

Abstract

Crabs and lobsters are valuable crustaceans that contribute enormously to the seafood needs of the growing human population. This paper presents a comprehensive analysis of single- and multi-stage object detectors for the detection of crabs and lobsters using images captured onboard fishing boats. [...] Read more.

Crabs and lobsters are valuable crustaceans that contribute enormously to the seafood needs of the growing human population. This paper presents a comprehensive analysis of single- and multi-stage object detectors for the detection of crabs and lobsters using images captured onboard fishing boats. We investigate the speed and accuracy of multiple object detection techniques using a novel dataset, multiple backbone networks, various input sizes, and fine-tuned parameters. We extend our work to train lightweight models to accommodate the fishing boats equipped with low-power hardware systems. Firstly, we train Faster R-CNN, SSD, and YOLO with different backbones and tuning parameters. The models trained with higher input sizes resulted in lower frames per second (FPS) and vice versa. The base models were highly accurate but were compromised in computational and run-time costs. The lightweight models were adaptable to low-power hardware compared to the base models. Secondly, we improved the performance of YOLO (v3, v4, and tiny versions) using custom anchors generated by the k-means clustering approach using our novel dataset. The YOLO (v4 and it’s tiny version) achieved mean average precision (mAP) of 99.2% and 95.2%, respectively. The YOLOv4-tiny trained on the custom anchor-based dataset is capable of precisely detecting crabs and lobsters onboard fishing boats at 64 frames per second (FPS) on an NVidia GeForce RTX 3070 GPU. The Results obtained identified the strengths and weaknesses of each method towards a trade-off between speed and accuracy for detecting objects in input images. Full article

(This article belongs to the Special Issue Selected Papers from Computer Graphics & Visual Computing (CGVC 2023))

► Show Figures

Figure 1

22 pages, 7554 KiB

Open AccessArticle

PDT-YOLO: A Roadside Object-Detection Algorithm for Multiscale and Occluded Targets

by Ruoying Liu, Miaohua Huang, Liangzi Wang, Chengcheng Bi and Ye Tao

Sensors 2024, 24(7), 2302; https://doi.org/10.3390/s24072302 - 4 Apr 2024

Cited by 8 | Viewed by 2834

Abstract

To tackle the challenges of weak sensing capacity for multi-scale objects, high missed detection rates for occluded targets, and difficulties for model deployment in detection tasks of intelligent roadside perception systems, the PDT-YOLO algorithm based on YOLOv7-tiny is proposed. Firstly, we introduce the [...] Read more.

To tackle the challenges of weak sensing capacity for multi-scale objects, high missed detection rates for occluded targets, and difficulties for model deployment in detection tasks of intelligent roadside perception systems, the PDT-YOLO algorithm based on YOLOv7-tiny is proposed. Firstly, we introduce the intra-scale feature interaction module (AIFI) and reconstruct the feature pyramid structure to enhance the detection accuracy of multi-scale targets. Secondly, a lightweight convolution module (GSConv) is introduced to construct a multi-scale efficient layer aggregation network module (ETG), enhancing the network feature extraction ability while maintaining weight. Thirdly, multi-attention mechanisms are integrated to optimize the feature expression ability of occluded targets in complex scenarios, Finally, Wise-IoU with a dynamic non-monotonic focusing mechanism improves the accuracy and generalization ability of model sensing. Compared with YOLOv7-tiny, PDT-YOLO on the DAIR-V2X-C dataset improves mAP50 and mAP50:95 by 4.6% and 12.8%, with a parameter count of 6.1 million; on the IVODC dataset by 15.7% and 11.1%. We deployed the PDT-YOLO in an actual traffic environment based on a robot operating system (ROS), with a detection frame rate of 90 FPS, which can meet the needs of roadside object detection and edge deployment in complex traffic scenes. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

17 pages, 12074 KiB

Open AccessArticle

Research and Design of a Chicken Wing Testing and Weight Grading Device

by Kelin Wang, Zhiyong Li, Chengyi Wang, Bing Guo, Juntai Li, Zhengchao Lv and Xiaoling Ding

Electronics 2024, 13(6), 1049; https://doi.org/10.3390/electronics13061049 - 12 Mar 2024

Cited by 1 | Viewed by 1748

Abstract

This thesis introduces a nondestructive inspection and weight grading device for chicken wings to replace the traditional manual grading operation. A two-sided quality nondestructive inspection model of chicken wings based on the YOLO v7-tiny target detection algorithm is designed and deployed in a [...] Read more.

This thesis introduces a nondestructive inspection and weight grading device for chicken wings to replace the traditional manual grading operation. A two-sided quality nondestructive inspection model of chicken wings based on the YOLO v7-tiny target detection algorithm is designed and deployed in a Jetson Xavier NX embedded platform. An STM32 microcontroller is used as the main control platform, and a wing turning device adapting to the conveyor belt speed, dynamic weighing, and a high-efficiency intelligent grading unit are developed, and the prototype is optimized and verified in experiments. Experiments show that the device can grade four chicken wings per second, with a comprehensive accuracy rate of 98.4%, which is better than the traditional grading methods in terms of efficiency and accuracy. Full article

► Show Figures

Figure 1

15 pages, 6945 KiB

Open AccessArticle

Design and Implementation of Nursing-Secure-Care System with mmWave Radar by YOLO-v4 Computing Methods

by Jih-Ching Chiu, Guan-Yi Lee, Chih-Yang Hsieh and Qing-You Lin

Appl. Syst. Innov. 2024, 7(1), 10; https://doi.org/10.3390/asi7010010 - 19 Jan 2024

Cited by 4 | Viewed by 2972

Abstract

In computer vision and image processing, the shift from traditional cameras to emerging sensing tools, such as gesture recognition and object detection, addresses privacy concerns. This study navigates the Integrated Sensing and Communication (ISAC) era, using millimeter-wave signals as radar via a Convolutional [...] Read more.

In computer vision and image processing, the shift from traditional cameras to emerging sensing tools, such as gesture recognition and object detection, addresses privacy concerns. This study navigates the Integrated Sensing and Communication (ISAC) era, using millimeter-wave signals as radar via a Convolutional Neural Network (CNN) model for event sensing. Our focus is on leveraging deep learning to detect security-critical gestures, converting millimeter-wave parameters into point cloud images, and enhancing recognition accuracy. CNNs present complexity challenges in deep learning. To address this, we developed flexible quantization methods, simplifying You Only Look Once (YOLO)-v4 operations with an 8-bit fixed-point number representation. Cross-simulation validation showed that CPU-based quantization improves speed by 300% with minimal accuracy loss, even doubling the YOLO-tiny model’s speed in a GPU environment. We established a Raspberry Pi 4-based system, combining simplified deep learning with Message Queuing Telemetry Transport (MQTT) Internet of Things (IoT) technology for nursing care. Our quantification method significantly boosted identification speed by nearly 2.9 times, enabling millimeter-wave sensing in embedded systems. Additionally, we implemented hardware-based quantization, directly quantifying data from images or weight files, leading to circuit synthesis and chip design. This work integrates AI with mmWave sensors in the domain of nursing security and hardware implementation to enhance recognition accuracy and computational efficiency. Employing millimeter-wave radar in medical institutions or homes offers a strong solution to privacy concerns compared to conventional cameras that capture and analyze the appearance of patients or residents. Full article

(This article belongs to the Section Human-Computer Interaction)

► Show Figures

Figure 1

22 pages, 13151 KiB

Open AccessArticle

SFHG-YOLO: A Simple Real-Time Small-Object-Detection Method for Estimating Pineapple Yield from Unmanned Aerial Vehicles

by Guoyan Yu, Tao Wang, Guoquan Guo and Haochun Liu

Sensors 2023, 23(22), 9242; https://doi.org/10.3390/s23229242 - 17 Nov 2023

Cited by 10 | Viewed by 2401

Abstract

The counting of pineapple buds relies on target recognition in estimating pineapple yield using unmanned aerial vehicle (UAV) photography. This research proposes the SFHG-YOLO method, with YOLOv5s as the baseline, to address the practical needs of identifying small objects (pineapple buds) in UAV [...] Read more.

The counting of pineapple buds relies on target recognition in estimating pineapple yield using unmanned aerial vehicle (UAV) photography. This research proposes the SFHG-YOLO method, with YOLOv5s as the baseline, to address the practical needs of identifying small objects (pineapple buds) in UAV vision and the drawbacks of existing algorithms in terms of real-time performance and accuracy. Field pineapple buds are small objects that may be detected in high density using a lightweight network model. This model enhances spatial attention and adaptive context information fusion to increase detection accuracy and resilience. To construct the lightweight network model, the first step involves utilizing the coordinate attention module and MobileNetV3. Additionally, to fully leverage feature information across various levels and enhance perception skills for tiny objects, we developed both an enhanced spatial attention module and an adaptive context information fusion module. Experiments were conducted to validate the suggested algorithm’s performance in detecting small objects. The SFHG-YOLO model exhibited significant gains in assessment measures, achieving mAP@0.5 and mAP@0.5:0.95 improvements of

7.4 %

and

31 %

, respectively, when compared to the baseline model YOLOv5s. Considering the model size and computational cost, the findings underscore the superior performance of the suggested technique in detecting high-density small items. This program offers a reliable detection approach for estimating pineapple yield by accurately identifying minute items. Full article

(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture)

► Show Figures

Figure 1

17 pages, 5127 KiB

Open AccessArticle

Deep Learning Neural Network-Based Detection of Wafer Marking Character Recognition in Complex Backgrounds

by Yufan Zhao, Jun Xie and Peiyu He

Electronics 2023, 12(20), 4293; https://doi.org/10.3390/electronics12204293 - 17 Oct 2023

Viewed by 1901

Abstract

Wafer characters are used to record the transfer of important information in industrial production and inspection. Wafer character recognition is usually used in the traditional template matching method. However, the accuracy and robustness of the template matching method for detecting complex images are [...] Read more.

Wafer characters are used to record the transfer of important information in industrial production and inspection. Wafer character recognition is usually used in the traditional template matching method. However, the accuracy and robustness of the template matching method for detecting complex images are low, which affects production efficiency. An improved model based on YOLO v7-Tiny is proposed for wafer character recognition in complex backgrounds to enhance detection accuracy. In order to improve the robustness of the detection system, the images required for model training and testing are augmented by brightness, rotation, blurring, and cropping. Several improvements were adopted in the improved YOLO model, including an optimized spatial channel attention model (CBAM-L) for better feature extraction capability, improved neck structure based on BiFPN to enhance the feature fusion capability, and the addition of angle parameter to adapt to tilted character detection. The experimental results showed that the model had a value of 99.44% for mAP@0.5 and an F1 score of 0.97. In addition, the proposed model with very few parameters was suitable for embedded industrial devices with small memory, which was crucial for reducing the hardware cost. The results showed that the comprehensive performance of the improved model was better than several existing state-of-the-art detection models. Full article

(This article belongs to the Special Issue Applications of Deep Learning Techniques)

► Show Figures

Figure 1

14 pages, 3312 KiB

Open AccessArticle

Identification of Individual Hanwoo Cattle by Muzzle Pattern Images through Deep Learning

by Taejun Lee, Youngjun Na, Beob Gyun Kim, Sangrak Lee and Yongjun Choi

Animals 2023, 13(18), 2856; https://doi.org/10.3390/ani13182856 - 8 Sep 2023

Cited by 13 | Viewed by 4466

Abstract

The objective of this study was to identify Hanwoo cattle via a deep-learning model using muzzle images. A total of 9230 images from 336 Hanwoo were used. Images of the same individuals were taken at four different times to avoid overfitted models. Muzzle [...] Read more.

The objective of this study was to identify Hanwoo cattle via a deep-learning model using muzzle images. A total of 9230 images from 336 Hanwoo were used. Images of the same individuals were taken at four different times to avoid overfitted models. Muzzle images were cropped by the YOLO v8-based model trained with 150 images with manual annotation. Data blocks were composed of image and national livestock traceability numbers and were randomly selected and stored as train, validation test data. Transfer learning was performed with the tiny, small and medium versions of Efficientnet v2 models with SGD, RMSProp, Adam and Lion optimizers. The small version using Lion showed the best validation accuracy of 0.981 in 36 epochs within 12 transfer-learned models. The top five models achieved the best validation accuracy and were evaluated with the training data for practical usage. The small version using Adam showed the best test accuracy of 0.970, but the small version using RMSProp showed the lowest repeated error. Results with high accuracy prediction in this study demonstrated the potential of muzzle patterns as an identification key for individual cattle. Full article

(This article belongs to the Section Cattle)

► Show Figures

Figure 1

20 pages, 4363 KiB

Open AccessArticle

Deep-Learning-Based Rice Disease and Insect Pest Detection on a Mobile Phone

by Jizhong Deng, Chang Yang, Kanghua Huang, Luocheng Lei, Jiahang Ye, Wen Zeng, Jianling Zhang, Yubin Lan and Yali Zhang

Agronomy 2023, 13(8), 2139; https://doi.org/10.3390/agronomy13082139 - 15 Aug 2023

Cited by 23 | Viewed by 5616

Abstract

The realization that mobile phones can detect rice diseases and insect pests not only solves the problems of low efficiency and poor accuracy from manually detection and reporting, but it also helps farmers detect and control them in the field in a timely [...] Read more.

The realization that mobile phones can detect rice diseases and insect pests not only solves the problems of low efficiency and poor accuracy from manually detection and reporting, but it also helps farmers detect and control them in the field in a timely fashion, thereby ensuring the quality of rice grains. This study examined two Improved detection models for the detection of six high-frequency diseases and insect pests. These models were the Improved You Only Look Once (YOLO)v5s and YOLOv7-tiny based on their lightweight object detection networks. The Improved YOLOv5s was introduced with the Ghost module to reduce computation and optimize the model structure, and the Improved YOLOv7-tiny was introduced with the Convolutional Block Attention Module (CBAM) and SIoU to improve model learning ability and accuracy. First, we evaluated and analyzed the detection accuracy and operational efficiency of the models. Then we deployed two proposed methods to a mobile phone. We also designed an application to further verify their practicality for detecting rice diseases and insect pests. The results showed that Improved YOLOv5s achieved the highest F1-Score of 0.931, 0.961 in mean average precision (mAP) (0.5), and 0.648 in mAP (0.5:0.9). It also reduced network parameters, model size, and the floating point operations per second (FLOPs) by 47.5, 45.7, and 48.7%, respectively. Furthermore, it increased the model inference speed by 38.6% compared with the original YOLOv5s model. Improved YOLOv7-tiny outperformed the original YOLOv7-tiny in detection accuracy, which was second only to Improved YOLOv5s. The probability heat maps of the detection results showed that Improved YOLOv5s performed better in detecting large target areas of rice diseases and insect pests, while Improved YOLOv7-tiny was more accurate in small target areas. On the mobile phone platform, the precision and recall of Improved YOLOv5s under FP16 accuracy were 0.925 and 0.939, and the inference speed was 374 ms/frame, which was superior to Improved YOLOv7-tiny. Both of the proposed improved models realized accurate identification of rice diseases and insect pests. Moreover, the constructed mobile phone application based on the improved detection models provided a reference for realizing fast and efficient field diagnoses. Full article

(This article belongs to the Special Issue The Applications of Deep Learning in Smart Agriculture)

► Show Figures

Figure 1

25 pages, 10359 KiB

Open AccessArticle

On-Board Multi-Class Geospatial Object Detection Based on Convolutional Neural Network for High Resolution Remote Sensing Images

by Yanyun Shen, Di Liu, Junyi Chen, Zhipan Wang, Zhe Wang and Qingling Zhang

Remote Sens. 2023, 15(16), 3963; https://doi.org/10.3390/rs15163963 - 10 Aug 2023

Cited by 9 | Viewed by 3168

Abstract

Multi-class geospatial object detection in high-resolution remote sensing images has significant potential in various domains such as industrial production, military warning, disaster monitoring, and urban planning. However, the traditional process of remote sensing object detection involves several time-consuming steps, including image acquisition, image [...] Read more.

Multi-class geospatial object detection in high-resolution remote sensing images has significant potential in various domains such as industrial production, military warning, disaster monitoring, and urban planning. However, the traditional process of remote sensing object detection involves several time-consuming steps, including image acquisition, image download, ground processing, and object detection. These steps may not be suitable for tasks with shorter timeliness requirements, such as military warning and disaster monitoring. Additionally, the transmission of massive data from satellites to the ground is limited by bandwidth, resulting in time delays and redundant information, such as cloud coverage images. To address these challenges and achieve efficient utilization of information, this paper proposes a comprehensive on-board multi-class geospatial object detection scheme. The proposed scheme consists of several steps. Firstly, the satellite imagery is sliced, and the PID-Net (Proportional-Integral-Derivative Network) method is employed to detect and filter out cloud-covered tiles. Subsequently, our Manhattan Intersection over Union (MIOU) loss-based YOLO (You Only Look Once) v7-Tiny method is used to detect remote-sensing objects in the remaining tiles. Finally, the detection results are mapped back to the original image, and the truncated NMS (Non-Maximum Suppression) method is utilized to filter out repeated and noisy boxes. To validate the reliability of the scheme, this paper creates a new dataset called DOTA-CD (Dataset for Object Detection in Aerial Images-Cloud Detection). Experiments were conducted on both ground and on-board equipment using the AIR-CD dataset, DOTA dataset, and DOTA-CD dataset. The results demonstrate the effectiveness of our method. Full article

(This article belongs to the Special Issue Convolutional Neural Network Applications in Remote Sensing II)

► Show Figures

Figure 1

16 pages, 7026 KiB

Open AccessArticle

Borno-Net: A Real-Time Bengali Sign-Character Detection and Sentence Generation System Using Quantized Yolov4-Tiny and LSTMs

by Nasima Begum, Rashik Rahman, Nusrat Jahan, Saqib Sizan Khan, Tanjina Helaly, Ashraful Haque and Nipa Khatun

Appl. Sci. 2023, 13(9), 5219; https://doi.org/10.3390/app13095219 - 22 Apr 2023

Cited by 6 | Viewed by 3263

Abstract

Sign language is the most commonly used form of communication for persons with disabilities who have hearing or speech difficulties. However, persons without hearing impairment cannot understand these signs in many cases. As a consequence, persons with disabilities experience difficulties while expressing their [...] Read more.

Sign language is the most commonly used form of communication for persons with disabilities who have hearing or speech difficulties. However, persons without hearing impairment cannot understand these signs in many cases. As a consequence, persons with disabilities experience difficulties while expressing their emotions or needs. Thus, a sign character detection and text generation system is necessary to mitigate this issue. In this paper, we propose an end-to-end system that can detect Bengali sign characters from input images or video frames and generate meaningful sentences. The proposed system consists of two phases. In the first phase, a quantization technique for the YoloV4-Tiny detection model is proposed for detecting 49 different sign characters, including 36 Bengali alphabet characters, 10 numeric characters, and 3 special characters. Here, the detection model localizes hand signs and predicts the corresponding character. The second phase generates text from the predicted characters by a detection model. The Long Short-Term Memory (LSTM) model is utilized to generate meaningful text from the character signs detected in the previous phase. To train the proposed system, the BdSL 49 dataset is used, which has approximately 14,745 images of 49 different classes. The proposed quantized YoloV4-Tiny model achieves a mAP of 99.7%, and the proposed language model achieves an overall accuracy of 99.12%. In addition, performance analysis among YoloV4, YoloV4 Tiny, and YoloV7 models is provided in this research. Full article

► Show Figures

Figure 1

Search Results (37)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (37)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI