MDPI - Publisher of Open Access Journals

26 pages, 20666 KB

Open AccessArticle

DRC²-Net: A Context-Aware and Geometry-Adaptive Network for Lightweight SAR Ship Detection

by Abdelrahman Yehia, Naser El-Sheimy, Ashraf Helmy, Ibrahim Sh. Sanad and Mohamed Hanafy

Sensors 2025, 25(22), 6837; https://doi.org/10.3390/s25226837 - 8 Nov 2025

Cited by 1 | Viewed by 748

Synthetic Aperture Radar (SAR) ship detection remains challenging due to background clutter, target sparsity, and fragmented or partially occluded ships, particularly at small scales. To address these issues, we propose the Deformable Recurrent Criss-Cross Attention Network (

{DRC}^{2}

-Net), a lightweight and [...] Read more.

Synthetic Aperture Radar (SAR) ship detection remains challenging due to background clutter, target sparsity, and fragmented or partially occluded ships, particularly at small scales. To address these issues, we propose the Deformable Recurrent Criss-Cross Attention Network (

{DRC}^{2}

-Net), a lightweight and efficient detection framework built upon the YOLOX-Tiny architecture. The model incorporates two SAR-specific modules: a Recurrent Criss-Cross Attention (RCCA) module to enhance contextual awareness and reduce false positives and a Deformable Convolutional Networks v2 (DCNv2) module to capture geometric deformations and scale variations adaptively. These modules expand the Effective Receptive Field (ERF) and improve feature adaptability under complex conditions. DRC²-Net is trained on the SSDD and iVision-MRSSD datasets, encompassing highly diverse SAR imagery including inshore and offshore scenes, variable sea states, and complex coastal backgrounds. The model maintains a compact architecture with 5.05 M parameters, ensuring strong generalization and real-time applicability. On the SSDD dataset, it outperforms the YOLOX-Tiny baseline with AP@50 of 93.04% (+0.9%),

{AP}_{s}

of 91.15% (+1.31%),

{AP}_{m}

of 88.30% (+1.22%), and

{AP}_{l}

of 89.47% (+13.32%). On the more challenging iVision-MRSSD dataset, it further demonstrates improved scale-aware detection, achieving higher AP across small, medium, and large targets. These results confirm the effectiveness and robustness of

{DRC}^{2}

-Net for multi-scale ship detection in complex SAR environments, consistently surpassing state-of-the-art detectors. Full article

(This article belongs to the Special Issue Artificial Intelligence in Computer Vision: Methods and Applications—2nd Edition)

► Show Figures

Figure 1

19 pages, 1929 KB

Open AccessArticle

Detection and Classification of Defects on Metal Surfaces Based on a Lightweight YOLOX-Tiny COCO Network

by João Duarte, Manuel Fernandes Claro, Pedro M. A. Vitoriano, Tito G. Amaral and Vitor Fernão Pires

Eng 2025, 6(11), 302; https://doi.org/10.3390/eng6110302 - 1 Nov 2025

Viewed by 2105

Abstract

The detection of metallic surface defects is an essential task to control the quality of industrial products. During the production of metal materials, several defect types may appear on the surface, accompanied by a large amount of background texture information, leading to false [...] Read more.

The detection of metallic surface defects is an essential task to control the quality of industrial products. During the production of metal materials, several defect types may appear on the surface, accompanied by a large amount of background texture information, leading to false or missing detections during small-defect detection. Computer vision is a crucial method for the automatic detection of defects. Yet, this remains a challenging problem, requiring the continuous development of new approaches and algorithms. Furthermore, many industries require fast and real-time detection. In this paper, a lightweight deep learning model is presented for implementation on embedded devices to perform in real time. The YOLOX-Tiny model is used for detecting and classifying metallic surface defect types. The YOLOX-Tiny has 5.06M parameters and only 6.45 GFLOPs, yet performs well, even with a smaller model size than its counterparts. Extensive experiments on the dataset demonstrate that the proposed model is robust and can meet the accuracy requirements for metallic defect detection. Full article

(This article belongs to the Special Issue Emerging Trends and Technologies in Manufacturing Engineering)

► Show Figures

Figure 1

16 pages, 1934 KB

Open AccessArticle

Research on Obtaining Pepper Phenotypic Parameters Based on Improved YOLOX Algorithm

by Yukang Huo, Rui-Feng Wang, Chang-Tao Zhao, Pingfan Hu and Haihua Wang

AgriEngineering 2025, 7(7), 209; https://doi.org/10.3390/agriengineering7070209 - 2 Jul 2025

Cited by 10 | Viewed by 1511

Abstract

Pepper is a vital crop with extensive agricultural and industrial applications. Accurate phenotypic measurement, including plant height and stem diameter, is essential for assessing yield and quality, yet manual measurement is time-consuming and labor-intensive. This study proposes a deep learning-based phenotypic measurement method [...] Read more.

Pepper is a vital crop with extensive agricultural and industrial applications. Accurate phenotypic measurement, including plant height and stem diameter, is essential for assessing yield and quality, yet manual measurement is time-consuming and labor-intensive. This study proposes a deep learning-based phenotypic measurement method for peppers. A Pepper-mini dataset was constructed using offline augmentation. To address challenges in multi-plant growth environments, an improved YOLOX-tiny detection model incorporating a CA attention mechanism was developed, achieving a mAP of 95.16%. A detection box filtering method based on Euclidean distance was introduced to identify target plants. Further processing using HSV threshold segmentation, morphological operations, and connected component denoising enabled accurate region selection. Measurement algorithms were then applied, yielding high correlations with true values:

R^{2}

= 0.973 for plant height and

R^{2}

= 0.842 for stem diameter, with average errors of 0.443 cm and 0.0765 mm, respectively. This approach demonstrates a robust and efficient solution for automated phenotypic analysis in pepper cultivation. Full article

(This article belongs to the Special Issue Sensing and Monitoring in Modern Agriculture: New Technologies for Improving Crop Management)

► Show Figures

Figure 1

17 pages, 12823 KB

Open AccessArticle

Remote Sensing Small Object Detection Network Based on Multi-Scale Feature Extraction and Information Fusion

by Junsuo Qu, Tong Liu, Zongbing Tang, Yifei Duan, Heng Yao and Jiyuan Hu

Remote Sens. 2025, 17(5), 913; https://doi.org/10.3390/rs17050913 - 5 Mar 2025

Cited by 5 | Viewed by 3334

Abstract

Nowadays, object detection algorithms are widely used in various scenarios. However, there are further small object detection requirements in some special scenarios. Due to the problems related to small objects, such as their less available features, unbalanced samples, higher positioning accuracy requirements, and [...] Read more.

Nowadays, object detection algorithms are widely used in various scenarios. However, there are further small object detection requirements in some special scenarios. Due to the problems related to small objects, such as their less available features, unbalanced samples, higher positioning accuracy requirements, and fewer data sets, a small object detection algorithm is more complex than a general object detection algorithm. The detection effect of the model for small objects is not ideal. Therefore, this paper takes YOLOXs as the benchmark network and enhances the feature information on small objects by improving the network’s structure so as to improve the detection effect of the model for small objects. This specific research is presented as follows: Aiming at the problem of a neck network based on an FPN and its variants being prone to information loss in the feature fusion of non-adjacent layers, this paper proposes a feature fusion and distribution module, which replaces the information transmission path, from deep to shallow, in the neck network of YOLOXs. This method first fuses and extracts the feature layers used by the backbone network for prediction to obtain global feature information containing multiple-size objects. Then, the global feature information is distributed to each prediction branch to ensure that the high-level semantic and fine-grained information are more efficiently integrated so as to help the model effectively learn the discriminative information on small objects and classify them correctly. Finally, after testing on the VisDrone2021 dataset, which corresponds to a standard image size of 1080p (1920 × 1080), the resolution of each image is high and the video frame rate contained in the dataset is usually 30 frames/second (fps), with a high resolution in time, it can be used to detect objects of various sizes and for dynamic object detection tasks. And when we integrated the module into a YOLOXs network (named the FE-YOLO network) with the three improvement points of the feature layer, channel number, and maximum pool, the mAP and APs were increased by 1.0% and 0.8%, respectively. Compared with YOLOV5m, YOLOV7-Tiny, FCOS, and other advanced models, it can obtain the best performance. Full article

► Show Figures

Figure 1

21 pages, 10344 KB

Open AccessArticle

Efficient Deployment of Peanut Leaf Disease Detection Models on Edge AI Devices

by Zekai Lv, Shangbin Yang, Shichuang Ma, Qiang Wang, Jinti Sun, Linlin Du, Jiaqi Han, Yufeng Guo and Hui Zhang

Agriculture 2025, 15(3), 332; https://doi.org/10.3390/agriculture15030332 - 2 Feb 2025

Cited by 13 | Viewed by 3518

Abstract

The intelligent transformation of crop leaf disease detection has driven the use of deep neural network algorithms to develop more accurate disease detection models. In resource-constrained environments, the deployment of crop leaf disease detection models on the cloud introduces challenges such as communication [...] Read more.

The intelligent transformation of crop leaf disease detection has driven the use of deep neural network algorithms to develop more accurate disease detection models. In resource-constrained environments, the deployment of crop leaf disease detection models on the cloud introduces challenges such as communication latency and privacy concerns. Edge AI devices offer lower communication latency and enhanced scalability. To achieve the efficient deployment of crop leaf disease detection models on edge AI devices, a dataset of 700 images depicting peanut leaf spot, scorch spot, and rust diseases was collected. The YOLOX-Tiny network was utilized to conduct deployment experiments with the peanut leaf disease detection model on the Jetson Nano B01. The experiments initially focused on three aspects of efficient deployment optimization: the fusion of rectified linear unit (ReLU) and convolution operations, the integration of Efficient Non-Maximum Suppression for TensorRT (EfficientNMS_TRT) to accelerate post-processing within the TensorRT model, and the conversion of model formats from number of samples, channels, height, width (NCHW) to number of samples, height, width, and channels (NHWC) in the TensorFlow Lite model. Additionally, experiments were conducted to compare the memory usage, power consumption, and inference latency between the two inference frameworks, as well as to evaluate the real-time video detection performance using DeepStream. The results demonstrate that the fusion of ReLU activation functions with convolution operations reduced the inference latency by 55.5% compared to the use of the Sigmoid linear unit (SiLU) activation alone. In the TensorRT model, the integration of the EfficientNMS_TRT module accelerated post-processing, leading to a reduction in the inference latency of 19.6% and an increase in the frames per second (FPS) of 20.4%. In the TensorFlow Lite model, conversion to the NHWC format decreased the model conversion time by 88.7% and reduced the inference latency by 32.3%. These three efficient deployment optimization methods effectively decreased the inference latency and enhanced the inference efficiency. Moreover, a comparison between the two frameworks revealed that TensorFlow Lite exhibited memory usage reductions of 15% to 20% and power consumption decreases of 15% to 25% compared to TensorRT. Additionally, TensorRT achieved inference latency reductions of 53.2% to 55.2% relative to TensorFlow Lite. Consequently, TensorRT is deemed suitable for tasks requiring strong real-time performance and low latency, whereas TensorFlow Lite is more appropriate for scenarios with constrained memory and power resources. Additionally, the integration of DeepStream and EfficientNMS_TRT was found to optimize memory and power utilization, thereby enhancing the speed of real-time video detection. A detection rate of 28.7 FPS was achieved at a resolution of 1280 × 720. These experiments validate the feasibility and advantages of deploying crop leaf disease detection models on edge AI devices. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

20 pages, 3497 KB

Open AccessArticle

Enhancing Autonomous Driving Safety: A Robust Stacking Ensemble Model for Traffic Sign Detection and Recognition

by Yichen Wang, Jie Wang and Qianjin Wang

Sustainability 2024, 16(19), 8597; https://doi.org/10.3390/su16198597 - 3 Oct 2024

Cited by 3 | Viewed by 2417

Abstract

Accurate detection and classification of traffic signs play a vital role in ensuring driver safety and supporting advancements in autonomous driving technology. This paper introduces a novel approach for traffic sign detection and recognition by integrating the Faster RCNN and YOLOX-Tiny models using [...] Read more.

Accurate detection and classification of traffic signs play a vital role in ensuring driver safety and supporting advancements in autonomous driving technology. This paper introduces a novel approach for traffic sign detection and recognition by integrating the Faster RCNN and YOLOX-Tiny models using a stacking ensemble technique. The innovative ensemble methodology creatively merges the strengths of both models, surpassing the limitations of individual algorithms and achieving superior performance in challenging real-world scenarios. The proposed model was evaluated on the CCTSDB dataset and the MTSD dataset, demonstrating competitive performance compared to traditional algorithms. All experiments were conducted using Python 3.8 on the same system equipped with an NVIDIA GTX 3060 12G graphics card. Our results show improved accuracy and efficiency in recognizing traffic signs in various real-world scenarios, including distant, close, complex, moderate, and simple settings, achieving a 4.78% increase in mean Average Precision (mAP) compared to Faster RCNN and improving Frames Per Second (FPS) by 8.1% and mAP by 6.18% compared to YOLOX-Tiny. Moreover, the proposed model exhibited notable precision in challenging scenarios such as ultra-long-distance detections, shadow occlusions, motion blur, and complex environments with diverse sign categories. These findings not only showcase the model’s robustness but also serve as a cornerstone in propelling the evolution of autonomous driving technology and sustainable development of future transportation. The results presented in this paper could potentially be integrated into advanced driver-assistance systems and autonomous vehicles, offering a significant step forward in enhancing road safety and traffic management. Full article

► Show Figures

Figure 1

16 pages, 5276 KB

Open AccessArticle

Comparative Study of Lightweight Target Detection Methods for Unmanned Aerial Vehicle-Based Road Distress Survey

by Feifei Xu, Yan Wan, Zhipeng Ning and Hui Wang

Sensors 2024, 24(18), 6159; https://doi.org/10.3390/s24186159 - 23 Sep 2024

Cited by 7 | Viewed by 2467

Abstract

Unmanned aerial vehicles (UAVs) are effective tools for identifying road anomalies with limited detection coverage due to the discrete spatial distribution of roads. Despite computational, storage, and transmission challenges, existing detection algorithms can be improved to support this task with robustness and efficiency. [...] Read more.

Unmanned aerial vehicles (UAVs) are effective tools for identifying road anomalies with limited detection coverage due to the discrete spatial distribution of roads. Despite computational, storage, and transmission challenges, existing detection algorithms can be improved to support this task with robustness and efficiency. In this study, the K-means clustering algorithm was used to calculate the best prior anchor boxes; Faster R-CNN (region-based convolutional neural network), YOLOX-s (You Only Look Once version X-small), YOLOv5-s, YOLOv7-tiny, YOLO-MobileNet, and YOLO-RDD models were built based on image data collected by UAVs. YOLO-MobileNet has the most lightweight model but performed worst in accuracy, but greatly reduces detection accuracy. YOLO-RDD (road distress detection) performed best with a mean average precision (mAP) of 0.701 above the Intersection over Union (IoU) value of 0.5 and achieved relatively high accuracy in detecting all four types of distress. The YOLO-RDD model most successfully detected potholes with an AP of 0.790. Significant or severe distresses were better identified, and minor cracks were relatively poorly identified. The YOLO-RDD model achieved an 85% computational reduction compared to YOLOv7-tiny while maintaining high detection accuracy. Full article

(This article belongs to the Special Issue Toward Green and Intelligent Transportation Infrastructure: Road Non-destructive Testing and Structural Health Monitoring Technologies)

► Show Figures

Figure 1

19 pages, 5556 KB

Open AccessArticle

AFMSFFNet: An Anchor-Free-Based Feature Fusion Model for Ship Detection

by Yuxin Zhang, Chunlei Dong, Lixin Guo, Xiao Meng, Yue Liu and Qihao Wei

Remote Sens. 2024, 16(18), 3465; https://doi.org/10.3390/rs16183465 - 18 Sep 2024

Cited by 1 | Viewed by 2032

Abstract

This paper aims to improve a small-scale object detection model to achieve detection accuracy matching or even surpassing that of complex models. Efforts are made in the module design phase to minimize parameter count as much as possible, thereby providing the potential for [...] Read more.

This paper aims to improve a small-scale object detection model to achieve detection accuracy matching or even surpassing that of complex models. Efforts are made in the module design phase to minimize parameter count as much as possible, thereby providing the potential for rapid detection of maritime targets. Here, this paper introduces an innovative Anchor-Free-based Multi-Scale Feature Fusion Network (AFMSFFNet), which improves the problems of missed detection and false positives, particularly in inshore or small target scenarios. Leveraging the YOLOX tiny as the foundational architecture, our proposed AFMSFFNet incorporates a novel Adaptive Bidirectional Fusion Pyramid Network (AB-FPN) for efficient multi-scale feature fusion, enhancing the saliency representation of targets and reducing interference from complex backgrounds. Simultaneously, the designed Multi-Scale Global Attention Detection Head (MGAHead) utilizes a larger receptive field to learn object features, generating high-quality reconstructed features for enhanced semantic information integration. Extensive experiments conducted on publicly available Synthetic Aperture Radar (SAR) image ship datasets demonstrate that AFMSFFNet outperforms the traditional baseline models in detection performance. The results indicate an improvement of 2.32% in detection accuracy compared to the YOLOX tiny model. Additionally, AFMSFFNet achieves a Frames Per Second (FPS) of 78.26 in SSDD, showcasing superior efficiency compared to the well-established performance networks, such as faster R-CNN and CenterNet, with efficiency improvement ranging from 4.7 to 6.7 times. This research provides a valuable solution for efficient ship detection in complex backgrounds, demonstrating the efficacy of AFMSFFNet through quantitative improvements in accuracy and efficiency compared to existing models. Full article

► Show Figures

Graphical abstract

15 pages, 4485 KB

Open AccessArticle

Image Recognition and Classification of Farmland Pests Based on Improved Yolox-Tiny Algorithm

by Yuxue Wang, Hao Dong, Songyu Bai, Yang Yu and Qingwei Duan

Appl. Sci. 2024, 14(13), 5568; https://doi.org/10.3390/app14135568 - 26 Jun 2024

Cited by 3 | Viewed by 2140

Abstract

In order to rapidly detect pest types in farmland and mitigate their adverse effects on agricultural production, we proposed an improved Yolox-tiny-based target detection method for farmland pests. This method enhances the detection accuracy of farmland pests by limiting downsampling and incorporating the [...] Read more.

In order to rapidly detect pest types in farmland and mitigate their adverse effects on agricultural production, we proposed an improved Yolox-tiny-based target detection method for farmland pests. This method enhances the detection accuracy of farmland pests by limiting downsampling and incorporating the Convolution Block Attention Module (CBAM). In the experiments, images of pests common to seven types of farmland and particularly harmful to crops were processed through the original Yolox-tiny model after preprocessing and partial target expansion for comparative training and testing. The results indicate that the improved Yolox-tiny model increased the average precision by 7.18%, from 63.55% to 70.73%, demonstrating enhanced precision in detecting farmland pest targets compared to the original model. Full article

► Show Figures

Figure 1

15 pages, 598 KB

Open AccessArticle

Rice Diseases Identification Method Based on Improved YOLOv7-Tiny

by Duoguan Cheng, Zhenqing Zhao and Jiang Feng

Agriculture 2024, 14(5), 709; https://doi.org/10.3390/agriculture14050709 - 29 Apr 2024

Cited by 18 | Viewed by 3284

Abstract

The accurate and rapid identification of rice diseases is crucial for enhancing rice yields. However, this task encounters several challenges: (1) Complex background problem: The rice background in a natural environment is complex, which interferes with rice disease recognition; (2) Disease region irregularity [...] Read more.

The accurate and rapid identification of rice diseases is crucial for enhancing rice yields. However, this task encounters several challenges: (1) Complex background problem: The rice background in a natural environment is complex, which interferes with rice disease recognition; (2) Disease region irregularity problem: Some rice diseases exhibit irregular shapes, and their target regions are small, making them difficult to detect; (3) Classification and localization problem: Rice disease recognition employs identical features for both classification and localization tasks, thereby affecting the training effect. To address the aforementioned problems, an enhanced rice disease recognition model leveraging the improved YOLOv7-Tiny is proposed. Specifically, in order to reduce the interference of complex background, the YOLOv7-Tiny model’s backbone network has been enhanced by incorporating the Convolutional Block Attention Module (CBAM); subsequently, to address the irregularity issue in the disease region, the RepGhost bottleneck module, which is based on structural reparameterization techniques, has been introduced; Finally, to resolve the classification and localization issue, a lightweight YOLOX decoupled head has been proposed. The experimental results have demonstrated that: (1) The enhanced YOLOv7-Tiny model demonstrated elevated F1 scores and mAP@.5, achieving 0.894 and 0.922, respectively, on the rice pest and disease dataset. These scores exceeded the original YOLOv7-Tiny model’s performance by margins of 3.1 and 2.2 percentage points, respectively. (2) In comparison to the YOLOv3-Tiny, YOLOv4-Tiny, YOLOv5-S, YOLOX-S, and YOLOv7-Tiny models, the enhanced YOLOv7-Tiny model achieved higher F1 scores and mAP@.5. The improved YOLOv7-Tiny model boasts a single image inference time of 26.4 ms, satisfying the requirement for real-time identification of rice diseases and facilitating deployment in embedded devices. Full article

(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)

► Show Figures

Figure 1

18 pages, 4616 KB

Open AccessArticle

Seatbelt Detection Algorithm Improved with Lightweight Approach and Attention Mechanism

by Liankui Qiu, Jiankun Rao and Xiangzhe Zhao

Appl. Sci. 2024, 14(8), 3346; https://doi.org/10.3390/app14083346 - 16 Apr 2024

Cited by 3 | Viewed by 4150

Abstract

Precise and rapid detection of seatbelts is an essential research field for intelligent traffic management. In order to improve the detection precision of seatbelts and speed up algorithm inference velocity, a lightweight seatbelt detection algorithm is proposed. Firstly, by adding the G-ELAN module [...] Read more.

Precise and rapid detection of seatbelts is an essential research field for intelligent traffic management. In order to improve the detection precision of seatbelts and speed up algorithm inference velocity, a lightweight seatbelt detection algorithm is proposed. Firstly, by adding the G-ELAN module designed in this paper to the YOLOv7-tiny network, the optimization of construction and reduction of parameters are accomplished, and the ResNet is compressed with the channel pruning approach to decrease computational overheads. Then, the Mish activation function is utilized to replace the Leaky Relu in the neck to enhance the non-linear competence of the network. Finally, the triplet attention module is integrated into the model after pruning to make up for the underlying performance reduction caused by the previous stage and upgrade overall detection precision. The experimental results based on the self-built seatbelt dataset showed that, compared to the initial network, the Mean Average Precision (mAP) achieved by the proposed GM-YOLOv7 was improved by 3.8%, while the volume and the computation amount were lowered by 20% and 24.6%, respectively. Compared with YOLOv3, YOLOX, and YOLOv5, the mAP of GM-YOLOv7 increased by 22.4%, 4.6%, and 4.2%, respectively, and the number of computational operations decreased by 25%, 63%, and 38%, respectively. In addition, the accuracy of the improved RST-Net increased to 98.25%, while the parameter value was reduced by 48% compared to the basic model, effectively improving the detection performance and realizing a lightweight structure. Full article

(This article belongs to the Special Issue Deep Learning for Object Detection)

► Show Figures

Figure 1

13 pages, 2974 KB

Open AccessArticle

High-Precision Detection for Sandalwood Trees via Improved YOLOv5s and StyleGAN

by Yu Zhang, Jiajun Niu, Zezhong Huang, Chunlei Pan, Yueju Xue and Fengxiao Tan

Agriculture 2024, 14(3), 452; https://doi.org/10.3390/agriculture14030452 - 11 Mar 2024

Cited by 19 | Viewed by 3294

Abstract

An algorithm model based on computer vision is one of the critical technologies that are imperative for agriculture and forestry planting. In this paper, a vision algorithm model based on StyleGAN and improved YOLOv5s is proposed to detect sandalwood trees from unmanned aerial [...] Read more.

An algorithm model based on computer vision is one of the critical technologies that are imperative for agriculture and forestry planting. In this paper, a vision algorithm model based on StyleGAN and improved YOLOv5s is proposed to detect sandalwood trees from unmanned aerial vehicle remote sensing data, and this model has excellent adaptability to complex environments. To enhance feature expression ability, a CA (coordinate attention) module with dimensional information is introduced, which can both capture target channel information and keep correlation information between long-range pixels. To improve the training speed and test accuracy, SIOU (structural similarity intersection over union) is proposed to replace the traditional loss function, whose direction matching degree between the prediction box and the real box is fully considered. To achieve the generalization ability of the model, StyleGAN is introduced to augment the remote sensing data of sandalwood trees and to improve the sample balance of different flight heights. The experimental results show that the average accuracy of sandalwood tree detection increased from 93% to 95.2% through YOLOv5s model improvement; then, on that basis, the accuracy increased by another 0.4% via data generation from the StyleGAN algorithm model, finally reaching 95.6%. Compared with the mainstream lightweight models YOLOv5-mobilenet, YOLOv5-ghost, YOLOXs, and YOLOv4-tiny, the accuracy of this method is 2.3%, 2.9%, 3.6%, and 6.6% higher, respectively. The size of the training sandalwood tree model is 14.5 Mb, and the detection time is 17.6 ms. Thus, the algorithm demonstrates the advantages of having high detection accuracy, a compact model size, and a rapid processing speed, making it suitable for integration into edge computing devices for on-site real-time monitoring. Full article

(This article belongs to the Special Issue Advanced Image Processing in Agricultural Applications)

► Show Figures

Figure 1

22 pages, 18514 KB

Open AccessArticle

Efficient and Lightweight Automatic Wheat Counting Method with Observation-Centric SORT for Real-Time Unmanned Aerial Vehicle Surveillance

by Jie Chen, Xiaochun Hu, Jiahao Lu, Yan Chen and Xin Huang

Agriculture 2023, 13(11), 2110; https://doi.org/10.3390/agriculture13112110 - 7 Nov 2023

Cited by 11 | Viewed by 3228

Abstract

The number of wheat ears per unit area is crucial for assessing wheat yield, but automated wheat ear counting still faces significant challenges due to factors like lighting, orientation, and density variations. Departing from most static image analysis methodologies, this study introduces Wheat-FasterYOLO, [...] Read more.

The number of wheat ears per unit area is crucial for assessing wheat yield, but automated wheat ear counting still faces significant challenges due to factors like lighting, orientation, and density variations. Departing from most static image analysis methodologies, this study introduces Wheat-FasterYOLO, an efficient real-time model designed to detect, track, and count wheat ears in video sequences. This model uses FasterNet as its foundational feature extraction network, significantly reducing the model’s parameter count and improving the model’s inference speed. We also incorporate deformable convolutions and dynamic sparse attention into the feature extraction network to enhance its ability to capture wheat ear features while reducing the effects of intricate environmental conditions. To address information loss during up-sampling and strengthen the model’s capacity to extract wheat ear features across varying feature map scales, we integrate a path aggregation network (PAN) with the content-aware reassembly of features (CARAFE) up-sampling operator. Furthermore, the incorporation of the Kalman filter-based target-tracking algorithm, Observation-centric SORT (OC-SORT), enables real-time tracking and counting of wheat ears within expansive field settings. Experimental results demonstrate that Wheat-FasterYOLO achieves a mean average precision (mAP) score of 94.01% with a small memory usage of 2.87MB, surpassing popular detectors such as YOLOX and YOLOv7-Tiny. With the integration of OC-SORT, the composite higher order tracking accuracy (HOTA) and counting accuracy reached 60.52% and 91.88%, respectively, while maintaining a frame rate of 92 frames per second (FPS). This technology has promising applications in wheat ear counting tasks. Full article

(This article belongs to the Special Issue Computational, AI and IT Solutions Helping Agriculture)

► Show Figures

Figure 1

23 pages, 4318 KB

Open AccessArticle

Automatic Speaker Positioning in Meetings Based on YOLO and TDOA

by Chen-Chiung Hsieh, Men-Ru Lu and Hsiao-Ting Tseng

Sensors 2023, 23(14), 6250; https://doi.org/10.3390/s23146250 - 8 Jul 2023

Cited by 2 | Viewed by 2875

Abstract

In recent years, many things have been held via video conferences due to the impact of the COVID-19 epidemic around the world. A webcam will be used in conjunction with a computer and the Internet. However, the network camera cannot automatically turn and [...] Read more.

In recent years, many things have been held via video conferences due to the impact of the COVID-19 epidemic around the world. A webcam will be used in conjunction with a computer and the Internet. However, the network camera cannot automatically turn and cannot lock the screen to the speaker. Therefore, this study uses the objection detector YOLO to capture the upper body of all people on the screen and judge whether each person opens or closes their mouth. At the same time, the Time Difference of Arrival (TDOA) is used to detect the angle of the sound source. Finally, the person’s position obtained by YOLO is reversed to the person’s position in the spatial coordinates through the distance between the person and the camera. Then, the spatial coordinates are used to calculate the angle between the person and the camera through inverse trigonometric functions. Finally, the angle obtained by the camera, and the angle of the sound source obtained by the microphone array, are matched for positioning. The experimental results show that the recall rate of positioning through YOLOX-Tiny reached 85.2%, and the recall rate of TDOA alone reached 88%. Integrating YOLOX-Tiny and TDOA for positioning, the recall rate reached 86.7%, the precision rate reached 100%, and the accuracy reached 94.5%. Therefore, the method proposed in this study can locate the speaker, and it has a better effect than using only one source. Full article

(This article belongs to the Special Issue Computer Vision in AI for Robotics Development)

► Show Figures

Figure 1

16 pages, 6577 KB

Open AccessArticle

LPDNet: A Lightweight Network for SAR Ship Detection Based on Multi-Level Laplacian Denoising

by Congxia Zhao, Xiongjun Fu, Jian Dong, Cheng Feng and Hao Chang

Sensors 2023, 23(13), 6084; https://doi.org/10.3390/s23136084 - 1 Jul 2023

Cited by 9 | Viewed by 2836

Abstract

Intelligent ship detection based on synthetic aperture radar (SAR) is vital in maritime situational awareness. Deep learning methods have great advantages in SAR ship detection. However, the methods do not strike a balance between lightweight and accuracy. In this article, we propose an [...] Read more.

Intelligent ship detection based on synthetic aperture radar (SAR) is vital in maritime situational awareness. Deep learning methods have great advantages in SAR ship detection. However, the methods do not strike a balance between lightweight and accuracy. In this article, we propose an end-to-end lightweight SAR target detection algorithm, multi-level Laplacian pyramid denoising network (LPDNet). Firstly, an intelligent denoising method based on the multi-level Laplacian transform is proposed. Through Convolutional Neural Network (CNN)-based threshold suppression, the denoising becomes adaptive to every SAR image via back-propagation and makes the denoising processing supervised. Secondly, channel modeling is proposed to combine the spatial domain and frequency domain information. Multi-dimensional information enhances the detection effect. Thirdly, the Convolutional Block Attention Module (CBAM) is introduced into the feature fusion module of the basic framework (Yolox-tiny) so that different weights are given to each pixel of the feature map to highlight the effective features. Experiments on SSDD and AIR SARShip-1.0 demonstrate that the proposed method achieves 97.14% AP with a speed of 24.68FPS and 92.19% AP with a speed of 23.42FPS, respectively, with only 5.1 M parameters, which verifies the accuracy, efficiency, and lightweight of the proposed method. Full article

(This article belongs to the Collection Remote Sensing Image Processing)

► Show Figures

Figure 1

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI