MDPI - Publisher of Open Access Journals

20 pages, 7048 KB

Open AccessArticle

Enhanced Lightweight Object Detection Model in Complex Scenes: An Improved YOLOv8n Approach

by Sohaya El Hamdouni, Boutaina Hdioud and Sanaa El Fkihi

Information 2025, 16(10), 871; https://doi.org/10.3390/info16100871 - 8 Oct 2025

Viewed by 817

Object detection has a vital impact on the analysis and interpretation of visual scenes. It is widely utilized in various fields, including healthcare, autonomous driving, and vehicle surveillance. However, complex scenes containing small, occluded, and multiscale objects present significant difficulties for object detection. [...] Read more.

Object detection has a vital impact on the analysis and interpretation of visual scenes. It is widely utilized in various fields, including healthcare, autonomous driving, and vehicle surveillance. However, complex scenes containing small, occluded, and multiscale objects present significant difficulties for object detection. This paper introduces a lightweight object detection algorithm, utilizing YOLOv8n as the baseline model, to address these problems. Our method focuses on four steps. Firstly, we add a layer for small object detection to enhance the feature expression capability of small objects. Secondly, to handle complex forms and appearances, we employ the C2f-DCNv2 module. This module integrates advanced DCNv2 (Deformable Convolutional Networks v2) by substituting the final C2f module in the backbone. Thirdly, we designed the CBAM, a lightweight attention module. We integrate it into the neck section to address missed detections. Finally, we use Ghost Convolution (GhostConv) as a light convolutional layer. This alternates with ordinary convolution in the neck. It ensures good detection performance while decreasing the number of parameters. Experimental performance on the PASCAL VOC dataset demonstrates that our approach lowers the number of model parameters by approximately 9.37%. The mAP@0.5:0.95 increased by 0.9%, recall (R) increased by 0.8%, mAP@0.5 increased by 0.3%, and precision (P) increased by 0.1% compared to the baseline model. To better evaluate the model’s generalization performance in real-world driving scenarios, we conducted additional experiments using the KITTI dataset. Compared to the baseline model, our approach yielded a 0.8% improvement in mAP@0.5 and 1.3% in mAP@0.5:0.95. This result indicates strong performance in more dynamic and challenging conditions. Full article

(This article belongs to the Special Issue Addressing Real-World Challenges in Recognition and Classification with Cutting-Edge AI Models and Methods)

► Show Figures

Graphical abstract

26 pages, 1971 KB

Open AccessArticle

Dynamic Allocation of C-V2X Communication Resources Based on Graph Attention Network and Deep Reinforcement Learning

by Zhijuan Li, Guohong Li, Zhuofei Wu, Wei Zhang and Alessandro Bazzi

Sensors 2025, 25(16), 5209; https://doi.org/10.3390/s25165209 - 21 Aug 2025

Viewed by 1165

Abstract

Vehicle-to-vehicle (V2V) and vehicle-to-network (V2N) communications are two key components of intelligent transport systems (ITSs) that can share spectrum resources through in-band overlay. V2V communication primarily supports traffic safety, whereas V2N primarily focuses on infotainment and information exchange. Achieving reliable V2V transmission alongside [...] Read more.

Vehicle-to-vehicle (V2V) and vehicle-to-network (V2N) communications are two key components of intelligent transport systems (ITSs) that can share spectrum resources through in-band overlay. V2V communication primarily supports traffic safety, whereas V2N primarily focuses on infotainment and information exchange. Achieving reliable V2V transmission alongside high-rate V2N services in resource-constrained, dynamically changing traffic environments poses a significant challenge for resource allocation. To address this, we propose a novel reinforcement learning (RL) framework, termed Graph Attention Network (GAT)-Advantage Actor–Critic (GAT-A2C). In this framework, we construct a graph based on V2V links and their potential interference relationships. Each V2V link is represented as a node, and edges connect nodes that may interfere. The GAT captures key interference patterns among neighboring vehicles while accounting for real-time mobility and channel variations. The features generated by the GAT, combined with individual link characteristics, form the environment state, which is then processed by the RL agent to jointly optimize the resource blocks allocation and the transmission power for both V2V and V2N communications. Simulation results demonstrate that the proposed method substantially improves V2N rates and V2V communication success ratios under various vehicle densities. Furthermore, the approach exhibits strong scalability, making it a promising solution for future large-scale intelligent vehicular networks operating in dynamic traffic scenarios. Full article

(This article belongs to the Special Issue Sensors and Sensing Technologies for Traffic, Driving and Transportation)

► Show Figures

Figure 1

14 pages, 1721 KB

Open AccessArticle

Informational and Topological Characterization of CO and O₃ Hourly Time Series in the Mexico City Metropolitan Area During the 2019–2023 Period: Insights into the Impact of the COVID-19 Pandemic

by Alejandro Ramirez-Rojas, Paulina Rebeca Cárdenas-Moreno, Israel Reyes-Ramírez, Michele Lovallo and Luciano Telesca

Appl. Sci. 2025, 15(16), 8775; https://doi.org/10.3390/app15168775 - 8 Aug 2025

Viewed by 381

Abstract

The main anthropogenic sources of air pollution in big cities are vehicular traffic and industrial activities. The emissions of primary pollutants are produced directly from the combustion of fossil fuels of vehicles and industry, whilst the secondary pollutants, such as tropospheric ozone ( [...] Read more.

The main anthropogenic sources of air pollution in big cities are vehicular traffic and industrial activities. The emissions of primary pollutants are produced directly from the combustion of fossil fuels of vehicles and industry, whilst the secondary pollutants, such as tropospheric ozone (

O_{3}

), are produced from precursors like Carbon monoxide (

C O

), among others, and meteorological factors such as radiation. In this study, we analyze the time series of

CO

and

O_{3}

concentrations monitored by the RAMA program between 2019 and 2023 in the southwest of the Mexico City Metropolitan Area, encompassing the COVID-19 lockdown period declared from March to September–October 2020. After removing cyclic patterns and normalizing the data, we applied informational and topological methods to investigate variability changes in the concentration time series, particularly in response to the lockdown. Following the onset of lockdown measures in March 2020—which led to a significant reduction in industrial activity and vehicular traffic—the informational quantities

N_{X}

and Fisher Information Measure (FIM) for

CO

revealed significant shifts during the lockdown, while these metrics remained stable for

O_{3}

. Also, the coefficient of variation of the degree

C V_{k}

, which was defined for the network constructed for each series by the Visibility Graph, showed marked changes for

CO

but not for

O_{3}

. The combined informational and topological analysis highlighted distinct underlying structures:

CO

exhibited localized, intermittent emission patterns leading to greater structural complexity, while

O_{3}

displayed smoother, less organized variability. Also, the temporal variation of the FIM and

N_{X}

provides a means to monitor the evolving statistical behavior of the

C O

and

O_{3}

time series over time. Finally, the Visibility Graph (VG) method shows a behavioral trend similar to that shown by the informational quantifiers, revealing a significant change during the lockdown for

C O

, although remaining almost stable for

O_{3}

. Full article

► Show Figures

Figure 1

17 pages, 1597 KB

Open AccessArticle

Harmonized Autonomous–Human Vehicles via Simulation for Emissions Reduction in Riyadh City

by Ali Louati, Hassen Louati and Elham Kariri

Future Internet 2025, 17(8), 342; https://doi.org/10.3390/fi17080342 - 30 Jul 2025

Viewed by 975

Abstract

The integration of autonomous vehicles (AVs) into urban transportation systems has significant potential to enhance traffic efficiency and reduce environmental impacts. This study evaluates the impact of different AV penetration scenarios (0%, 10%, 30%, 50%) on traffic performance and carbon emissions along Prince [...] Read more.

The integration of autonomous vehicles (AVs) into urban transportation systems has significant potential to enhance traffic efficiency and reduce environmental impacts. This study evaluates the impact of different AV penetration scenarios (0%, 10%, 30%, 50%) on traffic performance and carbon emissions along Prince Mohammed bin Salman bin Abdulaziz Road in Riyadh, Saudi Arabia. Using microscopic simulation (SUMO) based on real-world datasets, we assess key performance indicators such as travel time, stop frequency, speed, and CO₂ emissions. Results indicate notable improvements with increasing AV deployment, including up to 25.5% reduced travel time and 14.6% lower emissions at 50% AV penetration. Coordinated AV behavior was approximated using adjusted simulation parameters and Python-based APIs, effectively modeling vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V2I), and vehicle-to-network (V2N) communications. These findings highlight the benefits of harmonized AV–human vehicle interactions, providing a scalable and data-driven framework applicable to smart urban mobility planning. Full article

(This article belongs to the Section Smart System Infrastructure and Applications)

► Show Figures

Figure 1

36 pages, 1587 KB

Open AccessArticle

Analysis of MCP-Distributed Jammers and 3D Beam-Width Variations for UAV-Assisted C-V2X Millimeter-Wave Communications

by Mohammad Arif, Wooseong Kim, Adeel Iqbal and Sung Won Kim

Mathematics 2025, 13(10), 1665; https://doi.org/10.3390/math13101665 - 19 May 2025

Cited by 2 | Viewed by 625

Abstract

Jamming devices introduce unwanted signals into the network to disrupt primary communications. The effectiveness of these jamming signals mainly depends on the number and distribution of the jammers. The impact of clustered jamming has not been investigated previously for an unmanned aerial vehicle [...] Read more.

Jamming devices introduce unwanted signals into the network to disrupt primary communications. The effectiveness of these jamming signals mainly depends on the number and distribution of the jammers. The impact of clustered jamming has not been investigated previously for an unmanned aerial vehicle (UAV)-assisted cellular-vehicle-to-everything (C-V2X) communications by considering multiple roads in the given region. Also, exploiting three-dimensional (3D) beam-width variations for a millimeter waveband antenna in the presence of jamming for vehicular node (V-N) links has not been evaluated, which influences the UAV-assisted C-V2X system’s performance. The novelty of this paper resides in analyzing the impact of clustered jamming for UAV-assisted C-V2X networks and quantifying the effects of fluctuating antenna 3D beam width on the V-N performance by exploiting millimeter waves. To this end, we derive the analytical expressions for coverage of a typical V-N linked with a line-of-sight (LOS) UAV, non-LOS UAV, macro base station (MBS), and recipient V-N for UAV-assisted C-V2X networks by exploiting beam-width variations in the presence of jammers. The results show network performance in terms of coverage and spectral efficiencies by setting V-Ns equal to 3 km⁻², MBSs equal to 3 km⁻², and UAVs equal to 6 km⁻². The findings indicate that the performance of millimeter waveband UAV-assisted C-V2X communications is decreased by introducing clustered jamming in the given region. Specifically, the coverage performance of the network decreases by 25.5% at −10 dB SIR threshold in the presence of clustered jammers. The performance further declines by increasing variations in the antenna 3D beam width. Therefore, network designers must focus on considering advanced counter-jamming techniques when jamming signals, along with the beam-width fluctuations, are anticipated in vehicular networks. Full article

(This article belongs to the Section D1: Probability and Statistics)

► Show Figures

Figure 1

17 pages, 3343 KB

Open AccessArticle

Comparative Analysis of YOLO Series Algorithms for UAV-Based Highway Distress Inspection: Performance and Application Insights

by Ziyi Yang, Xin Lan and Hui Wang

Sensors 2025, 25(5), 1475; https://doi.org/10.3390/s25051475 - 27 Feb 2025

Cited by 15 | Viewed by 2861

Abstract

Established unmanned aerial vehicle (UAV) highway distress detection (HDD) faces the dual challenges of accuracy and efficiency, this paper conducted a comparative study on the application of the YOLO (You Only Look Once) series of algorithms in UAV-based HDD to provide a reference [...] Read more.

Established unmanned aerial vehicle (UAV) highway distress detection (HDD) faces the dual challenges of accuracy and efficiency, this paper conducted a comparative study on the application of the YOLO (You Only Look Once) series of algorithms in UAV-based HDD to provide a reference for the selection of models. YOLOv5-l and v9-c achieved the highest detection accuracy, with YOLOv5-l performing well in mean and classification detection precision and recall, while YOLOv9-c showed poor performance in these aspects. In terms of detection efficiency, YOLOv10-n, v7-t, and v11-n achieved the highest levels, while YOLOv5-n, v8-n, and v10-n had the smallest model sizes. Notably, YOLOv11-n was the best-performing model in terms of combined detection efficiency, model size, and computational complexity, making it a promising candidate for embedded real-time HDD. YOLOv5-s and v11-s were found to balance detection accuracy and model lightweightness, although their efficiency was only average. When comparing t/n and l/c versions, the changes in the backbone network of YOLOv9 had the greatest impact on detection accuracy, followed by the network depth_multiple and width_multiple of YOLOv5. The relative compression degrees of YOLOv5-n and YOLOv8-n were the highest, and v9-t achieved the greatest efficiency improvement in UAV HDD, followed by YOLOv10-n and v11-n. Full article

(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)

► Show Figures

Figure 1

27 pages, 5623 KB

Open AccessArticle

Torque Ripple Minimization for Switched Reluctance Motor Drives Based on Harris Hawks–Radial Basis Function Approximation

by Jackson Oloo and Szamel Laszlo

Energies 2025, 18(4), 1006; https://doi.org/10.3390/en18041006 - 19 Feb 2025

Cited by 3 | Viewed by 1002

Abstract

Switched reluctance motor drives are becoming attractive for electric vehicle propulsion systems due to their simple and cheap construction. However, their operation is degraded by torque ripples due to the salient nature of the stator and rotor poles. There are several methods of [...] Read more.

Switched reluctance motor drives are becoming attractive for electric vehicle propulsion systems due to their simple and cheap construction. However, their operation is degraded by torque ripples due to the salient nature of the stator and rotor poles. There are several methods of mitigating torque ripples in switched reluctance motors (SRMs). Apart from changing the geometrical design of the motor, the less costly technique involves the development of an adaptive switching strategy. By selecting suitable turn-on and turn-off angles, torque ripples in SRMs can be significantly reduced. This work combines the benefits of Harris Hawks Optimization (HHO) and Radial Basis Functions (RBFs) to search and estimate optimal switching angles. An objective function is developed under constraints and the HHO is utilized to perform search stages for optimal switching angles that guarantee minimal torque ripples at every speed and current operating point. In this work, instead of storing the

θ_{o n}, θ_{o f f}

values in a look-up table, the values are passed on to an RBF model to learn the nonlinear relationship between the columns of data from the HHO and hence transform them into high-dimensional outputs. The values are used to train an enhanced neural network (NN) in an adaptive switching strategy to address the nonlinear magnetic characteristics of the SRM. The proposed method is implemented on a current chopping control-based SRM 8/6, 600 V model. Percentage torque ripples are used as the key performance index of the proposed method. A fuzzy logic switching angle compensation strategy is implemented in numerical simulations to validate the performance of the HHO-RBF method. Full article

(This article belongs to the Special Issue Advanced Electric Powertrain Technologies for Electric Vehicles)

► Show Figures

Figure 1

18 pages, 7157 KB

Open AccessArticle

Proportional-Switch Adjustment Process-Based Day-by-Day Evolution Model for Mixed Traffic Flow in an Autonomous Driving Environment

by Yihao Huang, Han Zhang and Aiwu Kuang

World Electr. Veh. J. 2025, 16(1), 53; https://doi.org/10.3390/wevj16010053 - 20 Jan 2025

Cited by 1 | Viewed by 1078

Abstract

Given the rapid development of technologies such as new energy vehicles, autonomous driving, and vehicle-to-everything (V2X) communication, a mixed traffic flow comprising connected and autonomous vehicles (CAVs) and human-driven vehicles (HDVs) is anticipated to emerge. This necessitates the development of a daily dynamic [...] Read more.

Given the rapid development of technologies such as new energy vehicles, autonomous driving, and vehicle-to-everything (V2X) communication, a mixed traffic flow comprising connected and autonomous vehicles (CAVs) and human-driven vehicles (HDVs) is anticipated to emerge. This necessitates the development of a daily dynamic evolution model for mixed traffic flow to address the dynamic traffic management needs of urban environments characterized by mixed traffic. The daily dynamic evolution model can capture the temporal evolution of traffic flow in road networks, with a focus on the daily path choice behavior of travelers and the evolving traffic flow in the network. First, based on the travel characteristics of CAVs and HDVs, the user group in a connected autonomous driving environment is classified into three categories, each adhering to the system optimal (SO) criterion, the user equilibrium (UE) criterion, or the stochastic user equilibrium (SUE) criterion. Next, the pure HDV traffic capacity BPR (Bureau of Public Roads) function is adapted into a heterogeneous traffic flow travel time function to compute the travel time cost for mixed traffic flow. Based on the energy consumption calculation formula for HDVs, the impact of CAVs is fully considered to establish the travel energy consumption cost for both CAVs and HDVs. The total individual travel cost for CAVs and HDVs encompasses both travel time cost and energy consumption cost. Furthermore, a daily dynamic evolution model for mixed traffic flow in a connected autonomous driving environment is developed using the proportional-switch adjustment process (PAP) model. The fundamental properties of the model are validated. Finally, numerical simulations on an N-dimensional (N-D) network confirm the validity and effectiveness of the daily evolution model for mixed traffic flow. A sensitivity analysis of traveler responses in the daily evolution model reveals that, as the sensitivity of CAVs to impedance changes increases, the fluctuations in mixed traffic flow during the early stages of evolution become more pronounced, and the time required to reach a mixed-equilibrium state decreases. Therefore, the PAP-based daily dynamic evolution model for mixed traffic flow effectively captures the evolution process of CAV and HDV mixed traffic flow and supports urban traffic management in a connected autonomous driving environment. Full article

(This article belongs to the Special Issue Vehicle Safe Motion in Mixed Vehicle Technologies Environment)

► Show Figures

Figure 1

20 pages, 9894 KB

Open AccessArticle

Estimation of Strawberry Canopy Volume in Unmanned Aerial Vehicle RGB Imagery Using an Object Detection-Based Convolutional Neural Network

by Min-Seok Gang, Thanyachanok Sutthanonkul, Won Suk Lee, Shiyu Liu and Hak-Jin Kim

Sensors 2024, 24(21), 6920; https://doi.org/10.3390/s24216920 - 28 Oct 2024

Cited by 3 | Viewed by 1655

Abstract

Estimating canopy volumes of strawberry plants can be useful for predicting yields and establishing advanced management plans. Therefore, this study evaluated the spatial variability of strawberry canopy volumes using a ResNet50V2-based convolutional neural network (CNN) model trained with RGB images acquired through manual [...] Read more.

Estimating canopy volumes of strawberry plants can be useful for predicting yields and establishing advanced management plans. Therefore, this study evaluated the spatial variability of strawberry canopy volumes using a ResNet50V2-based convolutional neural network (CNN) model trained with RGB images acquired through manual unmanned aerial vehicle (UAV) flights equipped with a digital color camera. A preprocessing method based on the You Only Look Once v8 Nano (YOLOv8n) object detection model was applied to correct image distortions influenced by fluctuating flight altitude under a manual maneuver. The CNN model was trained using actual canopy volumes measured using a cylindrical case and small expanded polystyrene (EPS) balls to account for internal plant spaces. Estimated canopy volumes using the CNN with flight altitude compensation closely matched the canopy volumes measured with EPS balls (nearly 1:1 relationship). The model achieved a slope, coefficient of determination (R²), and root mean squared error (RMSE) of 0.98, 0.98, and 74.3 cm³, respectively, corresponding to an 84% improvement over the conventional paraboloid shape approximation. In the application tests, the canopy volume map of the entire strawberry field was generated, highlighting the spatial variability of the plant’s canopy volumes, which is crucial for implementing site-specific management of strawberry crops. Full article

(This article belongs to the Special Issue Feature Papers in Smart Agriculture 2024)

► Show Figures

Figure 1

23 pages, 3739 KB

Open AccessArticle

The Shared Experience Actor–Critic (SEAC) Approach for Allocating Radio Resources and Mitigating Resource Collisions in 5G-NR-V2X Mode 2 Under Aperiodic Traffic Conditions

by Sawera Aslam, Daud Khan and KyungHi Chang

Sensors 2024, 24(20), 6769; https://doi.org/10.3390/s24206769 - 21 Oct 2024

Cited by 2 | Viewed by 2151

Abstract

5G New Radio (NR)-V2X, standardized by 3GPP Release 16, includes a distributed resource allocation Mode, known as Mode 2, that allows vehicles to autonomously select transmission resources using either sensing-based semi-persistent scheduling (SB-SPS) or dynamic scheduling (DS). In unmanaged 5G-NR-V2X scenarios, SB-SPS loses [...] Read more.

5G New Radio (NR)-V2X, standardized by 3GPP Release 16, includes a distributed resource allocation Mode, known as Mode 2, that allows vehicles to autonomously select transmission resources using either sensing-based semi-persistent scheduling (SB-SPS) or dynamic scheduling (DS). In unmanaged 5G-NR-V2X scenarios, SB-SPS loses effectiveness with aperiodic and variable data. DS, while better for aperiodic traffic, faces challenges due to random selection, particularly in high traffic density scenarios, leading to increased collisions. To address these limitations, this study models the Cellular V2X network as a decentralized multi-agent networked Markov decision process (MDP), where each vehicle agent uses the Shared Experience Actor–Critic (SEAC) technique to optimize performance. The superiority of SEAC over SB-SPS and DS is demonstrated through simulations, showing that the SEAC with an N-step approach achieves an average improvement of approximately 18–20% in enhancing reliability, reducing collisions, and improving resource utilization under high vehicular density scenarios with aperiodic traffic patterns. Full article

(This article belongs to the Special Issue Advanced Vehicular Ad Hoc Networks: 2nd Edition)

► Show Figures

Figure 1

23 pages, 5682 KB

Open AccessArticle

IV-YOLO: A Lightweight Dual-Branch Object Detection Network

by Dan Tian, Xin Yan, Dong Zhou, Chen Wang and Wenshuai Zhang

Sensors 2024, 24(19), 6181; https://doi.org/10.3390/s24196181 - 24 Sep 2024

Cited by 9 | Viewed by 4300

Abstract

With the rapid growth in demand for security surveillance, assisted driving, and remote sensing, object detection networks with robust environmental perception and high detection accuracy have become a research focus. However, single-modality image detection technologies face limitations in environmental adaptability, often affected by [...] Read more.

With the rapid growth in demand for security surveillance, assisted driving, and remote sensing, object detection networks with robust environmental perception and high detection accuracy have become a research focus. However, single-modality image detection technologies face limitations in environmental adaptability, often affected by factors such as lighting conditions, fog, rain, and obstacles like vegetation, leading to information loss and reduced detection accuracy. We propose an object detection network that integrates features from visible light and infrared images—IV-YOLO—to address these challenges. This network is based on YOLOv8 (You Only Look Once v8) and employs a dual-branch fusion structure that leverages the complementary features of infrared and visible light images for target detection. We designed a Bidirectional Pyramid Feature Fusion structure (Bi-Fusion) to effectively integrate multimodal features, reducing errors from feature redundancy and extracting fine-grained features for small object detection. Additionally, we developed a Shuffle-SPP structure that combines channel and spatial attention to enhance the focus on deep features and extract richer information through upsampling. Regarding model optimization, we designed a loss function tailored for multi-scale object detection, accelerating the convergence speed of the network during training. Compared with the current state-of-the-art Dual-YOLO model, IV-YOLO achieves mAP improvements of 2.8%, 1.1%, and 2.2% on the Drone Vehicle, FLIR, and KAIST datasets, respectively. On the Drone Vehicle and FLIR datasets, IV-YOLO has a parameter count of 4.31 M and achieves a frame rate of 203.2 fps, significantly outperforming YOLOv8n (5.92 M parameters, 188.6 fps on the Drone Vehicle dataset) and YOLO-FIR (7.1 M parameters, 83.3 fps on the FLIR dataset), which had previously achieved the best performance on these datasets. This demonstrates that IV-YOLO achieves higher real-time detection performance while maintaining lower parameter complexity, making it highly promising for applications in autonomous driving, public safety, and beyond. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

17 pages, 4164 KB

Open AccessArticle

G-YOLO: A Lightweight Infrared Aerial Remote Sensing Target Detection Model for UAVs Based on YOLOv8

by Xiaofeng Zhao, Wenwen Zhang, Yuting Xia, Hui Zhang, Chao Zheng, Junyi Ma and Zhili Zhang

Drones 2024, 8(9), 495; https://doi.org/10.3390/drones8090495 - 18 Sep 2024

Cited by 13 | Viewed by 3948

Abstract

A lightweight infrared target detection model, G-YOLO, based on an unmanned aerial vehicle (UAV) is proposed to address the issues of low accuracy in target detection of UAV aerial images in complex ground scenarios and large network models that are difficult to apply [...] Read more.

A lightweight infrared target detection model, G-YOLO, based on an unmanned aerial vehicle (UAV) is proposed to address the issues of low accuracy in target detection of UAV aerial images in complex ground scenarios and large network models that are difficult to apply to mobile or embedded platforms. Firstly, the YOLOv8 backbone feature extraction network is improved and designed based on the lightweight network, GhostBottleneckV2, and the remaining part of the backbone network adopts the depth-separable convolution, DWConv, to replace part of the standard convolution, which effectively retains the detection effect of the model while greatly reducing the number of model parameters and calculations. Secondly, the neck structure is improved by the ODConv module, which adopts an adaptive convolutional structure to adaptively adjust the convolutional kernel size and step size, which allows for more effective feature extraction and detection based on targets at different scales. At the same time, the neck structure is further optimized using the attention mechanism, SEAttention, to improve the model’s ability to learn global information of input feature maps, which is then applied to each channel of each feature map to enhance the useful information in a specific channel and improve the model’s detection performance. Finally, the introduction of the SlideLoss loss function enables the model to calculate the differences between predicted and actual truth bounding boxes during the training process, and adjust the model parameters based on these differences to improve the accuracy and efficiency of object detection. The experimental results show that compared with YOLOv8n, the G-YOLO reduces the missed and false detection rates of infrared small target detection in complex backgrounds. The number of model parameters is reduced by 74.2%, the number of computational floats is reduced by 54.3%, the FPS is improved by 71, which improves the detection efficiency of the model, and the average accuracy (mAP) reaches 91.4%, which verifies the validity of the model for UAV-based infrared small target detection. Furthermore, the FPS of the model reaches 556, and it will be suitable for wider and more complex detection task such as small targets, long-distance targets, and other complex scenes. Full article

► Show Figures

Figure 1

22 pages, 7164 KB

Open AccessArticle

LettuceNet: A Novel Deep Learning Approach for Efficient Lettuce Localization and Counting

by Aowei Ruan, Mengyuan Xu, Songtao Ban, Shiwei Wei, Minglu Tian, Haoxuan Yang, Annan Hu, Dong Hu and Linyi Li

Agriculture 2024, 14(8), 1412; https://doi.org/10.3390/agriculture14081412 - 20 Aug 2024

Cited by 3 | Viewed by 2180

Abstract

Traditional lettuce counting relies heavily on manual labor, which is laborious and time-consuming. In this study, a simple and efficient method for localization and counting lettuce is proposed, based only on lettuce field images acquired by an unmanned aerial vehicle (UAV) equipped with [...] Read more.

Traditional lettuce counting relies heavily on manual labor, which is laborious and time-consuming. In this study, a simple and efficient method for localization and counting lettuce is proposed, based only on lettuce field images acquired by an unmanned aerial vehicle (UAV) equipped with an RGB camera. In this method, a new lettuce counting model based on the weak supervised deep learning (DL) approach is developed, called LettuceNet. The LettuceNet network adopts a more lightweight design that relies only on point-level labeled images to train and accurately predict the number and location information of high-density lettuce (i.e., clusters of lettuce with small planting spacing, high leaf overlap, and unclear boundaries between adjacent plants). The proposed LettuceNet is thoroughly assessed in terms of localization and counting accuracy, model efficiency, and generalizability using the Shanghai Academy of Agricultural Sciences-Lettuce (SAAS-L) and the Global Wheat Head Detection (GWHD) datasets. The results demonstrate that LettuceNet achieves superior counting accuracy, localization, and efficiency when employing the enhanced MobileNetV2 as the backbone network. Specifically, the counting accuracy metrics, including mean absolute error (MAE), root mean square error (RMSE), normalized root mean square error (nRMSE), and coefficient of determination (R²), reach 2.4486, 4.0247, 0.0276, and 0.9933, respectively, and the F-Score for localization accuracy is an impressive 0.9791. Moreover, the LettuceNet is compared with other existing widely used plant counting methods including Multi-Column Convolutional Neural Network (MCNN), Dilated Convolutional Neural Networks (CSRNets), Scale Aggregation Network (SANet), TasselNet Version 2 (TasselNetV2), and Focal Inverse Distance Transform Maps (FIDTM). The results indicate that our proposed LettuceNet performs the best among all evaluated merits, with 13.27% higher R² and 72.83% lower nRMSE compared to the second most accurate SANet in terms of counting accuracy. In summary, the proposed LettuceNet has demonstrated great performance in the tasks of localization and counting of high-density lettuce, showing great potential for field application. Full article

(This article belongs to the Topic Advances in Smart Agriculture with Remote Sensing as the Core and Its Applications in Crops Field)

► Show Figures

Figure 1

23 pages, 30652 KB

Open AccessArticle

EUAVDet: An Efficient and Lightweight Object Detector for UAV Aerial Images with an Edge-Based Computing Platform

by Wanneng Wu, Ao Liu, Jianwen Hu, Yan Mo, Shao Xiang, Puhong Duan and Qiaokang Liang

Drones 2024, 8(6), 261; https://doi.org/10.3390/drones8060261 - 13 Jun 2024

Cited by 14 | Viewed by 3749

Abstract

Crafting an edge-based real-time object detector for unmanned aerial vehicle (UAV) aerial images is challenging because of the limited computational resources and the small size of detected objects. Existing lightweight object detectors often prioritize speed over detecting extremely small targets. To better balance [...] Read more.

Crafting an edge-based real-time object detector for unmanned aerial vehicle (UAV) aerial images is challenging because of the limited computational resources and the small size of detected objects. Existing lightweight object detectors often prioritize speed over detecting extremely small targets. To better balance this trade-off, this paper proposes an efficient and low-complexity object detector for edge computing platforms deployed on UAVs, termed EUAVDet (Edge-based UAV Object Detector). Specifically, an efficient feature downsampling module and a novel multi-kernel aggregation block are first introduced into the backbone network to retain more feature details and capture richer spatial information. Subsequently, an improved feature pyramid network with a faster ghost module is incorporated into the neck network to fuse multi-scale features with fewer parameters. Experimental evaluations on the VisDrone, SeaDronesSeeV2, and UAVDT datasets demonstrate the effectiveness and plug-and-play capability of our proposed modules. Compared with the state-of-the-art YOLOv8 detector, the proposed EUAVDet achieves better performance in nearly all the metrics, including parameters, FLOPs, mAP, and FPS. The smallest version of EUAVDet (EUAVDet-n) contains only 1.34 M parameters and achieves over 20 fps on the Jetson Nano. Our algorithm strikes a better balance between detection accuracy and inference speed, making it suitable for edge-based UAV applications. Full article

(This article belongs to the Special Issue Advances in Perception, Communications, and Control for Drones)

► Show Figures

Figure 1

17 pages, 4725 KB

Open AccessArticle

ITD-YOLOv8: An Infrared Target Detection Model Based on YOLOv8 for Unmanned Aerial Vehicles

by Xiaofeng Zhao, Wenwen Zhang, Hui Zhang, Chao Zheng, Junyi Ma and Zhili Zhang

Drones 2024, 8(4), 161; https://doi.org/10.3390/drones8040161 - 19 Apr 2024

Cited by 34 | Viewed by 4820

Abstract

A UAV infrared target detection model ITD-YOLOv8 based on YOLOv8 is proposed to address the issues of model missed and false detections caused by complex ground background and uneven target scale in UAV aerial infrared image target detection, as well as high computational [...] Read more.

A UAV infrared target detection model ITD-YOLOv8 based on YOLOv8 is proposed to address the issues of model missed and false detections caused by complex ground background and uneven target scale in UAV aerial infrared image target detection, as well as high computational complexity. Firstly, an improved YOLOv8 backbone feature extraction network is designed based on the lightweight network GhostHGNetV2. It can effectively capture target feature information at different scales, improving target detection accuracy in complex environments while remaining lightweight. Secondly, the VoVGSCSP improves model perceptual abilities by referencing global contextual information and multiscale features to enhance neck structure. At the same time, a lightweight convolutional operation called AXConv is introduced to replace the regular convolutional module. Replacing traditional fixed-size convolution kernels with convolution kernels of different sizes effectively reduces the complexity of the model. Then, to further optimize the model and reduce missed and false detections during object detection, the CoordAtt attention mechanism is introduced in the neck of the model to weight the channel dimensions of the feature map, allowing the network to pay more attention to the important feature information, thereby improving the accuracy and robustness of object detection. Finally, the implementation of XIoU as a loss function for boundary boxes enhances the precision of target localization. The experimental findings demonstrate that ITD-YOLOv8, in comparison to YOLOv8n, effectively reduces the rate of missed and false detections for detecting multi-scale small targets in complex backgrounds. Additionally, it achieves a 41.9% reduction in model parameters and a 25.9% decrease in floating-point operations. Moreover, the mean accuracy (mAP) attains an impressive 93.5%, thereby confirming the model’s applicability for infrared target detection on unmanned aerial vehicles (UAVs). Full article

► Show Figures

Figure 1

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (28)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI