MDPI - Publisher of Open Access Journals

44 pages, 12613 KB

Open AccessArticle

Quantum Theory of a Single Photon in an Arbitrary Medium

by Ashot S. Gevorkyan, Aleksandr V. Bogdanov and Vladimir V. Mareev

Particles 2026, 9(2), 58; https://doi.org/10.3390/particles9020058 - 18 May 2026

Viewed by 98

The quantum motion of a photon in an arbitrary medium was considered within the framework of the gauge symmetry group

S U (2) \otimes U (1)

using the Yang–Mills (Y-M) equations for Abelian fields. A system of second-order partial [...] Read more.

The quantum motion of a photon in an arbitrary medium was considered within the framework of the gauge symmetry group

S U (2) \otimes U (1)

using the Yang–Mills (Y-M) equations for Abelian fields. A system of second-order partial differential equations (PDEs) for the vector wave function of a photon is derived using the first-order Y-M equations as identities. The full wave function of a photon was defined as the arithmetic mean of the components of the wave function. In a particular case, an equation is obtained for its full wave function, taking into account the structure of space-time in a plane perpendicular to the direction of propagation of the photon. The quantum state of a photon in a nanowaveguide was investigated, and it is shown that under certain conditions, it is reduced to the problem of two coupled 1D quantum harmonic oscillators (QHO) with variable frequencies. An explicit expression is obtained for the wave function of a photon, which is characterized by two vibrational quantum numbers. A quantum theory of a photon for a dissipative medium has been developed taking into account the processes of absorption and emission of photons. The mathematical expectation (ME) of the photon wave function is constructed as the product of two 2D integral representations in which the integrand is the solution of a system of two coupled second-order PDEs. The ME of the probability amplitude of the transition of a single-photon state into one of the two-photon entangled Bell states is constructed. Finally, it was proven that, in addition to frequency, spin, momentum and polarization, the photon also has a spatial structure responsible for the cross sections of processes in which this massless fundamental particle participates. Full article

(This article belongs to the Special Issue Selected Papers from “The Modern Physics of Compact Stars and Relativistic Gravity 2025”)

► Show Figures

Figure 1

16 pages, 2602 KB

Open AccessArticle

A Feature-Enhanced Network for Vegetable Disease Detection in Complex Environments

by Xuewei Wang and Jun Liu

Plants 2026, 15(8), 1182; https://doi.org/10.3390/plants15081182 - 11 Apr 2026

Viewed by 537

Abstract

Accurate vegetable disease detection in complex cultivation environments remains challenging because early lesions are often small, low-contrast, and easily confounded by cluttered backgrounds. To address this issue, we propose VDD-Net, a feature-enhanced detection network based on YOLOv10 for robust vegetable disease detection in [...] Read more.

Accurate vegetable disease detection in complex cultivation environments remains challenging because early lesions are often small, low-contrast, and easily confounded by cluttered backgrounds. To address this issue, we propose VDD-Net, a feature-enhanced detection network based on YOLOv10 for robust vegetable disease detection in protected agriculture. The proposed framework integrates three modules: a receptive field enhancement (RFE) module to improve local perception of small lesions, an adaptive channel fusion (ACF) module to strengthen multi-scale feature aggregation and suppress background interference, and a global context attention (GCA) module to capture long-range dependencies and improve contextual discrimination. Experiments on a custom vegetable disease dataset showed that VDD-Net achieved an mAP@0.5 of 95.2% with only 7.78 M parameters. To further evaluate robustness, zero-shot cross-domain testing was conducted on the PlantDoc dataset, where VDD-Net achieved an mAP@0.5 of 76.5%, outperforming the baseline and showing improved generalization to natural scenes. In addition, after TensorRT optimization and FP16 quantization, the model maintained real-time inference on edge platforms, reaching 89.3 FPS on Jetson AGX Orin and 24.2 FPS on Jetson Nano. These results indicate that VDD-Net provides a practical balance among detection accuracy, cross-domain robustness, and deployment efficiency for intelligent disease monitoring in modern agriculture. Full article

(This article belongs to the Special Issue Combined Stresses on Plants: From Mechanisms to Adaptations)

► Show Figures

Figure 1

24 pages, 2830 KB

Open AccessArticle

Real-Time Radar-Based Hand Motion Recognition on FPGA Using a Hybrid Deep Learning Model

by Taher S. Ahmed, Ahmed F. Mahmoud, Magdy Elbahnasawy, Peter F. Driessen and Ahmed Youssef

Sensors 2026, 26(1), 172; https://doi.org/10.3390/s26010172 - 26 Dec 2025

Viewed by 1134

Abstract

Radar-based hand motion recognition (HMR) presents several challenges, including sensor interference, clutter, and the limitations of small datasets, which collectively hinder the performance and real-time deployment of deep learning (DL) models. To address these issues, this paper introduces a novel real-time HMR framework [...] Read more.

Radar-based hand motion recognition (HMR) presents several challenges, including sensor interference, clutter, and the limitations of small datasets, which collectively hinder the performance and real-time deployment of deep learning (DL) models. To address these issues, this paper introduces a novel real-time HMR framework that integrates advanced signal pre-processing, a hybrid convolutional neural network–support vector machine (CNN–SVM) architecture, and efficient hardware deployment. The pre-processing pipeline applies filtration, squared absolute value computation, and normalization to enhance radar data quality. To improve the robustness of DL models against noise and clutter, time-series radar signals are transformed into binarized images, providing a compact and discriminative representation for learning. A hybrid CNN-SVM model is then utilized for hand motion classification. The proposed model achieves a high classification accuracy of 98.91%, validating the quality of the extracted features and the efficiency of the proposed design. Additionally, it reduces the number of model parameters by approximately 66% relative to the most accurate recurrent baseline (CNN–GRU–SVM) and by up to 86% relative to CNN–BiLSTM–SVM, while achieving the highest SVM test accuracy of 92.79% across all CNN–RNN variants that use the same binarized radar images. For deployment, the model is quantized and implemented on two System-on-Chip (SoC) FPGA platforms—the Xilinx Zynq ZCU102 Evaluation Kit and the Xilinx Kria KR260 Robotics Starter Kit—using the Vitis AI toolchain. The system achieves end-to-end accuracies of 96.13% (ZCU102) and 95.42% (KR260). On the ZCU102, the system achieved a 70% reduction in execution time and a 74% improvement in throughput compared to the PC-based implementation. On the KR260, it achieved a 52% reduction in execution time and a 10% improvement in throughput relative to the same PC baseline. Both implementations exhibited minimal accuracy degradation relative to a PC-based setup—approximately 1% on ZCU102 and 2% on KR260. These results confirm the framework’s suitability for real-time, accurate, and resource-efficient radar-based hand motion recognition across diverse embedded environments. Full article

(This article belongs to the Special Issue Sensor Systems for Gesture Recognition (3rd Edition))

► Show Figures

Figure 1

58 pages, 8484 KB

Open AccessReview

Recent Real-Time Aerial Object Detection Approaches, Performance, Optimization, and Efficient Design Trends for Onboard Performance: A Survey

by Nadin Habash, Ahmad Abu Alqumsan and Tao Zhou

Sensors 2025, 25(24), 7563; https://doi.org/10.3390/s25247563 - 12 Dec 2025

Cited by 2 | Viewed by 3343

Abstract

The rising demand for real-time perception in aerial platforms has intensified the need for lightweight, hardware-efficient object detectors capable of reliable onboard operation. This survey provides a focused examination of real-time aerial object detection, emphasizing algorithms designed for edge devices and UAV onboard [...] Read more.

The rising demand for real-time perception in aerial platforms has intensified the need for lightweight, hardware-efficient object detectors capable of reliable onboard operation. This survey provides a focused examination of real-time aerial object detection, emphasizing algorithms designed for edge devices and UAV onboard processors, where computation, memory, and power resources are severely constrained. We first review the major aerial and remote-sensing datasets and analyze the unique challenges they introduce, such as small objects, fine-grained variation, multiscale variation, and complex backgrounds, which directly shape detector design. Recent studies addressing these challenges are then grouped, covering advances in lightweight backbones, fine-grained feature representation, multi-scale fusion, and optimized Transformer modules adapted for embedded environments. The review further highlights hardware-aware optimization techniques, including quantization, pruning, and TensorRT acceleration, as well as emerging trends in automated NAS tailored to UAV constraints. We discuss the adaptation of large pretrained models, such as CLIP-based embeddings and compressed Transformers, to meet onboard real-time requirements. By unifying architectural strategies, model compression, and deployment-level optimization, this survey offers a comprehensive perspective on designing next-generation detectors that achieve both high accuracy and true real-time performance in aerial applications. Full article

(This article belongs to the Special Issue Image Processing and Analysis in Sensor-Based Object Detection)

► Show Figures

Figure 1

18 pages, 5522 KB

Open AccessArticle

Automated Detection of Methane Leaks by Combining Infrared Imaging and a Gas-Faster Region-Based Convolutional Neural Network Technique

by Jinhui Zuo, Zhengqiang Li, Wenbin Xu, Jinxin Zuo and Zhipeng Rong

Sensors 2025, 25(18), 5714; https://doi.org/10.3390/s25185714 - 12 Sep 2025

Cited by 3 | Viewed by 2304

Abstract

Gas leaks threaten ecological and social safety. Non-contact infrared imaging enables large-scale, real-time measurements; however, in complex environments, weak signals from small leaks can hinder reliable detection. This study proposes a novel automated methane leak detection method based on infrared imaging and a [...] Read more.

Gas leaks threaten ecological and social safety. Non-contact infrared imaging enables large-scale, real-time measurements; however, in complex environments, weak signals from small leaks can hinder reliable detection. This study proposes a novel automated methane leak detection method based on infrared imaging and a Gas-Faster Region-based convolutional neural network (Gas R-CNN) to classify leakage amounts (≥30 mL/min). An uncooled infrared imaging system was employed to capture gas leak images containing leak volume features. We developed the Gas R-CNN model for gas leakage detection. This model introduces a multiscale feature network to improve leak feature extraction and enhancement, and it incorporates region-of-interest alignment to address the mismatch caused by double quantization. Feature extraction was enhanced by integrating ResNet50 with an efficient channel attention mechanism. Image enhancement techniques were applied to expand the dataset diversity. Leak detection capabilities were validated using the IOD-Video dataset, while the constructed gas dataset enabled the first quantitative leak assessment. The experimental results demonstrated that the model can accurately detect the leakage area and classify leakage amounts, enabling the quantitative analysis of infrared images. The proposed method achieved average precisions of 0.9599, 0.9647, and 0.9833 for leak rates of 30, 100, and 300 mL/min, respectively. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

28 pages, 7302 KB

Open AccessArticle

A Prototype of a Lightweight Structural Health Monitoring System Based on Edge Computing

by Yinhao Wang, Zhiyi Tang, Guangcai Qian, Wei Xu, Xiaomin Huang and Hao Fang

Sensors 2025, 25(18), 5612; https://doi.org/10.3390/s25185612 - 9 Sep 2025

Cited by 5 | Viewed by 2837

Abstract

Bridge Structural Health Monitoring (BSHM) is vital for assessing structural integrity and operational safety. Traditional wired systems are limited by high installation costs and complexity, while existing wireless systems still face issues with cost, synchronization, and reliability. Moreover, cloud-based methods for extreme event [...] Read more.

Bridge Structural Health Monitoring (BSHM) is vital for assessing structural integrity and operational safety. Traditional wired systems are limited by high installation costs and complexity, while existing wireless systems still face issues with cost, synchronization, and reliability. Moreover, cloud-based methods for extreme event detection struggle to meet real-time and bandwidth constraints in edge environments. To address these challenges, this study proposes a lightweight wireless BSHM system based on edge computing, enabling local data acquisition and real-time intelligent detection of extreme events. The system consists of wireless sensor nodes for front-end acceleration data collection and an intelligent hub for data storage, visualization, and earthquake recognition. Acceleration data are converted into time–frequency images to train a MobileNetV2-based model. With model quantization and Neural Processing Unit (NPU) acceleration, efficient on-device inference is achieved. Experiments on a laboratory steel bridge verify the system’s high acquisition accuracy, precise clock synchronization, and strong anti-interference performance. Compared with inference on a general-purpose ARM CPU running the unquantized model, the quantized model deployed on the NPU achieves a 26× speedup in inference, a 35% reduction in power consumption, and less than 1% accuracy loss. This solution provides a cost-effective, reliable BSHM framework for small-to-medium-sized bridges, offering local intelligence and rapid response with strong potential for real-world applications. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

26 pages, 4311 KB

Open AccessArticle

YOLOv13-Cone-Lite: An Enhanced Algorithm for Traffic Cone Detection in Autonomous Formula Racing Cars

by Zhukai Wang, Senhan Hu, Xuetao Wang, Yu Gao, Wenbo Zhang, Yaoyao Chen, Hai Lin, Tingting Gao, Junshuo Chen, Xianwu Gong, Binyu Wang and Weiyu Liu

Appl. Sci. 2025, 15(17), 9501; https://doi.org/10.3390/app15179501 - 29 Aug 2025

Cited by 6 | Viewed by 3913

Abstract

This study introduces YOLOv13-Cone-Lite, an enhanced algorithm based on YOLOv13s, designed to meet the stringent accuracy and real-time performance demands for traffic cone detection in autonomous formula racing cars on enclosed tracks. We improved detection accuracy by refining the network architecture. Specifically, the [...] Read more.

This study introduces YOLOv13-Cone-Lite, an enhanced algorithm based on YOLOv13s, designed to meet the stringent accuracy and real-time performance demands for traffic cone detection in autonomous formula racing cars on enclosed tracks. We improved detection accuracy by refining the network architecture. Specifically, the DS-C3k2_UIB module, an advanced iteration of the Universal Inverted Bottleneck (UIB), was integrated into the backbone to boost small object feature extraction. Additionally, the Non-Maximum Suppression (NMS)-free ConeDetect head was engineered to eliminate post-processing delays. To accommodate resource-limited onboard terminals, we minimized superfluous parameters through structural reparameterization pruning and performed 8-bit integer (INT8) quantization using the TensorRT toolkit, resulting in a lightweight model. Experimental findings show that YOLOv13-Cone-Lite achieves a mAP₅₀ of 92.9% (a 4.5% enhancement over the original YOLOv13s), a frame rate of 68 Hz (double the original model’s speed), and a parameter size of 8.7 MB (a 52.5% reduction). The proposed algorithm effectively addresses challenges like intricate lighting and long-range detection of small objects and offers the automotive industry a framework to develop more efficient onboard perception systems, while informing object detection in other closed autonomous environments like factory campuses. Notably, the model is optimized for enclosed tracks, with open traffic generalization needing further validation. Full article

► Show Figures

Figure 1

20 pages, 2004 KB

Open AccessCommunication

Towards Open-Set NLP-Based Multi-Level Planning for Robotic Tasks

by Peteris Racinskis, Oskars Vismanis, Toms Eduards Zinars, Janis Arents and Modris Greitans

Appl. Sci. 2024, 14(22), 10717; https://doi.org/10.3390/app142210717 - 19 Nov 2024

Cited by 3 | Viewed by 3388 | Correction

Abstract

This paper outlines a conceptual design for a multi-level natural language-based planning system and describes a demonstrator. The main goal of the demonstrator is to serve as a proof-of-concept by accomplishing end-to-end execution in a real-world environment, and showing a novel way of [...] Read more.

This paper outlines a conceptual design for a multi-level natural language-based planning system and describes a demonstrator. The main goal of the demonstrator is to serve as a proof-of-concept by accomplishing end-to-end execution in a real-world environment, and showing a novel way of interfacing an LLM-based planner with open-set semantic maps. The target use-case is executing sequences of tabletop pick-and-place operations using an industrial robot arm and RGB-D camera. The demonstrator processes unstructured user prompts, produces high-level action plans, queries a map for object positions and grasp poses using open-set semantics, then uses the resulting outputs to parametrize and execute a sequence of action primitives. In this paper, the overall system structure, high-level planning using language models, low-level planning through action and motion primitives, as well as the implementation of two different environment modeling schemes—2.5 or fully 3-dimensional—are described in detail. The impacts of quantizing image embeddings on object recall are assessed and high-level planner performance is evaluated using a small reference scene data set. We observe that, for the simple constrained test command data set, the high-level planner is able to achieve a total success rate of 96.40%, while the semantic maps exhibit maximum recall rates of 94.69% and 92.29% for the 2.5d and 3d versions, respectively. Full article

(This article belongs to the Special Issue Digital Technologies Enabling Modern Industries)

► Show Figures

Figure 1

20 pages, 691 KB

Open AccessArticle

DiscHAR: A Discrete Approach to Enhance Human Activity Recognition in Cyber Physical Systems: Smart Homes

by Ishrat Fatima, Asma Ahmad Farhan, Maria Tamoor, Shafiq ur Rehman, Hisham Abdulrahman Alhulayyil and Fawaz Tariq

Computers 2024, 13(11), 300; https://doi.org/10.3390/computers13110300 - 19 Nov 2024

Cited by 4 | Viewed by 2075

Abstract

The main challenges in smart home systems and cyber-physical systems come from not having enough data and unclear interpretation; thus, there is still a lot to be done in this field. In this work, we propose a practical approach called Discrete Human Activity [...] Read more.

The main challenges in smart home systems and cyber-physical systems come from not having enough data and unclear interpretation; thus, there is still a lot to be done in this field. In this work, we propose a practical approach called Discrete Human Activity Recognition (DiscHAR) based on prior research to enhance Human Activity Recognition (HAR). Our goal is to generate diverse data to build better models for activity classification. To tackle overfitting, which often occurs with small datasets, we generate data and convert them into discrete forms, improving classification accuracy. Our methodology includes advanced techniques like the R-Frame method for sampling and the Mixed-up approach for data generation. We apply K-means vector quantization to categorize the data, and through the elbow method, we determine the optimal number of clusters. The discrete sequences are converted into one-hot encoded vectors and fed into a CNN model to ensure precise recognition of human activities. Evaluations on the OPP79, PAMAP2, and WISDM datasets show that our approach outperforms existing models, achieving 89% accuracy for OPP79, 93.24% for PAMAP2, and 100% for WISDM. These results demonstrate the model’s effectiveness in identifying complex activities captured by wearable devices. Our work combines theory and practice to address ongoing challenges in this field, aiming to improve the reliability and performance of activity recognition systems in dynamic environments. Full article

► Show Figures

Figure 1

14 pages, 6513 KB

Open AccessArticle

YOLO-Chili: An Efficient Lightweight Network Model for Localization of Pepper Picking in Complex Environments

by Hailin Chen, Ruofan Zhang, Jialiang Peng, Hao Peng, Wenwu Hu, Yi Wang and Ping Jiang

Appl. Sci. 2024, 14(13), 5524; https://doi.org/10.3390/app14135524 - 25 Jun 2024

Cited by 7 | Viewed by 2961

Abstract

Currently, few deep models are applied to pepper-picking detection, and existing generalized neural networks face issues such as large model parameters, prolonged training times, and low accuracy. To address these challenges, this paper proposes the YOLO-chili target detection algorithm for chili pepper detection. [...] Read more.

Currently, few deep models are applied to pepper-picking detection, and existing generalized neural networks face issues such as large model parameters, prolonged training times, and low accuracy. To address these challenges, this paper proposes the YOLO-chili target detection algorithm for chili pepper detection. Initially, the classical target detection algorithm YOLOv5 serves as the benchmark model. We introduce an adaptive spatial feature pyramid structure that combines the attention mechanism and the concept of multi-scale prediction to enhance the model’s detection capabilities for occluded and small target peppers. Subsequently, we incorporate a three-channel attention mechanism module to improve the algorithm’s long-distance recognition ability and reduce interference from redundant objects. Finally, we employ a quantized pruning method to reduce model parameters and achieve lightweight processing. Applying this method to our custom chili pepper dataset, we achieve an average precision (AP) value of 93.11% for chili pepper detection, with an accuracy rate of 93.51% and a recall rate of 92.55%. The experimental results demonstrate that YOLO-chili enables accurate and real-time pepper detection in complex orchard environments. Full article

► Show Figures

Figure 1

28 pages, 2032 KB

Open AccessArticle

A Comparative Study of Preprocessing and Model Compression Techniques in Deep Learning for Forest Sound Classification

by Thivindu Paranayapa, Piumini Ranasinghe, Dakshina Ranmal, Dulani Meedeniya and Charith Perera

Sensors 2024, 24(4), 1149; https://doi.org/10.3390/s24041149 - 9 Feb 2024

Cited by 31 | Viewed by 4660

Abstract

Deep-learning models play a significant role in modern software solutions, with the capabilities of handling complex tasks, improving accuracy, automating processes, and adapting to diverse domains, eventually contributing to advancements in various industries. This study provides a comparative study on deep-learning techniques that [...] Read more.

Deep-learning models play a significant role in modern software solutions, with the capabilities of handling complex tasks, improving accuracy, automating processes, and adapting to diverse domains, eventually contributing to advancements in various industries. This study provides a comparative study on deep-learning techniques that can also be deployed on resource-constrained edge devices. As a novel contribution, we analyze the performance of seven Convolutional Neural Network models in the context of data augmentation, feature extraction, and model compression using acoustic data. The results show that the best performers can achieve an optimal trade-off between model accuracy and size when compressed with weight and filter pruning followed by 8-bit quantization. In adherence to the study workflow utilizing the forest sound dataset, MobileNet-v3-small and ACDNet achieved accuracies of 87.95% and 85.64%, respectively, while maintaining compact sizes of 243 KB and 484 KB, respectively. Henceforth, this study concludes that CNNs can be optimized and compressed to be deployed in resource-constrained edge devices for classifying forest environment sounds. Full article

(This article belongs to the Section Internet of Things)

► Show Figures

Figure 1

15 pages, 465 KB

Open AccessArticle

A Study on Determining the Optimal Feedback Rate in Distributed Block Diagonalization with Limited Feedback for Dense Cellular Networks

by Taehwi Kim and Moonsik Min

Mathematics 2024, 12(3), 460; https://doi.org/10.3390/math12030460 - 31 Jan 2024

Cited by 2 | Viewed by 1251

Abstract

In this study, we explore a downlink cellular network where each base station (BS) engages in simultaneous communication with multiple users through spatial division multiple access (SDMA). The positions of both BSs and users are established through independent random point processes, effectively representing [...] Read more.

In this study, we explore a downlink cellular network where each base station (BS) engages in simultaneous communication with multiple users through spatial division multiple access (SDMA). The positions of both BSs and users are established through independent random point processes, effectively representing the cellular network. SDMA utilizes block diagonalization (BD) at each BS, employing multiple receive antennas for each user. To implement BD, users quantize and provide feedback on their downlink channels to their respective BSs. The net spectral efficiency, measuring the effective rate accounting for both downlink and uplink resource usage, serves as a performance metric. In prior research, the optimal feedback rate in terms of maximizing net spectral efficiency has been approximated in this scenario. The corresponding approximations effectively illustrate the asymptotic behavior of the optimal number as a function of the length of the coherent channel block. However, the accuracy of the approximation diminishes when the length of the coherent channel block is relatively small. Given that the length of the coherent channel block can assume relatively small values depending on wireless environments, achieving a precise estimate across the entire range of the coherent block length holds significant importance. Consequently, this paper focuses primarily on enhancing the accuracy of the approximation for the optimal feedback rate. In order to achieve a more precise estimation, we analyze the derivative of the net spectral efficiency, which encompasses two functions that demonstrate distinct growth rates. In contrast to prior studies, both functions are rigorously approximated through mathematical analysis. As a result, the proposed approximation significantly improves the accuracy compared to previous studies, particularly when dealing with short coherent channel block lengths. Moreover, this approximation generally achieves near-optimal performance, regardless of system parameters. Full article

► Show Figures

Figure 1

15 pages, 19203 KB

Open AccessArticle

Improved Faster Region-Based Convolutional Neural Networks (R-CNN) Model Based on Split Attention for the Detection of Safflower Filaments in Natural Environments

by Zhenguo Zhang, Ruimeng Shi, Zhenyu Xing, Quanfeng Guo and Chao Zeng

Agronomy 2023, 13(10), 2596; https://doi.org/10.3390/agronomy13102596 - 11 Oct 2023

Cited by 22 | Viewed by 3425

Abstract

The accurate acquisition of safflower filament information is the prerequisite for robotic picking operations. To detect safflower filaments accurately in different illumination, branch and leaf occlusion, and weather conditions, an improved Faster R-CNN model for filaments was proposed. Due to the characteristics of [...] Read more.

The accurate acquisition of safflower filament information is the prerequisite for robotic picking operations. To detect safflower filaments accurately in different illumination, branch and leaf occlusion, and weather conditions, an improved Faster R-CNN model for filaments was proposed. Due to the characteristics of safflower filaments being dense and small in the safflower images, the model selected ResNeSt-101 with residual network structure as the backbone feature extraction network to enhance the expressive power of extracted features. Then, using Region of Interest (ROI) Align improved ROI Pooling to reduce the feature errors caused by double quantization. In addition, employing the partitioning around medoids (PAM) clustering was chosen to optimize the scale and number of initial anchors of the network to improve the detection accuracy of small-sized safflower filaments. The test results showed that the mean Average Precision (mAP) of the improved Faster R-CNN reached 91.49%. Comparing with Faster R-CNN, YOLOv3, YOLOv4, YOLOv5, and YOLOv6, the improved Faster R-CNN increased the mAP by 9.52%, 2.49%, 5.95%, 3.56%, and 1.47%, respectively. The mAP of safflower filaments detection was higher than 91% on a sunny, cloudy, and overcast day, in sunlight, backlight, branch and leaf occlusion, and dense occlusion. The improved Faster R-CNN can accurately realize the detection of safflower filaments in natural environments. It can provide technical support for the recognition of small-sized crops. Full article

(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture)

► Show Figures

Figure 1

24 pages, 10282 KB

Open AccessArticle

Research on Identification and Detection of Transmission Line Insulator Defects Based on a Lightweight YOLOv5 Network

by Zhilong Yu, Yanqiao Lei, Feng Shen, Shuai Zhou and Yue Yuan

Remote Sens. 2023, 15(18), 4552; https://doi.org/10.3390/rs15184552 - 15 Sep 2023

Cited by 29 | Viewed by 3393

Abstract

Transmission line fault detection using drones provides real-time assessment of the operational status of transmission equipment, and therefore it has immense importance in ensuring stable functioning of the transmission lines. Currently, identification of transmission line equipment relies predominantly on manual inspections that are [...] Read more.

Transmission line fault detection using drones provides real-time assessment of the operational status of transmission equipment, and therefore it has immense importance in ensuring stable functioning of the transmission lines. Currently, identification of transmission line equipment relies predominantly on manual inspections that are susceptible to the influence of natural surroundings, resulting in sluggishness and a high rate of false detections. In view of this, in this study, we propose an insulator defect recognition algorithm based on a YOLOv5 model with a new lightweight network as the backbone network, combining noise reduction and target detection. First, we propose a new noise reduction algorithm, i.e., the adaptive neighborhood-weighted median filtering (NW-AMF) algorithm. This algorithm employs a weighted summation technique to determine the median value of the pixel point’s neighborhood, effectively filtering out noise from the captured aerial images. Consequently, this approach significantly mitigates the adverse effects of varying noise levels on target detection. Subsequently, the RepVGG lightweight network structure is improved to the newly proposed lightweight structure called RcpVGG-YOLOv5. This structure facilitates single-branch inference, multi-branch training, and branch normalization, thereby improving the quantization performance while simultaneously striking a balance between target detection accuracy and speed. Furthermore, we propose a new loss function, i.e., Focal EIOU, to replace the original CIOU loss function. This optimization incorporates a penalty on the edge length of the target frame, which improves the contribution of the high-quality target gradient. This modification effectively addresses the issue of imbalanced positive and negative samples for small targets, suppresses background positive samples, and ultimately enhances the accuracy of detection. Finally, to align more closely with real-world engineering applications, the dataset utilized in this study consists of machine patrol images captured by the Unmanned Aerial Systems (UAS) of the Yunnan Power Supply Bureau Company. The experimental findings demonstrate that the proposed algorithm yields notable improvements in accuracy and inference speed compared to YOLOv5s, YOLOv7, and YOLOv8. Specifically, the improved algorithm achieves a 3.7% increase in accuracy and a 48.2% enhancement in inference speed compared to those of YOLOv5s. Similarly, it achieves a 2.7% accuracy improvement and a 33.5% increase in inference speed compared to those of YOLOv7, as well as a 1.5% accuracy enhancement and a 13.1% improvement in inference speed compared to those of YOLOv8. These results validate the effectiveness of the proposed algorithm through ablation experiments. Consequently, the method presented in this paper exhibits practical applicability in the detection of aerial images of transmission lines within complex environments. In future research endeavors, it is recommended to continue collecting aerial images for continuous iterative training, to optimize the model further, and to conduct in-depth investigations into the challenges associated with detecting small targets. Such endeavors hold significant importance for the advancement of transmission line detection. Full article

(This article belongs to the Special Issue State of the Art in Object Detection Based on Computer Vision and Image Processing)

► Show Figures

Figure 1

20 pages, 8263 KB

Open AccessArticle

A Thermal Infrared Pedestrian-Detection Method for Edge Computing Devices

by Shuai You, Yimu Ji, Shangdong Liu, Chaojun Mei, Xiaoliang Yao and Yujian Feng

Sensors 2022, 22(17), 6710; https://doi.org/10.3390/s22176710 - 5 Sep 2022

Cited by 7 | Viewed by 4611

Abstract

The thermal imaging pedestrian-detection system has excellent performance in different lighting scenarios, but there are problems regarding weak texture, object occlusion, and small objects. Meanwhile, large high-performance models have higher latency on edge devices with limited computing power. To solve the above problems, [...] Read more.

The thermal imaging pedestrian-detection system has excellent performance in different lighting scenarios, but there are problems regarding weak texture, object occlusion, and small objects. Meanwhile, large high-performance models have higher latency on edge devices with limited computing power. To solve the above problems, in this paper, we propose a real-time thermal imaging pedestrian-detection method for edge computing devices. Firstly, we utilize multi-scale mosaic data augmentation to enhance the diversity and texture of objects, which alleviates the impact of complex environments. Then, the parameter-free attention mechanism is introduced into the network to enhance features, which barely increases the computing cost of the network. Finally, we accelerate multi-channel video detection through quantization and multi-threading techniques on edge computing devices. Additionally, we create a high-quality thermal infrared dataset to facilitate the research. The comparative experiments on the self-built dataset, YDTIP, and three public datasets, with other methods show that our method also has certain advantages. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

Search Results (17)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (17)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI