MDPI - Publisher of Open Access Journals

30 pages, 7223 KiB

Open AccessArticle

Smart Wildlife Monitoring: Real-Time Hybrid Tracking Using Kalman Filter and Local Binary Similarity Matching on Edge Network

by Md. Auhidur Rahman, Stefano Giordano and Michele Pagano

Computers 2025, 14(8), 307; https://doi.org/10.3390/computers14080307 - 30 Jul 2025

Viewed by 131

Abstract

Real-time wildlife monitoring on edge devices poses significant challenges due to limited power, constrained bandwidth, and unreliable connectivity, especially in remote natural habitats. Conventional object detection systems often transmit redundant data of the same animals detected across multiple consecutive frames as a part [...] Read more.

Real-time wildlife monitoring on edge devices poses significant challenges due to limited power, constrained bandwidth, and unreliable connectivity, especially in remote natural habitats. Conventional object detection systems often transmit redundant data of the same animals detected across multiple consecutive frames as a part of a single event, resulting in increased power consumption and inefficient bandwidth usage. Furthermore, maintaining consistent animal identities in the wild is difficult due to occlusions, variable lighting, and complex environments. In this study, we propose a lightweight hybrid tracking framework built on the YOLOv8m deep neural network, combining motion-based Kalman filtering with Local Binary Pattern (LBP) similarity for appearance-based re-identification using texture and color features. To handle ambiguous cases, we further incorporate Hue-Saturation-Value (HSV) color space similarity. This approach enhances identity consistency across frames while reducing redundant transmissions. The framework is optimized for real-time deployment on edge platforms such as NVIDIA Jetson Orin Nano and Raspberry Pi 5. We evaluate our method against state-of-the-art trackers using event-based metrics such as MOTA, HOTA, and IDF1, with a focus on detected animals occlusion handling, trajectory analysis, and counting during both day and night. Our approach significantly enhances tracking robustness, reduces ID switches, and provides more accurate detection and counting compared to existing methods. When transmitting time-series data and detected frames, it achieves up to 99.87% bandwidth savings and 99.67% power reduction, making it highly suitable for edge-based wildlife monitoring in resource-constrained environments. Full article

(This article belongs to the Special Issue Intelligent Edge: When AI Meets Edge Computing)

► Show Figures

Figure 1

20 pages, 766 KiB

Open AccessArticle

Accelerating Deep Learning Inference: A Comparative Analysis of Modern Acceleration Frameworks

by Ishrak Jahan Ratul, Yuxiao Zhou and Kecheng Yang

Electronics 2025, 14(15), 2977; https://doi.org/10.3390/electronics14152977 - 25 Jul 2025

Viewed by 282

Abstract

Deep learning (DL) continues to play a pivotal role in a wide range of intelligent systems, including autonomous machines, smart surveillance, industrial automation, and portable healthcare technologies. These applications often demand low-latency inference and efficient resource utilization, especially when deployed on embedded or [...] Read more.

Deep learning (DL) continues to play a pivotal role in a wide range of intelligent systems, including autonomous machines, smart surveillance, industrial automation, and portable healthcare technologies. These applications often demand low-latency inference and efficient resource utilization, especially when deployed on embedded or edge devices with limited computational capacity. As DL models become increasingly complex, selecting the right inference framework is essential to meeting performance and deployment goals. In this work, we conduct a comprehensive comparison of five widely adopted inference frameworks: PyTorch, ONNX Runtime, TensorRT, Apache TVM, and JAX. All experiments are performed on the NVIDIA Jetson AGX Orin platform, a high-performance computing solution tailored for edge artificial intelligence workloads. The evaluation considers several key performance metrics, including inference accuracy, inference time, throughput, memory usage, and power consumption. Each framework is tested using a wide range of convolutional and transformer models and analyzed in terms of deployment complexity, runtime efficiency, and hardware utilization. Our results show that certain frameworks offer superior inference speed and throughput, while others provide advantages in flexibility, portability, or ease of integration. We also observe meaningful differences in how each framework manages system memory and power under various load conditions. This study offers practical insights into the trade-offs associated with deploying DL inference on resource-constrained hardware. Full article

(This article belongs to the Special Issue Hardware Acceleration for Machine Learning)

► Show Figures

Figure 1

18 pages, 8446 KiB

Open AccessFeature PaperArticle

Evaluation of Single-Shot Object Detection Models for Identifying Fanning Behavior in Honeybees at the Hive Entrance

by Tomyslav Sledevič

Agriculture 2025, 15(15), 1609; https://doi.org/10.3390/agriculture15151609 - 25 Jul 2025

Viewed by 271

Abstract

Thermoregulatory fanning behavior in honeybees is a vital indicator of colony health and environmental response. This study presents a novel dataset of 18,000 annotated video frames containing 57,597 instances capturing fanning behavior at the hive entrance across diverse conditions. Three state-of-the-art single-shot object [...] Read more.

Thermoregulatory fanning behavior in honeybees is a vital indicator of colony health and environmental response. This study presents a novel dataset of 18,000 annotated video frames containing 57,597 instances capturing fanning behavior at the hive entrance across diverse conditions. Three state-of-the-art single-shot object detection models (YOLOv8, YOLO11, YOLO12) are evaluated using standard RGB input and two motion-enhanced encodings: Temporally Stacked Grayscale (TSG) and Temporally Encoded Motion (TEM). Results show that models incorporating temporal information via TSG and TEM significantly outperform RGB-only input, achieving up to 85% mAP@50 with real-time inference capability on high-performance GPUs. Deployment tests on the Jetson AGX Orin platform demonstrate feasibility for edge computing, though with accuracy–speed trade-offs in smaller models. This work advances real-time, non-invasive monitoring of hive health, with implications for precision apiculture and automated behavioral analysis. Full article

(This article belongs to the Special Issue Machine Learning in Precision Livestock Farming: From Animal Activity Forecasting to Environmental Control)

► Show Figures

Figure 1

30 pages, 4239 KiB

Open AccessArticle

Real-Time Object Detection for Edge Computing-Based Agricultural Automation: A Case Study Comparing the YOLOX and YOLOv12 Architectures and Their Performance in Potato Harvesting Systems

by Joonam Kim, Giryeon Kim, Rena Yoshitoshi and Kenichi Tokuda

Sensors 2025, 25(15), 4586; https://doi.org/10.3390/s25154586 - 24 Jul 2025

Viewed by 279

Abstract

In this paper, we presents a case study involving the implementation experience and a methodological framework through a comprehensive comparative analysis of the YOLOX and YOLOv12 object detection models for agricultural automation systems deployed in the Jetson AGX Orin edge computing platform. We [...] Read more.

In this paper, we presents a case study involving the implementation experience and a methodological framework through a comprehensive comparative analysis of the YOLOX and YOLOv12 object detection models for agricultural automation systems deployed in the Jetson AGX Orin edge computing platform. We examined the architectural differences between the models and their impact on detection capabilities in data-imbalanced potato-harvesting environments. Both models were trained on identical datasets with images capturing potatoes, soil clods, and stones, and their performances were evaluated through 30 independent trials under controlled conditions. Statistical analysis confirmed that YOLOX achieved a significantly higher throughput (107 vs. 45 FPS, p < 0.01) and superior energy efficiency (0.58 vs. 0.75 J/frame) than YOLOv12, meeting real-time processing requirements for agricultural automation. Although both models achieved an equivalent overall detection accuracy (F1-score, 0.97), YOLOv12 demonstrated specialized capabilities for challenging classes, achieving 42% higher recall for underrepresented soil clod objects (0.725 vs. 0.512, p < 0.01) and superior precision for small objects (0–3000 pixels). Architectural analysis identified a YOLOv12 residual efficient layer aggregation network backbone and area attention mechanism as key enablers of balanced precision–recall characteristics, which were particularly valuable for addressing agricultural data imbalance. However, NVIDIA Nsight profiling revealed implementation inefficiencies in the YOLOv12 multiprocess architecture, which prevented the theoretical advantages from being fully realized in edge computing environments. These findings provide empirically grounded guidelines for model selection in agricultural automation systems, highlighting the critical interplay between architectural design, implementation efficiency, and application-specific requirements. Full article

(This article belongs to the Section Smart Agriculture)

► Show Figures

Figure 1

26 pages, 78396 KiB

Open AccessArticle

SWRD–YOLO: A Lightweight Instance Segmentation Model for Estimating Rice Lodging Degree in UAV Remote Sensing Images with Real-Time Edge Deployment

by Chunyou Guo and Feng Tan

Agriculture 2025, 15(15), 1570; https://doi.org/10.3390/agriculture15151570 - 22 Jul 2025

Viewed by 297

Abstract

Rice lodging severely affects crop growth, yield, and mechanized harvesting efficiency. The accurate detection and quantification of lodging areas are crucial for precision agriculture and timely field management. However, Unmanned Aerial Vehicle (UAV)-based lodging detection faces challenges such as complex backgrounds, variable lighting, [...] Read more.

Rice lodging severely affects crop growth, yield, and mechanized harvesting efficiency. The accurate detection and quantification of lodging areas are crucial for precision agriculture and timely field management. However, Unmanned Aerial Vehicle (UAV)-based lodging detection faces challenges such as complex backgrounds, variable lighting, and irregular lodging patterns. To address these issues, this study proposes SWRD–YOLO, a lightweight instance segmentation model that enhances feature extraction and fusion using advanced convolution and attention mechanisms. The model employs an optimized loss function to improve localization accuracy, achieving precise lodging area segmentation. Additionally, a grid-based lodging ratio estimation method is introduced, dividing images into fixed-size grids to calculate local lodging proportions and aggregate them for robust overall severity assessment. Evaluated on a self-built rice lodging dataset, the model achieves 94.8% precision, 88.2% recall, 93.3% mAP@0.5, and 91.4% F1 score, with real-time inference at 16.15 FPS on an embedded NVIDIA Jetson Orin NX device. Compared to the baseline YOLOv8n-seg, precision, recall, mAP@0.5, and F1 score improved by 8.2%, 16.5%, 12.8%, and 12.8%, respectively. These results confirm the model’s effectiveness and potential for deployment in intelligent crop monitoring and sustainable agriculture. Full article

(This article belongs to the Topic Intelligent Agriculture: Perception Technologies and Agricultural Equipment for Crop Production Processes)

► Show Figures

Figure 1

28 pages, 5813 KiB

Open AccessArticle

YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR

by Yizhou Shuai, Jingsha Shi, Yi Li, Shaohao Zhou, Lihua Zhang and Jiong Mu

Agronomy 2025, 15(7), 1712; https://doi.org/10.3390/agronomy15071712 - 16 Jul 2025

Cited by 1 | Viewed by 448

Abstract

Accurate weed detection in soybean fields is essential for enhancing crop yield and reducing herbicide usage. This study proposes a YOLO-SW model, an improved version of YOLOv8, to address the challenges of detecting weeds that are highly similar to the background in natural [...] Read more.

Accurate weed detection in soybean fields is essential for enhancing crop yield and reducing herbicide usage. This study proposes a YOLO-SW model, an improved version of YOLOv8, to address the challenges of detecting weeds that are highly similar to the background in natural environments. The research stands out for its novel integration of three key advancements: the Swin Transformer backbone, which leverages local window self-attention to achieve linear O(N) computational complexity for efficient global context capture; the CARAFE dynamic upsampling operator, which enhances small target localization through context-aware kernel generation; and the RTDETR encoder, which enables end-to-end detection via IoU-aware query selection, eliminating the need for complex post-processing. Additionally, a dataset of six common soybean weeds was expanded to 12,500 images through simulated fog, rain, and snow augmentation, effectively resolving data imbalance and boosting model robustness. The experimental results highlight both the technical superiority and practical relevance: YOLO-SW achieves 92.3% mAP@50 (3.8% higher than YOLOv8), with recognition accuracy and recall improvements of 4.2% and 3.9% respectively. Critically, on the NVIDIA Jetson AGX Orin platform, it delivers a real-time inference speed of 59 FPS, making it suitable for seamless deployment on intelligent weeding robots. This low-power, high-precision solution not only bridges the gap between deep learning and precision agriculture but also enables targeted herbicide application, directly contributing to sustainable farming practices and environmental protection. Full article

(This article belongs to the Special Issue Intelligent Information System for Agriculture Based on Vision Technology)

► Show Figures

Figure 1

21 pages, 3250 KiB

Open AccessArticle

Deploying Optimized Deep Vision Models for Eyeglasses Detection on Low-Power Platforms

by Henrikas Giedra, Tomyslav Sledevič and Dalius Matuzevičius

Electronics 2025, 14(14), 2796; https://doi.org/10.3390/electronics14142796 - 11 Jul 2025

Viewed by 489

Abstract

This research addresses the optimization and deployment of convolutional neural networks for eyeglasses detection on low-power edge devices. Multiple convolutional neural network architectures were trained and evaluated using the FFHQ dataset, which contains annotated eyeglasses in the context of faces with diverse facial [...] Read more.

This research addresses the optimization and deployment of convolutional neural networks for eyeglasses detection on low-power edge devices. Multiple convolutional neural network architectures were trained and evaluated using the FFHQ dataset, which contains annotated eyeglasses in the context of faces with diverse facial features and eyewear styles. Several post-training quantization techniques, including Float16, dynamic range, and full integer quantization, were applied to reduce model size and computational demand while preserving detection accuracy. The impact of model architecture and quantization methods on detection accuracy and inference latency was systematically evaluated. The optimized models were deployed and benchmarked on Raspberry Pi 5 and NVIDIA Jetson Orin Nano platforms. Experimental results show that full integer quantization reduces model size by up to 75% while maintaining competitive detection accuracy. Among the evaluated models, MobileNet architectures achieved the most favorable balance between inference speed and accuracy, demonstrating their suitability for real-time eyeglasses detection in resource-constrained environments. These findings enable efficient on-device eyeglasses detection, supporting applications such as virtual try-ons and IoT-based facial analysis systems. Full article

(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)

► Show Figures

Figure 1

21 pages, 2573 KiB

Open AccessArticle

Predictive Optimal Control Mechanism of Indoor Temperature Using Modbus TCP and Deep Reinforcement Learning

by Hongkyun Kim, Muhammad Adnan Ejaz, Kyutae Lee, Hyun-Mook Cho and Do Hyeun Kim

Appl. Sci. 2025, 15(13), 7248; https://doi.org/10.3390/app15137248 - 27 Jun 2025

Viewed by 450

Abstract

This research study proposes an indoor temperature regulation predictive optimal control system that entails the use of both deep reinforcement learning and the Modbus TCP communication protocol. The designed architecture comprises distributed sub-parts, namely, distributed room-level units as well as a centralized main-part [...] Read more.

This research study proposes an indoor temperature regulation predictive optimal control system that entails the use of both deep reinforcement learning and the Modbus TCP communication protocol. The designed architecture comprises distributed sub-parts, namely, distributed room-level units as well as a centralized main-part AI controller for maximizing efficient HVAC management in single-family residences as well as small-sized buildings. The system utilizes an LSTM model for forecasting temperature trends as well as an optimized control action using an envisaged DQN with predicted states, sensors, as well as user preferences. InfluxDB is utilized for gathering real-time environmental data such as temperature and humidity, as well as consumed power, and storing it. The AI controller processes these data to infer control commands for energy efficiency as well as thermal comfort. Experimentation on an NVIDIA Jetson Orin Nano as well as on a Raspberry Pi 4 proved the efficacy of the system, utilizing 8761 data points gathered hourly over 2023 in Cheonan, Korea. An added hysteresis-based mechanism for controlling power was incorporated to limit device wear resulting from repeated switching. Results indicate that the AI-based control system closely maintains target temperature setpoints with negligible deviations, affirming that it is a scalable, cost-efficient solution for intelligent climate management in buildings. Full article

► Show Figures

Figure 1

21 pages, 18182 KiB

Open AccessArticle

AgriLiteNet: Lightweight Multi-Scale Tomato Pest and Disease Detection for Agricultural Robots

by Chenghan Yang, Baidong Zhao, Madina Mansurova, Tianyan Zhou, Qiyuan Liu, Junwei Bao and Dingkun Zheng

Horticulturae 2025, 11(6), 671; https://doi.org/10.3390/horticulturae11060671 - 12 Jun 2025

Viewed by 453

Abstract

Real-time detection of tomato pests and diseases is essential for precision agriculture, as it requires high accuracy, speed, and energy efficiency of edge-computing agricultural robots. This study proposes AgriLiteNet (Lightweight Networks for Agriculture), a lightweight neural network integrating MobileNetV3 for local feature extraction [...] Read more.

Real-time detection of tomato pests and diseases is essential for precision agriculture, as it requires high accuracy, speed, and energy efficiency of edge-computing agricultural robots. This study proposes AgriLiteNet (Lightweight Networks for Agriculture), a lightweight neural network integrating MobileNetV3 for local feature extraction and a streamlined Swin Transformer for global modeling. AgriLiteNet is further enhanced by a lightweight channel–spatial mixed attention module and a feature pyramid network, enabling the detection of nine tomato pests and diseases, including small targets like spider mites, dense targets like bacterial spot, and large targets like late blight. It achieves a mean average precision at an intersection-over-union threshold of 0.5 of 0.98735, which is comparable to Suppression Mask R-CNN (0.98955) and Cas-VSwin Transformer (0.98874), and exceeds the performance of YOLOv5n (0.98249) and GMC-MobileV3 (0.98143). With 2.0 million parameters and 0.608 GFLOPs, AgriLiteNet delivers an inference speed of 35 frames per second and power consumption of 15 watts on NVIDIA Jetson Orin NX, surpassing Suppression Mask R-CNN (8 FPS, 22 W) and Cas-VSwin Transformer (12 FPS, 20 W). The model’s efficiency and compact design make it highly suitable for deployment in agricultural robots, supporting sustainable farming through precise pest and disease management. Full article

(This article belongs to the Special Issue Applied Artificial Intelligence in Digital Horticulture: Practices and Innovations)

► Show Figures

Figure 1

31 pages, 8417 KiB

Open AccessArticle

A Unified and Resource-Aware Framework for Adaptive Inference Acceleration on Edge and Embedded Platforms

by Yiyang Wang and Jing Zhao

Electronics 2025, 14(11), 2188; https://doi.org/10.3390/electronics14112188 - 28 May 2025

Viewed by 974

Abstract

Efficient and scalable inference is essential for deploying large-scale generative models across diverse hardware platforms, especially in real-time or resource-constrained scenarios. To address this, we propose a novel unified and resource-aware inference optimization framework that uniquely integrates three complementary techniques: sensitivity-aware mixed-precision quantization, [...] Read more.

Efficient and scalable inference is essential for deploying large-scale generative models across diverse hardware platforms, especially in real-time or resource-constrained scenarios. To address this, we propose a novel unified and resource-aware inference optimization framework that uniquely integrates three complementary techniques: sensitivity-aware mixed-precision quantization, heterogeneous sparse attention for reducing attention complexity, and capacity-aware dynamic expert routing for input-adaptive computation. This framework distinctively achieves fine-grained adaptivity by dynamically adjusting computation paths based on token complexity and hardware conditions, offering substantial performance gains and execution flexibility across diverse platforms, including edge devices like Jetson Orin. Implemented using PyTorch 1.13 and ONNX Runtime, our framework demonstrates significant reductions in inference latency and memory usage, alongside substantial throughput improvements in language and image generation tasks, outperforming existing baselines even under constrained GPU environments. Qualitative analyses reveal its fine-grained adaptivity, while robustness tests confirm stable behavior under resource fluctuation and input noise, offering an interpretable optimization approach suitable for heterogeneous deployments. Future work will explore reinforcement-based routing and multimodal inference. Full article

► Show Figures

Figure 1

15 pages, 62527 KiB

Open AccessArticle

Towards Intelligent Pruning of Vineyards by Direct Detection of Cutting Areas

by Elia Pacioni, Eugenio Abengózar, Miguel Macías Macías, Carlos J. García-Orellana, Ramón Gallardo and Horacio M. González Velasco

Agriculture 2025, 15(11), 1154; https://doi.org/10.3390/agriculture15111154 - 27 May 2025

Viewed by 508

Abstract

The development of robots for automatic pruning of vineyards using deep learning techniques seems feasible in the medium term. In this context, it is essential to propose and study solutions that can be deployed on portable hardware, with artificial intelligence capabilities but reduced [...] Read more.

The development of robots for automatic pruning of vineyards using deep learning techniques seems feasible in the medium term. In this context, it is essential to propose and study solutions that can be deployed on portable hardware, with artificial intelligence capabilities but reduced computing power. In this paper, we propose a novel approach to vineyard pruning by direct detection of cutting areas in real time by comparing Mask R-CNN and YOLOv8 performances. The studied object segmentation architectures are able to segment the image by locating the trunk, and pruned and not pruned vine shoots. Our study analyzes the performance of both frameworks in terms of segmentation efficiency and inference times on a Jetson AGX Orin GPU. To compare segmentation efficiency, we used the mAP50 and AP50 per category metrics. Our results show that YOLOv8 is superior both in segmentation efficiency and inference time. Specifically, YOLOv8-S exhibits the best tradeoff between efficiency and inference time, showing an mAP50 of 0.883 and an AP50 of 0.748 for the shoot class, with an inference time of around 55 ms on a Jetson AGX Orin. Full article

(This article belongs to the Special Issue Application of Vision Technology and Artificial Intelligence in Smart Farming—2nd Edition)

► Show Figures

Figure 1

29 pages, 21305 KiB

Open AccessArticle

Collaborative Optimization of Model Pruning and Knowledge Distillation for Efficient and Lightweight Multi-Behavior Recognition in Piglets

by Yizhi Luo, Kai Lin, Zixuan Xiao, Yuankai Chen, Chen Yang and Deqin Xiao

Animals 2025, 15(11), 1563; https://doi.org/10.3390/ani15111563 - 27 May 2025

Viewed by 537

Abstract

In modern intensive pig farming, accurately monitoring piglet behavior is crucial for health management and improving production efficiency. However, the complexity of existing models demands high computational resources, limiting the application of piglet behavior recognition in farming environments. In this study, the piglet [...] Read more.

In modern intensive pig farming, accurately monitoring piglet behavior is crucial for health management and improving production efficiency. However, the complexity of existing models demands high computational resources, limiting the application of piglet behavior recognition in farming environments. In this study, the piglet multi-behavior-recognition approach is divided into three stages. In the first stage, the LAMP pruning algorithm is used to prune and optimize redundant channels, resulting in the lightweight YOLOv8-Prune. In the second stage, based on YOLOv8, the AIFI module and the Gather–Distribute mechanism are incorporated, resulting in YOLOv8-GDA. In the third stage, using YOLOv8-GDA as the teacher model and YOLOv8-Prune as the student model, knowledge distillation is employed to further enhance detection accuracy, thus obtaining the YOLOv8-Piglet model, which strikes a balance between the detection accuracy and speed. Compared to the baseline model, YOLOv8-Piglet significantly reduces model complexity while improving detection performance, with a 6.3% increase in precision, 11.2% increase in recall, and an mAP@0.5 of 91.8%. The model was deployed on the NVIDIA Jetson Orin NX edge computing platform for the evaluation. The average inference time was reduced from 353.9 ms to 163.2 ms, resulting in a 53.8% reduction in the processing time. This study achieves a balance between model compression and recognition accuracy through the collaborative optimization of pruning and knowledge extraction. Full article

(This article belongs to the Special Issue Mathematical Modeling and Computer Vision in Animal Activity or Behavior: 2nd Edition)

► Show Figures

Figure 1

16 pages, 15339 KiB

Open AccessArticle

MLKD-Net: Lightweight Single Image Dehazing via Multi-Head Large Kernel Attention

by Jiwon Moon and Jongyoul Park

Appl. Sci. 2025, 15(11), 5858; https://doi.org/10.3390/app15115858 - 23 May 2025

Viewed by 432

Abstract

Haze significantly degrades image quality by reducing contrast and blurring object boundaries, which impairs the performance of computer vision systems. Among various approaches, single-image dehazing remains particularly challenging due to the absence of depth information. While Vision Transformer (ViT)-based models have achieved remarkable [...] Read more.

Haze significantly degrades image quality by reducing contrast and blurring object boundaries, which impairs the performance of computer vision systems. Among various approaches, single-image dehazing remains particularly challenging due to the absence of depth information. While Vision Transformer (ViT)-based models have achieved remarkable results by leveraging multi-head attention and large effective receptive fields, their high computational complexity limits their applicability in real-time and embedded systems. To address this limitation, we propose MLKD-Net, a lightweight CNN-based model that incorporates a novel Multi-Head Large Kernel Block (MLKD), which is based on the Multi-Head Large Kernel Attention (MLKA) mechanism. This structure preserves the benefits of large receptive fields and a multi-head design while also ensuring compactness and computational efficiency. MLKD-Net achieves a PSNR of 37.42 dB on the SOTS-Outdoor dataset while using 90.9% fewer parameters than leading Transformer-based models. Furthermore, it demonstrates real-time performance with 55.24 ms per image (18.2 FPS) on the NVIDIA Jetson Orin Nano in TensorRT-INT8 mode. These results highlight its effectiveness and practicality for resource-constrained, real-time image dehazing applications. Full article

(This article belongs to the Section Robotics and Automation)

► Show Figures

Figure 1

21 pages, 7067 KiB

Open AccessArticle

A Lightweight and Rapid Dragon Fruit Detection Method for Harvesting Robots

by Fei Yuan, Jinpeng Wang, Wenqin Ding, Song Mei, Chenzhe Fang, Sunan Chen and Hongping Zhou

Agriculture 2025, 15(11), 1120; https://doi.org/10.3390/agriculture15111120 - 23 May 2025

Cited by 1 | Viewed by 615

Abstract

Dragon fruit detection in natural environments remains challenged by limited accuracy and deployment difficulties, primarily due to variable lighting and occlusions from branches. To enhance detection accuracy and satisfy the deployment constraints of edge devices, we propose YOLOv10n-CGD, a lightweight and efficient dragon [...] Read more.

Dragon fruit detection in natural environments remains challenged by limited accuracy and deployment difficulties, primarily due to variable lighting and occlusions from branches. To enhance detection accuracy and satisfy the deployment constraints of edge devices, we propose YOLOv10n-CGD, a lightweight and efficient dragon fruit detection method designed for robotic harvesting applications. The method builds upon YOLOv10 and integrates Gated Convolution (gConv) into the C2f module, forming a novel C2f-gConv structure that effectively reduces model parameters and computational complexity. In addition, a Global Attention Mechanism (GAM) is inserted between the backbone and the feature fusion layers to enrich semantic representations and improve the detection of occluded fruits. Furthermore, the neck network integrates a Dynamic Sample (DySample) operator to enhance the spatial restoration of high-level semantic features. The experimental results demonstrate that YOLOv10n-CGD significantly improves performance while reducing model size from 5.8 MB to 4.5 MB—a 22.4% decrease. The mAP improves from 95.1% to 98.1%, with precision and recall reaching 97.1% and 95.7%, respectively. The observed improvements are statistically significant (p < 0.05). Moreover, detection speeds of 44.9 FPS and 17.2 FPS are achieved on Jetson AGX Orin and Jetson Nano, respectively, demonstrating strong real-time capabilities and suitability for deployment. In summary, YOLOv10n-CGD enables high-precision, real-time dragon fruit detection while preserving model compactness, offering robust technical support for future robotic harvesting systems and smart agricultural terminals. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

23 pages, 8052 KiB

Open AccessArticle

Embedded Vision System for Thermal Face Detection Using Deep Learning

by Isidro Robledo-Vega, Scarllet Osuna-Tostado, Abraham Efraím Rodríguez-Mata, Carmen Leticia García-Mata, Pedro Rafael Acosta-Cano and Rogelio Enrique Baray-Arana

Sensors 2025, 25(10), 3126; https://doi.org/10.3390/s25103126 - 15 May 2025

Viewed by 733

Abstract

Face detection technology is essential for surveillance and security projects; however, algorithms designed to detect faces in color images often struggle in poor lighting conditions. In this paper, we describe the development of an embedded vision system designed to detect human faces by [...] Read more.

Face detection technology is essential for surveillance and security projects; however, algorithms designed to detect faces in color images often struggle in poor lighting conditions. In this paper, we describe the development of an embedded vision system designed to detect human faces by analyzing images captured with thermal infrared sensors, thereby overcoming the limitations imposed by varying illumination conditions. All variants of the Ultralytics YOLOv8 and YOLO11 models were trained on the Terravic Facial IR database and tested on the Charlotte-ThermalFace database; the YOLO11 model achieved slightly higher performance metrics. We compared the performance of two embedded system boards: the NVIDIA Jetson Orin Nano and the NVIDIA Jetson Xavier NX, while running the trained model in inference mode. The NVIDIA Jetson Orin Nano performed better in terms of inference time. The developed embedded vision system based on these platforms accurately detects faces in thermal images in real-time. Full article

(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems)

► Show Figures

Figure 1

Search Results (69)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (69)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI