MDPI - Publisher of Open Access Journals

18 pages, 12946 KiB

Open AccessArticle

High-Resolution 3D Reconstruction of Individual Rice Tillers for Genetic Studies

by Jiexiong Xu, Jiyoung Lee, Gang Jiang and Xiangchao Gan

Agronomy 2025, 15(8), 1803; https://doi.org/10.3390/agronomy15081803 - 25 Jul 2025

Viewed by 155

The architecture of rice tillers plays a pivotal role in yield potential, yet conventional phenotyping methods have struggled to capture these intricate three-dimensional (3D) structures with high fidelity. In this study, a 3D model reconstruction method was developed specifically for rice tillers to [...] Read more.

The architecture of rice tillers plays a pivotal role in yield potential, yet conventional phenotyping methods have struggled to capture these intricate three-dimensional (3D) structures with high fidelity. In this study, a 3D model reconstruction method was developed specifically for rice tillers to overcome the challenges posed by their slender, feature-poor morphology in multi-view stereo-based 3D reconstruction. By applying strategically designed colorful reference markers, high-resolution 3D tiller models of 231 rice landraces were reconstructed. Accurate phenotyping was achieved by introducing ScaleCalculator, a software tool that integrated depth images from a depth camera to calibrate the physical sizes of the 3D models. The high efficiency of the 3D model-based phenotyping pipeline was demonstrated by extracting the following seven key agronomic traits: flag leaf length, panicle length, first internode length below the panicle, stem length, flag leaf angle, second leaf angle from the panicle, and third leaf angle. Genome-wide association studies (GWAS) performed with these 3D traits identified numerous candidate genes, nine of which had been previously confirmed in the literature. This work provides a 3D phenomics solution tailored for slender organs and offers novel insights into the genetic regulation of complex morphological traits in rice. Full article

(This article belongs to the Special Issue Harnessing Sensing, Artificial Intelligence, and Robotics for Digital Agriculture)

► Show Figures

Figure 1

18 pages, 2592 KiB

Open AccessArticle

A Minimal Solution for Binocular Camera Relative Pose Estimation Based on the Gravity Prior

by Dezhong Chen, Kang Yan, Hongping Zhang and Zhenbao Yu

Remote Sens. 2025, 17(15), 2560; https://doi.org/10.3390/rs17152560 - 23 Jul 2025

Viewed by 165

Abstract

High-precision positioning is the foundation for the functionality of various intelligent agents. In complex environments, such as urban canyons, relative pose estimation using cameras is a crucial step in high-precision positioning. To take advantage of the ability of an inertial measurement unit (IMU) [...] Read more.

High-precision positioning is the foundation for the functionality of various intelligent agents. In complex environments, such as urban canyons, relative pose estimation using cameras is a crucial step in high-precision positioning. To take advantage of the ability of an inertial measurement unit (IMU) to provide relatively accurate gravity prior information over a short period, we propose a minimal solution method for the relative pose estimation of a stereo camera system assisted by the IMU. We rigidly connect the IMU to the camera system and use it to obtain the rotation matrices in the roll and pitch directions for the entire system, thereby reducing the minimum number of corresponding points required for relative pose estimation. In contrast to classic pose-estimation algorithms, our method can also calculate the camera focal length, which greatly expands its applicability. We constructed a simulated dataset and used it to compare and analyze the numerical stability of the proposed method and the impact of different levels of noise on algorithm performance. We also collected real-scene data using a drone and validated the proposed algorithm. The results on real data reveal that our method exhibits smaller errors in calculating the relative pose of the camera system compared with two classic reference algorithms. It achieves higher precision and stability and can provide a comparatively accurate camera focal length. Full article

(This article belongs to the Section Urban Remote Sensing)

► Show Figures

Figure 1

25 pages, 8560 KiB

Open AccessArticle

Visual Point Cloud Map Construction and Matching Localization for Autonomous Vehicle

by Shuchen Xu, Kedong Zhao, Yongrong Sun, Xiyu Fu and Kang Luo

Drones 2025, 9(7), 511; https://doi.org/10.3390/drones9070511 - 21 Jul 2025

Viewed by 294

Abstract

Collaboration between autonomous vehicles and drones can enhance the efficiency and connectivity of three-dimensional transportation systems. When satellite signals are unavailable, vehicles can achieve accurate localization by matching rich ground environmental data to digital maps, simultaneously providing the auxiliary localization information for drones. [...] Read more.

Collaboration between autonomous vehicles and drones can enhance the efficiency and connectivity of three-dimensional transportation systems. When satellite signals are unavailable, vehicles can achieve accurate localization by matching rich ground environmental data to digital maps, simultaneously providing the auxiliary localization information for drones. However, conventional digital maps suffer from high construction costs, easy misalignment, and low localization accuracy. Thus, this paper proposes a visual point cloud map (VPCM) construction and matching localization for autonomous vehicles. We fuse multi-source information from vehicle-mounted sensors and the regional road network to establish the geographically high-precision VPCM. In the absence of satellite signals, we segment the prior VPCM on the road network based on real-time localization results, which accelerates matching speed and reduces mismatch probability. Simultaneously, by continuously introducing matching constraints of real-time point cloud and prior VPCM through improved iterative closest point matching method, the proposed solution can effectively suppress the drift error of the odometry and output accurate fusion localization results based on pose graph optimization theory. The experiments carried out on the KITTI datasets demonstrate the effectiveness of the proposed method, which can autonomously construct the high-precision prior VPCM. The localization strategy achieves sub-meter accuracy and reduces the average error per frame by 25.84% compared to similar methods. Subsequently, this method’s reusability and localization robustness under light condition changes and environment changes are verified using the campus dataset. Compared to the similar camera-based method, the matching success rate increased by 21.15%, and the average localization error decreased by 62.39%. Full article

(This article belongs to the Special Issue Advanced Autonomous Mobility Toward Low-Altitude Economy and Three-Dimensional Transportation Systems)

► Show Figures

Figure 1

23 pages, 5173 KiB

Open AccessArticle

Improvement of Cooperative Localization for Heterogeneous Mobile Robots

by Efe Oğuzhan Karcı, Ahmet Mustafa Kangal and Sinan Öncü

Drones 2025, 9(7), 507; https://doi.org/10.3390/drones9070507 - 19 Jul 2025

Viewed by 324

Abstract

This research focuses on enhancing cooperative localization for heterogeneous mobile robots composed of a quadcopter and an unmanned ground vehicle. The study employs sensor fusion techniques, particularly the Extended Kalman Filter, to fuse data from various sensors, including GPSs, IMUs, and cameras. By [...] Read more.

This research focuses on enhancing cooperative localization for heterogeneous mobile robots composed of a quadcopter and an unmanned ground vehicle. The study employs sensor fusion techniques, particularly the Extended Kalman Filter, to fuse data from various sensors, including GPSs, IMUs, and cameras. By integrating these sensors and optimizing fusion strategies, the research aims to improve the precision and reliability of cooperative localization in complex and dynamic environments. The primary objective is to develop a practical framework for cooperative localization that addresses the challenges posed by the differences in mobility and sensing capabilities among heterogeneous robots. Sensor fusion is used to compensate for the limitations of individual sensors, providing more accurate and robust localization results. Moreover, a comparative analysis of different sensor combinations and fusion strategies helps to identify the optimal configuration for each robot. This research focuses on the improvement of cooperative localization, path planning, and collaborative tasks for heterogeneous robots. The findings have broad applications in fields such as autonomous transportation, agricultural operation, and disaster response, where the cooperation of diverse robotic platforms is crucial for mission success. Full article

► Show Figures

Figure 1

21 pages, 9749 KiB

Open AccessArticle

Enhanced Pose Estimation for Badminton Players via Improved YOLOv8-Pose with Efficient Local Attention

by Yijian Wu, Zewen Chen, Hongxing Zhang, Yulin Yang and Weichao Yi

Sensors 2025, 25(14), 4446; https://doi.org/10.3390/s25144446 - 17 Jul 2025

Viewed by 370

Abstract

With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To [...] Read more.

With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To address these issues, we propose an enhanced pose estimation framework tailored to badminton players, built upon an improved YOLOv8-Pose architecture. In particular, we introduce an efficient local attention (ELA) mechanism that effectively captures fine-grained spatial dependencies and contextual information, thereby significantly improving the keypoint localization accuracy and overall pose estimation performance. To support this study, we construct a dedicated badminton pose dataset comprising 4000 manually annotated samples, captured using a Microsoft Kinect v2 camera. The raw data undergo careful processing and refinement through a combination of depth-assisted annotation and visual inspection to ensure high-quality ground truth keypoints. Furthermore, we conduct an in-depth comparative analysis of multiple attention modules and their integration strategies within the network, offering generalizable insights to enhance pose estimation models in other sports domains. The experimental results show that the proposed ELA-enhanced YOLOv8-Pose model consistently achieves superior accuracy across multiple evaluation metrics, including the mean squared error (MSE), object keypoint similarity (OKS), and percentage of correct keypoints (PCK), highlighting its effectiveness and potential for broader applications in sports vision tasks. Full article

(This article belongs to the Special Issue Computer Vision-Based Human Activity Recognition)

► Show Figures

Figure 1

25 pages, 9517 KiB

Open AccessArticle

YOLOv8n-SSDW: A Lightweight and Accurate Model for Barnyard Grass Detection in Fields

by Yan Sun, Hanrui Guo, Xiaoan Chen, Mengqi Li, Bing Fang and Yingli Cao

Agriculture 2025, 15(14), 1510; https://doi.org/10.3390/agriculture15141510 - 13 Jul 2025

Cited by 1 | Viewed by 299

Abstract

Barnyard grass is a major noxious weed in paddy fields. Accurate and efficient identification of barnyard grass is crucial for precision field management. However, existing deep learning models generally suffer from high parameter counts and computational complexity, limiting their practical application in field [...] Read more.

Barnyard grass is a major noxious weed in paddy fields. Accurate and efficient identification of barnyard grass is crucial for precision field management. However, existing deep learning models generally suffer from high parameter counts and computational complexity, limiting their practical application in field scenarios. Moreover, the morphological similarity, overlapping, and occlusion between barnyard grass and rice pose challenges for reliable detection in complex environments. To address these issues, this study constructed a barnyard grass detection dataset using high-resolution images captured by a drone equipped with a high-definition camera in rice experimental fields in Haicheng City, Liaoning Province. A lightweight field barnyard grass detection model, YOLOv8n-SSDW, was proposed to enhance detection precision and speed. Based on the baseline YOLOv8n model, a novel Separable Residual Coord Conv (SRCConv) was designed to replace the original convolution module, significantly reducing parameters while maintaining detection accuracy. The Spatio-Channel Enhanced Attention Module (SEAM) was introduced and optimized to improve sensitivity to barnyard grass edge features. Additionally, the lightweight and efficient Dysample upsampling module was incorporated to enhance feature map resolution. A new WIoU loss function was developed to improve bounding box classification and regression accuracy. Comprehensive performance analysis demonstrated that YOLOv8n-SSDW outperformed state-of-the-art models. Ablation studies confirmed the effectiveness of each improvement module. The final fused model achieved lightweight performance while improving detection accuracy, with a 2.2% increase in mAP_50, 3.8% higher precision, 0.6% higher recall, 10.6% fewer parameters, 9.8% lower FLOPs, and an 11.1% reduction in model size compared to the baseline. Field tests using drones combined with ground-based computers further validated the model’s robustness in real-world complex paddy environments. The results indicate that YOLOv8n-SSDW exhibits excellent accuracy and efficiency. This study provides valuable insights for barnyard grass detection in rice fields. Full article

(This article belongs to the Special Issue How Optical Sensors and Deep Learning Enhance the Production Management in Smart Agriculture)

► Show Figures

Figure 1

23 pages, 8966 KiB

Open AccessArticle

Object-Specific Multiview Classification Through View-Compatible Feature Fusion

by Javier Perez Soler, Jose-Luis Guardiola, Nicolás García Sastre, Pau Garrigues Carbó, Miguel Sanchis Hernández and Juan-Carlos Perez-Cortes

Sensors 2025, 25(13), 4127; https://doi.org/10.3390/s25134127 - 2 Jul 2025

Viewed by 323

Abstract

Multi-view classification (MVC) typically focuses on categorizing objects into distinct classes by employing multiple perspectives of the same objects. However, in numerous real-world applications, such as industrial inspection and quality control, there is an increasing need to distinguish particular objects from a pool [...] Read more.

Multi-view classification (MVC) typically focuses on categorizing objects into distinct classes by employing multiple perspectives of the same objects. However, in numerous real-world applications, such as industrial inspection and quality control, there is an increasing need to distinguish particular objects from a pool of similar ones while simultaneously disregarding unknown objects. In these scenarios, relying on a single image may not provide sufficient information to effectively identify the scrutinized object, as different perspectives may reveal distinct characteristics that are essential for accurate classification. Most existing approaches operate within closed-set environments and are focused on generalization, which makes them less effective in distinguishing individual objects from others. This limitations are particularly problematic in industrial quality assessment, where distinguishing between specific objects and discarding unknowns is crucial. To address this challenge, we introduce a View-Compatible Feature Fusion (VCFF) method that utilizes images from predetermined positions as an accurate solution for multi-view classification of specific objects. Unlike other approaches, VCFF explicitly integrates pose information during the fusion process. It does not merely use pose as auxiliary data but employs it to align and selectively fuse features from different views. This mathematically explicit fusion of rotations, based on relative poses, allows VCFF to effectively combine multi-view information, enhancing classification accuracy. Through experimental evaluations, we demonstrate that the proposed VCFF method outperforms state-of-the-art MVC algorithms, especially in open-set scenarios, where the set of possible objects is not fully known in advance. Remarkably, VCFF achieves an average precision of 1.0 using only 8 cameras, whereas existing methods require 20 cameras to reach a maximum of 0.95. In terms of AUC-ROC under the constraint of fewer than 3

σ

false positives—a critical metric in industrial inspection—current state-of-the-art methods achieve up to 0.72, while VCFF attains a perfect score of 1.0 with just eight cameras. Furthermore, our approach delivers highly accurate rotation estimation, maintaining an error margin slightly above 2° when sampling at 4° intervals. Full article

(This article belongs to the Special Issue Sensors for Object Detection, Pose Estimation, and 3D Reconstruction)

► Show Figures

Figure 1

23 pages, 9135 KiB

Open AccessArticle

Stone Detection on Agricultural Land Using Thermal Imagery from Unmanned Aerial Systems

by Florian Thürkow, Mike Teucher, Detlef Thürkow and Milena Mohri

AgriEngineering 2025, 7(7), 203; https://doi.org/10.3390/agriengineering7070203 - 1 Jul 2025

Viewed by 566

Abstract

Stones in agricultural fields pose a recurring challenge, particularly due to their potential to damage agricultural machinery and disrupt field operations. As modern agriculture moves toward automation and precision farming, efficient stone detection has become a critical concern. This study explores the potential [...] Read more.

Stones in agricultural fields pose a recurring challenge, particularly due to their potential to damage agricultural machinery and disrupt field operations. As modern agriculture moves toward automation and precision farming, efficient stone detection has become a critical concern. This study explores the potential of thermal imaging as a non-invasive method for detecting stones under varying environmental conditions. A series of controlled laboratory experiments and field investigations confirmed the assumption that stones exhibit higher surface temperatures than the surrounding soil, especially when soil moisture is high and air temperatures are cooling rapidly. This temperature difference is attributed to the higher thermal inertia of stones, which allows them to absorb and retain heat longer than soil, as well as to the evaporative cooling from moist soil. These findings demonstrate the viability of thermal cameras as a tool for stone detection in precision farming. Incorporating this technology with GPS mapping enables the generation of accurate location data, facilitating targeted stone removal and reducing equipment damage. This approach aligns with the goals of sustainable agricultural engineering by supporting field automation, minimizing mechanical inefficiencies, and promoting data-driven decisions. Thermal imaging thereby contributes to the evolution of next-generation agricultural systems. Full article

(This article belongs to the Special Issue Recent Trends and Advances in Agricultural Engineering)

► Show Figures

Figure 1

19 pages, 3319 KiB

Open AccessArticle

Frailty-Focused Movement Monitoring: A Single-Camera System Using Joint Angles for Assessing Chair-Based Exercise Quality

by Teng Qi, Miyuki Iwamoto, Dongeun Choi, Noriyuki Kida and Noriaki Kuwahara

Sensors 2025, 25(13), 3907; https://doi.org/10.3390/s25133907 - 23 Jun 2025

Viewed by 388

Abstract

Ensuring that older adults perform chair-based exercises (CBEs) correctly is essential for improving physical outcomes and reducing the risk of injury, particularly in home and community rehabilitation settings. However, evaluating the correctness of movements accurately and objectively outside clinical environments remains challenging. In [...] Read more.

Ensuring that older adults perform chair-based exercises (CBEs) correctly is essential for improving physical outcomes and reducing the risk of injury, particularly in home and community rehabilitation settings. However, evaluating the correctness of movements accurately and objectively outside clinical environments remains challenging. In this study, camera-based methods have been used to evaluate practical exercise quality. A single-camera system utilizing MediaPipe pose estimation was used to capture joint angle data as twenty older adults performed eight CBEs. Simultaneously, surface electromyography (sEMG) recorded muscle activity. Participants were guided to perform both proper and commonly observed incorrect forms of each movement. Statistical analyses compared joint angles and sEMG signals, and a support vector machine (SVM) was trained to classify movement correctness. The analysis showed that correct executions consistently produced distinct joint angle patterns and significantly higher sEMG activity than incorrect ones (p < 0.001). After modifying the selection of joint angle features for Movement 5 (M5), the classification accuracy improved to 96.26%. Including M5, the average classification accuracy across all eight exercises reached 97.77%, demonstrating the overall robustness and consistency of the proposed approach. In contrast, high variability across individuals made sEMG less reliable as a standalone indicator of correctness. The strong classification performance based on joint angles highlights the potential of this approach for real-world applications. While sEMG signals confirmed the physiological differences between correct and incorrect executions, their individual variability limits their generalizability as a sole criterion. Joint angle data derived from a simple single-camera setup can effectively distinguish movement quality in older adults, offering a low-cost, user-friendly solution for real-time feedback in home and community settings. This approach may help support independent exercise and reduce reliance on professional supervision. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

22 pages, 6243 KiB

Open AccessReview

A Review on UAS Trajectory Estimation Using Decentralized Multi-Sensor Systems Based on Robotic Total Stations

by Lucas Dammert, Tomas Thalmann, David Monetti, Hans-Berndt Neuner and Gottfried Mandlburger

Sensors 2025, 25(13), 3838; https://doi.org/10.3390/s25133838 - 20 Jun 2025

Viewed by 468

Abstract

In our contribution, we conduct a thematic literature review on trajectory estimation using a decentralized multi-sensor system based on robotic total stations (RTS) with a focus on unmanned aerial system (UAS) platforms. While RTS are commonly used for trajectory estimation in areas where [...] Read more.

In our contribution, we conduct a thematic literature review on trajectory estimation using a decentralized multi-sensor system based on robotic total stations (RTS) with a focus on unmanned aerial system (UAS) platforms. While RTS are commonly used for trajectory estimation in areas where GNSS is not sufficiently accurate or is unavailable, they are rarely used for UAS trajectory estimation. Extending the RTS with integrated camera images allows for UAS pose estimation (position and orientation). We review existing research on the entire RTS measurement processes, including time synchronization, atmospheric refraction, prism interaction, and RTS-based image evaluation. Additionally, we focus on integrated trajectory estimation using UAS onboard measurements such as IMU and laser scanning data. Although many existing articles address individual steps of the decentralized multi-sensor system, we demonstrate that a combination of existing works related to UAS trajectory estimation and RTS calibration is needed to allow for trajectory estimation at sub-

c

m

and sub-

0.01

gon accuracies, and we identify the challenges that must be addressed. Investigations into the use of RTS for kinematic tasks must be extended to realistic distances (approx. 300–500 m) and speeds (>2.5 m s⁻¹). In particular, image acquisition with the integrated camera must be extended by a time synchronization approach. As to the estimation of UAS orientation based on RTS camera images, the results of initial simulation studies must be validated by field tests, and existing approaches for integrated trajectory estimation must be adapted to optimally integrate RTS data. Full article

(This article belongs to the Section Sensors and Robotics)

► Show Figures

Figure 1

22 pages, 8644 KiB

Open AccessArticle

Privacy-Preserving Approach for Early Detection of Long-Lie Incidents: A Pilot Study with Healthy Subjects

by Riska Analia, Anne Forster, Sheng-Quan Xie and Zhiqiang Zhang

Sensors 2025, 25(12), 3836; https://doi.org/10.3390/s25123836 - 19 Jun 2025

Viewed by 633

Abstract

(1) Background: Detecting long-lie incidents—where individuals remain immobile after a fall—is essential for timely intervention and preventing severe health consequences. However, most existing systems focus only on fall detection, neglect post-fall monitoring, and raise privacy concerns, especially in real-time, non-invasive applications; (2) Methods: [...] Read more.

(1) Background: Detecting long-lie incidents—where individuals remain immobile after a fall—is essential for timely intervention and preventing severe health consequences. However, most existing systems focus only on fall detection, neglect post-fall monitoring, and raise privacy concerns, especially in real-time, non-invasive applications; (2) Methods: This study proposes a lightweight, privacy-preserving, long-lie detection system utilizing thermal imaging and a soft-voting ensemble classifier. A low-resolution thermal camera captured simulated falls and activities of daily living (ADL) performed by ten healthy participants. Human pose keypoints were extracted using MediaPipe, followed by the computation of five handcrafted postural features. The top three classifiers—automatically selected based on cross-validation performance—formed the soft-voting ensemble. Long-lie conditions were identified through post-fall immobility monitoring over a defined period, using rule-based logic on posture stability and duration; (3) Results: The ensemble model achieved high classification performance with accuracy, precision, recall, and an F1 score of 0.98. Real-time deployment on a Raspberry Pi 5 demonstrated the system is capable of accurately detecting long-lie incidents based on continuous monitoring over 15 min, with minimal posture variation; (4) Conclusion: The proposed system introduces a novel approach to long-lie detection by integrating privacy-aware sensing, interpretable posture-based features, and efficient edge computing. It demonstrates strong potential for deployment in homecare settings. Future work includes validation with older adults and integration of vital sign monitoring for comprehensive assessment. Full article

(This article belongs to the Special Issue Sensing and Signal Processing Technologies for Outpatient Monitoring and Rehabilitation)

► Show Figures

Figure 1

28 pages, 12681 KiB

Open AccessArticle

MM-VSM: Multi-Modal Vehicle Semantic Mesh and Trajectory Reconstruction for Image-Based Cooperative Perception

by Márton Cserni, András Rövid and Zsolt Szalay

Appl. Sci. 2025, 15(12), 6930; https://doi.org/10.3390/app15126930 - 19 Jun 2025

Viewed by 448

Abstract

Recent advancements in cooperative 3D object detection have demonstrated significant potential for enhancing autonomous driving by integrating roadside infrastructure data. However, deploying comprehensive LiDAR-based cooperative perception systems remains prohibitively expensive and requires precisely annotated 3D data to function robustly. This paper proposes an [...] Read more.

Recent advancements in cooperative 3D object detection have demonstrated significant potential for enhancing autonomous driving by integrating roadside infrastructure data. However, deploying comprehensive LiDAR-based cooperative perception systems remains prohibitively expensive and requires precisely annotated 3D data to function robustly. This paper proposes an improved multi-modal method integrating LiDAR-based shape references into a previously mono-camera-based semantic vertex reconstruction framework to enable robust and cost-effective monocular and cooperative pose estimation after the reconstruction. A novel camera–LiDAR loss function that combines re-projection loss from a multi-view camera system alongside LiDAR shape constraints is proposed. Experimental evaluations conducted on the Argoverse dataset and real-world experiments demonstrate significantly improved shape reconstruction robustness and accuracy, thereby improving pose estimation performance. The effectiveness of the algorithm is proven through a real-world smart valet parking application, which is evaluated in our university parking area with real vehicles. Our approach allows accurate 6DOF pose estimation using an inexpensive IP camera without requiring context-specific training, thereby advancing the state of the art in monocular and cooperative image-based vehicle localization. Full article

(This article belongs to the Special Issue Advances in Autonomous Driving and Smart Transportation)

► Show Figures

Figure 1

35 pages, 21267 KiB

Open AccessArticle

Unmanned Aerial Vehicle–Unmanned Ground Vehicle Centric Visual Semantic Simultaneous Localization and Mapping Framework with Remote Interaction for Dynamic Scenarios

by Chang Liu, Yang Zhang, Liqun Ma, Yong Huang, Keyan Liu and Guangwei Wang

Drones 2025, 9(6), 424; https://doi.org/10.3390/drones9060424 - 10 Jun 2025

Viewed by 1223

Abstract

In this study, we introduce an Unmanned Aerial Vehicle (UAV) centric visual semantic simultaneous localization and mapping (SLAM) framework that integrates RGB–D cameras, inertial measurement units (IMUs), and a 5G–enabled remote interaction module. Our system addresses three critical limitations in existing approaches: (1) [...] Read more.

In this study, we introduce an Unmanned Aerial Vehicle (UAV) centric visual semantic simultaneous localization and mapping (SLAM) framework that integrates RGB–D cameras, inertial measurement units (IMUs), and a 5G–enabled remote interaction module. Our system addresses three critical limitations in existing approaches: (1) Distance constraints in remote operations; (2) Static map assumptions in dynamic environments; and (3) High–dimensional perception requirements for UAV–based applications. By combining YOLO–based object detection with epipolar–constraint-based dynamic feature removal, our method achieves real-time semantic mapping while rejecting motion artifacts. The framework further incorporates a dual–channel communication architecture to enable seamless human–in–the–loop control over UAV–Unmanned Ground Vehicle (UGV) teams in large–scale scenarios. Experimental validation across indoor and outdoor environments indicates that the system can achieve a detection rate of up to 75 frames per second (FPS) on an NVIDIA Jetson AGX Xavier using YOLO–FASTEST, ensuring the rapid identification of dynamic objects. In dynamic scenarios, the localization accuracy attains an average absolute pose error (APE) of 0.1275 m. This outperforms state–of–the–art methods like Dynamic–VINS (0.211 m) and ORB–SLAM3 (0.148 m) on the EuRoC MAV Dataset. The dual-channel communication architecture (Web Real–Time Communication (WebRTC) for video and Message Queuing Telemetry Transport (MQTT) for telemetry) reduces bandwidth consumption by 65% compared to traditional TCP–based protocols. Moreover, our hybrid dynamic feature filtering can reject 89% of dynamic features in occluded scenarios, guaranteeing accurate mapping in complex environments. Our framework represents a significant advancement in enabling intelligent UAVs/UGVs to navigate and interact in complex, dynamic environments, offering real-time semantic understanding and accurate localization. Full article

(This article belongs to the Special Issue Advances in Perception, Communications, and Control for Drones)

► Show Figures

Figure 1

19 pages, 8306 KiB

Open AccessArticle

Plant Sam Gaussian Reconstruction (PSGR): A High-Precision and Accelerated Strategy for Plant 3D Reconstruction

by Jinlong Chen, Yingjie Jiao, Fuqiang Jin, Xingguo Qin, Yi Ning, Minghao Yang and Yongsong Zhan

Electronics 2025, 14(11), 2291; https://doi.org/10.3390/electronics14112291 - 4 Jun 2025

Viewed by 580

Abstract

Plant 3D reconstruction plays a critical role in precision agriculture and plant growth monitoring, yet it faces challenges such as complex background interference, difficulties in capturing intricate plant structures, and a slow reconstruction speed. In this study, we propose PlantSamGaussianReconstruction (PSGR), a novel [...] Read more.

Plant 3D reconstruction plays a critical role in precision agriculture and plant growth monitoring, yet it faces challenges such as complex background interference, difficulties in capturing intricate plant structures, and a slow reconstruction speed. In this study, we propose PlantSamGaussianReconstruction (PSGR), a novel method that integrates Grounding SAM with 3D Gaussian Splatting (3DGS) techniques. PSGR employs Grounding DINO and SAM for accurate plant–background segmentation, utilizes algorithms such as Scale-Invariant Feature Transform (SIFT) for camera pose estimation and sparse point cloud generation, and leverages 3DGS for plant reconstruction. Furthermore, a 3D–2D projection-guided optimization strategy is introduced to enhance segmentation precision. The experimental results of various multi-view plant image datasets demonstrate that PSGR effectively removes background noise under diverse environments, accurately captures plant details, and achieves peak signal-to-noise ratio (PSNR) values exceeding 30 in most scenarios, outperforming the original 3DGS approach. Moreover, PSGR reduces training time by up to 26.9%, significantly improving reconstruction efficiency. These results suggest that PSGR is an efficient, scalable, and high-precision solution for plant modeling. Full article

► Show Figures

Figure 1

36 pages, 2706 KiB

Open AccessArticle

Towards Intelligent Assessment in Personalized Physiotherapy with Computer Vision

by Victor García and Olga C. Santos

Sensors 2025, 25(11), 3436; https://doi.org/10.3390/s25113436 - 29 May 2025

Viewed by 726

Abstract

Effective physiotherapy requires accurate and personalized assessments of patient mobility, yet traditional methods can be time-consuming and subjective. This study explores the potential of open-source computer vision algorithms, specifically YOLO Pose, to support automated, vision-based analysis in physiotherapy settings using information collected from [...] Read more.

Effective physiotherapy requires accurate and personalized assessments of patient mobility, yet traditional methods can be time-consuming and subjective. This study explores the potential of open-source computer vision algorithms, specifically YOLO Pose, to support automated, vision-based analysis in physiotherapy settings using information collected from optical sensors such as cameras. By extracting skeletal data from video input, the system enables objective evaluation of patient movements and rehabilitation progress. The visual information is then analyzed to propose a semantic framework that facilitates a structured interpretation of clinical parameters. Preliminary results indicate that YOLO Pose provides reliable pose estimation, offering a solid foundation for future enhancements, such as the integration of natural language processing (NLP) to improve patient interaction through empathetic, AI-driven support. Full article

(This article belongs to the Special Issue Deep Learning Applications for Pose Estimation and Human Action Recognition—2nd Edition)

► Show Figures

Figure 1

Search Results (302)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (302)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI