Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (302)

Search Parameters:
Keywords = accurate camera pose

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 12946 KiB  
Article
High-Resolution 3D Reconstruction of Individual Rice Tillers for Genetic Studies
by Jiexiong Xu, Jiyoung Lee, Gang Jiang and Xiangchao Gan
Agronomy 2025, 15(8), 1803; https://doi.org/10.3390/agronomy15081803 - 25 Jul 2025
Viewed by 155
Abstract
The architecture of rice tillers plays a pivotal role in yield potential, yet conventional phenotyping methods have struggled to capture these intricate three-dimensional (3D) structures with high fidelity. In this study, a 3D model reconstruction method was developed specifically for rice tillers to [...] Read more.
The architecture of rice tillers plays a pivotal role in yield potential, yet conventional phenotyping methods have struggled to capture these intricate three-dimensional (3D) structures with high fidelity. In this study, a 3D model reconstruction method was developed specifically for rice tillers to overcome the challenges posed by their slender, feature-poor morphology in multi-view stereo-based 3D reconstruction. By applying strategically designed colorful reference markers, high-resolution 3D tiller models of 231 rice landraces were reconstructed. Accurate phenotyping was achieved by introducing ScaleCalculator, a software tool that integrated depth images from a depth camera to calibrate the physical sizes of the 3D models. The high efficiency of the 3D model-based phenotyping pipeline was demonstrated by extracting the following seven key agronomic traits: flag leaf length, panicle length, first internode length below the panicle, stem length, flag leaf angle, second leaf angle from the panicle, and third leaf angle. Genome-wide association studies (GWAS) performed with these 3D traits identified numerous candidate genes, nine of which had been previously confirmed in the literature. This work provides a 3D phenomics solution tailored for slender organs and offers novel insights into the genetic regulation of complex morphological traits in rice. Full article
Show Figures

Figure 1

18 pages, 2592 KiB  
Article
A Minimal Solution for Binocular Camera Relative Pose Estimation Based on the Gravity Prior
by Dezhong Chen, Kang Yan, Hongping Zhang and Zhenbao Yu
Remote Sens. 2025, 17(15), 2560; https://doi.org/10.3390/rs17152560 - 23 Jul 2025
Viewed by 165
Abstract
High-precision positioning is the foundation for the functionality of various intelligent agents. In complex environments, such as urban canyons, relative pose estimation using cameras is a crucial step in high-precision positioning. To take advantage of the ability of an inertial measurement unit (IMU) [...] Read more.
High-precision positioning is the foundation for the functionality of various intelligent agents. In complex environments, such as urban canyons, relative pose estimation using cameras is a crucial step in high-precision positioning. To take advantage of the ability of an inertial measurement unit (IMU) to provide relatively accurate gravity prior information over a short period, we propose a minimal solution method for the relative pose estimation of a stereo camera system assisted by the IMU. We rigidly connect the IMU to the camera system and use it to obtain the rotation matrices in the roll and pitch directions for the entire system, thereby reducing the minimum number of corresponding points required for relative pose estimation. In contrast to classic pose-estimation algorithms, our method can also calculate the camera focal length, which greatly expands its applicability. We constructed a simulated dataset and used it to compare and analyze the numerical stability of the proposed method and the impact of different levels of noise on algorithm performance. We also collected real-scene data using a drone and validated the proposed algorithm. The results on real data reveal that our method exhibits smaller errors in calculating the relative pose of the camera system compared with two classic reference algorithms. It achieves higher precision and stability and can provide a comparatively accurate camera focal length. Full article
(This article belongs to the Section Urban Remote Sensing)
Show Figures

Figure 1

25 pages, 8560 KiB  
Article
Visual Point Cloud Map Construction and Matching Localization for Autonomous Vehicle
by Shuchen Xu, Kedong Zhao, Yongrong Sun, Xiyu Fu and Kang Luo
Drones 2025, 9(7), 511; https://doi.org/10.3390/drones9070511 - 21 Jul 2025
Viewed by 294
Abstract
Collaboration between autonomous vehicles and drones can enhance the efficiency and connectivity of three-dimensional transportation systems. When satellite signals are unavailable, vehicles can achieve accurate localization by matching rich ground environmental data to digital maps, simultaneously providing the auxiliary localization information for drones. [...] Read more.
Collaboration between autonomous vehicles and drones can enhance the efficiency and connectivity of three-dimensional transportation systems. When satellite signals are unavailable, vehicles can achieve accurate localization by matching rich ground environmental data to digital maps, simultaneously providing the auxiliary localization information for drones. However, conventional digital maps suffer from high construction costs, easy misalignment, and low localization accuracy. Thus, this paper proposes a visual point cloud map (VPCM) construction and matching localization for autonomous vehicles. We fuse multi-source information from vehicle-mounted sensors and the regional road network to establish the geographically high-precision VPCM. In the absence of satellite signals, we segment the prior VPCM on the road network based on real-time localization results, which accelerates matching speed and reduces mismatch probability. Simultaneously, by continuously introducing matching constraints of real-time point cloud and prior VPCM through improved iterative closest point matching method, the proposed solution can effectively suppress the drift error of the odometry and output accurate fusion localization results based on pose graph optimization theory. The experiments carried out on the KITTI datasets demonstrate the effectiveness of the proposed method, which can autonomously construct the high-precision prior VPCM. The localization strategy achieves sub-meter accuracy and reduces the average error per frame by 25.84% compared to similar methods. Subsequently, this method’s reusability and localization robustness under light condition changes and environment changes are verified using the campus dataset. Compared to the similar camera-based method, the matching success rate increased by 21.15%, and the average localization error decreased by 62.39%. Full article
Show Figures

Figure 1

23 pages, 5173 KiB  
Article
Improvement of Cooperative Localization for Heterogeneous Mobile Robots
by Efe Oğuzhan Karcı, Ahmet Mustafa Kangal and Sinan Öncü
Drones 2025, 9(7), 507; https://doi.org/10.3390/drones9070507 - 19 Jul 2025
Viewed by 324
Abstract
This research focuses on enhancing cooperative localization for heterogeneous mobile robots composed of a quadcopter and an unmanned ground vehicle. The study employs sensor fusion techniques, particularly the Extended Kalman Filter, to fuse data from various sensors, including GPSs, IMUs, and cameras. By [...] Read more.
This research focuses on enhancing cooperative localization for heterogeneous mobile robots composed of a quadcopter and an unmanned ground vehicle. The study employs sensor fusion techniques, particularly the Extended Kalman Filter, to fuse data from various sensors, including GPSs, IMUs, and cameras. By integrating these sensors and optimizing fusion strategies, the research aims to improve the precision and reliability of cooperative localization in complex and dynamic environments. The primary objective is to develop a practical framework for cooperative localization that addresses the challenges posed by the differences in mobility and sensing capabilities among heterogeneous robots. Sensor fusion is used to compensate for the limitations of individual sensors, providing more accurate and robust localization results. Moreover, a comparative analysis of different sensor combinations and fusion strategies helps to identify the optimal configuration for each robot. This research focuses on the improvement of cooperative localization, path planning, and collaborative tasks for heterogeneous robots. The findings have broad applications in fields such as autonomous transportation, agricultural operation, and disaster response, where the cooperation of diverse robotic platforms is crucial for mission success. Full article
Show Figures

Figure 1

21 pages, 9749 KiB  
Article
Enhanced Pose Estimation for Badminton Players via Improved YOLOv8-Pose with Efficient Local Attention
by Yijian Wu, Zewen Chen, Hongxing Zhang, Yulin Yang and Weichao Yi
Sensors 2025, 25(14), 4446; https://doi.org/10.3390/s25144446 - 17 Jul 2025
Viewed by 370
Abstract
With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To [...] Read more.
With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To address these issues, we propose an enhanced pose estimation framework tailored to badminton players, built upon an improved YOLOv8-Pose architecture. In particular, we introduce an efficient local attention (ELA) mechanism that effectively captures fine-grained spatial dependencies and contextual information, thereby significantly improving the keypoint localization accuracy and overall pose estimation performance. To support this study, we construct a dedicated badminton pose dataset comprising 4000 manually annotated samples, captured using a Microsoft Kinect v2 camera. The raw data undergo careful processing and refinement through a combination of depth-assisted annotation and visual inspection to ensure high-quality ground truth keypoints. Furthermore, we conduct an in-depth comparative analysis of multiple attention modules and their integration strategies within the network, offering generalizable insights to enhance pose estimation models in other sports domains. The experimental results show that the proposed ELA-enhanced YOLOv8-Pose model consistently achieves superior accuracy across multiple evaluation metrics, including the mean squared error (MSE), object keypoint similarity (OKS), and percentage of correct keypoints (PCK), highlighting its effectiveness and potential for broader applications in sports vision tasks. Full article
(This article belongs to the Special Issue Computer Vision-Based Human Activity Recognition)
Show Figures

Figure 1

25 pages, 9517 KiB  
Article
YOLOv8n-SSDW: A Lightweight and Accurate Model for Barnyard Grass Detection in Fields
by Yan Sun, Hanrui Guo, Xiaoan Chen, Mengqi Li, Bing Fang and Yingli Cao
Agriculture 2025, 15(14), 1510; https://doi.org/10.3390/agriculture15141510 - 13 Jul 2025
Cited by 1 | Viewed by 299
Abstract
Barnyard grass is a major noxious weed in paddy fields. Accurate and efficient identification of barnyard grass is crucial for precision field management. However, existing deep learning models generally suffer from high parameter counts and computational complexity, limiting their practical application in field [...] Read more.
Barnyard grass is a major noxious weed in paddy fields. Accurate and efficient identification of barnyard grass is crucial for precision field management. However, existing deep learning models generally suffer from high parameter counts and computational complexity, limiting their practical application in field scenarios. Moreover, the morphological similarity, overlapping, and occlusion between barnyard grass and rice pose challenges for reliable detection in complex environments. To address these issues, this study constructed a barnyard grass detection dataset using high-resolution images captured by a drone equipped with a high-definition camera in rice experimental fields in Haicheng City, Liaoning Province. A lightweight field barnyard grass detection model, YOLOv8n-SSDW, was proposed to enhance detection precision and speed. Based on the baseline YOLOv8n model, a novel Separable Residual Coord Conv (SRCConv) was designed to replace the original convolution module, significantly reducing parameters while maintaining detection accuracy. The Spatio-Channel Enhanced Attention Module (SEAM) was introduced and optimized to improve sensitivity to barnyard grass edge features. Additionally, the lightweight and efficient Dysample upsampling module was incorporated to enhance feature map resolution. A new WIoU loss function was developed to improve bounding box classification and regression accuracy. Comprehensive performance analysis demonstrated that YOLOv8n-SSDW outperformed state-of-the-art models. Ablation studies confirmed the effectiveness of each improvement module. The final fused model achieved lightweight performance while improving detection accuracy, with a 2.2% increase in mAP_50, 3.8% higher precision, 0.6% higher recall, 10.6% fewer parameters, 9.8% lower FLOPs, and an 11.1% reduction in model size compared to the baseline. Field tests using drones combined with ground-based computers further validated the model’s robustness in real-world complex paddy environments. The results indicate that YOLOv8n-SSDW exhibits excellent accuracy and efficiency. This study provides valuable insights for barnyard grass detection in rice fields. Full article
Show Figures

Figure 1

23 pages, 8966 KiB  
Article
Object-Specific Multiview Classification Through View-Compatible Feature Fusion
by Javier Perez Soler, Jose-Luis Guardiola, Nicolás García Sastre, Pau Garrigues Carbó, Miguel Sanchis Hernández and Juan-Carlos Perez-Cortes
Sensors 2025, 25(13), 4127; https://doi.org/10.3390/s25134127 - 2 Jul 2025
Viewed by 323
Abstract
Multi-view classification (MVC) typically focuses on categorizing objects into distinct classes by employing multiple perspectives of the same objects. However, in numerous real-world applications, such as industrial inspection and quality control, there is an increasing need to distinguish particular objects from a pool [...] Read more.
Multi-view classification (MVC) typically focuses on categorizing objects into distinct classes by employing multiple perspectives of the same objects. However, in numerous real-world applications, such as industrial inspection and quality control, there is an increasing need to distinguish particular objects from a pool of similar ones while simultaneously disregarding unknown objects. In these scenarios, relying on a single image may not provide sufficient information to effectively identify the scrutinized object, as different perspectives may reveal distinct characteristics that are essential for accurate classification. Most existing approaches operate within closed-set environments and are focused on generalization, which makes them less effective in distinguishing individual objects from others. This limitations are particularly problematic in industrial quality assessment, where distinguishing between specific objects and discarding unknowns is crucial. To address this challenge, we introduce a View-Compatible Feature Fusion (VCFF) method that utilizes images from predetermined positions as an accurate solution for multi-view classification of specific objects. Unlike other approaches, VCFF explicitly integrates pose information during the fusion process. It does not merely use pose as auxiliary data but employs it to align and selectively fuse features from different views. This mathematically explicit fusion of rotations, based on relative poses, allows VCFF to effectively combine multi-view information, enhancing classification accuracy. Through experimental evaluations, we demonstrate that the proposed VCFF method outperforms state-of-the-art MVC algorithms, especially in open-set scenarios, where the set of possible objects is not fully known in advance. Remarkably, VCFF achieves an average precision of 1.0 using only 8 cameras, whereas existing methods require 20 cameras to reach a maximum of 0.95. In terms of AUC-ROC under the constraint of fewer than 3σ false positives—a critical metric in industrial inspection—current state-of-the-art methods achieve up to 0.72, while VCFF attains a perfect score of 1.0 with just eight cameras. Furthermore, our approach delivers highly accurate rotation estimation, maintaining an error margin slightly above 2° when sampling at 4° intervals. Full article
(This article belongs to the Special Issue Sensors for Object Detection, Pose Estimation, and 3D Reconstruction)
Show Figures

Figure 1

23 pages, 9135 KiB  
Article
Stone Detection on Agricultural Land Using Thermal Imagery from Unmanned Aerial Systems
by Florian Thürkow, Mike Teucher, Detlef Thürkow and Milena Mohri
AgriEngineering 2025, 7(7), 203; https://doi.org/10.3390/agriengineering7070203 - 1 Jul 2025
Viewed by 566
Abstract
Stones in agricultural fields pose a recurring challenge, particularly due to their potential to damage agricultural machinery and disrupt field operations. As modern agriculture moves toward automation and precision farming, efficient stone detection has become a critical concern. This study explores the potential [...] Read more.
Stones in agricultural fields pose a recurring challenge, particularly due to their potential to damage agricultural machinery and disrupt field operations. As modern agriculture moves toward automation and precision farming, efficient stone detection has become a critical concern. This study explores the potential of thermal imaging as a non-invasive method for detecting stones under varying environmental conditions. A series of controlled laboratory experiments and field investigations confirmed the assumption that stones exhibit higher surface temperatures than the surrounding soil, especially when soil moisture is high and air temperatures are cooling rapidly. This temperature difference is attributed to the higher thermal inertia of stones, which allows them to absorb and retain heat longer than soil, as well as to the evaporative cooling from moist soil. These findings demonstrate the viability of thermal cameras as a tool for stone detection in precision farming. Incorporating this technology with GPS mapping enables the generation of accurate location data, facilitating targeted stone removal and reducing equipment damage. This approach aligns with the goals of sustainable agricultural engineering by supporting field automation, minimizing mechanical inefficiencies, and promoting data-driven decisions. Thermal imaging thereby contributes to the evolution of next-generation agricultural systems. Full article
(This article belongs to the Special Issue Recent Trends and Advances in Agricultural Engineering)
Show Figures

Figure 1

19 pages, 3319 KiB  
Article
Frailty-Focused Movement Monitoring: A Single-Camera System Using Joint Angles for Assessing Chair-Based Exercise Quality
by Teng Qi, Miyuki Iwamoto, Dongeun Choi, Noriyuki Kida and Noriaki Kuwahara
Sensors 2025, 25(13), 3907; https://doi.org/10.3390/s25133907 - 23 Jun 2025
Viewed by 388
Abstract
Ensuring that older adults perform chair-based exercises (CBEs) correctly is essential for improving physical outcomes and reducing the risk of injury, particularly in home and community rehabilitation settings. However, evaluating the correctness of movements accurately and objectively outside clinical environments remains challenging. In [...] Read more.
Ensuring that older adults perform chair-based exercises (CBEs) correctly is essential for improving physical outcomes and reducing the risk of injury, particularly in home and community rehabilitation settings. However, evaluating the correctness of movements accurately and objectively outside clinical environments remains challenging. In this study, camera-based methods have been used to evaluate practical exercise quality. A single-camera system utilizing MediaPipe pose estimation was used to capture joint angle data as twenty older adults performed eight CBEs. Simultaneously, surface electromyography (sEMG) recorded muscle activity. Participants were guided to perform both proper and commonly observed incorrect forms of each movement. Statistical analyses compared joint angles and sEMG signals, and a support vector machine (SVM) was trained to classify movement correctness. The analysis showed that correct executions consistently produced distinct joint angle patterns and significantly higher sEMG activity than incorrect ones (p < 0.001). After modifying the selection of joint angle features for Movement 5 (M5), the classification accuracy improved to 96.26%. Including M5, the average classification accuracy across all eight exercises reached 97.77%, demonstrating the overall robustness and consistency of the proposed approach. In contrast, high variability across individuals made sEMG less reliable as a standalone indicator of correctness. The strong classification performance based on joint angles highlights the potential of this approach for real-world applications. While sEMG signals confirmed the physiological differences between correct and incorrect executions, their individual variability limits their generalizability as a sole criterion. Joint angle data derived from a simple single-camera setup can effectively distinguish movement quality in older adults, offering a low-cost, user-friendly solution for real-time feedback in home and community settings. This approach may help support independent exercise and reduce reliance on professional supervision. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

22 pages, 6243 KiB  
Review
A Review on UAS Trajectory Estimation Using Decentralized Multi-Sensor Systems Based on Robotic Total Stations
by Lucas Dammert, Tomas Thalmann, David Monetti, Hans-Berndt Neuner and Gottfried Mandlburger
Sensors 2025, 25(13), 3838; https://doi.org/10.3390/s25133838 - 20 Jun 2025
Viewed by 468
Abstract
In our contribution, we conduct a thematic literature review on trajectory estimation using a decentralized multi-sensor system based on robotic total stations (RTS) with a focus on unmanned aerial system (UAS) platforms. While RTS are commonly used for trajectory estimation in areas where [...] Read more.
In our contribution, we conduct a thematic literature review on trajectory estimation using a decentralized multi-sensor system based on robotic total stations (RTS) with a focus on unmanned aerial system (UAS) platforms. While RTS are commonly used for trajectory estimation in areas where GNSS is not sufficiently accurate or is unavailable, they are rarely used for UAS trajectory estimation. Extending the RTS with integrated camera images allows for UAS pose estimation (position and orientation). We review existing research on the entire RTS measurement processes, including time synchronization, atmospheric refraction, prism interaction, and RTS-based image evaluation. Additionally, we focus on integrated trajectory estimation using UAS onboard measurements such as IMU and laser scanning data. Although many existing articles address individual steps of the decentralized multi-sensor system, we demonstrate that a combination of existing works related to UAS trajectory estimation and RTS calibration is needed to allow for trajectory estimation at sub-cm and sub-0.01 gon accuracies, and we identify the challenges that must be addressed. Investigations into the use of RTS for kinematic tasks must be extended to realistic distances (approx. 300–500 m) and speeds (>2.5 m s−1). In particular, image acquisition with the integrated camera must be extended by a time synchronization approach. As to the estimation of UAS orientation based on RTS camera images, the results of initial simulation studies must be validated by field tests, and existing approaches for integrated trajectory estimation must be adapted to optimally integrate RTS data. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

22 pages, 8644 KiB  
Article
Privacy-Preserving Approach for Early Detection of Long-Lie Incidents: A Pilot Study with Healthy Subjects
by Riska Analia, Anne Forster, Sheng-Quan Xie and Zhiqiang Zhang
Sensors 2025, 25(12), 3836; https://doi.org/10.3390/s25123836 - 19 Jun 2025
Viewed by 633
Abstract
(1) Background: Detecting long-lie incidents—where individuals remain immobile after a fall—is essential for timely intervention and preventing severe health consequences. However, most existing systems focus only on fall detection, neglect post-fall monitoring, and raise privacy concerns, especially in real-time, non-invasive applications; (2) Methods: [...] Read more.
(1) Background: Detecting long-lie incidents—where individuals remain immobile after a fall—is essential for timely intervention and preventing severe health consequences. However, most existing systems focus only on fall detection, neglect post-fall monitoring, and raise privacy concerns, especially in real-time, non-invasive applications; (2) Methods: This study proposes a lightweight, privacy-preserving, long-lie detection system utilizing thermal imaging and a soft-voting ensemble classifier. A low-resolution thermal camera captured simulated falls and activities of daily living (ADL) performed by ten healthy participants. Human pose keypoints were extracted using MediaPipe, followed by the computation of five handcrafted postural features. The top three classifiers—automatically selected based on cross-validation performance—formed the soft-voting ensemble. Long-lie conditions were identified through post-fall immobility monitoring over a defined period, using rule-based logic on posture stability and duration; (3) Results: The ensemble model achieved high classification performance with accuracy, precision, recall, and an F1 score of 0.98. Real-time deployment on a Raspberry Pi 5 demonstrated the system is capable of accurately detecting long-lie incidents based on continuous monitoring over 15 min, with minimal posture variation; (4) Conclusion: The proposed system introduces a novel approach to long-lie detection by integrating privacy-aware sensing, interpretable posture-based features, and efficient edge computing. It demonstrates strong potential for deployment in homecare settings. Future work includes validation with older adults and integration of vital sign monitoring for comprehensive assessment. Full article
Show Figures

Figure 1

28 pages, 12681 KiB  
Article
MM-VSM: Multi-Modal Vehicle Semantic Mesh and Trajectory Reconstruction for Image-Based Cooperative Perception
by Márton Cserni, András Rövid and Zsolt Szalay
Appl. Sci. 2025, 15(12), 6930; https://doi.org/10.3390/app15126930 - 19 Jun 2025
Viewed by 448
Abstract
Recent advancements in cooperative 3D object detection have demonstrated significant potential for enhancing autonomous driving by integrating roadside infrastructure data. However, deploying comprehensive LiDAR-based cooperative perception systems remains prohibitively expensive and requires precisely annotated 3D data to function robustly. This paper proposes an [...] Read more.
Recent advancements in cooperative 3D object detection have demonstrated significant potential for enhancing autonomous driving by integrating roadside infrastructure data. However, deploying comprehensive LiDAR-based cooperative perception systems remains prohibitively expensive and requires precisely annotated 3D data to function robustly. This paper proposes an improved multi-modal method integrating LiDAR-based shape references into a previously mono-camera-based semantic vertex reconstruction framework to enable robust and cost-effective monocular and cooperative pose estimation after the reconstruction. A novel camera–LiDAR loss function that combines re-projection loss from a multi-view camera system alongside LiDAR shape constraints is proposed. Experimental evaluations conducted on the Argoverse dataset and real-world experiments demonstrate significantly improved shape reconstruction robustness and accuracy, thereby improving pose estimation performance. The effectiveness of the algorithm is proven through a real-world smart valet parking application, which is evaluated in our university parking area with real vehicles. Our approach allows accurate 6DOF pose estimation using an inexpensive IP camera without requiring context-specific training, thereby advancing the state of the art in monocular and cooperative image-based vehicle localization. Full article
(This article belongs to the Special Issue Advances in Autonomous Driving and Smart Transportation)
Show Figures

Figure 1

35 pages, 21267 KiB  
Article
Unmanned Aerial Vehicle–Unmanned Ground Vehicle Centric Visual Semantic Simultaneous Localization and Mapping Framework with Remote Interaction for Dynamic Scenarios
by Chang Liu, Yang Zhang, Liqun Ma, Yong Huang, Keyan Liu and Guangwei Wang
Drones 2025, 9(6), 424; https://doi.org/10.3390/drones9060424 - 10 Jun 2025
Viewed by 1223
Abstract
In this study, we introduce an Unmanned Aerial Vehicle (UAV) centric visual semantic simultaneous localization and mapping (SLAM) framework that integrates RGB–D cameras, inertial measurement units (IMUs), and a 5G–enabled remote interaction module. Our system addresses three critical limitations in existing approaches: (1) [...] Read more.
In this study, we introduce an Unmanned Aerial Vehicle (UAV) centric visual semantic simultaneous localization and mapping (SLAM) framework that integrates RGB–D cameras, inertial measurement units (IMUs), and a 5G–enabled remote interaction module. Our system addresses three critical limitations in existing approaches: (1) Distance constraints in remote operations; (2) Static map assumptions in dynamic environments; and (3) High–dimensional perception requirements for UAV–based applications. By combining YOLO–based object detection with epipolar–constraint-based dynamic feature removal, our method achieves real-time semantic mapping while rejecting motion artifacts. The framework further incorporates a dual–channel communication architecture to enable seamless human–in–the–loop control over UAV–Unmanned Ground Vehicle (UGV) teams in large–scale scenarios. Experimental validation across indoor and outdoor environments indicates that the system can achieve a detection rate of up to 75 frames per second (FPS) on an NVIDIA Jetson AGX Xavier using YOLO–FASTEST, ensuring the rapid identification of dynamic objects. In dynamic scenarios, the localization accuracy attains an average absolute pose error (APE) of 0.1275 m. This outperforms state–of–the–art methods like Dynamic–VINS (0.211 m) and ORB–SLAM3 (0.148 m) on the EuRoC MAV Dataset. The dual-channel communication architecture (Web Real–Time Communication (WebRTC) for video and Message Queuing Telemetry Transport (MQTT) for telemetry) reduces bandwidth consumption by 65% compared to traditional TCP–based protocols. Moreover, our hybrid dynamic feature filtering can reject 89% of dynamic features in occluded scenarios, guaranteeing accurate mapping in complex environments. Our framework represents a significant advancement in enabling intelligent UAVs/UGVs to navigate and interact in complex, dynamic environments, offering real-time semantic understanding and accurate localization. Full article
(This article belongs to the Special Issue Advances in Perception, Communications, and Control for Drones)
Show Figures

Figure 1

19 pages, 8306 KiB  
Article
Plant Sam Gaussian Reconstruction (PSGR): A High-Precision and Accelerated Strategy for Plant 3D Reconstruction
by Jinlong Chen, Yingjie Jiao, Fuqiang Jin, Xingguo Qin, Yi Ning, Minghao Yang and Yongsong Zhan
Electronics 2025, 14(11), 2291; https://doi.org/10.3390/electronics14112291 - 4 Jun 2025
Viewed by 580
Abstract
Plant 3D reconstruction plays a critical role in precision agriculture and plant growth monitoring, yet it faces challenges such as complex background interference, difficulties in capturing intricate plant structures, and a slow reconstruction speed. In this study, we propose PlantSamGaussianReconstruction (PSGR), a novel [...] Read more.
Plant 3D reconstruction plays a critical role in precision agriculture and plant growth monitoring, yet it faces challenges such as complex background interference, difficulties in capturing intricate plant structures, and a slow reconstruction speed. In this study, we propose PlantSamGaussianReconstruction (PSGR), a novel method that integrates Grounding SAM with 3D Gaussian Splatting (3DGS) techniques. PSGR employs Grounding DINO and SAM for accurate plant–background segmentation, utilizes algorithms such as Scale-Invariant Feature Transform (SIFT) for camera pose estimation and sparse point cloud generation, and leverages 3DGS for plant reconstruction. Furthermore, a 3D–2D projection-guided optimization strategy is introduced to enhance segmentation precision. The experimental results of various multi-view plant image datasets demonstrate that PSGR effectively removes background noise under diverse environments, accurately captures plant details, and achieves peak signal-to-noise ratio (PSNR) values exceeding 30 in most scenarios, outperforming the original 3DGS approach. Moreover, PSGR reduces training time by up to 26.9%, significantly improving reconstruction efficiency. These results suggest that PSGR is an efficient, scalable, and high-precision solution for plant modeling. Full article
Show Figures

Figure 1

36 pages, 2706 KiB  
Article
Towards Intelligent Assessment in Personalized Physiotherapy with Computer Vision
by Victor García and Olga C. Santos
Sensors 2025, 25(11), 3436; https://doi.org/10.3390/s25113436 - 29 May 2025
Viewed by 726
Abstract
Effective physiotherapy requires accurate and personalized assessments of patient mobility, yet traditional methods can be time-consuming and subjective. This study explores the potential of open-source computer vision algorithms, specifically YOLO Pose, to support automated, vision-based analysis in physiotherapy settings using information collected from [...] Read more.
Effective physiotherapy requires accurate and personalized assessments of patient mobility, yet traditional methods can be time-consuming and subjective. This study explores the potential of open-source computer vision algorithms, specifically YOLO Pose, to support automated, vision-based analysis in physiotherapy settings using information collected from optical sensors such as cameras. By extracting skeletal data from video input, the system enables objective evaluation of patient movements and rehabilitation progress. The visual information is then analyzed to propose a semantic framework that facilitates a structured interpretation of clinical parameters. Preliminary results indicate that YOLO Pose provides reliable pose estimation, offering a solid foundation for future enhancements, such as the integration of natural language processing (NLP) to improve patient interaction through empathetic, AI-driven support. Full article
Show Figures

Figure 1

Back to TopTop