Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (89)

Search Parameters:
Keywords = panning and tilting

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 4219 KB  
Article
Airborne Intelligent System for Abnormal Pig Behavior Identification and Locking
by Yun Wang, Haopu Li, Zhihui Xiong, Yuanmeng Hu, Guangying Hu and Zhenyu Liu
Animals 2026, 16(10), 1506; https://doi.org/10.3390/ani16101506 - 14 May 2026
Viewed by 362
Abstract
Intensive pig farming presents substantial challenges for individual health monitoring due to high stocking densities, complex occlusion scenarios, and the need for continuous real-time surveillance. Existing monitoring approaches rely heavily on manual inspection, which is labor-intensive and prone to delayed detection of abnormal [...] Read more.
Intensive pig farming presents substantial challenges for individual health monitoring due to high stocking densities, complex occlusion scenarios, and the need for continuous real-time surveillance. Existing monitoring approaches rely heavily on manual inspection, which is labor-intensive and prone to delayed detection of abnormal behaviors and disease symptoms. This study proposes an embedded intelligent monitoring system integrating a pan-tilt gimbal platform with an improved multi-object tracking and anomaly detection framework for automated pig health surveillance. The system employs a modified Periodfill_DeepSORT algorithm that incorporates a ReID network with appearance features and motion prediction trajectories to maintain identity consistency under occlusion and re-entry scenarios. For anomaly detection, a lightweight YOLOv8-based network was trained on 772 abnormal samples across three behavioral categories: movement abnormalities, postural abnormalities, and disease-related abnormalities. Experimental results demonstrate that the Periodfill_DeepSORT algorithm achieves a Multiple Object Tracking Accuracy (MOTA) of 95.34%, a Multiple Object Tracking Precision (MOTP) of 94.77%, and an IDF1 score of 96.88%, with only 12 identity switches across 2000 frames involving 12 targets—27 fewer than the standard DeepSORT algorithm. In occlusion scenarios, MOTA improved from 61.1% to 78.3%. The anomaly detection network achieves an overall detection accuracy of 94.5%, representing an 8.8 percentage point improvement over the baseline model, with recognition accuracies of 96.2% for movement abnormalities, 94.1% for postural abnormalities, and 92.8% for disease-related abnormalities. The system operates at 90 frames per second on embedded hardware with a power consumption of 3.2 watts and a startup time of approximately 1 s, with gimbal angle errors maintained within 3°. These results demonstrate the system’s effectiveness and practical feasibility for real-time intelligent health monitoring in intensive livestock farming environments. Full article
(This article belongs to the Section Pigs)
Show Figures

Figure 1

19 pages, 12417 KB  
Article
Interleaved Sparse–Dense Scanning for Low-Latency Obstacle Detection and 3D Mapping on an Embedded Robotic Platform
by Syed Khubaib Ali, Ali A. Al-Temeemy and Pan Cao
Sensors 2026, 26(9), 2732; https://doi.org/10.3390/s26092732 - 28 Apr 2026
Viewed by 804
Abstract
LiDAR is widely used in robotics because it provides reliable range data for navigation and mapping. On a small embedded robot, however, there is a practical conflict between scan resolution and reaction speed. Dense scans provide better environmental detail, but they take too [...] Read more.
LiDAR is widely used in robotics because it provides reliable range data for navigation and mapping. On a small embedded robot, however, there is a practical conflict between scan resolution and reaction speed. Dense scans provide better environmental detail, but they take too long for fast obstacle avoidance, whereas sparse scans are faster but can miss obstacles if the spacing between adjacent rays is too large. This paper presents an Interleaved Sparse–Dense Scanning method for a servo-actuated single-point time-of-flight LiDAR mounted on an embedded mobile robot. A dense nested pan–tilt sweep is used for three-dimensional mapping, while a sparse forward scan is inserted between dense rows for obstacle detection and motion control. A geometric model is derived to relate sensing range, beam spacing, and minimum detectable object width. That model is then linked to zone-based safety constraints and to the distance the robot can travel before the next obstacle update. For the robot used in this study, the resulting sparse configuration is a 7-point forward scan over a 180° field of view. Experiments in a real indoor environment showed that this configuration reliably detected target blocking obstacles and reduced decision latency by 6.2 times compared with waiting for a complete dense scan before each navigation update. The proposed method provides a practical balance between reactive obstacle avoidance and useful 3D mapping on a low-cost embedded platform, while making the system’s timing and safety limits explicit. Full article
(This article belongs to the Collection 3D Imaging and Sensing System)
Show Figures

Figure 1

28 pages, 5422 KB  
Article
Vision-Guided Dual-Loop Control of a Truck-Mounted Electric Water Cannon for Autonomous Fire Suppression
by Zhiyuan Chen and Chaofeng Liu
Appl. Sci. 2026, 16(7), 3469; https://doi.org/10.3390/app16073469 - 2 Apr 2026
Viewed by 440
Abstract
Fire trucks equipped with truck-mounted electric water cannons are key mobile firefighting assets for urban and industrial fire response. However, due to the inherent mechanical inertia of the cannon body, its low-frequency motion response cannot match high-frequency control commands, making the system prone [...] Read more.
Fire trucks equipped with truck-mounted electric water cannons are key mobile firefighting assets for urban and industrial fire response. However, due to the inherent mechanical inertia of the cannon body, its low-frequency motion response cannot match high-frequency control commands, making the system prone to oscillations and control instability. To address this command–execution frequency mismatch, this paper proposes a decoupled dual closed-loop control architecture for truck-mounted electric water cannons on mobile fire trucks: the fast loop is used for fire-source tracking and rapid localization, while the slow loop is used for water-jet aiming alignment. In the fast loop, a 2-D quadrant positioning rule drives the pan–tilt unit to achieve rapid fire tracking and accurate centering. In the slow loop, Kalman-filter-based state estimation and delay-aligned prediction generate feedforward aiming commands; these commands are fused with error feedback and further processed through command limiting and trajectory optimization, ultimately producing smooth and executable angle references. The visual perception module ran at 58 FPS, satisfying the real-time requirement of the proposed system. In five repeated extinguishment tests under controlled open-site conditions, the proposed method successfully completed all trials and reduced the mean extinguishment time to 13.55 s, compared with 15.83 s for the incremental-PID baseline and 23.76 s for the coupled proportional baseline, while also showing smoother correction and less redundant oscillation. Full article
(This article belongs to the Section Mechanical Engineering)
Show Figures

Figure 1

25 pages, 3673 KB  
Systematic Review
Recent Advances in Multi-Camera Computer Vision for Industry 4.0 and Smart Cities: A Systematic Review
by Carlos Julio Fierro-Silva, Carolina Del-Valle-Soto, Samih M. Mostafa and José Varela-Aldás
Algorithms 2026, 19(4), 249; https://doi.org/10.3390/a19040249 - 25 Mar 2026
Viewed by 1482
Abstract
The rapid deployment of surveillance cameras in urban, industrial, and domestic environments has intensified the need for intelligent systems capable of analyzing video streams beyond the limitations of single-camera setups. Unlike traditional single-camera approaches, multi-camera systems expand spatial coverage, reduce blind spots, and [...] Read more.
The rapid deployment of surveillance cameras in urban, industrial, and domestic environments has intensified the need for intelligent systems capable of analyzing video streams beyond the limitations of single-camera setups. Unlike traditional single-camera approaches, multi-camera systems expand spatial coverage, reduce blind spots, and enable consistent tracking of people and objects across non-overlapping views, thereby improving robustness against occlusions and viewpoint changes. This article presents a comprehensive review of multi-camera vision systems published between 2020 and 2025, covering application domains including public security and biometrics, intelligent transportation, smart cities and IoT, healthcare monitoring, precision agriculture, industry and robotics, pan–tilt–zoom (PTZ) camera networks, and emerging areas such as retail and forensic analysis. The review synthesizes predominant technical approaches, including deep-learning-based detection, multi-target multi-camera tracking (MTMCT), re-identification (Re-ID), spatiotemporal fusion, and edge computing architectures. Persistent challenges are identified, particularly in inter-camera data association, scalability, computational efficiency, privacy preservation, and dataset availability. Emerging trends such as distributed edge AI, cooperative camera networks, and active perception are discussed to outline future research directions toward scalable, privacy-aware, and intelligent multi-camera infrastructures. Full article
Show Figures

Figure 1

30 pages, 22493 KB  
Article
H-CoRE: A Cooperative Framework for Heterogeneous Multi-Robot Exploration and Inspection
by Simone D’Angelo, Francesca Pagano, Riccardo Caccavale, Vincenzo Scognamiglio, Alessandro De Crescenzo, Pasquale Merone, Stefano Ciaravino, Alberto Finzi and Vincenzo Lippiello
Drones 2026, 10(4), 232; https://doi.org/10.3390/drones10040232 - 25 Mar 2026
Viewed by 1268
Abstract
This paper presents the H-CoRE (Heterogeneous Cooperative Multi-Robot Execution) framework designed to enable autonomous multi-robot operations in GNSS-denied environments. Built on an ROS 2-based architecture, H-CoRE enables collaborative, structured task execution through standardized software stacks. Each robot’s stack combines a high-level executive system [...] Read more.
This paper presents the H-CoRE (Heterogeneous Cooperative Multi-Robot Execution) framework designed to enable autonomous multi-robot operations in GNSS-denied environments. Built on an ROS 2-based architecture, H-CoRE enables collaborative, structured task execution through standardized software stacks. Each robot’s stack combines a high-level executive system with an agent-specific motion layer and leverages multi-sensor fusion for localization and mapping. The framework is inherently reconfigurable, allowing individual agents to operate autonomously or as part of a multi-robot team for collaborative missions. In the considered scenario, the system integrates aerial and ground vehicles, a fixed pan–tilt–zoom camera, and a human supervisory interface within a unified, modular infrastructure. The proposed system has been deployed in indoor, GNSS-denied environments, demonstrating autonomous navigation, cooperative area coverage, and real-time information sharing across multiple agents. Experimental results confirm the effectiveness of H-CoRE in maintaining general awareness and mission continuity, paving the way for future applications in search-and-rescue, inspection, and exploration tasks. Full article
Show Figures

Figure 1

14 pages, 18688 KB  
Article
Outdoor Motion Capture at Scale
by Michael Zwölfer, Martin Mössner, Helge Rhodin and Werner Nachbauer
Sensors 2026, 26(6), 1951; https://doi.org/10.3390/s26061951 - 20 Mar 2026
Viewed by 597
Abstract
Capturing kinematic data in outdoor sports is challenging, as motions span large capture volumes and occur under difficult environmental conditions. Video-based approaches, particularly with pan–tilt–zoom cameras, offer a practical solution, but the extensive manual post-processing required limits their use to short sequences and [...] Read more.
Capturing kinematic data in outdoor sports is challenging, as motions span large capture volumes and occur under difficult environmental conditions. Video-based approaches, particularly with pan–tilt–zoom cameras, offer a practical solution, but the extensive manual post-processing required limits their use to short sequences and few athletes. This study presents a motion capture pipeline that automates the detection of both reference points and sport-specific keypoints to overcome this limitation. The field test employed eight cameras covering a 250×80×30 m capture volume with nearly 300 reference points. Ten state-certified ski instructors performed eight standardized maneuvers. Reference points were localized through a hybrid approach combining YOLO object detection and ArUco marker identification. AlphaPose was fine-tuned on a new manually annotated dataset to detect skier-specific keypoints (e.g., skis, poles) alongside anatomical landmarks. Continuous frame-wise calibration and 3D reconstruction were performed using Direct Linear Transformation. Evaluation compared automated detections with manual annotations. Automated reference point detection achieved a mean localization error of 4.1 pixels (0.1% of 4K width) and reduced 3D segment-length variation by 23%. The skier-specific keypoint model reached 98% PCK, mAP of 0.97, and an MPJPE of 10.3 pixels while lowering 3D segment-length variation by 0.5 cm compared to manual digitization and 0.6 cm relative to a pretrained model. Replacing manual digitization with automated detection improves accuracy and facilitates kinematic data collection in large outdoor fields with many athletes and trials. The approach also enables the creation of sport-specific datasets valuable for biomechanical research and training next-generation 3D pose estimation models. Full article
(This article belongs to the Special Issue Advanced Sensors in Biomechanics and Rehabilitation—2nd Edition)
Show Figures

Graphical abstract

19 pages, 37608 KB  
Article
ZoomPatch: An Adaptive PTZ Scheduling Framework for Small Object Video Analytics
by Shutong Chen, Binhua Liang and Yan Chen
Appl. Sci. 2026, 16(6), 2934; https://doi.org/10.3390/app16062934 - 18 Mar 2026
Viewed by 350
Abstract
Accurate detection of small objects in video analytics is limited by low pixel resolution and insufficient visual cues. While software-based enhancements often fail to recover missing details, Pan–Tilt–Zoom (PTZ) cameras can physically increase spatial resolution through optical zoom. However, mechanical latency and configuration [...] Read more.
Accurate detection of small objects in video analytics is limited by low pixel resolution and insufficient visual cues. While software-based enhancements often fail to recover missing details, Pan–Tilt–Zoom (PTZ) cameras can physically increase spatial resolution through optical zoom. However, mechanical latency and configuration complexity hinder their real-time applicability. We propose ZoomPatch, a real-time video analytics framework tailored for small object detection. ZoomPatch actively schedules PTZ adjustments to capture optically enhanced subframes of regions of interest (ROIs) and fuses inference results back to the global reference frame. Specifically, it introduces a dynamic Cycle Length Proposer to adapt analysis cycles based on scene motion, and a Mixed Integer Linear Programming (MILP)-based Configuration Decider to determine the optimal sequence of pan, tilt, and zoom adjustments under time budget constraints. Simulation-based experimental evaluations across diverse workloads demonstrate that ZoomPatch significantly outperforms fixed-perspective, super-resolution (SR), and greedy baselines. Notably, in the detection task using YOLOv10, ZoomPatch improves the F1-score from 0.33 to 0.47 (a 42% increase) compared to the fixed-perspective baseline. Furthermore, ZoomPatch yields performance gains of 30% and 7% over the SR baseline (0.36) and the greedy baseline (0.44). Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

17 pages, 2743 KB  
Article
Research on Motion Trajectory Correction Method for Wall-Climbing Robots Based on External Visual Localization System
by Haolei Ru, Meiping Sheng, Fei Gao, Zhanghao Li, Jiahui Qi, Lei Cheng, Kuo Su, Jiahao Zhang and Jiangjian Xiao
Sensors 2026, 26(3), 773; https://doi.org/10.3390/s26030773 - 23 Jan 2026
Viewed by 477
Abstract
To reduce manual operation and enhance the intelligence of the high-altitude maintenance wall-climbing robot during its operation, path planning and autonomous navigation need to be implemented. Due to non-uniform magnetic adhesion between the wall-climbing robot and the steel plate, often caused by variations [...] Read more.
To reduce manual operation and enhance the intelligence of the high-altitude maintenance wall-climbing robot during its operation, path planning and autonomous navigation need to be implemented. Due to non-uniform magnetic adhesion between the wall-climbing robot and the steel plate, often caused by variations in steel thickness or surface pitting, the wall-climbing robot may experience motion deviations and deviate from its planned trajectory. In order to obtain the actual deviation from the expected trajectory, it is necessary to accurately locate the wall-climbing robot. This allows for the generation of precise control signals, enabling trajectory correction and ensuring high-precision autonomous navigation. Therefore, this paper proposes an external visual localization system based on a pan–tilt laser tracker unit. The system utilizes a zoom camera to track an AprilTag marker and drives the pan–tilt platform, while a laser rangefinder provides high-accuracy distance measurement. The robot’s three-dimensional (3D) pose is ultimately calculated by fusing the visual and ranging data. However, due to the limited tracking speed of the pan–tilt mechanism relative to the robot’s movement, we introduce an Extended Kalman Filter (EKF) to robustly predict the robot’s true spatial coordinates. The robot’s three-dimensional coordinates are periodically compared with the predefined route coordinates to calculate the deviation. This comparison generates closed-loop control signals for the robot’s movement direction and speed. Finally, based on the LoRa communication protocol, closed-loop control of the robot’s movement direction and speed are achieved through the upper-level computer, ensuring that the robot returns to the predefined track. Extensive comparative experiments demonstrate that the localization system achieves stable localization with an accuracy better than 0.025 m on a 6 m × 2.5 m steel structure surface. Based on this high-precision positioning and motion correction, the robot’s motion deviation is kept within 0.1 m, providing a reliable pose reference for precise motion control and high-reliability operation in complex structural environments. Full article
Show Figures

Figure 1

33 pages, 11440 KB  
Article
A Vision-Assisted Acoustic Channel Modeling Framework for Smartphone Indoor Localization
by Can Xue, Huixin Zhuge and Zhi Wang
Sensors 2026, 26(2), 717; https://doi.org/10.3390/s26020717 - 21 Jan 2026
Viewed by 491
Abstract
Conventional acoustic time-of-arrival (TOA) estimation in complex indoor environments is highly susceptible to multipath reflections and occlusions, resulting in unstable measurements and limited physical interpretability. This paper presents a smartphone-based indoor localization method built on vision-assisted acoustic channel modeling, and develops a fusion [...] Read more.
Conventional acoustic time-of-arrival (TOA) estimation in complex indoor environments is highly susceptible to multipath reflections and occlusions, resulting in unstable measurements and limited physical interpretability. This paper presents a smartphone-based indoor localization method built on vision-assisted acoustic channel modeling, and develops a fusion anchor integrating a pan–tilt–zoom (PTZ) camera and a near-ultrasonic signal transmitter to explicitly perceive indoor geometry, surface materials, and occlusion patterns. First, vision-derived priors are constructed on the anchor side based on line-of-sight reachability, orientation consistency, and directional risk, and are converted into soft anchor weights to suppress the impact of occlusion and pointing mismatch. Second, planar geometry and material cues reconstructed from camera images are used to generate probabilistic room impulse response (RIR) priors that cover the direct path and first-order reflections, where environmental uncertainty is mapped into path-dependent arrival-time variances and prior probabilities. Finally, under the RIR prior constraints, a path-wise posterior distribution is built from matched-filter outputs, and an adaptive fusion strategy is applied to switch between maximum a posteriori (MAP) and minimum mean square error (MMSE) estimators, yielding debiased TOA measurements with calibratable variances for downstream localization filters. Experiments in representative complex indoor scenarios demonstrate mean localization errors of 0.096 m and 0.115 m in static and dynamic tests, respectively, indicating improved accuracy and robustness over conventional TOA estimation. Full article
Show Figures

Figure 1

23 pages, 3778 KB  
Article
Deep Learning-Driven Design and Analysis of an Autonomous Robotic System for In-Pipe Inspection
by Ambigai Rajasekaran, Uma Mohan, Sethuramalingam Prabhu, Shaik Ayman Hameed Baig, Shaik Pasha, Srinivasan Sridhar, Utsav Jain, Arvind Sekhar, Aryan Dwivedi and Praneeth Kasiraju
Algorithms 2026, 19(1), 1; https://doi.org/10.3390/a19010001 - 19 Dec 2025
Viewed by 1613
Abstract
This paper presents an intelligent robotic system for in-pipe inspection that integrates a novel mechanical design, deep learning-based defect detection, and high-fidelity simulation for real-time validation. Unlike existing solutions, the proposed system combines a Mecanum wheel-based mobile platform with a modular arm and [...] Read more.
This paper presents an intelligent robotic system for in-pipe inspection that integrates a novel mechanical design, deep learning-based defect detection, and high-fidelity simulation for real-time validation. Unlike existing solutions, the proposed system combines a Mecanum wheel-based mobile platform with a modular arm and advanced pan-tilt camera, enabling navigation and inspection of pipes ranging from 100 mm to 500 mm in diameter. A comprehensive dataset of 53,486 images, including 27,000 annotated defect instances across six critical classes, was used to train a YOLOv11-based detection framework. The model achieved high accuracy with a precision of 0.9, recall of 0.8, mAP@0.5 of 0.9, and mAP@0.5:0.95 of 0.6, outperforming previous YOLO versions, SSD, RCNN, and DinoV2 by 26% in mAP. Real-time testing on a Raspberry Pi Camera 3 Wide IR module validated the robust detection under realistic conditions. This work contributes a mechanically adaptable robot, an optimized deep learning inspection framework, and an integrated simulation-to-deployment workflow, providing a scalable and autonomous solution for industrial pipeline inspection. Full article
(This article belongs to the Special Issue AI Applications and Modern Industry)
Show Figures

Figure 1

29 pages, 6001 KB  
Article
Vision-Based Geolocation of Moving Ground Targets Using Kalman Filtering with a Gimbal Camera on Board a UAV
by Jaemin Kim, Youngrun Kim, SuHyeon Kim, Hyeongjun Cho and Dongwon Jung
Aerospace 2025, 12(12), 1065; https://doi.org/10.3390/aerospace12121065 - 30 Nov 2025
Cited by 1 | Viewed by 1986
Abstract
Unmanned aerial vehicles (UAVs) are vital for surveillance missions requiring the geolocation of moving ground targets, yet small, resource-constrained platforms often lack integrated, robust systems that can handle disturbances such as wind, occlusions, and noise. This paper presents an integrated, end-to-end vision-based geolocation [...] Read more.
Unmanned aerial vehicles (UAVs) are vital for surveillance missions requiring the geolocation of moving ground targets, yet small, resource-constrained platforms often lack integrated, robust systems that can handle disturbances such as wind, occlusions, and noise. This paper presents an integrated, end-to-end vision-based geolocation pipeline specifically designed for embedded deployment on resource-constrained UAVs with gimbal cameras. Starting from a rough initial position estimate, pan/tilt angles are computed to orient the gimbal, and then a visual tracking module combining object detection (via Tiny-YOLO) and feedback control (using CSRT) centers the target in the frame. The target’s absolute position is derived from UAV inertial data and gimbal angles. To mitigate noisy or unavailable direct geolocation due to disturbances or visual lock loss, Kalman filtering is integrated with a unicycle-based motion model. Both an extended Kalman filter (EKF) and unscented Kalman filter (UKF) are evaluated and tuned in high-fidelity simulations, with the UKF demonstrating superior performance by reducing the 2D position RMSE by 33% compared to the EKF in occlusion scenarios. The system is implemented on embedded hardware and validated through real flight tests, establishing the operational capability of vision-based surveillance on small UAV platforms. Full article
Show Figures

Figure 1

22 pages, 2422 KB  
Article
Data-Driven Forward Kinematics for Robotic Spatial Augmented Reality: A Deep Learning Framework Using LSTM and Attention
by Sooyoung Jang, Hanul Yum and Ahyun Lee
Actuators 2025, 14(12), 569; https://doi.org/10.3390/act14120569 - 25 Nov 2025
Cited by 1 | Viewed by 737
Abstract
Robotic Spatial Augmented Reality (RSAR) systems present a unique control challenge as their end-effector is a projection, whose final position depends on both the actuator’s pose and the external environment’s geometry. Accurately controlling this projection first requires predicting the 6-DOF pose of a [...] Read more.
Robotic Spatial Augmented Reality (RSAR) systems present a unique control challenge as their end-effector is a projection, whose final position depends on both the actuator’s pose and the external environment’s geometry. Accurately controlling this projection first requires predicting the 6-DOF pose of a projector-camera unit from joint angles; however, loose kinematic specifications in many RSAR setups make precise analytical models unavailable for this task. This study proposes a novel deep learning model combining Long Short-Term Memory (LSTM) and an Attention Mechanism (LSTM–Attention) to accurately estimate the forward kinematics of a 2-axis Pan-Tilt actuator. To ensure a fair evaluation of intrinsic model performance, a simulation framework using Unity and unified robot description format was developed to generate a noise-free benchmark dataset. The proposed model utilizes a multi-task learning architecture with a geodesic distance loss function to optimize 3-dimensional position and 4-dimensional quaternion rotation separately. Quantitative results show that the proposed LSTM–Attention model achieved the lowest errors (Position MAE: 18.00 mm; Rotation MAE: 3.723 deg), consistently outperforming baseline models like Random Forest by 9.5% and 17.6%, respectively. Qualitative analysis further confirmed its superior stability and outlier suppression. The proposed LSTM–Attention architecture proves to be a effective and accurate methodology for modeling the complex non-linear kinematics of RSAR systems. Full article
(This article belongs to the Special Issue Advanced Learning and Intelligent Control Algorithms for Robots)
Show Figures

Figure 1

9 pages, 1449 KB  
Proceeding Paper
Modeling and Control of a Pan–Tilt Servo System for Face Tracking Using Deep Learning and PID
by Mihnea Dimitrie Doloiu, Ioan-Alexandru Spulber, Ilie Indreica, Gigel Măceșanu, Bogdan Sibisan and Tiberiu-Teodor Cociaș
Eng. Proc. 2025, 113(1), 75; https://doi.org/10.3390/engproc2025113075 - 19 Nov 2025
Viewed by 2077
Abstract
This paper presents a comprehensive modeling and control strategy for a pan–tilt (PT) servo system designed for real-time object tracking (specifically face detection) using deep learning and PID control. The system integrates a YOLO-based neural network to detect and localize the target within [...] Read more.
This paper presents a comprehensive modeling and control strategy for a pan–tilt (PT) servo system designed for real-time object tracking (specifically face detection) using deep learning and PID control. The system integrates a YOLO-based neural network to detect and localize the target within an image, mapping its coordinates from 3D space onto the 2D image plane through a mathematically defined geometric camera model. A complete mathematical representation of the pan–tilt mechanism is developed, accounting for all relevant forces and system components. Based on this model, a PID controller is designed, and its parameters are identified and implemented using the Ziegler–Nichols tuning method. Experimental results demonstrate that the system effectively tracks objects in real time, exhibiting minimal latency and precise motor responses. These findings suggest that the proposed approach is well-suited for practical applications, including security surveillance, assistive technologies, and interactive robotics. Full article
(This article belongs to the Proceedings of The Sustainable Mobility and Transportation Symposium 2025)
Show Figures

Figure 1

19 pages, 2549 KB  
Article
Optimal Aerial Imaging Parameters for UAV-Based Inspection and Maintenance of Photovoltaic Installations
by Eleftherios G. Vourkos, Eftychios G. Christoforou, Andreas S. Panayides, Soteris A. Kalogirou and Rafaela A. Agathokleous
Energies 2025, 18(21), 5818; https://doi.org/10.3390/en18215818 - 4 Nov 2025
Viewed by 1354
Abstract
Unmanned Aerial Vehicles (UAVs) equipped with thermal and RGB cameras and enhanced by deep learning offer a powerful solution for autonomous photovoltaic (PV) system inspection. However, defect detection performance depends on flight parameters such as altitude, camera angles, speed, and solar position. This [...] Read more.
Unmanned Aerial Vehicles (UAVs) equipped with thermal and RGB cameras and enhanced by deep learning offer a powerful solution for autonomous photovoltaic (PV) system inspection. However, defect detection performance depends on flight parameters such as altitude, camera angles, speed, and solar position. This study examines the impact of various UAV flight parameters on the accurate detection of critical PV defects including hotspots, dirt from bird droppings, dust accumulation, and cell failures. For this purpose, two datasets were developed, comprising over 38,000 thermal infrared and RGB images. Using the YOLOv11 model, 21 flight configurations varying in altitude, camera tilt and pan angles, speed, and solar position were evaluated at four different times of day to assess the combined ambient and geometric effects on detection accuracy. Results indicate that low-altitude flights enhance small-object detection, while higher altitudes improve coverage at the expense of fine-detail accuracy. Dust detection is most effective when the camera aligns with the sun, whereas steep midday tilts cause reflective false positives. Thermal defect detection performs best during morning flights with moderate tilt angles. These findings emphasize the need to balance accuracy, coverage, efficiency, and safety, offering practical guidelines for effective and scalable PV inspection and maintenance. Full article
Show Figures

Figure 1

17 pages, 1217 KB  
Article
An Internet of Things Approach to Vision-Based Livestock Monitoring: PTZ Cameras for Dairy Cow Identification
by Niken Prasasti Martono, Ryota Tsukamoto and Hayato Ohwada
Telecom 2025, 6(4), 82; https://doi.org/10.3390/telecom6040082 - 3 Nov 2025
Cited by 1 | Viewed by 2268
Abstract
The Internet of Things (IoT) offers promising solutions for smart agriculture, particularly in the monitoring of livestock. This paper proposes a contactless, low-cost system for individual cow identification and monitoring in a dairy barn using a single Pan–Tilt–Zoom (PTZ) camera and a YOLOv8 [...] Read more.
The Internet of Things (IoT) offers promising solutions for smart agriculture, particularly in the monitoring of livestock. This paper proposes a contactless, low-cost system for individual cow identification and monitoring in a dairy barn using a single Pan–Tilt–Zoom (PTZ) camera and a YOLOv8 deep learning model. The PTZ camera periodically scans the barn, capturing images that are processed to detect and recognize a specific target cow among the herd without any wearable sensors. The system embeds barn area metadata in each image, allowing it to estimate the cow’s location and compute the frequency of its presence in predefined zones. We fine-tuned a YOLOv8 object detection model to distinguish the target cow, achieving high precision in identification. Experimental results in a real barn environment demonstrate that the system can identify an individual cow with 85.96% Precision and 68.06% Recall, and the derived spatial occupancy patterns closely match ground truth observations. Compared to conventional methods requiring multiple fixed cameras or RFID-based wearables, the proposed approach significantly reduces equipment costs and animal handling stress. It should be noted that the present work serves as a proof-of-concept for targeted cow tracking that identifies and follows a specific individual within a herd rather than a fully generalized multi-cow identification system. Full article
Show Figures

Figure 1

Back to TopTop