Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (512)

Search Parameters:
Keywords = video filtering

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 3130 KiB  
Article
Deep Learning-Based Instance Segmentation of Galloping High-Speed Railway Overhead Contact System Conductors in Video Images
by Xiaotong Yao, Huayu Yuan, Shanpeng Zhao, Wei Tian, Dongzhao Han, Xiaoping Li, Feng Wang and Sihua Wang
Sensors 2025, 25(15), 4714; https://doi.org/10.3390/s25154714 (registering DOI) - 30 Jul 2025
Viewed by 183
Abstract
The conductors of high-speed railway OCSs (Overhead Contact Systems) are susceptible to conductor galloping due to the impact of natural elements such as strong winds, rain, and snow, resulting in conductor fatigue damage and significantly compromising train operational safety. Consequently, monitoring the galloping [...] Read more.
The conductors of high-speed railway OCSs (Overhead Contact Systems) are susceptible to conductor galloping due to the impact of natural elements such as strong winds, rain, and snow, resulting in conductor fatigue damage and significantly compromising train operational safety. Consequently, monitoring the galloping status of conductors is crucial, and instance segmentation techniques, by delineating the pixel-level contours of each conductor, can significantly aid in the identification and study of galloping phenomena. This work expands upon the YOLO11-seg model and introduces an instance segmentation approach for galloping video and image sensor data of OCS conductors. The algorithm, designed for the stripe-like distribution of OCS conductors in the data, employs four-direction Sobel filters to extract edge features in horizontal, vertical, and diagonal orientations. These features are subsequently integrated with the original convolutional branch to form the FDSE (Four Direction Sobel Enhancement) module. It integrates the ECA (Efficient Channel Attention) mechanism for the adaptive augmentation of conductor characteristics and utilizes the FL (Focal Loss) function to mitigate the class-imbalance issue between positive and negative samples, hence enhancing the model’s sensitivity to conductors. Consequently, segmentation outcomes from neighboring frames are utilized, and mask-difference analysis is performed to autonomously detect conductor galloping locations, emphasizing their contours for the clear depiction of galloping characteristics. Experimental results demonstrate that the enhanced YOLO11-seg model achieves 85.38% precision, 77.30% recall, 84.25% AP@0.5, 81.14% F1-score, and a real-time processing speed of 44.78 FPS. When combined with the galloping visualization module, it can issue real-time alerts of conductor galloping anomalies, providing robust technical support for railway OCS safety monitoring. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Figure 1

12 pages, 557 KiB  
Article
Advancing Diagnostics with Semi-Automatic Tear Meniscus Central Area Measurement for Aqueous Deficient Dry Eye Discrimination
by Hugo Pena-Verdeal, Jacobo Garcia-Queiruga, Belen Sabucedo-Villamarin, Carlos Garcia-Resua, Maria J. Giraldez and Eva Yebra-Pimentel
Medicina 2025, 61(8), 1322; https://doi.org/10.3390/medicina61081322 - 22 Jul 2025
Viewed by 193
Abstract
Background and Objectives: To clinically validate a semi-automatic measurement of Tear Meniscus Central Area (TMCA) to differentiate between Non-Aqueous Deficient Dry Eye (Non-ADDE) and Aqueous Deficient Dry Eye (ADDE) patients. Materials and Methods: 120 volunteer participants were included in the study. Following [...] Read more.
Background and Objectives: To clinically validate a semi-automatic measurement of Tear Meniscus Central Area (TMCA) to differentiate between Non-Aqueous Deficient Dry Eye (Non-ADDE) and Aqueous Deficient Dry Eye (ADDE) patients. Materials and Methods: 120 volunteer participants were included in the study. Following TFOS DEWS II diagnostic criteria, a battery of tests was conducted for dry eye diagnosis: Ocular Surface Disease Index questionnaire, tear film osmolarity, tear film break-up time, and corneal staining. Additionally, lower tear meniscus videos were captured with Tearscope illumination and, separately, with fluorescein using slit-lamp blue light and a yellow filter. Tear meniscus height was measured from Tearscope videos to differentiate Non-ADDE from ADDE participants, while TMCA was obtained from fluorescein videos. Both parameters were analyzed using the open-source software NIH ImageJ. Results: Receiver Operating Characteristics analysis showed that semi-automatic TMCA evaluation had significant diagnostic capability to differentiate between Non-ADDE and ADDE participants, with an optimal cut-off value to differentiate between the two groups of 54.62 mm2 (Area Under the Curve = 0.714 ± 0.051, p < 0.001; specificity: 71.7%; sensitivity: 68.9%). Conclusions: The semi-automatic TMCA evaluation showed preliminary valuable results as a diagnostic tool for distinguishing between ADDE and Non-ADDE individuals. Full article
(This article belongs to the Special Issue Advances in Diagnosis and Therapies of Ocular Diseases)
Show Figures

Figure 1

31 pages, 4668 KiB  
Article
BLE Signal Processing and Machine Learning for Indoor Behavior Classification
by Yi-Shiun Lee, Yong-Yi Fanjiang, Chi-Huang Hung and Yung-Shiang Huang
Sensors 2025, 25(14), 4496; https://doi.org/10.3390/s25144496 - 19 Jul 2025
Viewed by 309
Abstract
Smart home technology enhances the quality of life, particularly with respect to in-home care and health monitoring. While video-based methods provide accurate behavior analysis, privacy concerns drive interest in non-visual alternatives. This study proposes a Bluetooth Low Energy (BLE)-enabled indoor positioning and behavior [...] Read more.
Smart home technology enhances the quality of life, particularly with respect to in-home care and health monitoring. While video-based methods provide accurate behavior analysis, privacy concerns drive interest in non-visual alternatives. This study proposes a Bluetooth Low Energy (BLE)-enabled indoor positioning and behavior recognition system, integrating machine learning techniques to support sustainable and privacy-preserving health monitoring. Key optimizations include: (1) a vertically mounted Data Collection Unit (DCU) for improved height positioning, (2) synchronized data collection to reduce discrepancies, (3) Kalman filtering to smooth RSSI signals, and (4) AI-based RSSI analysis for enhanced behavior recognition. Experiments in a real home environment used a smart wristband to assess BLE signal variations across different activities (standing, sitting, lying down). The results show that the proposed system reliably tracks user locations and identifies behavior patterns. This research supports elderly care, remote health monitoring, and non-invasive behavior analysis, providing a privacy-preserving solution for smart healthcare applications. Full article
Show Figures

Figure 1

27 pages, 6541 KiB  
Article
Multi-Object-Based Efficient Traffic Signal Optimization Framework via Traffic Flow Analysis and Intensity Estimation Using UCB-MRL-CSFL
by Zainab Saadoon Naser, Hend Marouane and Ahmed Fakhfakh
Vehicles 2025, 7(3), 72; https://doi.org/10.3390/vehicles7030072 - 11 Jul 2025
Viewed by 415
Abstract
Traffic congestion has increased significantly in today’s rapidly urbanizing world, influencing people’s daily lives. Traffic signal control systems (TSCSs) play an important role in alleviating congestion by optimizing traffic light timings and improving road efficiency. Yet traditional TSCSs neglected pedestrians, cyclists, and other [...] Read more.
Traffic congestion has increased significantly in today’s rapidly urbanizing world, influencing people’s daily lives. Traffic signal control systems (TSCSs) play an important role in alleviating congestion by optimizing traffic light timings and improving road efficiency. Yet traditional TSCSs neglected pedestrians, cyclists, and other non-monitored road users, degrading traffic signal optimization (TSO). Therefore, this framework proposes a multi-object-based traffic flow analysis and intensity estimation model for efficient TSO using Upper Confidence Bound Multi-agent Reinforcement Learning Cubic Spline Fuzzy Logic (UCB-MRL-CSFL). Initially, the real-time traffic videos undergo frame conversion and redundant frame removal, followed by preprocessing. Then, the lanes are detected; further, the objects are detected using Temporal Context You Only Look Once (TC-YOLO). Now, the object counting in each lane is carried out using the Cumulative Vehicle Motion Kalman Filter (CVMKF), followed by queue detection using Vehicle Density Mapping (VDM). Next, the traffic flow is analyzed by Feature Variant Optical Flow (FVOF), followed by traffic intensity estimation. Now, based on the siren flashlight colors, emergency vehicles are separated. Lastly, UCB-MRL-CSFL optimizes the Traffic Signals (TSs) based on the separated emergency vehicle, pedestrian information, and traffic intensity. Therefore, the proposed framework outperforms the other conventional methodologies for TSO by considering pedestrians, cyclists, and so on, with higher computational efficiency (94.45%). Full article
Show Figures

Figure 1

17 pages, 7292 KiB  
Article
QP-Adaptive Dual-Path Residual Integrated Frequency Transformer for Data-Driven In-Loop Filter in VVC
by Cheng-Hsuan Yeh, Chi-Ting Ni, Kuan-Yu Huang, Zheng-Wei Wu, Cheng-Pin Peng and Pei-Yin Chen
Sensors 2025, 25(13), 4234; https://doi.org/10.3390/s25134234 - 7 Jul 2025
Viewed by 369
Abstract
As AI-enabled embedded systems such as smart TVs and edge devices demand efficient video processing, Versatile Video Coding (VVC/H.266) becomes essential for bandwidth-constrained Multimedia Internet of Things (M-IoT) applications. However, its block-based coding often introduces compression artifacts. While CNN-based methods effectively reduce these [...] Read more.
As AI-enabled embedded systems such as smart TVs and edge devices demand efficient video processing, Versatile Video Coding (VVC/H.266) becomes essential for bandwidth-constrained Multimedia Internet of Things (M-IoT) applications. However, its block-based coding often introduces compression artifacts. While CNN-based methods effectively reduce these artifacts, maintaining robust performance across varying quantization parameters (QPs) remains challenging. Recent QP-adaptive designs like QA-Filter show promise but are still limited. This paper proposes DRIFT, a QP-adaptive in-loop filtering network for VVC. DRIFT combines a lightweight frequency fusion CNN (LFFCNN) for local enhancement and a Swin Transformer-based global skip connection for capturing long-range dependencies. LFFCNN leverages octave convolution and introduces a novel residual block (FFRB) that integrates multiscale extraction, QP adaptivity, frequency fusion, and spatial-channel attention. A QP estimator (QPE) is further introduced to mitigate double enhancement in inter-coded frames. Experimental results demonstrate that DRIFT achieves BD rate reductions of 6.56% (intra) and 4.83% (inter), with an up to 10.90% gain on the BasketballDrill sequence. Additionally, LFFCNN reduces the model size by 32% while slightly improving the coding performance over QA-Filter. Full article
(This article belongs to the Special Issue Multimodal Sensing Technologies for IoT and AI-Enabled Systems)
Show Figures

Figure 1

21 pages, 4859 KiB  
Article
Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation
by Jun Yin, Fei Wu, Hao Su, Peng Huang and Yuetong Qixuan
Sensors 2025, 25(13), 4199; https://doi.org/10.3390/s25134199 - 5 Jul 2025
Viewed by 526
Abstract
The Segment Anything Model 2 (SAM2) has achieved state-of-the-art performance in pixel-level object segmentation for both static and dynamic visual content. Its streaming memory architecture maintains spatial context across video sequences, yet struggles with long-term tracking due to its static inference framework. SAM [...] Read more.
The Segment Anything Model 2 (SAM2) has achieved state-of-the-art performance in pixel-level object segmentation for both static and dynamic visual content. Its streaming memory architecture maintains spatial context across video sequences, yet struggles with long-term tracking due to its static inference framework. SAM 2’s fixed temporal window approach indiscriminately retains historical frames, failing to account for frame quality or dynamic motion patterns. This leads to error propagation and tracking instability in challenging scenarios involving fast-moving objects, partial occlusions, or crowded environments. To overcome these limitations, this paper proposes SAM2Plus, a zero-shot enhancement framework that integrates Kalman filter prediction, dynamic quality thresholds, and adaptive memory management. The Kalman filter models object motion using physical constraints to predict trajectories and dynamically refine segmentation states, mitigating positional drift during occlusions or velocity changes. Dynamic thresholds, combined with multi-criteria evaluation metrics (e.g., motion coherence, appearance consistency), prioritize high-quality frames while adaptively balancing confidence scores and temporal smoothness. This reduces ambiguities among similar objects in complex scenes. SAM2Plus further employs an optimized memory system that prunes outdated or low-confidence entries and retains temporally coherent context, ensuring constant computational resources even for infinitely long videos. Extensive experiments on two video object segmentation (VOS) benchmarks demonstrate SAM2Plus’s superiority over SAM 2. It achieves an average improvement of 1.0 in J&F metrics across all 24 direct comparisons, with gains exceeding 2.3 points on SA-V and LVOS datasets for long-term tracking. The method delivers real-time performance and strong generalization without fine-tuning or additional parameters, effectively addressing occlusion recovery and viewpoint changes. By unifying motion-aware physics-based prediction with spatial segmentation, SAM2Plus bridges the gap between static and dynamic reasoning, offering a scalable solution for real-world applications such as autonomous driving and surveillance systems. Full article
Show Figures

Figure 1

21 pages, 5105 KiB  
Article
A Dynamic Kalman Filtering Method for Multi-Object Fruit Tracking and Counting in Complex Orchards
by Yaning Zhai, Ling Zhang, Xin Hu, Fanghu Yang and Yang Huang
Sensors 2025, 25(13), 4138; https://doi.org/10.3390/s25134138 - 2 Jul 2025
Viewed by 496
Abstract
With the rapid development of agricultural intelligence in recent years, automatic fruit detection and counting technologies have become increasingly significant for optimizing orchard management and advancing precision agriculture. However, existing deep learning-based models are primarily designed to process static and single-frame images, thereby [...] Read more.
With the rapid development of agricultural intelligence in recent years, automatic fruit detection and counting technologies have become increasingly significant for optimizing orchard management and advancing precision agriculture. However, existing deep learning-based models are primarily designed to process static and single-frame images, thereby failing to meet the large-scale detection and counting demands in the dynamically changing scenes of modern orchards. To address these challenges, this paper proposes a multi-object fruit tracking and counting method, which integrates an improved YOLO-based object detection algorithm with a dynamically optimized Kalman filter. By optimizing the network structure, the improved YOLO detection model provides high-quality detection results for subsequent tracking tasks. Then a modified Kalman filter with a variable forgetting factor is integrated to dynamically adjust the weighting of historical data, enabling the model to adapt to changes in observation and motion noise. Moreover, fruit targets are associated using a combined strategy based on Intersection over Union (IoU) and Re-Identification (Re-ID) features, improving the accuracy and stability of object matching. Consequently, the continuous tracking and precise counting of fruits in video sequences are achieved. Experimental results with image frames of fruits in video sequence are demonstrated, showing that the proposed method performs robust and continuous tracking (MOTA of 95.0% and HOTA of 82.4%). For fruit counting, the method attains a high coefficient-of-determination of 0.85 and a low root-mean-square error (RMSE) of 1.57, exhibiting high accuracy and stability of fruit detection, tracking and counting in video sequences under complex orchard environments. Full article
(This article belongs to the Special Issue AI-Based Computer Vision Sensors & Systems)
Show Figures

Figure 1

20 pages, 119066 KiB  
Article
Coarse-Fine Tracker: A Robust MOT Framework for Satellite Videos via Tracking Any Point
by Hanru Shi, Xiaoxuan Liu, Xiyu Qi, Enze Zhu, Jie Jia and Lei Wang
Remote Sens. 2025, 17(13), 2167; https://doi.org/10.3390/rs17132167 - 24 Jun 2025
Viewed by 264
Abstract
Traditional Multiple Object Tracking (MOT) methods in satellite videos mostly follow the Detection-Based Tracking (DBT) framework. However, the DBT framework assumes that all objects are correctly recognized and localized by the detector. In practice, the low resolution of satellite videos, small objects, and [...] Read more.
Traditional Multiple Object Tracking (MOT) methods in satellite videos mostly follow the Detection-Based Tracking (DBT) framework. However, the DBT framework assumes that all objects are correctly recognized and localized by the detector. In practice, the low resolution of satellite videos, small objects, and complex backgrounds inevitably leads to a decline in detector performance. To alleviate the impact of detector degradation on track, we propose Coarse-Fine Tracker, a framework that integrates the MOT framework with the Tracking Any Point (TAP) method CoTracker for the first time, leveraging TAP’s persistent point correspondence modeling to compensate for detector failures. In our Coarse-Fine Tracker, we divide the satellite video into sub-videos. For one sub-video, we first use ByteTrack to track the outputs of the detector, referred to as coarse tracking, which involves the Kalman filter and box-level motion features. Given the small size of objects in satellite videos, we treat each object as a point to be tracked. We then use CoTracker to track the center point of each object, referred to as fine tracking, by calculating the appearance feature similarity between each point and its neighboring points. Finally, the Consensus Fusion Strategy eliminates mismatched detections in coarse tracking results by checking their geometric consistency against fine tracking results and recovers missed objects via linear interpolation or linear fitting. This method is validated on the VISO and SAT-MTB datasets. Experimental results in VISO show that the tracker achieves a multi-object tracking accuracy (MOTA) of 66.9, a multi-object tracking precision (MOTP) of 64.1, and an IDF1 score of 77.8, surpassing the detector-only baseline by 11.1% in MOTA while reducing ID switches by 139. Comparative experiments with ByteTrack demonstrate the robustness of our tracking method when the performance of the detector deteriorates. Full article
Show Figures

Figure 1

19 pages, 11127 KiB  
Article
Drone State Estimation Based on Frame-to-Frame Template Matching with Optimal Windows
by Seokwon Yeom
Drones 2025, 9(7), 457; https://doi.org/10.3390/drones9070457 - 24 Jun 2025
Viewed by 398
Abstract
The flight capability of drones expands the surveillance area and allows drones to be mobile platforms. Therefore, it is important to estimate the kinematic state of drones. In this paper, the kinematic state of a mini drone in flight is estimated based on [...] Read more.
The flight capability of drones expands the surveillance area and allows drones to be mobile platforms. Therefore, it is important to estimate the kinematic state of drones. In this paper, the kinematic state of a mini drone in flight is estimated based on the video captured by its camera. A novel frame-to-frame template-matching technique is proposed. The instantaneous velocity of the drone is measured through image-to-position conversion and frame-to-frame template matching using optimal windows. Multiple templates are defined by their corresponding windows in a frame. The size and location of the windows are obtained by minimizing the sum of the least square errors between the piecewise linear regression model and the nonlinear image-to-position conversion function. The displacement between two consecutive frames is obtained via frame-to-frame template matching that minimizes the sum of normalized squared differences. The kinematic state of the drone is estimated by a Kalman filter based on the velocity computed from the displacement. The Kalman filter is augmented to simultaneously estimate the state and velocity bias of the drone. For faster processing, a zero-order hold scheme is adopted to reuse the measurement. In the experiments, two 150 m long roadways were tested; one road is in an urban environment and the other in a suburban environment. A mini drone starts from a hovering state, reaches top speed, and then continues to fly at a nearly constant speed. The drone captures video 10 times on each road from a height of 40 m at a 60-degree camera tilt angle. It will be shown that the proposed method achieves average distance errors at low meter levels after the flight. Full article
(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)
Show Figures

Figure 1

23 pages, 6358 KiB  
Article
Optimization of Sorghum Spike Recognition Algorithm and Yield Estimation
by Mengyao Han, Jian Gao, Cuiqing Wu, Qingliang Cui, Xiangyang Yuan and Shujin Qiu
Agronomy 2025, 15(7), 1526; https://doi.org/10.3390/agronomy15071526 - 23 Jun 2025
Viewed by 342
Abstract
In the natural field environment, the high planting density of sorghum and severe occlusion among spikes substantially increases the difficulty of sorghum spike recognition, resulting in frequent false positives and false negatives. The target detection model suitable for this environment requires high computational [...] Read more.
In the natural field environment, the high planting density of sorghum and severe occlusion among spikes substantially increases the difficulty of sorghum spike recognition, resulting in frequent false positives and false negatives. The target detection model suitable for this environment requires high computational power, and it is difficult to realize real-time detection of sorghum spikes on mobile devices. This study proposes a detection-tracking scheme based on improved YOLOv8s-GOLD-LSKA with optimized DeepSort, aiming to enhance yield estimation accuracy in complex agricultural field scenarios. By integrating the GOLD module’s dual-branch multi-scale feature fusion and the LSKA attention mechanism, a lightweight detection model is developed. The improved DeepSort algorithm enhances tracking robustness in occlusion scenarios by optimizing the confidence threshold filtering (0.46), frame-skipping count, and cascading matching strategy (n = 3, max_age = 40). Combined with the five-point sampling method, the average dry weight of sorghum spikes (0.12 kg) was used to enable rapid yield estimation. The results demonstrate that the improved model achieved a mAP of 85.86% (a 6.63% increase over the original YOLOv8), an F1 score of 81.19%, and a model size reduced to 7.48 MB, with a detection speed of 0.0168 s per frame. The optimized tracking system attained a MOTA of 67.96% and ran at 42 FPS. Image- and video-based yield estimation accuracies reached 89–96% and 75–93%, respectively, with single-frame latency as low as 0.047 s. By optimizing the full detection–tracking–yield pipeline, this solution overcomes challenges in small object missed detections, ID switches under occlusion, and real-time processing in complex scenarios. Its lightweight, high-efficiency design is well suited for deployment on UAVs and mobile terminals, providing robust technical support for intelligent sorghum monitoring and precision agriculture management, and thereby playing a crucial role in driving agricultural digital transformation. Full article
Show Figures

Figure 1

22 pages, 4426 KiB  
Article
High-Radix Taylor-Optimized Tone Mapping Processor for Adaptive 4K HDR Video at 30 FPS
by Xianglong Wang, Zhiyong Lai, Lei Chen and Fengwei An
Sensors 2025, 25(13), 3887; https://doi.org/10.3390/s25133887 - 22 Jun 2025
Viewed by 353
Abstract
High Dynamic Range (HDR) imaging is capable of capturing vivid and lifelike visual effects, which are crucial for fields such as computer vision, photography, and medical imaging. However, real-time processing of HDR content remains challenging due to the computational complexity of tone mapping [...] Read more.
High Dynamic Range (HDR) imaging is capable of capturing vivid and lifelike visual effects, which are crucial for fields such as computer vision, photography, and medical imaging. However, real-time processing of HDR content remains challenging due to the computational complexity of tone mapping algorithms and the inherent limitations of Low Dynamic Range (LDR) capture systems. This paper presents an adaptive HDR tone mapping processor that achieves high computational efficiency and robust image quality under varying exposure conditions. By integrating an exposure-adaptive factor into a bilateral filtering framework, we dynamically optimize parameters to achieve consistent performance across fluctuating illumination conditions. Further, we introduce a high-radix Taylor expansion technique to accelerate floating-point logarithmic and exponential operations, significantly reducing resource overhead while maintaining precision. The proposed architecture, implemented on a Xilinx XCVU9P FPGA, operates at 250 MHz and processes 4K video at 30 frames per second (FPS), outperforming state-of-the-art designs in both throughput and hardware efficiency. Experimental results demonstrate superior image fidelity with an average Tone Mapping Quality Index (TMQI): 0.9314 and 43% fewer logic resources compared to existing solutions, enabling real-time HDR processing for high-resolution applications. Full article
Show Figures

Figure 1

20 pages, 2223 KiB  
Article
ChatGPT-Based Model for Controlling Active Assistive Devices Using Non-Invasive EEG Signals
by Tais da Silva Mota, Saket Sarkar, Rakshith Poojary and Redwan Alqasemi
Electronics 2025, 14(12), 2481; https://doi.org/10.3390/electronics14122481 - 18 Jun 2025
Viewed by 572
Abstract
With an anticipated 3.6 million Americans who will be living with limb loss by 2050, the demand for active assistive devices is rapidly increasing. This study investigates the feasibility of leveraging a ChatGPT-based (Version 4o) model to predict motion based on input electroencephalogram [...] Read more.
With an anticipated 3.6 million Americans who will be living with limb loss by 2050, the demand for active assistive devices is rapidly increasing. This study investigates the feasibility of leveraging a ChatGPT-based (Version 4o) model to predict motion based on input electroencephalogram (EEG) signals, enabling the non-invasive control of active assistive devices. To achieve this goal, three objectives were set. First, the model’s capability to derive accurate mathematical relationships from numerical datasets was validated to establish a foundational level of computational accuracy. Next, synchronized arm motion videos and EEG signals were introduced, which allowed the model to filter, normalize, and classify EEG data in relation to distinct text-based arm motions. Finally, the integration of marker-based motion capture data provided motion information, which is essential for inverse kinematics applications in robotic control. The combined findings highlight the potential of ChatGPT-generated machine learning systems to effectively correlate multimodal data streams and serve as a robust foundation for the intuitive, non-invasive control of assistive technologies using EEG signals. Future work will focus on applying the model to real-time control applications while expanding the dataset’s diversity to enhance the accuracy and performance of the model, with the ultimate aim of improving the independence and quality of life of individuals who rely on active assistive devices. Full article
(This article belongs to the Special Issue Advances in Intelligent Control Systems)
Show Figures

Figure 1

21 pages, 12445 KiB  
Article
Parkinson’s Disease Detection via Bilateral Gait Camera Sensor Fusion Using CMSA-Net and Implementation on Portable Device
by Jinxuan Wang, Hua Huo, Wei Liu, Changwei Zhao, Shilu Kang and Lan Ma
Sensors 2025, 25(12), 3715; https://doi.org/10.3390/s25123715 - 13 Jun 2025
Viewed by 476
Abstract
The annual increase in the incidence of Parkinson’s disease (PD) underscores the critical need for effective detection methods and devices. Gait video features based on camera sensors, as a crucial biomarker for PD, are well-suited for detection and show promise for the development [...] Read more.
The annual increase in the incidence of Parkinson’s disease (PD) underscores the critical need for effective detection methods and devices. Gait video features based on camera sensors, as a crucial biomarker for PD, are well-suited for detection and show promise for the development of portable devices. Consequently, we developed a single-step segmentation method based on Savitzky–Golay (SG) filtering and a sliding window peak selection function, along with a Cross-Attention Fusion with Mamba-2 and Self-Attention Network (CMSA-Net). Additionally, we introduced a loss function based on Maximum Mean Discrepancy (MMD) to further enhance the fusion process. We evaluated our method on a dual-view gait video dataset that we collected in collaboration with a hospital, comprising 304 healthy control (HC) samples and 84 PD samples, achieving an accuracy of 89.10% and an F1-score of 81.11%, thereby attaining the best detection performance compared with other methods. Based on these methodologies, we designed a simple and user-friendly portable PD detection device. The device is equipped with various operating modes—including single-view, dual-view, and prior information correction—which enable it to adapt to diverse environments, such as residential and elder care settings, thereby demonstrating strong practical applicability. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 4957 KiB  
Article
OITrack: Multi-Object Tracking for Small Targets in Satellite Video via Online Trajectory Completion and Iterative Expansion over Union
by Weishan Lu, Xueying Wang, Wei An, Chao Xiao, Qian Yin and Guoliang Zhang
Remote Sens. 2025, 17(12), 2042; https://doi.org/10.3390/rs17122042 - 13 Jun 2025
Viewed by 454
Abstract
Multi-object tracking (MOT) in satellite videos presents significant challenges, including small target sizes, dense distributions, and complex motion patterns. To address these issues, we propose OITrack, an improved tracking framework that integrates a Trajectory Completion Module (TCM), an Adaptive Kalman Filter (AKF), and [...] Read more.
Multi-object tracking (MOT) in satellite videos presents significant challenges, including small target sizes, dense distributions, and complex motion patterns. To address these issues, we propose OITrack, an improved tracking framework that integrates a Trajectory Completion Module (TCM), an Adaptive Kalman Filter (AKF), and an Iterative Expansion Intersection over Union Strategy (I-EIoU) strategy. Specifically, TCM enhances temporal continuity by compensating for missing trajectories, AKF improves tracking robustness by dynamically adjusting observation noise, and I-EIoU optimizes target association, leading to more accurate small-object matching. Experimental evaluations on the VIdeo Satellite Objects (VISO) dataset demonstrated that OITrack outperforms existing MOT methods across multiple key metrics, achieving a Multiple Object Tracking Accuracy (MOTA) of 57.0%, an Identity F1 Score (IDF1) of 67.5%, a reduction in False Negatives (FN) to 29,170, and a decrease in Identity Switches (ID switches) to 889. These results indicate that our method effectively improves tracking accuracy while minimizing identity mismatches, enhancing overall robustness. Full article
Show Figures

Figure 1

23 pages, 4973 KiB  
Article
Detection of Electric Network Frequency in Audio Using Multi-HCNet
by Yujin Li, Tianliang Lu, Shufan Peng, Chunhao He, Kai Zhao, Gang Yang and Yan Chen
Sensors 2025, 25(12), 3697; https://doi.org/10.3390/s25123697 - 13 Jun 2025
Viewed by 550
Abstract
With the increasing application of electrical network frequency (ENF) in forensic audio and video analysis, ENF signal detection has emerged as a critical technology. However, high-pass filtering operations commonly employed in modern communication scenarios, while effectively removing infrasound to enhance communication quality at [...] Read more.
With the increasing application of electrical network frequency (ENF) in forensic audio and video analysis, ENF signal detection has emerged as a critical technology. However, high-pass filtering operations commonly employed in modern communication scenarios, while effectively removing infrasound to enhance communication quality at reduced costs, result in a substantial loss of fundamental frequency information, thereby degrading the performance of existing detection methods. To tackle this issue, this paper introduces Multi-HCNet, an innovative deep learning model specifically tailored for ENF signal detection in high-pass filtered environments. Specifically, the model incorporates an array of high-order harmonic filters (AFB), which compensates for the loss of fundamental frequency by capturing high-order harmonic components. Additionally, a grouped multi-channel adaptive attention mechanism (GMCAA) is proposed to precisely distinguish between multiple frequency signals, demonstrating particular effectiveness in differentiating between 50 Hz and 60 Hz fundamental frequency signals. Furthermore, a sine activation function (SAF) is utilized to better align with the periodic nature of ENF signals, enhancing the model’s capacity to capture periodic oscillations. Experimental results indicate that after hyperparameter optimization, Multi-HCNet exhibits superior performance across various experimental conditions. Compared to existing approaches, this study not only significantly improves the detection accuracy of ENF signals in complex environments, achieving a peak accuracy of 98.84%, but also maintains an average detection accuracy exceeding 80% under high-pass filtering conditions. These findings demonstrate that even in scenarios where fundamental frequency information is lost, the model remains capable of effectively detecting ENF signals, offering a novel solution for ENF signal detection under extreme conditions of fundamental frequency absence. Moreover, this study successfully distinguishes between 50 Hz and 60 Hz fundamental frequency signals, providing robust support for the practical deployment and extension of ENF signal applications. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Back to TopTop