Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (63)

Search Parameters:
Keywords = camera module v3

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 3321 KB  
Article
Evaluating the X2000: A Novel Integrated Platform for Rapid ADAS Development
by Michael Giuliani and George Pappas
Electronics 2026, 15(3), 679; https://doi.org/10.3390/electronics15030679 - 4 Feb 2026
Viewed by 385
Abstract
In this work, we present the design and evaluation of the X2000, a new development kit created to simplify and accelerate research for advanced driver-assistance systems (ADAS). The X2000 is a complete ADAS development kit for the Ford Mach-E. It includes a forward-facing [...] Read more.
In this work, we present the design and evaluation of the X2000, a new development kit created to simplify and accelerate research for advanced driver-assistance systems (ADAS). The X2000 is a complete ADAS development kit for the Ford Mach-E. It includes a forward-facing vehicle-mounted camera, vehicle-mounted AI computer, controller area network flexible data-rate (CAN-FD) and 12 V power connections, and a CAN-FD interface to the vehicle’s forward radar. Central to the kit is a novel ADAS software architecture designed for readability and extensibility. Included in the design are software modules for the following: (1) camera and radar interfacing; (2) image processing; (3) AI model inference; (4) data logging; (5) steering and velocity planning; (6) low-level vehicle controls for steering, acceleration, and braking; (7) lane centering visualization to the car’s 17-inch touchscreen. To build on a proven system, the X2000 integrates the AI model, planner, low-level controls, and radar interfacing software from Openpilot. We build on the excellent work of the Openpilot team while creating a highly simplified system. Openpilot features 17 software processes and 77 inter-process messages, while the X2000 uses 6 processes and 7 inter-process messages. Full article
Show Figures

Figure 1

20 pages, 5171 KB  
Article
LGD-DeepLabV3+: An Enhanced Framework for Remote Sensing Semantic Segmentation via Multi-Level Feature Fusion and Global Modeling
by Xin Wang, Xu Liu, Adnan Mahmood, Yaxin Yang and Xipeng Li
Sensors 2026, 26(3), 1008; https://doi.org/10.3390/s26031008 - 3 Feb 2026
Viewed by 250
Abstract
Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and [...] Read more.
Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and aerial cameras, UAV-borne optical sensors, and other imaging payloads. These sensing systems deliver large-area coverage with fine ground sampling distance, which magnifies domain shifts between different sensors and acquisition conditions. This work builds upon DeepLabV3+ and proposes complementary improvements at three stages: input, context, and decoder fusion. First, to mitigate the interference of complex and heterogeneous data distributions on network optimization, a feature-mapping network is introduced to project raw images into a simpler distribution before they are fed into the segmentation backbone. This approach facilitates training and enhances feature separability. Second, although the Atrous Spatial Pyramid Pooling (ASPP) aggregates multi-scale context, it remains insufficient for modeling long-range dependencies. Therefore, a routing-style global modeling module is incorporated after ASPP to strengthen global relation modeling and ensure cross-region semantic consistency. Third, considering that the fusion between shallow details and deep semantics in the decoder is limited and prone to boundary blurring, a fusion module is designed to facilitate deep interaction and joint learning through cross-layer feature alignment and coupling. The proposed model improves the mean Intersection over Union (mIoU) by 8.83% on the LoveDA dataset and by 6.72% on the ISPRS Potsdam dataset compared to the baseline. Qualitative results further demonstrate clearer boundaries and more stable region annotations, while the proposed modules are plug-and-play and easy to integrate into camera-based remote sensing pipelines and other imaging-sensor systems, providing a practical accuracy–efficiency trade-off. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

19 pages, 2984 KB  
Article
Development and Field Testing of an Acoustic Sensor Unit for Smart Crossroads as Part of V2X Infrastructure
by Yury Furletov, Dinara Aptinova, Mekan Mededov, Andrey Keller, Sergey S. Shadrin and Daria A. Makarova
Smart Cities 2026, 9(1), 17; https://doi.org/10.3390/smartcities9010017 - 21 Jan 2026
Viewed by 250
Abstract
Improving city crossroads safety is a critical problem for modern smart transportation systems (STS). This article presents the results of developing, upgrading, and comprehensively experimentally testing an acoustic monitoring system prototype designed for rapid accident detection. Unlike conventional camera- or lidar-based approaches, the [...] Read more.
Improving city crossroads safety is a critical problem for modern smart transportation systems (STS). This article presents the results of developing, upgrading, and comprehensively experimentally testing an acoustic monitoring system prototype designed for rapid accident detection. Unlike conventional camera- or lidar-based approaches, the proposed solution uses passive sound source localization to operate effectively with no direct visibility and in adverse weather conditions, addressing a key limitation of camera- or lidar-based systems. Generalized Cross-Correlation with Phase Transform (GCC-PHAT) algorithms were used to develop a hardware–software complex featuring four microphones, a multichannel audio interface, and a computation module. This study focuses on the gradual upgrading of the algorithm to reduce the mean localization error in real-life urban conditions. Laboratory and complex field tests were conducted on an open-air testing ground of a university campus. During these tests, the system demonstrated that it can accurately determine the coordinates of a sound source imitating accidents (sirens, collisions). The analysis confirmed that the system satisfies the V2X infrastructure integration response time requirement (<200 ms). The results suggest that the system can be used as part of smart transportation systems. Full article
(This article belongs to the Section Physical Infrastructures and Networks in Smart Cities)
Show Figures

Figure 1

25 pages, 3861 KB  
Article
Semantically Guided 3D Reconstruction and Body Weight Estimation Method for Dairy Cows
by Jinshuo Zhang, Xinzhong Wang, Hewei Meng, Junzhu Huang, Xinran Zhang, Kuizhou Zhou, Yaping Li and Huijie Peng
Agriculture 2026, 16(2), 182; https://doi.org/10.3390/agriculture16020182 - 11 Jan 2026
Viewed by 292
Abstract
To address the low efficiency and stress-inducing nature of traditional manual weighing for dairy cows, this study proposes a semantically guided 3D reconstruction and body weight estimation method for dairy cows. First, a dual-viewpoint Kinect V2 camera synchronous acquisition system captures top-view and [...] Read more.
To address the low efficiency and stress-inducing nature of traditional manual weighing for dairy cows, this study proposes a semantically guided 3D reconstruction and body weight estimation method for dairy cows. First, a dual-viewpoint Kinect V2 camera synchronous acquisition system captures top-view and side-view point cloud data from 150 calves and 150 lactating cows. Subsequently, the CSS-PointNet++ network model was designed. Building upon PointNet++, it incorporates Convolutional Block Attention Module (CBAM) and Attention-Weighted Hybrid Pooling Module (AHPM) to achieve precise semantic segmentation of the torso and limbs in the side-view point cloud. Based on this, point cloud registration algorithms were applied to align the dual-view point clouds. Missing parts were mirrored and completed using semantic information to achieve 3D reconstruction. Finally, a body weight estimation model was established based on volume and surface area through surface reconstruction. Experiments demonstrate that CSS-PointNet++ achieves an Overall Accuracy (OA) of 98.35% and a mean Intersection over Union (mIoU) of 95.61% in semantic segmentation tasks, representing improvements of 2.2% and 4.65% over PointNet++, respectively. In the weight estimation phase, the BP neural network (BPNN) delivers optimal performance: For the calf group, the Mean Absolute Error (MAE) was 1.8409 kg, Root Mean Square Error (RMSE) was 2.4895 kg, Mean Relative Error (MRE) was 1.49%, and Coefficient of Determination (R2) was 0.9204; for the lactating cows group, MAE was 12.5784 kg, RMSE was 14.4537 kg, MRE was 1.75%, and R2 was 0.8628. This method enables 3D reconstruction and body weight estimation of cows during walking, providing an efficient and precise body weight monitoring solution for precision farming. Full article
(This article belongs to the Section Farm Animal Production)
Show Figures

Figure 1

19 pages, 4836 KB  
Article
Experimental Study of Pouch-Type Battery Cell Thermal Characteristics Operated at High C-Rates
by Marius Vasylius, Deivydas Šapalas, Benas Dumbrauskas, Valentinas Kartašovas, Audrius Senulis, Artūras Tadžijevas, Pranas Mažeika, Rimantas Didžiokas, Ernestas Šimkutis and Lukas Januta
Batteries 2026, 12(1), 14; https://doi.org/10.3390/batteries12010014 - 28 Dec 2025
Viewed by 644
Abstract
This paper investigates pouch-type lithium-ion battery cells with a nominal voltage of 3.7 V and a nominal capacity of 57 Ah. A numerical model of the cell was developed and implemented using the NTGK method, which accurately predicts electrochemical and thermal processes. The [...] Read more.
This paper investigates pouch-type lithium-ion battery cells with a nominal voltage of 3.7 V and a nominal capacity of 57 Ah. A numerical model of the cell was developed and implemented using the NTGK method, which accurately predicts electrochemical and thermal processes. The results of numerical modeling matched with the experimental results of battery cell temperature measurements—the average deviation was about 4.5%; therefore, it can be considered reliable for further engineering research and construction of battery modules. In the experimental part of the paper, the battery cell was loaded in various C-rates (from 0.5 to 2 C), using heat flux sensors, thermocouples, and a thermal imaging camera. The studies revealed that the highest temperature is in the tabs area of cells. The temperature on the face of the cell surface exceeds 35 °C already from a load of 1.35 C, which accelerates cell degradation and reduces the number of cycles. Thermal imaging revealed uneven temperature distribution, whereby the top of the cell heats up more than the bottom of the cell and the temperature gradient can reach 2–4 °C. It was observed that during faster charge/discharge modes, the temperature rises from the tabs of the cell, and during slower ones, more in the middle face surface of the cell. The studies highlight the need to apply additional cooling solutions, especially cooling of the upper cell face, to ensure durability and uniform heat distribution. Full article
Show Figures

Figure 1

13 pages, 14872 KB  
Article
Efficient Weather Perception via a Lightweight Network with Multi-Scale Feature Learning, Channel Attention, and Soft Voting
by Che-Cheng Chang, Po-Ting Wu, Ting-Yu Tsai and Jhe-Wei Lin
Electronics 2026, 15(1), 4; https://doi.org/10.3390/electronics15010004 - 19 Dec 2025
Viewed by 389
Abstract
Autonomous driving technology is advancing rapidly, particularly in vision-based approaches that use cameras to perceive the driving environment, which is the most human-like perception method. However, one of the key challenges that smart vehicles face is adapting to various weather conditions, which can [...] Read more.
Autonomous driving technology is advancing rapidly, particularly in vision-based approaches that use cameras to perceive the driving environment, which is the most human-like perception method. However, one of the key challenges that smart vehicles face is adapting to various weather conditions, which can significantly impact visual perception and vehicular control strategies. The ideal design for the latter is to dynamically adjust in real time to ensure safe and efficient driving, taking into account the prevailing weather conditions. In this study, we propose a lightweight weather perception model that incorporates multi-scale feature learning, channel attention mechanisms, and a soft voting ensemble strategy. This enables the model to capture various visual patterns, emphasize critical information, and integrate predictions across multiple modules for improved robustness. Benchmark comparisons are conducted using several well-known deep learning networks, including EfficientNet-B0, ResNet50, SqueezeNet, MobileNetV3-Large, MobileNetV3-Small, and LSKNet. Finally, using both public datasets and real-world video recordings from roads in Taiwan, our model demonstrates superior computational efficiency while maintaining high predictive accuracy. For example, our model achieves 98.07% classification accuracy with only 0.4 million parameters and 0.19 GFLOPs, surpassing several well-known CNNs in computational efficiency. Compared with EfficientNet-B0, which has a similar accuracy (98.37%) but requires over ten times more parameters and four times more FLOPs, our model offers a much lighter and faster alternative. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

23 pages, 3778 KB  
Article
Deep Learning-Driven Design and Analysis of an Autonomous Robotic System for In-Pipe Inspection
by Ambigai Rajasekaran, Uma Mohan, Sethuramalingam Prabhu, Shaik Ayman Hameed Baig, Shaik Pasha, Srinivasan Sridhar, Utsav Jain, Arvind Sekhar, Aryan Dwivedi and Praneeth Kasiraju
Algorithms 2026, 19(1), 1; https://doi.org/10.3390/a19010001 - 19 Dec 2025
Viewed by 689
Abstract
This paper presents an intelligent robotic system for in-pipe inspection that integrates a novel mechanical design, deep learning-based defect detection, and high-fidelity simulation for real-time validation. Unlike existing solutions, the proposed system combines a Mecanum wheel-based mobile platform with a modular arm and [...] Read more.
This paper presents an intelligent robotic system for in-pipe inspection that integrates a novel mechanical design, deep learning-based defect detection, and high-fidelity simulation for real-time validation. Unlike existing solutions, the proposed system combines a Mecanum wheel-based mobile platform with a modular arm and advanced pan-tilt camera, enabling navigation and inspection of pipes ranging from 100 mm to 500 mm in diameter. A comprehensive dataset of 53,486 images, including 27,000 annotated defect instances across six critical classes, was used to train a YOLOv11-based detection framework. The model achieved high accuracy with a precision of 0.9, recall of 0.8, mAP@0.5 of 0.9, and mAP@0.5:0.95 of 0.6, outperforming previous YOLO versions, SSD, RCNN, and DinoV2 by 26% in mAP. Real-time testing on a Raspberry Pi Camera 3 Wide IR module validated the robust detection under realistic conditions. This work contributes a mechanically adaptable robot, an optimized deep learning inspection framework, and an integrated simulation-to-deployment workflow, providing a scalable and autonomous solution for industrial pipeline inspection. Full article
(This article belongs to the Special Issue AI Applications and Modern Industry)
Show Figures

Figure 1

12 pages, 2668 KB  
Communication
Image Sensing for Motorcycle Active Safety Warning System: Using YOLO and Heuristic Weighting Mechanism
by Yaw-Jen Chang, Ming-Cheng Hsu and Wen-Yung Liang
Sensors 2025, 25(23), 7214; https://doi.org/10.3390/s25237214 - 26 Nov 2025
Cited by 1 | Viewed by 720
Abstract
This paper presents an active safety warning system for two-wheeled motorcycles that integrates YOLO v4 image recognition technology with a heuristic weighting mechanism (HWM) model to calculate risk scores and thus alert riders. The system’s analytical core is based on the NVIDIA Jetson [...] Read more.
This paper presents an active safety warning system for two-wheeled motorcycles that integrates YOLO v4 image recognition technology with a heuristic weighting mechanism (HWM) model to calculate risk scores and thus alert riders. The system’s analytical core is based on the NVIDIA Jetson TX2 module, with a camera mounted on the left-side rearview mirror of the motorcycle. YOLO is used to identify the type of approaching vehicle and measure the distance between the vehicle and the motorcycle. Moreover, the HWM model takes inputs such as vehicle type, spacing between the motorcycle and the vehicle, motorcycle speed, and distance from the intersection to generate potential risk scores. After training, the YOLO model for vehicle recognition achieved a mean Average Precision (mAP) of 92.78% at an Intersection over Union (IoU) threshold of 0.5. Additionally, the camera mounted at a 30° angle could clearly capture vehicles approaching from the left rear side of the motorcycle, achieving the highest vehicle recognition rate. Moreover, the HWM model generates a reasonable risk score to advise the rider to decelerate when the motorcycle is traveling at high speed with a vehicle approaching from behind, thereby reducing the risk of an accident and enhancing the safety of the motorcyclist. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 3239 KB  
Article
Feature-Level Vehicle-Infrastructure Cooperative Perception with Adaptive Fusion for 3D Object Detection
by Shuangzhi Yu, Jiankun Peng, Shaojie Wang, Di Wu and Chunye Ma
Smart Cities 2025, 8(5), 171; https://doi.org/10.3390/smartcities8050171 - 14 Oct 2025
Cited by 1 | Viewed by 1872
Abstract
As vehicle-centric perception struggles with occlusion and dense traffic, vehicle-infrastructure cooperative perception (VICP) offers a viable route to extend sensing coverage and robustness. This study proposes a feature-level VICP framework that fuses vehicle- and roadside-derived visual features via V2X communication. The model integrates [...] Read more.
As vehicle-centric perception struggles with occlusion and dense traffic, vehicle-infrastructure cooperative perception (VICP) offers a viable route to extend sensing coverage and robustness. This study proposes a feature-level VICP framework that fuses vehicle- and roadside-derived visual features via V2X communication. The model integrates four components: regional feature reconstruction (RFR) for transferring region-specific roadside cues, context-driven channel attention (CDCA) for channel recalibration, uncertainty-weighted fusion (UWF) for confidence-guided weighting, and point sampling voxel fusion (PSVF) for efficient alignment. Evaluated on the DAIR-V2X-C benchmark, our method consistently outperforms state-of-the-art feature-level fusion baselines, achieving improved AP3D and APBEV (reported settings: 16.31% and 21.49%, respectively). Ablations show RFR provides the largest single-module gain +3.27% AP3D and +3.85% APBEV, UWF yields substantial robustness gains, and CDCA offers modest calibration benefits. The framework enhances occlusion handling and cross-view detection while reducing dependence on explicit camera calibration, supporting more generalizable cooperative perception. Full article
Show Figures

Figure 1

26 pages, 7995 KB  
Article
Smart Home Control Using Real-Time Hand Gesture Recognition and Artificial Intelligence on Raspberry Pi 5
by Thomas Hobbs and Anwar Ali
Electronics 2025, 14(20), 3976; https://doi.org/10.3390/electronics14203976 - 10 Oct 2025
Viewed by 3714
Abstract
This paper outlines the process of developing a low-cost system for home appliance control via real-time hand gesture classification using Computer Vision and a custom lightweight machine learning model. This system strives to enable those with speech or hearing disabilities to interface with [...] Read more.
This paper outlines the process of developing a low-cost system for home appliance control via real-time hand gesture classification using Computer Vision and a custom lightweight machine learning model. This system strives to enable those with speech or hearing disabilities to interface with smart home devices in real time using hand gestures, such as is possible with voice-activated ‘smart assistants’ currently available. The system runs on a Raspberry Pi 5 to enable future IoT integration and reduce costs. The system also uses the official camera module v2 and 7-inch touchscreen. Frame preprocessing uses MediaPipe to assign hand coordinates, and NumPy tools to normalise them. A machine learning model then predicts the gesture. The model, a feed-forward network consisting of five fully connected layers, was built using Keras 3 and compiled with TensorFlow Lite. Training data utilised the HaGRIDv2 dataset, modified to consist of 15 one-handed gestures from its original of 23 one- and two-handed gestures. When used to train the model, validation metrics of 0.90 accuracy and 0.31 loss were returned. The system can control both analogue and digital hardware via GPIO pins and, when recognising a gesture, averages 20.4 frames per second with no observable delay. Full article
Show Figures

Figure 1

24 pages, 6407 KB  
Article
Lightweight SCC-YOLO for Winter Jujube Detection and 3D Localization with Cross-Platform Deployment Evaluation
by Meng Zhou, Yaohua Hu, Anxiang Huang, Yiwen Chen, Xing Tong, Mengfei Liu and Yunxiao Pan
Agriculture 2025, 15(19), 2092; https://doi.org/10.3390/agriculture15192092 - 8 Oct 2025
Cited by 1 | Viewed by 846
Abstract
Harvesting winter jujubes is a key step in production, yet traditional manual approaches are labor-intensive and inefficient. To overcome these challenges, we propose SCC-YOLO, a lightweight method for winter jujube detection, 3D localization, and cross-platform deployment, aiming to support intelligent harvesting. In this [...] Read more.
Harvesting winter jujubes is a key step in production, yet traditional manual approaches are labor-intensive and inefficient. To overcome these challenges, we propose SCC-YOLO, a lightweight method for winter jujube detection, 3D localization, and cross-platform deployment, aiming to support intelligent harvesting. In this study, RGB-D cameras were integrated with an improved YOLOv11 network optimized by ShuffleNetV2, CBAM, and a redesigned C2f_WTConv module, which enables joint spatial–frequency feature modeling and enhances small-object detection in complex orchard conditions. The model was trained on a diversified dataset with extensive augmentation to ensure robustness. In addition, the original localization loss was replaced with DIoU to improve bounding box regression accuracy. A robotic harvesting system was developed, and an Eye-to-Hand calibration-based 3D localization pipeline was implemented to map fruit coordinates to the robot workspace for accurate picking. To validate engineering applicability, the SCC-YOLO model was deployed on both desktop (PyTorch and ONNX Runtime) and mobile (NCNN with Vulkan+FP16) platforms, and FPS, latency, and stability were comparatively analyzed. Experimental results showed that SCC-YOLO improved mAP by 5.6% over YOLOv11, significantly enhanced detection precision and robustness, and achieved real-time performance on mobile devices while maintaining peak throughput on high-performance desktops. Field and laboratory tests confirmed the system’s effectiveness for detection, localization, and harvesting efficiency, demonstrating its adaptability to diverse deployment environments and its potential for broader agricultural applications. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

25 pages, 5456 KB  
Article
A Lightweight Hybrid Detection System Based on the OpenMV Vision Module for an Embedded Transportation Vehicle
by Xinxin Wang, Hongfei Gao, Xiaokai Ma and Lijun Wang
Sensors 2025, 25(18), 5724; https://doi.org/10.3390/s25185724 - 13 Sep 2025
Cited by 1 | Viewed by 1389
Abstract
Aiming at the real-time object detection requirements of the intelligent control system for laboratory item transportation in mobile embedded unmanned vehicles, this paper proposes a lightweight hybrid detection system based on the OpenMV vision module. The system adopts a two-stage detection mechanism: in [...] Read more.
Aiming at the real-time object detection requirements of the intelligent control system for laboratory item transportation in mobile embedded unmanned vehicles, this paper proposes a lightweight hybrid detection system based on the OpenMV vision module. The system adopts a two-stage detection mechanism: in long-distance scenarios (>32 cm), fast target positioning is achieved through red threshold segmentation based on the HSV(Hue, Saturation, Value) color space; when in close range (≤32 cm), it switches to a lightweight deep learning model for fine-grained recognition to reduce invalid computations. By integrating the MobileNetV2 backbone network with the FOMO (Fast Object Matching and Occlusion) object detection algorithm, the FOMO MobileNetV2 model is constructed, achieving an average classification accuracy of 94.1% on a self-built multi-dimensional dataset (including two variables of light intensity and object distance, with 820 samples), which is a 26.5% improvement over the baseline MobileNetV2. In terms of hardware, multiple functional components are integrated: OLED display, Bluetooth communication unit, ultrasonic sensor, OpenMV H7 Plus camera, and servo pan-tilt. Target tracking is realized through the PID control algorithm, and finally, the embedded terminal achieves a real-time processing performance of 55 fps. Experimental results show that the system can effectively and in real-time identify and track the detection targets set in the laboratory. The designed unmanned vehicle system provides a practical solution for the automated and low-power transportation of small items in the laboratory environment. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

44 pages, 7582 KB  
Article
Continuous Authentication in Resource-Constrained Devices via Biometric and Environmental Fusion
by Nida Zeeshan, Makhabbat Bakyt, Naghmeh Moradpoor and Luigi La Spada
Sensors 2025, 25(18), 5711; https://doi.org/10.3390/s25185711 - 12 Sep 2025
Cited by 1 | Viewed by 3007
Abstract
Continuous authentication allows devices to keep checking that the active user is still the rightful owner instead of relying on a single login. However, current methods can be tricked by forging faces, revealing personal data, or draining the battery. Additionally, the environment where [...] Read more.
Continuous authentication allows devices to keep checking that the active user is still the rightful owner instead of relying on a single login. However, current methods can be tricked by forging faces, revealing personal data, or draining the battery. Additionally, the environment where the user plays a vital role in determining the user’s online security. Thanks to several security attacks, such as impersonation and replay, the user or the device can easily be compromised. We present a lightweight system that pairs face recognition with complex environmental sensing, i.e., the phone validates the user when the surrounding light or noise changes. A convolutional network turns each captured face into a 128-bit code, which is combined with a random “nonce” and protected by hashing. A camera–microphone module monitors light and sound to decide when to sample again, reducing unnecessary checks. We verified the protocol with formal security tools (Scyther v1.1.3.) and confirmed resistance to replay, interception, deepfake, and impersonation attacks. Across 2700 authentication cycles on a Snapdragon 778G testbed, the median decision time decreased from 61.2 ± 3.4 ms to 42.3 ± 2.1 ms (p < 0.01, paired t-test). Data usage per authentication cycle fell by an average of 24.7% ± 1.8%, and mean energy consumption per cycle decreased from 21.3 mJ to 19.8 mJ (∆ = 6.6 mJ, 95% CI: 5.9–7.2). These differences were consistent across varying lighting (≤50, 50–300, >300 lux) and noise conditions (30–55 dB SPL). These results show that smart-sensor-triggered face recognition can offer secure and energy-efficient continuous verification, supporting smart imaging and deep-learning-based face recognition. Full article
(This article belongs to the Section Environmental Sensing)
Show Figures

Figure 1

35 pages, 10185 KB  
Article
Int.2D-3D-CNN: Integrated 2D and 3D Convolutional Neural Networks for Video Violence Recognition
by Wimolsree Getsopon, Sirawan Phiphitphatphaisit, Emmanuel Okafor and Olarik Surinta
Mathematics 2025, 13(16), 2665; https://doi.org/10.3390/math13162665 - 19 Aug 2025
Cited by 1 | Viewed by 2222
Abstract
Intelligent video analysis tools have advanced significantly, with numerous cameras installed in various locations to enhance security and monitor unusual events. However, the effective detection and monitoring of violent incidents often depend on manual effort and time-consuming analysis of recorded footage, which can [...] Read more.
Intelligent video analysis tools have advanced significantly, with numerous cameras installed in various locations to enhance security and monitor unusual events. However, the effective detection and monitoring of violent incidents often depend on manual effort and time-consuming analysis of recorded footage, which can delay timely interventions. Deep learning has emerged as a powerful approach for extracting critical features essential to identifying and classifying violent behavior, enabling the development of accurate and scalable models across diverse domains. This study presents the Int.2D-3D-CNN architecture, which integrates a two-dimensional convolutional neural network (2D-CNN) and 3D-CNNs for video-based violence recognition. Compared to traditional 2D-CNN and 3D-CNN models, the proposed Int.2D-3D-CNN model presents improved performance on the Hockey Fight, Movie, and Violent Flows datasets. The architecture captures both static and dynamic characteristics of violent scenes by integrating spatial and temporal information. Specifically, the 2D-CNN component employs lightweight MobileNetV1 and MobileNetV2 to extract spatial features from individual frames, while a simplified 3D-CNN module with a single 3D convolution layer captures motion and temporal dependencies across sequences. Evaluation results highlight the robustness of the proposed model in accurately distinguishing violent from non-violent videos under diverse conditions. The Int.2D-3D-CNN model achieved accuracies of 98%, 100%, and 98% on the Hockey Fight, Movie, and Violent Flows datasets, respectively, indicating strong potential for violence recognition applications. Full article
(This article belongs to the Special Issue Applications of Deep Learning and Convolutional Neural Network)
Show Figures

Figure 1

21 pages, 9749 KB  
Article
Enhanced Pose Estimation for Badminton Players via Improved YOLOv8-Pose with Efficient Local Attention
by Yijian Wu, Zewen Chen, Hongxing Zhang, Yulin Yang and Weichao Yi
Sensors 2025, 25(14), 4446; https://doi.org/10.3390/s25144446 - 17 Jul 2025
Cited by 4 | Viewed by 3912
Abstract
With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To [...] Read more.
With the rapid development of sports analytics and artificial intelligence, accurate human pose estimation in badminton is becoming increasingly important. However, challenges such as the lack of domain-specific datasets and the complexity of athletes’ movements continue to hinder progress in this area. To address these issues, we propose an enhanced pose estimation framework tailored to badminton players, built upon an improved YOLOv8-Pose architecture. In particular, we introduce an efficient local attention (ELA) mechanism that effectively captures fine-grained spatial dependencies and contextual information, thereby significantly improving the keypoint localization accuracy and overall pose estimation performance. To support this study, we construct a dedicated badminton pose dataset comprising 4000 manually annotated samples, captured using a Microsoft Kinect v2 camera. The raw data undergo careful processing and refinement through a combination of depth-assisted annotation and visual inspection to ensure high-quality ground truth keypoints. Furthermore, we conduct an in-depth comparative analysis of multiple attention modules and their integration strategies within the network, offering generalizable insights to enhance pose estimation models in other sports domains. The experimental results show that the proposed ELA-enhanced YOLOv8-Pose model consistently achieves superior accuracy across multiple evaluation metrics, including the mean squared error (MSE), object keypoint similarity (OKS), and percentage of correct keypoints (PCK), highlighting its effectiveness and potential for broader applications in sports vision tasks. Full article
(This article belongs to the Special Issue Computer Vision-Based Human Activity Recognition)
Show Figures

Figure 1

Back to TopTop