Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,732)

Search Parameters:
Keywords = video camera

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 4392 KiB  
Article
Visualization of Kinetic Parameters of a Droplet Nucleation Boiling on Smooth and Micro-Pillar Surfaces with Inclined Angles
by Yi-Nan Zhang, Guo-Qing Huang, Lu-Ming Zhao and Hong-Xia Chen
Energies 2025, 18(15), 4152; https://doi.org/10.3390/en18154152 - 5 Aug 2025
Abstract
The evaporation dynamics of droplets on smooth and inclined micro-pillar surfaces were experimentally investigated. The surface temperature was increased from 50 °C to 120 °C, with the inclination angles being 0°, 30°, 45°, and 60° respectively. The dynamic parameters, including contact area, nucleation [...] Read more.
The evaporation dynamics of droplets on smooth and inclined micro-pillar surfaces were experimentally investigated. The surface temperature was increased from 50 °C to 120 °C, with the inclination angles being 0°, 30°, 45°, and 60° respectively. The dynamic parameters, including contact area, nucleation density, bubble stable diameter, and droplet asymmetry, were recorded using two high-speed video cameras, and the corresponding evaporation performance was analyzed. Experimental results showed that the inclination angle had a significant influence on the evaporation of micro-pillar surfaces than smooth surfaces as well as a positive correlation between the enhancement performance of the micro-pillars and increasing inclination angles. This angular dependence arises from surface inclination-induced tail elongation and the corresponding asymmetry of droplets. With definition of the one-dimensional asymmetry factor (ε) and volume asymmetry factor (γ), it was proven that although the asymmetric thickness of the droplets reduces the nucleation density and bubble stable diameter, the droplet asymmetry significantly increased the heat exchange area, resulting in a 37% improvement in the evaporation rate of micro-pillar surfaces and about a 15% increase in its enhancement performance to smooth surfaces when the inclination angle increased from 0°to 60°. These results indicate that asymmetry causes changes in heat transfer conditions, specifically, a significant increase in the wetted area and deformation of the liquid film, which are the direct enhancement mechanisms of inclined micro-pillar surfaces. Full article
(This article belongs to the Special Issue Advancements in Heat Transfer and Fluid Flow for Energy Applications)
Show Figures

Figure 1

23 pages, 3055 KiB  
Article
A Markerless Approach for Full-Body Biomechanics of Horses
by Sarah K. Shaffer, Omar Medjaouri, Brian Swenson, Travis Eliason and Daniel P. Nicolella
Animals 2025, 15(15), 2281; https://doi.org/10.3390/ani15152281 - 5 Aug 2025
Viewed by 77
Abstract
The ability to quantify equine kinematics is essential for clinical evaluation, research, and performance feedback. However, current methods are challenging to implement. This study presents a motion capture methodology for horses, where three-dimensional, full-body kinematics are calculated without instrumentation on the animal, offering [...] Read more.
The ability to quantify equine kinematics is essential for clinical evaluation, research, and performance feedback. However, current methods are challenging to implement. This study presents a motion capture methodology for horses, where three-dimensional, full-body kinematics are calculated without instrumentation on the animal, offering a more scalable and labor-efficient approach when compared with traditional techniques. Kinematic trajectories are calculated from multi-camera video data. First, a neural network identifies skeletal landmarks (markers) in each camera view and the 3D location of each marker is triangulated. An equine biomechanics model is scaled to match the subject’s shape, using segment lengths defined by markers. Finally, inverse kinematics (IK) produces full kinematic trajectories. We test this methodology on a horse at three gaits. Multiple neural networks (NNs), trained on different equine datasets, were evaluated. All networks predicted over 78% of the markers within 25% of the length of the radius bone on test data. Root-mean-square-error (RMSE) between joint angles predicted via IK using ground truth marker-based motion capture data and network-predicted data was less than 10 degrees for 25 to 32 of 35 degrees of freedom, depending on the gait and data used for network training. NNs trained over a larger variety of data improved joint angle RMSE and curve similarity. Marker prediction error, the average distance between ground truth and predicted marker locations, and IK marker error, the distance between experimental and model markers, were used to assess network, scaling, and registration errors. The results demonstrate the potential of markerless motion capture for full-body equine kinematic analysis. Full article
(This article belongs to the Special Issue Advances in Equine Sports Medicine, Therapy and Rehabilitation)
Show Figures

Figure 1

27 pages, 13385 KiB  
Article
In-Field Load Acquisitions on a Variable Chamber Round Baler Using Instrumented Hub Carriers and a Dynamometric Towing Pin
by Filippo Coppola, Andrea Ruffin and Giovanni Meneghetti
Appl. Sci. 2025, 15(15), 8579; https://doi.org/10.3390/app15158579 - 1 Aug 2025
Viewed by 121
Abstract
In this work, the load spectra acting in the vertical direction on the hub carriers and in the horizontal longitudinal direction on the drawbar of a trailed variable chamber round baler were evaluated. To this end, each hub carrier was instrumented with appropriately [...] Read more.
In this work, the load spectra acting in the vertical direction on the hub carriers and in the horizontal longitudinal direction on the drawbar of a trailed variable chamber round baler were evaluated. To this end, each hub carrier was instrumented with appropriately calibrated strain gauge bridges. Similarly, the baler was equipped with a dynamometric towing pin, instrumented with strain gauge sensors and calibrated in the laboratory, which replaced the original pin connecting the baler and the tractor during the in-field load acquisitions. In both cases, the calibration tests returned the relationship between applied forces and output signals of the strain gauge bridges. Multiple in-field load acquisitions were carried out under typical maneuvers and operating conditions. The synchronous acquisition of a video via an onboard camera and Global Positioning System (GPS) signal allowed to observe the behaviour of the baler in correspondence of particular trends of the vertical and horizontal loads and to point out the most demanding maneuver in view of the fatigue resistance of the baler. Finally, through the application of a rainflow cycle counting algorithm according to ASTM E1049-85, the load spectrum for each maneuver was derived. Full article
(This article belongs to the Section Mechanical Engineering)
Show Figures

Figure 1

23 pages, 4510 KiB  
Article
Identification and Characterization of Biosecurity Breaches on Poultry Farms with a Recent History of Highly Pathogenic Avian Influenza Virus Infection Determined by Video Camera Monitoring in the Netherlands
by Armin R. W. Elbers and José L. Gonzales
Pathogens 2025, 14(8), 751; https://doi.org/10.3390/pathogens14080751 - 30 Jul 2025
Viewed by 478
Abstract
Biosecurity measures applied on poultry farms, with a recent history of highly pathogenic avian influenza virus infection, were monitored using 24 h/7 days-per-week video monitoring. Definition of biosecurity breaches were based on internationally acknowledged norms. Farms of four different production types (two broiler, [...] Read more.
Biosecurity measures applied on poultry farms, with a recent history of highly pathogenic avian influenza virus infection, were monitored using 24 h/7 days-per-week video monitoring. Definition of biosecurity breaches were based on internationally acknowledged norms. Farms of four different production types (two broiler, two layer, two breeder broiler, and one duck farm) were selected. Observations of entry to and exit from the anteroom revealed a high degree of biosecurity breaches in six poultry farms and good biosecurity practices in one farm in strictly maintaining the separation between clean and potentially contaminated areas in the anteroom. Hand washing with soap and water and/or using disinfectant lotion was rarely observed at entry to the anteroom and was almost absent at exit. Egg transporters did not disinfect fork-lift wheels when entering the egg-storage room nor change or properly disinfect footwear. The egg-storage room was not cleaned and disinfected after egg transport by the farmer. Similarly, footwear and trolley wheels were not disinfected when introducing young broilers or ducklings to the poultry unit. Biosecurity breaches were observed when introducing bedding material in the duck farm. This study shows a need for an engaging awareness and training campaign for poultry farmers and their co-workers as well as for transporters to promote good biosecurity practices. Full article
Show Figures

Figure 1

30 pages, 37977 KiB  
Article
Text-Guided Visual Representation Optimization for Sensor-Acquired Video Temporal Grounding
by Yun Tian, Xiaobo Guo, Jinsong Wang and Xinyue Liang
Sensors 2025, 25(15), 4704; https://doi.org/10.3390/s25154704 - 30 Jul 2025
Viewed by 266
Abstract
Video temporal grounding (VTG) aims to localize a semantically relevant temporal segment within an untrimmed video based on a natural language query. The task continues to face challenges arising from cross-modal semantic misalignment, which is largely attributed to redundant visual content in sensor-acquired [...] Read more.
Video temporal grounding (VTG) aims to localize a semantically relevant temporal segment within an untrimmed video based on a natural language query. The task continues to face challenges arising from cross-modal semantic misalignment, which is largely attributed to redundant visual content in sensor-acquired video streams, linguistic ambiguity, and discrepancies in modality-specific representations. Most existing approaches rely on intra-modal feature modeling, processing video and text independently throughout the representation learning stage. However, this isolation undermines semantic alignment by neglecting the potential of cross-modal interactions. In practice, a natural language query typically corresponds to spatiotemporal content in video signals collected through camera-based sensing systems, encompassing a particular sequence of frames and its associated salient subregions. We propose a text-guided visual representation optimization framework tailored to enhance semantic interpretation over video signals captured by visual sensors. This framework leverages textual information to focus on spatiotemporal video content, thereby narrowing the cross-modal gap. Built upon the unified cross-modal embedding space provided by CLIP, our model leverages video data from sensing devices to structure representations and introduces two dedicated modules to semantically refine visual representations across spatial and temporal dimensions. First, we design a Spatial Visual Representation Optimization (SVRO) module to learn spatial information within intra-frames. It selects salient patches related to the text, capturing more fine-grained visual details. Second, we introduce a Temporal Visual Representation Optimization (TVRO) module to learn temporal relations from inter-frames. Temporal triplet loss is employed in TVRO to enhance attention on text-relevant frames and capture clip semantics. Additionally, a self-supervised contrastive loss is introduced at the clip–text level to improve inter-clip discrimination by maximizing semantic variance during training. Experiments on Charades-STA, ActivityNet Captions, and TACoS, widely used benchmark datasets, demonstrate that our method outperforms state-of-the-art methods across multiple metrics. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

22 pages, 554 KiB  
Systematic Review
Smart Homes: A Meta-Study on Sense of Security and Home Automation
by Carlos M. Torres-Hernandez, Mariano Garduño-Aparicio and Juvenal Rodriguez-Resendiz
Technologies 2025, 13(8), 320; https://doi.org/10.3390/technologies13080320 - 30 Jul 2025
Viewed by 466
Abstract
This review examines advancements in smart home security through the integration of home automation technologies. Various security systems, including surveillance cameras, smart locks, and motion sensors, are analyzed, highlighting their effectiveness in enhancing home security. These systems enable users to monitor and control [...] Read more.
This review examines advancements in smart home security through the integration of home automation technologies. Various security systems, including surveillance cameras, smart locks, and motion sensors, are analyzed, highlighting their effectiveness in enhancing home security. These systems enable users to monitor and control their homes in real-time, providing an additional layer of security. The document also examines how these security systems can enhance the quality of life for users by providing greater convenience and control over their domestic environment. The ability to receive instant alerts and access video recordings from anywhere allows users to respond quickly to unexpected situations, thereby increasing their sense of security and well-being. Additionally, the challenges and future trends in this field are addressed, emphasizing the importance of designing solutions that are intuitive and easy to use. As technology continues to evolve, it is crucial for developers and manufacturers to focus on creating products that seamlessly integrate into users’ daily lives, facilitating their adoption and use. This comprehensive state-of-the-art review, based on the Scopus database, provides a detailed overview of the current status and future potential of smart home security systems. It highlights how ongoing innovation in this field can lead to the development of more advanced and efficient solutions that not only protect homes but also enhance the overall user experience. Full article
(This article belongs to the Special Issue Smart Systems (SmaSys2024))
Show Figures

Figure 1

16 pages, 5245 KiB  
Article
Automatic Detection of Foraging Hens in a Cage-Free Environment with Computer Vision Technology
by Samin Dahal, Xiao Yang, Bidur Paneru, Anjan Dhungana and Lilong Chai
Poultry 2025, 4(3), 34; https://doi.org/10.3390/poultry4030034 - 30 Jul 2025
Viewed by 227
Abstract
Foraging behavior in hens is an important indicator of animal welfare. It involves both the search for food and exploration of the environment, which provides necessary enrichment. In addition, it has been inversely linked to damaging behaviors such as severe feather pecking. Conventional [...] Read more.
Foraging behavior in hens is an important indicator of animal welfare. It involves both the search for food and exploration of the environment, which provides necessary enrichment. In addition, it has been inversely linked to damaging behaviors such as severe feather pecking. Conventional studies rely on manual observation to investigate foraging location, duration, timing, and frequency. However, this approach is labor-intensive, time-consuming, and subject to human bias. Our study developed computer vision-based methods to automatically detect foraging hens in a cage-free research environment and compared their performance. A cage-free room was divided into four pens, two larger pens measuring 2.9 m × 2.3 m with 30 hens each and two smaller pens measuring 2.3 m × 1.8 m with 18 hens each. Cameras were positioned vertically, 2.75 m above the floor, recording the videos at 15 frames per second. Out of 4886 images, 70% were used for model training, 20% for validation, and 10% for testing. We trained multiple You Only Look Once (YOLO) object detection models from YOLOv9, YOLOv10, and YOLO11 series for 100 epochs each. All the models achieved precision, recall, and mean average precision at 0.5 intersection over union (mAP@0.5) above 75%. YOLOv9c achieved the highest precision (83.9%), YOLO11x achieved the highest recall (86.7%), and YOLO11m achieved the highest mAP@0.5 (89.5%). These results demonstrate the use of computer vision to automatically detect complex poultry behavior, such as foraging, making it more efficient. Full article
Show Figures

Figure 1

33 pages, 11684 KiB  
Article
Face Spoofing Detection with Stacking Ensembles in Work Time Registration System
by Rafał Klinowski and Mirosław Kordos
Appl. Sci. 2025, 15(15), 8402; https://doi.org/10.3390/app15158402 - 29 Jul 2025
Viewed by 139
Abstract
This paper introduces a passive face-authenticity detection system, designed for integration into an employee work time registration platform. The system is implemented as a stacking ensemble of multiple models. Each model independently assesses whether a camera is capturing a live human face or [...] Read more.
This paper introduces a passive face-authenticity detection system, designed for integration into an employee work time registration platform. The system is implemented as a stacking ensemble of multiple models. Each model independently assesses whether a camera is capturing a live human face or a spoofed representation, such as a photo or video. The ensemble comprises a convolutional neural network (CNN), a smartphone bezel-detection algorithm to identify faces displayed on electronic devices, a face context analysis module, and additional CNNs for image processing. The outputs of these models are aggregated by a neural network that delivers the final classification decision. We examined various combinations of models within the ensemble and compared the performance of our approach against existing methods through experimental evaluation. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Image Processing)
Show Figures

Figure 1

18 pages, 4836 KiB  
Article
Deep Learning to Analyze Spatter and Melt Pool Behavior During Additive Manufacturing
by Deepak Gadde, Alaa Elwany and Yang Du
Metals 2025, 15(8), 840; https://doi.org/10.3390/met15080840 - 28 Jul 2025
Viewed by 459
Abstract
To capture the complex metallic spatter and melt pool behavior during the rapid interaction between the laser and metal material, high-speed cameras are applied to record the laser powder bed fusion process and generate a large volume of image data. In this study, [...] Read more.
To capture the complex metallic spatter and melt pool behavior during the rapid interaction between the laser and metal material, high-speed cameras are applied to record the laser powder bed fusion process and generate a large volume of image data. In this study, four deep learning algorithms are applied: YOLOv5, Fast R-CNN, RetinaNet, and EfficientDet. They are trained by the recorded videos to learn and extract information on spatter and melt pool behavior during the laser powder bed fusion process. The well-trained models achieved high accuracy and low loss, demonstrating strong capability in accurately detecting and tracking spatter and melt pool dynamics. A stability index is proposed and calculated based on the melt pool length change rate. Greater index value reflects a more stable melt pool. We found that more spatters were detected for the unstable melt pool, while fewer spatters were found for the stable melt pool. The spatter’s size can affect its initial ejection speed, and large spatters are ejected slowly while small spatters are ejected rapidly. In addition, more than 58% of detected spatters have their initial ejection angle in the range of 60–120°. These findings provide a better understanding of spatter and melt pool dynamics and behavior, uncover the influence of melt pool stability on spatter formation, and demonstrate the correlation between the spatter size and its initial ejection speed. This work will contribute to the extraction of important information from high-speed recorded videos for additive manufacturing to reduce waste, lower cost, enhance part quality, and increase process reliability. Full article
(This article belongs to the Special Issue Machine Learning in Metal Additive Manufacturing)
Show Figures

Figure 1

15 pages, 2636 KiB  
Article
Chest Compression Skill Evaluation System Using Pose Estimation and Web-Based Application
by Ryota Watanabe, Jahidul Islam, Xin Zhu, Emiko Kaneko, Ken Iseki and Lei Jing
Appl. Sci. 2025, 15(15), 8252; https://doi.org/10.3390/app15158252 - 24 Jul 2025
Viewed by 296
Abstract
It is critical to provide life-sustaining treatment to OHCA patients before ambulance care arrives. However, incorrectly performed resuscitation maneuvers reduce the chances of survival and recovery for the victims. Therefore, we must train regularly and learn how to do it correctly. To facilitate [...] Read more.
It is critical to provide life-sustaining treatment to OHCA patients before ambulance care arrives. However, incorrectly performed resuscitation maneuvers reduce the chances of survival and recovery for the victims. Therefore, we must train regularly and learn how to do it correctly. To facilitate regular chest compression training, this study aims to improve the accuracy of a chest compression evaluation system using posture estimation and to develop a web application. To analyze and enhance accuracy, the YOLOv8 posture estimation was used to examine compression depth, recoil, and tempo, and its accuracy was compared to that of the manikin, which has evaluation systems. We conducted comparative experiments with different camera angles and heights to optimize the accuracy of the evaluation. The experimental results showed that an angle of 30 degrees and a height of 50 cm produced superior accuracy. For web application development, a system has been designed to allow users to upload videos for analysis and obtain appropriate compression parameters. The usability evaluation of the application confirmed its ease of use and accessibility, and positive feedback was obtained. In the conclusion, these findings suggest that optimizing recording conditions significantly improves the accuracy of posture-based chest compression evaluation. Future work will focus on enhancing real-time feedback functionality and improving the user interface of the web application. Full article
(This article belongs to the Special Issue Machine Learning in Biomedical Applications)
Show Figures

Figure 1

23 pages, 13739 KiB  
Article
Traffic Accident Rescue Action Recognition Method Based on Real-Time UAV Video
by Bo Yang, Jianan Lu, Tao Liu, Bixing Zhang, Chen Geng, Yan Tian and Siyu Zhang
Drones 2025, 9(8), 519; https://doi.org/10.3390/drones9080519 - 24 Jul 2025
Viewed by 427
Abstract
Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and [...] Read more.
Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and localization annotation. A total of 5082 keyframes were labeled with 1–5 targets each, and 14,412 instances of data were prepared (including flight altitude and camera angles) for action classification and position annotation. To mitigate the challenges posed by high-resolution drone footage with excessive redundant information, we propose the SlowFast-Traffic (SF-T) framework, a spatio-temporal sequence-based algorithm for recognizing traffic accident rescue actions. For more efficient extraction of target–background correlation features, we introduce the Actor-Centric Relation Network (ACRN) module, which employs temporal max pooling to enhance the time-dimensional features of static backgrounds, significantly reducing redundancy-induced interference. Additionally, smaller ROI feature map outputs are adopted to boost computational speed. To tackle class imbalance in incident samples, we integrate a Class-Balanced Focal Loss (CB-Focal Loss) function, effectively resolving rare-action recognition in specific rescue scenarios. We replace the original Faster R-CNN with YOLOX-s to improve the target detection rate. On our proposed dataset, the SF-T model achieves a mean average precision (mAP) of 83.9%, which is 8.5% higher than that of the standard SlowFast architecture while maintaining a processing speed of 34.9 tasks/s. Both accuracy-related metrics and computational efficiency are substantially improved. The proposed method demonstrates strong robustness and real-time analysis capabilities for modern traffic rescue action recognition. Full article
(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)
Show Figures

Figure 1

27 pages, 6578 KiB  
Article
Evaluating Neural Radiance Fields for ADA-Compliant Sidewalk Assessments: A Comparative Study with LiDAR and Manual Methods
by Hang Du, Shuaizhou Wang, Linlin Zhang, Mark Amo-Boateng and Yaw Adu-Gyamfi
Infrastructures 2025, 10(8), 191; https://doi.org/10.3390/infrastructures10080191 - 22 Jul 2025
Viewed by 365
Abstract
An accurate assessment of sidewalk conditions is critical for ensuring compliance with the Americans with Disabilities Act (ADA), particularly to safeguard mobility for wheelchair users. This paper presents a novel 3D reconstruction framework based on neural radiance field (NeRF), which utilize a monocular [...] Read more.
An accurate assessment of sidewalk conditions is critical for ensuring compliance with the Americans with Disabilities Act (ADA), particularly to safeguard mobility for wheelchair users. This paper presents a novel 3D reconstruction framework based on neural radiance field (NeRF), which utilize a monocular video input from consumer-grade cameras to generate high-fidelity 3D models of sidewalk environments. The framework enables automatic extraction of ADA-relevant geometric features, including the running slope, the cross slope, and vertical displacements, facilitating an efficient and scalable compliance assessment process. A comparative study is conducted across three surveying methods—manual measurements, LiDAR scanning, and the proposed NeRF-based approach—evaluated on four sidewalks and one curb ramp. Each method was assessed based on accuracy, cost, time, level of automation, and scalability. The NeRF-based approach achieved high agreement with LiDAR-derived ground truth, delivering an F1 score of 96.52%, a precision of 96.74%, and a recall of 96.34% for ADA compliance classification. These results underscore the potential of NeRF to serve as a cost-effective, automated alternative to traditional and LiDAR-based methods, with sufficient precision for widespread deployment in municipal sidewalk audits. Full article
Show Figures

Figure 1

18 pages, 2545 KiB  
Article
Reliable Indoor Fire Detection Using Attention-Based 3D CNNs: A Fire Safety Engineering Perspective
by Mostafa M. E. H. Ali and Maryam Ghodrat
Fire 2025, 8(7), 285; https://doi.org/10.3390/fire8070285 - 21 Jul 2025
Viewed by 534
Abstract
Despite recent advances in deep learning for fire detection, much of the current research prioritizes model-centric metrics over dataset fidelity, particularly from a fire safety engineering perspective. Commonly used datasets are often dominated by fully developed flames, mislabel smoke-only frames as non-fire, or [...] Read more.
Despite recent advances in deep learning for fire detection, much of the current research prioritizes model-centric metrics over dataset fidelity, particularly from a fire safety engineering perspective. Commonly used datasets are often dominated by fully developed flames, mislabel smoke-only frames as non-fire, or lack intra-video diversity due to redundant frames from limited sources. Some works treat smoke detection alone as early-stage detection, even though many fires (e.g., electrical or chemical) begin with visible flames and no smoke. Additionally, attempts to improve model applicability through mixed-context datasets—combining indoor, outdoor, and wildland scenes—often overlook the unique false alarm sources and detection challenges specific to each environment. To address these limitations, we curated a new video dataset comprising 1108 annotated fire and non-fire clips captured via indoor surveillance cameras. Unlike existing datasets, ours emphasizes early-stage fire dynamics (pre-flashover) and includes varied fire sources (e.g., sofa, cupboard, and attic fires), realistic false alarm triggers (e.g., flame-colored objects, artificial lighting), and a wide range of spatial layouts and illumination conditions. This collection enables robust training and benchmarking for early indoor fire detection. Using this dataset, we developed a spatiotemporal fire detection model based on the mixed convolutions ResNets (MC3_18) architecture, augmented with Convolutional Block Attention Modules (CBAM). The proposed model achieved 86.11% accuracy, 88.76% precision, and 84.04% recall, along with low false positive (11.63%) and false negative (15.96%) rates. Compared to its CBAM-free baseline, the model exhibits notable improvements in F1-score and interpretability, as confirmed by Grad-CAM++ visualizations highlighting attention to semantically meaningful fire features. These results demonstrate that effective early fire detection is inseparable from high-quality, context-specific datasets. Our work introduces a scalable, safety-driven approach that advances the development of reliable, interpretable, and deployment-ready fire detection systems for residential environments. Full article
Show Figures

Figure 1

28 pages, 8982 KiB  
Article
Decision-Level Multi-Sensor Fusion to Improve Limitations of Single-Camera-Based CNN Classification in Precision Farming: Application in Weed Detection
by Md. Nazmuzzaman Khan, Adibuzzaman Rahi, Mohammad Al Hasan and Sohel Anwar
Computation 2025, 13(7), 174; https://doi.org/10.3390/computation13070174 - 18 Jul 2025
Viewed by 317
Abstract
The United States leads in corn production and consumption in the world with an estimated USD 50 billion per year. There is a pressing need for the development of novel and efficient techniques aimed at enhancing the identification and eradication of weeds in [...] Read more.
The United States leads in corn production and consumption in the world with an estimated USD 50 billion per year. There is a pressing need for the development of novel and efficient techniques aimed at enhancing the identification and eradication of weeds in a manner that is both environmentally sustainable and economically advantageous. Weed classification for autonomous agricultural robots is a challenging task for a single-camera-based system due to noise, vibration, and occlusion. To address this issue, we present a multi-camera-based system with decision-level sensor fusion to improve the limitations of a single-camera-based system in this paper. This study involves the utilization of a convolutional neural network (CNN) that was pre-trained on the ImageNet dataset. The CNN subsequently underwent re-training using a limited weed dataset to facilitate the classification of three distinct weed species: Xanthium strumarium (Common Cocklebur), Amaranthus retroflexus (Redroot Pigweed), and Ambrosia trifida (Giant Ragweed). These weed species are frequently encountered within corn fields. The test results showed that the re-trained VGG16 with a transfer-learning-based classifier exhibited acceptable accuracy (99% training, 97% validation, 94% testing accuracy) and inference time for weed classification from the video feed was suitable for real-time implementation. But the accuracy of CNN-based classification from video feed from a single camera was found to deteriorate due to noise, vibration, and partial occlusion of weeds. Test results from a single-camera video feed show that weed classification accuracy is not always accurate for the spray system of an agricultural robot (AgBot). To improve the accuracy of the weed classification system and to overcome the shortcomings of single-sensor-based classification from CNN, an improved Dempster–Shafer (DS)-based decision-level multi-sensor fusion algorithm was developed and implemented. The proposed algorithm offers improvement on the CNN-based weed classification when the weed is partially occluded. This algorithm can also detect if a sensor is faulty within an array of sensors and improves the overall classification accuracy by penalizing the evidence from a faulty sensor. Overall, the proposed fusion algorithm showed robust results in challenging scenarios, overcoming the limitations of a single-sensor-based system. Full article
(This article belongs to the Special Issue Moving Object Detection Using Computational Methods and Modeling)
Show Figures

Figure 1

18 pages, 7391 KiB  
Article
Reliable QoE Prediction in IMVCAs Using an LMM-Based Agent
by Michael Sidorov, Tamir Berger, Jonathan Sterenson, Raz Birman and Ofer Hadar
Sensors 2025, 25(14), 4450; https://doi.org/10.3390/s25144450 - 17 Jul 2025
Viewed by 291
Abstract
Face-to-face interaction is one of the most natural forms of human communication. Unsurprisingly, Video Conferencing (VC) Applications have experienced a significant rise in demand over the past decade. With the widespread availability of cellular devices equipped with high-resolution cameras, Instant Messaging Video Call [...] Read more.
Face-to-face interaction is one of the most natural forms of human communication. Unsurprisingly, Video Conferencing (VC) Applications have experienced a significant rise in demand over the past decade. With the widespread availability of cellular devices equipped with high-resolution cameras, Instant Messaging Video Call Applications (IMVCAs) now constitute a substantial portion of VC communications. Given the multitude of IMVCA options, maintaining a high Quality of Experience (QoE) is critical. While content providers can measure QoE directly through end-to-end connections, Internet Service Providers (ISPs) must infer QoE indirectly from network traffic—a non-trivial task, especially when most traffic is encrypted. In this paper, we analyze a large dataset collected from WhatsApp IMVCA, comprising over 25,000 s of VC sessions. We apply four Machine Learning (ML) algorithms and a Large Multimodal Model (LMM)-based agent, achieving mean errors of 4.61%, 5.36%, and 13.24% for three popular QoE metrics: BRISQUE, PIQE, and FPS, respectively. Full article
Show Figures

Figure 1

Back to TopTop