Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,963)

Search Parameters:
Keywords = video based

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 1483 KiB  
Article
Empowering Independence for Visually Impaired Museum Visitors Through Enhanced Accessibility
by Theresa Zaher Nasser, Tsvi Kuflik and Alexandra Danial-Saad
Sensors 2025, 25(15), 4811; https://doi.org/10.3390/s25154811 - 5 Aug 2025
Abstract
Museums serve as essential cultural centers, yet their mostly visual exhibits restrict access for blind and partially sighted (BPS) individuals. While recent technological advances have started to bridge this gap, many accessibility solutions focus mainly on basic inclusion rather than promoting independent exploration. [...] Read more.
Museums serve as essential cultural centers, yet their mostly visual exhibits restrict access for blind and partially sighted (BPS) individuals. While recent technological advances have started to bridge this gap, many accessibility solutions focus mainly on basic inclusion rather than promoting independent exploration. This research addresses this limitation by creating features that enable visitors’ independence through customizable interaction patterns and self-paced exploration. It improved upon existing interactive tangible user interfaces (ITUIs) by enhancing their audio content and adding more flexible user control options. A mixed-methods approach evaluated the ITUI’s usability, ability to be used independently, and user satisfaction. Quantitative data were gathered using ITUI-specific satisfaction, usability, comparison, and general preference scales, while insights were obtained through notes taken during a think-aloud protocol as participants interacted with the ITUIs, direct observation, and analysis of video recordings of the experiment. The results showed a strong preference for a Pushbutton-based ITUI, which scored highest in usability (M = 87.5), perceived independence (72%), and user control (76%). Participants stressed the importance of tactile interaction, clear feedback, and customizable audio features like volume and playback speed. These findings underscore the vital role of user control and precise feedback in designing accessible museum experiences. Full article
Show Figures

Figure 1

23 pages, 3055 KiB  
Article
A Markerless Approach for Full-Body Biomechanics of Horses
by Sarah K. Shaffer, Omar Medjaouri, Brian Swenson, Travis Eliason and Daniel P. Nicolella
Animals 2025, 15(15), 2281; https://doi.org/10.3390/ani15152281 - 5 Aug 2025
Abstract
The ability to quantify equine kinematics is essential for clinical evaluation, research, and performance feedback. However, current methods are challenging to implement. This study presents a motion capture methodology for horses, where three-dimensional, full-body kinematics are calculated without instrumentation on the animal, offering [...] Read more.
The ability to quantify equine kinematics is essential for clinical evaluation, research, and performance feedback. However, current methods are challenging to implement. This study presents a motion capture methodology for horses, where three-dimensional, full-body kinematics are calculated without instrumentation on the animal, offering a more scalable and labor-efficient approach when compared with traditional techniques. Kinematic trajectories are calculated from multi-camera video data. First, a neural network identifies skeletal landmarks (markers) in each camera view and the 3D location of each marker is triangulated. An equine biomechanics model is scaled to match the subject’s shape, using segment lengths defined by markers. Finally, inverse kinematics (IK) produces full kinematic trajectories. We test this methodology on a horse at three gaits. Multiple neural networks (NNs), trained on different equine datasets, were evaluated. All networks predicted over 78% of the markers within 25% of the length of the radius bone on test data. Root-mean-square-error (RMSE) between joint angles predicted via IK using ground truth marker-based motion capture data and network-predicted data was less than 10 degrees for 25 to 32 of 35 degrees of freedom, depending on the gait and data used for network training. NNs trained over a larger variety of data improved joint angle RMSE and curve similarity. Marker prediction error, the average distance between ground truth and predicted marker locations, and IK marker error, the distance between experimental and model markers, were used to assess network, scaling, and registration errors. The results demonstrate the potential of markerless motion capture for full-body equine kinematic analysis. Full article
(This article belongs to the Special Issue Advances in Equine Sports Medicine, Therapy and Rehabilitation)
Show Figures

Figure 1

19 pages, 1010 KiB  
Article
Online Video Streaming from the Perspective of Transaction Cost Economics
by Amit Malhan, Pankaj Chaudhary and Robert Pavur
J. Theor. Appl. Electron. Commer. Res. 2025, 20(3), 199; https://doi.org/10.3390/jtaer20030199 - 4 Aug 2025
Abstract
In recent years, online streaming has encountered the challenge of retaining its user base. This study considers the role of transaction cost economics theory in consumer choices to continue subscribing. Participants respond to their top three streaming services, resulting in 797 responses, accounting [...] Read more.
In recent years, online streaming has encountered the challenge of retaining its user base. This study considers the role of transaction cost economics theory in consumer choices to continue subscribing. Participants respond to their top three streaming services, resulting in 797 responses, accounting for multiple selections by each respondent. Respondents could choose their top three services from a list of Netflix, Disney, Hulu, Amazon Prime Video, HBO Max, and Apple TV+. The study’s conclusions highlight the impact of uncertainty, a negative measure of streaming quality, on online subscription-based video streaming. Additionally, asset specificity, reflecting uniqueness and exclusive content, is found to be positively related to continuing a subscription. This research distinguishes itself by examining individuals who are already subscribers to provide insights and guidance through the lens of Transaction Cost Economics, to help marketing professionals seeking a deeper understanding of consumer behavior in the online streaming landscape. Full article
Show Figures

Figure 1

19 pages, 1109 KiB  
Article
User Preference-Based Dynamic Optimization of Quality of Experience for Adaptive Video Streaming
by Zixuan Feng, Yazhi Liu and Hao Zhang
Electronics 2025, 14(15), 3103; https://doi.org/10.3390/electronics14153103 - 4 Aug 2025
Abstract
With the rapid development of video streaming services, adaptive bitrate (ABR) algorithms have become a core technology for ensuring optimal viewing experiences. Traditional ABR strategies, predominantly rule-based or reinforcement learning-driven, typically employ uniform quality assessment metrics that overlook users’ subjective preference differences regarding [...] Read more.
With the rapid development of video streaming services, adaptive bitrate (ABR) algorithms have become a core technology for ensuring optimal viewing experiences. Traditional ABR strategies, predominantly rule-based or reinforcement learning-driven, typically employ uniform quality assessment metrics that overlook users’ subjective preference differences regarding factors such as video quality and stalling. To address this limitation, this paper proposes an adaptive video bitrate selection system that integrates preference modeling with reinforcement learning. By incorporating a preference learning module, the system models and scores user viewing trajectories, using these scores to replace conventional rewards and guide the training of the Proximal Policy Optimization (PPO) algorithm, thereby achieving policy optimization that better aligns with users’ perceived experiences. Simulation results on DASH network bandwidth traces demonstrate that the proposed optimization method improves overall Quality of Experience (QoE) by over 9% compared to other mainstream algorithms. Full article
Show Figures

Figure 1

10 pages, 903 KiB  
Article
Gender Differences in Visual Information Perception Ability: A Signal Detection Theory Approach
by Yejin Lee and Kwangtae Jung
Appl. Sci. 2025, 15(15), 8621; https://doi.org/10.3390/app15158621 (registering DOI) - 4 Aug 2025
Abstract
The accurate perception of visual stimuli in human–machine systems is crucial for improving system safety, usability, and task performance. The widespread adoption of digital technology has significantly increased the importance of visual interfaces and information. Therefore, it is essential to design visual interfaces [...] Read more.
The accurate perception of visual stimuli in human–machine systems is crucial for improving system safety, usability, and task performance. The widespread adoption of digital technology has significantly increased the importance of visual interfaces and information. Therefore, it is essential to design visual interfaces and information with user characteristics in mind to ensure accurate perception of visual information. This study employed the Cognitive Perceptual Assessment for Driving (CPAD) to evaluate and compare gender differences in the ability to perceive visual signals within complex visual stimuli. The experimental setup included a computer with CPAD installed, along with a touch monitor, mouse, joystick, and keyboard. The participants included 11 male and 20 female students, with an average age of 22 for males and 21 for females. Prior to the experiment, participants were instructed to determine whether a signal stimulus was present: if a square, presented as the signal, was included in the visual stimulus, they moved the joystick to the left; otherwise, they moved it to the right. Each participant performed a total of 40 trials. The entire experiment was recorded on video to measure overall response times. The experiment measured the number of correct detections of signal presence, response times, the number of misses (failing to detect the signal when present), and false alarms (detecting the signal when absent). The analysis of experimental data revealed no significant differences in perceptual ability or response times for visual stimuli between genders. However, males demonstrated slightly superior perceptual ability and marginally shorter response times compared to females. Analyses of sensitivity and response bias, based on signal detection theory, also indicated a slightly higher perceptual ability in males. In conclusion, although these differences were not statistically significant, males demonstrated a slightly better perception ability for visual stimuli. The findings of this study can inform the design of information, user interfaces, and visual displays in human–machine systems, particularly in light of the recent trend of increased female participation in the industrial sector. Future research will focus on diverse types of visual information to further validate these findings. Full article
Show Figures

Figure 1

16 pages, 612 KiB  
Article
Examination of Step Kinematics Between Children with Different Acceleration Patterns in Short-Sprint Dash
by Ilias Keskinis, Vassilios Panoutsakopoulos, Evangelia Merkou, Savvas Lazaridis and Eleni Bassa
Biomechanics 2025, 5(3), 60; https://doi.org/10.3390/biomechanics5030060 - 4 Aug 2025
Abstract
Background/Objectives: Sprinting is a fundamental locomotor skill and a key indicator of lower limb strength and anaerobic power in early childhood. The aim of the study was to examine possible differences in the step kinematic parameters and their contribution to sprint speed [...] Read more.
Background/Objectives: Sprinting is a fundamental locomotor skill and a key indicator of lower limb strength and anaerobic power in early childhood. The aim of the study was to examine possible differences in the step kinematic parameters and their contribution to sprint speed between children with different patterns of speed development. Methods: 65 prepubescent male and female track athletes (33 males and 32 females; 6.9 ± 0.8 years old) were examined in a maximal 15 m short sprint running test, where photocells measured time for each 5 m segment. At the last 5 m segment, step length, frequency, and velocity were evaluated via a video analysis method. The symmetry angle was calculated for the examined step kinematic parameters. Results: Based on the speed at the final 5 m segment of the test, two groups were identified, the maximum sprint phase (MAX) and the acceleration phase (ACC) group. Speed was significantly (p < 0.05) higher in ACC in the final 5 m segment, while there was a significant (p < 0.05) interrelationship between step length and frequency in ACC but not in MAX. No other differences were observed. Conclusions: The difference observed in the interrelationship between speed and step kinematic parameters between ACC and MAX highlights the importance of identifying the speed development pattern to apply individualized training stimuli for the optimization of training that can lead to better conditioning and wellbeing of children involved in sports with requirements for short-sprint actions. Full article
(This article belongs to the Collection Locomotion Biomechanics and Motor Control)
Show Figures

Figure 1

12 pages, 480 KiB  
Article
A Novel Deep Learning Model for Predicting Colorectal Anastomotic Leakage: A Pioneer Multicenter Transatlantic Study
by Miguel Mascarenhas, Francisco Mendes, Filipa Fonseca, Eduardo Carvalho, Andre Santos, Daniela Cavadas, Guilherme Barbosa, Antonio Pinto da Costa, Miguel Martins, Abdullah Bunaiyan, Maísa Vasconcelos, Marley Ribeiro Feitosa, Shay Willoughby, Shakil Ahmed, Muhammad Ahsan Javed, Nilza Ramião, Guilherme Macedo and Manuel Limbert
J. Clin. Med. 2025, 14(15), 5462; https://doi.org/10.3390/jcm14155462 - 3 Aug 2025
Viewed by 56
Abstract
Background/Objectives: Colorectal anastomotic leak (CAL) is one of the most severe postoperative complications in colorectal surgery, impacting patient morbidity and mortality. Current risk assessment methods rely on clinical and intraoperative factors, but no real-time predictive tool exists. This study aimed to develop [...] Read more.
Background/Objectives: Colorectal anastomotic leak (CAL) is one of the most severe postoperative complications in colorectal surgery, impacting patient morbidity and mortality. Current risk assessment methods rely on clinical and intraoperative factors, but no real-time predictive tool exists. This study aimed to develop an artificial intelligence model based on intraoperative laparoscopic recording of the anastomosis for CAL prediction. Methods: A convolutional neural network (CNN) was trained with annotated frames from colorectal surgery videos across three international high-volume centers (Instituto Português de Oncologia de Lisboa, Hospital das Clínicas de Ribeirão Preto, and Royal Liverpool University Hospital). The dataset included a total of 5356 frames from 26 patients, 2007 with CAL and 3349 showing normal anastomosis. Four CNN architectures (EfficientNetB0, EfficientNetB7, ResNet50, and MobileNetV2) were tested. The models’ performance was evaluated using their sensitivity, specificity, accuracy, and area under the receiver operating characteristic (AUROC) curve. Heatmaps were generated to identify key image regions influencing predictions. Results: The best-performing model achieved an accuracy of 99.6%, AUROC of 99.6%, sensitivity of 99.2%, specificity of 100.0%, PPV of 100.0%, and NPV of 98.9%. The model reliably identified CAL-positive frames and provided visual explanations through heatmaps. Conclusions: To our knowledge, this is the first AI model developed to predict CAL using intraoperative video analysis. Its accuracy suggests the potential to redefine surgical decision-making by providing real-time risk assessment. Further refinement with a larger dataset and diverse surgical techniques could enable intraoperative interventions to prevent CAL before it occurs, marking a paradigm shift in colorectal surgery. Full article
(This article belongs to the Special Issue Updates in Digestive Diseases and Endoscopy)
Show Figures

Figure 1

26 pages, 18583 KiB  
Article
Transforming Pedagogical Practices and Teacher Identity Through Multimodal (Inter)action Analysis: A Case Study of Novice EFL Teachers in China
by Jing Zhou, Chengfei Li and Yan Cheng
Behav. Sci. 2025, 15(8), 1050; https://doi.org/10.3390/bs15081050 - 3 Aug 2025
Viewed by 61
Abstract
This study investigates the evolving pedagogical strategies and professional identity development of two novice college English teachers in China through a semester-long classroom-based inquiry. Drawing on Norris’s Multimodal (Inter)action Analysis (MIA), it analyzes 270 min of video-recorded lessons across three instructional stages, supported [...] Read more.
This study investigates the evolving pedagogical strategies and professional identity development of two novice college English teachers in China through a semester-long classroom-based inquiry. Drawing on Norris’s Multimodal (Inter)action Analysis (MIA), it analyzes 270 min of video-recorded lessons across three instructional stages, supported by visual transcripts and pitch-intensity spectrograms. The analysis reveals each teacher’s transformation from textbook-reliant instruction to student-centered pedagogy, facilitated by multimodal strategies such as gaze, vocal pitch, gesture, and head movement. These shifts unfold across the following three evolving identity configurations: compliance, experimentation, and dialogic enactment. Rather than following a linear path, identity development is shown as a negotiated process shaped by institutional demands and classroom interactional realities. By foregrounding the multimodal enactment of self in a non-Western educational context, this study offers insights into how novice EFL teachers navigate tensions between traditional discourse norms and reform-driven pedagogical expectations, contributing to broader understandings of identity formation in global higher education. Full article
Show Figures

Figure 1

24 pages, 1751 KiB  
Article
Robust JND-Guided Video Watermarking via Adaptive Block Selection and Temporal Redundancy
by Antonio Cedillo-Hernandez, Lydia Velazquez-Garcia, Manuel Cedillo-Hernandez, Ismael Dominguez-Jimenez and David Conchouso-Gonzalez
Mathematics 2025, 13(15), 2493; https://doi.org/10.3390/math13152493 - 3 Aug 2025
Viewed by 143
Abstract
This paper introduces a robust and imperceptible video watermarking framework designed for blind extraction in dynamic video environments. The proposed method operates in the spatial domain and combines multiscale perceptual analysis, adaptive Just Noticeable Difference (JND)-based quantization, and temporal redundancy via multiframe embedding. [...] Read more.
This paper introduces a robust and imperceptible video watermarking framework designed for blind extraction in dynamic video environments. The proposed method operates in the spatial domain and combines multiscale perceptual analysis, adaptive Just Noticeable Difference (JND)-based quantization, and temporal redundancy via multiframe embedding. Watermark bits are embedded selectively in blocks with high perceptual masking using a QIM strategy, and the corresponding DCT coefficients are estimated directly from the spatial domain to reduce complexity. To enhance resilience, each bit is redundantly inserted across multiple keyframes selected based on scene transitions. Extensive simulations over 21 benchmark videos (CIF, 4CIF, HD) validate that the method achieves superior performance in robustness and perceptual quality, with an average Bit Error Rate (BER) of 1.03%, PSNR of 50.1 dB, SSIM of 0.996, and VMAF of 97.3 under compression, noise, cropping, and temporal desynchronization. The system outperforms several recent state-of-the-art techniques in both quality and speed, requiring no access to the original video during extraction. These results confirm the method’s viability for practical applications such as copyright protection and secure video streaming. Full article
(This article belongs to the Section E: Applied Mathematics)
Show Figures

Figure 1

21 pages, 4252 KiB  
Article
AnimalAI: An Open-Source Web Platform for Automated Animal Activity Index Calculation Using Interactive Deep Learning Segmentation
by Mahtab Saeidifar, Guoming Li, Lakshmish Macheeri Ramaswamy, Chongxiao Chen and Ehsan Asali
Animals 2025, 15(15), 2269; https://doi.org/10.3390/ani15152269 - 3 Aug 2025
Viewed by 123
Abstract
Monitoring the activity index of animals is crucial for assessing their welfare and behavior patterns. However, traditional methods for calculating the activity index, such as pixel intensity differencing of entire frames, are found to suffer from significant interference and noise, leading to inaccurate [...] Read more.
Monitoring the activity index of animals is crucial for assessing their welfare and behavior patterns. However, traditional methods for calculating the activity index, such as pixel intensity differencing of entire frames, are found to suffer from significant interference and noise, leading to inaccurate results. These classical approaches also do not support group or individual tracking in a user-friendly way, and no open-access platform exists for non-technical researchers. This study introduces an open-source web-based platform that allows researchers to calculate the activity index from top-view videos by selecting individual or group animals. It integrates Segment Anything Model2 (SAM2), a promptable deep learning segmentation model, to track animals without additional training or annotation. The platform accurately tracked Cobb 500 male broilers from weeks 1 to 7 with a 100% success rate, IoU of 92.21% ± 0.012, precision of 93.87% ± 0.019, recall of 98.15% ± 0.011, and F1 score of 95.94% ± 0.006, based on 1157 chickens. Statistical analysis showed that tracking 80% of birds in week 1, 60% in week 4, and 40% in week 7 was sufficient (r ≥ 0.90; p ≤ 0.048) to represent the group activity in respective ages. This platform offers a practical, accessible solution for activity tracking, supporting animal behavior analytics with minimal effort. Full article
(This article belongs to the Section Animal Welfare)
Show Figures

Figure 1

10 pages, 1055 KiB  
Article
Artificial Intelligence and Hysteroscopy: A Multicentric Study on Automated Classification of Pleomorphic Lesions
by Miguel Mascarenhas, Carla Peixoto, Ricardo Freire, Joao Cavaco Gomes, Pedro Cardoso, Inês Castro, Miguel Martins, Francisco Mendes, Joana Mota, Maria João Almeida, Fabiana Silva, Luis Gutierres, Bruno Mendes, João Ferreira, Teresa Mascarenhas and Rosa Zulmira
Cancers 2025, 17(15), 2559; https://doi.org/10.3390/cancers17152559 - 3 Aug 2025
Viewed by 126
Abstract
Background/Objectives: The integration of artificial intelligence (AI) in medical imaging is rapidly advancing, yet its application in gynecologic use remains limited. This proof-of-concept study presents the development and validation of a convolutional neural network (CNN) designed to automatically detect and classify endometrial [...] Read more.
Background/Objectives: The integration of artificial intelligence (AI) in medical imaging is rapidly advancing, yet its application in gynecologic use remains limited. This proof-of-concept study presents the development and validation of a convolutional neural network (CNN) designed to automatically detect and classify endometrial polyps. Methods: A multicenter dataset (n = 3) comprising 65 hysteroscopies was used, yielding 33,239 frames and 37,512 annotated objects. Still frames were extracted from full-length videos and annotated for the presence of histologically confirmed polyps. A YOLOv1-based object detection model was used with a 70–20–10 split for training, validation, and testing. Primary performance metrics included recall, precision, and mean average precision at an intersection over union (IoU) ≥ 0.50 (mAP50). Frame-level classification metrics were also computed to evaluate clinical applicability. Results: The model achieved a recall of 0.96 and precision of 0.95 for polyp detection, with a mAP50 of 0.98. At the frame level, mean recall was 0.75, precision 0.98, and F1 score 0.82, confirming high detection and classification performance. Conclusions: This study presents a CNN trained on multicenter, real-world data that detects and classifies polyps simultaneously with high diagnostic and localization performance, supported by explainable AI features that enhance its clinical integration and technological readiness. Although currently limited to binary classification, this study demonstrates the feasibility and potential of AI to reduce diagnostic subjectivity and inter-observer variability in hysteroscopy. Future work will focus on expanding the model’s capabilities to classify a broader range of endometrial pathologies, enhance generalizability, and validate performance in real-time clinical settings. Full article
Show Figures

Figure 1

24 pages, 23817 KiB  
Article
Dual-Path Adversarial Denoising Network Based on UNet
by Jinchi Yu, Yu Zhou, Mingchen Sun and Dadong Wang
Sensors 2025, 25(15), 4751; https://doi.org/10.3390/s25154751 - 1 Aug 2025
Viewed by 197
Abstract
Digital image quality is crucial for reliable analysis in applications such as medical imaging, satellite remote sensing, and video surveillance. However, traditional denoising methods struggle to balance noise removal with detail preservation and lack adaptability to various types of noise. We propose a [...] Read more.
Digital image quality is crucial for reliable analysis in applications such as medical imaging, satellite remote sensing, and video surveillance. However, traditional denoising methods struggle to balance noise removal with detail preservation and lack adaptability to various types of noise. We propose a novel three-module architecture for image denoising, comprising a generator, a dual-path-UNet-based denoiser, and a discriminator. The generator creates synthetic noise patterns to augment training data, while the dual-path-UNet denoiser uses multiple receptive field modules to preserve fine details and dense feature fusion to maintain global structural integrity. The discriminator provides adversarial feedback to enhance denoising performance. This dual-path adversarial training mechanism addresses the limitations of traditional methods by simultaneously capturing both local details and global structures. Experiments on the SIDD, DND, and PolyU datasets demonstrate superior performance. We compare our architecture with the latest state-of-the-art GAN variants through comprehensive qualitative and quantitative evaluations. These results confirm the effectiveness of noise removal with minimal loss of critical image details. The proposed architecture enhances image denoising capabilities in complex noise scenarios, providing a robust solution for applications that require high image fidelity. By enhancing adaptability to various types of noise while maintaining structural integrity, this method provides a versatile tool for image processing tasks that require preserving detail. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 4569 KiB  
Article
Lightweight Vision Transformer for Frame-Level Ergonomic Posture Classification in Industrial Workflows
by Luca Cruciata, Salvatore Contino, Marianna Ciccarelli, Roberto Pirrone, Leonardo Mostarda, Alessandra Papetti and Marco Piangerelli
Sensors 2025, 25(15), 4750; https://doi.org/10.3390/s25154750 - 1 Aug 2025
Viewed by 205
Abstract
Work-related musculoskeletal disorders (WMSDs) are a leading concern in industrial ergonomics, often stemming from sustained non-neutral postures and repetitive tasks. This paper presents a vision-based framework for real-time, frame-level ergonomic risk classification using a lightweight Vision Transformer (ViT). The proposed system operates directly [...] Read more.
Work-related musculoskeletal disorders (WMSDs) are a leading concern in industrial ergonomics, often stemming from sustained non-neutral postures and repetitive tasks. This paper presents a vision-based framework for real-time, frame-level ergonomic risk classification using a lightweight Vision Transformer (ViT). The proposed system operates directly on raw RGB images without requiring skeleton reconstruction, joint angle estimation, or image segmentation. A single ViT model simultaneously classifies eight anatomical regions, enabling efficient multi-label posture assessment. Training is supervised using a multimodal dataset acquired from synchronized RGB video and full-body inertial motion capture, with ergonomic risk labels derived from RULA scores computed on joint kinematics. The system is validated on realistic, simulated industrial tasks that include common challenges such as occlusion and posture variability. Experimental results show that the ViT model achieves state-of-the-art performance, with F1-scores exceeding 0.99 and AUC values above 0.996 across all regions. Compared to previous CNN-based system, the proposed model improves classification accuracy and generalizability while reducing complexity and enabling real-time inference on edge devices. These findings demonstrate the model’s potential for unobtrusive, scalable ergonomic risk monitoring in real-world manufacturing environments. Full article
(This article belongs to the Special Issue Secure and Decentralised IoT Systems)
Show Figures

Figure 1

11 pages, 441 KiB  
Article
Medical Education: Are Reels a Good Deal in Video-Based Learning?
by Daniel Humberto Pozza, Fani Lourença Neto, José Tiago Costa-Pereira and Isaura Tavares
Educ. Sci. 2025, 15(8), 981; https://doi.org/10.3390/educsci15080981 (registering DOI) - 31 Jul 2025
Viewed by 221
Abstract
Based on our question, “Are reels/short-videos the real deal in video-based learning?” this study explores the effectiveness of short (around 2 min) video-based learning in engaging medical students from the second large medical Portuguese school. With the increasing integration of digital tools in [...] Read more.
Based on our question, “Are reels/short-videos the real deal in video-based learning?” this study explores the effectiveness of short (around 2 min) video-based learning in engaging medical students from the second large medical Portuguese school. With the increasing integration of digital tools in education, video content has emerged as a dynamic method to enhance learning experiences. This cross-sectional survey was conducted by using anonymous self-administered questionnaires, prepared with reference to previous studies, and distributed to 264 informed students who voluntarily agreed to participate. This sample represented 75.5% of the students attending the classes. The questionnaires included topics related to the 65 short videos about practical classes, as well as the students’ learning preferences. The collected data were analyzed using descriptive and comparative statistics. The students considered that the content and format of the videos were adequate (99.6% and 100%, respectively). Specifically, the videos helped the students to better understand the practical classes, consolidate and retain the practical content, and simplify the study for the exams. Additionally, the videos were praised for their high-quality audiovisual content, being innovative, complete, concise, short and/or adequate, or better than other formats such as printed information. The combination of written and audiovisual support materials for teaching and studying is important and has been shown to improve students’ performance. This pedagogical methodology is well-suited for the current generation of students, aiding not only in study and exam preparation but also in remote learning. Full article
(This article belongs to the Special Issue Higher Education Development and Technological Innovation)
Show Figures

Figure 1

28 pages, 5699 KiB  
Article
Multi-Modal Excavator Activity Recognition Using Two-Stream CNN-LSTM with RGB and Point Cloud Inputs
by Hyuk Soo Cho, Kamran Latif, Abubakar Sharafat and Jongwon Seo
Appl. Sci. 2025, 15(15), 8505; https://doi.org/10.3390/app15158505 (registering DOI) - 31 Jul 2025
Viewed by 128
Abstract
Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving [...] Read more.
Recently, deep learning algorithms have been increasingly applied in construction for activity recognition, particularly for excavators, to automate processes and enhance safety and productivity through continuous monitoring of earthmoving activities. These deep learning algorithms analyze construction videos to classify excavator activities for earthmoving purposes. However, previous studies have solely focused on single-source external videos, which limits the activity recognition capabilities of the deep learning algorithm. This paper introduces a novel multi-modal deep learning-based methodology for recognizing excavator activities, utilizing multi-stream input data. It processes point clouds and RGB images using the two-stream long short-term memory convolutional neural network (CNN-LSTM) method to extract spatiotemporal features, enabling the recognition of excavator activities. A comprehensive dataset comprising 495,000 video frames of synchronized RGB and point cloud data was collected across multiple construction sites under varying conditions. The dataset encompasses five key excavator activities: Approach, Digging, Dumping, Idle, and Leveling. To assess the effectiveness of the proposed method, the performance of the two-stream CNN-LSTM architecture is compared with that of single-stream CNN-LSTM models on the same RGB and point cloud datasets, separately. The results demonstrate that the proposed multi-stream approach achieved an accuracy of 94.67%, outperforming existing state-of-the-art single-stream models, which achieved 90.67% accuracy for the RGB-based model and 92.00% for the point cloud-based model. These findings underscore the potential of the proposed activity recognition method, making it highly effective for automatic real-time monitoring of excavator activities, thereby laying the groundwork for future integration into digital twin systems for proactive maintenance and intelligent equipment management. Full article
(This article belongs to the Special Issue AI-Based Machinery Health Monitoring)
Show Figures

Figure 1

Back to TopTop