Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,010)

Search Parameters:
Keywords = media training

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 3705 KB  
Article
Cross-Platform Multi-Modal Transfer Learning Framework for Cyberbullying Detection
by Weiqi Zhang, Chengzu Dong, Aiting Yao, Asef Nazari and Anuroop Gaddam
Electronics 2026, 15(2), 442; https://doi.org/10.3390/electronics15020442 - 20 Jan 2026
Viewed by 119
Abstract
Cyberbullying and hate speech increasingly appear in multi-modal social media posts, where images and text are combined in diverse and fast changing ways across platforms. These posts differ in style, vocabulary and layout, and labeled data are sparse and noisy, which makes it [...] Read more.
Cyberbullying and hate speech increasingly appear in multi-modal social media posts, where images and text are combined in diverse and fast changing ways across platforms. These posts differ in style, vocabulary and layout, and labeled data are sparse and noisy, which makes it difficult to train detectors that are both reliable and deployable under tight computational budgets. Many high performing systems rely on large vision language backbones, full parameter fine tuning, online retrieval or model ensembles, which raises training and inference costs. We present a parameter efficient cross-platform multi-modal transfer learning framework for cyberbullying and hateful content detection. Our framework has three components. First, we perform domain adaptive pretraining of a compact ViLT backbone on in domain image-text corpora. Second, we apply parameter efficient fine tuning that updates only bias terms, a small subset of LayerNorm parameters and the classification head, leaving the inference computation graph unchanged. Third, we use noise aware knowledge distillation from a stronger teacher built from pretrained text and CLIP based image-text encoders, where only high confidence, temperature scaled predictions are used as soft labels during training, and teacher models and any retrieval components are used only offline. We evaluate primarily on Hateful Memes and use IMDB as an auxiliary text only benchmark to show that the deployment aware PEFT + offline-KD recipe can still be applied when other modalities are unavailable. On Hateful Memes, our student updates only 0.11% of parameters and retain about 96% of the AUROC of full fine-tuning. Full article
(This article belongs to the Special Issue Data Privacy and Protection in IoT Systems)
Show Figures

Figure 1

32 pages, 122293 KB  
Article
Hybrid Negation: Enhancing Sentiment Analysis for Complex Sentences
by Miftahul Qorib and Paul Cotae
Appl. Sci. 2026, 16(2), 1000; https://doi.org/10.3390/app16021000 - 19 Jan 2026
Viewed by 152
Abstract
Numerous valuable information is available on the Internet, and many individuals rely on mass media as their primary source of information. Various views, comments, expressions, and opinions on social networks have been a tremendous source of information. Harvesting free, resourceful information through social [...] Read more.
Numerous valuable information is available on the Internet, and many individuals rely on mass media as their primary source of information. Various views, comments, expressions, and opinions on social networks have been a tremendous source of information. Harvesting free, resourceful information through social media makes text mining a powerful tool for analyzing public opinions on various issues across diverse social networks. Various research projects have implemented text sentiment analysis through machine and deep learning approaches. Social media text often expresses sentiment through complex syntax and negation (e.g., implicit and double negation and nested clauses), which many classifiers mishandle. We propose hybrid negation, a clause-aware approach that combines (i) explicit/implicit/double-negation rules, (ii) dependency-based scope detection, (iii) a TextBlob back-off for phrase polarity, and (iv) an MLP-learned clause-weighting module that aggregates clause-level scores. Across 156,539 tweets (three-class sentiment), we evaluate six negation strategies and 228 model configurations with and without SMOTE (applied strictly within training folds). Hybrid Negation achieves 98.582% accuracy, 98.196% precision, 98.189% recall, and 98.193% F1 with BERT, outperforming rule-only and antonym/synonym baselines. Ablations show each component contributes to the model’s performance, with dependency scope and double negations offering the largest gains. Per-class results, confidence intervals, and paired tests with multiple-comparison control confirm statistically significant improvements. We release code and preprocessing scripts to support reproducibility. Full article
(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)
Show Figures

Figure 1

21 pages, 2529 KB  
Article
Continual Learning for Saudi-Dialect Offensive-Language Detection Under Temporal Linguistic Drift
by Afefa Asiri and Mostafa Saleh
Information 2026, 17(1), 99; https://doi.org/10.3390/info17010099 - 18 Jan 2026
Viewed by 156
Abstract
Offensive-language detection systems that perform well at a given point in time often degrade as linguistic patterns evolve, particularly in dialectal Arabic social media, where new terms emerge and familiar expressions shift in meaning. This study investigates temporal linguistic drift in Saudi-dialect offensive-language [...] Read more.
Offensive-language detection systems that perform well at a given point in time often degrade as linguistic patterns evolve, particularly in dialectal Arabic social media, where new terms emerge and familiar expressions shift in meaning. This study investigates temporal linguistic drift in Saudi-dialect offensive-language detection through a systematic evaluation of continual-learning approaches. Building on the Saudi Offensive Dialect (SOD) dataset, we designed test scenarios incorporating newly introduced offensive terms, context-shifting expressions, and varying proportions of historical data to assess both adaptation and knowledge retention. Eight continual-learning configurations—Experience Replay (ER), Elastic Weight Consolidation (EWC), Low-Rank Adaptation (LoRA), and their combinations—were evaluated across five test scenarios. Results show that models without continual-learning experience a 13.4-percentage-point decline in F1-macro on evolved patterns. In our experiments, Experience Replay achieved a relatively favorable balance, maintaining 0.812 F1-macro on historical data and 0.976 on contemporary patterns (KR = −0.035; AG = +0.264), though with increased memory and training time. EWC showed moderate retention (KR = −0.052) with comparable adaptation (AG = +0.255). On the SimuReal test set—designed with realistic class imbalance and only 5% drift terms—ER achieved 0.842 and EWC achieved 0.833, compared to the original model’s 0.817, representing modest improvements under realistic conditions. LoRA-based methods showed lower adaptation in our experiments, likely reflecting the specific LoRA configuration used in this study. Further investigation with alternative settings is warranted. Full article
(This article belongs to the Special Issue Social Media Mining: Algorithms, Insights, and Applications)
Show Figures

Figure 1

23 pages, 1503 KB  
Article
Hallucination-Aware Interpretable Sentiment Analysis Model: A Grounded Approach to Reliable Social Media Content Classification
by Abdul Rahaman Wahab Sait and Yazeed Alkhurayyif
Electronics 2026, 15(2), 409; https://doi.org/10.3390/electronics15020409 - 16 Jan 2026
Viewed by 180
Abstract
Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of [...] Read more.
Sentiment analysis (SA) has become an essential tool for analyzing social media content in order to monitor public opinion and support digital analytics. Although transformer-based SA models exhibit remarkable performance, they lack mechanisms to mitigate hallucinated sentiment, which refers to the generation of unsupported or overconfident predictions without explicit linguistic evidence. To address this limitation, this study presents a hallucination-aware SA model by incorporating semantic grounding, interpretability-congruent supervision, and neuro-symbolic reasoning within a unified architecture. The proposed model is based on a fine-tuned Open Pre-trained Transformer (OPT) model, using three fundamental mechanisms: a Sentiment Integrity Filter (SIF), a SHapley Additive exPlanations (SHAP)-guided regularization technique, and a confidence-based lexicon-deep fusion module. The experimental analysis was conducted on two multi-class sentiment datasets that contain Twitter (now X) and Reddit posts. In Dataset 1, the suggested model achieved an average accuracy of 97.6% and a hallucination rate of 2.3%, outperforming the current transformer-based and hybrid sentiment models. With Dataset 2, the framework demonstrated strong external generalization with an accuracy of 95.8%, and a hallucination rate of 3.4%, which is significantly lower than state-of-the-art methods. These findings indicate that it is possible to include hallucination mitigation into transformer optimization without any performance degradation, offering a deployable, interpretable, and linguistically complex social media SA framework, which will enhance the reliability of neural systems of language understanding. Full article
Show Figures

Figure 1

12 pages, 589 KB  
Article
Inclusive and Sustainable Digital Innovation Within the Amara Berri System
by Ana Belén Olmos Ortega, Cristina Medrano Pascual, Rosa Ana Alonso Ruiz, María García Pérez and María Ángeles Valdemoros San Emeterio
Sustainability 2026, 18(2), 947; https://doi.org/10.3390/su18020947 - 16 Jan 2026
Viewed by 178
Abstract
The current debate on digital education is at a crossroads between the need for technological innovation and the growing concern about the impact of passive screen use. In this context, identifying sustainable pedagogical models that integrate Information and Communication Technologies (ICT) in a [...] Read more.
The current debate on digital education is at a crossroads between the need for technological innovation and the growing concern about the impact of passive screen use. In this context, identifying sustainable pedagogical models that integrate Information and Communication Technologies (ICT) in a meaningful and inclusive way is an urgent need. This article presents a case study of the Amara Berri System (ABS), aiming to analyze how inclusive and sustainable digital innovation is operationalized within the system and whether teachers’ length of service is associated with the implementation and perceived impact of inclusive ICT practices. The investigation is based on a mixed-methods sequential design. A questionnaire was administered to a sample of 292 teachers to collect data on their practices and perceptions. Subsequently, a focus group with eight teachers was conducted to further explore the meaning of their practices. Quantitative results show that the implementation and positive evaluation of inclusive ICT practices correlate significantly with teachers’ seniority within the system, which suggests that the model is formative in itself. Qualitative analysis shows that ICTs are not an end in themselves within the ABS, but an empowering tool for the students. The “Audiovisual Media Room”, managed by students, functions as a space for social and creative production that gives technology a pedagogical purpose. The study concludes that the sustainability of digital innovation requires coherence with the pedagogical project. Findings offer valuable implications for the design of teacher training contexts that foster the integration of technology within a framework of truly inclusive education. Full article
(This article belongs to the Special Issue Sustainable Digital Education: Innovations in Teaching and Learning)
Show Figures

Figure 1

27 pages, 11232 KB  
Article
Aerokinesis: An IoT-Based Vision-Driven Gesture Control System for Quadcopter Navigation Using Deep Learning and ROS2
by Sergei Kondratev, Yulia Dyrchenkova, Georgiy Nikitin, Leonid Voskov, Vladimir Pikalov and Victor Meshcheryakov
Technologies 2026, 14(1), 69; https://doi.org/10.3390/technologies14010069 - 16 Jan 2026
Viewed by 235
Abstract
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in [...] Read more.
This paper presents Aerokinesis, an IoT-based software–hardware system for intuitive gesture-driven control of quadcopter unmanned aerial vehicles (UAVs), developed within the Robot Operating System 2 (ROS2) framework. The proposed system addresses the challenge of providing an accessible human–drone interaction interface for operators in scenarios where traditional remote controllers are impractical or unavailable. The architecture comprises two hierarchical control levels: (1) high-level discrete command control utilizing a fully connected neural network classifier for static gesture recognition, and (2) low-level continuous flight control based on three-dimensional hand keypoint analysis from a depth camera. The gesture classification module achieves an accuracy exceeding 99% using a multi-layer perceptron trained on MediaPipe-extracted hand landmarks. For continuous control, we propose a novel approach that computes Euler angles (roll, pitch, yaw) and throttle from 3D hand pose estimation, enabling intuitive four-degree-of-freedom quadcopter manipulation. A hybrid signal filtering pipeline ensures robust control signal generation while maintaining real-time responsiveness. Comparative user studies demonstrate that gesture-based control reduces task completion time by 52.6% for beginners compared to conventional remote controllers. The results confirm the viability of vision-based gesture interfaces for IoT-enabled UAV applications. Full article
(This article belongs to the Section Information and Communication Technologies)
Show Figures

Figure 1

18 pages, 3987 KB  
Article
Low-Latency Autonomous Surveillance in Defense Environments: A Hybrid RTSP-WebRTC Architecture with YOLOv11
by Juan José Castro-Castaño, William Efrén Chirán-Alpala, Guillermo Alfonso Giraldo-Martínez, José David Ortega-Pabón, Edison Camilo Rodríguez-Amézquita, Diego Ferney Gallego-Franco and Yeison Alberto Garcés-Gómez
Computers 2026, 15(1), 62; https://doi.org/10.3390/computers15010062 - 16 Jan 2026
Viewed by 260
Abstract
This article presents the Intelligent Monitoring System (IMS), an AI-assisted, low-latency surveillance platform designed for defense environments. The study addresses the need for real-time autonomous situational awareness by integrating high-speed video transmission with advanced computer vision analytics in constrained network settings. The IMS [...] Read more.
This article presents the Intelligent Monitoring System (IMS), an AI-assisted, low-latency surveillance platform designed for defense environments. The study addresses the need for real-time autonomous situational awareness by integrating high-speed video transmission with advanced computer vision analytics in constrained network settings. The IMS employs a hybrid transmission architecture based on RTSP for ingestion and WHEP/WebRTC for distribution, orchestrated via MediaMTX, with the objective of achieving end-to-end latencies below one second. The methodology includes a comparative evaluation of video streaming protocols (JPEG-over-WebSocket, HLS, WebRTC, etc.) and AI frameworks, alongside the modular architectural design and prolonged experimental validation. The detection module integrates YOLOv11 models fine-tuned on the VisDrone dataset to optimize performance for small objects, aerial views, and dense scenes. Experimental results, obtained through over 300 h of operational tests using IP cameras and aerial platforms, confirmed the stability and performance of the chosen architecture, maintaining latencies close to 500 ms. The YOLOv11 family was adopted as the primary detection framework, providing an effective trade-off between accuracy and inference performance in real-time scenarios. The YOLOv11n model was trained and validated on a Tesla T4 GPU, and YOLOv11m will be validated on the target platform in subsequent experiments. The findings demonstrate the technical viability and operational relevance of the IMS as a core component for autonomous surveillance systems in defense, satisfying strict requirements for speed, stability, and robust detection of vehicles and pedestrians. Full article
Show Figures

Figure 1

30 pages, 6201 KB  
Article
AFAD-MSA: Dataset and Models for Arabic Fake Audio Detection
by Elsayed Issa
Computation 2026, 14(1), 20; https://doi.org/10.3390/computation14010020 - 14 Jan 2026
Viewed by 187
Abstract
As generative speech synthesis produces near-human synthetic voices and reliance on online media grows, robust audio-deepfake detection is essential to fight misuse and misinformation. In this study, we introduce the Arabic Fake Audio Dataset for Modern Standard Arabic (AFAD-MSA), a curated corpus of [...] Read more.
As generative speech synthesis produces near-human synthetic voices and reliance on online media grows, robust audio-deepfake detection is essential to fight misuse and misinformation. In this study, we introduce the Arabic Fake Audio Dataset for Modern Standard Arabic (AFAD-MSA), a curated corpus of authentic and synthetic Arabic speech designed to advance research on Arabic deepfake and spoofed-speech detection. The synthetic subset is generated with four state-of-the-art proprietary text-to-speech and voice-conversion models. Rich metadata—covering speaker attributes and generation information—is provided to support reproducibility and benchmarking. To establish reference performance, we trained three AASIST models and compared their performance to two baseline transformer detectors (Wav2Vec 2.0 and Whisper). On the AFAD-MSA test split, AASIST-2 achieved perfect accuracy, surpassing the baseline models. However, its performance declined under cross-dataset evaluation. These results underscore the importance of data construction. Detectors generalize best when exposed to diverse attack types. In addition, continual or contrastive training that interleaves bona fide speech with large, heterogeneous spoofed corpora will further improve detectors’ robustness. Full article
Show Figures

Figure 1

27 pages, 80350 KB  
Article
Pose-Based Static Sign Language Recognition with Deep Learning for Turkish, Arabic, and American Sign Languages
by Rıdvan Yayla, Hakan Üçgün and Mahmud Abbas
Sensors 2026, 26(2), 524; https://doi.org/10.3390/s26020524 - 13 Jan 2026
Viewed by 236
Abstract
Advancements in artificial intelligence have significantly enhanced communication for individuals with hearing impairments. This study presents a robust cross-lingual Sign Language Recognition (SLR) framework for Turkish, American English, and Arabic sign languages. The system utilizes the lightweight MediaPipe library for efficient hand landmark [...] Read more.
Advancements in artificial intelligence have significantly enhanced communication for individuals with hearing impairments. This study presents a robust cross-lingual Sign Language Recognition (SLR) framework for Turkish, American English, and Arabic sign languages. The system utilizes the lightweight MediaPipe library for efficient hand landmark extraction, ensuring stable and consistent feature representation across diverse linguistic contexts. Datasets were meticulously constructed from nine public-domain sources (four Arabic, three American, and two Turkish). The final training data comprises curated image datasets, with frames for each language carefully selected from varying angles and distances to ensure high diversity. A comprehensive comparative evaluation was conducted across three state-of-the-art deep learning architectures—ConvNeXt (CNN-based), Swin Transformer (ViT-based), and Vision Mamba (SSM-based)—all applied to identical feature sets. The evaluation demonstrates the superior performance of contemporary vision Transformers and state space models in capturing subtle spatial cues across diverse sign languages. Our approach provides a comparative analysis of model generalization capabilities across three distinct sign languages, offering valuable insights for model selection in pose-based SLR systems. Full article
(This article belongs to the Special Issue Sensor Systems for Gesture Recognition (3rd Edition))
Show Figures

Figure 1

20 pages, 266 KB  
Article
Skills Ecosystem and the Role of School Management for Sustainable Development of Dual Education
by Svetlana Alexandrova and Veneta Krasteva
Societies 2026, 16(1), 20; https://doi.org/10.3390/soc16010020 - 12 Jan 2026
Viewed by 316
Abstract
The article presents an analysis of the mechanisms used by a vocational high school in Bulgaria to develop dual training and implement it sustainably. It focuses on the school management’s leadership role in the network of different stakeholders, demonstrating the importance of this [...] Read more.
The article presents an analysis of the mechanisms used by a vocational high school in Bulgaria to develop dual training and implement it sustainably. It focuses on the school management’s leadership role in the network of different stakeholders, demonstrating the importance of this aspect in the entire process of developing dual education. Apart from the case analysis of the Bulgarian vocational high school’s successful implementation of dual learning, the research strategy includes examining regulatory documents, evaluation reports and publications in media and by companies, as well as analyzing the attitudes among key stakeholders. An overview of the challenges facing dual education in Bulgaria is also provided. Based on the case study findings, the factors supporting the implementation and sustainability of the dual system have been identified. We conclude that the long-term development of the dual education model depends on the understanding that the formation of professional skills is a dynamic process, requiring attention to the needs of the local environment, adaptability to current changes and active participation by all stakeholders. The role of school leadership—with regard to both its motivation and activity—has proven to be essential, and therefore it should not be overlooked when creating state incentives to support dual training. Full article
24 pages, 930 KB  
Article
Developing Science Communication Competence in Initial Teacher Training
by Dieter Reynaldo Fuentes-Cancell, Odiel Estrada-Molina and Mónica Gutiérrez-Ortega
Educ. Sci. 2026, 16(1), 86; https://doi.org/10.3390/educsci16010086 - 7 Jan 2026
Viewed by 293
Abstract
This study examines the development of scientific dissemination skills in initial teacher education through a sequential explanatory mixed-methods design (QUANTITATIVE→QUALITATIVE). The purpose was to explore how the integration of Project-Based Learning (PBL) and Experiential Learning (EL) fosters the acquisition of cognitive, communicative, media–digital, [...] Read more.
This study examines the development of scientific dissemination skills in initial teacher education through a sequential explanatory mixed-methods design (QUANTITATIVE→QUALITATIVE). The purpose was to explore how the integration of Project-Based Learning (PBL) and Experiential Learning (EL) fosters the acquisition of cognitive, communicative, media–digital, and ethical–social competencies related to scientific communication. Seventy-nine students from Early Childhood Education (n = 36) and Primary Education (n = 43) degrees at the University of Valladolid participated during the 2024–2025 academic year. In the quantitative phase, a validated questionnaire was administered to assess four dimensions of competence, while the qualitative phase included systematic observations and focus groups. Data analysis combined descriptive and inferential statistics with thematic analysis and convergent integration. The results showed significant improvements in all dimensions, particularly in communicative and media–digital skills, with qualitative evidence explaining the mechanisms underlying this progress. The integration of findings revealed the transformation of students from passive recipients to active mediators of scientific knowledge. It is concluded that the combination of PBL and EL constitutes an effective pedagogical framework for promoting responsible scientific dissemination in higher education and reinforcing the social responsibility of teacher training. Full article
Show Figures

Figure 1

22 pages, 5177 KB  
Article
Tensor-Train-Based Elastic Wavefield Decomposition in VTI Media
by Youngjae Shin
Appl. Sci. 2026, 16(2), 569; https://doi.org/10.3390/app16020569 - 6 Jan 2026
Viewed by 274
Abstract
Elastic wavefield decomposition into quasi-compressional (qP) and quasi-shear-vertical (qSV) modes is essential for elastic imaging and inversion in VTI media, but becomes computationally expensive when polarization vectors vary strongly in space. I propose a tensor-train (TT) representation of mixed-domain decomposition projectors, constructed via [...] Read more.
Elastic wavefield decomposition into quasi-compressional (qP) and quasi-shear-vertical (qSV) modes is essential for elastic imaging and inversion in VTI media, but becomes computationally expensive when polarization vectors vary strongly in space. I propose a tensor-train (TT) representation of mixed-domain decomposition projectors, constructed via TT-cross with a single user-specified tolerance and applied efficiently using FFT-based operations. A residual-orthogonal strategy extracts qSV from the residual wavefield after qP removal to suppress mode leakage. The method is implemented in Python/PyTorch with GPU acceleration. Numerical experiments on three 2D VTI models (a two-layer benchmark, a BP 2007 benchmark subset, and an Overthrust-based structurally complex model) demonstrate reconstruction errors of 0.094–0.89% for TT, compared to 1.67–6.44% for a conventional CUR low-rank approach (4–46× improvement), with consistently lower cross-talk and near-unity energy ratios. Time-domain receiver traces further confirm that TT yields smaller reconstruction residual spikes and reduced cross-mode leakage than CUR. Runtime tests show that CUR can be faster on smaller grids, whereas TT with GPU acceleration becomes competitive and can outperform CUR for larger models. The TT representation scales linearly with tensor Od Ns r2—enabling practical extension to higher-dimensional projector tensors where conven-tional methods become impractical. Full article
(This article belongs to the Special Issue Exploration Geophysics and Seismic Surveying)
Show Figures

Figure 1

23 pages, 6094 KB  
Systematic Review
Toward Smart VR Education in Media Production: Integrating AI into Human-Centered and Interactive Learning Systems
by Zhi Su, Tse Guan Tan, Ling Chen, Hang Su and Samer Alfayad
Biomimetics 2026, 11(1), 34; https://doi.org/10.3390/biomimetics11010034 - 4 Jan 2026
Viewed by 610
Abstract
Smart virtual reality (VR) systems are becoming central to media production education, where immersive practice, real-time feedback, and hands-on simulation are essential. This review synthesizes the integration of artificial intelligence (AI) into human-centered, interactive VR learning for television and media production. Searches in [...] Read more.
Smart virtual reality (VR) systems are becoming central to media production education, where immersive practice, real-time feedback, and hands-on simulation are essential. This review synthesizes the integration of artificial intelligence (AI) into human-centered, interactive VR learning for television and media production. Searches in Scopus, Web of Science, IEEE Xplore, ACM Digital Library, and SpringerLink (2013–2024) identified 790 records; following PRISMA screening, 94 studies met the inclusion criteria and were synthesized using a systematic scoping review approach. Across this corpus, common AI components include learner modeling, adaptive task sequencing (e.g., RL-based orchestration), affect sensing (vision, speech, and biosignals), multimodal interaction (gesture, gaze, voice, haptics), and growing use of LLM/NLP assistants. Reported benefits span personalized learning trajectories, high-fidelity simulation of studio workflows, and more responsive feedback loops that support creative, technical, and cognitive competencies. Evaluation typically covers usability and presence, workload and affect, collaboration, and scenario-based learning outcomes, leveraging interaction logs, eye tracking, and biofeedback. Persistent challenges include latency and synchronization under multimodal sensing, data governance and privacy for biometric/affective signals, limited transparency/interpretability of AI feedback, and heterogeneous evaluation protocols that impede cross-system comparison. We highlight essential human-centered design principles—teacher-in-the-loop orchestration, timely and explainable feedback, and ethical data governance—and outline a research agenda to support standardized evaluation and scalable adoption of smart VR education in the creative industries. Full article
(This article belongs to the Special Issue Biomimetic Innovations for Human–Machine Interaction)
Show Figures

Figure 1

20 pages, 1508 KB  
Article
Bidirectional Translation of ASL and English Using Machine Vision and CNN and Transformer Networks
by Stefanie Amiruzzaman, Md Amiruzzaman, Raga Mouni Batchu, James Dracup, Alexander Pham, Benjamin Crocker, Linh Ngo and M. Ali Akber Dewan
Computers 2026, 15(1), 20; https://doi.org/10.3390/computers15010020 - 4 Jan 2026
Viewed by 320
Abstract
This study presents a real-time, bidirectional system for translating American Sign Language (ASL) to and from English using computer vision and transformer-based models to enhance accessibility for deaf and hard of hearing users. Leveraging publicly available sign language and text–to-gloss datasets, the system [...] Read more.
This study presents a real-time, bidirectional system for translating American Sign Language (ASL) to and from English using computer vision and transformer-based models to enhance accessibility for deaf and hard of hearing users. Leveraging publicly available sign language and text–to-gloss datasets, the system integrates MediaPipe-based holistic landmark extraction with CNN- and transformer-based architectures to support translation across video, text, and speech modalities within a web-based interface. In the ASL-to-English direction, the sign-to-gloss model achieves a 25.17% word error rate (WER) on the RWTH-PHOENIX-Weather 2014T benchmark, which is competitive with recent continuous sign language recognition systems, and the gloss-level translation attains a ROUGE-L score of 79.89, indicating strong preservation of sign content and ordering. In the reverse English-to-ASL direction, the English-to-Gloss transformer trained on ASLG-PC12 achieves a ROUGE-L score of 96.00, demonstrating high-fidelity gloss sequence generation suitable for landmark-based ASL animation. These results highlight a favorable accuracy-efficiency trade-off achieved through compact model architectures and low-latency decoding, supporting practical real-time deployment. Full article
(This article belongs to the Section AI-Driven Innovations)
Show Figures

Figure 1

20 pages, 3425 KB  
Article
Sensing Through Tissues Using Diffuse Optical Imaging and Genetic Programming
by Ganesh M. Balasubramaniam, Ami Hauptman and Shlomi Arnon
Sensors 2026, 26(1), 318; https://doi.org/10.3390/s26010318 - 3 Jan 2026
Viewed by 456
Abstract
Diffuse optical imaging (DOI) uses scattered light to non-invasively sense and image highly diffuse media, including biological tissues such as the breast and brain. Despite its clinical potential, widespread adoption remains limited because physical constraints, limited available datasets, and conventional reconstruction algorithms struggle [...] Read more.
Diffuse optical imaging (DOI) uses scattered light to non-invasively sense and image highly diffuse media, including biological tissues such as the breast and brain. Despite its clinical potential, widespread adoption remains limited because physical constraints, limited available datasets, and conventional reconstruction algorithms struggle with the strongly nonlinear, ill-posed inverse problem posed by multiple photon scattering. We introduce Diffuse optical Imaging using Genetic Programming (DI-GP), a physics-guided and fully interpretable genetic programming framework for DOI. Grounded in the diffusion equation, DI-GP evolves closed-form symbolic mappings that enable fast and accurate 2-D reconstructions in strongly scattering media. Unlike deep neural networks, Genetic Programming (GP) naturally produces symbolic expressions, explicit rules, and transparent computational pipelines—an increasingly important capability as regulatory and high-stakes domains (e.g., FDA/EMA, medical imaging regulation) demand explainable and auditable AI systems, and where training data are often scarce. DI-GP delivers substantially faster inference and improved qualitative and quantitative reconstruction performance compared to analytical baselines. We validate the approach in both simulations and tabletop experiments, recovering targets without prior knowledge of shape or location at depths exceeding ~25 transport mean-free paths. Additional experiments demonstrate centimeter-scale imaging in tissue-like media, highlighting the promise of DI-GP for non-invasive deep-tissue imaging and its potential as a foundation for practical DOI systems. Full article
Show Figures

Figure 1

Back to TopTop