Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (39)

Search Parameters:
Keywords = video-based language assessment

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 1603 KiB  
Perspective
A Perspective on Quality Evaluation for AI-Generated Videos
by Zhichao Zhang, Wei Sun and Guangtao Zhai
Sensors 2025, 25(15), 4668; https://doi.org/10.3390/s25154668 - 28 Jul 2025
Viewed by 188
Abstract
Recent breakthroughs in AI-generated content (AIGC) have transformed video creation, empowering systems to translate text, images, or audio into visually compelling stories. Yet reliable evaluation of these machine-crafted videos remains elusive because quality is governed not only by spatial fidelity within individual frames [...] Read more.
Recent breakthroughs in AI-generated content (AIGC) have transformed video creation, empowering systems to translate text, images, or audio into visually compelling stories. Yet reliable evaluation of these machine-crafted videos remains elusive because quality is governed not only by spatial fidelity within individual frames but also by temporal coherence across frames and precise semantic alignment with the intended message. The foundational role of sensor technologies is critical, as they determine the physical plausibility of AIGC outputs. In this perspective, we argue that multimodal large language models (MLLMs) are poised to become the cornerstone of next-generation video quality assessment (VQA). By jointly encoding cues from multiple modalities such as vision, language, sound, and even depth, the MLLM can leverage its powerful language understanding capabilities to assess the quality of scene composition, motion dynamics, and narrative consistency, overcoming the fragmentation of hand-engineered metrics and the poor generalization ability of CNN-based methods. Furthermore, we provide a comprehensive analysis of current methodologies for assessing AIGC video quality, including the evolution of generation models, dataset design, quality dimensions, and evaluation frameworks. We argue that advances in sensor fusion enable MLLMs to combine low-level physical constraints with high-level semantic interpretations, further enhancing the accuracy of visual quality assessment. Full article
(This article belongs to the Special Issue Perspectives in Intelligent Sensors and Sensing Systems)
Show Figures

Figure 1

24 pages, 6881 KiB  
Article
Sign Language Anonymization: Face Swapping Versus Avatars
by Marina Perea-Trigo, Manuel Vázquez-Enríquez, Jose C. Benjumea-Bellot, Jose L. Alba-Castro and Juan A. Álvarez-García
Electronics 2025, 14(12), 2360; https://doi.org/10.3390/electronics14122360 - 9 Jun 2025
Viewed by 528
Abstract
The visual nature of Sign Language datasets raises privacy concerns that hinder data sharing, which is essential for advancing deep learning (DL) models in Sign Language recognition and translation. This study evaluated two anonymization techniques, realistic avatar synthesis and face swapping (FS), designed [...] Read more.
The visual nature of Sign Language datasets raises privacy concerns that hinder data sharing, which is essential for advancing deep learning (DL) models in Sign Language recognition and translation. This study evaluated two anonymization techniques, realistic avatar synthesis and face swapping (FS), designed to anonymize the identities of signers, while preserving the semantic integrity of signed content. A novel metric, Identity Anonymization with Expressivity Preservation (IAEP), is introduced to assess the balance between effective anonymization and the preservation of facial expressivity crucial for Sign Language communication. In addition, the quality evaluation included the LPIPS and FID metrics, which measure perceptual similarity and visual quality. A survey with deaf participants further complemented the analysis, providing valuable insight into the practical usability and comprehension of anonymized videos. The results show that while face swapping achieved acceptable anonymization and preserved semantic clarity, avatar-based anonymization struggled with comprehension. These findings highlight the need for further research efforts on securing privacy while preserving Sign Language understandability, both for dataset accessibility and the anonymous participation of deaf people in digital content. Full article
(This article belongs to the Special Issue Application of Machine Learning in Graphics and Images, 2nd Edition)
Show Figures

Figure 1

33 pages, 2413 KiB  
Article
Synergizing STEM and ELA: Exploring How Small-Group Interactions Shape Design Decisions in an Engineering Design-Based Unit
by Deana M. Lucas, Emily M. Haluschak, Christine H. McDonnell, Siddika Selcen Guzey, Greg J. Strimel, Morgan M. Hynes and Tamara J. Moore
Educ. Sci. 2025, 15(6), 716; https://doi.org/10.3390/educsci15060716 - 7 Jun 2025
Viewed by 556
Abstract
While small group learning through engineering design activities has been shown to enhance student achievement, motivation, and problem-solving skills, much of the existing research in this area focuses on undergraduate engineering education. Therefore, this study examines how small-group interactions influence design decisions within [...] Read more.
While small group learning through engineering design activities has been shown to enhance student achievement, motivation, and problem-solving skills, much of the existing research in this area focuses on undergraduate engineering education. Therefore, this study examines how small-group interactions influence design decisions within a sixth-grade engineering design-based English Language Arts unit for multilingual learners. Multilingual Learners make up 21% of the U.S. school-aged population and benefit from early STEM opportunities that shape future educational and career trajectories. Grounded in constructivist learning theories, the research explores collaborative learning in the engineering design process, using a comparative case study design. Specifically, this study explores student interactions and group dynamics in two small groups (Group A and Group B) engaged in a board game design challenge incorporating microelectronics. Video recordings serve as the primary data source, allowing for an in-depth analysis of verbal and nonverbal interactions. The study employed the Social Interdependence Theory to examine how group members collaborate, negotiate roles, and make design decisions. Themes such as positive interdependence, group accountability, promotive interaction, and individual responsibility are used to assess how cooperation influences final design choices. Three key themes emerged: Roles and Dynamics, Conflict, and Teacher Intervention. Group A and Group B exhibited distinct collaboration patterns, with Group A demonstrating stronger leadership dynamics that shaped decision-making, while Group B encountered challenges related to engagement and resource control. The results demonstrate the importance of small-group interactions in shaping design decisions and emphasize the role of group dynamics and teacher intervention in supporting multilingual learners’ engagement and success in integrated STEM curriculum. Full article
(This article belongs to the Special Issue STEM Synergy: Advancing Integrated Approaches in Education)
Show Figures

Figure 1

36 pages, 2706 KiB  
Article
Towards Intelligent Assessment in Personalized Physiotherapy with Computer Vision
by Victor García and Olga C. Santos
Sensors 2025, 25(11), 3436; https://doi.org/10.3390/s25113436 - 29 May 2025
Viewed by 740
Abstract
Effective physiotherapy requires accurate and personalized assessments of patient mobility, yet traditional methods can be time-consuming and subjective. This study explores the potential of open-source computer vision algorithms, specifically YOLO Pose, to support automated, vision-based analysis in physiotherapy settings using information collected from [...] Read more.
Effective physiotherapy requires accurate and personalized assessments of patient mobility, yet traditional methods can be time-consuming and subjective. This study explores the potential of open-source computer vision algorithms, specifically YOLO Pose, to support automated, vision-based analysis in physiotherapy settings using information collected from optical sensors such as cameras. By extracting skeletal data from video input, the system enables objective evaluation of patient movements and rehabilitation progress. The visual information is then analyzed to propose a semantic framework that facilitates a structured interpretation of clinical parameters. Preliminary results indicate that YOLO Pose provides reliable pose estimation, offering a solid foundation for future enhancements, such as the integration of natural language processing (NLP) to improve patient interaction through empathetic, AI-driven support. Full article
Show Figures

Figure 1

32 pages, 806 KiB  
Systematic Review
Safety and Efficacy of Different Therapeutic Interventions for Primary Progressive Aphasia: A Systematic Review
by Abdulrahim Saleh Alrasheed, Reem Ali Alshamrani, Abdullah Ali Al Ameer, Reham Mohammed Alkahtani, Noor Mohammad AlMohish, Mustafa Ahmed AlQarni and Majed Mohammad Alabdali
J. Clin. Med. 2025, 14(9), 3063; https://doi.org/10.3390/jcm14093063 - 29 Apr 2025
Viewed by 1293
Abstract
Background: Primary progressive aphasia (PPA) is a neurodegenerative disorder that worsens over time without appropriate treatment. Although referral to a speech and language pathologist is essential for diagnosing language deficits and developing effective treatment plans, there is no scientific consensus regarding the [...] Read more.
Background: Primary progressive aphasia (PPA) is a neurodegenerative disorder that worsens over time without appropriate treatment. Although referral to a speech and language pathologist is essential for diagnosing language deficits and developing effective treatment plans, there is no scientific consensus regarding the most effective treatment. Thus, our study aims to assess the efficacy and safety of various therapeutic interventions for PPA. Methods: Google Scholar, PubMed, Web of Science, and the Cochrane Library databases were systematically searched to identify articles assessing different therapeutic interventions for PPA. To ensure comprehensive coverage, the search strategy employed specific medical subject headings. The primary outcome measure was language gain; the secondary outcome assessed overall therapeutic effects. Data on study characteristics, patient demographics, PPA subtypes, therapeutic modalities, and treatment patterns were collected. Results: Fifty-seven studies with 655 patients were included. For naming and word finding, errorless learning therapy, lexical retrieval cascade (LRC), semantic feature training, smartphone-based cognitive therapy, picture-naming therapy, and repetitive transcranial magnetic stimulation (rTMS) maintained effects for up to six months. Repetitive rTMS, video-implemented script training for aphasia (VISTA), and structured oral reading therapy improved speech fluency. Sole transcranial treatments enhanced auditory verbal comprehension, whereas transcranial direct current stimulation (tDCS) combined with language or cognitive therapy improved repetition abilities. Phonological and orthographic treatments improved reading accuracy across PPA subtypes. tDCS combined with speech therapy enhanced mini-mental state examination (MMSE) scores and cognitive function. Several therapies, including smartphone-based cognitive therapy and VISTA therapy, demonstrated sustained language improvements over six months. Conclusions: Various therapeutic interventions offer potential benefits for individuals with PPA. However, due to the heterogeneity in study designs, administration methods, small sample sizes, and lack of standardized measurement methods, drawing a firm conclusion is difficult. Further studies are warranted to establish evidence-based treatment protocols. Full article
Show Figures

Figure 1

22 pages, 5937 KiB  
Article
Uncrewed Aerial Vehicle-Based Automatic System for Seat Belt Compliance Detection at Stop-Controlled Intersections
by Gideon Asare Owusu, Ashutosh Dumka, Adu-Gyamfi Kojo, Enoch Kwasi Asante, Rishabh Jain, Skylar Knickerbocker, Neal Hawkins and Anuj Sharma
Remote Sens. 2025, 17(9), 1527; https://doi.org/10.3390/rs17091527 - 25 Apr 2025
Viewed by 579
Abstract
Transportation agencies often rely on manual surveys to monitor seat belt compliance; however, these methods are limited by surveyor fatigue, reduced visibility due to tinted windows or low lighting, and restricted geographic coverage, making manual surveys prone to errors and unrepresentative of the [...] Read more.
Transportation agencies often rely on manual surveys to monitor seat belt compliance; however, these methods are limited by surveyor fatigue, reduced visibility due to tinted windows or low lighting, and restricted geographic coverage, making manual surveys prone to errors and unrepresentative of the broader driving population. This paper presents an automated seat belt detection system leveraging the YOLO11 neural network on video footage captured by a tethered uncrewed aerial vehicle (UAV). The objectives are to (1) develop a robust system for detecting seat belt use at stop-controlled intersections, (2) evaluate factors affecting detection accuracy, and (3) demonstrate the potential of UAV-based compliance monitoring. The model was tested in real-world scenarios at a single-lane and a complex multi-lane stop-controlled intersection in Iowa. Three studies examined key factors influencing detection accuracy: (i) seat belt–shirt color contrast, (ii) sunlight direction, and (iii) vehicle type. System performance was compared against manual video review and large language model (LLM)-assisted analysis, with assessments focused on accuracy, resource requirements, and computational efficiency. The model achieved a mean average precision (mAP) of 0.902, maintained high accuracy across the three studies, and outperformed manual methods in reliability and efficiency while offering a scalable, cost-effective alternative to LLM-based solutions. Full article
Show Figures

Figure 1

28 pages, 3886 KiB  
Article
Assessment and Improvement of Avatar-Based Learning System: From Linguistic Structure Alignment to Sentiment-Driven Expressions
by Aru Ukenova, Gulmira Bekmanova, Nazar Zaki, Meiram Kikimbayev and Mamyr Altaibek
Sensors 2025, 25(6), 1921; https://doi.org/10.3390/s25061921 - 19 Mar 2025
Viewed by 1013
Abstract
This research investigates the improvement of learning systems that utilize avatars by shifting from elementary language compatibility to emotion-driven interactions. An assessment of various instructional approaches indicated marked differences in overall effectiveness, with the system showing steady but slight improvements and little variation, [...] Read more.
This research investigates the improvement of learning systems that utilize avatars by shifting from elementary language compatibility to emotion-driven interactions. An assessment of various instructional approaches indicated marked differences in overall effectiveness, with the system showing steady but slight improvements and little variation, suggesting it has the potential for consistent use. Analysis through one-way ANOVA identified noteworthy disparities in post-test results across different teaching strategies. However, the pairwise comparisons with Tukey’s HSD did not reveal significant group differences. The group variation and limited sample sizes probably affected statistical strength. Evaluation of effect size demonstrated that the traditional approach had an edge over the avatar-based method, with lessons recorded on video displaying more moderate distinctions. The innovative nature of the system might account for its initial lower effectiveness, as students could need some time to adjust. Participants emphasized the importance of emotional authenticity and cultural adaptation, including incorporating a Kazakh accent, to boost the system’s success. In response, the system was designed with sentiment-driven gestures and facial expressions to improve engagement and personalization. These findings show the potential of emotionally intelligent avatars to encourage more profound learning experiences and the significance of fine-tuning the system for widespread adoption in a modern educational context. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

12 pages, 201 KiB  
Review
Advances in Autism Spectrum Disorder (ASD) Diagnostics: From Theoretical Frameworks to AI-Driven Innovations
by Christine K. Syriopoulou-Delli
Electronics 2025, 14(5), 951; https://doi.org/10.3390/electronics14050951 - 27 Feb 2025
Cited by 3 | Viewed by 2698
Abstract
This study provides a comprehensive analysis of the evolution of Autism Spectrum Disorder (ASD) diagnostics, tracing its progression from psychoanalytic origins to the integration of advanced artificial intelligence (AI) technologies. The study explores, through scientific data bases like Pub Med, Scopus, and Google [...] Read more.
This study provides a comprehensive analysis of the evolution of Autism Spectrum Disorder (ASD) diagnostics, tracing its progression from psychoanalytic origins to the integration of advanced artificial intelligence (AI) technologies. The study explores, through scientific data bases like Pub Med, Scopus, and Google Scholar, how theoretical frameworks, including psychoanalysis, behavioral psychology, cognitive development, and neurobiological paradigms, have shaped diagnostic methodologies over time. Each paradigm’s associated assessment tools, such as the Autism Diagnostic Observation Schedule (ADOS) and the Vineland Adaptive Behavior Scales, are discussed in relation to their scientific advancements and limitations. Emerging technologies, particularly AI, are highlighted for their transformative impact on ASD diagnostics. The application of AI in areas such as video analysis, Natural Language Processing (NLP), and biodata integration demonstrates significant progress in precision, accessibility, and inclusivity. Ethical considerations, including algorithmic transparency, data security, and inclusivity for underrepresented populations, are critically examined alongside the challenges of scalability and equitable implementation. Additionally, neurodiversity- informed approaches are emphasized for their role in reframing autism as a natural variation of human cognition and behavior, advocating for strength-based, inclusive diagnostic frameworks. This synthesis underscores the interplay between evolving theoretical models, technological advancements, and the growing focus on compassionate, equitable diagnostic practices. It concludes by advocating for continued innovation, interdisciplinary collaboration, and ethical oversight to further refine ASD diagnostics and improve outcomes for individuals across the autism spectrum. Full article
(This article belongs to the Special Issue Robotics: From Technologies to Applications)
18 pages, 12629 KiB  
Article
Leveraging AI-Generated Virtual Speakers to Enhance Multilingual E-Learning Experiences
by Sergio Miranda and Rosa Vegliante
Information 2025, 16(2), 132; https://doi.org/10.3390/info16020132 - 11 Feb 2025
Cited by 2 | Viewed by 1252
Abstract
The growing demand for accessible and effective e-learning platforms has led to an increased focus on innovative solutions to address the challenges posed by the diverse linguistic backgrounds of learners. This paper explores the use of AI-generated virtual speakers to enhance multilingual e-learning [...] Read more.
The growing demand for accessible and effective e-learning platforms has led to an increased focus on innovative solutions to address the challenges posed by the diverse linguistic backgrounds of learners. This paper explores the use of AI-generated virtual speakers to enhance multilingual e-learning experiences. This study employs a system developed using Google Sheets and Google Script to create and manage multilingual courses, integrating AI-powered virtual speakers to deliver content in learners’ native languages. The e-learning platform used is a customized Moodle, and three courses were developed: “Mental Wellbeing in Mining”, “Rescue in the Mine”, and “Risk Assessment” for a European ERASMUS+ project. This study involved 147 participants from various educational and professional backgrounds. The main findings indicate that AI-generated virtual speakers significantly improve the accessibility of e-learning content. Participants preferred content in their native language and found AI-generated videos effective and engaging. This study concludes that AI-generated virtual speakers offer a promising approach to overcoming linguistic barriers in e-learning, providing personalized and adaptive learning experiences. Future research should focus on addressing ethical considerations, such as data privacy and algorithmic bias, and expanding the user base to include more languages and proficiency levels. Full article
(This article belongs to the Special Issue Advancing Educational Innovation with Artificial Intelligence)
Show Figures

Figure 1

15 pages, 2004 KiB  
Article
YouTube as a Source of Information for Dietary Guidance and Advisory Content in the Management of Non-Alcoholic Fatty Liver Disease
by Kagan Tur
Healthcare 2025, 13(4), 351; https://doi.org/10.3390/healthcare13040351 - 7 Feb 2025
Viewed by 892
Abstract
Background/Objectives: Fatty liver disease (FLD), particularly non-alcoholic fatty liver disease (NAFLD), is a growing global health concern that underscores the need for effective dietary management strategies. With over 25% of patients seeking dietary advice through platforms like YouTube, the quality and reliability [...] Read more.
Background/Objectives: Fatty liver disease (FLD), particularly non-alcoholic fatty liver disease (NAFLD), is a growing global health concern that underscores the need for effective dietary management strategies. With over 25% of patients seeking dietary advice through platforms like YouTube, the quality and reliability of this information remain critical. However, the disparity in educational value and engagement metrics between professional and non-professional content remains underexplored. This study evaluates YouTube’s role in disseminating dietary advice for FLD management, focusing on content reliability, engagement metrics, and the educational value of videos. Methods: This cross-sectional study systematically analyzed 183 YouTube videos on FLD and dietary advice. Videos were selected based on relevance, English language, and non-promotional content. Scoring systems, including DISCERN, Global Quality Score (GQS), and the Video Information and Quality Index (VIQI), were employed to assess reliability, quality, and educational value. Engagement metrics such as views, likes, dislikes, and interaction rates were analyzed across uploader categories, including healthcare professionals, patients, and undefined sources. Results: Videos uploaded by healthcare professionals demonstrated significantly higher DISCERN scores (4.2 ± 0.8) and GQS ratings (4.1 ± 0.6) compared to patient-generated content (DISCERN: 2.8 ± 0.9; GQS: 3.0 ± 0.7). However, patient-generated videos achieved higher engagement rates, with median views reaching 340,000 (IQR: 15,000–1,000,000) compared to 450,050 (IQR: 23,000–1,800,000) for professional videos. Nutritional recommendations spanned diverse approaches, including low-carb diets, Mediterranean diets, and guidance to avoid processed foods and sugars. A significant proportion of videos lacked evidence-based content, particularly among non-professional uploads. Conclusions: YouTube represents a widely accessed but inconsistent source of dietary advice for FLD. While healthcare professional videos exhibit higher reliability and educational value, patient-generated content achieves broader engagement, revealing a critical gap in trusted, accessible dietary guidance. These findings highlight the need for clinicians and content creators to collaborate in curating and disseminating evidence-based content, ensuring patients receive accurate, actionable advice for managing FLD. Full article
Show Figures

Figure 1

31 pages, 4490 KiB  
Review
Uncovering Research Trends on Artificial Intelligence Risk Assessment in Businesses: A State-of-the-Art Perspective Using Bibliometric Analysis
by Juan Carlos Muria-Tarazón, Juan Vicente Oltra-Gutiérrez, Raúl Oltra-Badenes and Santiago Escobar-Román
Appl. Sci. 2025, 15(3), 1412; https://doi.org/10.3390/app15031412 - 30 Jan 2025
Viewed by 1721
Abstract
This paper presents a quantitative vision of the study of artificial intelligence risk assessment in business based on a bibliometric analysis of the most relevant publications. The main goal is to determine whether the risk assessment of artificial intelligence systems used in businesses [...] Read more.
This paper presents a quantitative vision of the study of artificial intelligence risk assessment in business based on a bibliometric analysis of the most relevant publications. The main goal is to determine whether the risk assessment of artificial intelligence systems used in businesses is really a subject of increasing interest and to identify the most influential and productive sources of scientific research in this area. Data were collected from the Web of Science Core Collection, one of the most complete and prestigious databases. Regarding the temporal evolution of publications and citations this study evidences, this research subject shows rapid growth in the number of publications (at a compound annual rate of 31.20% from 2018 to 2024 inclusive), showing its high attraction for researchers, responding to the need to implement systematic risk assessment processes in the organizations using AI to mitigate potential harms, ensure compliance with regulations, and enhance artificial intelligence systems’ trust and adoption. Especially after the surge of large language models like ChatGPT or Gemini, AI is revolutionizing the dynamics of human–computer interaction using natural language, video, and audio. However, as the scientific community initiates rigorous studies on AI risk assessment within organizational contexts, it is imperative to consider critical issues such as data privacy, ethics, bias, and hallucinations to ensure the successful integration and interaction of AI systems with human operators. Furthermore, this paper constitutes a starting point, including for any researcher who wants to be introduced to this topic, indicating new challenges that should be dealt by researchers interested in AI and hot topics, in addition to the most relevant literature, authors, and journals about this research subject. Full article
(This article belongs to the Special Issue Emerging Technologies of Human-Computer Interaction)
Show Figures

Figure 1

17 pages, 248 KiB  
Article
“Video Killed the Radio Star”: Transitioning from an Audio- to a Video-Based Exam in Hungarian Language Classes for International Medical Students
by Gabriella Hild, Anna Dávidovics, Vilmos Warta and Timea Németh
Educ. Sci. 2025, 15(2), 161; https://doi.org/10.3390/educsci15020161 - 28 Jan 2025
Viewed by 1059
Abstract
This action research examines the transition from audio- to video-based tasks in the final Medical Hungarian exam for international medical students, aiming to better align assessment with real-life language needs and enhance student motivation. Conducted at a Hungarian medical university with 61 second-year [...] Read more.
This action research examines the transition from audio- to video-based tasks in the final Medical Hungarian exam for international medical students, aiming to better align assessment with real-life language needs and enhance student motivation. Conducted at a Hungarian medical university with 61 second-year students, the study uses a mixed-methods approach. Quantitative data from a questionnaire and qualitative insights from focus group interviews reveal students’ experiences with the video-based exam tasks and preparatory materials. The results indicate a positive reception of the Practice Test Book and the new video exam format, with visual cues like body language aiding in comprehension and engagement. Students found that the video-based tasks closely mirrored clinical interactions, strengthening the relevance of language skills in professional contexts. Preparatory materials, including lead-in exercises, were well-received by students and seen as effective in improving readiness for the exam. The study suggests that the shift from audio- to video-based assessment can bridge classroom learning with real-world application, potentially serving as a model for other non-traditional study abroad settings in Languages Other Than English (LOTEs), especially as purely audio-based communication has become less prevalent in today’s world. Full article
28 pages, 1980 KiB  
Article
Developing and Validating a Video-Based Measurement Instrument for Assessing Teachers’ Professional Vision of Language-Stimulation Interactions in the ECE Classroom
by Lien Dorme, Anne-Lotte Stevens, Wendelien Vantieghem, Kris Van den Branden and Ruben Vanderlinde
Educ. Sci. 2025, 15(2), 155; https://doi.org/10.3390/educsci15020155 - 26 Jan 2025
Viewed by 1503
Abstract
This study reports on the development and validation of a video-based instrument to assess early childhood education (ECE) teachers’ professional vision (PV) of language-stimulation (LS) interactions. PV refers to noticing and reasoning about key classroom interactions, a skill that can be trained and [...] Read more.
This study reports on the development and validation of a video-based instrument to assess early childhood education (ECE) teachers’ professional vision (PV) of language-stimulation (LS) interactions. PV refers to noticing and reasoning about key classroom interactions, a skill that can be trained and distinguishes experts from novices. The instrument targets the PV of three language-stimulation (LS) strategies: language input (LI), opportunities for language production (OLP), and feedback (FB). The instrument measures noticing through comparative judgement (CJ) and reasoning through multiple-choice items. Construct validity was assessed using the AERA framework, using three samples: a sample of professionals (n = 22), a pre-service teachers’ sample (n = 107), and a mixed sample with in- and pre-service teachers (n = 6). Reliability and validity were confirmed, with strong reliability scores for the CJ aggregated “master” rank orders (SRR: 0.827–0.866). Think-aloud procedures demonstrated that respondents’ decisions during CJ were mainly based on LS-relevant video features. Decisions unrelated to LS require further study. Multiple-choice reasoning items were developed from professionals’ open-ended feedback. Pre-service teacher reasoning scores showed no significant predictors. Using real classroom videos, this instrument provides an ecologically valid, scalable tool for assessing teachers’ professional vision of LS interactions. This validated instrument offers a foundation for professional development programs aimed at addressing the theory–practice gap in early language education. Full article
(This article belongs to the Special Issue Enhancing the Power of Video in Teacher Education)
Show Figures

Figure 1

27 pages, 2436 KiB  
Article
Seeing the Sound: Multilingual Lip Sync for Real-Time Face-to-Face Translation
by Amirkia Rafiei Oskooei, Mehmet S. Aktaş and Mustafa Keleş
Computers 2025, 14(1), 7; https://doi.org/10.3390/computers14010007 - 28 Dec 2024
Cited by 3 | Viewed by 4155
Abstract
Imagine a future where language is no longer a barrier to real-time conversations, enabling instant and lifelike communication across the globe. As cultural boundaries blur, the demand for seamless multilingual communication has become a critical technological challenge. This paper addresses the lack of [...] Read more.
Imagine a future where language is no longer a barrier to real-time conversations, enabling instant and lifelike communication across the globe. As cultural boundaries blur, the demand for seamless multilingual communication has become a critical technological challenge. This paper addresses the lack of robust solutions for real-time face-to-face translation, particularly for low-resource languages, by introducing a comprehensive framework that not only translates language but also replicates voice nuances and synchronized facial expressions. Our research tackles the primary challenge of achieving accurate lip synchronization across culturally diverse languages, filling a significant gap in the literature by evaluating the generalizability of lip sync models beyond English. Specifically, we develop a novel evaluation framework combining quantitative lip sync error metrics and qualitative assessments by human observers. This framework is applied to assess two state-of-the-art lip sync models with different architectures for Turkish, Persian, and Arabic languages, using a newly collected dataset. Based on these findings, we propose and implement a modular system that integrates language-agnostic lip sync models with neural networks to deliver a fully functional face-to-face translation experience. Inference Time Analysis shows this system achieves highly realistic, face-translated talking heads in real time, with a throughput as low as 0.381 s. This transformative framework is primed for deployment in immersive environments such as VR/AR, Metaverse ecosystems, and advanced video conferencing platforms. It offers substantial benefits to developers and businesses aiming to build next-generation multilingual communication systems for diverse applications. While this work focuses on three languages, its modular design allows scalability to additional languages. However, further testing in broader linguistic and cultural contexts is required to confirm its universal applicability, paving the way for a more interconnected and inclusive world where language ceases to hinder human connection. Full article
(This article belongs to the Special Issue Computational Science and Its Applications 2024 (ICCSA 2024))
Show Figures

Figure 1

14 pages, 3115 KiB  
Article
Improving Web Readability Using Video Content: A Relevance-Based Approach
by Ehsan Elahi, Jorge Morato and Ana Iglesias
Appl. Sci. 2024, 14(23), 11055; https://doi.org/10.3390/app142311055 - 27 Nov 2024
Viewed by 1210
Abstract
With the increasing integration of multimedia elements into webpages, videos have emerged as a popular medium for enhancing user engagement and knowledge retention. However, irrelevant or poorly placed videos can hinder readability and distract users from the core content of a webpage. This [...] Read more.
With the increasing integration of multimedia elements into webpages, videos have emerged as a popular medium for enhancing user engagement and knowledge retention. However, irrelevant or poorly placed videos can hinder readability and distract users from the core content of a webpage. This paper proposes a novel approach leveraging natural language processing (NLP) techniques to assess the relevance of video content on educational websites, thereby enhancing readability and user engagement. By using a cosine similarity-based relevance scoring method, we measured the alignment between video transcripts and webpage text, aiming to improve the user’s comprehension of complex topics presented on educational platforms. Our results demonstrated a strong correlation between automated relevance scores and user ratings, with an improvement of over 35% in relevance alignment. The methodology was evaluated across 50 educational websites representing diverse subjects, including science, mathematics, and language learning. We conducted a two-phase evaluation process: an automated scoring phase using cosine similarity, followed by a user study with 100 participants who rated the relevance of videos to webpage content. The findings support the significance of integrating NLP-driven video relevance assessments for enhanced readability on educational websites, highlighting the potential for broader applications in e-learning. Full article
(This article belongs to the Special Issue AI Horizons: Present Status and Visions for the Next Era)
Show Figures

Figure 1

Back to TopTop