Saved Queries

Background/Objectives: Accurate assessment of speech intelligibility is necessary for individuals with motor speech disorders. Transcription or scaled rating methods by naïve listeners are the most reliable tasks for these purposes; however, they are often resource-intensive and time-consuming within clinical contexts. Automatic speech recognition (ASR) systems, which transcribe speech into text, have been increasingly utilized for assessing speech intelligibility. This study investigates the feasibility of using an open-source ASR system to assess speech intelligibility in Hebrew and English speakers with Down syndrome (DS). Methods: Recordings from 65 Hebrew- and English-speaking participants were included: 33 speakers with DS and 32 typically developing (TD) peers. Speech samples (words, sentences) were transcribed using Whisper (OpenAI) and by naïve listeners. The proportion of agreement between ASR transcriptions and those of naïve listeners was compared across speaker groups (TD, DS) and languages (Hebrew, English) for word-level data. Further comparisons for Hebrew speakers were conducted across speaker groups and stimuli (words, sentences). Results: The strength of the correlation between listener and ASR transcription scores varied across languages, and was higher for English (r = 0.98) than for Hebrew (r = 0.81) for speakers with DS. A higher proportion of listener–ASR agreement was demonstrated for TD speakers, as compared to those with DS (0.94 vs. 0.74, respectively), and for English, in comparison to Hebrew speakers (0.91 for English DS speakers vs. 0.74 for Hebrew DS speakers). Listener–ASR agreement for single words was consistently higher than for sentences among Hebrew speakers. Speakers’ intelligibility influenced word-level agreement among Hebrew- but not English-speaking participants with DS. Conclusions: ASR performance for English closely approximated that of naïve listeners, suggesting potential near-future clinical applicability within single-word intelligibility assessment. In contrast, a lower proportion of agreement between human listeners and ASR for Hebrew speech indicates that broader clinical implementation may require further training of ASR models in this language. Full article

(This article belongs to the Special Issue Evaluation and Management of Developmental Disabilities)

42 pages, 1131 KiB

Open AccessArticle

A Hybrid Human-AI Model for Enhanced Automated Vulnerability Scoring in Modern Vehicle Sensor Systems

by Mohamed Sayed Farghaly, Heba Kamal Aslan and Islam Tharwat Abdel Halim

Future Internet 2025, 17(8), 339; https://doi.org/10.3390/fi17080339 - 28 Jul 2025

Abstract

Modern vehicles are rapidly transforming into interconnected cyber–physical systems that rely on advanced sensor technologies and pervasive connectivity to support autonomous functionality. Yet, despite this evolution, standardized methods for quantifying cybersecurity vulnerabilities across critical automotive components remain scarce. This paper introduces a novel hybrid model that integrates expert-driven insights with generative AI tools to adapt and extend the Common Vulnerability Scoring System (CVSS) specifically for autonomous vehicle sensor systems. Following a three-phase methodology, the study conducted a systematic review of 16 peer-reviewed sources (2018–2024), applied CVSS version 4.0 scoring to 15 representative attack types, and evaluated four free source generative AI models—ChatGPT, DeepSeek, Gemini, and Copilot—on a dataset of 117 annotated automotive-related vulnerabilities. Expert validation from 10 domain professionals reveals that Light Detection and Ranging (LiDAR) sensors are the most vulnerable (9 distinct attack types), followed by Radio Detection And Ranging (radar) (8) and ultrasonic (6). Network-based attacks dominate (104 of 117 cases), with 92.3% of the dataset exhibiting low attack complexity and 82.9% requiring no user interaction. The most severe attack vectors, as scored by experts using CVSS, include eavesdropping (7.19), Sybil attacks (6.76), and replay attacks (6.35). Evaluation of large language models (LLMs) showed that DeepSeek achieved an F1 score of 99.07% on network-based attacks, while all models struggled with minority classes such as high complexity (e.g., ChatGPT F1 = 0%, Gemini F1 = 15.38%). The findings highlight the potential of integrating expert insight with AI efficiency to deliver more scalable and accurate vulnerability assessments for modern vehicular systems.This study offers actionable insights for vehicle manufacturers and cybersecurity practitioners, aiming to inform strategic efforts to fortify sensor integrity, optimize network resilience, and ultimately enhance the cybersecurity posture of next-generation autonomous vehicles. Full article

(This article belongs to the Special Issue Generative Artificial Intelligence: Systems, Technologies and Applications)

27 pages, 27337 KiB

Open AccessArticle

Gest-SAR: A Gesture-Controlled Spatial AR System for Interactive Manual Assembly Guidance with Real-Time Operational Feedback

by Naimul Hasan and Bugra Alkan

Machines 2025, 13(8), 658; https://doi.org/10.3390/machines13080658 - 27 Jul 2025

Abstract

Manual assembly remains essential in modern manufacturing, yet the increasing complexity of customised production imposes significant cognitive burdens and error rates on workers. Existing Spatial Augmented Reality (SAR) systems often operate passively, lacking adaptive interaction, real-time feedback and a control system with gesture. In response, we present Gest-SAR, a SAR framework that integrates a custom MediaPipe-based gesture classification model to deliver adaptive light-guided pick-to-place assembly instructions and real-time error feedback within a closed-loop interaction instance. In a within-subject study, ten participants completed standardised Duplo-based assembly tasks using Gest-SAR, paper-based manuals, and tablet-based instructions; performance was evaluated via assembly cycle time, selection and placement error rates, cognitive workload assessed by NASA-TLX, and usability test by post-experimental questionnaires. Quantitative results demonstrate that Gest-SAR significantly reduces cycle times with an average of 3.95 min compared to Paper (Mean = 7.89 min, p < 0.01) and Tablet (Mean = 6.99 min, p < 0.01). It also achieved 7 times less average error rates while lowering perceived cognitive workload (p < 0.05 for mental demand) compared to conventional modalities. In total, 90% of the users agreed to prefer SAR over paper and tablet modalities. These outcomes indicate that natural hand-gesture interaction coupled with real-time visual feedback enhances both the efficiency and accuracy of manual assembly. By embedding AI-driven gesture recognition and AR projection into a human-centric assistance system, Gest-SAR advances the collaborative interplay between humans and machines, aligning with Industry 5.0 objectives of resilient, sustainable, and intelligent manufacturing. Full article

(This article belongs to the Special Issue AI-Integrated Advanced Robotics Towards Industry 5.0)

29 pages, 429 KiB

Open AccessArticle

Matching Game Preferences Through Dialogical Large Language Models: A Perspective

by Renaud Fabre, Daniel Egret and Patrice Bellot

Appl. Sci. 2025, 15(15), 8307; https://doi.org/10.3390/app15158307 (registering DOI) - 25 Jul 2025

Viewed by 122

Abstract

This perspective paper explores the future potential of “conversational intelligence” by examining how Large Language Models (LLMs) could be combined with GRAPHYP’s network system to better understand human conversations and preferences. Using recent research and case studies, we propose a conceptual framework that could make AI reasoning transparent and traceable, allowing humans to see and understand how AI reaches its conclusions. We present the conceptual perspective of “Matching Game Preferences through Dialogical Large Language Models (D-LLMs),” a proposed system that would allow multiple users to share their different preferences through structured conversations. This approach envisions personalizing LLMs by embedding individual user preferences directly into how the model makes decisions. The proposed D-LLM framework would require three main components: (1) reasoning processes that could analyze different search experiences and guide performance, (2) classification systems that would identify user preference patterns, and (3) dialogue approaches that could help humans resolve conflicting information. This perspective framework aims to create an interpretable AI system where users could examine, understand, and combine the different human preferences that influence AI responses, detected through GRAPHYP’s search experience networks. The goal of this perspective is to envision AI systems that would not only provide answers but also show users how those answers were reached, making artificial intelligence more transparent and trustworthy for human decision-making. Full article

►▼ Show Figures

Figure 1

26 pages, 673 KiB

Open AccessArticle

Mathematical Modeling and Structural Equation Analysis of Acceptance Behavior Intention to AI Medical Diagnosis Systems

by Kai-Chao Yao and Sumei Chiang

Mathematics 2025, 13(15), 2390; https://doi.org/10.3390/math13152390 - 25 Jul 2025

Viewed by 166

Abstract

This study builds on Davis’ TAM by integrating environmental and psychological variables relevant to AI medical diagnostics. This study developed a mathematical theoretical model called the “AI medical diagnosis-acceptance evaluation model” (AMD-AEM) to better understand acceptance behavior intention. Using mathematical modeling, we established reflective measurement model indicators and structural equation relationships, where linear structural equations illustrate the interactions among latent variables. In 2025, we collected empirical data from 2380 patients and medical staff who have experience with AI diagnostic systems in teaching hospitals in central Taiwan. Smart PLS 3 was employed to validate the AMD-AEM model. The results reveal that perceived usefulness (PU) and information quality (IQ) are the primary predictors of acceptance behavior intention (ABI). Additionally, perceived ease of use (PE) indirectly influences ABI through PU and attitude toward use (ATU). AI emotional perception (AEP) notably shows a significant positive relationship with ATU, highlighting that warm and positive human–AI interactions are crucial for user acceptance. IQ was identified as a mediating variable, with variance accounted for (VAF) coefficient analysis confirming its complete mediation effect on the path from ATU to ABI. This indicates that information quality enhances user attitudes and directly increases acceptance behavior intention. The AMD-AEM model demonstrates an excellent fit, providing valuable insights for academia and the healthcare industry. Full article

(This article belongs to the Special Issue Statistical Analysis: Theory, Methods and Applications)

►▼ Show Figures

Figure 1

51 pages, 5654 KiB

Open AccessReview

Exploring the Role of Digital Twin and Industrial Metaverse Technologies in Enhancing Occupational Health and Safety in Manufacturing

by Arslan Zahid, Aniello Ferraro, Antonella Petrillo and Fabio De Felice

Appl. Sci. 2025, 15(15), 8268; https://doi.org/10.3390/app15158268 - 25 Jul 2025

Viewed by 233

Abstract

The evolution of Industry 4.0 and the emerging paradigm of Industry 5.0 have introduced disruptive technologies that are reshaping modern manufacturing environments. Among these, Digital Twin (DT) and Industrial Metaverse (IM) technologies are increasingly recognized for their potential to enhance Occupational Health and Safety (OHS). However, a comprehensive understanding of how these technologies integrate to support OHS in manufacturing remains limited. This study systematically explores the transformative role of DT and IM in creating immersive, intelligent, and human-centric safety ecosystems. Following the PRISMA guidelines, a Systematic Literature Review (SLR) of 75 peer-reviewed studies from the SCOPUS and Web of Science databases was conducted. The review identifies key enabling technologies such as Virtual Reality (VR), Augmented Reality (AR), Extended Reality (XR), Internet of Things (IoT), Artificial Intelligence (AI), Cyber-Physical Systems (CPS), and Collaborative Robots (COBOTS), and highlights their applications in real-time monitoring, immersive safety training, and predictive hazard mitigation. A conceptual framework is proposed, illustrating a synergistic digital ecosystem that integrates predictive analytics, real-time monitoring, and immersive training to enhance the OHS. The findings highlight both the transformative benefits and the key adoption challenges of these technologies, including technical complexities, data security, privacy, ethical concerns, and organizational resistance. This study provides a foundational framework for future research and practical implementation in Industry 5.0. Full article

(This article belongs to the Special Issue Leveraging Digital Transformation for Enhanced Occupational Health and Safety in Manufacturing)

►▼ Show Figures

Figure 1

14 pages, 3995 KiB

Open AccessArticle

Future Illiteracies—Architectural Epistemology and Artificial Intelligence

by Mustapha El Moussaoui

Architecture 2025, 5(3), 53; https://doi.org/10.3390/architecture5030053 - 25 Jul 2025

Viewed by 180

Abstract

In the age of artificial intelligence (AI), architectural practice faces a paradox of immense potential and creeping standardization. As humans are increasingly relying on AI-generated outputs, architecture risks becoming a spectacle of repetition—a shuffling of data that neither truly innovates nor progresses vertically in creative depth. This paper explores the critical role of data in AI systems, scrutinizing the training datasets that form the basis of AI’s generative capabilities and the implications for architectural practice. We argue that when architects approach AI passively, without actively engaging their own creative and critical faculties, they risk becoming passive users locked in an endless loop of horizontal expansion without meaningful vertical growth. By examining the epistemology of architecture in the AI age, this paper calls for a paradigm where AI serves as a tool for vertical and horizontal growth, contingent on human creativity and agency. Only by mastering this dynamic relationship can architects avoid the trap of passive, standardized design and unlock the true potential of AI. Full article

(This article belongs to the Special Issue AI as a Tool for Architectural Design and Urban Planning)

►▼ Show Figures

Figure 1

24 pages, 331 KiB

Open AccessPerspective

Strategy for the Development of Cartography in Bulgaria with a 10-Year Planning Horizon (2025–2035) in the Context of Industry 4.0 and 5.0

by Temenoujka Bandrova, Davis Dinkov and Stanislav Vasilev

ISPRS Int. J. Geo-Inf. 2025, 14(8), 289; https://doi.org/10.3390/ijgi14080289 - 25 Jul 2025

Viewed by 346

Abstract

This strategic document outlines Bulgaria’s roadmap for modernizing its cartographic sector from 2025 to 2035, addressing the outdated geospatial infrastructure, lack of standardized digital practices, lack of coordinated digital infrastructure, outdated standards, and fragmented data management systems. The strategy was developed in accordance with the national methodology for strategic planning and through preliminary consultations with key stakeholders, including research institutions, business organizations, and public institutions. It aims to build a human-centered, data-driven geospatial framework aligned with global standards such as ISO 19100 and the EU INSPIRE Directive. Core components include: (1) modernization of the national geodetic system, (2) adoption of remote sensing and AI technologies, (3) development of interactive, web-based geospatial platforms, and (4) implementation of quality assurance and certification standards. A SWOT analysis highlights key strengths—such as existing institutional expertise—and critical challenges, including outdated legislation and insufficient coordination. The strategy emphasizes the need for innovation, regulatory reform, inter-institutional collaboration, and sustained investment. It ultimately positions Bulgarian cartography as a strategic contributor to national sustainable development and digital transformation. Full article

31 pages, 960 KiB

Open AccessReview

Generative AI as a Pillar for Predicting 2D and 3D Wildfire Spread: Beyond Physics-Based Models and Traditional Deep Learning

by Haowen Xu, Sisi Zlatanova, Ruiyu Liang and Ismet Canbulat

Fire 2025, 8(8), 293; https://doi.org/10.3390/fire8080293 - 24 Jul 2025

Viewed by 338

Abstract

Wildfires increasingly threaten human life, ecosystems, and infrastructure, with events like the 2025 Palisades and Eaton fires in Los Angeles County underscoring the urgent need for more advanced prediction frameworks. Existing physics-based and deep-learning models struggle to capture dynamic wildfire spread across both 2D and 3D domains, especially when incorporating real-time, multimodal geospatial data. This paper explores how generative artificial intelligence (AI) models—such as GANs, VAEs, and transformers—can serve as transformative tools for wildfire prediction and simulation. These models offer superior capabilities in managing uncertainty, integrating multimodal inputs, and generating realistic, scalable wildfire scenarios. We adopt a new paradigm that leverages large language models (LLMs) for literature synthesis, classification, and knowledge extraction, conducting a systematic review of recent studies applying generative AI to fire prediction and monitoring. We highlight how generative approaches uniquely address challenges faced by traditional simulation and deep-learning methods. Finally, we outline five key future directions for generative AI in wildfire management, including unified multimodal modeling of 2D and 3D dynamics, agentic AI systems and chatbots for decision intelligence, and real-time scenario generation on mobile devices, along with a discussion of critical challenges. Our findings advocate for a paradigm shift toward multimodal generative frameworks to support proactive, data-informed wildfire response. Full article

(This article belongs to the Special Issue Fire Risk Assessment and Emergency Evacuation)

►▼ Show Figures

Figure 1

17 pages, 1310 KiB

Open AccessArticle

IHRAS: Automated Medical Report Generation from Chest X-Rays via Classification, Segmentation, and LLMs

by Gabriel Arquelau Pimenta Rodrigues, André Luiz Marques Serrano, Guilherme Dantas Bispo, Geraldo Pereira Rocha Filho, Vinícius Pereira Gonçalves and Rodolfo Ipolito Meneguette

Bioengineering 2025, 12(8), 795; https://doi.org/10.3390/bioengineering12080795 - 24 Jul 2025

Viewed by 227

Abstract

The growing demand for accurate and efficient Chest X-Ray (CXR) interpretation has prompted the development of AI-driven systems to alleviate radiologist workload and reduce diagnostic variability. This paper introduces the Intelligent Humanized Radiology Analysis System (IHRAS), a modular framework that automates the end-to-end process of CXR analysis and report generation. IHRAS integrates four core components: (i) deep convolutional neural networks for multi-label classification of 14 thoracic conditions; (ii) Grad-CAM for spatial visualization of pathologies; (iii) SAR-Net for anatomical segmentation; and (iv) a large language model (DeepSeek-R1) guided by the CRISPE prompt engineering framework to generate structured diagnostic reports using SNOMED CT terminology. Evaluated on the NIH ChestX-ray dataset, IHRAS demonstrates consistent diagnostic performance across diverse demographic and clinical subgroups, and produces high-fidelity, clinically relevant radiological reports with strong faithfulness, relevancy, and alignment scores. The system offers a transparent and scalable solution to support radiological workflows while highlighting the importance of interpretability and standardization in clinical Artificial Intelligence applications. Full article

(This article belongs to the Special Issue AI Advancements in Healthcare: Medical Imaging and Sensing Technologies)

►▼ Show Figures

Figure 1

34 pages, 15050 KiB

Open AccessArticle

Story Forge: A Card-Based Framework for AI-Assisted Interactive Storytelling

by Yaojiong Yu, Gianni Corino and Mike Phillips

Electronics 2025, 14(15), 2955; https://doi.org/10.3390/electronics14152955 - 24 Jul 2025

Viewed by 265

Abstract

The application of artificial intelligence has significantly advanced interactive storytelling. However, current research has predominantly concentrated on the content generation capabilities of AI, primarily following a one-way ‘input-direct generation’ model. This has led to limited practicality in AI story writing, mainly due to the absence of investigations into user-driven creative processes. Consequently, users often perceive AI-generated suggestions as unhelpful and unsatisfactory. This study introduces a novel creative tool named Story Forge, which incorporates a card-based interactive narrative approach. By utilizing interactive story element cards, the tool facilitates the integration of narrative components with artificial intelligence-generated content to establish an interactive story writing framework. To evaluate the efficacy of Story Forge, two tests were conducted with a focus on user engagement, decision-making, narrative outcomes, the replay value of meta-narratives, and their impact on the users’ emotions and self-reflection. In the comparative assessment, the participants were randomly assigned to either the experimental group or the control group, in which they would use either a web-based AI story tool or Story Forge for story creation. Statistical analyses, including independent-sample t-tests, p-values, and effect size calculation (Cohen’s d), were employed to validate the effectiveness of the framework design. The findings suggest that Story Forge enhances users’ intuitive creativity, real-time story development, and emotional expression while empowering their creative autonomy. Full article

(This article belongs to the Special Issue Innovative Designs in Human–Computer Interaction)

►▼ Show Figures

Figure 1

15 pages, 1758 KiB

Open AccessArticle

Eye-Guided Multimodal Fusion: Toward an Adaptive Learning Framework Using Explainable Artificial Intelligence

by Sahar Moradizeyveh, Ambreen Hanif, Sidong Liu, Yuankai Qi, Amin Beheshti and Antonio Di Ieva

Sensors 2025, 25(15), 4575; https://doi.org/10.3390/s25154575 - 24 Jul 2025

Viewed by 129

Abstract

Interpreting diagnostic imaging and identifying clinically relevant features remain challenging tasks, particularly for novice radiologists who often lack structured guidance and expert feedback. To bridge this gap, we propose an Eye-Gaze Guided Multimodal Fusion framework that leverages expert eye-tracking data to enhance learning and decision-making in medical image interpretation. By integrating chest X-ray (CXR) images with expert fixation maps, our approach captures radiologists’ visual attention patterns and highlights regions of interest (ROIs) critical for accurate diagnosis. The fusion model utilizes a shared backbone architecture to jointly process image and gaze modalities, thereby minimizing the impact of noise in fixation data. We validate the system’s interpretability using Gradient-weighted Class Activation Mapping (Grad-CAM) and assess both classification performance and explanation alignment with expert annotations. Comprehensive evaluations, including robustness under gaze noise and expert clinical review, demonstrate the framework’s effectiveness in improving model reliability and interpretability. This work offers a promising pathway toward intelligent, human-centered AI systems that support both diagnostic accuracy and medical training. Full article

(This article belongs to the Section Sensing and Imaging)

►▼ Show Figures

Figure 1

26 pages, 2261 KiB

Open AccessArticle

Real-Time Fall Monitoring for Seniors via YOLO and Voice Interaction

by Eugenia Tîrziu, Ana-Mihaela Vasilevschi, Adriana Alexandru and Eleonora Tudora

Future Internet 2025, 17(8), 324; https://doi.org/10.3390/fi17080324 - 23 Jul 2025

Viewed by 93

Abstract

In the context of global demographic aging, falls among the elderly remain a major public health concern, often leading to injury, hospitalization, and loss of autonomy. This study proposes a real-time fall detection system that combines a modern computer vision model, YOLOv11 with integrated pose estimation, and an Artificial Intelligence (AI)-based voice assistant designed to reduce false alarms and improve intervention efficiency and reliability. The system continuously monitors human posture via video input, detects fall events based on body dynamics and keypoint analysis, and initiates a voice-based interaction to assess the user’s condition. Depending on the user’s verbal response or the absence thereof, the system determines whether to trigger an emergency alert to caregivers or family members. All processing, including speech recognition and response generation, is performed locally to preserve user privacy and ensure low-latency performance. The approach is designed to support independent living for older adults. Evaluation of 200 simulated video sequences acquired by the development team demonstrated high precision and recall, along with a decrease in false positives when incorporating voice-based confirmation. In addition, the system was also evaluated on an external dataset to assess its robustness. Our results highlight the system’s reliability and scalability for real-world in-home elderly monitoring applications. Full article

(This article belongs to the Special Issue Artificial Intelligence for Smart Healthcare: Methods, Applications, and Challenges)

►▼ Show Figures

Figure 1

39 pages, 2929 KiB

Open AccessArticle

A Risk-Based Analysis of Lightweight Drones: Evaluating the Harmless Threshold Through Human-Centered Safety Criteria

by Tamer Savas

Drones 2025, 9(8), 517; https://doi.org/10.3390/drones9080517 - 23 Jul 2025

Viewed by 120

Abstract

In recent years, the rapid development of lightweight Unmanned Aerial Vehicle (UAV) technology under 250 g has begun to challenge the validity of existing mass-based safety classifications. The commonly used 250 g threshold for defining “harmless” UAVs has become a subject requiring more detailed evaluations, especially as new models with increased speed and performance enter the market. This study aims to reassess the adequacy of the current 250 g mass limit by conducting a comprehensive analysis using human-centered injury metrics, including kinetic energy, Blunt Criterion (BC), Viscous Criterion (VC), and the Abbreviated Injury Scale (AIS). Within this scope, an extensive dataset of commercial UAV models under 500 g was compiled, with a particular focus on the sub-250 g segment. For each model, KE, BC, VC, and AIS values were calculated using publicly available technical data and validated physical models. The results were compared against established injury thresholds, such as 14.9 J (AIS-3 serious injury), 25 J (“harmless” threshold), and 33.9 J (AIS-4 severe injury). Furthermore, new recommendations were developed for regulatory authorities, including energy-based classification systems and mission-specific dynamic threshold mechanisms. According to the findings of this study, most UAVs under 250 g continue to remain below the current “harmless” threshold values. However, some next-generation high-speed UAV models are approaching or exceeding critical KE levels, indicating a need to reassess existing regulatory approaches. Additionally, the strong correlation between both BC and VC metrics with AIS outcomes demonstrates that these indicators are complementary and valuable tools for assessing injury risk. In this context, the adoption of an energy-based supplementary classification and dynamic, mission-based regulatory frameworks is recommended. Full article

►▼ Show Figures

Figure 1

26 pages, 338 KiB

Open AccessArticle

ChatGPT as a Stable and Fair Tool for Automated Essay Scoring

by Francisco García-Varela, Miguel Nussbaum, Marcelo Mendoza, Carolina Martínez-Troncoso and Zvi Bekerman

Educ. Sci. 2025, 15(8), 946; https://doi.org/10.3390/educsci15080946 - 23 Jul 2025

Viewed by 163

Abstract

The evaluation of open-ended questions is typically performed by human instructors using predefined criteria to uphold academic standards. However, manual grading presents challenges, including high costs, rater fatigue, and potential bias, prompting interest in automated essay scoring systems. While automated essay scoring tools can assess content, coherence, and grammar, discrepancies between human and automated scoring have raised concerns about their reliability as standalone evaluators. Large language models like ChatGPT offer new possibilities, but their consistency and fairness in feedback remain underexplored. This study investigates whether ChatGPT can provide stable and fair essay scoring—specifically, whether identical student responses receive consistent evaluations across multiple AI interactions using the same criteria. The study was conducted in two marketing courses at an engineering school in Chile, involving 40 students. Results showed that ChatGPT, when unprompted or using minimal guidance, produced volatile grades and shifting criteria. Incorporating the instructor’s rubric reduced this variability but did not eliminate it. Only after providing an example-rich rubric, a standardized output format, low temperature settings, and a normalization process based on decision tables did ChatGPT-4o demonstrate consistent and fair grading. Based on these findings, we developed a scalable algorithm that automatically generates effective grading rubrics and decision tables with minimal human input. The added value of this work lies in the development of a scalable algorithm capable of automatically generating normalized rubrics and decision tables for new questions, thereby extending the accessibility and reliability of automated assessment. Full article

(This article belongs to the Section Technology Enhanced Education)

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 24.

Go to page 1 2 3 4 5

Search Results (1,191)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI