Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (342)

Search Parameters:
Keywords = rubric

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 1185 KB  
Review
Chronic Cholecystitis: Anatomical Variants, Pediculitis, and a Candidate Preoperative Framework for Difficult Laparoscopic Cholecystectomy
by Georgiana-Andreea Marinescu, Sarmis Marian Sandulescu, Dumitru Radulescu, Oana Taisescu, Emil-Tiberius Trasca, Elena-Irina Caluianu, Dorin Mercut, Razvan Mercut, Eleonora Daniela Ciupeanu-Calugaru, Alexandru Stefarta, Patricia-Mihaela Radulescu and Citto Taisescu
Diagnostics 2026, 16(8), 1201; https://doi.org/10.3390/diagnostics16081201 - 17 Apr 2026
Viewed by 201
Abstract
Preoperative risk stratification for laparoscopic cholecystectomy (LC) remains imperfect, particularly in patients with chronic inflammatory remodeling and biliary anatomic variants. Existing tools often focus on acute presentations or intraoperative variables, resulting in uncertainty on how congenital anatomy, recurrent biliary colic, and cystic pediculitis [...] Read more.
Preoperative risk stratification for laparoscopic cholecystectomy (LC) remains imperfect, particularly in patients with chronic inflammatory remodeling and biliary anatomic variants. Existing tools often focus on acute presentations or intraoperative variables, resulting in uncertainty on how congenital anatomy, recurrent biliary colic, and cystic pediculitis interact. We synthesize a hypothesis-generating conceptual framework and propose an illustrative candidate preoperative rubric for future validation. We performed a structured narrative review of PubMed, Scopus, and Web of Science (January 1990–December 2024; last search: 15 December 2024). Eligible primary studies evaluated clinical history, imaging-defined anatomy, inflammatory biomarkers, and/or operative outcomes (conversion, intraoperative complications, or operative difficulty) in the setting of LC. Acute cholecystitis and chronic/elective cohorts were interpreted separately during the narrative synthesis. Two reviewers screened titles/abstracts and assessed full texts using predefined inclusion/exclusion criteria; due to heterogeneity, no meta-analysis and no formal risk-of-bias tool were applied. The literature supports a plausible vicious cycle in which biliary anatomic variants may impair drainage and promote stasis, recurrent biliary colic, and chronic inflammation, ultimately leading to fibrosis/pediculitis and a “frozen” Calot’s triangle. We translate these signals into an illustrative candidate rubric (0–16 points) spanning three domains: clinical history (0–6), imaging (0–6), and inflammatory biomarkers (0–4). Weights and cut-offs (low: 0–4; moderate: 5–9; high: 10–16) were chosen a priori for conceptual clarity and are not data-derived. This review provides a conceptual map and a candidate variable set to support hypothesis generation, standardized data collection, and staged validation. The rubric is not validated and must not be used for clinical decision-making. Planned next steps include feasibility-oriented derivation, followed by prospective multicenter external validation and impact assessment. Full article
Show Figures

Figure 1

20 pages, 660 KB  
Article
Rapid AI-Assisted Instructional Design: Using Agentic LLM Tools to Develop UDL-Aligned Curricula for Student Veterans and Multilingual Learners
by John C. Chick and Laura T. Morello
Appl. Sci. 2026, 16(8), 3871; https://doi.org/10.3390/app16083871 - 16 Apr 2026
Viewed by 182
Abstract
Background/Context: Creating instructional materials that authentically meet the needs of marginalized learner groups such as student veterans, multilingual adult learners, and first-generation doctoral students demands consistent application of Universal Design for Learning (UDL) principles coupled with meaningful content expertise about those learners’ traits, [...] Read more.
Background/Context: Creating instructional materials that authentically meet the needs of marginalized learner groups such as student veterans, multilingual adult learners, and first-generation doctoral students demands consistent application of Universal Design for Learning (UDL) principles coupled with meaningful content expertise about those learners’ traits, access needs, and lived experiences. Faculty at teaching-intensive institutions face persistent constraints of time, knowledge, and course load that make systematic UDL implementation difficult. Objective: This practitioner-scholar case study examines whether HAIST-structured agentic LLM-assisted instructional design can produce UDL-aligned materials for student veterans and multilingual learners at a quality level and time frame realistic for under-resourced faculty. Methodology: Drawing from the Human-AI Symbiotic Theory (HAIST) and UDL guidelines, we document four AI-assisted cycles of instructional design at a Hispanic-Serving Institution. Outcomes related to UDL alignment were measured using a rubric adapted from CAST Guidelines 2.2. Results: Across four materials, initial AI generation averaged 61.4% UDL alignment (SD = 8.7%); following iterative calibration, this rose to 84.2% (SD = 5.3%). The largest gains occurred in the Engagement category. Conclusions: These descriptive findings, interpreted as exploratory rather than inferential given the single-site case study design and n = 4 materials, suggest that HAIST-structured AI-assisted design has the potential to produce accessible materials for underserved learner populations in time frames feasible for working faculty. Learner outcome data were not collected in this study; future quasi-experimental work is needed to assess the effectiveness of these materials with target learner populations. Full article
21 pages, 632 KB  
Article
A Qualitative Case Study of Socio-Scientific Reasoning in the En-ROADS Climate Simulation
by Shuvra Rahman, Gillian Roehrig and Heba EL-Deghaidy
Sustainability 2026, 18(8), 3873; https://doi.org/10.3390/su18083873 - 14 Apr 2026
Viewed by 497
Abstract
Addressing climate change requires an understanding not only of science concepts but also the social, economic, and political factors that influence decision making. Thus, this study investigated the development of socio-scientific reasoning related to climate change action. This case study explored the six [...] Read more.
Addressing climate change requires an understanding not only of science concepts but also the social, economic, and political factors that influence decision making. Thus, this study investigated the development of socio-scientific reasoning related to climate change action. This case study explored the six dimensions of socio-scientific reasoning (complexity, perspective-taking, inquiry, skepticism, affordance of science, and multiple perspective-taking) of twenty undergraduate students as they engaged with decision making about climate action. Data were collected from classroom worksheets reflecting small group decision making and individual student reflections. Data were analyzed using a rubric that categorized the level of students’ socio-scientific reasoning across the six dimensions. These categorizations were further supported by qualitative interpretation of students’ responses. The findings indicate strong performance in complexity and perspective-taking, while inquiry, skepticism, and the affordance of science were less consistently demonstrated. The study contributes to understanding how simulation-based learning can support the development of SSR and highlights the importance of structured pedagogical design in fostering higher order reasoning in climate education. Full article
Show Figures

Figure 1

30 pages, 2210 KB  
Review
Dynamic Response-Based Bridge Monitoring and Structural Assessment: A Structured Scoping Review and Evidence Inventory
by Muhammad Ziad Bacha, Mario Lucio Puppio, Marco Zucca and Mauro Sassu
Infrastructures 2026, 11(4), 134; https://doi.org/10.3390/infrastructures11040134 - 10 Apr 2026
Viewed by 264
Abstract
Dynamic response measurements support bridge monitoring and structural assessment because they are obtainable under operational loading and are sensitive to changes in stiffness, boundary conditions, and mass distribution. This article presents a structured scoping review of dynamic-response-based bridge monitoring and assessment. It covers [...] Read more.
Dynamic response measurements support bridge monitoring and structural assessment because they are obtainable under operational loading and are sensitive to changes in stiffness, boundary conditions, and mass distribution. This article presents a structured scoping review of dynamic-response-based bridge monitoring and assessment. It covers damage-sensitive indicators, stiffness/capacity proxy inference, interpretation under operational and extreme loading, sensing with acquisition (contact, and indirect/drive-by), and data processing, machine learning and digital-twin integration for decision support. Evidence was identified through targeted searches in Scopus and The Lens with duplicate resolution in Zotero. The cited studies are compiled into a traceable evidence inventory linked to method families and decision objectives. The synthesis shows that global modal properties enable change screening but are highly confounded by environmental/operational variability. Localization and state characterization typically require denser or higher-fidelity sensing and signal conditioning. Finally, capacity-related inference using calibrated conversion models or machine learning (ML) surrogates remains context-bounded and validation-dependent. This review provides an end-to-end pipeline, evidence-maturity rubric, and conservative failure-mode checks with escalation logic that tie SHM outputs to inspection and analysis rather than direct condition declarations for bridge owners. This review is intentionally scoped and does not claim PRISMA-style comprehensiveness. Full article
Show Figures

Figure 1

25 pages, 32705 KB  
Article
Controlling the Art School: Ideologies of Materials and a Speculative Vision for Hybrid Arts Education
by Dylan Yamada-Rice
Arts 2026, 15(4), 73; https://doi.org/10.3390/arts15040073 - 8 Apr 2026
Viewed by 486
Abstract
In responding to the special issue’s call to examine the shifting space of materiality, this article uses creative writing, hand-drawn comics, and speculative fiction/design as a form of research by practice to critique changes in UK Higher Arts Education in relation to art [...] Read more.
In responding to the special issue’s call to examine the shifting space of materiality, this article uses creative writing, hand-drawn comics, and speculative fiction/design as a form of research by practice to critique changes in UK Higher Arts Education in relation to art materials. It shows how embedded neoliberal structures that have been documented to negatively impact HE staff and the arts in general, also now extend to prioritising and excluding some art materials over others. A speculative vision is offered as an alternative in which a nomadic higher arts education is put forward, one that encourages the use of hybrid art materials. The means chosen to make the arguments presented are analogue methods of drawing, cutting, printing, sewing and writing to strengthen the point that digital materials are currently prioritised in UK arts education due to HE’s entanglement with agendas entwinned with Big Tech and most recently the military. The format is also deliberately experimental to move away from common ways of presenting research and theory that have become formulaic as academics are pushed to meet the ideals of the Research Excellence Framework, another neoliberal rubric. Full article
Show Figures

Figure 1

16 pages, 410 KB  
Article
Observing Instructional Practice: Can We Consistently Measure Teaching Quality Constructs?
by Mark White, Armin Jentsch, Jennifer Luoto and Kirsti Klette
Educ. Sci. 2026, 16(4), 583; https://doi.org/10.3390/educsci16040583 - 7 Apr 2026
Viewed by 313
Abstract
Classroom observation systems can be an important tool for understanding teaching quality. However, the wide range of possible lessons that can be observed raises concerns about whether fixed observation rubrics can measure the intended teaching quality constructs equally well in each lesson. This [...] Read more.
Classroom observation systems can be an important tool for understanding teaching quality. However, the wide range of possible lessons that can be observed raises concerns about whether fixed observation rubrics can measure the intended teaching quality constructs equally well in each lesson. This paper argues for exploring measurement invariance across lessons in observation systems, adopting an understanding of measurement invariance that emphasises the alignment between a theoretical construct and the measurement of that construct. We conceptualise teaching quality using the Protocol for Language Arts Teaching Observation (PLATO). PLATO’s focus on individual items is misaligned to traditional measurement invariance approaches, so we emphasise the importance of detailed, qualitative considerations of how rubric operationalisations of a construct may or may not capture the intended teaching quality construct equally well across lessons with different characteristics. We discuss the affordances and limitations of this way of considering measurement invariance and argue for the importance of ensuring measurement invariance across lessons. Full article
(This article belongs to the Special Issue Recent Advances in Measuring Teaching Quality)
Show Figures

Figure 1

23 pages, 1629 KB  
Article
AI-Based Automated Scoring Layer Using Large Language Models and Semantic Analysis
by Anastasia Vangelova and Veska Gancheva
Appl. Sci. 2026, 16(7), 3537; https://doi.org/10.3390/app16073537 - 4 Apr 2026
Viewed by 933
Abstract
Automated scoring of open-ended questions is an important research direction in educational technology and artificial intelligence, as manual grading is time-consuming and often subject to inter-rater variation. This paper proposes an AI-based framework for automated scoring that combines large language models (LLMs), Retrieval-Augmented [...] Read more.
Automated scoring of open-ended questions is an important research direction in educational technology and artificial intelligence, as manual grading is time-consuming and often subject to inter-rater variation. This paper proposes an AI-based framework for automated scoring that combines large language models (LLMs), Retrieval-Augmented Generation (RAG), analytical rubrics, and structured machine-readable output within a Moodle-supported e-learning environment. The framework is designed to support context-grounded and criterion-based evaluation by combining the student response, retrieved instructional context, and rubric-defined scoring criteria within a controlled assessment workflow. The proposed approach aims to improve the consistency, traceability, and practical applicability of automated scoring for open-ended responses. To examine its performance, an experimental study was conducted in a real university setting involving a five-task open-ended examination. AI-generated scores were compared with independent human scores using agreement, reliability, correlation, and error metrics. The results indicate a strong level of agreement between automated and expert scoring within the tested setting, together with relatively low average deviation. These findings suggest that the proposed framework has practical potential for supporting automated assessment in digital learning environments, while also highlighting the importance of careful interpretation within the scope of the experimental design. Full article
(This article belongs to the Special Issue Application of Semantic Web Technologies for E-Learning)
Show Figures

Figure 1

21 pages, 932 KB  
Systematic Review
Problem Design Characteristics in School-Based PBL: A PRISMA-Informed Review of Korean K-12 Cases
by Hyunwook Kim and Jino Kim
Educ. Sci. 2026, 16(4), 553; https://doi.org/10.3390/educsci16040553 - 1 Apr 2026
Viewed by 399
Abstract
Problem-based learning (PBL) relies on the problem as the generative trigger for inquiry, collaboration, and assessment, yet school-based reports often provide limited guidance on how problems are actually designed. This PRISMA-informed review analyzed 24 PBL cases published in Korea Citation Index (KCI) journals [...] Read more.
Problem-based learning (PBL) relies on the problem as the generative trigger for inquiry, collaboration, and assessment, yet school-based reports often provide limited guidance on how problems are actually designed. This PRISMA-informed review analyzed 24 PBL cases published in Korea Citation Index (KCI) journals (2020–2025) to characterize enacted problem design in Korean K–12 settings. Using a literature-grounded framework covering authenticity, cognitive demand, collaboration, and pedagogical alignment, five expert coders conducted calibration and consensus coding of 12 elements on a 0–3 rubric. The findings showed that Perspective Integration, Analysis Requirements, and Assessment Opportunities were most strongly represented, whereas Curriculum Integration and Cognitive Conflict were less frequently advanced. Correlation and triad analyses further indicated recurring design patterns centered on collaboration and authenticity, while cognitively destabilizing configurations were rare. These results suggest that Korean K–12 PBL tends to emphasize socially supported, contextually meaningful inquiry. By contrast, the rarely observed cognitive-demand triad suggests a tendency to avoid problems requiring strong conceptual destabilization and deep analysis. These findings identify context-bounded design patterns and offer practical guidance for designing PBL problems that better balance socio-cognitive support, conceptual challenge, and curricular coherence. Full article
(This article belongs to the Section Curriculum and Instruction)
Show Figures

Figure 1

20 pages, 1454 KB  
Article
AI-Supported Adaptive Simulation for Diagnostic Disclosure in Medical Students: A Randomized Controlled Trial
by Brenda Ofelia Jay-Jímenez, Diego Alberto Martínez-Islas, Axel Tonatiuh Marroquin-Aguilar, Fernanda Avelino-Vivas, Dafne Montserrat Solis-Galván, Alexis Arturo Laguna-González, Bruno Manuel García-García, Eduardo Minaya-Pérez, Efren Quiñones-Lara, Ismael Martínez-Bonilla, Adolfo René Méndez-Cruz and Héctor Iván Saldívar-Cerón
Int. Med. Educ. 2026, 5(2), 35; https://doi.org/10.3390/ime5020035 - 1 Apr 2026
Viewed by 450
Abstract
Diagnostic disclosure is a complex communication task that requires learners to integrate interpersonal attunement, structured information delivery, and condition-specific reasoning in real time. We conducted a randomized controlled trial comparing conventional diagnostic communication training with the same training supplemented by an AI-supported adaptive [...] Read more.
Diagnostic disclosure is a complex communication task that requires learners to integrate interpersonal attunement, structured information delivery, and condition-specific reasoning in real time. We conducted a randomized controlled trial comparing conventional diagnostic communication training with the same training supplemented by an AI-supported adaptive virtual patient simulation designed to provide additional deliberate practice and individualized, just-in-time feedback. Eighty undergraduate medical students were randomized 1:1 and completed standardized-patient encounters involving disclosure of a new diagnosis of type 2 diabetes mellitus before and after training. Performance was assessed by blinded physician raters using an adapted Kalamazoo rubric. Among students with complete pre–post data (conventional training, n = 25; AI-supported training, n = 26), both groups showed substantial improvement. Mean gains were numerically larger in the AI-supported group, with small-to-moderate standardized effects across selected communication domains; however, baseline-adjusted group-by-time interactions did not reach conventional thresholds for statistical significance, indicating that any added mean effects beyond conventional training remain uncertain. Exploratory person-level analyses suggested greater heterogeneity of improvement in the AI-supported condition, including a higher density of large gains in higher-order communication components. These findings should therefore be interpreted as exploratory rather than confirmatory. AI-supported adaptive simulation appears feasible as an adjunct to communication training, but adequately powered studies are needed to clarify effect magnitude, mechanisms, and generalizability across training contexts. Full article
Show Figures

Graphical abstract

18 pages, 313 KB  
Article
Positioning Generative AI in EFL Peer Feedback: Training Feedback Literacy and Enabling Uptake in Speaking Classes
by Bradley Irwin and Theron Muller
Educ. Sci. 2026, 16(4), 544; https://doi.org/10.3390/educsci16040544 - 1 Apr 2026
Viewed by 663
Abstract
Peer feedback is widely used in English as a foreign language (EFL) higher education, yet its benefits are often limited by uneven feedback quality and learners’ difficulty in interpreting and using comments. This theoretical paper synthesizes research on peer feedback, student feedback literacy, [...] Read more.
Peer feedback is widely used in English as a foreign language (EFL) higher education, yet its benefits are often limited by uneven feedback quality and learners’ difficulty in interpreting and using comments. This theoretical paper synthesizes research on peer feedback, student feedback literacy, and recent developments in generative artificial intelligence (GenAI) to propose a theory-informed design framework that positions GenAI as Trainer and Synthesizer in L2 speaking peer feedback. Building on feedback literacy as a set of capacities (appreciating feedback, making judgments, managing affect, and taking action), the paper argues that speaking tasks create distinct constraints, including time pressure, fleeting performance, and heightened affect, which make real-time peer feedback promising but pedagogically challenging. To address these challenges, here we introduce two complementary roles for GenAI in peer feedback workflows: a Trainer that supports feedback quality through calibration with exemplars, rubric-guided practice, and feedback-on-feedback; and a Synthesizer that aggregates peer input into concise, actionable guidance linked to criteria and learning goals. The conceptual proposal specifies key design principles (e.g., transparency, learner agency, teacher-in-the-loop oversight, and privacy-conscious data practices) and outlines researchable propositions for evaluating learning, engagement, and equity outcomes. The paper concludes with implications for task design, training sequences, and responsible classroom implementation. Full article
20 pages, 1145 KB  
Article
GenAI-Supported Flipped Learning in Preservice Chemistry Teacher Education: Lesson-Design Performance, Learning Attitude, Self-Regulated Learning, and Critical Thinking Awareness
by Jun Zhang, Xinyue Deng, Tong Wu and Kai Wang
Behav. Sci. 2026, 16(4), 514; https://doi.org/10.3390/bs16040514 - 29 Mar 2026
Viewed by 506
Abstract
This quasi-experimental study compared GenAI-supported flipped learning (AI-FL) with reading-based flipped learning (R-FL) in an 11-week preservice chemistry course. Two intact classes completed the same topics and identical in-class activities, differing only in pre-class preparation through guided GenAI-based interactive learning or assigned readings. [...] Read more.
This quasi-experimental study compared GenAI-supported flipped learning (AI-FL) with reading-based flipped learning (R-FL) in an 11-week preservice chemistry course. Two intact classes completed the same topics and identical in-class activities, differing only in pre-class preparation through guided GenAI-based interactive learning or assigned readings. The study examined lesson-design performance, learning attitude, self-regulated learning, and critical thinking awareness. After controlling for pretest scores, the reading-based flipped learning group showed stronger lesson-design performance, whereas the GenAI-supported group reported more positive learning attitudes. No significant group differences were observed for self-regulated learning or critical thinking awareness. These findings suggest that, in this course context, GenAI-supported pre-class learning may enhance learners’ attitudes but does not necessarily improve rubric-aligned lesson-design performance compared with reading-based preparation. Full article
(This article belongs to the Section Educational Psychology)
Show Figures

Figure 1

27 pages, 7770 KB  
Article
Structured Data Visualization Instruction in Graduate Education: An Empirical Study of Conceptual and Procedural Development
by Simón Gutiérrez de Ravé, Eduardo Gutiérrez de Ravé and Francisco José Jiménez-Hornero
Educ. Sci. 2026, 16(4), 533; https://doi.org/10.3390/educsci16040533 - 27 Mar 2026
Viewed by 516
Abstract
Information visualization is a crucial yet often underdeveloped research skill in graduate education. This study examined how practice-based visualization instruction enhances graduate students’ conceptual understanding and procedural competence in scientific graph construction. Forty-two first-year graduate students participated in a ten-week instructional program combining [...] Read more.
Information visualization is a crucial yet often underdeveloped research skill in graduate education. This study examined how practice-based visualization instruction enhances graduate students’ conceptual understanding and procedural competence in scientific graph construction. Forty-two first-year graduate students participated in a ten-week instructional program combining diagnostic assessment, guided exercises, and a complex graph replication task. Conceptual and procedural competence were evaluated using validated analytic rubrics to ensure reliability and depth of analysis. Results showed substantial improvement in students’ ability to select suitable chart types, label axes accurately, and apply coherent color schemes. Consistent with the study’s hypotheses, significant gains were observed in conceptual understanding (H1) and technical execution (H2), and a moderate positive correlation between the two domains (H3) confirmed that stronger conceptual grasp aligned with higher visualization proficiency. Iterative feedback and guided reflection supported the integration of theory and practice. However, challenges in detailed annotation and multivariable coordination persisted. Overall, structured, practice-based visualization training enhanced methodological competence and communication clarity. Embedding such experiential learning within graduate curricula can strengthen visualization literacy and support the development of research independence. Full article
(This article belongs to the Section Higher Education)
Show Figures

Figure A1

13 pages, 651 KB  
Article
AI-Generated Exercise Prescriptions for At-Risk Populations: Safety and Feasibility of a Large Language Model Assessed by Expert Evaluation
by Minkyung Choi, Jaeyong Park, Myeounggon Lee, Jaewon Beom, Se Young Jung and Kihyuk Lee
J. Clin. Med. 2026, 15(6), 2457; https://doi.org/10.3390/jcm15062457 - 23 Mar 2026
Viewed by 531
Abstract
Background/Objectives: In exercise science and sports medicine, the potential use of large language models for generating personalized exercise programs is being explored. However, the practical applicability of AI-generated exercise prescriptions has not yet been sufficiently validated, particularly in complex clinical contexts. This study [...] Read more.
Background/Objectives: In exercise science and sports medicine, the potential use of large language models for generating personalized exercise programs is being explored. However, the practical applicability of AI-generated exercise prescriptions has not yet been sufficiently validated, particularly in complex clinical contexts. This study aimed to evaluate their practical utility under expert supervision. Methods: Exercise prescription outputs generated by a large language model (Gemini 2.5, Google LLC) were analyzed using clinical cases incorporating complex exercise-related considerations. Three levels of prompt structuring were applied. Experts evaluated the outputs using a structured rubric assessing safety, feasibility, guideline alignment, and personalization. Inter-expert agreement was assessed using intraclass correlation coefficients (ICC), and expert-specific internal consistency was evaluated using Cronbach’s alpha. Results: AI-generated exercise prescriptions demonstrated a certain level of structural completeness. However, inter-expert agreement was low (ICC (2,3) = 0.139), whereas expert-specific internal consistency was high (Cronbach’s alpha > 0.92). Prompt structuring from Stage 1 to Stage 2 was associated with improved mean scores in safety and guideline alignment. Additional structuring did not consistently yield further improvements. Conclusions: AI-generated exercise prescriptions may have practical potential as supportive decision-making tools when expert involvement is assumed. Nonetheless, expert judgments did not converge toward a single evaluative standard, reflecting the inherently expert-dependent nature of exercise prescription. Full article
Show Figures

Figure 1

36 pages, 5956 KB  
Article
A Knowledge-Augmented Two-Stage Workflow for Architectural Concept-to-Massing Generation and Evaluation
by Shangci Sun and Yao Fu
Buildings 2026, 16(6), 1265; https://doi.org/10.3390/buildings16061265 - 23 Mar 2026
Viewed by 397
Abstract
Large language models (LLMs) and diffusion-based image generators can rapidly produce architectural ideas and imagery, yet translating conceptual narratives into massing composition is often implicit and difficult to reproduce. In this paper, we present a knowledge-augmented two-stage workflow for architectural concept-to-massing generation and [...] Read more.
Large language models (LLMs) and diffusion-based image generators can rapidly produce architectural ideas and imagery, yet translating conceptual narratives into massing composition is often implicit and difficult to reproduce. In this paper, we present a knowledge-augmented two-stage workflow for architectural concept-to-massing generation and evaluation. The outputs are represented as axonometric massing proxy images, which serve as 2D visual proxies for early-stage massing refinement rather than editable 3D models. The workflow integrates a prototype library and Knowledge Graph (KG) routing to map narrative cues into executable strategy and operation tokens and compile stage-specific prompts. Stage 1 produces structural concept sketches emphasizing legible composition, while Stage 2 generates axonometric massing proxy images conditioned on Stage 1 sketches to stabilize composition across candidates. Under a fixed sampling budget, candidates are ranked using a rubric-based scoring protocol with Top-K selection, and evaluation signals can be written back to update prompt compilation iteratively. Across diverse project briefs, ablation studies demonstrate that knowledge augmentation improves constraint compliance and composition readability while maintaining controlled diversity for early exploration. We report expert ratings together with paired statistical tests to support reproducible comparisons. Full article
Show Figures

Figure 1

33 pages, 2332 KB  
Article
EvalHack: Answer-Side Prompt Injection for Probing LLM Exam-Grading Panel Stability
by Catalin Anghel, Marian Viorel Craciun, Adina Cocu, Andreea Alexandra Anghel, Antonio Stefan Balau, Adrian Istrate and Aurelian-Dumitrache Anghele
Information 2026, 17(3), 297; https://doi.org/10.3390/info17030297 - 18 Mar 2026
Viewed by 467
Abstract
Large language models are increasingly used as automated graders, yet their reliability under answer-side manipulation and their behavior in multi-model panels remain insufficiently understood. This paper introduces EvalHack, a matrix benchmark in which a fixed committee of four LLMs grades university-level machine learning [...] Read more.
Large language models are increasingly used as automated graders, yet their reliability under answer-side manipulation and their behavior in multi-model panels remain insufficiently understood. This paper introduces EvalHack, a matrix benchmark in which a fixed committee of four LLMs grades university-level machine learning exam answers under a strict integer-only contract (0–10) grounded in instructor-authored rubric artifacts. The dataset comprises 100 students answering 10 short, open-ended items (1000 answers). For each answer, the evaluation includes a clean version and two content-preserving adversarial variants that operate only on the student text: A1, a visible coercive suffix appended to the answer, and A2, a stealth variant that uses Unicode control characters (e.g., zero-width and bidirectional marks) to embed an instruction. EvalHack instruments the full grading pipeline, recording item-level member scores, the committee aggregate, within-panel disagreement, and discrepancies to human grades. Empirically, answer-side edits induce systematic score inflation and stronger top-end concentration, with edited answers clustering near the upper end of the scale. Within-panel disagreement, measured as the range between the highest and lowest member score, varies across conditions, with median Consistency Spread values of 3.0 (clean), 2.0 (A1), and 6.0 (A2). Compared to human graders, the panel is more lenient on average (MAE = 1.897; bias human − panel = −1.345). Finally, grouping items by disagreement shows that low-disagreement items exhibit smaller human-panel errors, indicating that within-panel spread can serve as a practical uncertainty signal for routing difficult answers to human review or to larger/more specialized panels. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Graphical abstract

Back to TopTop