Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (11,140)

Search Parameters:
Keywords = language model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 2948 KB  
Article
A Multimodal Model- and Retrieval-Guided Framework for BIM Model Cost Estimation
by Hassan Al-Derham, Ruchika Jagannath Chaudhari, Lu Gao and Ahmed Senouci
Buildings 2026, 16(11), 2103; https://doi.org/10.3390/buildings16112103 (registering DOI) - 25 May 2026
Abstract
BIM model-based construction cost estimation requires reliable linkage between model-derived building information and estimator-facing cost records. However, BIM models and structured cost databases use different descriptive logics: BIM model data primarily describe what a building component is in the model, whereas cost records [...] Read more.
BIM model-based construction cost estimation requires reliable linkage between model-derived building information and estimator-facing cost records. However, BIM models and structured cost databases use different descriptive logics: BIM model data primarily describe what a building component is in the model, whereas cost records primarily describe how that component is constructed, measured, and priced. When BIM model names are non-standard or properties are incomplete, this mismatch may lead to ambiguous cost item selection, particularly when candidate records differ in unit basis, material assembly, thickness, finish, fire rating, or performance requirements. To address this problem, this study proposes a multimodal model- and retrieval-guided framework for BIM model-based cost estimation. The framework converts BIM model content into standardized estimator-readable descriptions, retrieves cost database candidate entries, applies rule-based checks for unit, material, thickness, finish, and fire rating consistency, and produces reviewable cost item selections for database-based cost calculation. The method uses a multimodal model to supplement and standardize component information, while cost records remain the authority for unit prices rather than being replaced by model-generated estimates. The framework was evaluated using a BIM example containing 7374 building elements across 21 model element types, together with a structured cost database containing approximately 11,500 pricing records. The full workflow reduced unmatched categories and improved pricing coverage relative to direct cost item retrieval. The results indicated that the proposed method can improve the technical appropriateness and coverage of cost item selection. The study contributes a reviewable workflow that integrates BIM model content, multimodal description standardization, cost database candidate retrieval, rule-based specification filtering, and database-grounded cost synthesis for selecting justified cost items under practical estimating ambiguity. Full article
(This article belongs to the Special Issue Digital Technologies in Construction and Built Environment)
Show Figures

Figure 1

28 pages, 1703 KB  
Article
Temporal Obfuscation Testing for LLM Structural Reasoning: From Single-Day Dealer Constraints to Persistent Market Regimes
by Christopher Regan and Ying Xie
J. Risk Financial Manag. 2026, 19(6), 382; https://doi.org/10.3390/jrfm19060382 (registering DOI) - 25 May 2026
Abstract
Deploying large language models (LLMs) for domain-specific analysis raises a critical validation challenge: distinguishing genuine structural reasoning from training data memorization. We address this through temporal obfuscation testing, which strips calendar dates, ticker symbols, and contextual markers from input sequences, forcing models to [...] Read more.
Deploying large language models (LLMs) for domain-specific analysis raises a critical validation challenge: distinguishing genuine structural reasoning from training data memorization. We address this through temporal obfuscation testing, which strips calendar dates, ticker symbols, and contextual markers from input sequences, forcing models to reason from numerical structure alone. Applying this framework to options dealer gamma exposure (GEX) patterns across two temporal scales, we validate detection using 2221 evaluations (1412 real windows plus 809 synthetic controls) spanning 2020–2025. At the single-day scale, obfuscation testing achieves 71.5% detection of dealer hedging patterns with 91.2% predictive accuracy; raw strike-level data outperforms pre-calculated GEX metrics by 30.8 percentage points (92.3% vs. 61.5%), establishing that parametric aggregation represents lossy compression of structural signal. At the multi-day scale, 30-day regime detection achieves 81.2% detection in 2024 (95% CI [75.8, 86.1]%) versus 12.1% in 2020 (95% CI [8.1, 16.6]%)—a 69.1 percentage point separation (φ = 0.69, Fisher’s exact p = 1.8 × 10−52)—with 0% false positives on synthetic controls. Multi-year analysis reveals regime evolution tracking zero-days-to-expiration (0DTE) adoption—detection rising from 3.7% (2021) to 100% (2024)—with GEX magnitude growing from $3.0B to $20.3B. Stable detection despite collapsing profitability (Sharpe 1.8 → 0.1) confirms structural market mechanics rather than exploitable inefficiencies, establishing temporal obfuscation as a generalizable methodology for validating LLM reasoning in quantitative domains. Full article
(This article belongs to the Section Financial Technology and Innovation)
Show Figures

Figure 1

39 pages, 1013 KB  
Article
Reframing Internal Audit Through Emerging AI Technologies: Toward an Integration Framework and Assessment Model
by Ionut-Florin Anica-Popa, Cătălin-Georgel Tudor, Liana-Elena Anica-Popa and Marinela Vrîncianu
Electronics 2026, 15(11), 2280; https://doi.org/10.3390/electronics15112280 (registering DOI) - 25 May 2026
Abstract
The rapid adoption of artificial intelligence (AI) technologies into organizational information systems is reshaping internal audit practices and governance mechanisms. Nevertheless, notable gaps persist in both the academic literature and professional practice in integrating AI technologies within the internal audit function and in [...] Read more.
The rapid adoption of artificial intelligence (AI) technologies into organizational information systems is reshaping internal audit practices and governance mechanisms. Nevertheless, notable gaps persist in both the academic literature and professional practice in integrating AI technologies within the internal audit function and in assessing its maturity level. To address these limitations, this study employs a two-stage methodological approach. First, a Systematic Literature Review (SLR) was conducted following PRISMA 2020 and PRISMA-S guidelines to synthesize existing knowledge and identify structural gaps in AI integration within the internal audit function (IAF). Second, drawing on the SLR findings, the study follows a theory-building and conceptual artifact-design approach. Three instruments are developed and assessed: the Internal Audit–Artificial Intelligence Integration Framework (IA-AIIF), the Internal Audit–AI Integration Assessment Cube (IA-AI Cube), and a Hierarchical Weighted Scoring Model (HWSM). These instruments enable the multidimensional evaluation of AI integration across technical, operational, and governance dimensions. They may offer guidance for both practitioners and researchers advancing AI-enabled approaches within the IAF. The findings suggest practical, managerial, and theoretical contributions by supporting AI integration in internal audit practice, outlining implementation-oriented recommendations for AI adoption, and advancing the body of knowledge at the intersection of internal audit and emerging AI technologies. Full article
(This article belongs to the Special Issue Generative AI and Its Transformative Potential, 2nd Edition)
Show Figures

Figure 1

28 pages, 4453 KB  
Article
Layered Network Diffusion of Misinformation on YouTube: A Multi-Level Analysis of Video and Channel Interactions
by Md Irfanuzzaman Khan, Benedict Sheehy and Bruce Baer Arnold
Platforms 2026, 4(2), 9; https://doi.org/10.3390/platforms4020009 (registering DOI) - 25 May 2026
Abstract
Misinformation has become a persistent feature of contemporary digital information environments. Platform designs and business models often privilege attention, engagement, and repeated exposure over epistemic quality. However, misinformation does not diffuse uniformly across platform structures. This study examines how contested claims in a [...] Read more.
Misinformation has become a persistent feature of contemporary digital information environments. Platform designs and business models often privilege attention, engagement, and repeated exposure over epistemic quality. However, misinformation does not diffuse uniformly across platform structures. This study examines how contested claims in a South Korean social policy controversy circulate on YouTube. The analysis focuses on unfounded allegations regarding permanent employment offers to part-time workers at Incheon International Airport across two analytic levels: (1) a videoclip network, in which video-to-video ties are formed through shared commenters over time, and (2) a channel network, in which channel-to-channel ties are formed through shared commenters over time. Drawing on YouTube Data API records, we employ a mixed computational approach that integrates social network analysis, speech-to-text transcription, natural language processing, semantic network analysis, and automated content classification. Videos are classified as misinformation or non-misinformation based on the presence of demonstrably incorrect claims or corrective content. We compare network structure, diffusion patterns, and engagement dynamics across these two layers. The results reveal pronounced layer-specific differences. Misinformation diffuses more extensively within the channel network, which exhibits higher density and stronger cross-channel interconnectedness, suggesting that creator-level infrastructures function as stabilising conduits for the circulation of false claims. By contrast, diffusion pathways at the videoclip level show comparatively weaker differentiation between misinformation and non-misinformation content. Engagement patterns also diverge misinformation videos attract significantly more likes, while message format and channel attributes are less consistently distinguishing. From a theoretical standpoint, this study advances a multi-layer diffusion perspective on platform-mediated misinformation by demonstrating how platform architectures shape the visibility, persistence, and amplification of false claims. The findings highlight the importance of intervention strategies that move beyond individual content moderation toward creator- and network-level governance mechanisms, with implications for the design of platform features, recommendation systems, and misinformation mitigation tools. Full article
Show Figures

Figure 1

26 pages, 3152 KB  
Article
Ethical Coordination of LLM Multi-Agent Systems
by J. de Curtò, I. de Zarzà and Carlos T. Calafate
Electronics 2026, 15(11), 2278; https://doi.org/10.3390/electronics15112278 (registering DOI) - 25 May 2026
Abstract
Embedding large language model (LLM) coordinators in production electronic systems, connected vehicles, multi-robot fabrics, IoT control loops, telecommunications orchestration, demands a pre-delivery filter stage that preserves ethical guarantees under adversarial influence at deployment scale. We present a constitutional governance layer that filters compiled [...] Read more.
Embedding large language model (LLM) coordinators in production electronic systems, connected vehicles, multi-robot fabrics, IoT control loops, telecommunications orchestration, demands a pre-delivery filter stage that preserves ethical guarantees under adversarial influence at deployment scale. We present a constitutional governance layer that filters compiled influence policies before they reach a heterogeneous population of grounded LLM agents whose hybrid decision model combines a game-theoretic base probability with an LLM-evaluated narrative shift attenuated by per-agent resistance. Four experiments on a Barabási–Albert scale-free network of 30 agents powered by Llama-3.3-70B-Instruct show that the filter holds an Ethical Cooperation Score (ECS) of 0.176 (multi-seed mean 0.163, 95% confidence interval (CI) [0.150,0.174]) against an unconstrained baseline of ECS=0, enforced by a hard integrity gate (1.000 vs. 0.000). We surface an autonomy paradox in which unconstrained agents resist manipulation more forcefully (0.856 vs. 0.728) yet collapse to ECS=0, establishing that system-level integrity cannot be delegated to agent-level defence. The advantage is monotonic in resistance (+0.174 to +0.183), seed-stable (Cliff’s δ=1.0, complete separation), topology- and backbone-invariant across five contemporary LLMs, robust to alternative ECS formulations, and reproduces at N = 100. Against constitutional artificial intelligence (CAI) critique-revise and LlamaGuard-style safety-classifier baselines, the framework matches the integrity floor and adds a measurable margin on the secondary risk surface (burst timing, composite manipulation risk). The filter runs at 0.78 μs/call (1.3×106 decisions/s/core), supporting always-on deployment as a stateless, model-agnostic component of LLM agent pipelines in adversarially contested electronic systems. Full article
(This article belongs to the Special Issue AI-Powered Natural Language Processing Applications)
Show Figures

Figure 1

19 pages, 3811 KB  
Article
Understanding and Mitigating Multilingual Bias in LLM-Driven Verilog Code Generation via Hard-Example In-Context Learning
by Guang Yang
Electronics 2026, 15(11), 2275; https://doi.org/10.3390/electronics15112275 (registering DOI) - 25 May 2026
Abstract
Large language models (LLMs) are increasingly adopted for Verilog code generation, yet existing benchmarks assume English-only prompts, overlooking the linguistic diversity of the global FPGA engineering community. We introduce Multi-VerilogEval, the first multilingual Verilog benchmark, built from 156 unique underlying tasks instantiated in [...] Read more.
Large language models (LLMs) are increasingly adopted for Verilog code generation, yet existing benchmarks assume English-only prompts, overlooking the linguistic diversity of the global FPGA engineering community. We introduce Multi-VerilogEval, the first multilingual Verilog benchmark, built from 156 unique underlying tasks instantiated in four languages (English, Japanese, Hindi, and Mongolian), yielding 624 language-specific test cases. Our evaluation of four representative LLMs reveals a silent failure pattern: syntactic correctness remains high (∼90%) across languages, but functional correctness degrades by up to 23.9% for non-English prompts in open-source and domain-specific models, while commercial models remain near-parity. Hidden-state analysis suggests that multilingual bias is associated with persistent cross-lingual representation divergence throughout the network, which becomes most pronounced in the final layers that directly drive token generation. As fine-tuning and common prompt-based mitigations remain impractical or unreliable for multilingual RTL, we propose HE-ICL (Hard-Example In-Context Learning), a train-free method that constructs few-shot hard-example demonstrations from cross-lingually difficult cases. HE-ICL closes 80–100% of the multilingual gap without any parameter updates, achieving near-parity with or exceeding the English reference level across all evaluated HE-ICL settings. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

20 pages, 701 KB  
Article
A Generative AI Architecture Integrating Retrieval-Augmented Generation and Low-Rank Adaptation for Knowledge-Intensive Medical Reasoning
by Ming-Hseng Tseng, Yu-Chuan Chen and Wei-Ting Chen
Future Internet 2026, 18(6), 280; https://doi.org/10.3390/fi18060280 (registering DOI) - 25 May 2026
Abstract
Large language models (LLMs) have demonstrated strong potential in medical knowledge applications; however, their reliability in knowledge-intensive medical reasoning—remains limited due to hallucination, inadequate domain grounding, and unstable inference behavior. These limitations are particularly pronounced in tasks of professional medical reasoning that require [...] Read more.
Large language models (LLMs) have demonstrated strong potential in medical knowledge applications; however, their reliability in knowledge-intensive medical reasoning—remains limited due to hallucination, inadequate domain grounding, and unstable inference behavior. These limitations are particularly pronounced in tasks of professional medical reasoning that require strict logical consistency and authoritative knowledge support. This study proposes a generative AI architecture that integrates RAG (Retrieval-Augmented Generation) with parameter-efficient supervised fine-tuning based on Low-Rank Adaptation (LoRA) to improve reasoning stability and diagnostic accuracy in complex medical domains. The architecture combines internalized domain reasoning learned through LoRA-based fine-tuning with external knowledge grounding enabled by a dynamic RAG mechanism, allowing the model to selectively retrieve domain-specific knowledge only when it is semantically relevant and evidence supported. To validate the proposed architecture, a large-scale real-world dataset comprising 11,476 multiple-choice questions from Taiwan’s national Traditional Chinese Medicine (TCM) licensing examinations (2005–2025) is constructed as a representative case study of knowledge-intensive medical reasoning. The experimental results show that the baseline LLM achieves an accuracy of 61.0%. Incorporating RAG improves accuracy to 89.0%, while combined LoRA-based fine-tuning and RAG architecture further increases accuracy to 90.1%, with reduced variance across repeated evaluations. Statistical analysis using McNemar’s test confirms that the performance improvements introduced by the retrieval mechanism are highly significant. The results demonstrate that integrating parameter-efficient fine-tuning with dynamically controlled retrieval is critical to balancing reasoning stability and knowledge enhancement in generative AI systems. Beyond the specific medical case study examined in this work, the proposed architecture offers a reproducible and extensible framework for developing reliable generative AI systems in other knowledge-intensive professional reasoning and educational domains. Full article
Show Figures

Figure 1

20 pages, 451 KB  
Article
Active Learning and Feedback in EFL Teacher Education Through AI-Supported Flipped Classrooms
by Paola Cabrera-Solano, Luz Castillo-Cuesta and Cesar Ochoa-Cueva
Educ. Sci. 2026, 16(6), 827; https://doi.org/10.3390/educsci16060827 (registering DOI) - 25 May 2026
Abstract
This study examines the integration of generative Artificial Intelligence (AI) tools within a Flipped Classroom model to enhance active learning and feedback processes in an English as a Foreign Language (EFL) teaching program. The participants were 242 pre-service EFL teachers enrolled in upper-level [...] Read more.
This study examines the integration of generative Artificial Intelligence (AI) tools within a Flipped Classroom model to enhance active learning and feedback processes in an English as a Foreign Language (EFL) teaching program. The participants were 242 pre-service EFL teachers enrolled in upper-level courses at a private university in southern Ecuador. Adopting a mixed-methods, design-based research approach, the study incorporated a diagnostic survey, written reflections, post-intervention survey, and focus groups. These instruments explored students’ prior knowledge, perceptions, and experiences regarding AI-supported learning. Findings showed that AI tools such as ChatGPT, Gemini, and Copilot strengthened students’ linguistic accuracy, writing performance, self-regulation, and understanding of pedagogical concepts. AI-generated feedback complemented teacher feedback by providing immediate and clear guidance, promoting iterative revision and deeper engagement with course content. Participants reported increased autonomy, improved time management, and greater readiness to integrate AI into future teaching practices. The results indicate that AI-supported flipped instruction fosters meaningful learning, enhances feedback quality, and develops both linguistic and pedagogical competencies. Full article
(This article belongs to the Section Higher Education)
Show Figures

Figure 1

26 pages, 3887 KB  
Article
Bigger Isn’t Always Better: Choosing the Right Size Large Language Model for Locally Hosted School Settings
by Cecilia Ka Yuk Chan, Wei Dai, Kepan Cao, Alan T. Y. Poon and Tom Colloton
Appl. Sci. 2026, 16(11), 5268; https://doi.org/10.3390/app16115268 - 25 May 2026
Abstract
The rapid integration of large language models (LLMs) into education has shifted research focus from questions of capability, such as what LLMs can do and how accurately—to questions of deployability, including how they can be operated effectively for many learners at once. In [...] Read more.
The rapid integration of large language models (LLMs) into education has shifted research focus from questions of capability, such as what LLMs can do and how accurately—to questions of deployability, including how they can be operated effectively for many learners at once. In school environments, system reliability, scalability, and real-time responsiveness are critical, as delays or interruptions can directly reduce learner engagement, particularly during synchronous activities. This study evaluates the performance of open-source LLaMA models ranging from 1 billion to 70 billion parameters across one-, dual-, triple-, and quad-GPU configurations suitable for educational settings. Performance is assessed using four key indicators: success rate (percentage of completed requests), generation speed (tokens per second), throughput (completed responses per second), and latency (time until full response generation). These metrics were measured under progressively increasing numbers of simultaneous users to identify system capacity limits and trade-offs between model size, responsiveness, and scalability. The results indicate that smaller models (1B–3B) deliver faster, more stable performance under concurrent use, while larger models (8B–70B) experience substantial slowdowns and reduced reliability, even on high-end GPU systems. These findings suggest that effective educational deployment should prioritize empirical performance and infrastructure compatibility over model size alone. The paper concludes by proposing a practical framework to guide educators, administrators, and developers in selecting and configuring locally hosted GPU systems that balance model capability, response speed, and resource efficiency for real-time applications such as AI tutors, classroom chatbots, and automated feedback tools. Full article
(This article belongs to the Special Issue Innovative Applications of Artificial Intelligence in Education)
Show Figures

Figure 1

20 pages, 1913 KB  
Review
Informed Consent in AI-Augmented Dentistry and Dental Research: A Scoping Review
by Tamara Mihut, Corina Marilena Cristache, Luminita Oancea and Victor Nimigean
Dent. J. 2026, 14(6), 320; https://doi.org/10.3390/dj14060320 - 25 May 2026
Abstract
Background/Objectives: Artificial intelligence (AI) is increasingly used in dental diagnostics, treatment planning, documentation, and research. However, there is limited synthesis of how informed consent should be understood and operationalized in AI-augmented dentistry. This scoping review aimed to map the existing literature on informed [...] Read more.
Background/Objectives: Artificial intelligence (AI) is increasingly used in dental diagnostics, treatment planning, documentation, and research. However, there is limited synthesis of how informed consent should be understood and operationalized in AI-augmented dentistry. This scoping review aimed to map the existing literature on informed consent in AI-assisted dental care and dental research, identify conceptual and practical gaps, and synthesize key domains relevant to ethically robust implementation. Methods: This review was conducted in accordance with PRISMA-ScR and the review question was developed using the Population–Concept–Context framework. Searches were performed in PubMed, Web of Science, and ClinicalKey, supplemented by Google Scholar and reference list screening. English-language sources published between January 2015 and January 2026 were considered if they addressed informed consent, patient information, autonomy, transparency, accountability, or governance in relation to AI use in dentistry or dental research. Results: Of 2624 records identified, 30 sources were included. The reviewed literature consistently emphasized the importance of disclosing AI involvement, clarifying clinician accountability, communicating uncertainty and bias, distinguishing clinical care from research-related consent, and addressing secondary data use. Most included sources were conceptual, ethical, regulatory, or narrative in nature, with limited empirical evidence on implementation or patient outcomes. Conclusions: The available literature suggests that informed consent in AI-augmented dentistry should extend beyond traditional clinician–patient models to explicitly address AI involvement, human oversight, and data governance. Based on recurring themes across the included sources, we propose the ACCOUNT-AI framework as a conceptual synthesis to support future research, policy development, and implementation efforts. Full article
Show Figures

Graphical abstract

24 pages, 1136 KB  
Article
RIB-Guard: A Risk-Aware Information Bottleneck Defense for Black-Box Large Language Models
by Muen Cai, Yuan Shen, Xiong Luo and Jian Hu
Entropy 2026, 28(6), 585; https://doi.org/10.3390/e28060585 - 24 May 2026
Abstract
Large language models (LLMs) remain vulnerable to jailbreak attacks, especially in black-box settings where target-model gradients and internal tokenization are inaccessible. Recent information bottleneck-based defenses cast prompt protection as a compression problem, but existing methods still rely heavily on white-box optimization and the [...] Read more.
Large language models (LLMs) remain vulnerable to jailbreak attacks, especially in black-box settings where target-model gradients and internal tokenization are inaccessible. Recent information bottleneck-based defenses cast prompt protection as a compression problem, but existing methods still rely heavily on white-box optimization and the intrinsic alignment strength of the protected model. To address these limitations, we propose RIB-Guard, a safety-aware information bottleneck defense for black-box LLMs. RIB-Guard learns a token-level masking policy that extracts a minimally safety-sufficient prompt via reinforcement learning using only black-box feedback. In addition, it introduces an independent lightweight safety head to estimate residual jailbreak risk and provide model-agnostic safety guidance during training. The proposed framework jointly balances prompt compactness, benign utility preservation, and residual risk suppression within a unified objective. Experimental results on direct single-turn harmful and benign prompt settings show that RIB-Guard improves jailbreak robustness while maintaining competitive benign utility. By extending information bottleneck-based prompt protection from white-box to black-box settings, RIB-Guard provides a step toward safety-aware information-theoretic front-end defense for black-box LLMs. Full article
(This article belongs to the Special Issue The Information Bottleneck Method: Theory and Applications)
26 pages, 782 KB  
Article
Agentic Patterns for Decentralized Network Protocol Configuration
by Ahmed Twabi, Yepeng Ding and Tohru Kondo
Electronics 2026, 15(11), 2270; https://doi.org/10.3390/electronics15112270 - 24 May 2026
Abstract
Tool-augmented large language model agents are increasingly proposed for network configuration, but routing protocols differ in the control-plane state each commanded router can observe. This difference creates a specific problem for multi-agent orchestration: agents may coordinate more, yet still fail when correct verification [...] Read more.
Tool-augmented large language model agents are increasingly proposed for network configuration, but routing protocols differ in the control-plane state each commanded router can observe. This difference creates a specific problem for multi-agent orchestration: agents may coordinate more, yet still fail when correct verification depends on peer- or remote-router evidence. We study this interaction through 350 controlled runs on RIP, OSPF, and BGP tasks implemented with FRRouting and Containerlab, comparing a single-agent baseline with multi-agent orchestration patterns across language models. Protocol-centric trace metrics, including spatial coverage, coordination tax, and cross-router verification gap, are combined with intent-property scores and model-balanced bootstrap analysis. The results show that observability explains performance more clearly than orchestration patterns: multi-agent templates trail the baseline on local RIP feedback, show only small and uncertain gains on single-area OSPF troubleshooting, and remain near zero on stricter multi-area OSPF and BGP tasks where peer-side verification gaps are often complete. The main contribution is therefore a protocol-centered account of when agentic orchestration helps, when it adds coordination cost, and why current architectures face a cross-router verification ceiling. Full article
25 pages, 1045 KB  
Article
ADL-KG: Diacritic-Aware Knowledge Graph Prompting for Arabic LLM Question Answering
by Narimene Ayat, Fouzi Harrag, Nassir Harrag and Khaled Shaalan
Computation 2026, 14(6), 121; https://doi.org/10.3390/computation14060121 - 24 May 2026
Abstract
Arabic’s complex morphological system and the optional use of short vowels (tashkīl) introduce substantial lexical ambiguity, posing significant challenges for Large Language Models (LLMs). While diacritics enhance linguistic precision, LLMs trained predominantly on undiacritized corpora often exhibit performance degradation when processing fully diacritized [...] Read more.
Arabic’s complex morphological system and the optional use of short vowels (tashkīl) introduce substantial lexical ambiguity, posing significant challenges for Large Language Models (LLMs). While diacritics enhance linguistic precision, LLMs trained predominantly on undiacritized corpora often exhibit performance degradation when processing fully diacritized inputs due to representation shifts and tokenization inconsistencies. To address this limitation, we propose the Arabic Diacritic Lexical Knowledge Graph (ADL-KG), a structured framework that links diacritized and undiacritized forms through integrated lexical, morphological, and semantic knowledge. Building upon this resource, we introduce Diacritic-Aware Knowledge Graph Prompting (DA-KGP), a prompt augmentation strategy that injects explicit linguistic features into LLM inputs to facilitate robust interpretation of diacritized Arabic text. The framework is evaluated on the Arabic Reading Comprehension Dataset under zero-shot and few-shot question answering across AraGPT2-base, BLOOMZ-560M, SILMA-v1, and LLaMA 3.1-8B. Performance is assessed using Exact Match, BLEU, ROUGE-1, and BERTScore-F1. Experimental results show that fully diacritized prompts significantly degrade baseline performance, whereas DA-KGP consistently mitigates this effect by improving semantic alignment across diverse architectures. For AraGPT2-base, KG augmentation improves average BERTScore-F1 by +5.96 points. SILMA-v1 achieves the strongest lexical improvements, reaching 21.57 BLEU and 81.31% BERTScore-F1 in the KG-enhanced two-shot configuration. LLaMA 3.1-8B achieves the highest overall semantic performance with 82.54% BERTScore-F1 under KG-enhanced prompting, while BLOOMZ-560M also demonstrates statistically significant semantic gains through structured augmentation. These findings demonstrate that morphologically informed prompting and structured lexical grounding provide an effective and parameter-efficient strategy for improving the robustness and semantic fidelity of Arabic LLMs under fully diacritized input conditions. Full article
48 pages, 1363 KB  
Article
SyMPRep: A Symbolic Math Problem Representation Framework for Structured and Controllable Problem Transformation
by Hyuk Namgoong, Yerim Han and Sangkeun Jung
Appl. Sci. 2026, 16(11), 5256; https://doi.org/10.3390/app16115256 - 24 May 2026
Abstract
Mathematical problem transformation is a teaching-and-learning strategy that extends conceptual understanding and problem-solving ability by expressing the same concept across diverse situations. It has recently attracted attention in artificial intelligence as a tool for data augmentation, difficulty control, and model evaluation. However, existing [...] Read more.
Mathematical problem transformation is a teaching-and-learning strategy that extends conceptual understanding and problem-solving ability by expressing the same concept across diverse situations. It has recently attracted attention in artificial intelligence as a tool for data augmentation, difficulty control, and model evaluation. However, existing approaches struggle to jointly represent and control how core mathematical elements—such as operational structure, quantitative relations, and conditions—are preserved or modified. This limitation is particularly evident in natural-language problems, where intertwined components make it difficult to perform targeted partial transformations or verify structural validity. To address these challenges, we propose the Symbolic Math Problem Representation Framework (SyMPRep), which represents the relationships among sentences, conditions, questions, quantities, units, and operations in a symbolic structure. It classifies free-form instructions into predefined categories and decomposes problems into constituent elements, enabling transformation over an explicit abstraction structure. This allows problem transformation to be treated as a controllable , traceable, and recoverable structural operation rather than surface rewriting. Experiments on GSM8K and Math500 show that SyMPRep achieves stable alignment and recoverability, and confirm that the main challenge lies in structural control rather than surface fluency. Ablation results highlight the importance of symbolic schema and show that different metrics capture distinct aspects of transformation quality. In downstream applications, answer-invariant transformations yield modest improvements on easier problems, while human evaluation indicates that the generated problems are coherent and suitable for educational use. These findings suggest that SyMPRep serves as a representation-driven interface for controllable structural transformation. Full article
28 pages, 12534 KB  
Article
Temporal Dynamics of Postharvest Quality in Carrot Genotypes: A Multidimensional Analysis of Physicochemical, Biofunctional, Spectral, and Sensory Attributes
by Paola Andrea Ospina-Sanchez, Juan Camilo Henao-Rojas, Luz Marina Melgarejo and Joaquin Guillermo Ramirez-Gil
Horticulturae 2026, 12(6), 657; https://doi.org/10.3390/horticulturae12060657 - 24 May 2026
Abstract
Postharvest quality of carrot (Daucus carota L.) is determined by the interaction between genotype and storage environment, yet systematic comparative evidence across pigmented genotypes with contrasting biochemical profiles remains scarce. This study evaluated the postharvest behavior of five carrot genotypes (6KUR, 14BER, [...] Read more.
Postharvest quality of carrot (Daucus carota L.) is determined by the interaction between genotype and storage environment, yet systematic comparative evidence across pigmented genotypes with contrasting biochemical profiles remains scarce. This study evaluated the postharvest behavior of five carrot genotypes (6KUR, 14BER, yellow, white, and purple) under refrigeration (4 °C) and room temperature (15 °C) over 30 days, integrating physicochemical, spectral, and consumer-based assessment. Variables included color, fresh weight loss, respiration rate, firmness, titratable acidity, total soluble solids, and β-carotene quantification by spectrophotometry. Non-destructive monitoring was performed using Vis/NIR reflectance (350–1900 nm) with spectral indices sensitive to anthocyanin and carotenoid content (CRI1, CRI2, mARI) and tissue structural integrity (NDVI), and consumer perception (~60 participants per evaluation) was characterized through natural language processing of open-ended responses. Refrigeration significantly reduced β-carotene degradation (~15–20% loss vs. 50–60% at room temperature) and better preserved overall quality across genotypes. Purple carrots demonstrated superior postharvest stability across multiple traits, whereas white carrots showed the greatest susceptibility to quality loss. Spectral indices exhibited genotype-dependent temporal variation, particularly in pigmented roots, supporting their potential for non-destructive pigment monitoring during storage. Consumer descriptors reflected a progressive decline in desirable sensory attributes under both conditions. These findings support the integration of physicochemical, spectral, and sensory approaches for comprehensive postharvest characterization of genotypically diverse carrot germplasm, and identify priority genotypes and trait combinations for future predictive modeling studies. Full article
(This article belongs to the Section Postharvest Biology, Quality, Safety, and Technology)
Show Figures

Graphical abstract

Back to TopTop