Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (8,423)

Search Parameters:
Keywords = language modelling

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
9 pages, 493 KB  
Technical Note
Rapid Agrichemical Inventory via Video Documentation and Large Language Model Identification
by Michael Anastario, Cynthia Armendáriz-Arnez, Lillian Shakespeare Largo, Talia Gordon and Elizabeth F. S. Roberts
Int. J. Environ. Res. Public Health 2025, 22(10), 1527; https://doi.org/10.3390/ijerph22101527 (registering DOI) - 5 Oct 2025
Abstract
Background: This technical note presents a methodological approach to agrichemical inventory documentation. It complements exposure assessments in field settings with time-restricted observational periods. Conducted in Michoacán, Mexico, this method leverages large language model (LLM) capabilities for categorizing agrichemicals from brief video footage. Method: [...] Read more.
Background: This technical note presents a methodological approach to agrichemical inventory documentation. It complements exposure assessments in field settings with time-restricted observational periods. Conducted in Michoacán, Mexico, this method leverages large language model (LLM) capabilities for categorizing agrichemicals from brief video footage. Method: Given time-limited access to a storage shed housing various agrichemicals, a short video was recorded and processed into 31 screenshots. Using OpenAI’s ChatGPT (model: GPT-4o®), agrichemicals in each image were identified and categorized as fertilizers, herbicides, insecticides, fungicides, or other substances. Results: Human validation revealed that the LLM accurately identified 75% of agrichemicals, with human verification correcting entries. Conclusions: This rapid identification method builds upon behavioral methods of exposure assessment, facilitating initial data collection in contexts where researcher access to hazardous materials may be time limited and would benefit from the efficiency and cross-validation offered by this method. Further refinement of this LLM-assisted approach could optimize accuracy in the identification of agrichemical products and expand its application to complement exposure assessments in field-based research, particularly as LLM technologies rapidly evolve. Most importantly, this Technical Note illustrates how field researchers can strategically harness LLMs under real-world time constraints, opening new possibilities for rapid observational approaches to exposure assessment. Full article
Show Figures

Figure 1

22 pages, 1443 KB  
Article
Leveraging Symmetry in Multi-Agent Code Generation: A Cross-Verification Collaboration Protocol for Competitive Programming
by Aoyu Song and Afizan Azman
Symmetry 2025, 17(10), 1660; https://doi.org/10.3390/sym17101660 (registering DOI) - 5 Oct 2025
Abstract
Competitive programming has emerged as a critical benchmark for evaluating large language models (LLMs) in solving algorithmic problems under competitive conditions. Existing methods, such as the Sequential One-Agent Pipeline (SOP) approach, suffer from significant limitations, including the inability to effectively manage semantic drift [...] Read more.
Competitive programming has emerged as a critical benchmark for evaluating large language models (LLMs) in solving algorithmic problems under competitive conditions. Existing methods, such as the Sequential One-Agent Pipeline (SOP) approach, suffer from significant limitations, including the inability to effectively manage semantic drift across multiple stages, a lack of coordinated adversarial testing, and suboptimal final solutions. These issues lead to high rates of wrong answer (WA) and time-limit exceeded (TLE) errors, especially on complex problems. In this paper, we propose the Cross-Verification Collaboration Protocol (CVCP), a multi-agent framework that integrates symmetry detection, symmetry-guided adversarial testing, Round-Trip Review Protocol (RTRP), and Asynchronous Voting Resolution (AVR) to address these shortcomings. We evaluate our method on the CodeELO dataset, showing significant improvements in performance, with Elo Ratings increasing by up to 7.1% and Pass Rates for hard problems improving by as much as 1.8 times compared to the SOP baseline. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

16 pages, 2489 KB  
Article
Sentence-Level Silent Speech Recognition Using a Wearable EMG/EEG Sensor System with AI-Driven Sensor Fusion and Language Model
by Nicholas Satterlee, Xiaowei Zuo, Kee Moon, Sung Q. Lee, Matthew Peterson and John S. Kang
Sensors 2025, 25(19), 6168; https://doi.org/10.3390/s25196168 (registering DOI) - 5 Oct 2025
Abstract
Silent speech recognition (SSR) enables communication without vocalization by interpreting biosignals such as electromyography (EMG) and electroencephalography (EEG). Most existing SSR systems rely on high-density, non-wearable sensors and focus primarily on isolated word recognition, limiting their practical usability. This study presents a wearable [...] Read more.
Silent speech recognition (SSR) enables communication without vocalization by interpreting biosignals such as electromyography (EMG) and electroencephalography (EEG). Most existing SSR systems rely on high-density, non-wearable sensors and focus primarily on isolated word recognition, limiting their practical usability. This study presents a wearable SSR system capable of accurate sentence-level recognition using single-channel EMG and EEG sensors with real-time wireless transmission. A moving window-based few-shot learning model, implemented with a Siamese neural network, segments and classifies words from continuous biosignals without requiring pauses or manual segmentation between word signals. A novel sensor fusion model integrates both EMG and EEG modalities, enhancing classification accuracy. To further improve sentence-level recognition, a statistical language model (LM) is applied as post-processing to correct syntactic and lexical errors. The system was evaluated on a dataset of four military command sentences containing ten unique words, achieving 95.25% sentence-level recognition accuracy. These results demonstrate the feasibility of sentence-level SSR using wearable sensors through a window-based few-shot learning model, sensor fusion, and ML applied to limited simultaneous EMG and EEG signals. Full article
(This article belongs to the Special Issue Advanced Sensing Techniques in Biomedical Signal Processing)
Show Figures

Figure 1

22 pages, 6212 KB  
Article
VLA-MP: A Vision-Language-Action Framework for Multimodal Perception and Physics-Constrained Action Generation in Autonomous Driving
by Maoning Ge, Kento Ohtani, Yingjie Niu, Yuxiao Zhang and Kazuya Takeda
Sensors 2025, 25(19), 6163; https://doi.org/10.3390/s25196163 (registering DOI) - 5 Oct 2025
Abstract
Autonomous driving in complex real-world environments requires robust perception, reasoning, and physically feasible planning, which remain challenging for current end-to-end approaches. This paper introduces VLA-MP, a unified vision-language-action framework that integrates multimodal Bird’s-Eye View (BEV) perception, vision-language alignment, and a GRU-bicycle dynamics cascade [...] Read more.
Autonomous driving in complex real-world environments requires robust perception, reasoning, and physically feasible planning, which remain challenging for current end-to-end approaches. This paper introduces VLA-MP, a unified vision-language-action framework that integrates multimodal Bird’s-Eye View (BEV) perception, vision-language alignment, and a GRU-bicycle dynamics cascade adapter for physics-informed action generation. The system constructs structured environmental representations from RGB images and LiDAR, aligns scene features with natural language instructions through a cross-modal projector and large language model, and converts high-level semantic hidden states outputs into executable and physically consistent trajectories. Experiments on the LMDrive dataset and CARLA simulator demonstrate that VLA-MP achieves high performance across the LangAuto benchmark series, with best driving scores of 44.3, 63.5, and 78.4 on LangAuto, LangAuto-Short, and LangAuto-Tiny, respectively, while maintaining high infraction scores of 0.89–0.95, outperforming recent VLA methods such as LMDrive and AD-H. Visualization and video results further validate the framework’s ability to follow complex language-conditioned instructions, adapt to dynamic environments, and prioritize safety. These findings highlight the potential of combining multimodal perception, language reasoning, and physics-aware adapters for robust and interpretable autonomous driving. Full article
(This article belongs to the Special Issue Large AI Models for Positioning and Perception in Autonomous Driving)
Show Figures

Figure 1

26 pages, 3424 KB  
Systematic Review
Rethinking Blockchain Governance with AI: The VOPPA Framework
by Catalin Daniel Morar, Daniela Elena Popescu, Ovidiu Constantin Novac and David Ghiurău
Computers 2025, 14(10), 425; https://doi.org/10.3390/computers14100425 (registering DOI) - 4 Oct 2025
Abstract
Blockchain governance has become central to the performance and resilience of decentralized systems, yet current models face recurring issues of participation, coordination, and adaptability. This article offers a structured analysis of governance frameworks and highlights their limitations through recent high-impact case studies. It [...] Read more.
Blockchain governance has become central to the performance and resilience of decentralized systems, yet current models face recurring issues of participation, coordination, and adaptability. This article offers a structured analysis of governance frameworks and highlights their limitations through recent high-impact case studies. It then examines how artificial intelligence (AI) is being integrated into governance processes, ranging from proposal summarization and anomaly detection to autonomous agent-based voting. In response to existing gaps, this paper proposes the Voting Via Parallel Predictive Agents (VOPPA) framework, a multi-agent architecture aimed at enabling predictive, diverse, and decentralized decision-making. Strengthening blockchain governance will require not just decentralization but also intelligent, adaptable, and accountable decision-making systems. Full article
14 pages, 588 KB  
Protocol
The Silent Cognitive Burden of Chronic Pain: Protocol for an AI-Enhanced Living Dose–Response Bayesian Meta-Analysis
by Kevin Pacheco-Barrios, Rafaela Machado Filardi, Edward Yoon, Luis Fernando Gonzalez-Gonzalez, Joao Victor Ribeiro, Joao Pedro Perin, Paulo S. de Melo, Marianna Leite, Luisa Silva and Alba Navarro-Flores
J. Clin. Med. 2025, 14(19), 7030; https://doi.org/10.3390/jcm14197030 (registering DOI) - 4 Oct 2025
Abstract
Background: Chronic pain affects nearly one in five adults worldwide and is increasingly recognized not only as a disease but as a potential risk factor for neurocognitive decline and dementia. While some evidence supports this association, existing systematic reviews are static and rapidly [...] Read more.
Background: Chronic pain affects nearly one in five adults worldwide and is increasingly recognized not only as a disease but as a potential risk factor for neurocognitive decline and dementia. While some evidence supports this association, existing systematic reviews are static and rapidly outdated, and none have leveraged advanced methods for continuous updating and robust uncertainty modeling. Objective: This protocol describes a living systematic review with dose–response Bayesian meta-analysis, enhanced by artificial intelligence (AI) tools, to synthesize and maintain up-to-date evidence on the prospective association between any type of chronic pain and subsequent cognitive decline. Methods: We will systematically search PubMed, Embase, Web of Science, and preprint servers for prospective cohort studies evaluating chronic pain as an exposure and cognitive decline as an outcome. Screening will be semi-automated using natural language processing models (ASReview), with human oversight for quality control. Bayesian hierarchical meta-analysis will estimate pooled effect sizes and accommodate between-study heterogeneity. Meta-regression will explore study-level moderators such as pain type, severity, and cognitive domain assessed. If data permit, a dose–response meta-analysis will be conducted. Living updates will occur biannually using AI-enhanced workflows, with results transparently disseminated through preprints and peer-reviewed updates. Results: This is a protocol; results will be disseminated in future reports. Conclusions: This living Bayesian systematic review aims to provide continuously updated, methodologically rigorous evidence on the link between chronic pain and cognitive decline. The approach integrates innovative AI tools and advanced meta-analytic methods, offering a template for future living evidence syntheses in neurology and pain research. Full article
(This article belongs to the Section Anesthesiology)
18 pages, 46866 KB  
Article
SATrack: Semantic-Aware Alignment Framework for Visual–Language Tracking
by Yangyang Tian, Liusen Xu, Zhe Li, Liang Jiang, Cen Chen and Huanlong Zhang
Electronics 2025, 14(19), 3935; https://doi.org/10.3390/electronics14193935 (registering DOI) - 4 Oct 2025
Abstract
Visual–language tracking often faces challenges like target deformation and confusion caused by similar objects. These issues can disrupt the alignment between visual inputs and their textual descriptions, leading to cross-modal semantic drift and feature-matching errors. To address these issues, we propose SATrack, a [...] Read more.
Visual–language tracking often faces challenges like target deformation and confusion caused by similar objects. These issues can disrupt the alignment between visual inputs and their textual descriptions, leading to cross-modal semantic drift and feature-matching errors. To address these issues, we propose SATrack, a Semantic-Aware Alignment framework for visual–language tracking. Specifically, we first propose the Semantically Aware Contrastive Alignment module, which leverages attention-guided semantic distance modeling to identify hard negative samples that are semantically similar but carry different labels. This helps the model better distinguish confusing instances and capture fine-grained cross-modal differences. Secondly, we design the Cross-Modal Token Filtering strategy, which leverages attention responses guided by both the visual template and the textual description to filter out irrelevant or weakly related tokens in the search region. This helps the model focus more precisely on the target. Finally, we propose a Confidence-Guided Template Memory mechanism, which evaluates the prediction quality of each frame using convolutional operations and confidence thresholding. High-confidence frames are stored to selectively update the template memory, enabling the model to adapt to appearance changes over time. Extensive experiments show that SATrack achieves a 65.8% success rate on the TNL2K benchmark, surpassing the previous state-of-the-art UVLTrack by 3.1% and demonstrating superior robustness and accuracy. Full article
(This article belongs to the Special Issue Deep Perception in Autonomous Driving, 2nd Edition)
Show Figures

Figure 1

23 pages, 548 KB  
Article
Symmetry- and Asymmetry-Aware Dual-Path Retrieval and In-Context Learning-Based LLM for Equipment Relation Extraction
by Mingfei Tang, Liang Zhang, Zhipeng Yu, Xiaolong Shi and Xiulei Liu
Symmetry 2025, 17(10), 1647; https://doi.org/10.3390/sym17101647 (registering DOI) - 4 Oct 2025
Abstract
Relation extraction in the equipment domain often exhibits asymmetric patterns, where entities participate in multiple overlapping relations that break the expected structural symmetry of semantic associations. Such asymmetry increases task complexity and reduces extraction accuracy in conventional approaches. To address this issue, we [...] Read more.
Relation extraction in the equipment domain often exhibits asymmetric patterns, where entities participate in multiple overlapping relations that break the expected structural symmetry of semantic associations. Such asymmetry increases task complexity and reduces extraction accuracy in conventional approaches. To address this issue, we propose a symmetry- and asymmetry-aware dual-path retrieval and in-context learning-based large language model. Specifically, the BGE-M3 embedding model is fine-tuned for domain-specific adaptation, and a multi-level retrieval database is constructed to capture both global semantic symmetry at the sentence level and local asymmetric interactions at the relation level. A dual-path retrieval strategy, combined with Reciprocal Rank Fusion, integrates these complementary perspectives, while task-specific prompt templates further enhance extraction accuracy. Experimental results demonstrate that our method not only mitigates the challenges posed by overlapping and asymmetric relations but also leverages the latent symmetry of semantic structures to improve performance. Experimental results show that our approach effectively mitigates challenges from overlapping and asymmetric relations while exploiting latent semantic symmetry, achieving an F1-score of 88.53%, a 1.86% improvement over the strongest baseline (GPT-RE). Full article
(This article belongs to the Special Issue Symmetry and Its Applications in Computer Vision)
Show Figures

Figure 1

81 pages, 4442 KB  
Systematic Review
From Illusion to Insight: A Taxonomic Survey of Hallucination Mitigation Techniques in LLMs
by Ioannis Kazlaris, Efstathios Antoniou, Konstantinos Diamantaras and Charalampos Bratsas
AI 2025, 6(10), 260; https://doi.org/10.3390/ai6100260 - 3 Oct 2025
Abstract
Large Language Models (LLMs) exhibit remarkable generative capabilities but remain vulnerable to hallucinations—outputs that are fluent yet inaccurate, ungrounded, or inconsistent with source material. To address the lack of methodologically grounded surveys, this paper introduces a novel method-oriented taxonomy of hallucination mitigation strategies [...] Read more.
Large Language Models (LLMs) exhibit remarkable generative capabilities but remain vulnerable to hallucinations—outputs that are fluent yet inaccurate, ungrounded, or inconsistent with source material. To address the lack of methodologically grounded surveys, this paper introduces a novel method-oriented taxonomy of hallucination mitigation strategies in text-based LLMs. The taxonomy organizes over 300 studies into six principled categories: Training and Learning Approaches, Architectural Modifications, Input/Prompt Optimization, Post-Generation Quality Control, Interpretability and Diagnostic Methods, and Agent-Based Orchestration. Beyond mapping the field, we identify persistent challenges such as the absence of standardized evaluation benchmarks, attribution difficulties in multi-method systems, and the fragility of retrieval-based methods when sources are noisy or outdated. We also highlight emerging directions, including knowledge-grounded fine-tuning and hybrid retrieval–generation pipelines integrated with self-reflective reasoning agents. This taxonomy provides a methodological framework for advancing reliable, context-sensitive LLM deployment in high-stakes domains such as healthcare, law, and defense. Full article
(This article belongs to the Section AI Systems: Theory and Applications)
23 pages, 5798 KB  
Article
Application of Generative AI in Financial Risk Prediction: Enhancing Model Accuracy and Interpretability
by Kai-Chao Yao, Hsiu-Chu Hung, Ching-Hsin Wang, Wei-Lun Huang, Hui-Ting Liang, Tzu-Hsin Chu, Bo-Siang Chen and Wei-Sho Ho
Information 2025, 16(10), 857; https://doi.org/10.3390/info16100857 - 3 Oct 2025
Abstract
This study explores the application of generative artificial intelligence (AI) in financial risk forecasting, aiming to assess its potential in enhancing both the accuracy and interpretability of predictive models. Traditional methods often struggle with the complexity and nonlinearity of financial data, whereas generative [...] Read more.
This study explores the application of generative artificial intelligence (AI) in financial risk forecasting, aiming to assess its potential in enhancing both the accuracy and interpretability of predictive models. Traditional methods often struggle with the complexity and nonlinearity of financial data, whereas generative AI—such as large language models and generative adversarial networks (GANs)—offers novel solutions to these challenges. The study begins with a comprehensive review of current research on generative AI in financial risk prediction, with a focus on its roles in data augmentation and feature extraction. It then investigates techniques such as Generative Adversarial Explanation (GAX) to evaluate their effectiveness in improving model interpretability. Case studies demonstrate the practical value of generative AI in real-world financial forecasting and quantify its contribution to predictive accuracy. Furthermore, the study identifies key challenges—including data quality, model training costs, and regulatory compliance—and proposes corresponding mitigation strategies. The findings suggest that generative AI can significantly improve the accuracy and interpretability of financial risk models, though its adoption must be carefully managed to address associated risks. This study offers insights and guidance for future research in applying generative AI to financial risk forecasting. Full article
(This article belongs to the Special Issue Modeling in the Era of Generative AI)
22 pages, 2445 KB  
Article
The Construction of a Design Method Knowledge Graph Driven by Multi-Source Heterogeneous Data
by Jixing Shi, Kaiyi Wang, Zhongqing Wang, Zhonghang Bai and Fei Hu
Appl. Sci. 2025, 15(19), 10702; https://doi.org/10.3390/app151910702 - 3 Oct 2025
Abstract
To address the fragmentation and weak correlation of knowledge in the design method domain, this paper proposes a framework for constructing a knowledge graph driven by multi-source heterogeneous data. The process involves collecting multi-source heterogeneous data and subsequently utilizing text mining and natural [...] Read more.
To address the fragmentation and weak correlation of knowledge in the design method domain, this paper proposes a framework for constructing a knowledge graph driven by multi-source heterogeneous data. The process involves collecting multi-source heterogeneous data and subsequently utilizing text mining and natural language processing techniques to extract design themes and method elements. A “theme–stage–attribute” three-dimensional mapping model is established to achieve semantic coupling of knowledge. The BERT-BiLSTM-CRF (Bidirectional Encoder Representations from Transformers-Bidirectional Long Short-Term Memory-Conditional Random Field) model is employed for entity recognition and relation extraction, while the Sentence-BERT (Sentence Bidirectional Encoder Representations from Transformers) model is used to perform multi-source knowledge fusion. The Neo4j graph database facilitates knowledge storage, visualization, and querying, forming the basis for developing a prototype of a design method recommendation system. The framework’s effectiveness was validated through experiments on extraction performance and knowledge graph quality. The results demonstrate that the framework achieves an F1 score of 91.2% for knowledge extraction, and an 8.44% improvement over the baseline. The resulting graph’s node and relation coverage reached 94.1% and 91.2%, respectively. In complex semantic query tasks, the framework shows a significant advantage over traditional classification systems, achieving a maximum F1 score of 0.97. It can effectively integrate dispersed knowledge in the field of design methods and support method matching throughout the entire design process. This research is of significant value for advancing knowledge management and application in innovative product design. Full article
23 pages, 730 KB  
Article
She Wants Safety, He Wants Speed: A Mixed-Methods Study on Gender Differences in EV Consumer Behavior
by Qi Zhu and Qian Bao
Systems 2025, 13(10), 869; https://doi.org/10.3390/systems13100869 - 3 Oct 2025
Abstract
Against the backdrop of the rapid proliferation of electric vehicles (EVs), gender-oriented behavioral mechanisms remain underexplored, particularly the unique pathways of female users in usage experience, value assessment, and purchase decision-making. This study constructs an integrated framework based on the Stimulus–Organism–Response (SOR) model, [...] Read more.
Against the backdrop of the rapid proliferation of electric vehicles (EVs), gender-oriented behavioral mechanisms remain underexplored, particularly the unique pathways of female users in usage experience, value assessment, and purchase decision-making. This study constructs an integrated framework based on the Stimulus–Organism–Response (SOR) model, leveraging social media big data to analyze in depth how gender differences influence EV users’ purchase intentions. By integrating natural language processing techniques, grounded theory coding, and structural equation modeling (SEM), this study models and analyzes 272,083 pieces of user-generated content (UGC) from Chinese social media platforms, identifying key functional and emotional factors shaping female users’ perceptions and attitudes. The results reveal that esthetic value, safety, and intelligent features more strongly drive emotional responses among female users’ decisions through functional cognition, with gender significantly moderating the pathways from perceived attributes to emotional resonance and cognitive evaluation. This study further confirms the dual mediating roles of functional cognition and emotional experience and identifies a masking (suppression) effect for the ‘intelligent perception’ variable. Methodologically, it develops a novel hybrid paradigm that integrates data-driven semantic mining with psychological behavioral modeling, enhancing the ecological validity of consumer behavior research. Practically, the findings provide empirical support for gender-sensitive EV product design, personalized marketing strategies, and community-based service innovations, while also discussing research limitations and proposing future directions for cross-cultural validation and multimodal analysis. Full article
23 pages, 838 KB  
Article
Applied with Caution: Extreme-Scenario Testing Reveals Significant Risks in Using LLMs for Humanities and Social Sciences Paper Evaluation
by Hua Liu, Ling Dai and Haozhe Jiang
Appl. Sci. 2025, 15(19), 10696; https://doi.org/10.3390/app151910696 - 3 Oct 2025
Abstract
The deployment of large language models (LLMs) in academic paper evaluation is increasingly widespread, yet their trustworthiness remains debated; to expose fundamental flaws often masked under conventional testing, this study employed extreme-scenario testing to systematically probe the lower performance boundaries of LLMs in [...] Read more.
The deployment of large language models (LLMs) in academic paper evaluation is increasingly widespread, yet their trustworthiness remains debated; to expose fundamental flaws often masked under conventional testing, this study employed extreme-scenario testing to systematically probe the lower performance boundaries of LLMs in assessing the scientific validity and logical coherence of papers from the humanities and social sciences (HSS). Through a highly credible quasi-experiment, 40 high-quality Chinese papers from philosophy, sociology, education, and psychology were selected, for which domain experts created versions with implanted “scientific flaws” and “logical flaws”. Three representative LLMs (GPT-4, DeepSeek, and Doubao) were evaluated against a baseline of 24 doctoral candidates, following a protocol progressing from ‘broad’ to ‘targeted’ prompts. Key findings reveal poor evaluation consistency, with significantly low intra-rater and inter-rater reliability for the LLMs, and limited flaw detection capability, as all models failed to distinguish between original and flawed papers under broad prompts, unlike human evaluators; although targeted prompts improved detection, LLM performance remained substantially inferior, particularly in tasks requiring deep empirical insight and logical reasoning. The study proposes that LLMs operate on a fundamentally different “task decomposition-semantic understanding” mechanism, relying on limited text extraction and shallow semantic comparison rather than the human process of “worldscape reconstruction → meaning construction and critique”, resulting in a critical inability to assess argumentative plausibility and logical coherence. It concludes that current LLMs possess fundamental limitations in evaluations requiring depth and critical thinking, are not reliable independent evaluators, and that over-trusting them carries substantial risks, necessitating rational human-AI collaborative frameworks, enhanced model adaptation through downstream alignment techniques like prompt engineering and fine-tuning, and improvements in general capabilities such as logical reasoning. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

29 pages, 10807 KB  
Article
From Abstraction to Realization: A Diagrammatic BIM Framework for Conceptual Design in Architectural Education
by Nancy Alassaf
Sustainability 2025, 17(19), 8853; https://doi.org/10.3390/su17198853 - 3 Oct 2025
Abstract
The conceptual design phase in architecture establishes the foundation for subsequent design decisions and influences up to 80% of a building’s lifecycle environmental impact. While Building Information Modeling (BIM) demonstrates transformative potential for sustainable design, its application during conceptual design remains constrained by [...] Read more.
The conceptual design phase in architecture establishes the foundation for subsequent design decisions and influences up to 80% of a building’s lifecycle environmental impact. While Building Information Modeling (BIM) demonstrates transformative potential for sustainable design, its application during conceptual design remains constrained by perceived technical complexity and limited support for abstract thinking. This research examines how BIM tools can facilitate conceptual design through diagrammatic reasoning, thereby bridging technical capabilities with creative exploration. A mixed-methods approach was employed to develop and validate a Diagrammatic BIM (D-BIM) framework. It integrates diagrammatic reasoning, parametric modeling, and performance evaluation within BIM environments. The framework defines three core relationships—dissection, articulation, and actualization—which enable transitions from abstract concepts to detailed architectural forms in Revit’s modeling environments. Using Richard Meier’s architectural language as a structured test case, a 14-week quasi-experimental study with 19 third-year architecture students assessed the framework’s effectiveness through pre- and post-surveys, observations, and artifact analysis. Statistical analysis revealed significant improvements (p < 0.05) with moderate to large effect sizes across all measures, including systematic design thinking, diagram utilization, and academic self-efficacy. Students demonstrated enhanced design iteration, abstraction-to-realization transitions, and performance-informed decision-making through quantitative and qualitative assessments during early design stages. However, the study’s limitations include a small, single-institution sample, the absence of a control group, a focus on a single architectural language, and the exploratory integration of environmental analysis tools. Findings indicate that the framework repositions BIM as a cognitive design environment that supports creative ideation while integrating structured design logic and performance analysis. The study advances Education for Sustainable Development (ESD) by embedding critical, systems-based, and problem-solving competencies, demonstrating BIM’s role in sustainability-focused early design. This research provides preliminary evidence that conceptual design and BIM are compatible when supported with diagrammatic reasoning, offering a foundation for integrating competency-based digital pedagogy that bridges creative and technical dimensions of architectural design. Full article
(This article belongs to the Special Issue Advances in Engineering Education and Sustainable Development)
Show Figures

Figure 1

17 pages, 782 KB  
Article
DAPO: Mobility-Aware Joint Optimization of Model Partitioning and Task Offloading for Edge LLM Inference
by Hao Feng, Gan Huang, Nian Zhou, Feng Zhang, Yuming Liu, Xiumin Zhou and Junchen Liu
Electronics 2025, 14(19), 3929; https://doi.org/10.3390/electronics14193929 - 3 Oct 2025
Abstract
Deploying Large Language Models (LLMs) in edge environments faces two major challenges: (i) the conflict between limited device resources and high computational demands, and (ii) the dynamic impact of user mobility on model partitioning and task offloading decisions. To address these challenges, this [...] Read more.
Deploying Large Language Models (LLMs) in edge environments faces two major challenges: (i) the conflict between limited device resources and high computational demands, and (ii) the dynamic impact of user mobility on model partitioning and task offloading decisions. To address these challenges, this paper proposes the Dynamic Adaptive Partitioning and Offloading (DAPO) framework, an intelligent solution for multi-user, multi-edge Mobile Edge Intelligence (MEI) systems. DAPO employs a Deep Deterministic Policy Gradient (DDPG) algorithm to jointly optimize the model partition point and the task offloading destination. By mapping continuous policy outputs onto valid discrete actions, DAPO efficiently addresses the high-dimensional hybrid action space and dynamically adapts to user mobility. Through extensive simulations, we demonstrate that DAPO outperforms baseline strategies and mainstream RL methods, achieving up to 27% lower latency and 18% lower energy consumption compared to PPO and A2C, while maintaining fast convergence and scalability in dynamic mobile environments. Full article
(This article belongs to the Special Issue Towards Efficient and Reliable AI at the Edge)
Show Figures

Figure 1

Back to TopTop