MDPI - Publisher of Open Access Journals

23 pages, 3781 KiB

Open AccessArticle

Evaluating Urban Visual Attractiveness Perception Using Multimodal Large Language Model and Street View Images

by Qianyu Zhou, Jiaxin Zhang and Zehong Zhu

Buildings 2025, 15(16), 2970; https://doi.org/10.3390/buildings15162970 - 21 Aug 2025

Visual attractiveness perception—an individual’s capacity to recognise and evaluate the visual appeal of urban scene safety—has direct implications for well-being, economic vitality, and social cohesion. However, most empirical studies rely on single-source metrics or algorithm-centric pipelines that under-represent human perception. Addressing this gap, [...] Read more.

Visual attractiveness perception—an individual’s capacity to recognise and evaluate the visual appeal of urban scene safety—has direct implications for well-being, economic vitality, and social cohesion. However, most empirical studies rely on single-source metrics or algorithm-centric pipelines that under-represent human perception. Addressing this gap, we introduce a fully reproducible, multimodal framework that measures and models this domain-specific facet of human intelligence by coupling Generative Pre-trained Transformer 4o (GPT-4o) with 1000 Street View images. The pipeline first elicits pairwise aesthetic judgements from GPT-4o, converts them into a latent attractiveness scale via Thurstone’s law of comparative judgement, and then validates the scale against 1.17 M crowdsourced ratings from MIT’s Place Pulse 2.0 benchmark (Spearman ρ = 0.76, p < 0.001). Compared with a Siamese CNN baseline (ρ = 0.60), GPT-4o yields both higher criterion validity and an 88% reduction in inference time, underscoring its superior capacity to approximate human evaluative reasoning. In this study, we introduce a standardised and reproducible streetscape evaluation pipeline using GPT-4o. We then combine the resulting attractiveness scores with network-based accessibility modelling to generate a “aesthetic–accessibility map” of urban central districts in Chongqing, China. Cluster analysis reveals four statistically distinct street types—Iconic Core, Liveable Rings, Transit-Rich but Bland, and Peripheral Low-Appeal—providing actionable insights for landscape design, urban governance, and tourism planning. Full article

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

► Show Figures

Figure 1

23 pages, 10088 KiB

Open AccessArticle

Development of an Interactive Digital Human with Context-Sensitive Facial Expressions

by Fan Yang, Lei Fang, Rui Suo, Jing Zhang and Mincheol Whang

Sensors 2025, 25(16), 5117; https://doi.org/10.3390/s25165117 - 18 Aug 2025

Viewed by 277

Abstract

With the increasing complexity of human–computer interaction scenarios, conventional digital human facial expression systems show notable limitations in handling multi-emotion co-occurrence, dynamic expression, and semantic responsiveness. This paper proposes a digital human system framework that integrates multimodal emotion recognition and compound facial expression [...] Read more.

With the increasing complexity of human–computer interaction scenarios, conventional digital human facial expression systems show notable limitations in handling multi-emotion co-occurrence, dynamic expression, and semantic responsiveness. This paper proposes a digital human system framework that integrates multimodal emotion recognition and compound facial expression generation. The system establishes a complete pipeline for real-time interaction and compound emotional expression, following a sequence of “speech semantic parsing—multimodal emotion recognition—Action Unit (AU)-level 3D facial expression control.” First, a ResNet18-based model is employed for robust emotion classification using the AffectNet dataset. Then, an AU motion curve driving module is constructed on the Unreal Engine platform, where dynamic synthesis of basic emotions is achieved via a state-machine mechanism. Finally, Generative Pre-trained Transformer (GPT) is utilized for semantic analysis, generating structured emotional weight vectors that are mapped to the AU layer to enable language-driven facial responses. Experimental results demonstrate that the proposed system significantly improves facial animation quality, with naturalness increasing from 3.54 to 3.94 and semantic congruence from 3.44 to 3.80. These results validate the system’s capability to generate realistic and emotionally coherent expressions in real time. This research provides a complete technical framework and practical foundation for high-fidelity digital humans with affective interaction capabilities. Full article

(This article belongs to the Special Issue Emotion Recognition Based on Sensors (3rd Edition))

► Show Figures

Figure 1

14 pages, 4593 KiB

Open AccessArticle

Fine-Tuned Large Language Models for High-Accuracy Prediction of Band Gap and Stability in Transition Metal Sulfides

by Zimo Zhao, Lin Hu and Honghui Wang

Materials 2025, 18(16), 3793; https://doi.org/10.3390/ma18163793 - 13 Aug 2025

Viewed by 377

Abstract

This study presents a fine-tuned Large Language Model approach for predicting band gap and stability of transition metal sulfides. Our method processes textual descriptions of crystal structures directly, eliminating the need for complex feature engineering required by traditional ML and GNN approaches. Using [...] Read more.

This study presents a fine-tuned Large Language Model approach for predicting band gap and stability of transition metal sulfides. Our method processes textual descriptions of crystal structures directly, eliminating the need for complex feature engineering required by traditional ML and GNN approaches. Using a strategically selected dataset of 554 compounds from the Materials Project database, we fine-tuned GPT-3.5-turbo through nine consecutive iterations. Performance metrics improved significantly, with band gap prediction R² values increasing from 0.7564 to 0.9989, while stability classification achieved F1 > 0.7751. The fine-tuned model demonstrated superior generalization ability compared to both GPT-3.5 and GPT-4.0 models, maintaining high accuracy across diverse material structures. This approach is particularly valuable for new material systems with limited experimental data, as it can extract meaningful features directly from text descriptions and transfer knowledge from pre-training to domain-specific tasks without relying on extensive numerical datasets. Full article

► Show Figures

Graphical abstract

24 pages, 2572 KiB

Open AccessArticle

DIALOGUE: A Generative AI-Based Pre–Post Simulation Study to Enhance Diagnostic Communication in Medical Students Through Virtual Type 2 Diabetes Scenarios

by Ricardo Xopan Suárez-García, Quetzal Chavez-Castañeda, Rodrigo Orrico-Pérez, Sebastián Valencia-Marin, Ari Evelyn Castañeda-Ramírez, Efrén Quiñones-Lara, Claudio Adrián Ramos-Cortés, Areli Marlene Gaytán-Gómez, Jonathan Cortés-Rodríguez, Jazel Jarquín-Ramírez, Nallely Guadalupe Aguilar-Marchand, Graciela Valdés-Hernández, Tomás Eduardo Campos-Martínez, Alonso Vilches-Flores, Sonia Leon-Cabrera, Adolfo René Méndez-Cruz, Brenda Ofelia Jay-Jímenez and Héctor Iván Saldívar-Cerón

Eur. J. Investig. Health Psychol. Educ. 2025, 15(8), 152; https://doi.org/10.3390/ejihpe15080152 - 7 Aug 2025

Viewed by 1089

Abstract

DIALOGUE (DIagnostic AI Learning through Objective Guided User Experience) is a generative artificial intelligence (GenAI)-based training program designed to enhance diagnostic communication skills in medical students. In this single-arm pre–post study, we evaluated whether DIALOGUE could improve students’ ability to disclose a type [...] Read more.

DIALOGUE (DIagnostic AI Learning through Objective Guided User Experience) is a generative artificial intelligence (GenAI)-based training program designed to enhance diagnostic communication skills in medical students. In this single-arm pre–post study, we evaluated whether DIALOGUE could improve students’ ability to disclose a type 2 diabetes mellitus (T2DM) diagnosis with clarity, structure, and empathy. Thirty clinical-phase students completed two pre-test virtual encounters with an AI-simulated patient (ChatGPT, GPT-4o), scored by blinded raters using an eight-domain rubric. Participants then engaged in ten asynchronous GenAI scenarios with automated natural-language feedback. Seven days later, they completed two post-test consultations with human standardized patients, again evaluated with the same rubric. Mean total performance increased by 36.7 points (95% CI: 31.4–42.1; p < 0.001), and the proportion of high-performing students rose from 0% to 70%. Gains were significant across all domains, most notably in opening the encounter, closure, and diabetes specific explanation. Multiple regression showed that lower baseline empathy (β = −0.41, p = 0.005) and higher digital self-efficacy (β = 0.35, p = 0.016) independently predicted greater improvement; gender had only a marginal effect. Cluster analysis revealed three learner profiles, with the highest-gain group characterized by low empathy and high digital self-efficacy. Inter-rater reliability was excellent (ICC ≈ 0.90). These findings provide empirical evidence that GenAI-mediated training can meaningfully enhance diagnostic communication and may serve as a scalable, individualized adjunct to conventional medical education. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Heath, Psychology and Education)

► Show Figures

Graphical abstract

10 pages, 616 KiB

Open AccessCommunication

Brief Prompt-Engineering Clinic Substantially Improves AI Literacy and Reduces Technology Anxiety in First-Year Teacher-Education Students: A Pre–Post Pilot Study

by Roberto Carlos Davila-Moran, Juan Manuel Sanchez Soto, Henri Emmanuel Lopez Gomez, Manuel Silva Infantes, Andres Arias Lizares, Lupe Marilu Huanca Rojas and Simon Jose Cama Flores

Educ. Sci. 2025, 15(8), 1010; https://doi.org/10.3390/educsci15081010 - 6 Aug 2025

Viewed by 635

Abstract

Generative AI tools such as ChatGPT are reshaping educational practice, yet first-year teacher-education students often lack the prompt-engineering skills and confidence required to use them responsibly. This pilot study examined whether a concise three-session clinic on prompt engineering could simultaneously boost AI literacy [...] Read more.

Generative AI tools such as ChatGPT are reshaping educational practice, yet first-year teacher-education students often lack the prompt-engineering skills and confidence required to use them responsibly. This pilot study examined whether a concise three-session clinic on prompt engineering could simultaneously boost AI literacy and reduce technology anxiety in prospective teachers. Forty-five freshmen in a Peruvian teacher-education program completed validated Spanish versions of a 12-item AI-literacy scale and a 12-item technology-anxiety scale one week before and after the intervention; normality-checked pre–post differences were analysed with paired-samples t-tests, Cohen’s d, and Pearson correlations. AI literacy rose by 0.70 ± 0.46 points (t (44) = −6.10, p < 0.001, d = 0.91), while technology anxiety fell by 0.58 ± 0.52 points (t (44) = −3.82, p = 0.001, d = 0.56); individual gains were inversely correlated (r = −0.46, p = 0.002). These findings suggest that integrating micro-level prompt-engineering clinics in the first semester can help future teachers engage critically and comfortably with generative AI and guide curriculum designers in updating teacher-training programs. Full article

(This article belongs to the Special Issue ChatGPT as Educative and Pedagogical Tool: Perspectives and Prospects)

► Show Figures

Figure 1

25 pages, 1751 KiB

Open AccessReview

Large Language Models for Adverse Drug Events: A Clinical Perspective

by Md Muntasir Zitu, Dwight Owen, Ashish Manne, Ping Wei and Lang Li

J. Clin. Med. 2025, 14(15), 5490; https://doi.org/10.3390/jcm14155490 - 4 Aug 2025

Viewed by 744

Abstract

Adverse drug events (ADEs) significantly impact patient safety and health outcomes. Manual ADE detection from clinical narratives is time-consuming, labor-intensive, and costly. Recent advancements in large language models (LLMs), including transformer-based architectures such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pretrained [...] Read more.

Adverse drug events (ADEs) significantly impact patient safety and health outcomes. Manual ADE detection from clinical narratives is time-consuming, labor-intensive, and costly. Recent advancements in large language models (LLMs), including transformer-based architectures such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pretrained Transformer (GPT) series, offer promising methods for automating ADE extraction from clinical data. These models have been applied to various aspects of pharmacovigilance and clinical decision support, demonstrating potential in extracting ADE-related information from real-world clinical data. Additionally, chatbot-assisted systems have been explored as tools in clinical management, aiding in medication adherence, patient engagement, and symptom monitoring. This narrative review synthesizes the current state of LLMs in ADE detection from a clinical perspective, organizing studies into categories such as human-facing decision support tools, immune-related ADE detection, cancer-related and non-cancer-related ADE surveillance, and personalized decision support systems. In total, 39 articles were included in this review. Across domains, LLM-driven methods have demonstrated promising performances, often outperforming traditional approaches. However, critical limitations persist, such as domain-specific variability in model performance, interpretability challenges, data quality and privacy concerns, and infrastructure requirements. By addressing these challenges, LLM-based ADE detection could enhance pharmacovigilance practices, improve patient safety outcomes, and optimize clinical workflows. Full article

(This article belongs to the Section Pharmacology)

► Show Figures

Figure 1

10 pages, 426 KiB

Open AccessProceeding Paper

Guiding or Misleading: Challenges of Artificial Intelligence-Generated Content in Heuristic Teaching: ChatGPT

by Ping-Kuo A. Chen

Eng. Proc. 2025, 103(1), 1; https://doi.org/10.3390/engproc2025103001 - 4 Aug 2025

Viewed by 326

Abstract

Artificial intelligence (AI)-generated content (AIGC) is an innovative technology that utilizes machine learning, AI models, reward modeling, and natural language processing (NLP) to create diverse digital content such as videos, images, and text. It has the potential to support various human activities with [...] Read more.

Artificial intelligence (AI)-generated content (AIGC) is an innovative technology that utilizes machine learning, AI models, reward modeling, and natural language processing (NLP) to create diverse digital content such as videos, images, and text. It has the potential to support various human activities with significant implications in teaching and learning, facilitating heuristic teaching for educators. By using AIGC, teachers can create extensive knowledge content and effectively design instructional strategies to guide students, aligning with heuristic teaching. However, incorporating AIGC into heuristic teaching has controversies and concerns, which potentially mislead outcomes. Nevertheless, leveraging AIGC greatly benefits teachers in enhancing heuristic teaching. When integrating AIGC to support heuristic teaching, challenges and risks must be acknowledged and addressed. These challenges include the need for users to possess sufficient knowledge reserves to identify incorrect information and content generated by AIGC, the importance of avoiding excessive reliance on AIGC, ensuring users maintain control over their actions rather than being driven by AIGC, and the necessity of scrutinizing and verifying the accuracy of information and knowledge generated by AIGC to preserve its effectiveness. Full article

► Show Figures

Figure 1

8 pages, 192 KiB

Open AccessBrief Report

Accuracy and Safety of ChatGPT-3.5 in Assessing Over-the-Counter Medication Use During Pregnancy: A Descriptive Comparative Study

by Bernadette Cornelison, David R. Axon, Bryan Abbott, Carter Bishop, Cindy Jebara, Anjali Kumar and Kristen A. Root

Pharmacy 2025, 13(4), 104; https://doi.org/10.3390/pharmacy13040104 - 30 Jul 2025

Viewed by 869

Abstract

As artificial intelligence (AI) becomes increasingly utilized to perform tasks requiring human intelligence, patients who are pregnant may turn to AI for advice on over-the-counter (OTC) medications. However, medications used in pregnancy may pose profound safety concerns limited by data availability. This study [...] Read more.

As artificial intelligence (AI) becomes increasingly utilized to perform tasks requiring human intelligence, patients who are pregnant may turn to AI for advice on over-the-counter (OTC) medications. However, medications used in pregnancy may pose profound safety concerns limited by data availability. This study focuses on a chatbot’s ability to accurately provide information regarding OTC medications as it relates to patients that are pregnant. A prospective, descriptive design was used to compare the responses generated by the Chat Generative Pre-Trained Transformer 3.5 (ChatGPT-3.5) to the information provided by UpToDate^®. Eighty-seven of the top pharmacist-recommended OTC drugs in the United States (U.S.) as identified by Pharmacy Times were assessed for safe use in pregnancy using ChatGPT-3.5. A piloted, standard prompt was input into ChatGPT-3.5, and the responses were recorded. Two groups independently rated the responses compared to UpToDate on their correctness, completeness, and safety using a 5-point Likert scale. After independent evaluations, the groups discussed the findings to reach a consensus, with a third independent investigator giving final ratings. For correctness, the median score was 5 (interquartile range [IQR]: 5–5). For completeness, the median score was 4 (IQR: 4–5). For safety, the median score was 5 (IQR: 5–5). Despite high overall scores, the safety errors in 9% of the evaluations (n = 8), including omissions that pose a risk of serious complications, currently renders the chatbot an unsafe standalone resource for this purpose. Full article

(This article belongs to the Special Issue AI Use in Pharmacy and Pharmacy Education)

17 pages, 609 KiB

Open AccessArticle

GPT-Based Text-to-SQL for Spatial Databases

by Hui Wang, Li Guo, Yubin Liang, Le Liu and Jiajin Huang

ISPRS Int. J. Geo-Inf. 2025, 14(8), 288; https://doi.org/10.3390/ijgi14080288 - 24 Jul 2025

Viewed by 494

Abstract

Text-to-SQL for spatial databases enables the translation of natural language questions into corresponding SQL queries, allowing non-experts to easily access spatial data, which has gained increasing attention from researchers. Previous research has primarily focused on rule-based methods. However, these methods have limitations when [...] Read more.

Text-to-SQL for spatial databases enables the translation of natural language questions into corresponding SQL queries, allowing non-experts to easily access spatial data, which has gained increasing attention from researchers. Previous research has primarily focused on rule-based methods. However, these methods have limitations when dealing with complicated or unknown natural language questions. While advanced machine learning models can be trained, they typically require large labeled training datasets, which are severely lacking for spatial databases. Recently, Generative Pre-Trained Transformer (GPT) models have emerged as a promising paradigm for Text-to-SQL tasks in relational databases, driven by carefully designed prompts. In response to the severe lack of datasets for spatial databases, we have created a publicly available dataset that supports both English and Chinese. Furthermore, we propose a GPT-based method to construct prompts for spatial databases, which incorporates geographic and spatial database knowledge into the prompts and requires only a small number of training samples, such as 1, 3, or 5 examples. Extensive experiments demonstrate that incorporating geographic and spatial database knowledge into prompts improves the accuracy of Text-to-SQL tasks for spatial databases. Our proposed method can help non-experts access spatial databases more easily and conveniently. Full article

► Show Figures

Figure 1

19 pages, 460 KiB

Open AccessArticle

Refining Text2Cypher on Small Language Model with Reinforcement Learning Leveraging Semantic Information

by Quoc-Bao-Huy Tran, Aagha Abdul Waheed, Syed Mudasir and Sun-Tae Chung

Appl. Sci. 2025, 15(15), 8206; https://doi.org/10.3390/app15158206 - 23 Jul 2025

Viewed by 380

Abstract

Text2Cypher is a text-to-text task that converts natural language questions into Cypher queries. Recent research by Neo4j on Text2Cypher demonstrates that fine-tuning a baseline language model (a pretrained and instruction-tuned generative model) using a comprehensive Text2Cypher dataset can effectively enhance query generation performance. [...] Read more.

Text2Cypher is a text-to-text task that converts natural language questions into Cypher queries. Recent research by Neo4j on Text2Cypher demonstrates that fine-tuning a baseline language model (a pretrained and instruction-tuned generative model) using a comprehensive Text2Cypher dataset can effectively enhance query generation performance. However, the improvement is still insufficient for effectively learning the syntax and semantics of complex natural texts, particularly when applied to unseen Cypher schema structures across diverse domains during training. To address this challenge, we propose a novel refinement training method based on baseline language models, employing reinforcement learning with Group Relative Policy Optimization (GRPO). This method leverages extracted semantic information, such as key-value properties and triple relationships from input texts during the training process. Experimental results of the proposed refinement training method applied to a small-scale baseline language model (SLM) like Qwen2.5-3B-Instruct demonstrate that it achieves competitive execution accuracy scores on unseen schemas across various domains. Furthermore, the proposed method significantly outperforms most baseline LMs with larger parameter sizes in terms of Google-BLEU and execution accuracy scores over Neo4j’s comprehensive Text2Cypher dataset, with the exception of colossal LLMs such as GPT4o, GPT4o-mini, and Gemini. Full article

► Show Figures

Figure 1

22 pages, 1805 KiB

Open AccessArticle

A Hybrid Semantic and Multi-Attention Mechanism Approach for Detecting Vulnerabilities in Smart Contract Code

by Zhenxiang He, Yanling Liu and Xiaohui Sun

Symmetry 2025, 17(7), 1161; https://doi.org/10.3390/sym17071161 - 21 Jul 2025

Viewed by 521

Abstract

Driven by blockchain technology, numerous industries are increasingly adopting smart contracts to enhance efficiency, reduce costs, and improve transparency. As a result, ensuring the security of smart contracts has become critical. Traditional detection methods often suffer from low efficiency, are prone to missing [...] Read more.

Driven by blockchain technology, numerous industries are increasingly adopting smart contracts to enhance efficiency, reduce costs, and improve transparency. As a result, ensuring the security of smart contracts has become critical. Traditional detection methods often suffer from low efficiency, are prone to missing complex vulnerabilities, and have limited accuracy. Although deep learning approaches address some of these challenges, issues with both accuracy and efficiency remain in current solutions. To overcome these limitations, this paper proposes a symmetry-inspired solution that harmonizes bidirectional and generative semantic patterns. First, we generate distinct feature extraction segments for different vulnerabilities. We then use the Bidirectional Encoder Representations from Transformers (BERT) module to extract original semantic features from these segments and the Generative Pre-trained Transformer (GPT) module to extract generative semantic features. Finally, the two sets of semantic features are fused using a multi-attention mechanism and input into a classifier for result prediction. Our method was tested on three datasets, achieving F1 scores of 93.33%, 93.65%, and 92.31%, respectively. The results demonstrate that our approach outperforms most existing methods in smart contract detection. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

26 pages, 15354 KiB

Open AccessArticle

Transforming Physics Teacher Training Through ChatGPT: A Study on Usability and Impact

by Marcos Guerrero-Zambrano, Leonor Sanchez-Alvarado, Bryan Valarezo-Chamba and Erick Lamilla-Rubio

Educ. Sci. 2025, 15(7), 887; https://doi.org/10.3390/educsci15070887 - 11 Jul 2025

Viewed by 878

Abstract

Teacher training in Physics often faces challenges related to engaging students and conveying abstract concepts effectively. Generative AI tools, such as ChatGPT, present transformative opportunities for designing innovative and tailored educational activities. This study investigates the impact of ChatGPT on pre-service Physics teacher [...] Read more.

Teacher training in Physics often faces challenges related to engaging students and conveying abstract concepts effectively. Generative AI tools, such as ChatGPT, present transformative opportunities for designing innovative and tailored educational activities. This study investigates the impact of ChatGPT on pre-service Physics teacher training, focusing on its usability, effectiveness, and influence on participant satisfaction. Utilizing a quantitative research approach, two Likert-scale surveys were administered to 24 prospective Physics teachers in Ecuador, both before and after an intervention workshop. The workshop introduced participants to ChatGPT’s features and its applications in designing playful, Physics-focused learning activities. Results indicated a significant increase in familiarity with AI tools, enhanced activity design quality, and high satisfaction rates. Notably,

79 %

of participants highlighted ChatGPT’s utility in adapting activities to diverse learning levels, and

83 %

acknowledged its efficiency in reducing preparation time. These findings underscore ChatGPT’s potential to revolutionize Physics education by facilitating the creation of personalized and engaging learning resources. Future research should explore larger sample sizes and longitudinal impacts to fully realize the implications of AI-driven tools in educational contexts. Full article

(This article belongs to the Topic Artificial Intelligence in Early Childhood Education)

► Show Figures

Figure 1

22 pages, 989 KiB

Open AccessArticle

A Second-Classroom Personalized Learning Path Recommendation System Based on Large Language Model Technology

by Qiankun Yang and Changyong Liang

Appl. Sci. 2025, 15(14), 7655; https://doi.org/10.3390/app15147655 - 8 Jul 2025

Viewed by 757

Abstract

To address the limitations of existing learning path recommendation methods—such as poor adaptability, weak personalization, and difficulties in processing long sequences of student behavior and interest data—this paper proposes a personalized learning path recommendation system for the second classroom based on large language [...] Read more.

To address the limitations of existing learning path recommendation methods—such as poor adaptability, weak personalization, and difficulties in processing long sequences of student behavior and interest data—this paper proposes a personalized learning path recommendation system for the second classroom based on large language model (LLM) technology, with a focus on integrating the pre-trained model GPT-4. The goal is to improve recommendation accuracy and personalization by leveraging GPT-4’s strong long-sequence modeling capability. The system fuses students’ multimodal data (e.g., physiological signals, facial expressions, activity levels, and emotional states), extracts deep features using GPT-4, and generates tailored learning paths based on individual feature vectors. It also incorporates incremental learning and self-attention mechanisms to enable real-time feedback and dynamic adjustments. A generative adversarial network (GAN) is introduced to enhance diversity and innovation in recommendations. The experimental results show that the system achieves a personalized recommendation accuracy of over 92%, with coverage and recall rates exceeding 91% and 93%, respectively. Feedback adjustment time remains within 1.5 s, outperforming mainstream models. This study provides a novel and effective technical framework for personalized learning in the second classroom, promoting both efficient resource utilization and student development. Full article

(This article belongs to the Special Issue Advanced Models and Algorithms for Recommender Systems)

► Show Figures

Figure 1

20 pages, 1535 KiB

Open AccessArticle

Multi-Agentic LLMs for Personalizing STEM Texts

by Michael Vaccaro, Mikayla Friday and Arash Zaghi

Appl. Sci. 2025, 15(13), 7579; https://doi.org/10.3390/app15137579 - 6 Jul 2025

Viewed by 727

Abstract

Multi-agent large language models promise flexible, modular architectures for delivering personalized educational content. Drawing on a pilot randomized controlled trial with middle school students (n = 23), we introduce a two-agent GPT-4 framework in which a Profiler agent infers learner-specific preferences and [...] Read more.

Multi-agent large language models promise flexible, modular architectures for delivering personalized educational content. Drawing on a pilot randomized controlled trial with middle school students (n = 23), we introduce a two-agent GPT-4 framework in which a Profiler agent infers learner-specific preferences and a Rewrite agent dynamically adapts science passages via an explicit message-passing protocol. We implement structured system and user prompts as inter-agent communication schemas to enable real-time content adaptation. The results of an ordinal logistic regression analysis hinted that students may be more likely to prefer texts aligned with their profile, demonstrating the feasibility of multi-agent system-driven personalization and highlighting the need for additional work to build upon this pilot study. Beyond empirical validation, we present a modular multi-agent architecture detailing agent roles, communication interfaces, and scalability considerations. We discuss design best practices, ethical safeguards, and pathways for extending this framework to collaborative agent networks—such as feedback-analysis agents—in K-12 settings. These results advance both our theoretical and applied understanding of multi-agent LLM systems for personalized learning. Full article

(This article belongs to the Special Issue Future Horizons in Multi-Agent Systems: Pioneering Trends and Breakthrough Innovations)

► Show Figures

Figure 1

13 pages, 972 KiB

Open AccessArticle

Assessing ChatGPT-v4 for Guideline-Concordant Inflammatory Bowel Disease: Accuracy, Completeness, and Temporal Drift

by Oguz Ozturk, Mucahit Ergul, Yavuz Cagir, Ali Atay, Kadir Can Acun, Orhan Coskun, Ilyas Tenlik, Muhammed Bahaddin Durak and Ilhami Yuksel

J. Clin. Med. 2025, 14(13), 4599; https://doi.org/10.3390/jcm14134599 - 29 Jun 2025

Viewed by 672

Abstract

Background/Objectives: Chat Generative Pretrained Transformer (ChatGPT) is a useful resource for individuals working in the healthcare field. This paper will include descriptions of several ways in which ChatGPT-4 can achieve greater accuracy in its diagnosis and treatment plans for ulcerative colitis (UC) and [...] Read more.

Background/Objectives: Chat Generative Pretrained Transformer (ChatGPT) is a useful resource for individuals working in the healthcare field. This paper will include descriptions of several ways in which ChatGPT-4 can achieve greater accuracy in its diagnosis and treatment plans for ulcerative colitis (UC) and Crohn’s disease (CD) by following the guidelines set out by the European Crohn’s and Colitis Organization (ECCO). Methods: The survey, which comprised 102 questions, was developed to assess the precision and consistency of respondents’ responses regarding the UC and CD. The questionnaire incorporated true/false and multiple-choice questions, with the objective of simulating real-life scenarios and adhering to the ECCO guidelines. We employed Likert scales to assess the responses. The inquiries were put to ChatGPT-4 on the initial day, the 15th day, and the 180th day. Results: The 51 true or false items demonstrated stability over a six-month period, with an initial accuracy of 92.8% at baseline, 92.8% on the 15th day, and peaked to 98.0% on the 180th day. This finding suggests a negligible effect size. The accuracy of the multiple-choice questions was initially 90.2% on Day 1, reached its highest point at 92.2% on Day 15, and then decreased to 84.3% on Day 180. However, the reliability of the data was found to be suboptimal, and the impact was deemed negligible. A modest, transient increase in performance was observed at 15 days, which subsequently diminished by 180 days, resulting in negligible effect sizes. Conclusions: ChatGPT-4 demonstrates potential as a clinical decision support system for UC and CD, but its assessment is marked by temporal variability and the inconsistent execution of various tasks. Essential initiatives that should be carried out before involving artificial intelligence (AI) technology in IBD trials are routine revalidation, multi-rater comparisons, prompt standardization, and the cultivation of a comprehensive understanding of the model’s limitations. Full article

(This article belongs to the Section Gastroenterology & Hepatopancreatobiliary Medicine)

► Show Figures

Figure 1

Search Results (181)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (181)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI