MDPI - Publisher of Open Access Journals

16 pages, 3476 KB

Open AccessArticle

ROboMC: A Portable Multimodal System for eHealth Training and Scalable AI-Assisted Education

by Marius Cioca and Adriana-Lavinia Cioca

Inventions 2025, 10(6), 103; https://doi.org/10.3390/inventions10060103 - 11 Nov 2025

AI-based educational chatbots can expand access to learning, but many remain limited to text-only interfaces and fixed infrastructures, while purely generative responses raise concerns of reliability and consistency. In this context, we present ROboMC, a portable and multimodal system that combines a validated [...] Read more.

AI-based educational chatbots can expand access to learning, but many remain limited to text-only interfaces and fixed infrastructures, while purely generative responses raise concerns of reliability and consistency. In this context, we present ROboMC, a portable and multimodal system that combines a validated knowledge base with generative responses (OpenAI) and voice–text interaction, designed to enable both text and voice interaction, ensuring reliability and flexibility in diverse educational scenarios. The system, developed in Django, integrates two response pipelines: local search using normalized keywords and fuzzy matching in the LocalQuestion database, and fallback to the generative model GPT-3.5-Turbo (OpenAI, San Francisco, CA, USA) with a prompt adapted exclusively for Romanian and an explicit disclaimer. All interactions are logged in AutomaticQuestion for later analysis, supported by a semantic encoder (SentenceTransformer—paraphrase-multilingual-MiniLM-L12-v2’, Hugging Face Inc., New York, NY, USA) that ensures search tolerance to variations in phrasing. Voice interaction is managed through gTTS (Google LLC, Mountain View, CA, USA) with integrated audio playback, while portability is achieved through deployment on a Raspberry Pi 4B (Raspberry Pi Foundation, Cambridge, UK) with microphone, speaker, and battery power. Voice input is enabled through a cloud-based speech-to-text component (Google Web Speech API accessed via the Python SpeechRecognition library, (Anthony Zhang, open-source project, USA) using the Google Web Speech API (Google LLC, Mountain View, CA, USA; language = “ro-RO”)), allowing users to interact by speaking. Preliminary tests showed average latencies of 120–180 ms for validated responses on laptop and 250–350 ms on Raspberry Pi, respectively, 2.5–3.5 s on laptop and 4–6 s on Raspberry Pi for generative responses, timings considered acceptable for real educational scenarios. A small-scale usability study (N ≈ 35) indicated good acceptability (SUS ~80/100), with participants valuing the balance between validated and generative responses, the voice integration, and the hardware portability. Although system validation was carried out in the eHealth context, its architecture allows extension to any educational field: depending on the content introduced into the validated database, ROboMC can be adapted to medicine, engineering, social sciences, or other disciplines, relying on ChatGPT only when no clear match is found in the local base, making it a scalable and interdisciplinary solution. Full article

(This article belongs to the Section Inventions and Innovation in Design, Modeling and Computing Methods)

► Show Figures

Figure 1

19 pages, 1763 KB

Open AccessArticle

Research on the Automatic Generation of Information Requirements for Emergency Response to Unexpected Events

by Yao Li, Chang Guo, Zhenhai Lu, Chao Zhang, Wei Gao, Jiaqi Liu and Jungang Yang

Appl. Sci. 2025, 15(22), 11953; https://doi.org/10.3390/app152211953 - 11 Nov 2025

Abstract

In dealing with emergency events, it is very important when making scientific and correct decisions. As an important premise, the creation of information needs is quite essential. Taking earthquakes as a type of unexpected event, this paper constructs a large and model-driven system [...] Read more.

In dealing with emergency events, it is very important when making scientific and correct decisions. As an important premise, the creation of information needs is quite essential. Taking earthquakes as a type of unexpected event, this paper constructs a large and model-driven system for automating the generating process of information requirements for earthquake response. This research explores how the different departments interact during an earthquake emergency response, how the information interacts with each other, and how the information requirement process operates. The system is designed from three points of view, building a knowledge base, designing and developing prompts, and designing the system structure. It talks about how computers automatically make info needs for sudden emergencies. During the experimental process, the backbone architectures used were four Large Language Models (LLMs): chatGLM (GLM-4.6), Spark (SparkX1.5), ERNIE Bot (4.5 Turbo), and DeepSeek (V3.2). According to the desired system process, information needs is generated by real-word cases and then they are compared to the gathered information needs by experts. In the comparison process, the “keyword weighted matching + text structure feature fusion” method was used to calculate the semantic similarity. Like true positives, false positives, and false negatives can be used to find differences and calculate metrics like precision and recal. And the F1-score is also computed. The experimental results show that all four LLMs achieved a precision and recall of over 90% in earthquake information extraction, with their F1-scores all exceeding 85%. This verifies the feasibility of the analytical method a chatGLM dopted in this research. Through comparative analysis, it was found that chatGLM exhibited the best performance, with an F1-score of 93.2%. Eventually, Python is used to script these aforementioned processes, which then create complete comparison charts for visual and test result checking. In the course of researching we also use Protege to create the knowledge requirements ontology, so it is easy for us to show and look at it. This research is particularly useful for emergency management departments, earthquake emergency response teams, and those working on intelligent emergency information systems or those focusing on the automated information requirement generation using technologies such as LLMs. It provides practical support for optimizing rapid decision-making in earthquake emergency response. Full article

► Show Figures

Figure 1

19 pages, 3351 KB

Open AccessArticle

A Multi-Point Preliminary Design Method for Centrifugal Compressor Stages of Fuel Cell-Based Propulsion Systems

by Alessandro Cappiello, Viviane Ciais and Matteo Pini

Int. J. Turbomach. Propuls. Power 2025, 10(4), 39; https://doi.org/10.3390/ijtpp10040039 - 3 Nov 2025

Viewed by 228

Abstract

The successful implementation of an airborne propulsion system based on hydrogen-powered fuel cell technology highly depends on the development of an efficient, lightweight and compact air supply compressor. Meeting these requirements by designing the compressor using conventional single-point preliminary design methods can be [...] Read more.

The successful implementation of an airborne propulsion system based on hydrogen-powered fuel cell technology highly depends on the development of an efficient, lightweight and compact air supply compressor. Meeting these requirements by designing the compressor using conventional single-point preliminary design methods can be challenging, due to the very wide range of corrected mass flow rate and pressure ratio values that the air supply compressor must be able to accommodate. This article presents a multi-point design methodology for the preliminary design of centrifugal compressors of air supply systems. The method is implemented in an in-house code, called TurboSim, and allows to perform single- and multi-objective constrained optimization of vaneless centrifugal compressors. Furthermore, an automatic design point selection method is also available. The accuracy of the compressor lumped-parameter model is validated against experimental data obtained on a high-pressure-ratio single-stage vaneless centrifugal compressor from the literature. Subsequently, the design methodology is applied to optimize the compressor of the air supply system of an actual fuel cell powertrain. The results, compared to those obtained with a more conventional single-point design method, show that the multi-point method provides compressor designs that feature superior performance and that better comply with the specified constraints at the target operating points. Full article

► Show Figures

Figure 1

17 pages, 402 KB

Open AccessArticle

Training a Team of Language Models as Options to Build an SQL-Based Memory

by Seokhan Lee and Hanseok Ko

Appl. Sci. 2025, 15(21), 11399; https://doi.org/10.3390/app152111399 - 24 Oct 2025

Viewed by 276

Abstract

Despite the rapid progress in the capabilities of large language models, they still lack a reliable and efficient method of storing and retrieving new information conveyed over the course of their interaction with users upon deployment. In this paper, we use reinforcement learning [...] Read more.

Despite the rapid progress in the capabilities of large language models, they still lack a reliable and efficient method of storing and retrieving new information conveyed over the course of their interaction with users upon deployment. In this paper, we use reinforcement learning methods to train a team of smaller language models, which we frame as options, on reward-respecting subtasks, to learn to use SQL commands to store and retrieve relevant information to and from an external SQL database. In particular, we train a storage language model on a subtask for distinguishing between user and assistant in the dialogue history, to learn to store any relevant facts that may be required to answer future user queries. We then train a retrieval language model on a subtask for querying a sufficient number of fields, to learn to retrieve information from the SQL database that could be useful in answering the current user query. We find that training our models on their respective subtasks results in much higher performance than training them to directly optimize the reward signal and that the resulting team of language models is able to achieve performance on memory tasks comparable to existing methods that rely on language models orders of magnitude larger in size. In particular, we were able to able to achieve a 36% gain in accuracy over a prompt engineering baseline and a 13% gain over a strong baseline that uses the much larger GPT-3.5 Turbo on the MSC-Self-Instruct dataset. Full article

(This article belongs to the Topic Challenges and Solutions in Large Language Models)

► Show Figures

Figure 1

32 pages, 6188 KB

Open AccessArticle

Siyasat: AI-Powered AI Governance Tool to Generate and Improve AI Policies According to Saudi AI Ethics Principles

by Dabiah Alboaneen, Shaikha Alhajri, Khloud Alhajri, Muneera Aljalal, Noura Alalyani, Hajer Alsaadan, Zainab Al Thonayan and Raja Alyafer

Computers 2025, 14(11), 452; https://doi.org/10.3390/computers14110452 - 22 Oct 2025

Viewed by 871

Abstract

The rapid development of artificial intelligence (AI) and growing reliance on generative AI (GenAI) tools such as ChatGPT and Bing Chat have raised concerns about risks, including privacy violations, bias, and discrimination. AI governance is viewed as a solution, and in Saudi Arabia, [...] Read more.

The rapid development of artificial intelligence (AI) and growing reliance on generative AI (GenAI) tools such as ChatGPT and Bing Chat have raised concerns about risks, including privacy violations, bias, and discrimination. AI governance is viewed as a solution, and in Saudi Arabia, the Saudi Data and Artificial Intelligence Authority (SDAIA) has introduced the AI Ethics Principles. However, many organizations face challenges in aligning their AI policies with these principles. This paper presents Siyasat, an Arabic web-based governance tool designed to generate and enhance AI policies based on SDAIA’s AI Ethics Principles. Powered by GPT-4-turbo and a Retrieval-Augmented Generation (RAG) approach, the tool uses a dataset of ten AI policies and SDAIA’s official ethics document. The results show that Siyasat achieved a BERTScore of 0.890 and Self-BLEU of 0.871 in generating AI policies, while in improving AI policies, it scored 0.870 and 0.980, showing strong consistency and quality. The paper contributes a practical solution to support public, private, and non-profit sectors in complying with Saudi Arabia’s AI Ethics Principles. Full article

► Show Figures

Figure 1

23 pages, 572 KB

Open AccessArticle

Zero-Shot Classification of Illicit Dark Web Content with Commercial LLMs: A Comparative Study on Accuracy, Human Consistency, and Inter-Model Agreement

by Víctor-Pablo Prado-Sánchez, Adrián Domínguez-Díaz, Luis De-Marcos and José-Javier Martínez-Herráiz

Electronics 2025, 14(20), 4101; https://doi.org/10.3390/electronics14204101 - 19 Oct 2025

Viewed by 986

Abstract

This study evaluates the zero-shot classification performance of eight commercial large language models (LLMs), GPT-4o, GPT-4o Mini, GPT-3.5 Turbo, Claude 3.5 Haiku, Gemini 2.0 Flash, DeepSeek Chat, DeepSeek Reasoner, and Grok, using the CoDA dataset (n = 10,000 Dark Web documents). Results [...] Read more.

This study evaluates the zero-shot classification performance of eight commercial large language models (LLMs), GPT-4o, GPT-4o Mini, GPT-3.5 Turbo, Claude 3.5 Haiku, Gemini 2.0 Flash, DeepSeek Chat, DeepSeek Reasoner, and Grok, using the CoDA dataset (n = 10,000 Dark Web documents). Results show strong macro-F1 scores across models, led by DeepSeek Chat (0.870), Grok (0.868), and Gemini 2.0 Flash (0.861). Alignment with human annotations was high, with Cohen’s Kappa above 0.840 for top models and Krippendorff’s Alpha reaching 0.871. Inter-model consistency was highest between Claude 3.5 Haiku and GPT-4o (κ = 0.911), followed by DeepSeek Chat and Grok (κ = 0.909), and Claude 3.5 Haiku with Gemini 2.0 Flash (κ = 0.907). These findings confirm that state-of-the-art LLMs can reliably classify illicit content under zero-shot conditions, though performance varies by model and category. Full article

► Show Figures

Figure 1

27 pages, 1859 KB

Open AccessArticle

Strengths and Weaknesses of Artificial Intelligence in Exploring Asbestos History and Regulations Across Countries

by Alessandro Croce, Francesca Ugo, Annalisa Roveta, Carlotta Bertolina, Caterina Rinaudo, Antonio Maconi and Marinella Bertolotti

Geosciences 2025, 15(10), 395; https://doi.org/10.3390/geosciences15100395 - 12 Oct 2025

Viewed by 484

Abstract

Asbestos, consisting of six natural mineral fibrous silicate phases, was widely utilized in industrial development during the 20th century and has left a global legacy of health, environmental, and regulatory challenges. Its remarkable properties (e.g., heat resistance, sound absorption, and tensile strength) made [...] Read more.

Asbestos, consisting of six natural mineral fibrous silicate phases, was widely utilized in industrial development during the 20th century and has left a global legacy of health, environmental, and regulatory challenges. Its remarkable properties (e.g., heat resistance, sound absorption, and tensile strength) made it a useful material in numerous applications. However, scientific research revealed its serious health risks in the early 1900s, with growing evidence during the 1960s, and nowadays its role in the development of different diseases (e.g., respiratory diseases, such as lung cancer, mesothelioma, and asbestosis) is well defined. Mapping this complex history requires integrating heterogeneous and often inconsistent information from nearly 200 countries. In this study, we tested the use of generative artificial intelligence (AI) tools as exploratory and comparative instruments to support the collection of asbestos-related data worldwide. Using Google Gemini (version 2.5 flash) and OpenAI ChatGPT (GPT-4-turbo variant), we gathered historical, medical, and regulatory information and then systematically verified and contextualized it with expert analysis. This dual approach allowed us to assess both the global asbestos situation and the reliability, advantages, and limitations of AI-assisted research. Our results highlight how AI can accelerate data collection and provide useful first drafts while underscoring the necessity of human expertise for validation, interpretation, and critical integration. This study, therefore, contributes a dual perspective: a comprehensive overview of the asbestos legacy across countries and a methodological reflection on the opportunities and pitfalls of employing AI in geoscientific and environmental research. Full article

(This article belongs to the Section Natural Hazards)

► Show Figures

Figure 1

12 pages, 1253 KB

Open AccessArticle

Rapid Nanopore Sequencing of Positive Blood Cultures Using Automated Benzyl-Alcohol Extraction Improves Time-Critical Sepsis Management

by Chi-Sheng Tai, Hsing-Yi Chung, Tai-Han Lin, Chih-Kai Chang, Cherng-Lih Perng, Po-Shiuan Hsieh, Hung-Sheng Shang and Ming-Jr Jian

Antibiotics 2025, 14(10), 1001; https://doi.org/10.3390/antibiotics14101001 - 9 Oct 2025

Viewed by 575

Abstract

Background/Objective: Timely identification of bloodstream pathogens is critical for sepsis management; however, PCR inhibitors such as sodium polyanetholesulfonate (SPS) in blood culture broth compromise nucleic acid recovery and long read sequencing. We assessed whether coupling a benzyl alcohol SPS-removal step to the [...] Read more.

Background/Objective: Timely identification of bloodstream pathogens is critical for sepsis management; however, PCR inhibitors such as sodium polyanetholesulfonate (SPS) in blood culture broth compromise nucleic acid recovery and long read sequencing. We assessed whether coupling a benzyl alcohol SPS-removal step to the fully automated LabTurbo AIO extractor improves Oxford Nanopore-based pathogen detection. Methods: Thirteen positive blood culture broths were pre-treated with benzyl alcohol and divided: half volumes were purified on the LabTurbo AIO; paired aliquots underwent manual QIAamp extraction. DNA purity was evaluated by NanoDrop and Qubit. Barcoded libraries were sequenced on MinION R9.4.1 flow cells for 6 h. Results: Automated eluates showed a median A₂₆₀/A₂₈₀ of 1.92 and A₂₆₀/A₂₃₀ of 1.96, versus 1.80 and 1.48 for manual extracts. The automated workflow generated 1.69 × 10⁶ total reads compared with 3.9 × 10⁵ reads for manual extraction. The median N50 read length increased from 5.9 kb to 8.7 kb, and the median proportion of reads classified to species increased from 62% to 84%. The hands-on time was <5 min and the sample-to-answer turnaround was <8 h, compared with >9 h and 90 min for the manual protocol, respectively. Conclusions: Benzyl alcohol SPS removal integrated into the LabTurbo AIO extractor yielded purer, longer, and higher read counts, enhancing nanopore sequencing depth and accuracy while compressing diagnostic turnaround to a single working day. This represents a practical advance for rapid blood culture pathogen identification in critical care settings. Full article

(This article belongs to the Special Issue Surveillance and Detection of Antimicrobial Resistance: Tools and Trends)

► Show Figures

Figure 1

21 pages, 1270 KB

Open AccessArticle

Performance and Uncertainty Analysis of Digital vs. Analog Pressure Scanners Under Static and Dynamic Conditions

by Roxana Nicolae, Constantin-Daniel Oancea, Rares Secareanu and Daniel Lale

Eng 2025, 6(10), 263; https://doi.org/10.3390/eng6100263 - 4 Oct 2025

Viewed by 370

Abstract

Dynamic pressure measurement is an important component in the turbo engine testing process. This paper presents a comparative analysis between two types of multichannel electronic pressure measurement systems, commonly known as pressure scanners, used for this purpose: ZOC17/8Px, with analog amplification per channel, [...] Read more.

Dynamic pressure measurement is an important component in the turbo engine testing process. This paper presents a comparative analysis between two types of multichannel electronic pressure measurement systems, commonly known as pressure scanners, used for this purpose: ZOC17/8Px, with analog amplification per channel, and MPS4264, a modern digital system with integrated A/D conversion. The study was conducted in two stages: a metrological verification and validation in static mode, using a high-precision pressure standard, and an experimental stage in dynamic mode, where data was acquired from a turbojet engine test stand, in constant engine speed mode. The signal stability of the pressure scanners was statistically analyzed by determining the coefficient of variation in the signal and the frequency spectrum (FFT) for each channel of the pressure scanners. Furthermore, comprehensive uncertainty budgets were calculated for both systems. The results highlight the superior stability and reduced uncertainty of the MPS4264 pressure scanner, attributing its enhanced performance to digital integration and a higher resilience to external noise. The findings support the adoption of modern digital systems for dynamic applications and provide a robust metrological basis for the optimal selection of measurement systems. Full article

(This article belongs to the Section Electrical and Electronic Engineering)

► Show Figures

Figure 1

15 pages, 1726 KB

Open AccessArticle

Nano Oil Additive Improves Internal Combustion Engine Efficiency and Life Expectancy

by Ding Lou, Jordan Morrison, Greg Christensen, Craig Bailey, Rose Gerani, Aaron Nardi and Rob Hrabe

Lubricants 2025, 13(10), 427; https://doi.org/10.3390/lubricants13100427 - 24 Sep 2025

Viewed by 1051

Abstract

Internal combustion engines remain a predominant source of global energy consumption, contributing substantially to both operational costs and greenhouse gas emissions. This work evaluates a nanomaterial-based engine oil additive that reduces friction and wear and increases torque, horsepower, and fuel efficiency. This novel [...] Read more.

Internal combustion engines remain a predominant source of global energy consumption, contributing substantially to both operational costs and greenhouse gas emissions. This work evaluates a nanomaterial-based engine oil additive that reduces friction and wear and increases torque, horsepower, and fuel efficiency. This novel nano oil additive contains functionalized carbon nanotubes and hexagonal boron nitride nanosheets that are dispersed in base oil using a proprietary ultrasonication process. Block-on-ring tests performed by multiple testing facilities demonstrated up to a 17% decrease in coefficient of friction and up to a 78% decrease in wear compared to the base oil after treating with the nano oil additive. Thermal properties enhancement by the nano oil additive was evaluated and increases up to 17 °C in thermal stability were obtained. Additionally, the nano oil additive increased torque and horsepower by an average of 7% in motorcycles and 2.4% in pickup trucks. Most importantly, the nano oil additive demonstrated improvements in fuel economy in both gasoline and diesel engines, with laboratory tests reporting 3–5% increases and practical field tests on a commercial truck fleet reporting an average of a 6% increase. The improved engine efficiency leads to reduced turbo temperature in heavy diesel engines and prolonged engine life expectancy and will significantly improve global environmental sustainability. Full article

(This article belongs to the Special Issue Recent Advances in Automotive Powertrain Lubrication)

► Show Figures

Figure 1

17 pages, 7481 KB

Open AccessArticle

A Real-Time Advisory Tool for Supporting the Use of Helmets in Construction Sites

by Ümit Işıkdağ, Handan Aş Çemrek, Seda Sönmez, Yaren Aydın, Gebrail Bekdaş and Zong Woo Geem

Information 2025, 16(10), 824; https://doi.org/10.3390/info16100824 - 24 Sep 2025

Viewed by 1066

Abstract

In the construction industry, occupational health and safety plays a critical role in preventing occupational accidents and increasing productivity. In recent years, computer vision and artificial intelligence-based systems have made significant contributions to improving these processes through automatic detection and tracking of objects. [...] Read more.

In the construction industry, occupational health and safety plays a critical role in preventing occupational accidents and increasing productivity. In recent years, computer vision and artificial intelligence-based systems have made significant contributions to improving these processes through automatic detection and tracking of objects. The aim of this study was to fine-tune object detection models and integrate them with Large Language Models for (i). accurate detection of personal protective equipment (PPE) by specifically focusing on helmets and (ii). providing real-time recommendations based on the detections for supporting the use of helmets in construction sites. For achieving the first objective of the study, large YOLOv8/v11/v12 models were trained using a helmet dataset consisting of 16,867 images. The dataset was divided into two classes: “Head (No Helmet)” and “Helmet”. The model, once trained, was able to analyze an image from a construction site and detect and count the people with and without helmets. A tool with the aim of providing advice to workers in real time was developed to fulfil the second objective of the study. The developed tool provides the counts of the people based on video feeds or analyzing a series of images and provides recommendations on occupational safety (based on the detections from the video feed and images) through an OpenAI GPT-3.5-turbo Large Language Model and with a Streamlit-based GUI. The use of YOLO enables quick and accurate detections; in addition, the use of the OpenAI model API serves the exact same purpose. The combination of the YOLO model and OpenAI model API enables near-real-time responses to the user over the web. The paper elaborates on the fine tuning of the detection model with the helmet dataset and the development of the real-time advisory tool. Full article

► Show Figures

Figure 1

11 pages, 1112 KB

Open AccessArticle

Thoracic MRI in Pediatric Oncology: Feasibility and Image Quality of Post-Contrast Free-Breathing Radial 3D T1 Weighted Imaging

by Patricia Tischendorf, Marc-David Künnemann, Tobias Krähling, Jan Hendrik Lange, Walter Heindel and Laura Beck

Biomedicines 2025, 13(9), 2302; https://doi.org/10.3390/biomedicines13092302 - 19 Sep 2025

Viewed by 936

Abstract

Objectives: To compare the feasibility and image quality of a post-contrast free-breathing radial stack-of-stars 3D T1w turbo-field echo Dixon sequence (3D T1w VANE mDIXON) with a conventional cartesian breath-hold 3D T1w fast-field echo mDIXON sequence in pediatric oncology patients undergoing chest MRI. [...] Read more.

Objectives: To compare the feasibility and image quality of a post-contrast free-breathing radial stack-of-stars 3D T1w turbo-field echo Dixon sequence (3D T1w VANE mDIXON) with a conventional cartesian breath-hold 3D T1w fast-field echo mDIXON sequence in pediatric oncology patients undergoing chest MRI. Methods: A total of 48 children (34 females; mean age 5.3 ± 3.7 years) underwent contrast-enhanced chest MRI, with 24 examined using the 3D T1w VANE mDIXON sequence and 24 with a conventional breath-hold 3D T1w mDIXON sequence. Image quality was independently assessed by three radiologists using a 5-point scale. Signal-to-noise ratio (SNR) was measured at two anatomical sites, a homogeneous paraspinal muscle region (SNR_muscle) and the liver apex (SNR_liver), while avoiding vessels and signal inhomogeneities. The presence of respiratory artifacts, total imaging time, and the need for general anesthesia or sedation were recorded. Interobserver agreement was determined using Fleiss’s kappa (ϰ), and mean SNR values were compared between groups using an independent samples t-test. Results: The 3D T1w VANE mDIXON sequence yielded significantly higher SNR_muscle and SNR_liver (530 ± 120; 570 ± 110 vs. 370 ± 110; 400 ± 90; p < 0.001), improved diagnostic image quality by approximately 25%, and reduced respiratory artifacts by about 23%. Interobserver agreement was almost perfect. Importantly, the need for general anesthesia was significantly reduced using the 3D T1w VANE mDIXON (p < 0.001). Conclusions: Free-breathing 3D T1w VANE mDIXON chest MRI is a feasible and effective imaging approach for pediatric oncology patients, offering superior image quality and reducing the need for general anesthesia compared to conventional methods. Full article

(This article belongs to the Special Issue Pediatric Tumors: Diagnosis, Pathogenesis, Treatment, and Outcome)

► Show Figures

Graphical abstract

12 pages, 211 KB

Open AccessArticle

A Comparative Study of Large Language Models in Programming Education: Accuracy, Efficiency, and Feedback in Student Assignment Grading

by Andrija Bernik, Danijel Radošević and Andrej Čep

Appl. Sci. 2025, 15(18), 10055; https://doi.org/10.3390/app151810055 - 15 Sep 2025

Viewed by 1251

Abstract

Programming education traditionally requires extensive manual assessment of student assignments, which is both time-consuming and resource-intensive for instructors. Recent advances in large language models (LLMs) open opportunities for automating this process and providing timely feedback. This paper investigates the application of artificial intelligence [...] Read more.

Programming education traditionally requires extensive manual assessment of student assignments, which is both time-consuming and resource-intensive for instructors. Recent advances in large language models (LLMs) open opportunities for automating this process and providing timely feedback. This paper investigates the application of artificial intelligence (AI) tools for preliminary assessment of undergraduate programming assignments. A multi-phase experimental study was conducted across three computer science courses: Introduction to Programming, Programming 2, and Advanced Programming Concepts. A total of 315 Python assignments were collected from the Moodle learning management system, with 100 randomly selected submissions analyzed in detail. AI evaluation was performed using ChatGPT-4 (GPT-4-turbo), Claude 3, and Gemini 1.5 Pro models, employing structured prompts aligned with a predefined rubric that assessed functionality, code structure, documentation, and efficiency. Quantitative results demonstrate high correlation between AI-generated scores and instructor evaluations, with ChatGPT-4 achieving the highest consistency (Pearson coefficient 0.91) and the lowest average absolute deviation (0.68 points). Qualitative analysis highlights AI’s ability to provide structured, actionable feedback, though variability across models was observed. The study identifies benefits such as faster evaluation and enhanced feedback quality, alongside challenges including model limitations, potential biases, and the need for human oversight. Recommendations emphasize hybrid evaluation approaches combining AI automation with instructor supervision, ethical guidelines, and integration of AI tools into learning management systems. The findings indicate that AI-assisted grading can improve efficiency and pedagogical outcomes while maintaining academic integrity. Full article

(This article belongs to the Special Issue Emerging Trends in Artificial Intelligence and Computer Science for E-Learning)

16 pages, 1697 KB

Open AccessArticle

Enhancing Ancient Ceramic Knowledge Services: A Question Answering System Using Fine-Tuned Models and GraphRAG

by Zhi Chen and Bingxiang Liu

Information 2025, 16(9), 792; https://doi.org/10.3390/info16090792 - 11 Sep 2025

Viewed by 483

Abstract

To address the challenges of extensive domain expertise and deficient semantic comprehension in the digital preservation of ancient ceramics, this paper proposes a knowledge question answering (QA) system integrating Low-Rank Adaptation (LoRA) fine-tuning and Graph Retrieval-Augmented Generation (GraphRAG). First, textual information of ceramic [...] Read more.

To address the challenges of extensive domain expertise and deficient semantic comprehension in the digital preservation of ancient ceramics, this paper proposes a knowledge question answering (QA) system integrating Low-Rank Adaptation (LoRA) fine-tuning and Graph Retrieval-Augmented Generation (GraphRAG). First, textual information of ceramic images is generated using the GLM-4V-9B model. These texts are then enriched with domain literature to produce ancient ceramic QA pairs via ERNIE 4.0 Turbo, culminating in a high-quality dataset of 2143 curated question–answer groups after manual refinement. Second, LoRA fine-tuning was employed on the Qwen2.5-7B-Instruct foundation model, significantly enhancing its question-answering proficiency specifically for the ancient ceramics domain. Finally, the GraphRAG framework is integrated, combining the fine-tuned large language model with knowledge graph path analysis to augment multi-hop reasoning capabilities for complex queries. Experimental results demonstrate performance improvements of 24.08% in ROUGE-1, 34.75% in ROUGE-2, 29.78% in ROUGE-L, and 4.52% in BERTScore_F1 over the baseline model. This evidence shows that the synergistic implementation of LoRA fine-tuning and GraphRAG delivers significant performance enhancements for ceramic knowledge systems, establishing a replicable technical framework for intelligent cultural heritage knowledge services. Full article

► Show Figures

Figure 1

17 pages, 1583 KB

Open AccessArticle

Comparative Analysis of AI Models for Python Code Generation: A HumanEval Benchmark Study

by Ali Bayram, Gonca Gokce Menekse Dalveren and Mohammad Derawi

Appl. Sci. 2025, 15(18), 9907; https://doi.org/10.3390/app15189907 - 10 Sep 2025

Viewed by 3509

Abstract

This study conducts a comprehensive comparative analysis of six contemporary artificial intelligence models for Python code generation using the HumanEval benchmark. The evaluated models include GPT-3.5 Turbo, GPT-4 Omni, Claude 3.5 Sonnet, Claude 3.7 Sonnet, Claude Sonnet 4, and Claude Opus 4. A [...] Read more.

This study conducts a comprehensive comparative analysis of six contemporary artificial intelligence models for Python code generation using the HumanEval benchmark. The evaluated models include GPT-3.5 Turbo, GPT-4 Omni, Claude 3.5 Sonnet, Claude 3.7 Sonnet, Claude Sonnet 4, and Claude Opus 4. A total of 164 Python programming problems were utilized to assess model performance through a multi-faceted methodology incorporating automated functional correctness evaluation via the Pass@1 metric, cyclomatic complexity analysis, maintainability index calculations, and lines-of-code assessment. The results indicate that Claude Sonnet 4 achieved the highest performance with a success rate of 95.1%, followed closely by Claude Opus 4 at 94.5%. Across all metrics, models developed by Anthropic Claude consistently outperformed those developed by OpenAI GPT by margins exceeding 20%. Statistical analysis further confirmed the existence of significant differences between the model families (p < 0.001). Anthropic Claude models were observed to generate more sophisticated and maintainable solutions with superior syntactic accuracy. In contrast, OpenAI GPT models tended to adopt simpler strategies but exhibited notable limitations in terms of reliability. These findings offer evidence-based insights to guide the selection of AI-powered coding assistants in professional software development contexts. Full article

► Show Figures

Figure 1

Search Results (528)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (528)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI