Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (371)

Search Parameters:
Keywords = domain expertise

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
38 pages, 1194 KiB  
Review
Transforming Data Annotation with AI Agents: A Review of Architectures, Reasoning, Applications, and Impact
by Md Monjurul Karim, Sangeen Khan, Dong Hoang Van, Xinyue Liu, Chunhui Wang and Qiang Qu
Future Internet 2025, 17(8), 353; https://doi.org/10.3390/fi17080353 (registering DOI) - 2 Aug 2025
Abstract
Data annotation serves as a critical foundation for artificial intelligence (AI) and machine learning (ML). Recently, AI agents powered by large language models (LLMs) have emerged as effective solutions to longstanding challenges in data annotation, such as scalability, consistency, cost, and limitations in [...] Read more.
Data annotation serves as a critical foundation for artificial intelligence (AI) and machine learning (ML). Recently, AI agents powered by large language models (LLMs) have emerged as effective solutions to longstanding challenges in data annotation, such as scalability, consistency, cost, and limitations in domain expertise. These agents facilitate intelligent automation and adaptive decision-making, thereby enhancing the efficiency and reliability of annotation workflows across various fields. Despite the growing interest in this area, a systematic understanding of the role and capabilities of AI agents in annotation is still underexplored. This paper seeks to fill that gap by providing a comprehensive review of how LLM-driven agents support advanced reasoning strategies, adaptive learning, and collaborative annotation efforts. We analyze agent architectures, integration patterns within workflows, and evaluation methods, along with real-world applications in sectors such as healthcare, finance, technology, and media. Furthermore, we evaluate current tools and platforms that support agent-based annotation, addressing key challenges such as quality assurance, bias mitigation, transparency, and scalability. Lastly, we outline future research directions, highlighting the importance of federated learning, cross-modal reasoning, and responsible system design to advance the development of next-generation annotation ecosystems. Full article
Show Figures

Figure 1

16 pages, 1873 KiB  
Systematic Review
A Systematic Review of GIS Evolution in Transportation Planning: Towards AI Integration
by Ayda Zaroujtaghi, Omid Mansourihanis, Mohammad Tayarani, Fatemeh Mansouri, Moein Hemmati and Ali Soltani
Future Transp. 2025, 5(3), 97; https://doi.org/10.3390/futuretransp5030097 (registering DOI) - 1 Aug 2025
Viewed by 67
Abstract
Previous reviews have examined specific facets of Geographic Information Systems (GIS) in transportation planning, such as transit-focused applications and open source geospatial tools. However, this study offers the first systematic, PRISMA-guided longitudinal evaluation of GIS integration in transportation planning, spanning thematic domains, data [...] Read more.
Previous reviews have examined specific facets of Geographic Information Systems (GIS) in transportation planning, such as transit-focused applications and open source geospatial tools. However, this study offers the first systematic, PRISMA-guided longitudinal evaluation of GIS integration in transportation planning, spanning thematic domains, data models, methodologies, and outcomes from 2004 to 2024. This study addresses this gap through a longitudinal analysis of GIS-based transportation research from 2004 to 2024, adhering to PRISMA guidelines. By conducting a mixed-methods analysis of 241 peer-reviewed articles, this study delineates major trends, such as increased emphasis on sustainability, equity, stakeholder involvement, and the incorporation of advanced technologies. Prominent domains include land use–transportation coordination, accessibility, artificial intelligence, real-time monitoring, and policy evaluation. Expanded data sources, such as real-time sensor feeds and 3D models, alongside sophisticated modeling techniques, enable evidence-based, multifaceted decision-making. However, challenges like data limitations, ethical concerns, and the need for specialized expertise persist, particularly in developing regions. Future geospatial innovations should prioritize the responsible adoption of emerging technologies, inclusive capacity building, and environmental justice to foster equitable and efficient transportation systems. This review highlights GIS’s evolution from a supplementary tool to a cornerstone of data-driven, sustainable urban mobility planning, offering insights for researchers, practitioners, and policymakers to advance transportation strategies that align with equity and sustainability goals. Full article
Show Figures

Figure 1

12 pages, 1346 KiB  
Article
A Language Vision Model Approach for Automated Tumor Contouring in Radiation Oncology
by Yi Luo, Hamed Hooshangnejad, Xue Feng, Gaofeng Huang, Xiaojian Chen, Rui Zhang, Quan Chen, Wil Ngwa and Kai Ding
Bioengineering 2025, 12(8), 835; https://doi.org/10.3390/bioengineering12080835 (registering DOI) - 31 Jul 2025
Viewed by 99
Abstract
Background: Lung cancer ranks as the leading cause of cancer-related mortality worldwide. The complexity of tumor delineation, crucial for radiation therapy, requires expertise often unavailable in resource-limited settings. Artificial Intelligence (AI), particularly with advancements in deep learning (DL) and natural language processing (NLP), [...] Read more.
Background: Lung cancer ranks as the leading cause of cancer-related mortality worldwide. The complexity of tumor delineation, crucial for radiation therapy, requires expertise often unavailable in resource-limited settings. Artificial Intelligence (AI), particularly with advancements in deep learning (DL) and natural language processing (NLP), offers potential solutions yet is challenged by high false positive rates. Purpose: The Oncology Contouring Copilot (OCC) system is developed to leverage oncologist expertise for precise tumor contouring using textual descriptions, aiming to increase the efficiency of oncological workflows by combining the strengths of AI with human oversight. Methods: Our OCC system initially identifies nodule candidates from CT scans. Employing Language Vision Models (LVMs) like GPT-4V, OCC then effectively reduces false positives with clinical descriptive texts, merging textual and visual data to automate tumor delineation, designed to elevate the quality of oncology care by incorporating knowledge from experienced domain experts. Results: The deployment of the OCC system resulted in a 35.0% reduction in the false discovery rate, a 72.4% decrease in false positives per scan, and an F1-score of 0.652 across our dataset for unbiased evaluation. Conclusions: OCC represents a significant advance in oncology care, particularly through the use of the latest LVMs, improving contouring results by (1) streamlining oncology treatment workflows by optimizing tumor delineation and reducing manual processes; (2) offering a scalable and intuitive framework to reduce false positives in radiotherapy planning using LVMs; (3) introducing novel medical language vision prompt techniques to minimize LVM hallucinations with ablation study; and (4) conducting a comparative analysis of LVMs, highlighting their potential in addressing medical language vision challenges. Full article
(This article belongs to the Special Issue Novel Imaging Techniques in Radiotherapy)
Show Figures

Figure 1

16 pages, 628 KiB  
Article
Beyond the Bot: A Dual-Phase Framework for Evaluating AI Chatbot Simulations in Nursing Education
by Phillip Olla, Nadine Wodwaski and Taylor Long
Nurs. Rep. 2025, 15(8), 280; https://doi.org/10.3390/nursrep15080280 (registering DOI) - 31 Jul 2025
Viewed by 152
Abstract
Background/Objectives: The integration of AI chatbots in nursing education, particularly in simulation-based learning, is advancing rapidly. However, there is a lack of structured evaluation models, especially to assess AI-generated simulations. This article introduces the AI-Integrated Method for Simulation (AIMS) evaluation framework, a dual-phase [...] Read more.
Background/Objectives: The integration of AI chatbots in nursing education, particularly in simulation-based learning, is advancing rapidly. However, there is a lack of structured evaluation models, especially to assess AI-generated simulations. This article introduces the AI-Integrated Method for Simulation (AIMS) evaluation framework, a dual-phase evaluation framework adapted from the FAITA model, designed to evaluate both prompt design and chatbot performance in the context of nursing education. Methods: This simulation-based study explored the application of an AI chatbot in an emergency planning course. The AIMS framework was developed and applied, consisting of six prompt-level domains (Phase 1) and eight performance criteria (Phase 2). These domains were selected based on current best practices in instructional design, simulation fidelity, and emerging AI evaluation literature. To assess the chatbots educational utility, the study employed a scoring rubric for each phase and incorporated a structured feedback loop to refine both prompt design and chatbox interaction. To demonstrate the framework’s practical application, the researchers configured an AI tool referred to in this study as “Eval-Bot v1”, built using OpenAI’s GPT-4.0, to apply Phase 1 scoring criteria to a real simulation prompt. Insights from this analysis were then used to anticipate Phase 2 performance and identify areas for improvement. Participants (three individuals)—all experienced healthcare educators and advanced practice nurses with expertise in clinical decision-making and simulation-based teaching—reviewed the prompt and Eval-Bot’s score to triangulate findings. Results: Simulated evaluations revealed clear strengths in the prompt alignment with course objectives and its capacity to foster interactive learning. Participants noted that the AI chatbot supported engagement and maintained appropriate pacing, particularly in scenarios involving emergency planning decision-making. However, challenges emerged in areas related to personalization and inclusivity. While the chatbot responded consistently to general queries, it struggled to adapt tone, complexity and content to reflect diverse learner needs or cultural nuances. To support replication and refinement, a sample scoring rubric and simulation prompt template are provided. When evaluated using the Eval-Bot tool, moderate concerns were flagged regarding safety prompts and inclusive language, particularly in how the chatbot navigated sensitive decision points. These gaps were linked to predicted performance issues in Phase 2 domains such as dialog control, equity, and user reassurance. Based on these findings, revised prompt strategies were developed to improve contextual sensitivity, promote inclusivity, and strengthen ethical guidance within chatbot-led simulations. Conclusions: The AIMS evaluation framework provides a practical and replicable approach for evaluating the use of AI chatbots in simulation-based education. By offering structured criteria for both prompt design and chatbot performance, the model supports instructional designers, simulation specialists, and developers in identifying areas of strength and improvement. The findings underscore the importance of intentional design, safety monitoring, and inclusive language when integrating AI into nursing and health education. As AI tools become more embedded in learning environments, this framework offers a thoughtful starting point for ensuring they are applied ethically, effectively, and with learner diversity in mind. Full article
Show Figures

Figure 1

24 pages, 739 KiB  
Article
CPEL: A Causality-Aware, Parameter-Efficient Learning Framework for Adaptation of Large Language Models with Case Studies in Geriatric Care and Beyond
by Jinzhong Xu, Junyi Gao, Xiaoming Liu, Guan Yang, Jie Liu, Yang Long, Ziyue Huang and Kai Yang
Mathematics 2025, 13(15), 2460; https://doi.org/10.3390/math13152460 - 30 Jul 2025
Viewed by 274
Abstract
Adapting Large Language Models (LLMs) to specialized domains like geriatric care remains a significant challenge due to the limited availability of domain-specific data and the difficulty of achieving efficient yet effective fine-tuning. Current methods often fail to effectively harness domain-specific causal insights, which [...] Read more.
Adapting Large Language Models (LLMs) to specialized domains like geriatric care remains a significant challenge due to the limited availability of domain-specific data and the difficulty of achieving efficient yet effective fine-tuning. Current methods often fail to effectively harness domain-specific causal insights, which are crucial for understanding and solving complex problems in low-resource domains.To address these challenges, we propose Causality-Aware, Parameter-Efficient Learning (CPEL), a novel framework that leverages domain-specific causal relationships to guide a multi-layer, parameter-efficient fine-tuning process for more effective domain adaptation. By embedding causal reasoning into the model’s adaptation pipeline, CPEL enables efficient specialization in the target domain while maintaining strong task-specific performance. Specifically, the Causal Prompt Generator of CPEL extracts and applies domain-specific causal structures, generating adaptive prompts that effectively guide the model’s learning process. Complementing this, the MPEFT module employs a dual-adapter mechanism to balance domain-level adaptation with downstream task optimization. This cohesive design ensures that CPEL achieves resource efficiency while capturing domain knowledge in a structured and interpretable manner. Based on this framework, we delved into its application in the field of geriatric care and trained a specialized large language model (Geriatric Care LLaMA) tailored for the aged-care domain, leveraging its capacity to efficiently integrate domain expertise. Experimental results from question-answering tasks demonstrate that CPEL improves ROUGE scores by 9–14% compared to mainstream LLMs and outperforms frontier models by 1–2 points in auto-scoring tasks. In summary, CPEL demonstrates robust generalization and cross-domain adaptability, highlighting its scalability and effectiveness as a transformative solution for domain adaptation in specialized, resource-constrained fields. Full article
Show Figures

Figure 1

11 pages, 15673 KiB  
Article
Automating GIS-Based Cloudburst Risk Mapping Using Generative AI: A Framework for Scalable Hydrological Analysis
by Alexander Adiyasa, Andrea Niccolò Mantegna and Irma Kveladze
Hydrology 2025, 12(8), 196; https://doi.org/10.3390/hydrology12080196 - 23 Jul 2025
Viewed by 288
Abstract
Accurate dynamic hydrological models are often too complex and costly for the rapid, broad-scale screening necessitated for proactive land-use planning against increasing cloudburst risks. This paper demonstrates the use of GPT-4 to develop a GUI-based Python 3.13.2 application for geospatial flood risk assessments. [...] Read more.
Accurate dynamic hydrological models are often too complex and costly for the rapid, broad-scale screening necessitated for proactive land-use planning against increasing cloudburst risks. This paper demonstrates the use of GPT-4 to develop a GUI-based Python 3.13.2 application for geospatial flood risk assessments. The study used instructive prompt techniques to script a traditional stream and catchment delineation methodology, further embedding it with a custom GUI. The resulting application demonstrates high performance, processing a 29.63 km2 catchment at a 1 m resolution in 30.31 s, and successfully identifying the main upstream contributing areas and flow paths for a specified area of interest. While its accuracy is limited by terrain data artifacts causing stream breaks, this study demonstrates how human–AI collaboration, with the LLM acting as a coding assistant guided by domain expertise, can empower domain experts and facilitate the development of advanced GIS-based decision-support systems. Full article
Show Figures

Figure 1

13 pages, 388 KiB  
Article
Benchmarking ChatGPT-3.5 and OpenAI o3 Against Clinical Pharmacists: Preliminary Insights into Clinical Accuracy, Sensitivity, and Specificity in Pharmacy MCQs
by Esraa M. Alsaudi, Sireen A. Shilbayeh and Rana K Abu-Farha
Healthcare 2025, 13(14), 1751; https://doi.org/10.3390/healthcare13141751 - 19 Jul 2025
Viewed by 461
Abstract
Objective: This proof-of-concept study aimed to evaluate and compare the clinical performance of two AI language models (ChatGPT-3.5 and OpenAI o3) in answering clinical pharmacy multiple-choice questions (MCQs), benchmarked against responses from specialist clinical pharmacists in Jordan, including academic preceptors and hospital-based clinicians. [...] Read more.
Objective: This proof-of-concept study aimed to evaluate and compare the clinical performance of two AI language models (ChatGPT-3.5 and OpenAI o3) in answering clinical pharmacy multiple-choice questions (MCQs), benchmarked against responses from specialist clinical pharmacists in Jordan, including academic preceptors and hospital-based clinicians. Methods: A total of 60 clinical pharmacy MCQs were developed based on current guidelines across four therapeutic areas: cardiovascular, endocrine, infectious, and respiratory diseases. Each item was reviewed by academic and clinical experts and then pilot-tested with five pharmacists to determine clarity and difficulty. Two ChatGPT models—GPT-3.5 and OpenAI o3—were tested using a standardized prompt for each MCQ, entered in separate sessions to avoid memory retention. Their answers were classified as true/false positives or negatives and retested after two weeks to assess reproducibility. Simultaneously, 25 licensed pharmacists (primarily from one academic institution and several hospitals in Amman) completed the same MCQs using validated references (excluding AI tools). Accuracy, sensitivity, specificity, and Cohen’s Kappa were used to compare AI and human performance, with statistical analysis conducted using appropriate tests at a significance level of p ≤ 0.05. Results: OpenAI o3 achieved the highest accuracy (83.3%), sensitivity (90.0%), and specificity (70.0%), outperforming GPT-3.5 (70.0%, 77.5%, 55.0%) and pharmacists (69.7%, 77.0%, 55.0%). AI performance declined significantly with increasing question difficulty. OpenAI o3 showed the highest accuracy in the cardiovascular domain (93.3%), while GPT-3.5 performed best in infectious diseases (80.0%). Reproducibility was higher for GPT-3.5 (81.6%, κ = 0.556) than OpenAI o3 (76.7%, κ = 0.364). Over two test rounds, GPT-3.5’s accuracy remained stable, whereas OpenAI o3’s accuracy decreased from 83.3% to 70.0%, indicating some variability. Conclusions: OpenAI o3 shows strong promise as a clinical decision-support tool in pharmacy, especially for low- to moderate-difficulty questions. However, inconsistencies in reproducibility and limitations in complex cases highlight the importance of cautious, supervised integration alongside human expertise. Full article
Show Figures

Figure 1

33 pages, 2593 KiB  
Article
Methodological Exploration of Ontology Generation with a Dedicated Large Language Model
by Maria Assunta Cappelli and Giovanna Di Marzo Serugendo
Electronics 2025, 14(14), 2863; https://doi.org/10.3390/electronics14142863 - 17 Jul 2025
Viewed by 326
Abstract
Ontologies are essential tools for representing, organizing, and sharing knowledge across various domains. This study presents a methodology for ontology construction supported by large language models (LLMs), with an initial application in the automotive sector. Specifically, a user preference ontology for adaptive interfaces [...] Read more.
Ontologies are essential tools for representing, organizing, and sharing knowledge across various domains. This study presents a methodology for ontology construction supported by large language models (LLMs), with an initial application in the automotive sector. Specifically, a user preference ontology for adaptive interfaces in autonomous machines was developed using ChatGPT-4o. Based on this case study, the results were generalized into a reusable methodology. The proposed workflow integrates classical ontology engineering methodologies with the generative and analytical capabilities of LLMs. Each phase follows well-established steps: domain definition, term elicitation, class hierarchy construction, property specification, formalization, population, and validation. A key innovation of this approach is the use of a guiding table that translates domain knowledge into structured prompts, ensuring consistency across iterative interactions with the LLM. Human experts play a continuous role throughout the process, refining definitions, resolving ambiguities, and validating outputs. The ontology was evaluated in terms of logical consistency, structural properties, semantic accuracy, and inferential completeness, confirming its correctness and coherence. Additional validation through SPARQL queries demonstrated its reasoning capabilities. This methodology is generalizable to other domains, if domain experts adapt the guiding table to the specific context. Despite the support provided by LLMs, domain expertise remains essential to guarantee conceptual rigor and practical relevance. Full article
(This article belongs to the Special Issue Role of Artificial Intelligence in Natural Language Processing)
Show Figures

Figure 1

38 pages, 5791 KiB  
Article
Hybrid Gaussian Process Regression Models for Accurate Prediction of Carbonation-Induced Steel Corrosion in Cementitious Mortars
by Teerapun Saeheaw
Buildings 2025, 15(14), 2464; https://doi.org/10.3390/buildings15142464 - 14 Jul 2025
Viewed by 227
Abstract
Steel corrosion prediction in concrete infrastructure remains a critical challenge for durability assessment and maintenance planning. This study presents a comprehensive framework integrating domain expertise with advanced machine learning for carbonation-induced corrosion prediction. Four Gaussian Process Regression (GPR) variants were systematically developed: Baseline [...] Read more.
Steel corrosion prediction in concrete infrastructure remains a critical challenge for durability assessment and maintenance planning. This study presents a comprehensive framework integrating domain expertise with advanced machine learning for carbonation-induced corrosion prediction. Four Gaussian Process Regression (GPR) variants were systematically developed: Baseline GPR with manual optimization, Expert Knowledge GPR employing domain-driven dual-kernel architecture, GPR with Automatic Relevance Determination (GPR-ARD) for feature selection, and GPR-OptCorrosion featuring specialized multi-component composite kernels. The models were trained and validated using 180 carbonated mortar specimens with 15 systematically categorized variables spanning mixture, material, environmental, and electrochemical parameters. GPR-OptCorrosion achieved superior performance (R2 = 0.9820, RMSE = 1.3311 μA/cm2), representing 44.7% relative improvement in explained variance over baseline methods, while Expert Knowledge GPR and GPR-ARD demonstrated comparable performance (R2 = 0.9636 and 0.9810, respectively). Contrary to conventional approaches emphasizing electrochemical indicators, automatic relevance determination revealed supplementary cementitious materials (silica fume and fly ash) as dominant predictive factors. All advanced models exhibited excellent generalization (gaps < 0.02) and real-time efficiency (<0.006 s), with probabilistic uncertainty quantification enabling risk-informed infrastructure management. This research contributes to advancing machine learning applications in corrosion engineering and provides a foundation for predictive maintenance strategies in concrete infrastructure. Full article
(This article belongs to the Section Building Materials, and Repair & Renovation)
Show Figures

Figure 1

18 pages, 1760 KiB  
Article
Integrating 68Ga-PSMA-11 PET/CT with Clinical Risk Factors for Enhanced Prostate Cancer Progression Prediction
by Joanna M. Wybranska, Lorenz Pieper, Christian Wybranski, Philipp Genseke, Jan Wuestemann, Julian Varghese, Michael C. Kreissl and Jakub Mitura
Cancers 2025, 17(14), 2285; https://doi.org/10.3390/cancers17142285 - 9 Jul 2025
Viewed by 422
Abstract
Background/Objectives: This study evaluates whether combining 68Ga-PSMA-11-PET/CT derived imaging biomarkers with clinical risk factors improves the prediction of early biochemical recurrence (eBCR) or clinical progress in patients with high-risk prostate cancer (PCa) after primary treatment, using machine learning (ML) models. Methods: We [...] Read more.
Background/Objectives: This study evaluates whether combining 68Ga-PSMA-11-PET/CT derived imaging biomarkers with clinical risk factors improves the prediction of early biochemical recurrence (eBCR) or clinical progress in patients with high-risk prostate cancer (PCa) after primary treatment, using machine learning (ML) models. Methods: We analyzed data from 93 high-risk PCa patients who underwent 68Ga-PSMA-11 PET/CT and received primary treatment at a single center. Two predictive models were developed: a logistic regression (LR) model and an ML derived probabilistic graphical model (PGM) based on a naïve Bayes framework. Both models were compared against each other and against the CAPRA risk score. The models’ input variables were selected based on statistical analysis and domain expertise including a literature review and expert input. A decision tree was derived from the PGM to translate its probabilistic reasoning into a transparent classifier. Results: The five key input variables were as follows: binarized CAPRA score, maximal intraprostatic PSMA uptake intensity (SUVmax), presence of bone metastases, nodal involvement at common iliac bifurcation, and seminal vesicle infiltration. The PGM achieved superior predictive performance with a balanced accuracy of 0.73, sensitivity of 0.60, and specificity of 0.86, substantially outperforming both the LR (balanced accuracy: 0.50, sensitivity: 0.00, specificity: 1.00) and CAPRA (balanced accuracy: 0.59, sensitivity: 0.20, specificity: 0.99). The decision tree provided an explainable classifier with CAPRA as a primary branch node, followed by SUVmax and specific PET-detected tumor sites. Conclusions: Integrating 68Ga-PSMA-11 imaging biomarkers with clinical parameters, such as CAPRA, significantly improves models to predict progression in patients with high-risk PCa undergoing primary treatment. The PGM offers superior balanced accuracy and enables risk stratification that may guide personalized treatment decisions. Full article
Show Figures

Figure 1

32 pages, 6788 KiB  
Article
Knee Osteoarthritis Detection and Classification Using Autoencoders and Extreme Learning Machines
by Jarrar Amjad, Muhammad Zaheer Sajid, Ammar Amjad, Muhammad Fareed Hamid, Ayman Youssef and Muhammad Irfan Sharif
AI 2025, 6(7), 151; https://doi.org/10.3390/ai6070151 - 8 Jul 2025
Viewed by 567
Abstract
Background/Objectives: Knee osteoarthritis (KOA) is a prevalent disorder affecting both older adults and younger individuals, leading to compromised joint function and mobility. Early and accurate detection is critical for effective intervention, as treatment options become increasingly limited as the disease progresses. Traditional diagnostic [...] Read more.
Background/Objectives: Knee osteoarthritis (KOA) is a prevalent disorder affecting both older adults and younger individuals, leading to compromised joint function and mobility. Early and accurate detection is critical for effective intervention, as treatment options become increasingly limited as the disease progresses. Traditional diagnostic methods rely heavily on the expertise of physicians and are susceptible to errors. The demand for utilizing deep learning models in order to automate and improve the accuracy of KOA image classification has been increasing. In this research, a unique deep learning model is presented that employs autoencoders as the primary mechanism for feature extraction, providing a robust solution for KOA classification. Methods: The proposed model differentiates between KOA-positive and KOA-negative images and categorizes the disease into its primary severity levels. Levels of severity range from “healthy knees” (0) to “severe KOA” (4). Symptoms range from typical joint structures to significant joint damage, such as bone spur growth, joint space narrowing, and bone deformation. Two experiments were conducted using different datasets to validate the efficacy of the proposed model. Results: The first experiment used the autoencoder for feature extraction and classification, which reported an accuracy of 96.68%. Another experiment using autoencoders for feature extraction and Extreme Learning Machines for actual classification resulted in an even higher accuracy value of 98.6%. To test the generalizability of the Knee-DNS system, we utilized the Butterfly iQ+ IoT device for image acquisition and Google Colab’s cloud computing services for data processing. Conclusions: This work represents a pioneering application of autoencoder-based deep learning models in the domain of KOA classification, achieving remarkable accuracy and robustness. Full article
(This article belongs to the Special Issue AI in Bio and Healthcare Informatics)
Show Figures

Figure 1

27 pages, 13752 KiB  
Article
Robust Watermarking of Tiny Neural Networks by Fine-Tuning and Post-Training Approaches
by Riccardo Adorante, Alessandro Carra, Marco Lattuada and Danilo Pietro Pau
Symmetry 2025, 17(7), 1094; https://doi.org/10.3390/sym17071094 - 8 Jul 2025
Viewed by 512
Abstract
Because neural networks pervade many industrial domains and are increasingly complex and accurate, the trained models themselves have become valuable intellectual properties. Developing highly accurate models demands increasingly higher investments of time, capital, and expertise. Many of these models are commonly deployed in [...] Read more.
Because neural networks pervade many industrial domains and are increasingly complex and accurate, the trained models themselves have become valuable intellectual properties. Developing highly accurate models demands increasingly higher investments of time, capital, and expertise. Many of these models are commonly deployed in cloud services and on resource-constrained edge devices. Consequently, safeguarding them is critically important. Neural network watermarking offers a practical solution to address this need by embedding a unique signature, either as a hidden bit-string or as a distinctive response to specially crafted “trigger” inputs. This allows owners to subsequently prove model ownership even if an adversary attempts to remove the watermark through attacks. In this manuscript, we adapt three state-of-the-art watermarking methods to “tiny” neural networks deployed on edge platforms by exploiting symmetry-related properties that ensure robustness and efficiency. In the context of machine learning, “tiny” is broadly used as a term referring to artificial intelligence techniques deployed in low-energy systems in the mW range and below, e.g., sensors and microcontrollers. We evaluate the robustness of the selected techniques by simulating attacks aimed at erasing the watermark while preserving the model’s original performances. The results before and after attacks demonstrate the effectiveness of these watermarking schemes in protecting neural network intellectual property without degrading the original accuracy. Full article
(This article belongs to the Section Computer)
Show Figures

Graphical abstract

28 pages, 1987 KiB  
Article
LLM-as-a-Judge Approaches as Proxies for Mathematical Coherence in Narrative Extraction
by Brian Keith
Electronics 2025, 14(13), 2735; https://doi.org/10.3390/electronics14132735 - 7 Jul 2025
Viewed by 568
Abstract
Evaluating the coherence of narrative sequences extracted from large document collections is crucial for applications in information retrieval and knowledge discovery. While mathematical coherence metrics based on embedding similarities provide objective measures, they require substantial computational resources and domain expertise to interpret. We [...] Read more.
Evaluating the coherence of narrative sequences extracted from large document collections is crucial for applications in information retrieval and knowledge discovery. While mathematical coherence metrics based on embedding similarities provide objective measures, they require substantial computational resources and domain expertise to interpret. We propose using large language models (LLMs) as judges to evaluate narrative coherence, demonstrating that their assessments correlate with mathematical coherence metrics. Through experiments on two data sets—news articles about Cuban protests and scientific papers from visualization conferences—we show that the LLM judges achieve Pearson correlations up to 0.65 with mathematical coherence while maintaining high inter-rater reliability (ICC > 0.92). The simplest evaluation approach achieves a comparable performance to the more complex approaches, even outperforming them for focused data sets while achieving over 90% of their performance for the more diverse data sets while using less computational resources. Our findings indicate that LLM-as-a-judge approaches are effective as a proxy for mathematical coherence in the context of narrative extraction evaluation. Full article
Show Figures

Figure 1

15 pages, 4430 KiB  
Article
A Comprehensive Approach to Instruction Tuning for Qwen2.5: Data Selection, Domain Interaction, and Training Protocols
by Xungang Gu, Mengqi Wang, Yangjie Tian, Ning Li, Jiaze Sun, Jingfang Xu, He Zhang, Ruohua Xu and Ming Liu
Computers 2025, 14(7), 264; https://doi.org/10.3390/computers14070264 - 5 Jul 2025
Viewed by 391
Abstract
Instruction tuning plays a pivotal role in aligning large language models with diverse tasks, yet its effectiveness hinges on the interplay of data quality, domain composition, and training strategies. This study moves beyond qualitative assessment to systematically quantify these factors through extensive experiments [...] Read more.
Instruction tuning plays a pivotal role in aligning large language models with diverse tasks, yet its effectiveness hinges on the interplay of data quality, domain composition, and training strategies. This study moves beyond qualitative assessment to systematically quantify these factors through extensive experiments on data selection, data mixture, and training protocols. By quantifying performance trade-offs, we demonstrate that the implicit method SuperFiltering achieves an optimal balance, whereas explicit filters can induce capability conflicts. A fine-grained analysis of cross-domain interactions quantifies a near-linear competition between code and math, while showing that tool use data exhibits minimal interference. To mitigate these measured conflicts, we compare multi-task, sequential, and multi-stage training strategies, revealing that multi-stage training significantly reduces Conflict Rates while preserving domain expertise. Our findings culminate in a unified framework for optimizing instruction tuning, offering actionable, data-driven guidelines for balancing multi-domain performance and enhancing model generalization, thus advancing the field by providing a methodology to move from intuition to systematic optimization. Full article
(This article belongs to the Special Issue Natural Language Processing (NLP) and Large Language Modelling)
Show Figures

Figure 1

22 pages, 3183 KiB  
Article
Surrogate Modeling for Building Design: Energy and Cost Prediction Compared to Simulation-Based Methods
by Navid Shirzadi, Dominic Lau and Meli Stylianou
Buildings 2025, 15(13), 2361; https://doi.org/10.3390/buildings15132361 - 5 Jul 2025
Viewed by 495
Abstract
Designing energy-efficient buildings is essential for reducing global energy consumption and carbon emissions. However, traditional physics-based simulation models require substantial computational resources, detailed input data, and domain expertise. To address these limitations, this study investigates the use of three machine learning-based surrogate models—Random [...] Read more.
Designing energy-efficient buildings is essential for reducing global energy consumption and carbon emissions. However, traditional physics-based simulation models require substantial computational resources, detailed input data, and domain expertise. To address these limitations, this study investigates the use of three machine learning-based surrogate models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP)—trained on a synthetic dataset of 2000 EnergyPlus-simulated building design scenarios to predict both energy use intensity (EUI) and cost estimates for midrise apartment buildings in the Toronto area. All three models exhibit strong predictive performance, with R2 values exceeding 0.9 for both EUI and cost. XGBoost achieves the best performance in cost prediction on the testing dataset with a root mean squared error (RMSE) of 5.13 CAD/m2, while MLP outperforms others in EUI prediction with a testing RMSE of 0.002 GJ/m2. In terms of computational efficiency, the surrogate models significantly outperform a physics-based simulation model, with MLP running approximately 340 times faster and XGBoost and RF achieving over 200 times speedup. This study also examines the effect of training dataset size on model performance, identifying a point of diminishing returns where further increases in data size yield minimal accuracy gains but substantially higher training times. To enhance model interpretability, SHapley Additive exPlanations (SHAP) analysis is used to quantify feature importance, revealing how different model types prioritize design parameters. A parametric design configuration analysis further evaluates the models’ sensitivity to changes in building envelope features. Overall, the findings demonstrate that machine learning-based surrogate models can serve as fast, accurate, and interpretable alternatives to traditional simulation methods, supporting efficient decision-making during early-stage building design. Full article
(This article belongs to the Section Building Energy, Physics, Environment, and Systems)
Show Figures

Figure 1

Back to TopTop