MDPI - Publisher of Open Access Journals

50 pages, 1686 KB

Open AccessReview

Data Foundations for Medical AI: Provenance, Reliability and Limitations of Russian Clinical NLP Resources

by Arsenii Litvinov, Lev Malishevskii, Evgeny Karpulevich, Iaroslav Bespalov, Yaroslav Nedumov, Sergey Zhdanov, Ivan Oseledets, Evgeniy Shlyakhto and Arutyun Avetisyan

Informatics 2026, 13(3), 45; https://doi.org/10.3390/informatics13030045 - 20 Mar 2026

Abstract

Russian-language resources for medical natural language processing (NLP) are expanding rapidly; however, their fragmentation, uneven curation, and limited clinical reliability hinder the development of safe machine learning systems for prognosis, prevention, and precision medicine. We provide the first systematic survey of Russian medical [...] Read more.

Russian-language resources for medical natural language processing (NLP) are expanding rapidly; however, their fragmentation, uneven curation, and limited clinical reliability hinder the development of safe machine learning systems for prognosis, prevention, and precision medicine. We provide the first systematic survey of Russian medical NLP datasets and analyze their suitability for clinically meaningful tasks as defined by the MedHELM taxonomy. We additionally perform expert clinical validation of three representative public corpora—RuMedPrimeData (real outpatient notes), MedSyn (synthetic clinical notes), and RuMedNLI (translated natural language inference)—assessing clinical plausibility, diagnosis accuracy, and logical consistency. Experts identified substantial reliability issues: across randomly sampled subsets of each corpus, only approximately 20% of RuMedPrimeData records, fewer than 15% of MedSyn records, and approximately 55% of RuMedNLI pairs met essential quality criteria, which can hinder downstream ML systems built on these data. To support robust applications—ranging from medical chatbots and triage assistants to predictive and preventive models—we outline practical requirements for high-quality datasets: coordinated, expert-validated, machine-readable corpora aligned with clinical guidelines and insurance logic, standardized de-identification, and transparent provenance. Strengthening these data foundations will enable the development of reliable, reproducible, and clinically relevant AI systems suitable for real-world healthcare applications. Full article

(This article belongs to the Special Issue From Data to Evidence: Transformative AI for Real-World Data)

► Show Figures

Figure 1

23 pages, 458 KB

Open AccessArticle

Automated Generation and Evaluation of Interactive-Fiction Serious Games with Open-Weight LLMs

by Finn Rogosch and Andreas Schrader

Appl. Sci. 2026, 16(6), 2932; https://doi.org/10.3390/app16062932 - 18 Mar 2026

Viewed by 46

Abstract

This work investigates whether open-weight large language models can automatically generate runnable and educationally faithful serious games in a constrained, text-only interactive-fiction (IF) setting. The target games are station-based single-player serious games for knowledge assessment, implemented as IF in a structured, machine-readable text [...] Read more.

This work investigates whether open-weight large language models can automatically generate runnable and educationally faithful serious games in a constrained, text-only interactive-fiction (IF) setting. The target games are station-based single-player serious games for knowledge assessment, implemented as IF in a structured, machine-readable text format, and used here as a first step towards later ambient scenarios. A fully automated pipeline called SINE (Serious Interactive Narrative Engine) is evaluated with four prompting strategies, grammar-guided decoding, deterministic validation, and a repair agent. Across a staged evaluation with 240 seeds and increasing complexity, finalist configurations reach success rates between roughly 68% and 86% on the joint criterion of compilation, playability, and learning-goal fidelity. Repair iterations proved central to robustness, whereas grammar masking on top of reasoning prompts did not consistently improve outcomes. The study provides a reproducible benchmark setup, open artifacts, and a constrained generation pipeline as a basis for later extensions toward broader serious game scenarios. Full article

(This article belongs to the Special Issue Artificial Intelligence in Education: Latest Advances and Prospects)

► Show Figures

Figure 1

16 pages, 6152 KB

Open AccessArticle

DisasterReliefGPT: Multimodal AI for Autonomous Disaster Impact Assessment and Crisis Communication

by Lekshmi Chandrika Reghunath, Athikkal Sudhir Abhishek, Arjun Changat, Arjun Unnikrishnan, Ayush Kumar Rai, Christian Napoli and Cristian Randieri

Technologies 2026, 14(3), 179; https://doi.org/10.3390/technologies14030179 - 16 Mar 2026

Viewed by 164

Abstract

The work presented herein proposes DisasterReliefGPT, a multimodal AI system for automation in the areas of crisis communication and post-disaster assessment. The system integrates three tightly coupled components: a vision module called DisasterOCS for structural damage detection in satellite images, a Large Vision–Language [...] Read more.

The work presented herein proposes DisasterReliefGPT, a multimodal AI system for automation in the areas of crisis communication and post-disaster assessment. The system integrates three tightly coupled components: a vision module called DisasterOCS for structural damage detection in satellite images, a Large Vision–Language Model (LVLM) for enhanced visual understanding and contextual reasoning, and a Large Language Model (LLM) to produce detailed, clear assessment reports. DisasterOCS relies on a ResNet34-based encoder with partial weight sharing and event-specific decoders, coupled with a custom MultiCrossEntropyDiceLoss function for multi-class segmentation on pre- and post-disaster image pairs. On the benchmark xBD dataset, the developed system reaches a high score of 78.8% in identifying F1-damage, making correct identifications of destroyed buildings with 81.3% precision, while undamaged structures are found with a very high value of 90.7%. From a combination of these components, emergency responders can immediately provide reliable and readable assessments of damage that can be used to directly support urgent decision-making. Full article

► Show Figures

Graphical abstract

11 pages, 1275 KB

Open AccessArticle

Optical Coherence Tomography (OCT) Evaluation of Thermal Tissue Alterations After Diode Laser Excision of Oral Leukoplakia (OL)

by Alessio Gambino, Alessandro Magliano, Giorgia El Haddad, Marta Bezzi, Adriana Cafaro, Dora Karimi, Roberto Broccoletti and Paolo Giacomo Arduino

Dent. J. 2026, 14(3), 168; https://doi.org/10.3390/dj14030168 - 12 Mar 2026

Viewed by 135

Abstract

Objectives: Oral leukoplakia (OL) is the most prevalent oral potentially malignant disorder and requires accurate diagnosis, safe excision, and reliable margin evaluation to minimize recurrence and malignant transformation. Diode laser excision is increasingly adopted due to its precision and favorable clinical outcomes; however, [...] Read more.

Objectives: Oral leukoplakia (OL) is the most prevalent oral potentially malignant disorder and requires accurate diagnosis, safe excision, and reliable margin evaluation to minimize recurrence and malignant transformation. Diode laser excision is increasingly adopted due to its precision and favorable clinical outcomes; however, laser-induced thermal effects at surgical margins raise concerns regarding tissue integrity and histopathological reliability. This study aimed to evaluate optical coherence tomography (OCT) as a real-time, high-resolution, non-invasive imaging modality for assessing peri-incisional thermal effects during diode laser excision of non-dysplastic OL. The primary objective was to validate OCT for ultrastructural and morphometric tissue analysis while ensuring preservation of diagnostic readability. Methods: A single-center observational case series was conducted at the University of Turin. Thirty patients with clinically and histopathologically confirmed oral leukoplakia without epithelial dysplasia were enrolled and allocated to two groups: 15 lesions excised using a 980 nm diode laser in continuous-wave contact mode (laser group) and 15 lesions removed by conventional scalpel biopsy (control group). Laser excisions were performed with standardized parameters and a circumferential safety margin of 5 mm. Immediately after excision, specimens underwent ex vivo spectral-domain OCT (SD-OCT) imaging to evaluate the epithelial and connective tissue microarchitecture at surgical margins and central lesion areas. OCT acquisition sites were precisely correlated with histological sections. Quantitative OCT measurements of epithelial thickness, lamina propria thickness, and laser-induced thermal alterations were compared with corresponding histological findings. Results: OCT consistently provided high-resolution visualization of oral mucosal microarchitecture in both groups, allowing clear identification of epithelial stratification, basement membrane continuity, and lamina propria organization. In the laser group, OCT detected superficial optical alterations at the surgical margins consistent with laser-induced thermal effects, while deeper tissue layers remained structurally readable. Histological analysis revealed mean epithelial and connective tissue thermal alterations of 288.9 μm and 430.3 μm, respectively. OCT-derived measurements showed high concordance with histology, with an overall agreement of 88.5% and no statistically significant differences between OCT and histological assessments. Importantly, laser-induced thermal effects did not impair definitive histopathological diagnosis in any specimen. Comparison with the control group confirmed preserved tissue architecture in scalpel-excised samples and highlighted OCT sensitivity in detecting laser-related structural remodeling. Conclusions: OCT proved to be a reliable, non-invasive imaging technique for real-time assessment of diode laser-induced thermal effects during OL excision. The technique accurately delineated tissue microstructure and surgical margins without compromising histopathological interpretation. Integration of OCT into the laser-assisted management of oral potentially malignant disorders may enhance surgical precision, optimize margin control, reduce diagnostic uncertainty, and support individualized follow-up strategies. Full article

(This article belongs to the Special Issue Optical Coherence Tomography (OCT) in Dentistry)

► Show Figures

Graphical abstract

13 pages, 233 KB

Open AccessArticle

Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis

by Abdullah Al-Khanaty, Jordan Santucci, David Hennes, Niranjan Sathianathen, Carlos Delgado, Karan Sharma, Eoin Dinneen, Kieran Sandhu, David Chen, Renu Eapen, Daniel Moon, Gregory Jack, Jeremy Goad, Shankar Siva, Muhammad Ali, Damien Bolton, Nathan Lawrentschuk, Declan G. Murphy and Marlon Perera

Cancers 2026, 18(6), 906; https://doi.org/10.3390/cancers18060906 - 11 Mar 2026

Viewed by 171

Abstract

Background: Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer. While these platforms offer accessible and conversational responses, concerns remain regarding the quality, usability, and clinical relevance of AI-generated content. This study comparatively evaluated patient-directed prostate [...] Read more.

Background: Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer. While these platforms offer accessible and conversational responses, concerns remain regarding the quality, usability, and clinical relevance of AI-generated content. This study comparatively evaluated patient-directed prostate cancer information generated by commonly used AI chatbots. Methods: Standardised prostate cancer-related prompts were developed using Google Trends and authoritative healthcare resources. Identical queries were submitted to five publicly accessible AI chatbots: ChatGPT 5.2, Google Gemini, Claude AI, Microsoft Copilot, and Perplexity. Responses were independently assessed by two blinded reviewers using the DISCERN instrument for information quality and the Patient Education Materials Assessment Tool for printable materials (PEMAT-P) for understandability and actionability. Inter-rater reliability was assessed using intraclass correlation coefficients (ICCs). Readability was evaluated using the Flesch–Kincaid Reading Ease score. Descriptive statistics were used for comparative and pooled analyses. Results: Overall information quality was moderate, with a pooled median (interquartile range [IQR]) DISCERN score of 56.5 (53.0–61.0). Higher mean DISCERN scores were observed for ChatGPT 5.2 and Microsoft Copilot, whereas lower scores were observed for Claude and Perplexity. PEMAT-P understandability was consistently high across platforms, with a pooled median (IQR) score of 91.7% (83.3–91.7%). In contrast, PEMAT-P actionability was uniformly poor, with a pooled median (IQR) score of 0% (0–0%). Readability analysis demonstrated moderate complexity, with a pooled median (IQR) Flesch–Kincaid Reading Ease score of 50.4 (49.2–52.5) and a median word count of 666 (657–1022). Inter-rater reliability was good for PEMAT understandability (ICC 0.841) and moderate for DISCERN (ICC 0.712). Conclusions: AI chatbots provide highly understandable but only moderately high-quality patient-directed prostate cancer information, with a consistent lack of actionable guidance. Although variation in content quality was observed across platforms, significant limitations remain in evidence transparency and practical patient support. Future development should prioritise integration of evidence-based resources and actionable decision-support tools to enhance the role of AI chatbots in prostate cancer education. Full article

(This article belongs to the Special Issue Artificial Intelligence in Urological Oncology: Applications in Imaging, Prognostics, and Precision Disease Management)

11 pages, 540 KB

Open AccessArticle

Evaluating Large Language Models for Diagnostic Accuracy and Health Information Quality in Oral Mucosal Diseases

by Melisa Iacob, Ayham Qawas, Ramesh Balasubramaniam, Agnieszka M. Frydrych and Omar Kujan

J. Pers. Med. 2026, 16(3), 129; https://doi.org/10.3390/jpm16030129 - 27 Feb 2026

Viewed by 314

Abstract

Background: Multimodal large language model (MLLM)-based systems capable of generating health-related information and diagnostic suggestions are increasingly used for health information retrieval; however, their accuracy, readability, and quality in oral healthcare remain unclear. Oral mucosal diseases comprise a heterogeneous group of conditions [...] Read more.

Background: Multimodal large language model (MLLM)-based systems capable of generating health-related information and diagnostic suggestions are increasingly used for health information retrieval; however, their accuracy, readability, and quality in oral healthcare remain unclear. Oral mucosal diseases comprise a heterogeneous group of conditions affecting the oral lining, ranging from benign and reactive lesions to potentially malignant and malignant disorders. Objective: This study evaluated and compared the diagnostic performance, readability, and information quality of MLLMs with traditional search engines included as comparator platforms, in diagnosing oral mucosal diseases. Methods: A cross-sectional observational study was conducted using 100 validated oral mucosal case scenarios representing benign, malignant, potentially malignant, infectious, and reactive oral lesions. Each scenario was entered into ChatGPT 3.5, ChatGPT 4.5 (Plus), Microsoft Copilot (smart), Grok (xAI), Claude (Sonnet 4.5), DeepSeek v3.1, and search engines Google, Bing, and Yahoo. Diagnostic accuracy, Positive Predictive Value (PPV), and Negative Predictive Value (NPV) were compared against reference diagnoses. Information quality was assessed using the DISCERN tool, and readability was evaluated using Flesch–Kincaid Reading Ease (FRES) and Grade Level (FKGL) scores. Statistical analyses included Cochran’s Q and McNemar tests (p < 0.05). Results: ChatGPT 4.5 demonstrated the highest overall diagnostic accuracy (88.5%), PPV (92%), and NPV (88%), followed by DeepSeek v3.1 and Claude (Sonnet 4.5). Traditional search engines performed poorly (accuracy 18–55%). MLLMs achieved higher DISCERN scores (2.84–3.20) but lower readability (FKGL = 11–14) than search engines (FKGL = 6–7). No platform met the recommended sixth-grade reading level for consumer health information. Conclusions: MLLMs, particularly ChatGPT Plus (GPT-4.5), outperformed conventional search engines in diagnostic accuracy and content quality but produced complex, less-readable text. Future AI development should prioritise improving clinical accuracy alongside readability and transparency to ensure equitable access to reliable oral health information. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Personalized Medicine: Diagnosis and Treatment)

► Show Figures

Figure 1

3 pages, 157 KB

Open AccessData Descriptor

Normative Physical Fitness Profiles and Sex Differences in University Students of Sport Sciences: An Open Dataset of Anthropometrics, Flexibility, Strength, and Jump Performance

by Julio Martín-Ruiz and Laura Ruiz-Sanchis

Data 2026, 11(2), 34; https://doi.org/10.3390/data11020034 - 7 Feb 2026

Viewed by 393

Abstract

This Data Descriptor provides an open, anonymized dataset describing anthropometric and physical fitness outcomes in undergraduate students enrolled in a Physical Activity and Sport Sciences degree program. The dataset included 156 participants (28 females and 128 males) and reported sex, age, body mass, [...] Read more.

This Data Descriptor provides an open, anonymized dataset describing anthropometric and physical fitness outcomes in undergraduate students enrolled in a Physical Activity and Sport Sciences degree program. The dataset included 156 participants (28 females and 128 males) and reported sex, age, body mass, stature, and body mass index, alongside standardized field-based tests covering flexibility, muscular endurance, strength, and jump performance. Hip flexibility was assessed using the Thomas test on both sides. Trunk extensor endurance was measured using the Biering–Sørensen test, and upper-body strength–endurance was assessed using a dead-hang test. Upper limb strength was recorded as elbow flexion strength. Lower limb power was evaluated using vertical jump tests, including Abalakov, squat jump, and countermovement jump, and a derived indicator (IE) was provided to facilitate comparisons across jump modalities. The data are distributed as a machine-readable CSV file accompanied by a detailed data dictionary describing the variables, units, and missingness. The dataset is intended to support the reproducible reporting of normative fitness profiles in sports science students, facilitate teaching and benchmarking in exercise science contexts, and enable secondary analyses exploring associations between anthropometry and physical performance. For reproducible inferential comparisons, users may apply Welch’s two-sample t-test for sex-based differences. Full article

(This article belongs to the Special Issue Big Data and Data-Driven Research in Sports)

19 pages, 694 KB

Open AccessArticle

A Spanish Language Proficiency Dataset for AI Evaluation

by Anselmo Peñas, Álvaro Rodrigo, Javier Fruns-Jiménez, Inés Soria-Pastor, Sergio Moreno-Álvarez, Alberto Pérez and Julio Reyes-Montesinos

Information 2026, 17(2), 159; https://doi.org/10.3390/info17020159 - 5 Feb 2026

Viewed by 398

Abstract

Benchmarking Spanish reading comprehension remains challenging due to the scarcity of proficiency-calibrated resources grounded in authentic human assessments. We introduce IC-UNED-RC-ES, a benchmark comprising more than 6000 items derived from Instituto Cervantes examinations, converted to a machine-readable format while preserving exam structure, proficiency [...] Read more.

Benchmarking Spanish reading comprehension remains challenging due to the scarcity of proficiency-calibrated resources grounded in authentic human assessments. We introduce IC-UNED-RC-ES, a benchmark comprising more than 6000 items derived from Instituto Cervantes examinations, converted to a machine-readable format while preserving exam structure, proficiency levels, and scoring criteria. Unlike many existing resources, IC-UNED-RC-ES includes a diverse set of exercise formats, combining common multiple-choice questions with new formats such as matching and fill-in-the-gap, which support a broader assessment of reading skills. The benchmark supports evaluation at both the item and exam levels and includes an exercise taxonomy with category-specific metrics. Baseline results with current AI systems reveal a strong difficulty effect (a 15-point drop from lower to advanced levels) and substantial variation across exercise types, with inference- and discourse-heavy categories reaching only 41%. IC-UNED-RC-ES provides a human-aligned, interpretable testbed for diagnosing strengths and weaknesses in Spanish reading comprehension and for tracking progress across model generations. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Graphical abstract

26 pages, 619 KB

Open AccessArticle

Benchmarking LLM-as-a-Judge Models for 5W1H Extraction Evaluation

by José Cassola-Bacallao, José Morales-Donaire, Paula Hernández-Montoya and Brian Keith-Norambuena

Electronics 2026, 15(3), 659; https://doi.org/10.3390/electronics15030659 - 3 Feb 2026

Viewed by 486

Abstract

Evaluating 5W1H (Who, What, When, Where, Why, and How) information extraction systems remains challenging, as traditional information retrieval metrics like ROUGE and BLEU fail to capture semantic accuracy and narrative coherence. The LLM-as-a-Judge paradigm offers a promising alternative, yet systematic comparisons of judge [...] Read more.

Evaluating 5W1H (Who, What, When, Where, Why, and How) information extraction systems remains challenging, as traditional information retrieval metrics like ROUGE and BLEU fail to capture semantic accuracy and narrative coherence. The LLM-as-a-Judge paradigm offers a promising alternative, yet systematic comparisons of judge models for this task are lacking. This study benchmarks multiple large language models, including state-of-the-art models such as GPT, Claude, and Gemini as evaluators of 5W1H extractions from Spanish news articles. We assess judge performance across six quality criteria: Factual Accuracy, Completeness, Relevance and Conciseness, Clarity and Readability, Faithfulness to Source, and Overall Coherence. Our analysis examines inter-judge agreement, score distribution patterns, criterion-level variance, and the relationship between evaluation quality and computational cost. Using two Spanish-language corpora (BASSE and FLARES), we identify which criteria exhibit consistent cross-model agreement and which prove most sensitive to judge selection. The main contribution of this work is providing the first systematic benchmark of LLM-as-a-Judge models for 5W1H extraction evaluation in Spanish, validated against expert journalistic judgment. Results reveal that all evaluated models achieve alignment levels above 90% across all metrics. Specifically, Claude Sonnet 4.5 emerges as the most accurate evaluator with a Global Judgment Acceptance Rate (JAR) of 99.79%. Furthermore, meta-evaluation with human experts demonstrates a substantial inter-annotator agreement of

κ = 0.6739

. Finally, we provide recommendations for judge model selection based on task requirements and resource constraints, contributing practical guidance for researchers implementing LLM-based evaluation pipelines for information extraction tasks. Full article

(This article belongs to the Special Issue Multimodal Learning for Multimedia Content Analysis and Understanding)

► Show Figures

Figure 1

16 pages, 795 KB

Open AccessArticle

Financial Information Quality Between Numerical Accuracy and Comprehensibility: Effects on Investment Decisions in the Context of the Bucharest Stock Exchange

by Daniela Mogîldea and Mihai Carp

Int. J. Financial Stud. 2026, 14(2), 34; https://doi.org/10.3390/ijfs14020034 - 3 Feb 2026

Viewed by 427

Abstract

The informational efficiency of stock prices is conditioned by the level of quality of financial reports, contributing to an accurate assessment of the company’s future performance. By approaching informational quality from two perspectives, we conducted an analysis of the impact of faithful representation [...] Read more.

The informational efficiency of stock prices is conditioned by the level of quality of financial reports, contributing to an accurate assessment of the company’s future performance. By approaching informational quality from two perspectives, we conducted an analysis of the impact of faithful representation and readability of annual reports on the reaction of the Romanian capital market, measured by annual stock returns (SR) and cumulative abnormal returns (CAR). The findings revealed an accentuated concern of investors regarding the faithful representation of the firm’s financial results (both at the time of financial statements’ publication and at the year-end) and a diminished significance of the comprehensibility level of financial information in the investment decision-making process. The annual reports of a sample of firms listed on the BSE between 2017 and 2023 have an increased level of linguistic complexity, which entails processing costs, and are intended for sophisticated users with financial expertise. Along with the specialized language, the extensive length of reports delays the incorporation of all information into the stock price, decreasing the informational efficiency of the market. This empirical study applies several indices to assess the readability and conciseness of financial information (FOG index, Flesch–Kincaid index, Flesch Reading Ease Score, and report length) and contributes to the expanding literature by providing a useful basis for future analysis of the influence of financial report quality on investors’ perceptions. Full article

(This article belongs to the Special Issue Accounting and Financial/Non-financial Reporting Developments)

► Show Figures

Figure 1

13 pages, 413 KB

Open AccessArticle

Exploring the Balance Between Artificial Intelligence and Human Expertise in Shaping Breast Reconstruction Outcomes: A Comparative Reflection Study

by Ioan Constantin Pop, Maximilian Vlad Muntean, Vlad Alexandru Gata, Radu Alexandru Ilies, Delia Nicoara, Claudiu Ioan Filip, Vasile Pop and Patriciu Andrei Achimas-Cadariu

J. Clin. Med. 2026, 15(3), 1170; https://doi.org/10.3390/jcm15031170 - 2 Feb 2026

Viewed by 255

Abstract

Background/Objectives: Artificial intelligence (AI) has shown potential in patient education and integration into clinical decision support systems. However, its performance in counseling patients on breast reconstruction currently remains underexplored. This study’s objective is to compare AI-generated answers with expert surgeon responses to [...] Read more.

Background/Objectives: Artificial intelligence (AI) has shown potential in patient education and integration into clinical decision support systems. However, its performance in counseling patients on breast reconstruction currently remains underexplored. This study’s objective is to compare AI-generated answers with expert surgeon responses to common patient questions (derived from clinical scenarios) in domains like oncological justification, reconstructive options, and postoperative care. Methods: We realized an observer-blinded study using five real-world clinical scenarios in the field of oncologic and reconstructive surgery of the breast. Both ChatGPT-5 (October 2025 version) and a senior board-certified plastic surgeon responded to frequently asked questions, which were split into three domains: (1) oncological and surgical justification; (2) reconstruction options and outcomes, respectively; and (3) postoperative period. The answers were evaluated by another senior plastic surgeon using a four-grade ordinal scoring system (1 = unsatisfactory, 4 = excellent), which assessed accuracy, completeness, safety, nuance, and alignment with the current guidelines. Results: Across a total of 40 questions, the average AI response score was 3.1 ± 0.6. Domain-specific items scored lowest values for oncological justification (2.8 ± 0.7) and higher values for reconstruction options/outcomes and postoperative care (both 3.2 ± 0.4). No AI response was graded as unsatisfactory (score 1). Responses graded 4 (15%) were considered comprehensive, accurate, and patient-friendly. Conclusions: Globally, ChatGPT-5 provides satisfactory, readable, and medically accurate answers to basic patient questions on breast reconstruction, with a few limitations in nuanced oncological justification. Full article

(This article belongs to the Special Issue Plastic and Reconstructive Surgery: Clinical Advances and Future Opportunities)

► Show Figures

Figure 1

18 pages, 436 KB

Open AccessArticle

Cross-Cultural Adaptation and Validation of the Simplified Diabetes Knowledge Test (Arabic Version) for Insulin-Dependent Diabetic Patients: A Cross-Sectional Study in Iraq

by Shaymaa Abdalwahed Abdulameer and Mohanad Naji Sahib

J. Clin. Med. 2026, 15(3), 1164; https://doi.org/10.3390/jcm15031164 - 2 Feb 2026

Viewed by 338

Abstract

Background/Objectives: Diabetes is major metabolic disorder and rapidly increasing public health problem globally. The greatest way to reduce diabetic complications is adequate knowledge about the condition. Hence, the primary objectives of this study were to evaluate the psychometric properties of the Simplified [...] Read more.

Background/Objectives: Diabetes is major metabolic disorder and rapidly increasing public health problem globally. The greatest way to reduce diabetic complications is adequate knowledge about the condition. Hence, the primary objectives of this study were to evaluate the psychometric properties of the Simplified Diabetes Knowledge Test—Arabic version (SDKT-A) among Iraqi insulin-dependent diabetic patients. Additionally, the secondary objectives were to assess the associated independent variables and the risk of atherosclerosis and cardiovascular risk event by using atherogenic indices and lipid ratios with the SDKT-A. Methods: A cross-sectional, descriptive study was conducted in primary healthcare clinics. The SDKT was translated into Arabic using forward–backward translation, reconciliation, and pilot testing. Thereafter, psychometric properties of the SDKT-A were evaluated depending on different criteria. Atherogenic indices of Castelli risk indices I and II (CRI-I and II), triglyceride/HDL ratio, non-HDL-C ratio, atherogenic coefficient (AC), and triglyceride–total cholesterol–body weight index (TCBI) were calculated using specific formulas. Results: The SDKT-A questionnaire showed acceptable readability and validity. Cronbach’s alpha test (95% confidence interval) was 0.662 (0.59–0.73). The Pearson correlation coefficient of reliability for test–retest was found to be 0.659. The item difficulty index for most items was between 0.237 and 0.877. The point biserial correlation values ranged from 0.028 to 0.535 with Ferguson’s sigma value equal to 0.962. The content validation results showed a significant content validity ratio (CVR) value for most of the questions, ranging from 0.8 to 1. The content validity index (CVI) value for SDKT-A was found to be 0.98, which showed good agreement between experts. In addition, the exploratory factor analysis with promax rotation identified four domains for the final 20 items of the SDKT-A that explained 41.83% of the scale total variance. The mean score of the SDKT-A was 11.09 ± 3.40. The total score of the SDKT-A was positively and significantly correlated with education level (r = 0.322, p < 0.01). In addition, the total scores of the SDKT-A were negatively and significantly correlated with glycemic control, age, CRI-I, CRI-II, triglyceride/HDL ratio, AC, non-HDL-C ratio, and TCBI. Furthermore, the glycemic control (HbA1c) was positively and significantly correlated with the preventive measures factor (r = 0.175, p < 0.05), and were negatively and significantly correlated with the lifestyle and modification factor (r = −0.169, p < 0.05), diet and monitoring factor (r = −0.158, p < 0.05), and awareness factor (r = −0.149, p < 0.05). Conclusions: This study showed acceptable psychometric properties for the SDKT-A, with low levels of knowledge of diabetic disease in the sample population. Finally, comprehensive and interactive educational programs regarding lifestyle and modification, diet, and monitoring and awareness in primary healthcare centers in Iraq are warranted. Full article

(This article belongs to the Section Endocrinology & Metabolism)

► Show Figures

Figure 1

33 pages, 7521 KB

Open AccessArticle

Convergent Radiation Algorithm for Multi-Attribute Group Decision-Making with Circular Intuitionistic Fuzzy Numbers

by Xiqi Li, Junda Qiu, Jiali Tang, Jie Zhang, Qi Liu, Taiji Li and Yongjie Guo

Axioms 2026, 15(2), 89; https://doi.org/10.3390/axioms15020089 - 26 Jan 2026

Viewed by 404

Abstract

This paper proposes a novel method, the Convergent Radiation Algorithm (CRA), aimed at multi-attribute group decision-making (MAGDM) in circular intuitionistic fuzzy settings. The approach is aimed at reaching geometric consensus among experts, with uncertainties and hesitancies expressed via circular intuitionistic fuzzy numbers (CIFNs). [...] Read more.

This paper proposes a novel method, the Convergent Radiation Algorithm (CRA), aimed at multi-attribute group decision-making (MAGDM) in circular intuitionistic fuzzy settings. The approach is aimed at reaching geometric consensus among experts, with uncertainties and hesitancies expressed via circular intuitionistic fuzzy numbers (CIFNs). First, the qualitative judgment in professionals is converted into a geometric space where experts’ assessments are represented as spatial points that reflect the differences between the opinions. All these points are gradually combined with the help of a radiation–reflection–convergence mechanism, which iteratively finds the Optimal Consensus Point (OCP) to minimize the overall weighted divergence over the evaluations. After that, a projection-based scoring method is used to locate good and bad optimal solutions, and the alternatives are ranked based on a comparison of their projection distance. It presents a numerical example with data supplied by the Hubei agro-ecological zone to demonstrate that the offered method helps to capture collective agreement and convergence behavior that is consistent, and makes the decision results readable and reliable. Full article

(This article belongs to the Special Issue Advances in Fuzzy Preference Relations and Decision-Making Methods with Applications)

► Show Figures

Figure 1

20 pages, 822 KB

Open AccessArticle

Dermatology “AI Babylon”: Cross-Language Evaluation of AI-Crafted Dermatology Descriptions

by Emmanouil Karampinis, Christina-Marina Zoumpourli, Christina Kontogianni, Theofanis Arkoumanis, Dimitra Koumaki, Dimitrios Mantzaris, Konstantinos Filippakis, Maria-Myrto Papadopoulou, Melpomeni Theofili, Nkechi Anne Enechukwu, Nomtondo Amina Ouédraogo, Alexandros Katoulis, Efterpi Zafiriou and Dimitrios Sgouros

Medicina 2026, 62(1), 227; https://doi.org/10.3390/medicina62010227 - 22 Jan 2026

Viewed by 426

Abstract

Background and Objectives: Dermatology relies on a complex terminology encompassing lesion types, distribution patterns, colors, and specialized sites such as hair and nails, while dermoscopy adds an additional descriptive framework, making interpretation subjective and challenging. Our study aims to evaluate the ability [...] Read more.

Background and Objectives: Dermatology relies on a complex terminology encompassing lesion types, distribution patterns, colors, and specialized sites such as hair and nails, while dermoscopy adds an additional descriptive framework, making interpretation subjective and challenging. Our study aims to evaluate the ability of a chatbot (Gemini 2) to generate dermatology descriptions across multiple languages and image types, and to assess the influence of prompt language on readability, completeness, and terminology consistency. Our research is based on the concept that non-English prompts are not mere translations of the English prompts but are independently generated texts that reflect medical and dermatological knowledge learned from non-English material used in the chatbot’s training. Materials and Methods: Five macroscopic and five dermoscopic images of common skin lesions were used. Images were uploaded to Gemini 2 with language-specific prompts requesting short paragraphs describing visible features and possible diagnoses. A total of 2400 outputs were analyzed for readability using LIX score and CLEAR (comprehensiveness, accuracy, evidence-based content, appropriateness, and relevance) assessment, while terminology consistency was evaluated via SNOMED CT mapping across English, French, German, and Greek outputs. Results: English and French descriptions were found to be harder to read and more sophisticated, while SNOMED CT mapping revealed the largest terminology mismatch in German and the smallest in French. English texts and macroscopic images achieved the highest accuracy, completeness, and readability based on CLEAR assessment, whereas dermoscopic images and non-English texts presented greater challenges. Conclusions: Overall, partial terminology inconsistencies and cross-lingual variations highlighted that the language of the prompt plays a critical role in shaping AI-generated dermatology descriptions. Full article

(This article belongs to the Special Issue Dermato-Engineering and AI Assessment in Dermatology Practice)

► Show Figures

Figure 1

35 pages, 22348 KB

Open AccessArticle

Performance Assessment of Portable SLAM-Based Systems for 3D Documentation of Historic Built Heritage

by Valentina Bonora and Martina Colapietro

Sensors 2026, 26(2), 657; https://doi.org/10.3390/s26020657 - 18 Jan 2026

Cited by 1 | Viewed by 595

Abstract

The rapid and reliable geometric documentation of historic built heritage is a key requirement for a wide range of conservation, analysis, and risk assessment activities. In recent years, portable and wearable Simultaneous Localization and Mapping (SLAM)-based systems have emerged as efficient tools for [...] Read more.

The rapid and reliable geometric documentation of historic built heritage is a key requirement for a wide range of conservation, analysis, and risk assessment activities. In recent years, portable and wearable Simultaneous Localization and Mapping (SLAM)-based systems have emerged as efficient tools for fast 3D data acquisition, offering significant advantages in terms of operational speed, accessibility, and flexibility. This paper presents an experimental performance assessment of three portable SLAM-based mobile mapping systems applied to the 3D documentation of historic religious buildings. Two historic parish churches in the Lunigiana region (Italy) are used as case studies to evaluate the systems under real-world conditions. The analysis focuses on key performance indicators relevant to metric documentation, including georeferencing accuracy, 3D model accuracy, point cloud density and resolution, and model completeness. The results highlight the capabilities and limitations of the tested systems, showing that all instruments can efficiently capture the primary geometries of complex historic buildings, while differences emerge in terms of accuracy, data consistency, and readability of architectural details. Although the work is framed within a broader research project addressing seismic vulnerability of historic structures, this contribution specifically focuses on the experimental evaluation of SLAM-based surveying performance. The results demonstrate that portable SLAM systems provide reliable geometric datasets suitable for preliminary documentation tasks and for supporting further multidisciplinary analyses, representing a valuable resource for the rapid 3D documentation of historic built heritage. Full article

(This article belongs to the Special Issue Innovative Approaches in 3D Sensing and Imaging Technologies for Cultural Heritage)

► Show Figures

Figure 1

Search Results (222)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (222)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI