Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (349)

Search Parameters:
Keywords = Range query

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
14 pages, 1290 KB  
Article
Evolution Landscape of PiggyBac (PB) Transposon in Beetles (Coleoptera)
by Quan Wang, Shasha Shi, Bingqing Wang, Xin Chen, Naisu Yang, Bo Gao and Chengyi Song
Genes 2025, 16(12), 1521; https://doi.org/10.3390/genes16121521 - 18 Dec 2025
Abstract
Background/Objectives: The PB family of “cut-and-paste” DNA transposons shows great promise as genetic manipulation tools while significantly impacting eukaryotic genome evolution. However, their evolutionary profile in beetles (Coleoptera), the most species-rich animal order, remains poorly characterized. Methods: A local tBLASTN search [...] Read more.
Background/Objectives: The PB family of “cut-and-paste” DNA transposons shows great promise as genetic manipulation tools while significantly impacting eukaryotic genome evolution. However, their evolutionary profile in beetles (Coleoptera), the most species-rich animal order, remains poorly characterized. Methods: A local tBLASTN search was conducted to mine PiggyBac (PB) transposons across 136 coleopteran insect genomes, using the DDE domain of the PB transposase as the query. Multiple sequence alignment was performed with MAFFT, and a maximum likelihood phylogenetic tree of the transposase DDE domains was constructed using IQ-TREE. Evolutionary dynamics were analyzed by means of K-divergence. Results: Our study reveals PB transposons are widely distributed, highly diverse, and remarkably active across beetles. We detected PB elements in 62 of 136 examined species (45%), classifying them into six distinct clades. A total of 62 PB-containing species harbored intact copies, with most showing recent insertions (K divergence ≈ 0), indicating ongoing transpositional activity. Notably, PB elements from Harmonia axyridis, Apoderus coryli, and Diabrotica balteata exhibit exceptional potential for genetic tool development. Structurally, intact PB elements ranged from 2074 to 3465 bp, each containing a single transposase ORF (500–725 aa). All were flanked by terminal inverted repeats and generated TTAA target site duplications. Conclusions: These findings demonstrate PB transposons have not only shaped historical beetle genome evolution but continue to drive genomic diversification, underscoring their dual significance as natural genome architects and promising biotechnological tools. Full article
(This article belongs to the Section Bioinformatics)
11 pages, 775 KB  
Article
Fast Spectral Search Using Improved Preprocessing and Limited Axis Check
by YoungJae Son, Tiejun Chen, Guangyong Shang, Myeongjin Kim and Sung-June Baek
Mathematics 2025, 13(24), 3983; https://doi.org/10.3390/math13243983 - 14 Dec 2025
Viewed by 119
Abstract
Efficient and accurate identification of spectra from large databases remains a critical challenge in spectroscopic analysis. Previous coarse-to-fine frameworks, typically combining Principal Component Analysis (PCA)-based preprocessing and k-d tree search, have shown that structured search can reduce computational cost without sacrificing [...] Read more.
Efficient and accurate identification of spectra from large databases remains a critical challenge in spectroscopic analysis. Previous coarse-to-fine frameworks, typically combining Principal Component Analysis (PCA)-based preprocessing and k-d tree search, have shown that structured search can reduce computational cost without sacrificing accuracy. Building on this foundation, we propose an enhanced algorithm that integrates an improved preprocessing and a novel limited axis check (LAC) method. The preprocessing stage applies running average filtering, downsampling, and threshold-based noise-cutting, followed by PCA to construct a compact, noise-suppressed spectral representation. In the search stage, the proposed LAC algorithm replaces conventional tree-based structures by performing an axis-wise limited-range search and voting strategy to efficiently locate the candidate spectrum closest to the query within the reduced PCA domain. A subsequent refined search determines the closest spectrum by computing distances to the shortlisted candidates. Experimental results demonstrate that the proposed approach attains accuracy equivalent to that of the full search while markedly reducing computational complexity. These results confirm that the integration of enhanced preprocessing and LAC substantially accelerates the spectral search process. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

12 pages, 3183 KB  
Article
Discovery and Genomic Characterization of a Novel Phage P284 with Potential Lytic Ability Against Agrobacterium tumefaciens
by Orges Cara, Miloud Sabri, Khaoula Mektoubi, Angelo De Stradis and Toufic Elbeaino
Plants 2025, 14(24), 3755; https://doi.org/10.3390/plants14243755 - 10 Dec 2025
Viewed by 172
Abstract
Agrobacterium tumefaciens (A. tumefaciens), the causal agent of crown gall disease, is a major threat to crop production worldwide. In this study, a novel lytic bacteriophage, designated P284, was identified and characterized for its antibacterial potential against A. tumefaciens. High-throughput [...] Read more.
Agrobacterium tumefaciens (A. tumefaciens), the causal agent of crown gall disease, is a major threat to crop production worldwide. In this study, a novel lytic bacteriophage, designated P284, was identified and characterized for its antibacterial potential against A. tumefaciens. High-throughput sequencing revealed a 44,922 bp double-stranded DNA genome (G+C content 54.3%), with 66 predicted coding sequences, none associated with virulence, lysogeny, or antibiotic resistance. Genomic and phylogenetic analyses allocated P284 within the genus Atuphduovirus (subfamily Dunnvirinae), showing 94% nucleotide sequence identity and 100% query coverage with phage PAT1, representing a distinct species. Turbidity assays revealed that P284 (MOI = 1) strongly inhibits A. tumefaciens growth up to 48 h, achieving a 92% reduction in bacterial density. Transmission electron microscopy confirmed rapid adsorption and host cell lysis within 30 min. In silico predictions identified three putative depolymerases with properties suitable for recombinant applications. The phage exhibited stability across a wide pH range (3–9) and temperatures from −20 to 60 °C. These findings highlight the lytic activity and environmental resilience of P284, and whether it can control crown gall disease in planta remains to be evaluated. Full article
(This article belongs to the Section Plant Protection and Biotic Interactions)
Show Figures

Figure 1

16 pages, 735 KB  
Systematic Review
Cryotherapy as a Surgical De-Escalation Strategy in Breast Cancer: Techniques, Complications, and Oncological Outcomes
by Kai Lin Lee, Ashita Ashish Sule, Hao Xing Lai, Qin Xiang Ng and Serene Si Ning Goh
Biomedicines 2025, 13(12), 2987; https://doi.org/10.3390/biomedicines13122987 - 5 Dec 2025
Viewed by 563
Abstract
Background: Early breast cancer outcomes have improved substantially, yet surgery may carry physical and psychosocial costs. Cryotherapy has gained attention as a minimally invasive alternative to surgery for select patients with breast cancer: particularly, those with small, unifocal, hormone receptor-positive tumors. Given [...] Read more.
Background: Early breast cancer outcomes have improved substantially, yet surgery may carry physical and psychosocial costs. Cryotherapy has gained attention as a minimally invasive alternative to surgery for select patients with breast cancer: particularly, those with small, unifocal, hormone receptor-positive tumors. Given rapidly expanding but heterogeneous reports, this state-of-the-art review therefore aims to synthesize information on how breast cryotherapy is performed, for whom it is most suitable, what outcomes to expect, and where evidence is still immature. Methods: We queried MEDLINE (via PubMed), Embase (via Ovid), and the Cochrane Library up to January 2025, using terms related to “breast neoplasms,” “cryotherapy,” and “cryoablation.” Eligible studies included clinical trials, cohort studies, and case series reporting outcomes of cryotherapy in breast cancer. Data were extracted on patient characteristics, procedural parameters, recurrence, survival, and complications. The risk of bias was assessed using the MINORS tool, and certainty of evidence was appraised with the GRADE framework. Results: A total of thirty one studies (comprising 1357 patients) formed the evidence corpus summarized here. Most involved early-stage, hormone receptor-positive breast cancers ≤ 2 cm treated with percutaneous cryoablation. Local recurrence, defined as any ipsilateral breast tumor recurrence confirmed radiologically or histologically, ranged from 0 to 68.8%, with smaller, unifocal tumors achieving the best control. Overall survival exceeded 80% in early-stage disease, while complications were generally minor, including bruising, hematoma, and skin erythema. Patient satisfaction was high, with favorable cosmetic outcomes reported in limited studies. However, the follow-up duration ranged from 1 month to 10 years (with nearly half < 1 year), and protocols varied substantially across studies. In summary, breast cryotherapy appears safe and can achieve encouraging local control and cosmetic results in carefully selected early-stage cases. Its role in aggressive subtypes, larger or multifocal disease, and as part of multimodal regimens requires further study. Conclusions: Standardized protocols, imaging/reporting conventions, and longer follow-up with patient-reported outcomes are needed to advance the field and further define where cryotherapy can appropriately de-escalate surgery. Full article
(This article belongs to the Special Issue Breast Cancer: New Diagnostic and Therapeutic Approaches)
Show Figures

Graphical abstract

17 pages, 11839 KB  
Article
Cylindrical Scan Context: A Multi-Channel Descriptor for Vertical-Structure-Aware LiDAR Localization
by Chulhee Bae, Gun Rae Cho, Jongho Bae, Sungho Park, Mangi Lee, Shin Kim and Jung Hyeun Park
Sensors 2025, 25(23), 7223; https://doi.org/10.3390/s25237223 - 26 Nov 2025
Viewed by 374
Abstract
This study introduces Cylindrical Scan Context (CSC), a novel LiDAR descriptor designed to improve robustness and efficiency in GPS-denied or degraded outdoor environments. Unlike the conventional Scan Context (SC), which relies on azimuth–range projection, CSC employs an azimuth–height representation that preserves vertical structural [...] Read more.
This study introduces Cylindrical Scan Context (CSC), a novel LiDAR descriptor designed to improve robustness and efficiency in GPS-denied or degraded outdoor environments. Unlike the conventional Scan Context (SC), which relies on azimuth–range projection, CSC employs an azimuth–height representation that preserves vertical structural information and incorporates multiple physical channels—range, point density, and reflectance intensity—to capture both geometric and radiometric characteristics of the environment. This multi-channel cylindrical formulation enhances descriptor distinctiveness and robustness against viewpoint, elevation, and trajectory variations. To validate the effectiveness of CSC, real-world experiments were conducted using both self-collected coastal–forest datasets and the public MulRan–KAIST dataset. Mapping was performed using LIO-SAM with LiDAR, IMU, and GPS measurements, after which LiDAR-only localization was evaluated independently. A total of approximately 700 query scenes (1 m ground-truth threshold) were used in the self-collected experiments, and about 1200 scenes (3 m threshold) were evaluated in the MulRan–KAIST experiments. Comparative analyses between SC and CSC were performed using Precision–Recall (PR) curves, Detection Recall (DR) curves, Root Mean Square Error (RMSE), and Top-K retrieval accuracy. The results show that CSC consistently yields lower RMSE—particularly in the vertical and lateral directions—and demonstrates faster recall growth and higher stability in global retrieval. Across datasets, CSC maintains superior DR performance in high-confidence regions and achieves up to 45% reduction in distance RMSE in large-scale campus environments. These findings confirm that the cylindrical multi-channel formulation of CSC significantly improves geometric consistency and localization reliability, offering a practical and robust LiDAR-only localization framework for challenging unstructured outdoor environments. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

19 pages, 2587 KB  
Article
Assessment of ChatGPT in Recommending Immunohistochemistry Panels for Salivary Gland Tumors
by Maria Cuevas-Nunez, Cosimo Galletti, Luca Fiorillo, Aida Meto, Wilmer Rodrigo Díaz-Castañeda, Shokoufeh Shahrabi Farahani, Guido Fadda, Valeria Zuccalà, Victor Gil Manich, Javier Bara-Casaus and Maria-Teresa Fernández-Figueras
BioMedInformatics 2025, 5(4), 66; https://doi.org/10.3390/biomedinformatics5040066 - 26 Nov 2025
Viewed by 389
Abstract
Background: Salivary gland tumors pose a diagnostic challenge due to their histological heterogeneity and overlapping features. While immunohistochemistry (IHC) is critical for accurate classification, selecting appropriate markers can be subjective and influenced by resource availability. Artificial intelligence (AI), particularly large language models (LLMs), [...] Read more.
Background: Salivary gland tumors pose a diagnostic challenge due to their histological heterogeneity and overlapping features. While immunohistochemistry (IHC) is critical for accurate classification, selecting appropriate markers can be subjective and influenced by resource availability. Artificial intelligence (AI), particularly large language models (LLMs), may support diagnostic decisions by recommending IHC panels. This study evaluated the performance of ChatGPT-4, a free and widely accessible general-purpose LLM, in recommending IHC markers for salivary gland tumors. Methods: ChatGPT-4 was prompted to generate IHC recommendations for 21 types of salivary gland tumors. A consensus of expert pathologists established reference panels. Each tumor type was queried using a standardized prompt designed to elicit IHC marker recommendations (“What IHC markers are recommended to confirm a diagnosis of [tumor type]?”). Outputs were assessed using a structured scoring rubric measuring accuracy, completeness, and relevance. Agreement was measured using Cohen’s Kappa, and diagnostic performance was evaluated via sensitivity, specificity, and F1-scores. Repeated-measures ANOVA and Bland–Altman analysis assessed consistency across three prompts. Results were compared to a rule-based system aligned with expert protocols. Results: ChatGPT-4 demonstrated moderate overall agreement with the pathologist panel (κ = 0.53). Agreement was higher for benign tumors (κ = 0.67) than for malignant ones (κ = 0.40), with pleomorphic adenoma showing the strongest concordance (κ = 0.74). Sensitivity values across tumor types ranged from 0.25 to 0.96, with benign tumors showing higher sensitivity (>0.80) and lower specificity (<0.50) observed in complex malignancies. The overall F1-score was 0.84 for benign and 0.63 for malignant tumors. Repeated prompts produced moderate variability without significant differences (p > 0.05). Compared with the rule-based system, ChatGPT included more incorrect and missed markers, indicating lower diagnostic precision. Conclusions: ChatGPT-4 shows promise as a low-cost tool for IHC panel selection but currently lacks the precision and consistency required for clinical application. Further refinement is needed before integration into diagnostic workflows. Full article
(This article belongs to the Special Issue The Application of Large Language Models in Clinical Practice)
Show Figures

Graphical abstract

18 pages, 1759 KB  
Article
AI-Powered Chatbot for FDA Drug Labeling Information Retrieval: OpenAI GPT for Grounded Question Answering
by Manasa Koppula, Fnu Madhulika, Navya Sreeramoju and Praveen Kolimi
Analytics 2025, 4(4), 33; https://doi.org/10.3390/analytics4040033 - 17 Nov 2025
Viewed by 802
Abstract
This study presents the development of an AI-powered chatbot designed to facilitate accurate and efficient retrieval of information from the FDA drug labeling documents. Leveraging OpenAI’s GPT-3.5-turbo model within a controlled, document-grounded question–answering framework, Chatbot was created, which can provide users with answers [...] Read more.
This study presents the development of an AI-powered chatbot designed to facilitate accurate and efficient retrieval of information from the FDA drug labeling documents. Leveraging OpenAI’s GPT-3.5-turbo model within a controlled, document-grounded question–answering framework, Chatbot was created, which can provide users with answers that are strictly limited to the content of the uploaded drug label, thereby minimizing hallucinations and enhancing traceability. A user-friendly interface built with Streamlit allows users to upload FDA labeling PDFs and pose natural language queries. The chatbot extracts relevant sections using PyMuPDF and regex-based segmentation and generates responses constrained to those sections. To evaluate performance, semantic similarity scores were computed between generated answers and ground truth text using Sentence Transformers. Results across 10 breast cancer drug labels demonstrate high semantic alignment, with most scores ranging from 0.7 to 0.9, indicating reliable summarization and contextual fidelity. The chatbot achieved high semantic similarity scores (≥0.95 for concise sections) and ROUGE scores, confirming strong semantic and textual alignment. Comparative analysis with GPT-5-chat and NotebookLM demonstrated that our approach maintains accuracy and section-specific fidelity across models. The current work is limited to a small dataset, focused on breast cancer drugs. Future work will expand to diverse therapeutic areas and incorporate BERTScore and expert-based validation. Full article
Show Figures

Figure 1

13 pages, 1941 KB  
Article
Mitral Valve Repair for the Treatment of Acute Bacterial Endocarditis: Analysis of a 10-Year Single-Center Experience
by Martina Musto, Sonia Lerta, Gloria Sangaletti, Raffaele Bruno, Elena Seminari, Giulia Magrini, Romina Frassica, Monica Wu, Stefano Pelenghi and Pasquale Totaro
J. Clin. Med. 2025, 14(22), 7907; https://doi.org/10.3390/jcm14227907 - 7 Nov 2025
Viewed by 310
Abstract
Background/Objectives: Acute bacterial endocarditis (ABE) is a frequent situation and continues to be a challenge. Mitral valve involvement during acute bacterial endocarditis is often the result of the spread of the endocarditic process from the adjacent aortic valve. Mitral involvement, on the other [...] Read more.
Background/Objectives: Acute bacterial endocarditis (ABE) is a frequent situation and continues to be a challenge. Mitral valve involvement during acute bacterial endocarditis is often the result of the spread of the endocarditic process from the adjacent aortic valve. Mitral involvement, on the other hand, could also be an expression of the initial localization of the bacteria. The best option for treating mitral ABE is still a matter of debate. Recent reports have shown satisfactory results with mitral reconstructive techniques in the treatment of mitral ABE. In this study, we present a comprehensive review of our 10-year institutional experience in the surgical management of acute mitral endocarditis with a focus on technical considerations, outcomes, and the durability of mitral valve repair in this high-risk population. Methods: We queried the institutional database, cross-referencing patients admitted with a diagnosis of “acute bacterial endocarditis” with patients undergoing surgical procedures for “valvular disease” at our division. Out of 1136 valvular procedures listed in our PACS database, 180 patients were admitted with a diagnosis of active acute endocarditis, and 46 included treatment of the mitral valve. We analyzed and compared short- and long-term follow-up (ranging from 3 to 141 months with a mean of 42 ± 38 months) of these 46 patients, dividing them into two groups: mitral valve repair (MVr) and mitral valve replacement (MVR). Results: 18 (40%) patients underwent reconstructive treatment of the mitral valve, and 28 (60%) underwent mitral valve replacement. Cumulative in-hospital mortality was 10% (5 pts, all from the MVR group), however, with no difference between the two groups. A shorter time gap from diagnosis to surgery (<10 days) was the only predictive factor for early mortality. A further 11 patients died during follow-up (2 from group A and 9 from group B). Long-term survival, on the other hand, was negatively influenced by MV surgical replacement (p = 0.0178), older patients’ age (>60 years), and urgent surgical procedures. Finally, patients with MVr also experienced a favorable postoperative event-free curve for endocarditis recurrence (p = 0.0260) and time elapsed before recurrence (p = 0.0438). Conclusions: Mitral valve repair in the case of active endocarditis could be a treatment associated with more favorable outcomes, providing that a complete eradication of infective tissue can be accomplished. Conservative treatment, when feasible, seems to offer favorable cumulative long-term outcomes. Full article
Show Figures

Figure 1

12 pages, 1467 KB  
Article
Identifying Risk Groups in 73,000 Patients with Diabetes Receiving Total Hip Replacement: A Machine Learning Clustering Analysis
by Alishah Ahmadi, Anthony J. Kaywood, Alejandra Chavarria, Oserekpamen Favour Omobhude, Adam Kiss, Mateusz Faltyn and Jason S. Hoellwarth
J. Pers. Med. 2025, 15(11), 537; https://doi.org/10.3390/jpm15110537 - 5 Nov 2025
Viewed by 441
Abstract
Background/Objective: Diabetes mellitus (DM) is a highly prevalent condition that contributes to adverse outcomes in patients undergoing total hip arthroplasty (THA). This study applied machine learning clustering algorithms to identify comorbidity profiles among diabetic THA patients and evaluate their association with postoperative [...] Read more.
Background/Objective: Diabetes mellitus (DM) is a highly prevalent condition that contributes to adverse outcomes in patients undergoing total hip arthroplasty (THA). This study applied machine learning clustering algorithms to identify comorbidity profiles among diabetic THA patients and evaluate their association with postoperative outcomes. Methods: The 2015–2021 National Inpatient Sample was queried using ICD-10 CM/PCS codes to identify DM patients undergoing THA. Forty-nine comorbidities, complications, and clinical covariates were incorporated into clustering analysis. The Davies–Bouldin and Calinski–Harabasz indices determined the optimal number of clusters. Multivariate logistic regression assessed risk of non-routine discharge (NRD), and Kruskal–Wallis H testing evaluated length-of-stay (LOS) differences. Results: A total of 73,606 patients were included. Six clusters were identified, ranging from 107 to 61,505 patients. Cluster 6, enriched for urinary tract infection and sepsis, had the highest risk of NRD (OR 7.83, p < 0.001) and the longest median LOS (9.0 days). Clusters 1–4 had shorter recoveries with median LOS of 2.0 days and narrow variability, while Cluster 5 showed intermediate outcomes. Kruskal–Wallis and post hoc testing confirmed significant differences across clusters (p < 0.001). Conclusions: Machine learning clustering of diabetic THA patients revealed six distinct groups with varied comorbidity profiles. Infection-driven clusters carried the highest risk for non-routine discharge and prolonged hospitalization. This approach provides a novel framework for risk stratification and may inform targeted perioperative management strategies. Full article
Show Figures

Figure 1

26 pages, 5753 KB  
Article
An Optimized Few-Shot Learning Framework for Fault Diagnosis in Milling Machines
by Faisal Saleem, Muhammad Umar and Jong-Myon Kim
Machines 2025, 13(11), 1010; https://doi.org/10.3390/machines13111010 - 2 Nov 2025
Viewed by 692
Abstract
Reliable fault diagnosis of milling machines is essential for maintaining operational stability and cost-effective maintenance; however, it remains challenging due to limited labeled data and the highly non-stationary nature of acoustic emission (AE) signals. This study introduces an optimized Few-Shot Learning framework (FSL) [...] Read more.
Reliable fault diagnosis of milling machines is essential for maintaining operational stability and cost-effective maintenance; however, it remains challenging due to limited labeled data and the highly non-stationary nature of acoustic emission (AE) signals. This study introduces an optimized Few-Shot Learning framework (FSL) that integrates time–frequency analysis with attention-guided representation learning and distribution-aware classification for data-efficient fault detection. The framework converts AE signals into Continuous Wavelet Transform (CWT) scalograms, which are processed using a self-attention-enhanced ResNet-50 backbone to capture both local texture features and long-range dependencies in the signal. Adaptive prototype computation with learnable importance weighting refines class representations, while Mahalanobis distance-based matching ensures robust alignment between query and prototype embeddings under limited sample conditions. To further strengthen discriminability, contrastive loss with hard negative mining enforces compact intra-class clustering and clear inter-class separation. Comprehensive experiments under 7-way 5-shot settings and 5-fold stratified cross-validation demonstrate consistent and reliable performance, achieving a mean accuracy of 98.86% ± 0.97% (95% CI: [98.01%, 99.71%]). Additional evaluations across multiple spindle speeds (660 rpm and 1440 rpm) confirm that the model generalizes effectively under varying operating conditions. Grad-CAM++ activation maps further illustrate that the network focuses on physically meaningful fault-related regions, enhancing interpretability. The results verify that the proposed framework achieves robust, scalable, and interpretable fault diagnosis using minimal labeled data, offering a practical solution for predictive maintenance in modern intelligent manufacturing environments. Full article
Show Figures

Figure 1

18 pages, 1707 KB  
Article
DefAn: Definitive Answer Dataset for LLM Hallucination Evaluation
by A. B. M. Ashikur Rahman, Saeed Anwar, Muhammad Usman, Irfan Ahmad and Ajmal Mian
Information 2025, 16(11), 937; https://doi.org/10.3390/info16110937 - 28 Oct 2025
Viewed by 2707
Abstract
Large Language Models (LLMs) represent a major step in AI development and are increasingly used in daily applications. However, they are prone to hallucinations, generating claims that contradict established facts, deviating from prompts, and producing inconsistent responses when the same prompt is presented [...] Read more.
Large Language Models (LLMs) represent a major step in AI development and are increasingly used in daily applications. However, they are prone to hallucinations, generating claims that contradict established facts, deviating from prompts, and producing inconsistent responses when the same prompt is presented multiple times. Addressing these issues is challenging due to the lack of comprehensive and easily assessable benchmark datasets. Most existing datasets are limited in scale and scope and rely on multiple-choice questions, which are insufficient for evaluating the generative capabilities of LLMs. To assess hallucination in LLMs, this paper introduces a comprehensive benchmark dataset consisting of over 20,000 unique prompts (more than 75,000 prompts in total) across eight domains. These prompts are designed to elicit definitive, concise, and informative answers. The dataset is divided into two segments: one publicly available for testing and assessing LLM performance, and a hidden segment for benchmarking various LLMs. In our experiments, we tested nine State-of-The-Art (SoTA) models, GPT-4o, GPT-3.5, LLama 2 7B, LLama 3 8B, Gemini 1.0 Pro, Mixtral 8x7B, Zephyr 7B, Deepseek-r1-7b, and Qwen2.5-14B, revealing that overall factual hallucination ranges from 48% to 82% on the public dataset and 31% to 76% on the hidden benchmark. Prompt Misalignment Hallucination ranges up to 95% in the public dataset and up to 94% in the hidden counterpart. Average consistency ranges from 21% to 61% and 44% to 63%, respectively. Domain-wise analysis reveals that LLM performance significantly deteriorates when asked for specific numeric information, whereas it performs moderately with queries involving persons, locations, and dates. Our dataset demonstrates its efficacy and serves as a comprehensive benchmark for evaluating LLM performance. Full article
Show Figures

Graphical abstract

20 pages, 2753 KB  
Article
Evaluation of the Accuracy and Reliability of Responses Generated by Artificial Intelligence Related to Clinical Pharmacology
by Michal Ordak, Julia Adamczyk, Agata Oskroba, Michal Majewski and Tadeusz Nasierowski
J. Clin. Med. 2025, 14(21), 7563; https://doi.org/10.3390/jcm14217563 - 25 Oct 2025
Viewed by 758
Abstract
Background/Objectives: Artificial intelligence (AI) is gaining importance in clinical pharmacology, supporting therapeutic decisions and the prediction of drug interactions, although its applications have significant limitations. The aim of the study was to evaluate the accuracy of the responses of four large language models [...] Read more.
Background/Objectives: Artificial intelligence (AI) is gaining importance in clinical pharmacology, supporting therapeutic decisions and the prediction of drug interactions, although its applications have significant limitations. The aim of the study was to evaluate the accuracy of the responses of four large language models (LLMs), namely ChatGPT-4o, ChatGPT-3.5, Gemini Advanced 2.0, and DeepSeek, in the field of clinical pharmacology and drug interactions, as well as to analyze the impact of prompting and questions from the National Specialization Examination for Pharmacists (PESF) on the results. Methods: In the analysis, three datasets were used: 20 case reports of successful pharmacotherapy, 20 reports of drug–drug interactions, and 240 test questions from the PESF (spring 2018 and autumn 2019 sessions). The responses generated by the models were compared with source data and the official examination key and were independently evaluated by clinical-pharmacotherapy experts. Additionally, the impact of prompting techniques was analyzed by expanding the content of the queries with detailed clinical and organizational elements to assess their influence on the accuracy of the obtained recommendations. Results: The analysis revealed differences in the accuracy of responses between the examined AI tools (p < 0.001), with ChatGPT-4o achieving the highest effectiveness and Gemini Advanced 2.0 the lowest. Responses generated by Gemini were more often imprecise and less consistent, which was reflected in their significantly lower level of substantive accuracy (p < 0.001). The analysis of more precisely formulated questions demonstrated a significant main effect of the AI tool (p < 0.001), with Gemini Advanced 2.0 performing significantly worse than all other models (p < 0.001). An additional analysis comparing responses to simple and extended questions, which incorporated additional clinical factors and the mode of source presentation, did not reveal significant differences either between AI tools or within individual models (p = 0.34). In the area of drug interactions, it was also shown that ChatGPT-4o achieved a higher level of response accuracy compared with the other tools (p < 0.001). Regarding the PESF exam questions, all models achieved similar results, ranging between 83 and 86% correct answers, and the differences between them were not statistically significant (p = 0.67). Conclusions: AI models demonstrate potential in the analysis of clinical pharmacology; however, their limitations require further refinement and cautious application in practice. Full article
Show Figures

Figure 1

12 pages, 1202 KB  
Data Descriptor
Toward Responsible AI in High-Stakes Domains: A Dataset for Building Static Analysis with LLMs in Structural Engineering
by Carlos Avila, Daniel Ilbay, Paola Tapia and David Rivera
Data 2025, 10(11), 169; https://doi.org/10.3390/data10110169 - 24 Oct 2025
Viewed by 695
Abstract
Modern engineering increasingly operates within socio-technical networks, such as the interdependence of energy grids, transport systems, and building codes, where decisions must be reliable and transparent. Large language models (LLMs) such as GPT promise efficiency by interpreting domain-specific queries and generating outputs, yet [...] Read more.
Modern engineering increasingly operates within socio-technical networks, such as the interdependence of energy grids, transport systems, and building codes, where decisions must be reliable and transparent. Large language models (LLMs) such as GPT promise efficiency by interpreting domain-specific queries and generating outputs, yet their predictive nature can introduce biases or fabricated values—risks that are unacceptable in structural engineering, where safety and compliance are paramount. This work presents a dataset that embeds generative AI into validated computational workflows through the Model Context Protocol (MCP). MCP enables API-based integration between ChatGPT (GPT-4o) and numerical solvers by converting natural-language prompts into structured solver commands. This creates context-aware exchanges—for example, transforming a query on seismic drift limits into an OpenSees analysis—whose results are benchmarked against manually generated ETABS models. This architecture ensures traceability, reproducibility, and alignment with seismic design standards. The dataset contains prompts, GPT outputs, solver-based analyses, and comparative error metrics for four reinforced concrete frame models designed under Ecuadorian (NEC-15) and U.S. (ASCE 7-22) codes. The end-to-end runtime for these scenarios, including LLM prompting, MCP orchestration, and solver execution, ranged between 6 and 12 s, demonstrating feasibility for design and verification workflows. Beyond providing records, the dataset establishes a reproducible methodology for integrating LLMs into engineering practice, with three goals: enabling independent verification, fostering collaboration across AI and civil engineering, and setting benchmarks for responsible AI use in high-stakes domains. Full article
Show Figures

Figure 1

37 pages, 12943 KB  
Article
Natural Disaster Information System (NDIS) for RPAS Mission Planning
by Robiah Al Wardah and Alexander Braun
Drones 2025, 9(11), 734; https://doi.org/10.3390/drones9110734 - 23 Oct 2025
Viewed by 821
Abstract
Today’s rapidly increasing number and performance of Remotely Piloted Aircraft Systems (RPASs) and sensors allows for an innovative approach in monitoring, mitigating, and responding to natural disasters and risks. At present, there are 100s of different RPAS platforms and smaller and more affordable [...] Read more.
Today’s rapidly increasing number and performance of Remotely Piloted Aircraft Systems (RPASs) and sensors allows for an innovative approach in monitoring, mitigating, and responding to natural disasters and risks. At present, there are 100s of different RPAS platforms and smaller and more affordable payload sensors. As natural disasters pose ever increasing risks to society and the environment, it is imperative that these RPASs are utilized effectively. In order to exploit these advances, this study presents the development and validation of a Natural Disaster Information System (NDIS), a geospatial decision-support framework for RPAS-based natural hazard missions. The system integrates a global geohazard database with specifications of geophysical sensors and RPAS platforms to automate mission planning in a generalized form. NDIS v1.0 uses decision tree algorithms to select suitable sensors and platforms based on hazard type, distance to infrastructure, and survey feasibility. NDIS v2.0 introduces a Random Forest method and a Critical Path Method (CPM) to further optimize task sequencing and mission timing. The latest version, NDIS v3.8.3, implements a staggered decision workflow that sequentially maps hazard type and disaster stage to appropriate survey methods, sensor payloads, and compatible RPAS using rule-based and threshold-based filtering. RPAS selection considers payload capacity and range thresholds, adjusted dynamically by proximity, and ranks candidate platforms using hazard- and sensor-specific endurance criteria. The system is implemented using ArcGIS Pro 3.4.0, ArcGIS Experience Builder (2025 cloud release), and Azure Web App Services (Python 3.10 runtime). NDIS supports both batch processing and interactive real-time queries through a web-based user interface. Additional features include a statistical overview dashboard to help users interpret dataset distribution, and a crowdsourced input module that enables community-contributed hazard data via ArcGIS Survey123. NDIS is presented and validated in, for example, applications related to volcanic hazards in Indonesia. These capabilities make NDIS a scalable, adaptable, and operationally meaningful tool for multi-hazard monitoring and remote sensing mission planning. Full article
Show Figures

Figure 1

20 pages, 495 KB  
Article
Efficient Single-Server Private Information Retrieval Based on LWE Encryption
by Hai Huang, Zhibo Guan, Bin Yu, Xiang Li, Mengmeng Ge, Chao Ma and Xiangyu Ma
Mathematics 2025, 13(21), 3373; https://doi.org/10.3390/math13213373 - 23 Oct 2025
Viewed by 715
Abstract
Private Information Retrieval (PIR) is a cryptographic protocol that allows users to retrieve data from one or more databases without revealing any information about their queries. Among existing PIR protocols, single-server schemes based on the Learning With Errors (LWE) assumption currently constitute the [...] Read more.
Private Information Retrieval (PIR) is a cryptographic protocol that allows users to retrieve data from one or more databases without revealing any information about their queries. Among existing PIR protocols, single-server schemes based on the Learning With Errors (LWE) assumption currently constitute the most practical class of constructions. However, existing schemes continue to suffer from high client-side preprocessing complexity and significant server-side storage overhead, leading to degraded overall performance. We propose ShufflePIR, a single-server protocol that marks the first introduction of an SM3-based pseudorandom function into the PIR framework for shuffling during preprocessing and utilizes cryptographic hardware to accelerate computation, thereby improving both efficiency and security. In addition, the adoption of a parallel encryption scheme based on the LWE assumption significantly enhances the client’s computational efficiency when processing long-bit data. We evaluate the performance of our protocol against the latest state-of-the-art PIR schemes. Simulation results demonstrate that ShufflePIR achieves a throughput of 9903 MB/s on a 16 GB database with 1 MB records, outperforming existing single-server PIR schemes. Overall, ShufflePIR provides an efficient and secure solution for privacy-preserving information retrieval in a wide range of applications. Full article
(This article belongs to the Special Issue Mathematical Models in Information Security and Cryptography)
Show Figures

Figure 1

Back to TopTop