MDPI - Publisher of Open Access Journals

19 pages, 458 KB

Open AccessArticle

Learning Selective Deferral Policies for Reliable Medical Text Classification

by Tahani Albalawi and Amani Alzahrani

Technologies 2026, 14(6), 359; https://doi.org/10.3390/technologies14060359 (registering DOI) - 13 Jun 2026

Medical text classification is an important task in biomedical natural language processing, but prediction errors remain problematic in high-stakes settings where reliability matters in addition to accuracy. To address this challenge, this paper proposes a learned selective deferral framework for biomedical sentence classification [...] Read more.

Medical text classification is an important task in biomedical natural language processing, but prediction errors remain problematic in high-stakes settings where reliability matters in addition to accuracy. To address this challenge, this paper proposes a learned selective deferral framework for biomedical sentence classification that allows uncertain predictions to be deferred under constrained review budgets. The framework combines a transformer-based classifier with uncertainty estimation, temperature scaling, and a learned deferral policy that predicts the likelihood of model error from multiple signals, including confidence, entropy, calibration-aware features, and Monte Carlo Dropout descriptors. Deferral decisions are applied under fixed budgets to improve the use of limited review capacity. Experiments on the PubMed 200k RCT dataset show that budget-constrained deferral reduces system-level risk. Using PubMedBERT as the primary backbone, deferring 20% of the highest-risk cases reduces system risk from 0.1108 to 0.0360. Compared with a calibrated confidence-threshold baseline, the learned policy provides modest but generally favorable improvements, with statistical significance observed at the 20% budget. Additional experiments across PubMedBERT, BioBERT, and SciBERT suggest that the framework transfers across biomedical transformer backbones, while calibration improves the reliability of confidence estimates and learned policies outperform random deferral. Full article

► Show Figures

Figure 1

24 pages, 327 KB

Open AccessArticle

AI-Driven Dental Procedure Coding: A Multi-Model Framework for CDT Extraction from Clinical Text

by Pranav Annareddy, Ali Noori, Deepthi Kollipara and Prashanti Manda

Dent. J. 2026, 14(6), 339; https://doi.org/10.3390/dj14060339 - 2 Jun 2026

Viewed by 196

Abstract

Background and Objectives: Dental procedure coding is essential for accurate billing, reimbursement, and clinical documentation, yet it remains largely manual, time-consuming, and error-prone. While natural language processing (NLP) has enabled significant advances in automated medical coding, limited work has focused on the [...] Read more.

Background and Objectives: Dental procedure coding is essential for accurate billing, reimbursement, and clinical documentation, yet it remains largely manual, time-consuming, and error-prone. While natural language processing (NLP) has enabled significant advances in automated medical coding, limited work has focused on the dental domain, particularly the assignment of Code on Dental Procedures and Nomenclature (CDT) codes from free-text clinical notes. This study aims to develop and evaluate an artificial intelligence framework that integrates large language models (LLMs) and traditional deep learning methods to automate CDT code extraction from narrative dental documentation. Methods: We evaluated three LLM-based strategies—zero-shot prompting, QLoRA fine-tuning, and parameter-efficient fine-tuning (PEFT) using LoRA—alongside a supervised Bidirectional GRU (Bi-GRU) classifier. Experiments were conducted using a synthetic dataset designed to emulate real-world dental encounters. Structured JSON output schemas, few-shot prompting, and scalable batch inference pipelines were employed to ensure consistent and interpretable predictions. Model performance was assessed using micro- and macro-averaged F1 scores, precision, recall, exact-match accuracy, and Hamming loss. Results: The zero-shot LLM achieved the highest micro-F1 score (0.9614) and perfect recall for frequent CDT codes, demonstrating strong baseline reasoning without task-specific training; however, performance declined for rare procedures and diagnostic code hallucinations were common. Fine-tuning improved domain alignment, with the non-quantized PEFT LoRA model outperforming QLoRA across all metrics, though both fine-tuned LLMs showed tendencies to over-generate plausible but incorrect codes. The Bi-GRU model achieved balanced performance (micro-F1 = 0.9362, macro-F1 = 0.9377) with minimal hallucinations but occasionally missed context-dependent procedures. Conclusions: These findings highlight complementary strengths between LLM-based and supervised approaches. LLMs provide strong contextual understanding and rapid deployment, while traditional models offer stable and precise multi-label classification. This work supports the development of hybrid, schema-constrained systems for scalable dental procedure coding. Full article

► Show Figures

Figure 1

11 pages, 296 KB

Open AccessArticle

Automating Systematic Reviews in Clinical Psychiatry: Comparing Domain Experts and NLP-Based Text Mining

by Cyril S. Ku, Daniel Weiner, Meera Wells, Andrew Huang and Morgan R. Peltier

Information 2026, 17(5), 463; https://doi.org/10.3390/info17050463 - 9 May 2026

Viewed by 355

Abstract

Objective: This study examines the potential of natural language processing and text mining to automate the systematic review process in clinical psychiatry, a field that traditionally relies on domain experts and can be time-consuming, prone to human bias and errors. The study compares [...] Read more.

Objective: This study examines the potential of natural language processing and text mining to automate the systematic review process in clinical psychiatry, a field that traditionally relies on domain experts and can be time-consuming, prone to human bias and errors. The study compares the classification of review articles by domain experts with that facilitated by machine algorithms. Methods: Using data from PubMed, 160 abstracts related to “transcranial magnetic stimulation” and “autism” were classified into “treatment” and “non-treatment” categories by both human reviewers and a computer algorithm. The computer algorithm, employing topic modeling in text mining, was compared to human reviewers, including two psychiatrists, a biostatistician, and a medical student. Results: The accuracy of human classifications ranged from 68% to 85%, with inter-rater reliability (Kappa statistic) between 0.40 (fair to moderate) and 0.64 (substantial). Intra-rater reliability, tested by reclassification after three months, varied from 0.38 to 0.82. Conclusions: The findings highlight the consistency and reproducibility of computational approaches compared to human classification, which exhibited both inter-rater and intra-rater variability. Differences in reviewer performance were observed; however, these patterns should be interpreted cautiously, as the study was not designed to directly assess cognitive or decision-making processes. Full article

(This article belongs to the Special Issue Natural Language Processing (NLP) with Applications and Natural Language Understanding (NLU), 2nd Edition)

► Show Figures

Graphical abstract

14 pages, 1106 KB

Open AccessArticle

An Ecological Analysis of Online Medical Consumption Discourse Among Visually Impaired Individuals Using a Theory-Driven LLM Approach

by Woo-Hyuk Kim and Eunhye Park

Healthcare 2026, 14(9), 1132; https://doi.org/10.3390/healthcare14091132 - 23 Apr 2026

Viewed by 246

Abstract

Background: This study examines how medical consumption is discussed in online communities among individuals who are blind or visually impaired using the Social Ecological Model (SEM) to capture multilevel healthcare experiences within digital discourse. Methods: A total of 428 posts and comments were [...] Read more.

Background: This study examines how medical consumption is discussed in online communities among individuals who are blind or visually impaired using the Social Ecological Model (SEM) to capture multilevel healthcare experiences within digital discourse. Methods: A total of 428 posts and comments were collected from Reddit’s r/Blind community. Term frequency–inverse document frequency keyword extraction and a theory-driven LLM-based classification approach were applied to categorize texts into five SEM levels: intrapersonal, interpersonal, institutional, community, and public policy. Results: The findings show that intrapersonal (44.4%) and public policy (29.8%) levels were the most prominent, indicating a strong emphasis on personal coping experiences alongside structural constraints in healthcare access. Institutional-level discourse accounted for 15.8%, whereas interpersonal (6.2%) and community (3.8%) discourse were relatively limited. Keywords and qualitative analyses revealed themes related to emotional adaptation, social support, service accessibility, mobility constraints, and welfare policy barriers. The Jaccard similarity analysis indicated stronger associations between institutional and policy levels, whereas community-level discourse remained relatively distinct. Conclusions: These findings highlight the importance of understanding healthcare experiences, both individually and structurally, in online environments. This study also demonstrated the potential of integrating LLM-based classification with theory-driven frameworks to enable an interpretable and scalable analysis of complex health-related discourse. Full article

► Show Figures

Figure 1

16 pages, 417 KB

Open AccessArticle

How Different Medical Practices Are Associated with Types of Patient Complaints in Russian Clinics

by Irina Evgenievna Kalabikhina, Anton Vasilyevich Kolotusha and Vadim Sergeevich Moshkin

Healthcare 2026, 14(8), 1027; https://doi.org/10.3390/healthcare14081027 - 13 Apr 2026

Cited by 1 | Viewed by 543

Abstract

Background/Objectives: Patient-Reported Experience Measures (PREMs) help us understand how patients perceive healthcare quality. Yet most studies look at complaints in isolation, without tying them to the structural features of medical practice. This study asks whether the nature of clinical work—shaped by diagnostic pathways, [...] Read more.

Background/Objectives: Patient-Reported Experience Measures (PREMs) help us understand how patients perceive healthcare quality. Yet most studies look at complaints in isolation, without tying them to the structural features of medical practice. This study asks whether the nature of clinical work—shaped by diagnostic pathways, interaction patterns, and professional focus—predicts what patients complain about. Methods: We analyzed 18,492 negative reviews from infodoctor.ru, collected between 2012 and 2023 across 16 Russian cities with populations over one million. We used a mix of methods: machine learning (logistic regression) to classify complaints as medical (M-type) or organizational (O-type), statistical tests (chi-square, proportion analysis), and expert validation by nine independent specialists. We also built a novel multidimensional classification of medical practices based on three criteria: diagnostic pathway length, frequency and duration of patient interaction, and whether the work is mainly technical or communicative. Results: Technical specialties received far more medical complaints than communicative ones (39.8% vs. 29.3%, p < 0.001), while communicative specialties received more organizational complaints (45.7% vs. 35.0%, p < 0.001). Specialties that manage chronic conditions over the long term had the highest share of organizational complaints (41.6%). At the city level, the share of communicative specialists correlated negatively with complaints per capita (r = −0.541, p = 0.0306). We found no meaningful gender differences in complaint patterns. Conclusions: The type of medical practice systematically shapes what patients complain about. Technical specialties draw criticism on clinical quality; communicative specialties draw criticism on how care is organized. Long-term care faces challenges rooted more in administrative friction than in clinical competence. These findings show that PREMs, when analyzed through a practice-based lens, can support targeted quality improvement—moving from simply tracking complaints to acting on them in specialty-specific ways. Full article

(This article belongs to the Special Issue Patient-Reported Measures: 2nd Edition)

16 pages, 1185 KB

Open AccessArticle

Leveraging Large Language Models for Automated Extraction of Abdominal Aortic Aneurysm Features from Radiology Reports

by Praneel Mukherjee, Ryan C. Lee, Roham Hadidchi, Sonya Henry, Michael Coard, Matthew Davis, Yossef Rubinov, Ha Nguyen-Luong, Leah Katz and Tim Q. Duong

Diagnostics 2026, 16(7), 1083; https://doi.org/10.3390/diagnostics16071083 - 3 Apr 2026

Viewed by 573

Abstract

Background/Objectives. Abdominal computed tomography (CT) radiology reports contain critical information for abdominal aortic aneurysm (AAA) management, including aneurysm presence, size, rupture status, and prior repair. However, this information is often embedded within lengthy, heterogeneous reports, making manual extraction inefficient. We evaluated the [...] Read more.

Background/Objectives. Abdominal computed tomography (CT) radiology reports contain critical information for abdominal aortic aneurysm (AAA) management, including aneurysm presence, size, rupture status, and prior repair. However, this information is often embedded within lengthy, heterogeneous reports, making manual extraction inefficient. We evaluated the performance of multiple large language models (LLMs) for automated extraction of AAA-related findings from radiology reports. Methods. We retrospectively analyzed 500 abdominal CT reports mentioning AAA from an urban academic health system (2020–2024). Ground truth labels were established by manual review. Four open-source LLMs (Qwen2.5-7B-Instruct, Llama3-Med42-8B, GPT-OSS-20B, and MedGemma-27B-text-it) were evaluated for extraction of aneurysm presence, size, morphology, rupture status, impending rupture, and prior aortic repair. Model outputs were compared with ground truth using exact-match accuracy, and inter-model agreement was assessed using Fleiss’ kappa. Reasoning traces were examined to characterize correct and incorrect model behavior. Results. Accuracy for identifying AAA presence ranged from 0.90 to 0.95 (κ = 0.851), and prior aortic repair from 0.90 to 0.97 (κ = 0.793). Accuracy for aneurysm size ranged from 0.67 to 0.88 (κ = 0.340), with low κ’s due to class imbalance or dimension misselection. Rupture and impending rupture were identified with accuracies exceeding 0.90 across models, though agreement was lower (κ = 0.485 and 0.589), reflecting low event prevalence. Larger models (GPT-OSS-20B, MedGemma-27B) generally outperformed smaller models. Reasoning analysis revealed strengths in measurement prioritization but recurrent errors, including dimension misselection, over-inference of prior repair, and conservative classification of rupture-related findings. Conclusions. LLMs can accurately extract clinically relevant AAA information from radiology reports with interpretable reasoning, with larger and medically trained models outperforming smaller or general-purpose models. Performance varies by task and model, underscoring the need for careful validation and human-in-the-loop deployment in clinical settings. Full article

(This article belongs to the Special Issue Large Language Models in Medical Diagnostics: Advancing Clinical Practice, Research, and Patient Care)

► Show Figures

Figure 1

15 pages, 701 KB

Open AccessArticle

Digital Medical Catalog: Harnessing AI for Automated Classification and Analysis of Medical Data

by Jeremie Biringanine Ruvunangiza and Carlos Alberto Valderrama Sakuyama

AI Med. 2026, 1(2), 10; https://doi.org/10.3390/aimed1020010 - 3 Apr 2026

Viewed by 704

Abstract

The exponential growth of unstructured medical data, particularly clinical notes and diagnostic reports, presents mounting challenges for healthcare knowledge extraction and utilization. This study introduces the Digital Medical Catalog (DMC), a framework that automates the conversion of clinical narratives into an auditable, semantically [...] Read more.

The exponential growth of unstructured medical data, particularly clinical notes and diagnostic reports, presents mounting challenges for healthcare knowledge extraction and utilization. This study introduces the Digital Medical Catalog (DMC), a framework that automates the conversion of clinical narratives into an auditable, semantically structured knowledge base. The framework combines BioClinicalBERT embeddings, c-TF-IDF statistical grounding, and semantic clustering, enabling high-fidelity classification (Macro F1 = 0.877 ± 0.012), traceable topic labeling, and temporal trend analysis. By demonstrating that semantic representation methods, reinforced with statistical grounding, are essential for large-scale medical text processing, this work establishes a foundation for privacy-preserving data governance and real-time intelligence within modern healthcare infrastructures. Full article

► Show Figures

Figure 1

25 pages, 7234 KB

Open AccessArticle

Quantum-Enhanced Multimodal Fusion Networks for Integrated Cancer Diagnosis: Combining CT, Genomics, and Clinical Records

by Sandeep Gupta, Kanad Ray, Shamim Kaiser, Sazzad Hossain and Jocelyn Faubert

Algorithms 2026, 19(4), 279; https://doi.org/10.3390/a19040279 - 2 Apr 2026

Viewed by 779

Abstract

Diagnosis of cancer is one of the hardest problems faced in modern medicine and involves integrating different data sources such as medical images, genomic profiles and clinical records. Traditional machine learning methods have difficulty handling the high-dimensional and complex correlation properties of multimodal [...] Read more.

Diagnosis of cancer is one of the hardest problems faced in modern medicine and involves integrating different data sources such as medical images, genomic profiles and clinical records. Traditional machine learning methods have difficulty handling the high-dimensional and complex correlation properties of multimodal medical data. In view of this, we propose a new Quantum-Enhanced Multimodal Fusion Network (QEMFN) framework to break through traditional image–text matching based on quantum computing principles for CT imaging with genomic sequencing data and EHR information. Our approach utilizes variational quantum circuits for feature encoding, quantum kernel methods for crossmodal attention, and hybrid quantum–classical architectures for final classification. We realize the framework using Google Cirq quantum computing library and validate it on publicly available datasets including TCIA (The Cancer Imaging Archive), TCGA (The Cancer Genome Atlas), and MIMIC-III clinical database. The matched multimodal cohort comprises 847 lung cancer patients, 623 colorectal cancer patients, and 401 liver cancer patients with complete imaging, genomic, and clinical records, assembled via de-identified patient ID linkage across the three archives. The experiment takes steps toward the realization of quantum-enhanced diagnostic systems and offers a path for subsequent experimental confirmation. We theoretically analyze the potential quantum advantage, present detailed implementation details using Cirq, and describe a roadmap to clinical translation for quantum-enhanced diagnostic tools. Full article

► Show Figures

Graphical abstract

18 pages, 3239 KB

Open AccessArticle

LPA-Tuning CLIP: An Improved CLIP-Based Classification Model for Intestinal Polyps

by Zumin Wang, Jun Gao, Wenhao Ping, Jing Qin and Changqing Ji

Sensors 2026, 26(6), 1764; https://doi.org/10.3390/s26061764 - 11 Mar 2026

Viewed by 481

Abstract

Background and Objective: Accurate classification of intestinal polyps is crucial for preventing colorectal cancer but is hindered by visual similarity among subtypes and endoscopic variability. While deep learning aids in diagnosis, single-modal models face efficiency–accuracy trade-offs and ignore pathological semantics. We propose a [...] Read more.

Background and Objective: Accurate classification of intestinal polyps is crucial for preventing colorectal cancer but is hindered by visual similarity among subtypes and endoscopic variability. While deep learning aids in diagnosis, single-modal models face efficiency–accuracy trade-offs and ignore pathological semantics. We propose a multimodal framework that integrates endoscopic images with structured pathological descriptions to bridge this gap. Methods: We propose LPA-Tuning CLIP, which incorporates three key innovations: replacing CLIP’s instance-level contrastive loss with cross-modal projection matching (CMPM) with ID loss to explicitly optimize intraclass compactness and interclass separation through label-aware image-text similarity matrices; introducing structured clinical semantic templates that encode WHO diagnostic criteria into hierarchical text prompts for consistent pathology annotations; and developing medical-aware augmentation that preserves lesion features while reducing domain shifts. Results: The experimental results demonstrate that our proposed method achieves an accuracy of 85.8% and an F1 score of 0.862 on the internal test set, establishing a new state-of-the-art performance for intestinal polyp classification. Conclusions: This study proposes a multimodal polyp classification paradigm that achieves 85.8% accuracy on three-subtype classification via endoscopic image-pathology text joint representation learning, outperforming unimodal baselines by 8.7% and a multimodal baseline by 4.3%. Full article

(This article belongs to the Special Issue AI and Intelligent Sensors for Medical Imaging)

► Show Figures

Figure 1

23 pages, 2717 KB

Open AccessArticle

Ensemble-Based Multi-Class and Multi-Label Text Classification for Noisy Clinical Dialogues

by Małgorzata Lucińska, Małgorzata Płaza, Justyna Kęczkowska, Kacper Kurek, Karol Wykrota, Stanisław Deniziak, Karol Twardowski, Zbigniew Koruba and Mirosław Płaza

Appl. Sci. 2026, 16(6), 2645; https://doi.org/10.3390/app16062645 - 10 Mar 2026

Viewed by 490

Abstract

Multi-class and multi-label classification of medical dialogues remains a challenging task due to high linguistic variability and transcription noise. This study proposes an ensemble approach based on three fine-tuned Polish T5 (Text-to-Text Transfer Transformer) models trained on partially overlapping clinical dialogue datasets. The [...] Read more.

Multi-class and multi-label classification of medical dialogues remains a challenging task due to high linguistic variability and transcription noise. This study proposes an ensemble approach based on three fine-tuned Polish T5 (Text-to-Text Transfer Transformer) models trained on partially overlapping clinical dialogue datasets. The models are evaluated exclusively on low-quality, highly noisy, automatically transcribed conversations to assess real-world robustness. The results demonstrate that the ensemble of models improves classification stability and outperforms the best single model, increasing the F1-score by 21.8% for internal medicine dialogues and by 44.9% for paediatric interviews. The proposed method shows potential for practical deployment in clinical decision support and automated medical documentation systems. Full article

(This article belongs to the Special Issue AI for Medical Systems: Algorithms, Applications, and Challenges)

► Show Figures

Figure 1

30 pages, 1413 KB

Open AccessArticle

Enhancing Post-Editing of Kazakh Translations Using Fine-Tuned Large Language Models

by Akbayan Bekarystankyzy, Diana Rakhimova, Aliya Zhiger, Assel Sakatay, Nazym Zhumakhan, Aigerim Yerimbetova, Dina Oralbekova and Mussa Turdalyuly

Algorithms 2026, 19(3), 199; https://doi.org/10.3390/a19030199 - 6 Mar 2026

Viewed by 828

Abstract

Machine translation for low-resource languages such as Kazakh remains a complex task due to the scarcity of training data, intricate morphological structures, and culturally specific linguistic characteristics. This study presents the first extensive exploration of fine-tuning large language models for automated post-editing of [...] Read more.

Machine translation for low-resource languages such as Kazakh remains a complex task due to the scarcity of training data, intricate morphological structures, and culturally specific linguistic characteristics. This study presents the first extensive exploration of fine-tuning large language models for automated post-editing of Kazakh translations. We introduce KazPE, a carefully curated and annotated dataset that includes 10,008 training sentences and 311 test sentences spanning six domains: the medical, scientific, journalistic, oral, fiction, and legal. The dataset features detailed error classifications across 9 linguistic categories. Our method fine-tunes GPT-4.1 mini using supervised learning to enhance translation quality by systematically correcting targeted errors. According to human evaluations, conducted on a continuous 0–1 scale, the fine-tuned model achieves an average quality score of 0.84, surpassing the baseline score of 0.80, corresponding to a 5% relative improvement. The greatest improvements are observed in handling morphological and lexical errors, as well as in domain-specific texts—particularly in legal (+17%) and medical (+22%) domains. In addition, the translations were evaluated using the automatic metrics: BLEU, TER and METEOR. The fine-tuned model shows improvements across all automatic metrics (BLEU, TER, METEOR), which confirms better n-gram overlap with reference texts, fewer edits needed, and enhanced lexical and semantic alignment with the reference texts. Comprehensive error analysis shows that the fine-tuning process effectively mitigates challenges related to Kazakh’s agglutinative morphology and specialized terminology, while preserving accuracy on already correct sentences. This research establishes the first structured evaluation framework for Kazakh translation post-editing and offers valuable guidance for enhancing machine translation in morphologically rich, low-resource languages. To facilitate further progress in Turkic language processing, we publicly release the KazPE dataset, trained models, and evaluation framework. Full article

(This article belongs to the Special Issue AI Applications and Modern Industry)

► Show Figures

Figure 1

14 pages, 4655 KB

Open AccessArticle

Fine-Tuning a Small Vision Language Model Using Synthetic Data for Explaining Bacterial Skin Disease Images

by Shiwan Zhang, Abdurrahim Yilmaz, Gulsum Gencoglan and Burak Temelkuran

Diagnostics 2026, 16(4), 603; https://doi.org/10.3390/diagnostics16040603 - 18 Feb 2026

Viewed by 1492

Abstract

Background/Objectives: Vision language models (VLMs) show strong potential for medical image understanding, but their large scale often limits practical deployment. This study investigates whether a compact VLM can be effectively adapted for dermatology, with a focus on explaining bacterial skin disease images. Methods: [...] Read more.

Background/Objectives: Vision language models (VLMs) show strong potential for medical image understanding, but their large scale often limits practical deployment. This study investigates whether a compact VLM can be effectively adapted for dermatology, with a focus on explaining bacterial skin disease images. Methods: We curate a dataset derived from PMC-OA using the BIOMEDICA dataset and construct PMC-derma-VQA-bacteria by pairing images with inherited figure captions and synthetically generated question–answer (QA) supervision produced by Google’s Gemini model. SmolVLM is fine-tuned under three supervision settings: QA-only, caption-only, and a combined QA+caption strategy. The models are evaluated on a held-out test set for both text-generation quality and diagnostic classification performance. Results: QA-only supervision yields the best report-generation performance, while the combined QA+caption setting achieves the highest classification accuracy (70.20%). Conclusions: Synthetic QA supervision can meaningfully enhance compact VLMs for medical image understanding and diagnostic support in dermatology. Full article

(This article belongs to the Special Issue Artificial Intelligence in Skin Disorders 2025)

► Show Figures

Graphical abstract

11 pages, 401 KB

Open AccessArticle

Comparative Evaluation of Rule-Based and Transformer-Based Text-Mining Methods for Detecting SGLT2 Inhibitor Mentions in Unstructured Clinical Free Text

by Attila Csaba Nagy

Technologies 2026, 14(2), 122; https://doi.org/10.3390/technologies14020122 - 15 Feb 2026

Viewed by 480

Abstract

Much of the patient data recorded in electronic health records is stored as unstructured free text. Extracting medication information from such data is essential, particularly for antidiabetic drugs such as sodium–glucose cotransporter-2 (SGLT2) inhibitors, but remains challenging due to spelling variability, abbreviations, and [...] Read more.

Much of the patient data recorded in electronic health records is stored as unstructured free text. Extracting medication information from such data is essential, particularly for antidiabetic drugs such as sodium–glucose cotransporter-2 (SGLT2) inhibitors, but remains challenging due to spelling variability, abbreviations, and non-standard documentation practices. This study compared four text-mining approaches, simple keyword search, regular expression–based matching, fuzzy string matching, and a transformer-based token classification baseline, for detecting SGLT2 inhibitor mentions in Hungarian clinical narratives. Clinical documents were obtained from the University of Debrecen Clinical Centre and covered patients with type 2 diabetes mellitus (ICD-10: E11) from 2018 and 2019. Searches targeted both generic and brand names and SGLT-related abbreviations. In the 2019 dataset (n = 5383), simple keyword search identified 1.49% of documents as containing an SGLT2 inhibitor mention, compared with 7.21% using regular expressions, 8.55% using fuzzy matching, and 0.71% using the transformer-based baseline. Mean execution times were 0.07 s, 1.64 s, 5.13 s, and 34.71 s, respectively. Method performance was further evaluated against a manually annotated reference set from 2018 using confusion matrices and standard classification metrics. Fuzzy string matching achieved the highest recall and F1-score, while regular expression-based matching provided a strong balance between precision and recall. The transformer-based baseline showed high precision but substantially lower recall in the absence of domain-specific fine-tuning. Overall, similarity-based fuzzy matching offered the most favorable balance between detection performance and computational efficiency for identifying SGLT2 inhibitor mentions in unstructured Hungarian clinical text. Full article

► Show Figures

Figure 1

18 pages, 8768 KB

Open AccessArticle

Implementation and Evaluation of the RECAP Framework: A Quality Improvement Initiative

by Courtenay R. Bruce, Natalie N. Zuniga-Georgy, Nathan Way, Lenis Sosa, Emmanuel Javaluyas, Terrell L. Williams, Swetha Mulpur and Gail Vozzella

Nurs. Rep. 2026, 16(2), 56; https://doi.org/10.3390/nursrep16020056 - 9 Feb 2026

Viewed by 721

Abstract

Background: Narration of care (NOC) refers to a nurse’s ability to explain the purposes, goals, and objectives of nursing tasks. In this project, narration of care (NOC) refers to real-time verbal explanation of nursing tasks and should not be confused with the Nursing [...] Read more.

Background: Narration of care (NOC) refers to a nurse’s ability to explain the purposes, goals, and objectives of nursing tasks. In this project, narration of care (NOC) refers to real-time verbal explanation of nursing tasks and should not be confused with the Nursing Outcomes Classification, which uses the same acronym. Although NOC is recognized as a critical skill, little research exists on how to teach it or evaluate its use. A companion article describes the development of a NOC framework. This article focuses on implementation and observed changes during rollout. Objective: We aimed to describe the implementation of a discussion-based course designed to teach nurses and patient care assistants (PCAs)—collectively referred to as nursing staff—how to effectively narrate care, and to assess changes observed during the implementation period. Methods: We used a mixed-methods, pre- and post-implementation design across seven hospitals over six months (February–August 2023). Quantitative data included pre–post comparisons of Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) scores (baseline: 2022; follow-up: 2024) and structured observations of nurse–patient interactions. Qualitative data from free-text course evaluations were thematically analyzed to contextualize quantitative findings. Integration occurred by comparing themes with observed practice gaps and patient experience trends. Results: Course Evaluations: In total, 7341 staff completed the course; 4185 evaluations were submitted. Ninety-five percent reported increased knowledge and rated the course highly. Common strategies cited included teach-back, reducing anxiety through NOC, active listening, and building personal connections. HCAHPS Comparisons: Five domains improved significantly post-implementation: care transitions (4.6, p = 0.001), cleanliness (3.9, p = 0.024), communication about medications (2.3, p = 0.042), discharge communication (2.7, p = 0.002), and restfulness (2.5, p = 0.015). Practice Observations: In total, 1281 observations were conducted. Observations indicated frequent use of several NOC-aligned behaviors and opportunities to improve narration of the environment and resolution of patient concerns. Conclusions: Improvements in patient experience measures and observed practices coincided with the course rollout. However, given the pre–post, uncontrolled design, causality cannot be inferred. Full article

(This article belongs to the Special Issue Advancing Nursing Practice Through Innovative Education)

9 pages, 550 KB

Open AccessOpinion

Is Cannabidiol (CBD) a Non-Psychoactive Phytocannabinoid?

by Eliana Rodrigues

Psychoactives 2026, 5(1), 4; https://doi.org/10.3390/psychoactives5010004 - 3 Feb 2026

Viewed by 1431

Abstract

Interest in psychoactive substances, including psychedelics, is rapidly expanding in medical, academic, and other popular fields. Despite the classifications established within the psychopharmacological scientific community, certain plants, animals, and fungi, as well as the substances obtained from them, have been misclassified by both [...] Read more.

Interest in psychoactive substances, including psychedelics, is rapidly expanding in medical, academic, and other popular fields. Despite the classifications established within the psychopharmacological scientific community, certain plants, animals, and fungi, as well as the substances obtained from them, have been misclassified by both the media and academic circles. This opinion piece aims to present arguments to answer the following question: Is CBD a non-psychoactive phytocannabinoid? Hundreds of robust scientific studies published in recent years involving CBD have strengthened its clinical use in the treatment of seizures, anxiety, psychosis, schizophrenia, post-traumatic stress disorder, and addiction. As part of the arguments to answer the question posed, this text provides a historical overview of the classifications of psychoactive substances available to date, and offers reflections on these terminologies and a proposed classification of psychedelics. Full article

► Show Figures

Graphical abstract

Search Results (121)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (121)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI