MDPI - Publisher of Open Access Journals

16 pages, 3953 KiB

Open AccessArticle

Skin Lesion Classification Using Hybrid Feature Extraction Based on Classical and Deep Learning Methods

by Maryem Zahid, Mohammed Rziza and Rachid Alaoui

BioMedInformatics 2025, 5(3), 41; https://doi.org/10.3390/biomedinformatics5030041 - 16 Jul 2025

Viewed by 271

This paper proposes a hybrid method for skin lesion classification combining deep learning features with conventional descriptors such as HOG, Gabor, SIFT, and LBP. Feature extraction was performed by extracting features of interest within the tumor area using suggested fusion methods. We tested [...] Read more.

This paper proposes a hybrid method for skin lesion classification combining deep learning features with conventional descriptors such as HOG, Gabor, SIFT, and LBP. Feature extraction was performed by extracting features of interest within the tumor area using suggested fusion methods. We tested and compared features obtained from different deep learning models coupled to HOG-based features. Dimensionality reduction and performance improvement were achieved by Principal Component Analysis, after which SVM was used for classification. The compared methods were tested on the reference database skin cancer-malignant-vs-benign. The results show a significant improvement in terms of accuracy due to complementarity between the conventional and deep learning-based methods. Specifically, the addition of HOG descriptors led to an accuracy increase of 5% for EfficientNetB0, 7% for ResNet50, 5% for ResNet101, 1% for NASNetMobile, 1% for DenseNet201, and 1% for MobileNetV2. These findings confirm that feature fusion significantly enhances performance compared to the individual application of each method. Full article

► Show Figures

Figure 1

19 pages, 1400 KiB

Open AccessArticle

Identifying Themes in Social Media Discussions of Eating Disorders: A Quantitative Analysis of How Meaningful Guidance and Examples Improve LLM Classification

by Apoorv Prasad, Setayesh Abiazi Shalmani, Lu He, Yang Wang and Susan McRoy

BioMedInformatics 2025, 5(3), 40; https://doi.org/10.3390/biomedinformatics5030040 - 11 Jul 2025

Viewed by 377

Abstract

Background: Social media represents a unique opportunity to investigate the perspectives of people with eating disorders at scale. One forum alone, r/EatingDisorders, now has 113,000 members worldwide. In less than a day, where a manual analysis might sample a few dozen items, automatic [...] Read more.

Background: Social media represents a unique opportunity to investigate the perspectives of people with eating disorders at scale. One forum alone, r/EatingDisorders, now has 113,000 members worldwide. In less than a day, where a manual analysis might sample a few dozen items, automatic classification using large language models (LLMs) can analyze thousands of posts. Methods: Here, we compare multiple strategies for invoking an LLM, including ones that include examples (few-shot) and annotation guidelines, to classify eating disorder content across 14 predefined themes using Llama3.1:8b on 6850 social media posts. In addition to standard metrics, we calculate four novel dimensions of classification quality: a Category Divergence Index, confidence scores (overall model certainty), focus scores (a measure of decisiveness for selected subsets of themes), and dominance scores (primary theme identification strength). Results: By every measure, invoking an LLM without extensive guidance and examples (zero-shot) is insufficient. Zero-shot had worse mean category divergence (7.17 versus 3.17). Whereas, few-shot yielded higher mean confidence, 0.42 versus 0.27, and higher mean dominance, 0.81 versus 0.46. Overall, a few-shot approach improved quality measures across nearly 90% of predictions. Conclusions: These findings suggest that LLMs, if invoked with expert instructions and helpful examples, can provide instantaneous high-quality annotation, enabling automated mental health content moderation systems or future clinical research. Full article

► Show Figures

Figure 1

38 pages, 1738 KiB

Open AccessArticle

AI-Driven Bayesian Deep Learning for Lung Cancer Prediction: Precision Decision Support in Big Data Health Informatics

by Natalia Amasiadi, Maria Aslani-Gkotzamanidou, Leonidas Theodorakopoulos, Alexandra Theodoropoulou, George A. Krimpas, Christos Merkouris and Aristeidis Karras

BioMedInformatics 2025, 5(3), 39; https://doi.org/10.3390/biomedinformatics5030039 - 9 Jul 2025

Viewed by 476

Abstract

Lung-cancer incidence is projected to rise by 50% by 2035, underscoring the need for accurate yet accessible risk-stratification tools. We trained a Bayesian neural network on 300 annotated chest-CT scans from the public LIDC–IDRI cohort, integrating clinical metadata. Hamiltonian Monte-Carlo sampling (10 000 [...] Read more.

Lung-cancer incidence is projected to rise by 50% by 2035, underscoring the need for accurate yet accessible risk-stratification tools. We trained a Bayesian neural network on 300 annotated chest-CT scans from the public LIDC–IDRI cohort, integrating clinical metadata. Hamiltonian Monte-Carlo sampling (10 000 posterior draws) captured parameter uncertainty; performance was assessed with stratified five-fold cross-validation and on three independent multi-centre cohorts. On the locked internal test set, the model achieved 99.0% accuracy, AUC = 0.990 and macro-F1 = 0.987. External validation across 824 scans yielded a mean AUC of 0.933 and an expected calibration error

< 0.034

, while eliminating false positives for benign nodules and providing voxel-level uncertainty maps. Uncertainty-aware Bayesian deep learning delivers state-of-the-art, well-calibrated lung-cancer risk predictions from a single CT scan, supporting personalised screening intervals and safe deployment in clinical workflows. Full article

► Show Figures

Figure 1

21 pages, 3053 KiB

Open AccessArticle

An Effective Approach for Wearable Sensor-Based Human Activity Recognition in Elderly Monitoring

by Youssef Errafik, Younes Dhassi, Mohamed Baghrous and Adil Kenzi

BioMedInformatics 2025, 5(3), 38; https://doi.org/10.3390/biomedinformatics5030038 - 9 Jul 2025

Viewed by 329

Abstract

Technological advancements and AI-based research have significantly influenced our daily lives. Human activity recognition (HAR) is a key area at the intersection of various AI technologies and application domains. In this study, we present our novel time series classification approach for monitoring the [...] Read more.

Technological advancements and AI-based research have significantly influenced our daily lives. Human activity recognition (HAR) is a key area at the intersection of various AI technologies and application domains. In this study, we present our novel time series classification approach for monitoring the physical behaviors of the elderly and patients. This approach, which integrates supervised and unsupervised methods with generative models, has been validated for HAR, showing promising results. Our method was specifically adapted for healthcare and surveillance applications, enhancing the classification of physical behaviors in the elderly. The hybrid approach proved its effectiveness on the HAR70+ dataset, surpassing traditional recurrent convolutional network-based approaches. We further evaluated the surveillance system for the elderly (Surv-Sys-Elderly) model on the HARTH and HAR70+ datasets, achieving an accuracy of 94,3% on the HAR70+ dataset for recognizing elderly behaviors, highlighting its robustness and suitability for both clinical and domestic environments. Full article

(This article belongs to the Topic Computational Intelligence and Bioinformatics (CIB))

► Show Figures

Figure 1

40 pages, 2828 KiB

Open AccessReview

Generative Artificial Intelligence in Healthcare: Applications, Implementation Challenges, and Future Directions

by Syed Arman Rabbani, Mohamed El-Tanani, Shrestha Sharma, Syed Salman Rabbani, Yahia El-Tanani, Rakesh Kumar and Manita Saini

BioMedInformatics 2025, 5(3), 37; https://doi.org/10.3390/biomedinformatics5030037 - 7 Jul 2025

Viewed by 1440

Abstract

Generative artificial intelligence (AI) is rapidly transforming healthcare systems since the advent of OpenAI in 2022. It encompasses a class of machine learning techniques designed to create new content and is classified into large language models (LLMs) for text generation and image-generating models [...] Read more.

Generative artificial intelligence (AI) is rapidly transforming healthcare systems since the advent of OpenAI in 2022. It encompasses a class of machine learning techniques designed to create new content and is classified into large language models (LLMs) for text generation and image-generating models for creating or enhancing visual data. These generative AI models have shown widespread applications in clinical practice and research. Such applications range from medical documentation and diagnostics to patient communication and drug discovery. These models are capable of generating text messages, answering clinical questions, interpreting CT scan and MRI images, assisting in rare diagnoses, discovering new molecules, and providing medical education and training. Early studies have indicated that generative AI models can improve efficiency, reduce administrative burdens, and enhance patient engagement, although most findings are preliminary and require rigorous validation. However, the technology also raises serious concerns around accuracy, bias, privacy, ethical use, and clinical safety. Regulatory bodies, including the FDA and EMA, are beginning to define governance frameworks, while academic institutions and healthcare organizations emphasize the need for transparency, supervision, and evidence-based implementation. Generative AI is not a replacement for medical professionals but a potential partner—augmenting decision-making, streamlining communication, and supporting personalized care. Its responsible integration into healthcare could mark a paradigm shift toward more proactive, precise, and patient-centered systems. Full article

(This article belongs to the Special Issue Integrating Health Informatics and Artificial Intelligence for Advanced Medicine)

► Show Figures

Figure 1

29 pages, 3896 KiB

Open AccessArticle

Self-Explaining Neural Networks for Food Recognition and Dietary Analysis

by Zvinodashe Revesai and Okuthe P. Kogeda

BioMedInformatics 2025, 5(3), 36; https://doi.org/10.3390/biomedinformatics5030036 - 2 Jul 2025

Viewed by 381

Abstract

Food pattern recognition plays a crucial role in modern healthcare by enabling automated dietary monitoring and personalised nutritional interventions, particularly for vulnerable populations with complex dietary needs. Current food recognition systems struggle to balance high accuracy with interpretability and computational efficiency when analysing [...] Read more.

Food pattern recognition plays a crucial role in modern healthcare by enabling automated dietary monitoring and personalised nutritional interventions, particularly for vulnerable populations with complex dietary needs. Current food recognition systems struggle to balance high accuracy with interpretability and computational efficiency when analysing complex meal compositions in real-world settings. We developed a novel self-explaining neural architecture that integrates specialised attention mechanisms with temporal modules within a streamlined framework. Our methodology employs hierarchical feature extraction through successive convolution operations, multi-head attention mechanisms for pattern classification, and bidirectional LSTM networks for temporal analysis. Architecture incorporates self-explaining components utilising attention-based mechanisms and interpretable concept encoders to maintain transparency. We evaluated our model on the FOOD101 dataset using 5-fold cross-validation, ablation studies, and comprehensive computational efficiency assessments. Training employed multi-objective optimisation with adaptive learning rates and specialised loss functions designed for dietary pattern recognition. Experiments demonstrate our model’s superior performance, achieving 94.1% accuracy with only 29.3 ms inference latency and 3.8 GB memory usage, representing a 63.3% parameter reduction compared to baseline transformers. The system maintains detection rates above 84% in complex multi-item recognition scenarios, whilst feature attribution analysis achieved scores of 0.89 for primary components. Cross-validation confirmed consistent performance with accuracy ranging from 92.8% to 93.5% across all folds. This research advances automated dietary analysis by providing an efficient, interpretable solution for food recognition with direct applications in nutritional monitoring and personalised healthcare, particularly benefiting vulnerable populations who require transparent and trustworthy dietary guidance. Full article

► Show Figures

Figure 1

23 pages, 2769 KiB

Open AccessArticle

Exploring CBC Data for Anemia Diagnosis: A Machine Learning and Ontology Perspective

by Amira S. Awaad, Yomna M. Elbarawy, H. Mancy and Naglaa E. Ghannam

BioMedInformatics 2025, 5(3), 35; https://doi.org/10.3390/biomedinformatics5030035 - 2 Jul 2025

Viewed by 411

Abstract

Background: Anemia, a common health disorder affecting populations globally, demands timely and accurate diagnosis for treatment to be effective. The aim of this paper is to detect and classify four types of anemia: hgb, iron-deficiency, folate-deficiency, and B12-deficiency anemia. Methods: This paper proposes [...] Read more.

Background: Anemia, a common health disorder affecting populations globally, demands timely and accurate diagnosis for treatment to be effective. The aim of this paper is to detect and classify four types of anemia: hgb, iron-deficiency, folate-deficiency, and B12-deficiency anemia. Methods: This paper proposes an ontology-enhanced machine learning (ML) framework to classify types of anemia from CBC data obtained from Kaggle, which contains 15,300 patient records. It evaluates the effects of classical versus deep classifiers on imbalanced and oversampled training samples. Tests include KNN, SVM, DT, RF, CNN, CNN+SVM, CNN+RF, and XGBoost. Another interesting contribution is the use of ontological reasoning via SPARQL queries to semantically enrich clinical features with categories like “Low Hemoglobin” or “Macrocytic MCV”. These semantic features were then used in both classical (SVM) and deep hybrid models (CNN+SVM). Results: Ontology-enhanced and CNN hybrid models perform competitively when paired with ROS or ADASYN, but their performance degrades significantly on the original dataset. There were tremendous performance gains with ontology-enhanced models in that Onto-CNN+SVM achieved an F1-score (1.00) for all the four types of anemia under ROS sampling, while Onto-SVM exhibited more than 20% improvement in F1-scores for minority categories like folate and B12 when compared to baseline models, except XGBoost. Conclusions: Ontology-driven knowledge coalescence has been shown to improve classification results; however, XGBoost consistently outperformed all other classifiers across all data conditions, making it the most robust and reliable model for clinically relevant decision-support systems in anemia diagnosis. Full article

► Show Figures

Figure 1

26 pages, 2124 KiB

Open AccessArticle

Integrating Boruta, LASSO, and SHAP for Clinically Interpretable Glioma Classification Using Machine Learning

by Mohammad Najeh Samara and Kimberly D. Harry

BioMedInformatics 2025, 5(3), 34; https://doi.org/10.3390/biomedinformatics5030034 - 30 Jun 2025

Viewed by 648

Abstract

Background: Gliomas represent the most prevalent and aggressive primary brain tumors, requiring precise classification to guide treatment strategies and improve patient outcomes. Purpose: This study aimed to develop and evaluate a machine learning-driven approach for glioma classification by identifying the most relevant genetic [...] Read more.

Background: Gliomas represent the most prevalent and aggressive primary brain tumors, requiring precise classification to guide treatment strategies and improve patient outcomes. Purpose: This study aimed to develop and evaluate a machine learning-driven approach for glioma classification by identifying the most relevant genetic and clinical biomarkers while demonstrating clinical utility. Methods: A dataset from The Cancer Genome Atlas (TCGA) containing 23 features was analyzed using an integrative approach combining Boruta, Least Absolute Shrinkage and Selection Operator (LASSO), and SHapley Additive exPlanations (SHAP) for feature selection. The refined feature set was used to train four machine learning models: Random Forest, Support Vector Machine, XGBoost, and Logistic Regression. Comprehensive evaluation included class distribution analysis, calibration assessment, and decision curve analysis. Results: The feature selection approach identified 13 key predictors, including IDH1, TP53, ATRX, PTEN, NF1, EGFR, NOTCH1, PIK3R1, MUC16, CIC mutations, along with Age at Diagnosis and race. XGBoost achieved the highest AUC (0.93), while Logistic Regression recorded the highest testing accuracy (88.09%). Class distribution analysis revealed excellent GBM detection (Average Precision 0.840–0.880) with minimal false negatives (5–7 cases). Calibration analysis demonstrated reliable probability estimates (Brier scores 0.103–0.124), and decision curve analysis confirmed substantial clinical utility with net benefit values of 0.36–0.39 across clinically relevant thresholds. Conclusions: The integration of feature selection techniques with machine learning models enhances diagnostic precision, interpretability, and clinical utility in glioma classification, providing a clinically ready framework that bridges computational predictions with evidence-based medical decision-making. Full article

► Show Figures

Figure 1

12 pages, 1637 KiB

Open AccessArticle

Identification of a New Lung Cancer Biomarker Signature Using Data Mining and Preliminary In Vitro Validation

by Ferid Ben Ali, Denis Mustafov, Maria Braoudaki, Sola Adeleke and Iosif Mporas

BioMedInformatics 2025, 5(2), 32; https://doi.org/10.3390/biomedinformatics5020032 - 11 Jun 2025

Viewed by 741

Abstract

Background: Lung adenocarcinoma is one of the major subtype of non-Small Cell Lung Cancer and biomarkers are essential to be identified for early diagnosis. The study aims to find in silico and preliminary in vitro analysis of potential biomarkers for lung adenocarcinoma. Methods [...] Read more.

Background: Lung adenocarcinoma is one of the major subtype of non-Small Cell Lung Cancer and biomarkers are essential to be identified for early diagnosis. The study aims to find in silico and preliminary in vitro analysis of potential biomarkers for lung adenocarcinoma. Methods: Bioinformatics analysis in parallel to data mining analysis was performed on microarray data with lung adenocarcinoma samples to identify potent gene biomarkers associated with lung cancer type. Afterwards, these genes were then validated in vitro using RT-qPCR analysis in cancerous (Calu-3) and non-cancerous (MRC-5) cell lines. Moreover, these genes were used in machine learning-based analysis to classify lung adenocarcinoma samples from controls. The analysis includes three experiments—the bioinformatic (in silico), in vitro, and machine learning analyses. Results: The three experiments identified four genes, namely, SLC15A1, GPR123 (ADGRA1), KCNAB2, and KNDC1, as key biomarkers and the most relevant gene features for distinguishing lung adenocarcinoma from control. Conclusions: This study identifies four biomarkers associated with lung adenocarcinoma through bioinformatics, in vitro and machine learning analyses. These four genes shows strong potential for further investigation in clinical research. Full article

► Show Figures

Figure 1

20 pages, 2269 KiB

Open AccessArticle

Voice as a Health Indicator: The Use of Sound Analysis and AI for Monitoring Respiratory Function

by Nicki Lentz-Nielsen, Lars Maaløe, Pascal Madeleine and Stig Nikolaj Blomberg

BioMedInformatics 2025, 5(2), 31; https://doi.org/10.3390/biomedinformatics5020031 - 7 Jun 2025

Viewed by 1209

Abstract

Background: Chronic obstructive pulmonary disease (COPD) is projected to be the third-leading cause of death by 2030. Traditional spirometry for the monitoring of the forced expiratory volume in one second (FEV1) can provoke discomfort and anxiety. This study aimed to validate AI models [...] Read more.

Background: Chronic obstructive pulmonary disease (COPD) is projected to be the third-leading cause of death by 2030. Traditional spirometry for the monitoring of the forced expiratory volume in one second (FEV1) can provoke discomfort and anxiety. This study aimed to validate AI models using daily audio recordings as an alternative for FEV1 estimation in home settings. Methods: Twenty-three participants with moderate to severe COPD recorded daily audio readings of standardized texts and measured their FEV1 using spirometry over nine months. Participants also recorded biomarkers (heart rate, temperature, oxygen saturation) via tablet application. Various machine learning models were trained using acoustic features extracted from 2053 recordings, with K-nearest neighbor, random forest, XGBoost, and linear models evaluated using 10-fold cross-validation. Results: The K-nearest neighbors model achieved a root mean square error of 174 mL/s on the validation data. The limit of agreement (LoA) ranged from −333.21 to 347.26 mL/s. Despite an error range of −1252 to 1435 mL/s, most predictions fell within the LoA, indicating good performance in estimating the FEV1. Conclusions: The predictive model showed promising results, with a narrower LoA compared to traditional unsupervised spirometry methods. The AI models effectively used audio to predict the FEV1, suggesting a viable non-invasive approach for COPD monitoring that could enhance patient comfort and accessibility in home settings. Full article

► Show Figures

Figure 1

22 pages, 2518 KiB

Open AccessArticle

Anticancer Effects of Pleurotus salmoneostramineus Protein Hydrolysate on HepG2 Cells and In Silico Characterization of Structural Effects of Chromoprotein-Derived Peptides on the Mitochondrial Uncoupling Protein 2 (UCP2)

by Erica K. Ventura-García, Mónica A. Valdez-Solana, Claudia Avitia-Domínguez, Guadalupe García-Arenas, Alfredo Téllez-Valencia, Nagamani Balagurusamy and Erick Sierra-Campos

BioMedInformatics 2025, 5(2), 29; https://doi.org/10.3390/biomedinformatics5020029 - 26 May 2025

Viewed by 1446

Abstract

Background: Pleurotus salmoneostramineus is acknowledged as a reliable source of high-quality protein, with its protein concentrates, hydrolysates, and peptides potentially offering health benefits to humans. However, studies validating the medicinal effects of P. salmoneostramineus proteins, particularly the pink chromoprotein, are currently absent. [...] Read more.

Background: Pleurotus salmoneostramineus is acknowledged as a reliable source of high-quality protein, with its protein concentrates, hydrolysates, and peptides potentially offering health benefits to humans. However, studies validating the medicinal effects of P. salmoneostramineus proteins, particularly the pink chromoprotein, are currently absent. Methods: This study explores anticancer peptides from the chromoprotein of P. salmoneostramineus, evaluating their ability to bind UCP2 via in silico analysis. Additionally, it assesses the protein hydrolysate from P. salmoneostramineus (PSPs) effect on HepG2 cell proliferation and mitochondrial metabolism, focusing on uncoupling protein activity. Results: Eight peptides were identified as potential UCP2 inhibitors. According to mACPpred2.0 and CSM-peptides servers, the peptides TSMQSSL, QEGQKL, SEDSGEA, and GRNSL exhibit promising anticancer properties. These anticancer peptides yielded the following docking scores (kcal/mol) when tested against UCP2: TSMQSSL (−166.75), QEGQKL (−126.06), SEDSGEA (−99.93), and GRNSL (−137.93). Molecular dynamics simulations have shown that the peptides establish stable interactions with UCP2 through salt bridges, hydrophobic interactions, and hydrogen bonds, implying that hydrogen bonding with RRR88 and FVW92 causes conformational changes in UCP2. Moreover, the outcomes of this study indicated that PSPs possess an antiproliferative effect on HepG2 cells and lower mitochondrial bioenergetics, especially UCP2 activity. Conclusions: These findings suggest that peptides from P.salmoneostramineus can inhibit UCP2, offering a promising approach for cancer prevention, playing therapeutic roles in treatment, and providing a basis for designing peptide-based cancer therapies. Full article

► Show Figures

Figure 1

17 pages, 1852 KiB

Open AccessArticle

A Tutorial Toolbox to Simplify Bioinformatics and Biostatistics Analyses of Microbial Omics Data in an Island Context

by Isaure Quétel, Sourakhata Tirera, Damien Cazenave, Nina Allouch, Chloé Baum, Yann Reynaud, Degrâce Batantou Mabandza, Virginie Nerrière, Serge Vedy, Matthieu Pot, Sébastien Breurec, Anne Lavergne, Séverine Ferdinand, Vincent Guerlais and David Couvin

BioMedInformatics 2025, 5(2), 27; https://doi.org/10.3390/biomedinformatics5020027 - 19 May 2025

Viewed by 1254

Abstract

Background: Bioinformatics is increasingly used in various scientific works. Large amounts of heterogeneous data are being generated these days. It is difficult to interpret and analyze these data effectively. Several software tools have been developed to facilitate the handling and analysis of biological [...] Read more.

Background: Bioinformatics is increasingly used in various scientific works. Large amounts of heterogeneous data are being generated these days. It is difficult to interpret and analyze these data effectively. Several software tools have been developed to facilitate the handling and analysis of biological data, based on specific needs. Methods: The Galaxy web platform is one of these software tools, allowing free access to users and facilitating the use of thousands of tools. Other software tools, such as Bioconda or Jupyter Notebook, facilitate the installation of tools and their dependencies. In addition to these tools, RStudio can be mentioned as a powerful interface that facilitates the use of the R programming language for data analysis and statistics. Results: The aim of this study is to provide the scientific community with guides on how to perform bioinformatics/biostatistical analyses in a simpler manner. With this work, we also try to democratize well-documented software tools to make them suitable for both bioinformaticians and non-bioinformaticians. We believe that user-friendly guides and real-life/concrete examples will provide end-users with suitable and easy-to-use methods for their bioinformatics analysis needs. Furthermore, tutorials and usage examples are available on our dedicated GitHub repository. Conclusions: These tutorials/examples (In English and/or French) could be used as pedagogical tools to promote bioinformatics analysis and offer potential solutions to several bioinformatics needs. Special emphasis is placed on microbial omics data analysis. Full article

► Show Figures

Figure 1

16 pages, 1470 KiB

Open AccessArticle

Decision Trees for the Analysis of Gene Expression Levels of COVID-19: An Association with Alzheimer’s Disease

by Jesús Alberto Torres-Sosa, Gonzalo Emiliano Aranda-Abreu, Nicandro Cruz-Ramírez and Sonia Lilia Mestizo-Gutiérrez

BioMedInformatics 2025, 5(2), 26; https://doi.org/10.3390/biomedinformatics5020026 - 9 May 2025

Viewed by 1105

Abstract

COVID-19 has caused millions of deaths around the world. The respiratory system is the main target of this disease, but it has also been reported to attack the central nervous system, creating a neuroinflammatory environment with the release of proinflammatory cytokines. There are [...] Read more.

COVID-19 has caused millions of deaths around the world. The respiratory system is the main target of this disease, but it has also been reported to attack the central nervous system, creating a neuroinflammatory environment with the release of proinflammatory cytokines. There are several studies suggesting a possible relationship between Alzheimer’s disease and COVID-19. Therefore, in this study, machine learning microarray analysis was performed to identify key genes in COVID-19 that may be associated with Alzheimer’s disease. The dataset is identified as GSE177477, containing 47 samples. A bioconductor oligo package in the RStudio (version 4.3.3) environment was used to process and normalize the data. Subsequently, one-way ANOVA was used to obtain differentially expressed genes. We used decision tree generation to classify 47 samples. The study identified 1856 differentially expressed genes. Three decision trees were generated where three genes (DNAJC16, TREM1, and UCP2) were identified that differentiated patients. The best decision tree obtained an accuracy of 72.34%, with a sensitivity of 72.34% and a specificity of 86.17%. The genes identified with the decision trees may be involved in processes like those of Alzheimer’s disease, such as in the inflammation process, amyloid pathologies, and related to type 2 diabetes mellitus. Full article

► Show Figures

Figure 1

23 pages, 3289 KiB

Open AccessArticle

Performance Comparison of Large Language Models for Efficient Literature Screening

by Maria Teresa Colangelo, Stefano Guizzardi, Marco Meleti, Elena Calciolari and Carlo Galli

BioMedInformatics 2025, 5(2), 25; https://doi.org/10.3390/biomedinformatics5020025 - 7 May 2025

Viewed by 1804

Abstract

Background: Systematic reviewers face a growing body of biomedical literature, making early-stage article screening increasingly time-consuming. In this study, we assessed six large language models (LLMs)—OpenHermes, Flan T5, GPT-2, Claude 3 Haiku, GPT-3.5 Turbo, and GPT-4o—for their ability to identify randomized controlled trials [...] Read more.

Background: Systematic reviewers face a growing body of biomedical literature, making early-stage article screening increasingly time-consuming. In this study, we assessed six large language models (LLMs)—OpenHermes, Flan T5, GPT-2, Claude 3 Haiku, GPT-3.5 Turbo, and GPT-4o—for their ability to identify randomized controlled trials (RCTs) in datasets of increasing difficulty. Methods: We first retrieved articles from PubMed and used all-mpnet-base-v2 to measure semantic similarity to known target RCTs, stratifying the collection into quartiles of descending relevance. Each LLM then received either verbose or concise prompts to classify articles as “Accepted” or “Rejected”. Results: Claude 3 Haiku, GPT-3.5 Turbo, and GPT-4o consistently achieved high recall, though their precision varied in the quartile with the highest similarity, where false positives increased. By contrast, smaller or older models struggled to balance sensitivity and specificity, with some over-including irrelevant studies or missing key articles. Importantly, multi-stage prompts did not guarantee performance gains for weaker models, whereas single-prompt approaches proved effective for advanced LLMs. Conclusions: These findings underscore that both model capability and prompt design strongly affect classification outcomes, suggesting that newer LLMs, if properly guided, can substantially expedite systematic reviews. Full article

► Show Figures

Figure 1

19 pages, 6148 KiB

Open AccessArticle

Subject-Independent Cuff-Less Blood Pressure Monitoring via Multivariate Analysis of Finger/Toe Photoplethysmography and Electrocardiogram Data

by Seyedmohsen Dehghanojamahalleh, Peshala Thibbotuwawa Gamage, Mohammad Ahmed, Cassondra Petersen, Brianna Matthew, Kesha Hyacinth, Yasith Weerasinghe, Ersoy Subasi, Munevver Mine Subasi and Mehmet Kaya

BioMedInformatics 2025, 5(2), 24; https://doi.org/10.3390/biomedinformatics5020024 - 4 May 2025

Viewed by 799

Abstract

(1) Background: Blood pressure (BP) variability is an important risk factor for cardiovascular diseases. Still, existing BP monitoring methods often require periodic cuff-based measurements, raising concerns about their accuracy and convenience. This study aims to develop a subject-independent, cuff-less BP estimation method using [...] Read more.

(1) Background: Blood pressure (BP) variability is an important risk factor for cardiovascular diseases. Still, existing BP monitoring methods often require periodic cuff-based measurements, raising concerns about their accuracy and convenience. This study aims to develop a subject-independent, cuff-less BP estimation method using finger and toe photoplethysmography (PPG) signals combined with an electrocardiogram (ECG) without the need for an initial cuff-based measurement. (2) Methods: A customized measurement system was used to record 80 readings from human subjects. Fifteen features with the highest dependency on the reference BP, including time and morphological characteristics of PPG and subject information, were analyzed. A multivariate regression model was employed to estimate BP. (3) Results: The results showed that incorporating toe PPG signals improved the accuracy of BP estimation, reducing the mean absolute error (MAE). Using both finger and toe PPG signals resulted in an MAE of 9.63 ± 12.54 mmHg for systolic BP and 6.76 ± 8.38 mmHg for diastolic BP, providing the lowest MAE compared to previous methods. (4) Conclusions: This study is the first to integrate toe PPG for more accurate BP estimation and proposes a method that does not require an initial cuff-based BP measurement, offering a promising approach for non-invasive, continuous BP monitoring. Full article

► Show Figures

Figure 1

Search Results (102)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Article Types

Countries / Regions

Search Results (102)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI