BioMedInformatics

26 pages, 2762 KiB

Open AccessArticle

Uncovering the Diagnostic Power of Radiomic Feature Significance in Automated Lung Cancer Detection: An Integrative Analysis of Texture, Shape, and Intensity Contributions

by Sotiris Raptis, Christos Ilioudis and Kiki Theodorou

BioMedInformatics 2024, 4(4), 2400-2425; https://doi.org/10.3390/biomedinformatics4040129 - 18 Dec 2024

Cited by 1 | Viewed by 1505

Abstract

Background: Lung cancer still maintains the leading position among causes of death in the world; the process of early detection surely contributes to changes in the survival of patients. Standard diagnostic methods are grossly insensitive, especially in the early stages. In this paper, [...] Read more.

Background: Lung cancer still maintains the leading position among causes of death in the world; the process of early detection surely contributes to changes in the survival of patients. Standard diagnostic methods are grossly insensitive, especially in the early stages. In this paper, radiomic features are discussed that can assure improved diagnostic accuracy through automated lung cancer detection by considering the important feature categories, such as texture, shape, and intensity, originating from the CT DICOM images. Methods: We developed and compared the performance of two machine learning models—DenseNet-201 CNN and XGBoost—trained on radiomic features with the ability to identify malignant tumors from benign ones. Feature importance was analyzed using SHAP and techniques of permutation importance that enhance both the global and case-specific interpretability of the models. Results: A few features that reflect tumor heterogeneity and morphology include GLCM Entropy, shape compactness, and surface-area-to-volume ratio. These performed excellently in diagnosis, with DenseNet-201 producing an accuracy of 92.4% and XGBoost at 89.7%. The analysis of feature interpretability ascertains its potential in early detection and boosting diagnostic confidence. Conclusions: The current work identifies the most important radiomic features and quantifies their diagnostic significance through a properly conducted feature selection process reflecting stability analysis. This provides the blueprint for feature-driven model interpretability in clinical applications. Radiomics features have great value in the automated diagnosis of lung cancer, especially when combined with machine learning models. This might improve early detection and open personalized diagnostic strategies for precision oncology. Full article

► Show Figures

Graphical abstract

26 pages, 6531 KiB

Open AccessArticle

Analysis of Regions of Homozygosity: Revisited Through New Bioinformatic Approaches

by Susana Valente, Mariana Ribeiro, Jennifer Schnur, Filipe Alves, Nuno Moniz, Dominik Seelow, João Parente Freixo, Paulo Filipe Silva and Jorge Oliveira

BioMedInformatics 2024, 4(4), 2374-2399; https://doi.org/10.3390/biomedinformatics4040128 - 16 Dec 2024

Cited by 2 | Viewed by 1908

Abstract

Background: Runs of homozygosity (ROHs), continuous homozygous regions across the genome, are often linked to consanguinity, with their size and frequency reflecting shared parental ancestry. Homozygosity mapping (HM) leverages ROHs to identify genes associated with autosomal recessive diseases. Whole-exome sequencing (WES) improves [...] Read more.

Background: Runs of homozygosity (ROHs), continuous homozygous regions across the genome, are often linked to consanguinity, with their size and frequency reflecting shared parental ancestry. Homozygosity mapping (HM) leverages ROHs to identify genes associated with autosomal recessive diseases. Whole-exome sequencing (WES) improves HM by detecting ROHs and disease-causing variants. Methods: To streamline personalized multigene panel creation, using WES and ROHs, we developed a methodology integrating ROHMMCLI and HomozygosityMapper algorithms, and, optionally, Human Phenotype Ontology (HPO) terms, implemented in a Django Web application. Resorting to a dataset of 12,167 WES, we performed the first ROH profiling of the Portuguese population. Clustering models were applied to predict consanguinity from ROH features. Results: These resources were applied for the genetic characterization of two siblings with epilepsy, myoclonus and dystonia, pinpointing the CSTB gene as disease-causing. Using the 2021 Census population distribution, we created a representative sample (3941 WES) and measured genome-wide autozygosity (F_ROH). Portalegre, Viseu, Bragança, Madeira, and Vila Real districts presented the highest F_ROH scores. Multidimensional scaling showed that ROH count and sum were key predictors of consanguinity, achieving a test F1-score of 0.96 with additional features. Conclusions: This study contributes with new bioinformatics tools for ROH analysis in a clinical setting, providing unprecedented population-level ROH data for Portugal. Full article

► Show Figures

Figure 1

36 pages, 25720 KiB

Open AccessArticle

Early Breast Cancer Detection Based on Deep Learning: An Ensemble Approach Applied to Mammograms

by Youness Khourdifi, Alae El Alami, Mounia Zaydi, Yassine Maleh and Omar Er-Remyly

BioMedInformatics 2024, 4(4), 2338-2373; https://doi.org/10.3390/biomedinformatics4040127 - 13 Dec 2024

Cited by 4 | Viewed by 3681

Abstract

Background: Breast cancer is one of the leading causes of death in women, making early detection through mammography crucial for improving survival rates. However, human interpretation of mammograms is often prone to diagnostic errors. This study addresses the challenge of improving the [...] Read more.

Background: Breast cancer is one of the leading causes of death in women, making early detection through mammography crucial for improving survival rates. However, human interpretation of mammograms is often prone to diagnostic errors. This study addresses the challenge of improving the accuracy of breast cancer detection by leveraging advanced machine learning techniques. Methods: We propose an extended ensemble deep learning model that integrates three state-of-the-art convolutional neural network (CNN) architectures: VGG16, DenseNet121, and InceptionV3. The model utilizes multi-scale feature extraction to enhance the detection of both benign and malignant masses in mammograms. This ensemble approach is evaluated on two benchmark datasets: INbreast and CBIS-DDSM. Results: The proposed ensemble model achieved significant performance improvements. On the INbreast dataset, the ensemble model attained an accuracy of 90.1%, recall of 88.3%, and an F1-score of 89.1%. For the CBIS-DDSM dataset, the model reached 89.5% accuracy and 90.2% specificity. The ensemble method outperformed each individual CNN model, reducing both false positives and false negatives, thereby providing more reliable diagnostic results. Conclusions: The ensemble deep learning model demonstrated strong potential as a decision support tool for radiologists, offering more accurate and earlier detection of breast cancer. By leveraging the complementary strengths of multiple CNN architectures, this approach can improve clinical decision making and enhance the accessibility of high-quality breast cancer screening. Full article

(This article belongs to the Topic Computational Intelligence and Bioinformatics (CIB))

► Show Figures

Figure 1

17 pages, 860 KiB

Open AccessReview

Artificial Intelligence in Wound Care: A Narrative Review of the Currently Available Mobile Apps for Automatic Ulcer Segmentation

by Davide Griffa, Alessio Natale, Yuri Merli, Michela Starace, Nico Curti, Martina Mussi, Gastone Castellani, Davide Melandri, Bianca Maria Piraccini and Corrado Zengarini

BioMedInformatics 2024, 4(4), 2321-2337; https://doi.org/10.3390/biomedinformatics4040126 - 11 Dec 2024

Cited by 4 | Viewed by 5364

Abstract

Introduction: Chronic ulcers significantly burden healthcare systems, requiring precise measurement and assessment for effective treatment. Traditional methods, such as manual segmentation, are time-consuming and error-prone. This review evaluates the potential of artificial intelligence AI-powered mobile apps for automated ulcer segmentation and their application [...] Read more.

Introduction: Chronic ulcers significantly burden healthcare systems, requiring precise measurement and assessment for effective treatment. Traditional methods, such as manual segmentation, are time-consuming and error-prone. This review evaluates the potential of artificial intelligence AI-powered mobile apps for automated ulcer segmentation and their application in clinical settings. Methods: A comprehensive literature search was conducted across PubMed, CINAHL, Cochrane, and Google Scholar databases. The review focused on mobile apps that use fully automatic AI algorithms for wound segmentation. Apps requiring additional hardware or needing more technical documentation were excluded. Vital technological features, clinical validation, and usability were analysed. Results: Ten mobile apps were identified, showing varying levels of segmentation accuracy and clinical validation. However, many apps did not publish sufficient information on the segmentation methods or algorithms used, and most lacked details on the databases employed for training their AI models. Additionally, several apps were unavailable in public repositories, limiting their accessibility and independent evaluation. These factors challenge their integration into clinical practice despite promising preliminary results. Discussion: AI-powered mobile apps offer significant potential for improving wound care by enhancing diagnostic accuracy and reducing the burden on healthcare professionals. Nonetheless, the lack of transparency regarding segmentation techniques, unpublished databases, and the limited availability of many apps in public repositories remain substantial barriers to widespread clinical adoption. Conclusions: AI-driven mobile apps for ulcer segmentation could revolutionise chronic wound management. However, overcoming limitations related to transparency, data availability, and accessibility is essential for their successful integration into healthcare systems. Full article

(This article belongs to the Section Imaging Informatics)

► Show Figures

Graphical abstract

12 pages, 1808 KiB

Open AccessArticle

Implementation of Automatic Segmentation Framework as Preprocessing Step for Radiomics Analysis of Lung Anatomical Districts

by Alessandro Stefano, Fabiano Bini, Nicolò Lauciello, Giovanni Pasini, Franco Marinozzi and Giorgio Russo

BioMedInformatics 2024, 4(4), 2309-2320; https://doi.org/10.3390/biomedinformatics4040125 - 11 Dec 2024

Cited by 1 | Viewed by 1504

Abstract

Background: The advent of artificial intelligence has significantly impacted radiology, with radiomics emerging as a transformative approach that extracts quantitative data from medical images to improve diagnostic and therapeutic accuracy. This study aimed to enhance the radiomic workflow by applying deep learning, through [...] Read more.

Background: The advent of artificial intelligence has significantly impacted radiology, with radiomics emerging as a transformative approach that extracts quantitative data from medical images to improve diagnostic and therapeutic accuracy. This study aimed to enhance the radiomic workflow by applying deep learning, through transfer learning, for the automatic segmentation of lung regions in computed tomography scans as a preprocessing step. Methods: Leveraging a pipeline articulated in (i) patient-based data splitting, (ii) intensity normalization, (iii) voxel resampling, (iv) bed removal, (v) contrast enhancement and (vi) model training, a DeepLabV3+ convolutional neural network (CNN) was fine tuned to perform whole-lung-region segmentation. Results: The trained model achieved high accuracy, Dice coefficient (0.97) and BF (93.06%) scores, and it effectively preserved lung region areas and removed confounding anatomical regions such as the heart and the spine. Conclusions: This study introduces a deep learning framework for the automatic segmentation of lung regions in CT images, leveraging an articulated pipeline and demonstrating excellent performance of the model, effectively isolating lung regions while excluding confounding anatomical structures. Ultimately, this work paves the way for more efficient, automated preprocessing tools in lung cancer detection, with potential to significantly improve clinical decision making and patient outcomes. Full article

(This article belongs to the Section Imaging Informatics)

► Show Figures

Figure 1

3 pages, 189 KiB

Open AccessEditorial

Should We Expect a Second Wave of AlphaFold Misuse After the Nobel Prize?

by Alexandre G. de Brevern

BioMedInformatics 2024, 4(4), 2306-2308; https://doi.org/10.3390/biomedinformatics4040124 - 6 Dec 2024

Viewed by 1574

Abstract

AlphaFold (AF) was the first deep learning tool to achieve exceptional fame in the field of biology [...] Full article

19 pages, 2807 KiB

Open AccessArticle

Quantifying Lenition as a Diagnostic Marker for Parkinson’s Disease and Atypical Parkinsonism

by Ratree Wayland, Rachel Meyer, Ruhi Reddy, Kevin Tang and Karen W. Hegland

BioMedInformatics 2024, 4(4), 2287-2305; https://doi.org/10.3390/biomedinformatics4040123 - 29 Nov 2024

Viewed by 1709

Abstract

Objective: This study aimed to evaluate lenition, a phonological process involving consonant weakening, as a diagnostic marker for differentiating Parkinson’s Disease (PD) from Atypical Parkinsonism (APD). Early diagnosis is critical for optimizing treatment outcomes, and lenition patterns in stop consonants may provide valuable [...] Read more.

Objective: This study aimed to evaluate lenition, a phonological process involving consonant weakening, as a diagnostic marker for differentiating Parkinson’s Disease (PD) from Atypical Parkinsonism (APD). Early diagnosis is critical for optimizing treatment outcomes, and lenition patterns in stop consonants may provide valuable insights into the distinct motor speech impairments associated with these conditions. Methods: Using Phonet, a machine learning model trained to detect phonological features, we analyzed the posterior probabilities of continuant and sonorant features from the speech of 142 participants (108 PD, 34 APD). Lenition was quantified based on deviations from expected values, and linear mixed-effects models were applied to compare phonological patterns between the two groups. Results: PD patients exhibited more stable articulatory patterns, particularly in preserving the contrast between voiced and voiceless stops. In contrast, APD patients showed greater lenition, particularly in voiceless stops, coupled with increased articulatory variability, reflecting a more generalized motor deficit. Conclusions: Lenition patterns, especially in voiceless stops, may serve as non-invasive markers for distinguishing PD from APD. These findings suggest potential applications in early diagnosis and tracking disease progression. Future research should expand the analysis to include a broader range of phonological features and contexts to improve diagnostic accuracy. Full article

► Show Figures

Figure 1

16 pages, 4205 KiB

Open AccessArticle

A Network Analysis Approach to Detect and Differentiate Usher Syndrome Types Using miRNA Expression Profiles: A Pilot Study

by Rama Krishna Thelagathoti, Wesley A. Tom, Chao Jiang, Dinesh S. Chandel, Gary Krzyzanowski, Appolinaire Olou and Rohan M. Fernando

BioMedInformatics 2024, 4(4), 2271-2286; https://doi.org/10.3390/biomedinformatics4040122 - 26 Nov 2024

Cited by 2 | Viewed by 1130

Abstract

Background: Usher syndrome (USH) is a rare genetic disorder that affects both hearing and vision. It presents in three clinical types—USH1, USH2, and USH3—with varying onset, severity, and disease progression. Existing diagnostics primarily rely on genetic profiling to identify variants in USH genes; [...] Read more.

Background: Usher syndrome (USH) is a rare genetic disorder that affects both hearing and vision. It presents in three clinical types—USH1, USH2, and USH3—with varying onset, severity, and disease progression. Existing diagnostics primarily rely on genetic profiling to identify variants in USH genes; however, accurate detection before symptom onset remains a challenge. MicroRNAs (miRNAs), which regulate gene expression, have been identified as potential biomarkers for disease. The aim of this study is to develop a data-driven system for the identification of USH using miRNA expression profiles. Methods: We collected microarray miRNA-expression data from 17 samples, representing four patient-derived USH cell lines and a non-USH control. Supervised feature selection was utilized to identify key miRNAs that differentiate USH cell lines from the non-USH control. Subsequently, a network model was constructed by measuring pairwise correlations based on these identified features. Results: The proposed system effectively distinguished between control and USH samples, demonstrating high accuracy. Additionally, the model could differentiate between the three USH types, reflecting its potential and sensitivity beyond the primary identification of affected subjects. Conclusions: This approach can be used to detect USH and differentiate between USH subtypes, suggesting its potential as a future base model for the identification of Usher syndrome. Full article

► Show Figures

Figure 1

20 pages, 578 KiB

Open AccessReview

Systematic Review of Deep Learning Techniques in Skin Cancer Detection

by Carolina Magalhaes, Joaquim Mendes and Ricardo Vardasca

BioMedInformatics 2024, 4(4), 2251-2270; https://doi.org/10.3390/biomedinformatics4040121 - 14 Nov 2024

Cited by 4 | Viewed by 3124

Abstract

Skin cancer is a serious health condition, as it can locally evolve into disfiguring states or metastasize to different tissues. Early detection of this disease is critical because it increases the effectiveness of treatment, which contributes to improved patient prognosis and reduced healthcare [...] Read more.

Skin cancer is a serious health condition, as it can locally evolve into disfiguring states or metastasize to different tissues. Early detection of this disease is critical because it increases the effectiveness of treatment, which contributes to improved patient prognosis and reduced healthcare costs. Visual assessment and histopathological examination are the gold standards for diagnosing these types of lesions. Nevertheless, these processes are strongly dependent on dermatologists’ experience, with excision advised only when cancer is suspected by a physician. Multiple approaches have surfed over the last few years, particularly those based on deep learning (DL) strategies, with the goal of assisting medical professionals in the diagnosis process and ultimately diminishing diagnostic uncertainty. This systematic review focused on the analysis of relevant studies based on DL applications for skin cancer diagnosis. The qualitative assessment included 164 records relevant to the topic. The AlexNet, ResNet-50, VGG-16, and GoogLeNet architectures are considered the top choices for obtaining the best classification results, and multiclassification approaches are the current trend. Public databases are considered key elements in this area and should be maintained and improved to facilitate scientific research. Full article

► Show Figures

Figure 1

28 pages, 5036 KiB

Open AccessArticle

Optimal Feature Selection and Classification for Parkinson’s Disease Using Deep Learning and Dynamic Bag of Features Optimization

by Aarti, Swathi Gowroju, Mst Ismat Ara Begum and A. S. M. Sanwar Hosen

BioMedInformatics 2024, 4(4), 2223-2250; https://doi.org/10.3390/biomedinformatics4040120 - 12 Nov 2024

Cited by 1 | Viewed by 1874

Abstract

Parkinson’s Disease (PD) is a neurological condition that worsens with time and is characterized bysymptoms such as cognitive impairment andbradykinesia, stiffness, and tremors. Parkinson’s is attributed to the interference of brain cells responsible for dopamine production, a substance regulating communication between brain cells. [...] Read more.

Parkinson’s Disease (PD) is a neurological condition that worsens with time and is characterized bysymptoms such as cognitive impairment andbradykinesia, stiffness, and tremors. Parkinson’s is attributed to the interference of brain cells responsible for dopamine production, a substance regulating communication between brain cells. The brain cells involved in dopamine generation handle adaptation and control, and smooth movement. Convolutional Neural Networks are used to extract distinctive visual characteristics from numerous graphomotor sample representations generated by both PD and control participants. The proposed method presents an optimal feature selection technique based on Deep Learning (DL) and the Dynamic Bag of Features Optimization Technique (DBOFOT). Our method combines neural network-based feature extraction with a strong optimization technique to dynamically choose the most relevant characteristics from biological data. Advanced DL architectures are then used to classify the chosen features, guaranteeing excellent computational efficiency and accuracy. The framework’s adaptability to different datasets further highlights its versatility and potential for further medical applications. With a high accuracy of 0.93, the model accurately identifies 93% of the cases that are categorized as Parkinson’s. Additionally, it has a recall of 0.89, which means that 89% of real Parkinson’s patients are accurately identified. While the recall for Class 0 (Healthy) is 0.75, meaning that 75% of the real healthy cases are properly categorized, the precision decreases to 0.64 for this class, indicating a larger false positive rate. Full article

► Show Figures

Graphical abstract

10 pages, 243 KiB

Open AccessArticle

Association Between Social Determinants of Health and Patient Portal Utilization in the United States

by Elizabeth Ayangunna, Gulzar H. Shah, Hani Samawi, Kristie C. Waterfield and Ana M. Palacios

BioMedInformatics 2024, 4(4), 2213-2222; https://doi.org/10.3390/biomedinformatics4040119 - 12 Nov 2024

Cited by 1 | Viewed by 1531

Abstract

(1) Background: Differences in health outcomes across populations are due to disparities in access to the social determinants of health (SDoH), such as educational level, household income, and internet access. With several positive outcomes reported with patient portal use, examining the associated social [...] Read more.

(1) Background: Differences in health outcomes across populations are due to disparities in access to the social determinants of health (SDoH), such as educational level, household income, and internet access. With several positive outcomes reported with patient portal use, examining the associated social determinants of health is imperative. Objective: This study analyzed the association between social determinants of health—education, health insurance, household income, rurality, and internet access—and patient portal use among adults in the United States before and after the COVID-19 pandemic. (2) Methods: The research used a quantitative, retrospective study design and secondary data from the combined cycles 1 to 4 of the Health Information National Trends Survey 5 (N = 14,103) and 6 (N = 5958). Descriptive statistics and logistic regression were conducted to examine the association between the variables operationalizing SDoH and the use of patient portals. (3) Results: Forty-percent (40%) of respondents reported using a patient portal before the pandemic, and this increased to 61% in 2022. The multivariable logistic regression showed higher odds of patient portal utilization by women compared to men (AOR = 1.56; CI, 1.32–1.83), those with at least a college degree compared to less than high school education (AOR = 2.23; CI, 1.29–3.83), and annual family income of USD 75,000 and above compared to those <USD 20,000 (AOR = 1.59; CI, 1.18–2.15). Those with access to the internet and health insurance also had significantly higher odds of using their patient portals. However, those who identified as Hispanic and non-Hispanic Black and residing in a rural area rather than urban (AOR = 0.72; CI, 0.54–0.95) had significantly lower odds of using their patient portals even after the pandemic. (4) Conclusions: The social determinants of health included in this study showed significant influence on patient portal utilization, which has implications for policymakers and public health stakeholders tasked with promoting patient portal utilization and its benefits. Full article

(This article belongs to the Special Issue Editor-in-Chief's Choices in Biomedical Informatics)

12 pages, 2751 KiB

Open AccessArticle

Impact of Data Pre-Processing Techniques on XGBoost Model Performance for Predicting All-Cause Readmission and Mortality Among Patients with Heart Failure

by Qisthi Alhazmi Hidayaturrohman and Eisuke Hanada

BioMedInformatics 2024, 4(4), 2201-2212; https://doi.org/10.3390/biomedinformatics4040118 - 1 Nov 2024

Cited by 3 | Viewed by 3587

Abstract

Background: Heart failure poses a significant global health challenge, with high rates of readmission and mortality. Accurate models to predict these outcomes are essential for effective patient management. This study investigates the impact of data pre-processing techniques on XGBoost model performance in predicting [...] Read more.

Background: Heart failure poses a significant global health challenge, with high rates of readmission and mortality. Accurate models to predict these outcomes are essential for effective patient management. This study investigates the impact of data pre-processing techniques on XGBoost model performance in predicting all-cause readmission and mortality among heart failure patients. Methods: A dataset of 168 features from 2008 heart failure patients was used. Pre-processing included handling missing values, categorical encoding, and standardization. Four imputation techniques were compared: Mean, Multivariate Imputation by Chained Equations (MICEs), k-nearest Neighbors (kNNs), and Random Forest (RF). XGBoost models were evaluated using accuracy, recall, F1-score, and Area Under the Curve (AUC). Robustness was assessed through 10-fold cross-validation. Results: The XGBoost model with kNN imputation, one-hot encoding, and standardization outperformed others, with an accuracy of 0.614, recall of 0.551, and F1-score of 0.476. The MICE-based model achieved the highest AUC (0.647) and mean AUC (0.65 ± 0.04) in cross-validation. All pre-processed models outperformed the default XGBoost model (AUC: 0.60). Conclusions: Data pre-processing, especially MICE with one-hot encoding and standardization, improves XGBoost performance in heart failure prediction. However, moderate AUC scores suggest further steps are needed to enhance predictive accuracy. Full article

► Show Figures

Figure 1

16 pages, 4868 KiB

Open AccessArticle

Drosophila Eye Gene Regulatory Network Inference Using BioGRNsemble: An Ensemble-of-Ensembles Machine Learning Approach

by Abdul Jawad Mohammed and Amal Khalifa

BioMedInformatics 2024, 4(4), 2186-2200; https://doi.org/10.3390/biomedinformatics4040117 - 29 Oct 2024

Viewed by 1412

Abstract

Background: Gene regulatory networks (GRNs) are complex gene interactions essential for organismal development and stability, and they are crucial for understanding gene-disease links in drug development. Advances in bioinformatics, driven by genomic data and machine learning, have significantly expanded GRN research, enabling deeper [...] Read more.

Background: Gene regulatory networks (GRNs) are complex gene interactions essential for organismal development and stability, and they are crucial for understanding gene-disease links in drug development. Advances in bioinformatics, driven by genomic data and machine learning, have significantly expanded GRN research, enabling deeper insights into these interactions. Methods: This study proposes and demonstrates the potential of BioGRNsemble, a modular and flexible approach for inferring gene regulatory networks from RNA-Seq data. Integrating the GENIE3 and GRNBoost2 algorithms, the BioGRNsemble methodology focuses on providing trimmed-down sub-regulatory networks consisting of transcription and target genes. Results: The methodology was successfully tested on a Drosophila melanogaster Eye gene expression dataset. Our validation analysis using the TFLink online database yielded 3703 verified predicted gene links, out of 534,843 predictions. Conclusion: Although the BioGRNsemble approach presents a promising method for inferring smaller, focused regulatory networks, it encounters challenges related to algorithm sensitivity, prediction bias, validation difficulties, and the potential exclusion of broader regulatory interactions. Improving accuracy and comprehensiveness will require addressing these issues through hyperparameter fine-tuning, the development of alternative scoring mechanisms, and the incorporation of additional validation methods. Full article

(This article belongs to the Section Applied Biomedical Data Science)

► Show Figures

Figure 1

13 pages, 511 KiB

Open AccessArticle

Addressing Semantic Variability in Clinical Outcome Reporting Using Large Language Models

by Fatemeh Shah-Mohammadi and Joseph Finkelstein

BioMedInformatics 2024, 4(4), 2173-2185; https://doi.org/10.3390/biomedinformatics4040116 - 28 Oct 2024

Viewed by 1401

Abstract

Background/Objectives: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in [...] Read more.

Background/Objectives: Clinical trials frequently employ diverse terminologies and definitions to describe similar outcomes, leading to ambiguity and inconsistency in data interpretation. Addressing the variability in clinical outcome reports and integrating semantically similar outcomes is important in healthcare and clinical research. Variability in outcome reporting not only hinders the comparability of clinical trial results but also poses significant challenges in evidence synthesis, meta-analysis, and evidence-based decision-making. Methods: This study investigates variability reduction in outcome measures reporting using rule-based and large language-based models. It aims to mitigate the challenges associated with variability in outcome reporting by comparing these two models. The first approach, which is rule-based, will leverage well-known ontologies, and the second approach exploits sentence-bidirectional encoder representations from transformers (SBERT) to identify semantically similar outcomes along with Generative Pre-training Transformer (GPT) to refine the results. Results: The results show that the relatively low percentages of outcomes are linked to established rule-based ontologies. Analysis of outcomes by word count highlighted the absence of ontological linkage for three-word outcomes, which indicates potential gaps in semantic representation. Conclusions: Employing large language models (LLMs), this study demonstrates its ability to identify similar outcomes, even with more than three words, suggesting a crucial role in outcome harmonization efforts, potentially reducing redundancy and enhancing data interoperability. Full article

► Show Figures

Figure 1

24 pages, 1013 KiB

Open AccessReview

Part-Prototype Models in Medical Imaging: Applications and Current Challenges

by Lisa Anita De Santi, Franco Italo Piparo, Filippo Bargagna, Maria Filomena Santarelli, Simona Celi and Vincenzo Positano

BioMedInformatics 2024, 4(4), 2149-2172; https://doi.org/10.3390/biomedinformatics4040115 - 28 Oct 2024

Cited by 1 | Viewed by 2042

Abstract

Recent developments in Artificial Intelligence have increasingly focused on explainability research. The potential of Explainable Artificial Intelligence (XAI) in producing trustworthy computer-aided diagnosis systems and its usage for knowledge discovery are gaining interest in the medical imaging (MI) community to support the diagnostic [...] Read more.

Recent developments in Artificial Intelligence have increasingly focused on explainability research. The potential of Explainable Artificial Intelligence (XAI) in producing trustworthy computer-aided diagnosis systems and its usage for knowledge discovery are gaining interest in the medical imaging (MI) community to support the diagnostic process and the discovery of image biomarkers. Most of the existing XAI applications in MI are focused on interpreting the predictions made using deep neural networks, typically including attribution techniques with saliency map approaches and other feature visualization methods. However, these are often criticized for providing incorrect and incomplete representations of the black-box models’ behaviour. This highlights the importance of proposing models intentionally designed to be self-explanatory. In particular, part-prototype (PP) models are interpretable-by-design computer vision (CV) models that base their decision process on learning and identifying representative prototypical parts from input images, and they are gaining increasing interest and results in MI applications. However, the medical field has unique characteristics that could benefit from more advanced implementations of these types of architectures. This narrative review summarizes existing PP networks, their application in MI analysis, and current challenges. Full article

(This article belongs to the Special Issue Advances in Quantitative Imaging Analysis: From Theory to Practice)

► Show Figures

Figure 1

16 pages, 1475 KiB

Open AccessArticle

Improvement of Statistical Models by Considering Correlations among Parameters: Local Anesthetic Agent Simulator for Pharmacological Education

by Toshiaki Ara and Hiroyuki Kitamura

BioMedInformatics 2024, 4(4), 2133-2148; https://doi.org/10.3390/biomedinformatics4040114 - 14 Oct 2024

Cited by 1 | Viewed by 1363

Abstract

Background: To elucidate the effects of local anesthetic agents (LAs), guinea pigs are used in pharmacological education. Herein, we aimed to develop a simulator for LAs. Previously, we developed a statistical model to simulate the LAs’ effects, and we estimated their parameters (mean [...] Read more.

Background: To elucidate the effects of local anesthetic agents (LAs), guinea pigs are used in pharmacological education. Herein, we aimed to develop a simulator for LAs. Previously, we developed a statistical model to simulate the LAs’ effects, and we estimated their parameters (mean [

μ

] and logarithm of standard deviation [

log σ

]) based on the results of animal experiments. The results of the Monte Carlo simulation were similar to those from the animal experiments. However, the drug parameter values widely varied among individuals, because this simulation did not consider correlations among parameters. Method: In this study, we set the correlations among these parameters, and we performed simulations using Monte Carlo simulation. Results: Weakly negative correlations were observed between

μ

and

log σ

(

r_{μ - log σ}

). In contrast, weakly positive correlations were observed among

μ

(

r_{μ}

) and among

log σ

(

r_{log σ}

). In the Monte Carlo simulation, the variability in duration was significant for small

r_{μ - log σ}

values, and the correlation for the duration between two drugs was significant for large

r_{μ}

and

r_{log σ}

values. When parameters were generated considering the correlation among the parameters, the correlation of the duration among the drugs became larger. Conclusions: These results suggest that parameter generation considering the correlation among parameters is important to reproduce the results of animal experiments in simulations. Full article

► Show Figures

Graphical abstract

16 pages, 525 KiB

Open AccessArticle

Evaluating COVID-19 Vaccine Efficacy Using Kaplan–Meier Survival Analysis

by Waleed Hilal, Michael G. Chislett, Yuandi Wu, Brett Snider, Edward A. McBean, John Yawney and Stephen Andrew Gadsden

BioMedInformatics 2024, 4(4), 2117-2132; https://doi.org/10.3390/biomedinformatics4040113 - 12 Oct 2024

Viewed by 2405

Abstract

Analyses of COVID-19 vaccines have become a forefront of pandemic-related research, as jurisdictions around the world encourage vaccinations as the most assured method to curtail the need for stringent public health measures. Kaplan–Meier models, a form of “survival analysis”, provide a statistical approach [...] Read more.

Analyses of COVID-19 vaccines have become a forefront of pandemic-related research, as jurisdictions around the world encourage vaccinations as the most assured method to curtail the need for stringent public health measures. Kaplan–Meier models, a form of “survival analysis”, provide a statistical approach to improve the understanding of time-to-event probabilities of occurrence. In applications of epidemiology and the study of vaccines, survival analyses can be implemented to quantify the probability of testing positive for SARS-CoV-2, given a population’s vaccination status. In this study, a large proportion of Ontario COVID-19 testing data is used to derive Kaplan–Meier probability curves for individuals who received two doses of a vaccine during a period of peak Delta variant cases, and again for those receiving three doses during a peak time of the Omicron variant. Data consisting of 614,470 individuals with two doses of a COVID-19 vaccine, and 49,551 individuals with three-doses of vaccine, show that recipients of the Moderna vaccine are slightly less likely to test positive for the virus in a 38-day period following their last vaccination than recipients of the Pfizer vaccine, although the difference between the two is marginal in most age groups. This result is largely consistent for two doses of the vaccines during a Delta variant period, as well as an Omicron variant period. The evaluated probabilities of testing positive align with the publicly reported vaccine efficacies of the mRNA vaccines, supporting the resolution that Kaplan–Meier methods in determining vaccine benefits are a justifiable and useful approach in addressing vaccine-related concerns in the COVID-19 landscape. Full article

(This article belongs to the Special Issue Editor's Choices Series for Methods in Biomedical Informatics Section)

► Show Figures

Figure 1

Journal Menu

Journal Browser

BioMedInformatics, Volume 4, Issue 4 (December 2024) – 17 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI