Editor's Choice Series for the Applied Biomedical Data Science Section

A special issue of BioMedInformatics (ISSN 2673-7426). This special issue belongs to the section "Applied Biomedical Data Science".

Deadline for manuscript submissions: closed (31 December 2024) | Viewed by 23824

Special Issue Editor


E-Mail Website
Guest Editor
1. Medical Faculty, Institute of Clinical Pharmacology, Goethe - University, Frankfurt am Main, Germany
2. Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Frankfurt am Main, Germany
Interests: data science; pain; clinical pharmacology
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The Editor's Choice Series for the Applied Biomedical Data Science Section presents a curated collection showcasing pioneering methodologies driving the intersection of data science and biomedical research. This Special Issue focuses on innovative approaches, techniques, and tools applied in the realm of data science to unravel complex biological phenomena, facilitate medical discoveries, and improve healthcare outcomes.

Encompassing fields such as big data analytics, machine learning applications, computational modeling, data-driven diagnostics, and bioinformatics, this series offers an extensive exploration of methodologies transforming biomedical data into actionable insights. From predictive modeling for disease prognosis to novel algorithms enhancing drug discovery, these articles will illuminate the impactful role of data science in shaping the future of healthcare.

Submissions for brief reports are not accepted for this Special Issue. Instead, comprehensive studies, methodological analyses, and practical applications at the nexus of data science and biomedical research are encouraged.

Prof. Dr. Jörn Lötsch
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. BioMedInformatics is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1000 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big data analytics
  • machine learning applications
  • computational modeling
  • data-driven diagnostics
  • bioinformatics
  • predictive modeling
  • disease prognosis
  • drug discovery
  • biomedical data analysis
  • healthcare innovation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

15 pages, 2611 KiB  
Article
ELIPF: Explicit Learning Framework for Pre-Emptive Forecasting, Early Detection and Curtailment of Idiopathic Pulmonary Fibrosis Disease
by Tagne Poupi Theodore Armand, Md Ariful Islam Mozumder, Kouayep Sonia Carole, Opeyemi Deji-Oloruntoba, Hee-Cheol Kim and Simeon Okechukwu Ajakwe
BioMedInformatics 2024, 4(3), 1807-1821; https://doi.org/10.3390/biomedinformatics4030099 - 1 Aug 2024
Cited by 3 | Viewed by 1193
Abstract
(1) Background: Among lung diseases, idiopathic pulmonary fibrosis (IPF) appears to be the most common type and causes scarring (fibrosis) of the lungs. IPF disease patients are recommended to undergo lung transplants, or they may witness progressive and irreversible lung damage that will [...] Read more.
(1) Background: Among lung diseases, idiopathic pulmonary fibrosis (IPF) appears to be the most common type and causes scarring (fibrosis) of the lungs. IPF disease patients are recommended to undergo lung transplants, or they may witness progressive and irreversible lung damage that will subsequently lead to death. In cases of irreversible damage, it becomes important to predict the patient’s mortality status. Traditional healthcare does not provide sophisticated tools for such predictions. Still, because artificial intelligence has effectively shown its capability to manage crucial healthcare situations, it is possible to predict patients’ mortality using machine learning techniques. (2) Methods: This research proposed a soft voting ensemble model applied to the top 30 best-fit clinical features to predict mortality risk for patients with idiopathic pulmonary fibrosis. Five machine learning algorithms were used for it, namely random forest (RF), support vector machine (SVM), gradient boosting machine (GBM), XGboost (XGB), and multi-layer perceptron (MLP). (3) Results: A soft voting ensemble method applied with the combined results of the classifiers showed an accuracy of 79.58%, sensitivity of 86%, F1-score of 84%, prediction error of 0.19, and responsiveness of 0.47. (4) Conclusions: Our proposed model will be helpful for physicians to make the right decision and keep track of the disease, thus reducing the mortality risk, improving the overall health condition of patients, and managing patient stratification. Full article
Show Figures

Figure 1

20 pages, 1519 KiB  
Article
Flow Analysis of Mastectomy Patients Using Length of Stay: A Single-Center Study
by Teresa Angela Trunfio and Giovanni Improta
BioMedInformatics 2024, 4(3), 1725-1744; https://doi.org/10.3390/biomedinformatics4030094 - 19 Jul 2024
Cited by 1 | Viewed by 1133
Abstract
Background: Malignant breast cancer is the most common cancer affecting women worldwide. The COVID-19 pandemic appears to have slowed the diagnostic process, leading to an enhanced use of invasive approaches such as mastectomy. The increased use of a surgical procedure pushes towards an [...] Read more.
Background: Malignant breast cancer is the most common cancer affecting women worldwide. The COVID-19 pandemic appears to have slowed the diagnostic process, leading to an enhanced use of invasive approaches such as mastectomy. The increased use of a surgical procedure pushes towards an objective analysis of patient flow with measurable quality indicators such as length of stay (LOS) in order to optimize it. Methods: In this work, different regression and classification models were implemented to analyze the total LOS as a function of a set of independent variables (age, gender, pre-op LOS, discharge ward, year of discharge, type of procedure, presence of hypertension, diabetes, cardiovascular disease, respiratory disease, secondary tumors, and surgery with complications) extracted from the discharge records of patients undergoing mastectomy at the ‘San Giovanni di Dio e Ruggi d’Aragona’ University Hospital of Salerno (Italy) in the years 2011–2021. In addition, the impact of COVID-19 was assessed by statistically comparing data from patients discharged in 2018–2019 with those discharged in 2020–2021. Results: The results obtained generally show the good performance of the regression models in characterizing the particular case studies. Among the models, the best at predicting the LOS from the set of variables described above was polynomial regression, with an R2 value above 0.689. The classification algorithms that operated on a LOS divided into 3 arbitrary classes also proved to be good tools, reaching 79% accuracy with the voting classifier. Among the independent variables, both implemented models showed that the ward of discharge, year of discharge, type of procedure and complications during surgery had the greatest impact on LOS. The final focus to assess the impact of COVID-19 showed a statically significant increase in surgical complications. Conclusion: Through this study, it was possible to validate the use of regression and classification models to characterize the total LOS of mastectomy patients. LOS proves to be an excellent indicator of performance, and through its analysis with advanced methods, such as machine learning algorithms, it is possible to understand which of the demographic and organizational variables collected have a significant impact and thus build simple predictors to support healthcare management. Full article
Show Figures

Figure 1

20 pages, 742 KiB  
Article
Ensemble of HMMs for Sequence Prediction on Multivariate Biomedical Data
by Richard Fechner, Jens Dörpinghaus, Robert Rockenfeller and Jennifer Faber
BioMedInformatics 2024, 4(3), 1672-1691; https://doi.org/10.3390/biomedinformatics4030090 - 3 Jul 2024
Viewed by 1187
Abstract
Background: Biomedical data are usually collections of longitudinal data assessed at certain points in time. Clinical observations assess the presences and severity of symptoms, which are the basis for the description and modeling of disease progression. Deciphering potential underlying unknowns from the distinct [...] Read more.
Background: Biomedical data are usually collections of longitudinal data assessed at certain points in time. Clinical observations assess the presences and severity of symptoms, which are the basis for the description and modeling of disease progression. Deciphering potential underlying unknowns from the distinct observation would substantially improve the understanding of pathological cascades. Hidden Markov Models (HMMs) have been successfully applied to the processing of possibly noisy continuous signals. We apply ensembles of HMMs to categorically distributed multivariate time series data, leaving space for expert domain knowledge in the prediction process. Methods: We use an ensemble of HMMs to predict the loss of free walking ability as one major clinical deterioration in the most common autosomal dominantly inherited ataxia disorder worldwide. Results: We present a prediction pipeline that processes data paired with a configuration file, enabling us to train, validate and query an ensemble of HMMs. In particular, we provide a theoretical and practical framework for multivariate time-series inference based on HMMs that includes constructing multiple HMMs, each to predict a particular observable variable. Our analysis is conducted on pseudo-data, but also on biomedical data based on Spinocerebellar ataxia type 3 disease. Conclusions: We find that the model shows promising results for the data we tested. The strength of this approach is that HMMs are well understood, probabilistic and interpretable models, setting it apart from most Deep Learning approaches. We publish all code and evaluation pseudo-data in an open-source repository. Full article
Show Figures

Figure 1

28 pages, 4958 KiB  
Article
Diagnostic Tool for Early Detection of Rheumatic Disorders Using Machine Learning Algorithm and Predictive Models
by Godfrey A. Mills, Dzifa Dey, Mohammed Kassim, Aminu Yiwere and Kenneth Broni
BioMedInformatics 2024, 4(2), 1174-1201; https://doi.org/10.3390/biomedinformatics4020065 - 8 May 2024
Cited by 2 | Viewed by 2470
Abstract
Background: Rheumatic diseases are chronic diseases that affect joints, tendons, ligaments, bones, muscles, and other vital organs. Detection of rheumatic diseases is a complex process that requires careful analysis of heterogeneous content from clinical examinations, patient history, and laboratory investigations. Machine learning techniques [...] Read more.
Background: Rheumatic diseases are chronic diseases that affect joints, tendons, ligaments, bones, muscles, and other vital organs. Detection of rheumatic diseases is a complex process that requires careful analysis of heterogeneous content from clinical examinations, patient history, and laboratory investigations. Machine learning techniques have made it possible to integrate such techniques into the complex diagnostic process to identify inherent features that lead to disease formation, development, and progression for remedial measures. Methods: An automated diagnostic tool using a multilayer neural network computational engine is presented to detect rheumatic disorders and the type of underlying disorder for therapeutic strategies. Rheumatic disorders considered are rheumatoid arthritis, osteoarthritis, and systemic lupus erythematosus. The detection system was trained and tested using 70% and 30% respectively of labelled synthetic dataset of 100,000 records containing both single and multiple disorders. Results: The detection system was able to detect and predict underlying disorders with accuracy of 97.48%, sensitivity of 96.80%, and specificity of 97.50%. Conclusion: The good performance suggests that this solution is robust enough and can be implemented for screening patients for intervention measures. This is a much-needed solution in environments with limited specialists, as the solution promotes task-shifting from the specialist level to the primary healthcare physicians. Full article
Show Figures

Figure 1

10 pages, 424 KiB  
Article
Hearables: In-Ear Multimodal Data Fusion for Robust Heart Rate Estimation
by Marek Żyliński, Amir Nassibi, Edoardo Occhipinti, Adil Malik, Matteo Bermond, Harry J. Davies and Danilo P. Mandic
BioMedInformatics 2024, 4(2), 911-920; https://doi.org/10.3390/biomedinformatics4020051 - 1 Apr 2024
Cited by 1 | Viewed by 1806
Abstract
Background: Ambulatory heart rate (HR) monitors that acquire electrocardiogram (ECG) or/and photoplethysmographm (PPG) signals from the torso, wrists, or ears are notably less accurate in tasks associated with high levels of movement compared to clinical measurements. However, a reliable estimation of [...] Read more.
Background: Ambulatory heart rate (HR) monitors that acquire electrocardiogram (ECG) or/and photoplethysmographm (PPG) signals from the torso, wrists, or ears are notably less accurate in tasks associated with high levels of movement compared to clinical measurements. However, a reliable estimation of HR can be obtained through data fusion from different sensors. These methods are especially suitable for multimodal hearable devices, where heart rate can be tracked from different modalities, including electrical ECG, optical PPG, and sounds (heart tones). Combined information from different modalities can compensate for single source limitations. Methods: In this paper, we evaluate the possible application of data fusion methods in hearables. We assess data fusion for heart rate estimation from simultaneous in-ear ECG and in-ear PPG, recorded on ten subjects while performing 5-min sitting and walking tasks. Results: Our findings show that data fusion methods provide a similar level of mean absolute error as the best single-source heart rate estimation but with much lower intra-subject variability, especially during walking activities. Conclusion: We conclude that data fusion methods provide more robust HR estimation than a single cardiovascular signal. These methods can enhance the performance of wearable devices, especially multimodal hearables, in heart rate tracking during physical activity. Full article
Show Figures

Figure 1

12 pages, 5186 KiB  
Article
Genetic Optimization in Uncovering Biologically Meaningful Gene Biomarkers for Glioblastoma Subtypes
by Petros Paplomatas, Ioanna-Efstathia Douroumi, Panagiotis Vlamos and Aristidis Vrahatis
BioMedInformatics 2024, 4(1), 811-822; https://doi.org/10.3390/biomedinformatics4010045 - 8 Mar 2024
Viewed by 1536
Abstract
Background: Glioblastoma multiforme (GBM) is a highly aggressive brain cancer known for its challenging survival rates; it is characterized by distinct subtypes, such as the proneural and mesenchymal states. The development of targeted therapies is critically dependent on a thorough understanding of these [...] Read more.
Background: Glioblastoma multiforme (GBM) is a highly aggressive brain cancer known for its challenging survival rates; it is characterized by distinct subtypes, such as the proneural and mesenchymal states. The development of targeted therapies is critically dependent on a thorough understanding of these subtypes. Advances in single-cell RNA-sequencing (scRNA-seq) have opened new avenues for identifying subtype-specific gene biomarkers, which are essential for innovative treatments. Methods: This study introduces a genetic optimization algorithm designed to select a precise set of genes that clearly differentiate between the proneural and mesenchymal GBM subtypes. By integrating differential gene expression analysis with gene variability assessments, our dual-criterion strategy ensures the selection of genes that are not only differentially expressed between subtypes but also exhibit consistent variability patterns. This approach enhances the biological relevance of identified biomarkers. We applied this algorithm to scRNA-seq data from GBM samples, focusing on the discovery of subtype-specific gene biomarkers. Results: The application of our genetic optimization algorithm to scRNA-seq data successfully identified significant genes that are closely associated with the fundamental characteristics of GBM. These genes show a strong potential to distinguish between the proneural and mesenchymal subtypes, offering insights into the molecular underpinnings of GBM heterogeneity. Conclusions: This study introduces a novel approach for biomarker discovery in GBM that is potentially applicable to other complex diseases. By leveraging scRNA-seq data, our method contributes to the development of targeted therapies, highlighting the importance of precise biomarker identification in personalized medicine. Full article
Show Figures

Figure 1

12 pages, 1363 KiB  
Article
Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning
by Siddartha Pullakhandam and Susan McRoy
BioMedInformatics 2024, 4(1), 661-672; https://doi.org/10.3390/biomedinformatics4010036 - 1 Mar 2024
Cited by 10 | Viewed by 5717
Abstract
Background: Currently, discriminating Iron Deficiency Anemia (IDA) from other anemia requires an expensive test (serum ferritin). Complete Blood Count (CBC) tests are less costly and more widely available. Machine learning models have not yet been applied to discriminating IDA but do well for [...] Read more.
Background: Currently, discriminating Iron Deficiency Anemia (IDA) from other anemia requires an expensive test (serum ferritin). Complete Blood Count (CBC) tests are less costly and more widely available. Machine learning models have not yet been applied to discriminating IDA but do well for similar tasks. Methods: We constructed multiple machine learning methods to classify IDA from CBC data using a US NHANES dataset of over 19,000 instances, calculating accuracy, precision, recall, and precision AUC (PR AUC). We validated the results using an unseen dataset from Kenya, using the same model. We calculated ranked feature importance to explain the global behavior of the model. Results: Our model classifies IDA with a PR AUC of 0.87 and recall/sensitivity of 0.98 and 0.89 for the original dataset and an unseen Kenya dataset, respectively. The explanations indicate that low blood level of hemoglobin, higher age, and higher Red Blood Cell distribution width were most critical. We also found that optimization made only minor changes to the explanations and that the features used remained consistent with professional practice. Conclusions: The overall high performance and consistency of the results suggest that the approach would be acceptable to health professionals and would support enhancements to current automated CBC analyzers. Full article
Show Figures

Graphical abstract

17 pages, 1859 KiB  
Article
Assessment of Voice Disorders Using Machine Learning and Vocal Analysis of Voice Samples Recorded through Smartphones
by Michele Giuseppe Di Cesare, David Perpetuini, Daniela Cardone and Arcangelo Merla
BioMedInformatics 2024, 4(1), 549-565; https://doi.org/10.3390/biomedinformatics4010031 - 19 Feb 2024
Cited by 7 | Viewed by 3712
Abstract
Background: The integration of edge computing into smart healthcare systems requires the development of computationally efficient models and methodologies for monitoring and detecting patients’ healthcare statuses. In this context, mobile devices, such as smartphones, are increasingly employed for the purpose of aiding diagnosis, [...] Read more.
Background: The integration of edge computing into smart healthcare systems requires the development of computationally efficient models and methodologies for monitoring and detecting patients’ healthcare statuses. In this context, mobile devices, such as smartphones, are increasingly employed for the purpose of aiding diagnosis, treatment, and monitoring. Notably, smartphones are widely pervasive and readily accessible to a significant portion of the population. These devices empower individuals to conveniently record and submit voice samples, thereby potentially facilitating the early detection of vocal irregularities or changes. This research focuses on the creation of diverse machine learning frameworks based on vocal samples captured by smartphones to distinguish between pathological and healthy voices. Methods: The investigation leverages the publicly available VOICED dataset, comprising 58 healthy voice samples and 150 samples from voices exhibiting pathological conditions, and machine learning techniques for the classification of healthy and diseased patients through the employment of Mel-frequency cepstral coefficients. Results: Through cross-validated two-class classification, the fine k-nearest neighbor exhibited the highest performance, achieving an accuracy rate of 98.3% in identifying healthy and pathological voices. Conclusions: This study holds promise for enabling smartphones to effectively identify vocal disorders, offering a multitude of advantages for both individuals and healthcare systems, encompassing heightened accessibility, early detection, and continuous monitoring. Full article
Show Figures

Graphical abstract

Review

Jump to: Research

12 pages, 255 KiB  
Review
Exploring the Role of ChatGPT in Oncology: Providing Information and Support for Cancer Patients
by Maurizio Cè, Vittoria Chiarpenello, Alessandra Bubba, Paolo Florent Felisaz, Giancarlo Oliva, Giovanni Irmici and Michaela Cellina
BioMedInformatics 2024, 4(2), 877-888; https://doi.org/10.3390/biomedinformatics4020049 - 25 Mar 2024
Cited by 9 | Viewed by 4085
Abstract
Introduction: Oncological patients face numerous challenges throughout their cancer journey while navigating complex medical information. The advent of AI-based conversational models like ChatGPT (San Francisco, OpenAI) represents an innovation in oncological patient management. Methods: We conducted a comprehensive review of the literature on [...] Read more.
Introduction: Oncological patients face numerous challenges throughout their cancer journey while navigating complex medical information. The advent of AI-based conversational models like ChatGPT (San Francisco, OpenAI) represents an innovation in oncological patient management. Methods: We conducted a comprehensive review of the literature on the use of ChatGPT in providing tailored information and support to patients with various types of cancer, including head and neck, liver, prostate, breast, lung, pancreas, colon, and cervical cancer. Results and Discussion: Our findings indicate that, in most instances, ChatGPT responses were accurate, dependable, and aligned with the expertise of oncology professionals, especially for certain subtypes of cancers like head and neck and prostate cancers. Furthermore, the system demonstrated a remarkable ability to comprehend patients’ emotional responses and offer proactive solutions and advice. Nevertheless, these models have also showed notable limitations and cannot serve as a substitute for the role of a physician under any circumstances. Conclusions: Conversational models like ChatGPT can significantly enhance the overall well-being and empowerment of oncological patients. Both patients and healthcare providers must become well-versed in the advantages and limitations of these emerging technologies. Full article
Show Figures

Graphical abstract

Back to TopTop