Data Science for Medical Informatics 2nd Edition

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Applied Biosciences and Bioengineering".

Deadline for manuscript submissions: closed (30 November 2023) | Viewed by 6366

Special Issue Editor


E-Mail Website
Guest Editor
Medical Data Science Department, Leipzig University, 04109 Leipzig, Germany
Interests: data integration in medical and life sciences in general; infrastructures for distributed privacy preserving data analyses; data science in medical research; FAIR data points
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue is a continuation of our previous Special Issue titled "Data Science for Medical Informatics".

Medical data science is a rapidly growing field and will transform healthcare and medical research. Despite the presence of a large spectrum of analysis methods, including powerful machine learning algorithms, unavailability and low quality of data is the biggest barrier for developing successful AI based medical applications.

Most medical data still reside in non-digital forms or as text, or are poorly documented and cannot be semantically interpreted. FAIR (Findable, Accessible, Interoperable, Reusable) data are the key for closing the gap between advancements in machine learning and the translation of AI in clinics. FAIR data management and analysis leads to reproducible, interpretable, transferable data science, as well as having a positive impact on data quality.

This Special Issue addresses this gap by collecting best FAIR data practices in medical data science.  We welcome submissions presenting new approaches and methods for improving FAIR data and quality, reproducible and privacy-preserved analyses, as well as success stories for translation and impact of AI in medical practice.

Prof. Dr. Toralf Kirsten
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • FAIR data points
  • data quality
  • reproducible and Transferable AI
  • case studies for demonstrating impact of AI in patient care
  • best practices for data science in medical sciences
  • privacy preserving data analysis

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 1414 KiB  
Article
Collaborative Semantic Annotation Tooling (CoAT) to Improve Efficiency and Plug-and-Play Semantic Interoperability in the Secondary Use of Medical Data: Concept, Implementation, and First Cross-Institutional Experiences
by Thomas Wiktorin, Daniel Grigutsch, Felix Erdfelder, Andrew J. Heidel, Frank Bloos, Danny Ammon, Matthias Löbe and Sven Zenker
Appl. Sci. 2024, 14(2), 820; https://doi.org/10.3390/app14020820 - 18 Jan 2024
Viewed by 563
Abstract
The cross-institutional secondary use of medical data benefits from structured semantic annotation, which ideally enables the matching and merging of semantically related data items from different sources and sites. While numerous medical terminologies and ontologies, as well as some tooling, exist to support [...] Read more.
The cross-institutional secondary use of medical data benefits from structured semantic annotation, which ideally enables the matching and merging of semantically related data items from different sources and sites. While numerous medical terminologies and ontologies, as well as some tooling, exist to support such annotation, cross-institutional data usage based on independently annotated datasets is challenging for multiple reasons: the annotation process is resource intensive and requires a combination of medical and technical expertise since it often requires judgment calls to resolve ambiguities resulting from the non-uniqueness of potential mappings to various levels of ontological hierarchies and relational and representational systems. The divergent resolution of such ambiguities can inhibit joint cross-institutional data usage based on semantic annotation since data items with related content from different sites will not be identifiable based on their respective annotations if different choices were made without further steps such as ontological inference, which is still an active area of research. We hypothesize that a collaborative approach to the semantic annotation of medical data can contribute to more resource-efficient and high-quality annotation by utilizing prior annotational choices of others to inform the annotation process, thus both speeding up the annotation itself and fostering a consensus approach to resolving annotational ambiguities by enabling annotators to discover and follow pre-existing annotational choices. Therefore, we performed a requirements analysis for such a collaborative approach, defined an annotation workflow based on the requirement analysis results, and implemented this workflow in a prototypical Collaborative Annotation Tool (CoAT). We then evaluated its usability and present first inter-institutional experiences with this novel approach to promote practically relevant interoperability driven by use of standardized ontologies. In both single-site usability evaluation and the first inter-institutional application, the CoAT showed potential to improve both annotation efficiency and quality by seamlessly integrating collaboratively generated annotation information into the annotation workflow, warranting further development and evaluation of the proposed innovative approach. Full article
(This article belongs to the Special Issue Data Science for Medical Informatics 2nd Edition)
Show Figures

Figure 1

15 pages, 394 KiB  
Article
Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
by Sameh Frihat, Catharina Lena Beckmann, Eva Maria Hartmann and Norbert Fuhr
Appl. Sci. 2023, 13(19), 10612; https://doi.org/10.3390/app131910612 - 23 Sep 2023
Cited by 1 | Viewed by 816
Abstract
Timely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that meet the [...] Read more.
Timely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that meet the needs of medical practitioners by incorporating text difficulty features. We collected a dataset of 209 scientific research abstracts from different medical fields, available in both English and German. To determine the difficulty aspects of readability and technical level of each abstract, 216 medical experts annotated the dataset. We used a pre-trained BERT model, fine-tuned to our dataset, to develop a regression model predicting those difficulty features of abstracts. To highlight the strength of this approach, the model was compared to readability formulas currently in use. Analysis of the dataset revealed that German abstracts are more technically complex and less readable than their English counterparts. Our baseline model showed greater efficacy than current readability formulas in predicting domain-specific readability aspects. Conclusion: Incorporating these text difficulty aspects into the search engine will provide healthcare professionals with reliable and efficient information retrieval tools. Additionally, the dataset can serve as a starting point for future research. Full article
(This article belongs to the Special Issue Data Science for Medical Informatics 2nd Edition)
Show Figures

Figure 1

8 pages, 486 KiB  
Communication
Enrichment of Spatial eGenes Colocalized with Type 2 Diabetes Mellitus Genome-Wide Association Study Signals in the Lysosomal Pathway
by Younyoung Kim and Chaeyoung Lee
Appl. Sci. 2023, 13(18), 10447; https://doi.org/10.3390/app131810447 - 19 Sep 2023
Cited by 1 | Viewed by 755
Abstract
Genome-wide association studies (GWAS) have identified genetic markers associated with type 2 diabetes mellitus (T2DM). Additionally, tissue-specific expression quantitative trait loci (eQTL) studies have revealed regulatory elements influencing gene expression in specific tissues. We performed enrichment analyses using spatial eGenes corresponding to known [...] Read more.
Genome-wide association studies (GWAS) have identified genetic markers associated with type 2 diabetes mellitus (T2DM). Additionally, tissue-specific expression quantitative trait loci (eQTL) studies have revealed regulatory elements influencing gene expression in specific tissues. We performed enrichment analyses using spatial eGenes corresponding to known T2DM GWAS signals to uncover T2DM pathological pathways. T2DM GWAS signals were obtained from the GWAS Catalog, and spatial eQTL data from T2DM-associated tissues, including visceral adipose tissue, liver, skeletal muscle, and pancreas, were sourced from the Genotype-Tissue Expression Consortium. The eGenes were enriched in Kyoto Encyclopedia of Genes and Genomes biological pathways using the Benjamini–Hochberg method. Colocalization analysis of 2857 independent T2DM GWAS signals identified 556 eGenes in visceral adipose tissue, 176 in liver, 715 in skeletal muscle, and 384 in pancreas (PFDR < 0.05 where PFDR is the false discovery rate). These eGenes showed enrichment in various pathways (PBH < 0.05 where PBH is the corrected P for the Benjamini–Hochberg multiple testing), especially the lysosomal pathway in pancreatic tissue. Unlike the mTOR pathway in T2DM autophagy dysregulation, the role of lysosomes remains poorly understood. The enrichment analysis of spatial eGenes associated with T2DM GWAS signals highlights the importance of the lysosomal pathway in autophagic termination. Thus, investigating the processes involving autophagic termination associated with lysosomes is a priority for understanding T2DM pathogenesis. Full article
(This article belongs to the Special Issue Data Science for Medical Informatics 2nd Edition)
Show Figures

Figure 1

18 pages, 1147 KiB  
Article
Prediction of Intensive Care Unit Length of Stay in the MIMIC-IV Dataset
by Lars Hempel, Sina Sadeghi and Toralf Kirsten
Appl. Sci. 2023, 13(12), 6930; https://doi.org/10.3390/app13126930 - 8 Jun 2023
Cited by 1 | Viewed by 3799
Abstract
Accurately estimating the length of stay (LOS) of patients admitted to the intensive care unit (ICU) in relation to their health status helps healthcare management allocate appropriate resources and better plan for the future. This paper presents predictive models for the LOS of [...] Read more.
Accurately estimating the length of stay (LOS) of patients admitted to the intensive care unit (ICU) in relation to their health status helps healthcare management allocate appropriate resources and better plan for the future. This paper presents predictive models for the LOS of ICU patients from the MIMIC-IV database based on typical demographic and administrative data, as well as early vital signs and laboratory measurements collected on the first day of ICU stay. The goal of this study was to demonstrate a practical, stepwise approach to predicting patient’s LOS in the ICU using machine learning and early available typical clinical data. The results show that this approach significantly improves the performance of models for predicting actual LOS in a pragmatic framework that includes only data with short stays predetermined by a prior classification. Full article
(This article belongs to the Special Issue Data Science for Medical Informatics 2nd Edition)
Show Figures

Figure 1

Back to TopTop