Data Science in Health Care

A special issue of Big Data and Cognitive Computing (ISSN 2504-2289).

Deadline for manuscript submissions: closed (20 October 2023) | Viewed by 24118

Special Issue Editors


E-Mail Website
Guest Editor
Department of Software and Information Systems Engineering, Faculty of Engineering Sciences, Ben-Gurion University of the Negev, Marcus Campus, Beer-Sheva 8410501, Israel
Interests: big biomedical data; machine learning; medical informatics; population genetics

E-Mail Website
Guest Editor
Department of Software and Information Systems Engineering, Faculty of Engineering Sciences, Ben-Gurion University of the Negev, Marcus Campus, Beer-Sheva 8410501, Israel
Interests: data science in medicine; AI in medicine; medical decision-support systems; temporal data mining; automated therapy planning; knowledge representation

E-Mail Website
Guest Editor
Center for Applied Scientific Computing, Division of Supercomputing, Korea Institute of Science and Technology Information (KISTI), University of Science and Technology (UST), 245 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea
Interests: data science in medicine; functional medicine; NGS based analysis; temporal medical data mining; bioinformatics; drug repositioning

Special Issue Information

Dear Colleagues,

Over the past decade, much health care data has been digitized, and the proportion of machine comprehensible data is continuously increasing. In addition, the amount of data that is being recorded is tremendous and of high volume. There is a clear demand for tools and methods for mining clinical data and for generating new insights that can improve patients’ health and health care systems, such as through patient clustering, patient state classification, prognosis prediction, alert systems, and therapy suggestions. Specific applications include clinical decision-support systems for multiple types of medical personnel, as well as for patients, such as for automated diagnosis support, automated therapy suggestions, prognosis determination, intelligent monitoring and alerting, and other applications.

Some of the challenges include handling the high volumes of data, the variety of sources of data, differences in standards of care across sites, data sharing while preserving data privacy, and analysis of unstructured data. New technologies and platforms also need to be effectively incorporated into useful clinical decision-support systems. Examples include the increasing emergence of remote home-based care, in which caregivers (or automated decision-support systems) need to rely on remote monitoring and on remote communication with patients and/or their sensors. We are entering an era of big data analysis and artificial intelligence capabilities for developing applications for health care and for health care platforms.

This Special Issue will include original theoretical and empirical studies, reviews, and opinions in the fields of data mining, machine learning, and AI in the context of healthcare. The purpose of this Special Issue is to serve the community as a unique and valuable reference source for researchers interested in the areas of data science and artificial intelligence, as these are applied to the various facets of medical care and medical research.

Dr. Nadav Rappoport
Prof. Dr. Yuval Shahar
Dr. Hyojung Paik
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data mining
  • machine learning
  • big data
  • artificial intelligence
  • medicine
  • health care
  • medical informatics
  • clinical data science

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

40 pages, 8198 KiB  
Article
The Semantic Adjacency Criterion in Time Intervals Mining
by Alexander Shknevsky, Yuval Shahar and Robert Moskovitch
Big Data Cogn. Comput. 2023, 7(4), 173; https://doi.org/10.3390/bdcc7040173 - 9 Nov 2023
Viewed by 1863
Abstract
We propose a new pruning constraint when mining frequent temporal patterns to be used as classification and prediction features, the Semantic Adjacency Criterion [SAC], which filters out temporal patterns that contain potentially semantically contradictory components, exploiting each medical domain’s knowledge. We have [...] Read more.
We propose a new pruning constraint when mining frequent temporal patterns to be used as classification and prediction features, the Semantic Adjacency Criterion [SAC], which filters out temporal patterns that contain potentially semantically contradictory components, exploiting each medical domain’s knowledge. We have defined three SAC versions and tested them within three medical domains (oncology, hepatitis, diabetes) and a frequent-temporal-pattern discovery framework. Previously, we had shown that using SAC enhances the repeatability of discovering the same temporal patterns in similar proportions in different patient groups within the same clinical domain. Here, we focused on SAC’s computational implications for pattern discovery, and for classification and prediction, using the discovered patterns as features, by four different machine-learning methods: Random Forests, Naïve Bayes, SVM, and Logistic Regression. Using SAC resulted in a significant reduction, across all medical domains and classification methods, of up to 97% in the number of discovered temporal patterns, and in the runtime of the discovery process, of up to 98%. Nevertheless, the highly reduced set of only semantically transparent patterns, when used as features, resulted in classification and prediction models whose performance was at least as good as the models resulting from using the complete temporal-pattern set. Full article
(This article belongs to the Special Issue Data Science in Health Care)
Show Figures

Figure 1

15 pages, 1058 KiB  
Article
White Blood Cell Classification Using Multi-Attention Data Augmentation and Regularization
by Nasrin Bayat, Diane D. Davey, Melanie Coathup and Joon-Hyuk Park
Big Data Cogn. Comput. 2022, 6(4), 122; https://doi.org/10.3390/bdcc6040122 - 21 Oct 2022
Cited by 9 | Viewed by 5540
Abstract
Accurate and robust human immune system assessment through white blood cell evaluation require computer-aided tools with pathologist-level accuracy. This work presents a multi-attention leukocytes subtype classification method by leveraging fine-grained and spatial locality attributes of white blood cell. The proposed framework comprises three [...] Read more.
Accurate and robust human immune system assessment through white blood cell evaluation require computer-aided tools with pathologist-level accuracy. This work presents a multi-attention leukocytes subtype classification method by leveraging fine-grained and spatial locality attributes of white blood cell. The proposed framework comprises three main components: texture-aware/attention map generation blocks, attention regularization, and attention-based data augmentation. The developed framework is applicable to general CNN-based architectures and enhances decision making by paying specific attention to the discriminative regions of a white blood cell. The performance of the proposed method/model was evaluated through an extensive set of experiments and validation. The obtained results demonstrate the superior performance of the model achieving 99.69 % accuracy compared to other state-of-the-art approaches. The proposed model is a good alternative and complementary to existing computer diagnosis tools to assist pathologists in evaluating white blood cells from blood smear images. Full article
(This article belongs to the Special Issue Data Science in Health Care)
Show Figures

Figure 1

13 pages, 2235 KiB  
Article
Deep Learning-Based Computer-Aided Classification of Amniotic Fluid Using Ultrasound Images from Saudi Arabia
by Irfan Ullah Khan, Nida Aslam, Fatima M. Anis, Samiha Mirza, Alanoud AlOwayed, Reef M. Aljuaid, Razan M. Bakr and Nourah Hasan Al Qahtani
Big Data Cogn. Comput. 2022, 6(4), 107; https://doi.org/10.3390/bdcc6040107 - 3 Oct 2022
Cited by 8 | Viewed by 3040
Abstract
Amniotic Fluid (AF) refers to a protective liquid surrounding the fetus inside the amniotic sac, serving multiple purposes, and hence is a key indicator of fetal health. Determining the AF levels at an early stage helps to ascertain the maturation of lungs and [...] Read more.
Amniotic Fluid (AF) refers to a protective liquid surrounding the fetus inside the amniotic sac, serving multiple purposes, and hence is a key indicator of fetal health. Determining the AF levels at an early stage helps to ascertain the maturation of lungs and gastrointestinal development, etc. Low AF entails the risk of premature birth, perinatal mortality, and thereby admission to intensive care unit (ICU). Moreover, AF level is also a critical factor in determining early deliveries. Hence, AF detection is a vital measurement required during early ultrasound (US), and its automation is essential. The detection of AF is usually a time-consuming process as it is patient specific. Furthermore, its measurement and accuracy are prone to errors as it heavily depends on the sonographer’s experience. However, automating this process by developing robust, precise, and effective methods for detection will be beneficial to the healthcare community. Therefore, in this paper, we utilized transfer learning models in order to classify the AF levels as normal or abnormal using the US images. The dataset used consisted of 166 US images of pregnant women, and initially the dataset was preprocessed before training the model. Five transfer learning models, namely, Xception, Densenet, InceptionResNet, MobileNet, and ResNet, were applied. The results showed that MobileNet achieved an overall accuracy of 0.94. Overall, the proposed study produces an effective result in successfully classifying the AF levels, thereby building automated, effective models reliant on transfer learning in order to aid sonographers in evaluating fetal health. Full article
(This article belongs to the Special Issue Data Science in Health Care)
Show Figures

Figure 1

13 pages, 1267 KiB  
Article
Optimizing Operation Room Utilization—A Prediction Model
by Benyamine Abbou, Orna Tal, Gil Frenkel, Robyn Rubin and Nadav Rappoport
Big Data Cogn. Comput. 2022, 6(3), 76; https://doi.org/10.3390/bdcc6030076 - 6 Jul 2022
Cited by 10 | Viewed by 7277
Abstract
Background: Operating rooms are the core of hospitals. They are a primary source of revenue and are often seen as one of the bottlenecks in the medical system. Many efforts are made to increase throughput, reduce costs, and maximize incomes, as well as [...] Read more.
Background: Operating rooms are the core of hospitals. They are a primary source of revenue and are often seen as one of the bottlenecks in the medical system. Many efforts are made to increase throughput, reduce costs, and maximize incomes, as well as optimize clinical outcomes and patient satisfaction. We trained a predictive model on the length of surgeries to improve the productivity and utility of operative rooms in general hospitals. Methods: We collected clinical and administrative data for the last 10 years from two large general public hospitals in Israel. We trained a machine learning model to give the expected length of surgery using pre-operative data. These data included diagnoses, laboratory tests, risk factors, demographics, procedures, anesthesia type, and the main surgeon’s level of experience. We compared our model to a naïve model that represented current practice. Findings: Our prediction model achieved better performance than the naïve model and explained almost 70% of the variance in surgery durations. Interpretation: A machine learning-based model can be a useful approach for increasing operating room utilization. Among the most important factors were the type of procedures and the main surgeon’s level of experience. The model enables the harmonizing of hospital productivity through wise scheduling and matching suitable teams for a variety of clinical procedures for the benefit of the individual patient and the system as a whole. Full article
(This article belongs to the Special Issue Data Science in Health Care)
Show Figures

Figure 1

16 pages, 3495 KiB  
Article
A Simple Free-Text-like Method for Extracting Semi-Structured Data from Electronic Health Records: Exemplified in Prediction of In-Hospital Mortality
by Eyal Klang, Matthew A. Levin, Shelly Soffer, Alexis Zebrowski, Benjamin S. Glicksberg, Brendan G. Carr, Jolion Mcgreevy, David L. Reich and Robert Freeman
Big Data Cogn. Comput. 2021, 5(3), 40; https://doi.org/10.3390/bdcc5030040 - 29 Aug 2021
Cited by 5 | Viewed by 4709
Abstract
The Epic electronic health record (EHR) is a commonly used EHR in the United States. This EHR contain large semi-structured “flowsheet” fields. Flowsheet fields lack a well-defined data dictionary and are unique to each site. We evaluated a simple free-text-like method to extract [...] Read more.
The Epic electronic health record (EHR) is a commonly used EHR in the United States. This EHR contain large semi-structured “flowsheet” fields. Flowsheet fields lack a well-defined data dictionary and are unique to each site. We evaluated a simple free-text-like method to extract these data. As a use case, we demonstrate this method in predicting mortality during emergency department (ED) triage. We retrieved demographic and clinical data for ED visits from the Epic EHR (1/2014–12/2018). Data included structured, semi-structured flowsheet records and free-text notes. The study outcome was in-hospital death within 48 h. Most of the data were coded using a free-text-like Bag-of-Words (BoW) approach. Two machine-learning models were trained: gradient boosting and logistic regression. Term frequency-inverse document frequency was employed in the logistic regression model (LR-tf-idf). An ensemble of LR-tf-idf and gradient boosting was evaluated. Models were trained on years 2014–2017 and tested on year 2018. Among 412,859 visits, the 48-h mortality rate was 0.2%. LR-tf-idf showed AUC 0.98 (95% CI: 0.98–0.99). Gradient boosting showed AUC 0.97 (95% CI: 0.96–0.99). An ensemble of both showed AUC 0.99 (95% CI: 0.98–0.99). In conclusion, a free-text-like approach can be useful for extracting knowledge from large amounts of complex semi-structured EHR data. Full article
(This article belongs to the Special Issue Data Science in Health Care)
Show Figures

Figure 1

Back to TopTop