Special Issue "Data Technology Applications in Life, Diseases, and Health"

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 October 2021.

Special Issue Editors

Prof. Dr. Keun Ho Ryu
E-Mail
Guest Editor
1. Department of Computer Science, Chungbuk National University, Cheongju 28644, Korea
2. Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City 700000, Vietnam
Interests: big data and databases; data mining; biomedical informatics; and bioinformatics; deep learning and interdisciplinary applications
Special Issues and Collections in MDPI journals
Dr. Erdenebileg Batbaatar
E-Mail Website
Guest Editor
1. Department of Computer Science, Chungbuk National University, Cheongju 28644, South Korea
2. Research and Development Center, Ellexi, Seoul 06764, South Korea
Interests: software engineering; data mining; big data analysis; bioinformatics; healthcare; vision; speech; natural language processing; machine learning; deep learning

Special Issue Information

Dear Colleagues,

The goal of this Special Issue is to explore how emerging technology solutions and real world applications in human life, disease, cancer, healthcare, and hospitals can help human beings to lead heathy lives as well as enhance wellbeing. Specifically, innovative contributions that either solve or advance the understanding of issues related to emerging technologies and applications as well as practical and experiences in the real world are very welcome.

This Special Issue also seeks to not only present solutions that combine state-of-the-art devices, computer software, model-based approaches for exploiting the huge health and bio data, and also the Internet of Things resources available (while ensuring that these systems are explainable to domain experts), but also new methods that more generally describe the successful application of emerging technologies and spectra, and science and engineering to issues such as disease, cancer, knowledge discovery, databases, sensor device and user interfaces, software design, and system implementation in the medical domain, as well as the healthcare, biology, and wellbeing domains. The main idea is to cover the applications of emerging technologies and spectra, and science and engineering issues addressing all facets of solutions in the real world from databases, disease, and human health technology from a wellbeing and healthy life perspective.

The general idea behind this Special Issue is to disseminate knowledge of the healthy life of human beings without disease from various engineering, scientific, and social settings that exploit big data, new models and emerging technologies.

This Special Issue will include papers that span a wide range of topics in the fields of applied medical systems and software, medical informatics, healthcare, bioinformatics, and databases, ranging from methodological aspects to theoretical and technological views. More specifically, this Special Issue will cover some emerging and real-world application research topics concerning new trends in applied databases, AI, applications, and emerging technologies including management, design, algorithms, models, hardware and software and their interfaces, data analytics, and real world solutions such as machine learning, deep learning, knowledge discovery, feature selection, data analytics, big data platforms, system design and implementation, all of which are technologies-related to the healthy lives of human beings.

A variety of modern real-life settings along with academic and industrial contexts could benefit from the dissemination of these advances and novel paradigms covering all facets of the databases, models and systems, and applicable technologies. Industries and modern applications could share their experience in exploiting approaches to models and systems for academic and industrial solutions keeping pace with the latest technologies. Academics could identify open research issues coming from industrial and real-life contexts to continuously support methodological and technological solutions.

TOPICS OF INTEREST

This Special Issue welcomes the submission of technical, experimental, methodological, and data analytical, developing and implementing contributions focused on real-world problems and systems, as well as on general applications of AI, data mining, and data analytic methodologies in emerging technology solutions and real world applications related to life, disease, cancer, healthcare, and hospitals that can help human beings to lead heathy lives, including but not limited to the following topics:

  • Data mining and knowledge discovery in healthcare
  • Machine and deep learning approaches for disease, and health data
  • Decision support systems for healthcare and wellbeing
  • Regression and forecasting for medical and/or biomedical signals
  • Healthcare and wellness information systems
  • Medical signal and image processing and techniques
  • Applications of AI techniques in healthcare and wellbeing systems
  • Medical data, knowledge bases, and informatics
  • Intelligent computing and platforms in medicine and healthcare
  • Biomedical applications
  • Biomedical text mining
  • Deep learning and methods to explain disease prediction
  • Big data frameworks and architectures for applied medical and health data
  • Visualization and interactive interfaces related to healthcare systems
  • Recommending and decision-making models and systems based on AI and data mining technologies
  • Machine learning and deep learning applications for life, disease, cancer, healthcare, and hospitals
  • Querying and filtering on heterogeneous, multi-source streaming life and health data
  • Life situation awareness and social network analysis
  • Internet of things and data management for human life
  • Data and applications for human life; data and applications for technology improvement
  • Emerging technologies and applications of data, database, big data, and data mining, AI, models
  • systems, and semantic techniques in the following sectors: human life, disease, cancer, healthcare, and hospital, etc.

Dr. Keun Ho Ryu
Dr. Erdenebileg Batbaatar
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2000 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Disease and Cancer
  • Healthcare
  • Databases and Big Data
  • Bio, Medical, and Health Informatics
  • Human Life and Wellbeing
  • Emerging Technologies
  • Applications

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

Open AccessArticle
A Partially Interpretable Adaptive Softmax Regression for Credit Scoring
Appl. Sci. 2021, 11(7), 3227; https://doi.org/10.3390/app11073227 - 03 Apr 2021
Viewed by 352
Abstract
Credit scoring is a process of determining whether a borrower is successful or unsuccessful in repaying a loan using borrowers’ qualitative and quantitative characteristics. In recent years, machine learning algorithms have become widely studied in the development of credit scoring models. Although efficiently [...] Read more.
Credit scoring is a process of determining whether a borrower is successful or unsuccessful in repaying a loan using borrowers’ qualitative and quantitative characteristics. In recent years, machine learning algorithms have become widely studied in the development of credit scoring models. Although efficiently classifying good and bad borrowers is a core objective of the credit scoring model, there is still a need for the model that can explain the relationship between input and output. In this work, we propose a novel partially interpretable adaptive softmax (PIA-Soft) regression model to achieve both state-of-the-art predictive performance and marginally interpretation between input and output. We augment softmax regression by neural networks to make it adaptive for each borrower. Our PIA-Soft model consists of two main components: linear (softmax regression) and non-linear (neural network). The linear part explains the fundamental relationship between input and output variables. The non-linear part serves to improve the prediction performance by identifying the non-linear relationship between features for each borrower. The experimental result on public benchmark datasets shows that our proposed model not only outperformed the machine learning baselines but also showed the explanations that logically related to the real-world. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Identification of Statin’s Action in a Small Cohort of Patients with Major Depression
Appl. Sci. 2021, 11(6), 2827; https://doi.org/10.3390/app11062827 - 22 Mar 2021
Viewed by 278
Abstract
Statins are widely used as an effective therapy for ischemic vascular disorders and employed for primary and secondary prevention in cardiac and cerebrovascular diseases. Their hemostatic mechanism has also been shown to induce changes in cerebral blood flow that may result in neurocognitive [...] Read more.
Statins are widely used as an effective therapy for ischemic vascular disorders and employed for primary and secondary prevention in cardiac and cerebrovascular diseases. Their hemostatic mechanism has also been shown to induce changes in cerebral blood flow that may result in neurocognitive improvement in subjects with Major Depressive Disorder. Behavioral data, various blood tests, and resting-state brain perfusion data were obtained at the start of this study and three months post-therapy from a small cohort of participants diagnosed with Major Depressive Disorder. Subjects received either rosuvastatin (10 mg) or placebo with their standard selective serotonin reuptake inhibitors therapy. At the end of the study, patients using rosuvastatin reported more positive mood changes than placebo users. However, standard statistical tests revealed no significant differences in any non-behavioral variables before and after the study. In contrast, feature selection techniques allowed identifying a small set of variables that may be affected by statin use and contribute to mood improvement. Classification models built to assess the distinguishability between the two groups showed an accuracy higher than 85% using only five selected features: two peripheral platelet activation markers, perfusion abnormality in the left inferior temporal gyrus, Attention Switching Task Reaction latency, and serum phosphorus levels. Thus, using machine learning tools, we could identify factors that may be causing self-reported mood improvement in patients due to statin use, possibly suggesting a regulatory role of statins in the pathogenesis of clinical depression. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Graphical abstract

Open AccessArticle
Three-Dimensional Tooth Model Reconstruction Using Statistical Randomization-Based Particle Swarm Optimization
Appl. Sci. 2021, 11(5), 2363; https://doi.org/10.3390/app11052363 - 07 Mar 2021
Viewed by 290
Abstract
The registration between images is a crucial part of the 3-D tooth reconstruction model. In this paper, we introduce a registration method using our proposed statistical randomization-based particle swarm optimization (SR-PSO) algorithm with the iterative closet point (ICP) method to find the optimal [...] Read more.
The registration between images is a crucial part of the 3-D tooth reconstruction model. In this paper, we introduce a registration method using our proposed statistical randomization-based particle swarm optimization (SR-PSO) algorithm with the iterative closet point (ICP) method to find the optimal affine transform between images. The hierarchical registration is also utilized in this paper since there are several consecutive images involving in the registration. We implemented this algorithm in the scanned commercial regular-tooth and orthodontic-tooth models. The results demonstrated that the final 3-D images provided good visualization to human eyes with the mean-squared error of 7.37 micrometer2 and 7.41 micrometer2 for both models, respectively. From the results compared with the particle swarm optimization (PSO) algorithm with the ICP method, it can be seen that the results from the proposed algorithm are much better than those from the PSO algorithm with the ICP method. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Comparing Ensemble-Based Machine Learning Classifiers Developed for Distinguishing Hypokinetic Dysarthria from Presbyphonia
Appl. Sci. 2021, 11(5), 2235; https://doi.org/10.3390/app11052235 - 03 Mar 2021
Viewed by 244
Abstract
It is essential to understand the voice characteristics in the normal aging process to accurately distinguish presbyphonia from neurological voice disorders. This study developed the best ensemble-based machine learning classifier that could distinguish hypokinetic dysarthria from presbyphonia using classification and regression tree (CART), [...] Read more.
It is essential to understand the voice characteristics in the normal aging process to accurately distinguish presbyphonia from neurological voice disorders. This study developed the best ensemble-based machine learning classifier that could distinguish hypokinetic dysarthria from presbyphonia using classification and regression tree (CART), random forest, gradient boosting algorithm (GBM), and XGBoost and compared the prediction performance of models. The subjects of this study were 76 elderly patients diagnosed with hypokinetic dysarthria and 174 patients with presbyopia. This study developed prediction models for distinguishing hypokinetic dysarthria from presbyphonia by using CART, GBM, XGBoost, and random forest and compared the accuracy, sensitivity, and specificity of the development models to identify the prediction performance of them. The results of this study showed that random forest had the best prediction performance when it was tested with the test dataset (accuracy = 0.83, sensitivity = 0.90, and specificity = 0.80, and area under the curve (AUC) = 0.85). The main predictors for detecting hypokinetic dysarthria were Cepstral peak prominence (CPP), jitter, shimmer, L/H ratio, L/H ratio_SD, CPP max (dB), CPP min (dB), and CPPF0 in the order of magnitude. Among them, CPP was the most important predictor for identifying hypokinetic dysarthria. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Multidimensional Emotion Recognition Based on Semantic Analysis of Biomedical EEG Signal for Knowledge Discovery in Psychological Healthcare
Appl. Sci. 2021, 11(3), 1338; https://doi.org/10.3390/app11031338 - 02 Feb 2021
Viewed by 505
Abstract
Electroencephalogram (EEG) as biomedical signal is widely applied in the medical field such as the detection of Alzheimer’s disease, Parkinson’s disease, etc. Moreover, by analyzing the EEG-based emotions, the mental status of individual can be revealed for further analysis on the psychological causes [...] Read more.
Electroencephalogram (EEG) as biomedical signal is widely applied in the medical field such as the detection of Alzheimer’s disease, Parkinson’s disease, etc. Moreover, by analyzing the EEG-based emotions, the mental status of individual can be revealed for further analysis on the psychological causes of some diseases such as cancer, which is considered as a vital factor on the induction of certain diseases. Therefore, once the emotional status can be correctly analyzed based on EEG signal, more healthcare-oriented applications can be furtherly carried out. Currently, in order to achieve efficiency and accuracy, diverse amounts of EEG-based emotions recognition methods generally extract features by analyzing the overall characteristics of signal, along with optimization strategy of channel selection to minimize the information redundancy. Those methods have been proved their effectiveness, however, there still remains a big challenge when applied with single channel information for emotion recognition task. Therefore, in order to recognize multidimensional emotions based on single channel information, an emotion quantification analysis (EQA) method is proposed to objectively analyze the semantically similarity between emotions in valence-arousal domains, and a multidimensional emotion recognition (EMER) model is proposed on recognizing multidimensional emotions according to the partial fluctuation pattern (PFP) features based on single channel information, and result shows that even though semantically similar emotions are proved to have similar change patterns in EEG signals, each single channel of 4 frequency bands can efficiently recognize 20 different emotions with an average accuracy above 93% separately. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Generalizability of Deep Learning System for the Pathologic Diagnosis of Various Cancers
Appl. Sci. 2021, 11(2), 808; https://doi.org/10.3390/app11020808 - 16 Jan 2021
Cited by 1 | Viewed by 340
Abstract
The deep learning (DL)-based approaches in tumor pathology help to overcome the limitations of subjective visual examination from pathologists and improve diagnostic accuracy and objectivity. However, it is unclear how a DL system trained to discriminate normal/tumor tissues in a specific cancer could [...] Read more.
The deep learning (DL)-based approaches in tumor pathology help to overcome the limitations of subjective visual examination from pathologists and improve diagnostic accuracy and objectivity. However, it is unclear how a DL system trained to discriminate normal/tumor tissues in a specific cancer could perform on other tumor types. Herein, we cross-validated the DL-based normal/tumor classifiers separately trained on the tissue slides of cancers from bladder, lung, colon and rectum, stomach, bile duct, and liver. Furthermore, we compared the differences between the classifiers trained on the frozen or formalin-fixed paraffin-embedded (FFPE) tissues. The Area under the curve (AUC) for the receiver operating characteristic (ROC) curve ranged from 0.982 to 0.999 when the tissues were analyzed by the classifiers trained on the same tissue preparation modalities and cancer types. However, the AUCs could drop to 0.476 and 0.439 when the classifiers trained for different tissue modalities and cancer types were applied. Overall, the optimal performance could be achieved only when the tissue slides were analyzed by the classifiers trained on the same preparation modalities and cancer types. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Evaluation of ECG Features for the Classification of Post-Stroke Survivors with a Diagnostic Approach
Appl. Sci. 2021, 11(1), 192; https://doi.org/10.3390/app11010192 - 28 Dec 2020
Viewed by 426
Abstract
Stroke is considered as a major cause of death and neurological disorders commonly associated with elderly people. Electrocardiogram (ECG) signals are used as a powerful tool in diagnosing stroke, and the analysis of ECG signals has become the focus of stroke research. ECG [...] Read more.
Stroke is considered as a major cause of death and neurological disorders commonly associated with elderly people. Electrocardiogram (ECG) signals are used as a powerful tool in diagnosing stroke, and the analysis of ECG signals has become the focus of stroke research. ECG changes and autonomic dysfunction are reportedly seen in patients with stroke. This study aimed to analyze the ECG features and develop a classification model with highly ranked ECG features as input variables based on machine-learning techniques for diagnosing stroke disease. The study included 52 stroke patients (mean age 72.7 years, 63% male) and 80 control subjects (mean age 75.5 years, 39% male) for a total of 132 elderly subjects. Resting ECG signals in the lying down position are measured using the BIOPAC MP150 system. The ECG signals are denoised using the discrete wavelet transform (DWT) method, and the features such as heart rate variability (HRV), indices of time and spectral domains and statistical and impulsive metrics, in addition to fiducial features, are extracted and analyzed. Our results showed that the values of the HRV variables were lower in the stroke group, revealing autonomic dysfunction in stroke patients. A statistically significant difference was observed in low-frequency (LF)/high-frequency (HF), time interval measured after the S wave to the beginning of the T wave (ST) and time interval measured from the beginning of the Q wave to the end of the T wave (QT) (p < 0.05) between the groups. Our study also highlighted some of the risk factors of stroke, such as age, male sex and dyslipidemia (p < 0.05), that are statistically significant. The k-nearest neighbors (KNN) model showed the highest classification results (accuracy 96.6%, precision 94.3%, recall 99.1% and F1-score 96.6%) than the random forest, support vector machine (SVM), Naïve Bayes and logistic regression models. Thus, our study reported some of the notable ECG changes in the study participants and also indicated that ECG could aid in diagnosing stroke disease. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
A Model for Assessing the Quantitative Effects of Heterogeneous Affinity in Malaria Transmission along with Ivermectin Mass Administration
Appl. Sci. 2020, 10(23), 8696; https://doi.org/10.3390/app10238696 - 04 Dec 2020
Viewed by 312
Abstract
Using an agent-based model of malaria, we present numerical evidence that in communities of individuals having an affinity varying within a broad range of values, disease transmission may increase up to 300%. Moreover, our findings provide new insight into how to combine different [...] Read more.
Using an agent-based model of malaria, we present numerical evidence that in communities of individuals having an affinity varying within a broad range of values, disease transmission may increase up to 300%. Moreover, our findings provide new insight into how to combine different strategies for the prevention of malaria transmission. In particular, we uncover a relationship between the level of heterogeneity and the level of conventional and unconventional anti-malarial drug administration (ivermectin and gametocidal agents), which, when taken together, will define a control parameter, tuning between disease persistence and elimination. Finally, we also provide evidence that the entomological inoculation rate, as well as the product between parasite and sporozoite rates are both good indicators of malaria incidence in the presence of heterogeneity in disease transmission and may configure a possible improvement in that setting, upon classical standard measures such as the basic reproductive number. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
A Deep Learning Approach with Feature Derivation and Selection for Overdue Repayment Forecasting
Appl. Sci. 2020, 10(23), 8491; https://doi.org/10.3390/app10238491 - 27 Nov 2020
Cited by 1 | Viewed by 362
Abstract
Risk control has always been a major challenge in finance. Overdue repayment is a frequently encountered discreditable behavior in online lending. Motivated by the powerful capabilities of deep neural networks, we propose a fusion deep learning approach, namely AD-MBLSTM, based on the deep [...] Read more.
Risk control has always been a major challenge in finance. Overdue repayment is a frequently encountered discreditable behavior in online lending. Motivated by the powerful capabilities of deep neural networks, we propose a fusion deep learning approach, namely AD-MBLSTM, based on the deep neural network (DNN), multi-layer bi-directional long short-term memory (LSTM) (BiLSTM) and the attention mechanism for overdue repayment behavior forecasting according to historical repayment records. Furthermore, we present a novel feature derivation and selection method for the procedure of data preprocessing. Visualization and interpretability improvement work is also implemented to explore the critical time points and causes of overdue repayment behavior. In addition, we present a new dataset originating from a practical application scenario in online lending. We evaluate our proposed framework on the dataset and compare the performance with various general machine learning models and neural network models. Comparison results and the ablation study demonstrate that our proposed model outperforms many effective general machine learning models by a large margin, and each indispensable sub-component takes an active role. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Identification of Metabolic Syndrome Based on Anthropometric, Blood and Spirometric Risk Factors Using Machine Learning
Appl. Sci. 2020, 10(21), 7741; https://doi.org/10.3390/app10217741 - 01 Nov 2020
Viewed by 488
Abstract
Metabolic syndrome (MS) is an aggregation of coexisting conditions that can indicate an individual’s high risk of major diseases, including cardiovascular disease, stroke, cancer, and type 2 diabetes. We conducted a cross-sectional survey to evaluate potential risk factor indicators by identifying relationships between [...] Read more.
Metabolic syndrome (MS) is an aggregation of coexisting conditions that can indicate an individual’s high risk of major diseases, including cardiovascular disease, stroke, cancer, and type 2 diabetes. We conducted a cross-sectional survey to evaluate potential risk factor indicators by identifying relationships between MS and anthropometric and spirometric factors along with blood parameters among Korean adults. A total of 13,978 subjects were enrolled from the Korea National Health and Nutrition Examination Survey. Statistical analysis was performed using a complex sampling design to represent the entire Korean population. We conducted binary logistic regression analysis to evaluate and compare potential associations of all included factors. We constructed prediction models based on Naïve Bayes and logistic regression algorithms. The performance evaluation of the prediction model improved the accuracy with area under the curve (AUC) and calibration curve. Among all factors, triglyceride exhibited a strong association with MS in both men (odds ratio (OR) = 2.711, 95% confidence interval (CI) [2.328–3.158]) and women (OR = 3.515 [3.042–4.062]). Regarding anthropometric factors, the waist-to-height ratio demonstrated a strong association in men (OR = 1.511 [1.311–1.742]), whereas waist circumference was the strongest indicator in women (OR = 2.847 [2.447–3.313]). Forced expiratory volume in 6s and forced expiratory flow 25–75% strongly associated with MS in both men (OR = 0.822 [0.749–0.903]) and women (OR = 1.150 [1.060–1.246]). Wrapper-based logistic regression prediction model showed the highest predictive power in both men and women (AUC = 0.868 and 0.932, respectively). Our findings revealed that several factors were associated with MS and suggested the potential of employing machine learning models to support the diagnosis of MS. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
A Comparative Analysis of Machine Learning Methods for Class Imbalance in a Smoking Cessation Intervention
Appl. Sci. 2020, 10(9), 3307; https://doi.org/10.3390/app10093307 - 09 May 2020
Cited by 3 | Viewed by 919
Abstract
Smoking is one of the major public health issues, which has a significant impact on premature death. In recent years, numerous decision support systems have been developed to deal with smoking cessation based on machine learning methods. However, the inevitable class imbalance is [...] Read more.
Smoking is one of the major public health issues, which has a significant impact on premature death. In recent years, numerous decision support systems have been developed to deal with smoking cessation based on machine learning methods. However, the inevitable class imbalance is considered a major challenge in deploying such systems. In this paper, we study an empirical comparison of machine learning techniques to deal with the class imbalance problem in the prediction of smoking cessation intervention among the Korean population. For the class imbalance problem, the objective of this paper is to improve the prediction performance based on the utilization of synthetic oversampling techniques, which we called the synthetic minority over-sampling technique (SMOTE) and an adaptive synthetic (ADASYN). This has been achieved by the experimental design, which comprises three components. First, the selection of the best representative features is performed in two phases: the lasso method and multicollinearity analysis. Second, generate the newly balanced data utilizing SMOTE and ADASYN technique. Third, machine learning classifiers are applied to construct the prediction models among all subjects and each gender. In order to justify the effectiveness of the prediction models, the f-score, type I error, type II error, balanced accuracy and geometric mean indices are used. Comprehensive analysis demonstrates that Gradient Boosting Trees (GBT), Random Forest (RF) and multilayer perceptron neural network (MLP) classifiers achieved the best performances in all subjects and each gender when SMOTE and ADASYN were utilized. The SMOTE with GBT and RF models also provide feature importance scores that enhance the interpretability of the decision-support system. In addition, it is proven that the presented synthetic oversampling techniques with machine learning models outperformed baseline models in smoking cessation prediction. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Assessment of Anthropometric and Body Composition Risk Factors in Patients with both Hypertension and Stroke in the Korean Population
Appl. Sci. 2020, 10(9), 3046; https://doi.org/10.3390/app10093046 - 27 Apr 2020
Cited by 1 | Viewed by 520
Abstract
The association of hypertension or stroke with anthropometric and body composition indices has been evaluated individually but not for patients with both conditions. Here, we compared these indices in patients with both hypertension and stroke and evaluated the best indicators for identifying patients [...] Read more.
The association of hypertension or stroke with anthropometric and body composition indices has been evaluated individually but not for patients with both conditions. Here, we compared these indices in patients with both hypertension and stroke and evaluated the best indicators for identifying patients with both diseases in the Korean population. Data were obtained from the Korea National Health and Nutrition Examination Survey (KNHANES) conducted from 2008 to 2011. Data analysis was carried out using a complex sampling design that considered the weighting for personal analysis to represent the whole population in Korea. Binary logistic regression was conducted for evaluating potential associations, and areas under the curve were calculated to compare the predictive power of all variables for identifying patients with hypertension or both hypertension and stroke. Among all hypertension-related factors, waist-to-height ratio (WHtR) exhibited a strong association in men (odds ratio (OR) = 1.390 [1.127–1.714]), whereas trunk-fat mass (OR = 1.613 [1.237–2.104]) and thoracic spine bone mineral density (BMD) (OR = 1.250 [1.044–1.496]) represented the best indicators in women. Comparison of anthropometric and body composition indices in patients with both diseases revealed that left arm BMD and left leg fat mass (LLF) were strongly associated in both men (OR = 0.504 [0.320–0.793]) and women (OR = 0.391 [0.208–0.734]). However, among patients with both hypertension and stroke, WHtR (OR = 1.689 [1.080–2.641]) and LLF (OR = 0.391 [0.208–0.734]) were the best risk predictors in men and women, respectively. Our findings suggested that the best indicators among patients with hypertension or both hypertension and stroke may differ according to men and women. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Computer-Assisted Relevance Assessment: A Case Study of Updating Systematic Medical Reviews
Appl. Sci. 2020, 10(8), 2845; https://doi.org/10.3390/app10082845 - 20 Apr 2020
Viewed by 565
Abstract
It is becoming more challenging for health professionals to keep up to date with current research. To save time, many experts perform evidence syntheses on systematic reviews instead of primary studies. Subsequently, there is a need to update reviews to include new evidence, [...] Read more.
It is becoming more challenging for health professionals to keep up to date with current research. To save time, many experts perform evidence syntheses on systematic reviews instead of primary studies. Subsequently, there is a need to update reviews to include new evidence, which requires a significant amount of effort and delays the update process. These efforts can be significantly reduced by applying computer-assisted techniques to identify relevant studies. In this study, we followed a “human-in-the-loop” approach by engaging medical experts through a controlled user experiment to update systematic reviews. The primary outcome of interest was to compare the performance levels achieved when judging full abstracts versus single sentences accompanied by Natural Language Inference labels. The experiment included post-task questionnaires to collect participants’ feedback on the usability of the computer-assisted suggestions. The findings lead us to the conclusion that employing sentence-level, for relevance assessment, achieves higher recall. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
A Sequential Emotion Approach for Diagnosing Mental Disorder on Social Media
Appl. Sci. 2020, 10(5), 1647; https://doi.org/10.3390/app10051647 - 01 Mar 2020
Cited by 2 | Viewed by 717
Abstract
Mental disorder has been affecting numerous individuals; however, mental health care is in a passive state where only a minority of individuals actively seek professional help. Due to the rapid development of social networks, individuals accustomed to expressing their raw feelings on social [...] Read more.
Mental disorder has been affecting numerous individuals; however, mental health care is in a passive state where only a minority of individuals actively seek professional help. Due to the rapid development of social networks, individuals accustomed to expressing their raw feelings on social media include patients who are suffering great pain from mental disorders. To distinguish individuals who merely feel sad and others who have mental disorders, the symptoms of mental disorder are taken into consideration. These symptoms constantly arise as a regular pattern like shifting of emotions or repeating of one representative emotion during a certain time. We proposed a Mental Disorder Identification Model (MDI-Model) to identify the four most commonly occurring mental disorders in the world: anxiety disorder, bipolar disorder, depressive disorder, and obsessive-compulsive disorder (OCD). The MDI-Model compares the sequential emotion pattern from users to identify mental disorders to detect those who are in a high risk. Tweets of diagnosed mental disorder users were analyzed to evaluate the accuracy of the MDI-Model, furthermore, the tweets of users from six different occupations were analyzed to verify the precision and predict the tendency of mental disorder among the different occupations. Results show that the MDI-Model can efficiently diagnose users with high precision in different mental statuses as severe, moderate, and mild stage, or tendency of mental disorder and mentally healthy status. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Multi-Task Topic Analysis Framework for Hallmarks of Cancer with Weak Supervision
Appl. Sci. 2020, 10(3), 834; https://doi.org/10.3390/app10030834 - 24 Jan 2020
Viewed by 763
Abstract
The hallmarks of cancer represent an essential concept for discovering novel knowledge about cancer and for extracting the complexity of cancer. Due to the lack of topic analysis frameworks optimized specifically for cancer data, the studies on topic modeling in cancer research still [...] Read more.
The hallmarks of cancer represent an essential concept for discovering novel knowledge about cancer and for extracting the complexity of cancer. Due to the lack of topic analysis frameworks optimized specifically for cancer data, the studies on topic modeling in cancer research still have a strong challenge. Recently, deep learning (DL) based approaches were successfully employed to learn semantic and contextual information from scientific documents using word embeddings according to the hallmarks of cancer (HoC). However, those are only applicable to labeled data. There is a comparatively small number of documents that are labeled by experts. In the real world, there is a massive number of unlabeled documents that are available online. In this paper, we present a multi-task topic analysis (MTTA) framework to analyze cancer hallmark-specific topics from documents. The MTTA framework consists of three main subtasks: (1) cancer hallmark learning (CHL)—used to learn cancer hallmarks on existing labeled documents; (2) weak label propagation (WLP)—used to classify a large number of unlabeled documents with the pre-trained model in the CHL task; and (3) topic modeling (ToM)—used to discover topics for each hallmark category. In the CHL task, we employed a convolutional neural network (CNN) with pre-trained word embedding that represents semantic meanings obtained from an unlabeled large corpus. In the ToM task, we employed a latent topic model such as latent Dirichlet allocation (LDA) and probabilistic latent semantic analysis (PLSA) model to catch the semantic information learned by the CNN model for topic analysis. To evaluate the MTTA framework, we collected a large number of documents related to lung cancer in a case study. We also conducted a comprehensive performance evaluation for the MTTA framework, comparing it with several approaches. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Graphical abstract

Open AccessArticle
A Deep Learning Model for Estimation of Patients with Undiagnosed Diabetes
Appl. Sci. 2020, 10(1), 421; https://doi.org/10.3390/app10010421 - 06 Jan 2020
Cited by 3 | Viewed by 1750
Abstract
A screening model for undiagnosed diabetes mellitus (DM) is important for early medical care. Insufficient research has been carried out developing a screening model for undiagnosed DM using machine learning techniques. Thus, the primary objective of this study was to develop a screening [...] Read more.
A screening model for undiagnosed diabetes mellitus (DM) is important for early medical care. Insufficient research has been carried out developing a screening model for undiagnosed DM using machine learning techniques. Thus, the primary objective of this study was to develop a screening model for patients with undiagnosed DM using a deep neural network. We conducted a cross-sectional study using data from the Korean National Health and Nutrition Examination Survey (KNHANES) 2013–2016. A total of 11,456 participants were selected, excluding those with diagnosed DM, an age < 20 years, or missing data. KNHANES 2013–2015 was used as a training dataset and analyzed to develop a deep learning model (DLM) for undiagnosed DM. The DLM was evaluated with 4444 participants who were surveyed in the 2016 KNHANES. The DLM was constructed using seven non-invasive variables (NIV): age, waist circumference, body mass index, gender, smoking status, hypertension, and family history of diabetes. The model showed an appropriate performance (area under curve (AUC): 80.11) compared with existing previous screening models. The DLM developed in this study for patients with undiagnosed diabetes could contribute to early medical care. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Open AccessArticle
Consumer-Driven Usability Test of Mobile Application for Tea Recommendation Service
by and
Appl. Sci. 2019, 9(19), 3961; https://doi.org/10.3390/app9193961 - 20 Sep 2019
Viewed by 749
Abstract
The rapidly growing interest in healthy lifestyles and the health benefit of foods and the growing tea-consuming population are driving the growth of the tea industry. In particular, the growing preference among Millennials for premium blended tea is leading the growth of the [...] Read more.
The rapidly growing interest in healthy lifestyles and the health benefit of foods and the growing tea-consuming population are driving the growth of the tea industry. In particular, the growing preference among Millennials for premium blended tea is leading the growth of the tea market. In this paper, we study the feasibility of recommendation services for blended tea, which has not been addressed well by existing recommender systems. To this end, we design TeaPickTM, a mobile application that suggests a blend of tea suited to the user’s preferences including desired health benefits. To evaluate the application and its recommendations, we conduct not only a usability test, but also a consumer acceptance test with 31 participants. Our user study shows that the participants were positive about the recommendation service provided by our application and were generally satisfied with the recommended tea. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Other

Jump to: Research

Open AccessProject Report
A Study of the Effectiveness Verification of Computer-Based Dementia Assessment Contents (Co-Wis): Non-Randomized Study
Appl. Sci. 2020, 10(5), 1579; https://doi.org/10.3390/app10051579 - 26 Feb 2020
Viewed by 847
Abstract
Computer-based neuropsychological assessments have many advantages over traditional neuropsychological assessments. However, limited data are available on the validity and reliability of computer-based assessments. The purpose of this study was to examine the reliability and validity of computer-based dementia assessment contents (Co-Wis). This study [...] Read more.
Computer-based neuropsychological assessments have many advantages over traditional neuropsychological assessments. However, limited data are available on the validity and reliability of computer-based assessments. The purpose of this study was to examine the reliability and validity of computer-based dementia assessment contents (Co-Wis). This study recruited 113 participants from Yeungnam University Medical Center in Daegu from June 2019 to December 2019 and received ethical approval. Participants were evaluated using standard and objective dementia cognitive test tools such as the Korean version of the Mini-Mental State Examination (K-MMSE), the Clinical Dementia Rating Scale (CDR), and the Standardized Seoul Neuropsychological Screening Battery-II (SNSB-II). To verify the effectiveness of Co-Wis, the concurrent validity, test–retest reliability (Pearson’s correlation coefficients), construct validity (Factor analysis), and signal detection analysis (ROC curve) were used. In most of the Co-Wis subtests, the concurrent validity and test–retest reliability showed statistically significant correlations (p < 0.05, p < 0.01). The factor analysis showed that Co-Wis assessed the most major cognitive areas (Tucker–Lewis Index (TLI) = 0.876, Comparative Fit Index (CFI) = 0.897, RMSEA = 0.88). Thus, Co-Wis appears clinically applicable and with high reliability and validity. In the future, we should develop tests to evaluate both standard data and big data-based machine learning. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Graphical abstract

Open AccessProject Report
PROMISE CLIP Project: A Retrospective, Multicenter Study for Prostate Cancer that Integrates Clinical, Imaging and Pathology Data
Appl. Sci. 2019, 9(15), 2982; https://doi.org/10.3390/app9152982 - 25 Jul 2019
Cited by 3 | Viewed by 1110
Abstract
There are many medical demands that still need to be resolved for prostate cancer (PCa), including better diagnosis and predictive medicine. For this to be accomplished, diverse medical data need to be integrated with the development of intelligent software (SW) based on various [...] Read more.
There are many medical demands that still need to be resolved for prostate cancer (PCa), including better diagnosis and predictive medicine. For this to be accomplished, diverse medical data need to be integrated with the development of intelligent software (SW) based on various types of medical data. Various types of information technology have been used to address these medical demands of PCa. We initiated the PROstate Medical Intelligence System Enterprise-Clinical, Imaging, and Pathology (PROMISE CLIP) and a multicenter, big data study to develop PCa SW for patients with PCa and clinicians. We integrated the clinical data of 7257 patients, 610 patients’ imaging data, and 39,000 cores of pathology digital scanning data from four tertiary hospitals in South Korea. We developed the PROMISE CLIP registry based on integrated clinical, imaging, and pathology data. Related intelligent SW has been developed for helping patients and clinicians decide on the best treatment option. The PROMISE CLIP study directs guidelines for intelligent SW development to solve medical demands for PCa. The PROMISE CLIP registry plays an important role in advancing PCa research and care. Full article
(This article belongs to the Special Issue Data Technology Applications in Life, Diseases, and Health)
Show Figures

Figure 1

Back to TopTop