MDPI - Publisher of Open Access Journals

10 pages, 2733 KB

Open AccessProceeding Paper

Mild Cognitive Impairment Identification System Based on Physiological Characteristics and Interactive Games

by Ming-An Chung, Zhi-Xuan Zhang, Jun-Hao Zhang, Chia-Chun Hsu, Yi-Ju Yao, Jin-Hong Chou, Ming-Chun Hsieh, Sung-Yun Chai, Shang-Jui Huang, Kai-Xiang Chen, Chia-Wei Lin and Pin-Han Chen

Eng. Proc. 2026, 128(1), 19; https://doi.org/10.3390/engproc2026128019 - 10 Mar 2026

Viewed by 106

Abstract

As the global aging population increases, the early detection and prevention of Alzheimer’s disease (AD) have become important in public health. To solve the problems of subjectivity and low timeliness of traditional assessment methods, this paper proposes a multimodal dementia prevention system that [...] Read more.

As the global aging population increases, the early detection and prevention of Alzheimer’s disease (AD) have become important in public health. To solve the problems of subjectivity and low timeliness of traditional assessment methods, this paper proposes a multimodal dementia prevention system that combines physiological sensing, a gamification interface, and a classification model. The system includes an interactive joystick to measure pulse and blood pressure. A Chinese music game app increases the participation of the elderly and reduces their sense of rejection through gamification interaction. After the physiological data were standardized by Z-score, they were input into three small sample classifiers (Gaussian Naïve Bayes, Fisher Linear Discriminant Analysis, and Logistic Regression) for the binary classification of AD. The system performance was evaluated using the Leave-One-Out cross-validation method. Experimental results show that Logistic Regression performed best in situations with extremely small samples and class imbalance, with an F1-score of 0.700, which was higher than the other two. Dynamic features and model fusion technologies need to be integrated to further enhance the clinical application potential of the system in the early prediction of dementia. Full article

► Show Figures

Figure 1

46 pages, 990 KB

Open AccessReview

Machine Learning for Outdoor Thermal Comfort Assessment and Optimization: Methods, Applications and Perspectives

by Giouli Mihalakakou, John A. Paravantis, Alexandros Romeos, Sonia Malefaki, Paraskevas N. Georgiou and Athanasios Giannadakis

Sustainability 2026, 18(5), 2600; https://doi.org/10.3390/su18052600 - 6 Mar 2026

Viewed by 173

Abstract

Urban environments face increasing thermal stress from climate change and the Urban Heat Island effect, with significant implications for livability, public health, and energy sustainability. Outdoor thermal comfort is defined as the state in which conditions are perceived as acceptable, depends on interactions [...] Read more.

Urban environments face increasing thermal stress from climate change and the Urban Heat Island effect, with significant implications for livability, public health, and energy sustainability. Outdoor thermal comfort is defined as the state in which conditions are perceived as acceptable, depends on interactions among meteorological, morphological, physiological, and behavioral factors. This review synthesizes the application of machine learning (ML) to outdoor thermal comfort assessment into a practice-oriented taxonomy. Research spans diverse climates and urban forms, using inputs across environmental and human domains. Supervised learning dominates. Regression approaches (linear regression, support vector regression, random forest, gradient boosting) and classification algorithms (decision trees, support vector machines, K-nearest neighbors, Naïve Bayes, random forest classifiers) are widely used to predict thermal indices such as the Physiological Equivalent Temperature and Universal Thermal Climate Index, or to classify subjective responses including thermal sensation, comfort, and acceptability. Unsupervised learning (clustering, principal component analysis) supports identification of microclimatic zones and perceptual clusters, while deep learning (multilayer perceptrons, convolutional and recurrent neural networks, generative adversarial networks) achieves superior accuracy for complex, high-dimensional, and spatiotemporal data. Algorithms such as random forests, support vector machines, and gradient boosting consistently show strong performance for both indices and subjective responses when integrating multi-domain inputs. Semi-supervised and reinforcement learning remain underexplored but offer promise for leveraging large-scale sensor data and enabling adaptive, real-time comfort management. The review concludes with a roadmap emphasizing explainable artificial intelligence, scalable surrogate modeling, and integration with simulation-based optimization and parametric design tools. Full article

► Show Figures

Figure 1

23 pages, 10789 KB

Open AccessArticle

Statistical Feature Engineering for Robot Failure Detection: A Comparative Study of Machine Learning and Deep Learning Classifiers

by Sertaç Savaş

Sensors 2026, 26(5), 1649; https://doi.org/10.3390/s26051649 - 5 Mar 2026

Viewed by 179

Abstract

Industrial robots are widely used in critical tasks such as assembly, welding, and material handling as core components of modern manufacturing systems. For the reliable operation of these systems, early and accurate detection of execution failures is crucial. In this study, a comprehensive [...] Read more.

Industrial robots are widely used in critical tasks such as assembly, welding, and material handling as core components of modern manufacturing systems. For the reliable operation of these systems, early and accurate detection of execution failures is crucial. In this study, a comprehensive comparison of machine learning and deep learning methods is conducted for the classification of robot execution failures using data acquired from force–torque sensors. Three different feature engineering approaches are proposed. The first is a Baseline approach that includes 90 raw time-series features. The second is the Domain-6 approach, which consists of 6 basic statistical features per sensor (36 in total). The third is the Domain-12 approach, which comprises 12 comprehensive statistical features per sensor (72 in total). The domain features include the mean, standard deviation, minimum, maximum, range, slope, median, skewness, kurtosis, RMS, energy, and IQR. In total, ten classification algorithms are evaluated, including eight machine learning methods and two deep learning models: Support Vector Machines (SVM), Random Forest (RF), k-Nearest Neighbors (KNN), Artificial Neural Network (ANN), Naive Bayes (NB), Decision Trees (DT), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM-LGBM), as well as a One-Dimensional Convolutional Neural Network (CNN-1D) and Long Short-Term Memory (LSTM). For traditional machine learning algorithms, 5 × 5 nested cross-validation is used, whereas for deep learning models, 5-fold cross-validation with a 20% validation split is employed. To ensure statistical reliability, all experiments are repeated over 30 independent runs. The experimental results demonstrate that feature engineering has a decisive impact on classification performance. In addition, regardless of the feature set, the highest accuracy (93.85% ± 0.90) is achieved by the Naive Bayes classifier using the Baseline features. The Domain-12 feature set provides consistent improvements across many algorithms, with substantial performance gains. The results are reported using accuracy, precision, recall, and F1-score metrics and are supported by confusion matrices. Finally, permutation feature importance analysis indicates that the skewness features of the Fx and Fy sensors are the most critical variables for failure detection. Overall, these findings show that time-domain statistical features offer an effective approach for robot failure classification. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

14 pages, 2065 KB

Open AccessArticle

Automated Early Detection of Skin Cancer Using a CNN-ViT-Attention-Based Hybrid Model

by Zekiye Kanat, Merve Kesim Onal, Harun Bingol, Serpil Sener, Engin Avci and Muhammed Yildirim

Biomedicines 2026, 14(3), 583; https://doi.org/10.3390/biomedicines14030583 - 5 Mar 2026

Viewed by 263

Abstract

Background/Objectives: Skin cancer is a very serious disease. There is a risk that the cancer will spread to other parts of the body as the cancerous tissue deepens. For this reason, early diagnosis is important because it allows for early initiation of [...] Read more.

Background/Objectives: Skin cancer is a very serious disease. There is a risk that the cancer will spread to other parts of the body as the cancerous tissue deepens. For this reason, early diagnosis is important because it allows for early initiation of treatment. This study proposes a hybrid model for the early diagnosis of skin cancer. Methods: The proposed model was developed using Convolutional Neural Networks (CNNs), Vision Transformer (ViT) architectures, and the k-Nearest Neighbors (KNN), Support Vector Machine (SVM), Naive Bayes (NB), Neural Network Classifiers, Decision Tree (DT), and Logistic Regression (LR) classifiers. Furthermore, the proposed model was fine-tuned to improve its disease diagnosis. Two attention mechanisms, channel and spatial, were used together in the proposed model. The HAM10000 dataset was used during the experiments. Class weighting was performed to ensure class-based balance in the dataset. Results: The proposed model was also compared with the CNN and ViT architectures frequently used in the literature. Among these models, the highest accuracy value of 95.1% was obtained with the proposed model. Conclusions: It is considered that the proposed model can be used as a decision support system for dermatologists in the diagnosis of skin cancer. Full article

(This article belongs to the Special Issue Artificial Intelligence Applications in Cancer and Other Diseases—2nd Edition)

► Show Figures

Figure 1

14 pages, 2336 KB

Open AccessArticle

Limitations of Retrospective Machine Learning Models for Predicting Tracheostomy After Cardiac Surgery

by Felix Wiesmueller, Johannes Rösch, Stephan Kersting and Thomas Strecker

Diagnostics 2026, 16(5), 771; https://doi.org/10.3390/diagnostics16050771 - 4 Mar 2026

Viewed by 242

Abstract

Background/Objectives: Early tracheostomy seems favorable in prolonged ventilated patients after surgery. Hence, predicting tracheostomy after cardiac surgery is essential. Recently proposed prediction models aim to support this decision-making process, but their diagnostic validity across other patient populations remains uncertain. Methods: A [...] Read more.

Background/Objectives: Early tracheostomy seems favorable in prolonged ventilated patients after surgery. Hence, predicting tracheostomy after cardiac surgery is essential. Recently proposed prediction models aim to support this decision-making process, but their diagnostic validity across other patient populations remains uncertain. Methods: A retrospective single-center study was performed at a university hospital. The patient sample included consecutive patients between 2010 and 2020 who underwent cardiac surgery. Patients who underwent tracheostomy after cardiac surgery were assigned to the intervention group. Control group patients, who had not undergone tracheostomy, were randomly assigned to the group. An existing model was evaluated by receiver operating characteristics curve analysis. Four sets of risk features were chosen depending on results from regression analysis, lasso regularization, random forest or clinical domain knowledge. Newly developed models were created using machine learning methods: random forest, naïve Bayes, nearest neighbor and deep learning. Multiple models were trained with either feature set and then assessed using confusion matrices on an independent test set. Results: A total of 4744 patients were included in this study. One-hundred and eighteen patients were included in the tracheostomy group. Diagnostic accuracy of the existing model showed insufficient discrimination (area under the curve (AUC) = 0.57). Likewise, newly developed models also showed overall poor diagnostic discrimination across all feature sets and algorithms. Conclusions: This study shows the diagnostic limitations of retrospective clinical data for the diagnostic prediction of tracheostomy, thereby informing the design of future prospective diagnostic studies. Training new models should not rely on retrospective data alone. Instead, prospective data collection and integration of physiological or imaging-based diagnostics could likely contribute to the development of a good classifier. Full article

(This article belongs to the Special Issue Artificial Intelligence for Clinical Diagnostic Decision Making)

► Show Figures

Figure 1

21 pages, 1301 KB

Open AccessFeature PaperArticle

Predicting 30-Day Readmission Risks in Breast Cancer Patients: An Explainable Machine Learning Approach

by Mlondolozi Mqadi, Elliot Mbunge and Tebogo Makaba

Appl. Sci. 2026, 16(5), 2467; https://doi.org/10.3390/app16052467 - 4 Mar 2026

Viewed by 232

Abstract

Hospital readmission within 30 days remains a significant challenge in oncology practice, contributing to higher healthcare costs, treatment delays, and poorer patient outcomes. Existing predictive models for breast cancer readmission are often limited by inadequate interpretability and generalisability. This study develops and evaluates [...] Read more.

Hospital readmission within 30 days remains a significant challenge in oncology practice, contributing to higher healthcare costs, treatment delays, and poorer patient outcomes. Existing predictive models for breast cancer readmission are often limited by inadequate interpretability and generalisability. This study develops and evaluates an explainable machine learning (ML) framework to predict 30-day hospital readmissions among breast cancer patients, with specific emphasis on methodological transparency and avoidance of information leakage. A retrospective dataset including demographic, clinical, and treatment-related variables such as age, comorbidity burden, ECOG performance status, baseline neutrophil count, and dosage adjustments was analysed. Multiple ML classifiers were evaluated—including Logistic Regression, Support Vector Machine, Naïve Bayes, K-Nearest Neighbours, Decision Tree, Random Forest, and XGBoost—using repeated stratified cross-validation (5 × 10 folds). Class imbalance was addressed using SMOTE applied strictly within the training folds to prevent data leakage. Out-of-fold performance metrics included ROC-AUC, PR-AUC, calibration curves, and Brier scores. Random Forest demonstrated the strongest discrimination specificity of 0.57 ± 0.33, the highest among all models, and a superior ROC-AUC of 0.68 ± 0.17, which was appropriate for the small, imbalanced dataset. For interpretability, each model was refit on the full dataset and analysed using Shapley Additive Explanations (SHAP), Partial Dependence Plots (PDP), and LIME. Comorbidity burden and ECOG performance status consistently emerged as the most influential predictors across all explainability techniques, aligning with established clinical evidence. The findings highlight the feasibility of applying explainable ML methods to small, imbalanced oncology datasets and demonstrate their potential to support early clinical risk identification in breast cancer care. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

19 pages, 4439 KB

Open AccessProceeding Paper

Comparative Analysis of Machine-Learning and Deep-Learning Approaches for Accurate Animal Disease Prediction and Health Risk Assessment

by Bhagyashree Panigrahy, Akhil Subudhi, Tanushree Harichandan, Neelamadhab Padhy and Rasmita Panigrahi

Eng. Proc. 2026, 124(1), 52; https://doi.org/10.3390/engproc2026124052 - 2 Mar 2026

Viewed by 340

Abstract

Effective, efficient, and early animal disease prediction is a challenging task. Identifying and reducing animal health risks is important for preventing disease outbreaks and improving cattle management. This study presents the machine-learning and hybrid deep-learning models for animal risk prediction. We employed eight [...] Read more.

Effective, efficient, and early animal disease prediction is a challenging task. Identifying and reducing animal health risks is important for preventing disease outbreaks and improving cattle management. This study presents the machine-learning and hybrid deep-learning models for animal risk prediction. We employed eight classifiers (Support Vector Machine, Logistic Regression, Decision Tree, K-Nearest Neighbors, Gaussian Naive Bayes, and Random Forest) along with feature-enhanced hybrid variants (RF–CNN and RF–ANN) to early detect risk to animals’ health. Our main objective is to develop and evaluate robust ML models for predicting animal health risks. Apart from these, we also present a comparative study of the conventional and hybrid models to construct a decision support system for early disease prediction. The experimental work reveals that RF obtained the highest accuracy of 95.77%, a macro F1-score of 0.9343, and a weighted F1-score of 0.9515. We also conduct the statistical test to confirm the robustness of the model for animal disease prediction. The proposed framework provides a scalable, interpretable decision-support system for real-world animal health monitoring and early disease intervention. Full article

(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

31 pages, 5508 KB

Open AccessArticle

An Edge–Fog–Cloud IoT Framework for Real-Time Cardiac Monitoring and Rapid Clinical Alerts in Hospital Wards

by Tehseen Baig, Nauman Riaz Chaudhry, Reema Choudhary, Pankaj Yadav, Younus Ahamad Shaik and Ayesha Rashid

Future Internet 2026, 18(3), 130; https://doi.org/10.3390/fi18030130 - 2 Mar 2026

Viewed by 272

Abstract

The difficulties of continuously monitoring cardiac patients in general hospital wards are still present because of the manual charting system and the slow clinical reaction to worsening physiological state. This paper outlines an edge- and fog-based Internet of Things (IoT) healthcare system to [...] Read more.

The difficulties of continuously monitoring cardiac patients in general hospital wards are still present because of the manual charting system and the slow clinical reaction to worsening physiological state. This paper outlines an edge- and fog-based Internet of Things (IoT) healthcare system to acquire, process, and prioritize the vital signs of patients in real time to minimize the alert latency and increase the time of clinical interventions. Wearable 12-lead ECG sensors transmit physiological measurements, such as heart rate, blood pressure, and oxygen saturation, to an intelligent edge service, where preprocessing, triage by threshold, and machine learning ECG classification are performed, and selective synchronization of physiological data with a cloud backend and data delivery to the clinician are made possible by a mobile application. The proposed architecture combines a ribbon-like streaming scheme, Flask-based gateway services, and Firebase Firestore to coordinate scalable mob/cloud with the help of multi-client data dissemination. To encompass borderline clinical deterioration, which is often unnoticed by conventional threshold systems, physiological parameters are classified into normal, alarming, emergency, and a new state, average. The Pan–Tompkins++ peak detector algorithm and multiple edge-resident classifiers, such as random forest, XGBoost, decision tree, naive Bayes, K-nearest neighbor, and support vector machine, are used to analyze the ECG waveforms. Experimental analysis of PhysioNet datasets and tests in real wards prove that the ensemble models can reach the highest possible ECG classification precision of 91.96 percent and snapshot-driven mobile alerts can decrease routine patient evaluation time by several minutes, to an average of 15.23 ± 2.71 s. These results suggest that edge-centric IoT systems can be appropriate in latency-critical hospital settings and that fog-based coordination is useful in next-generation smart healthcare systems. Full article

(This article belongs to the Special Issue Edge and Fog Computing for the Internet of Things, 2nd Edition)

► Show Figures

Graphical abstract

21 pages, 1352 KB

Open AccessArticle

Raman Spectroscopy Assisted by Machine Learning Algorithms for the Prediction of Different Types of Oral Cancer Cells

by Maria Lasalvia, Vito Capozzi and Giuseppe Perna

Appl. Sci. 2026, 16(5), 2380; https://doi.org/10.3390/app16052380 - 28 Feb 2026

Viewed by 168

Abstract

Oral squamous cell carcinoma (OSCC) cytology involves extracting a cell sample consisting of single cells or small clusters of cells from patients’ head and neck area in order to identify abnormal morphological characteristics after staining it. This method is used to screen for [...] Read more.

Oral squamous cell carcinoma (OSCC) cytology involves extracting a cell sample consisting of single cells or small clusters of cells from patients’ head and neck area in order to identify abnormal morphological characteristics after staining it. This method is used to screen for early cancer and the formation of metastases within the oral cavity. OSCC diagnosis partly depends on pathologists’ skills and also laboratories’ instrumentation. The use of Raman spectroscopy could support diagnoses performed using traditional methods, providing information based on the cellular biochemical environment. Technical drawbacks related to low signal-to-noise ratios of Raman spectroscopy and the need to obtain diagnostic information within a reasonable time frame have recently led to the analysis of Raman spectra using machine learning (ML) methods in order to obtain reliable information about the correct attribution of unknown cellular spectra. So, we used Raman micro-spectroscopy combined with machine learning methods to build classification models, which allow the diagnosis of different grades of OSCC in cell samples. The Raman spectra were analysed in the 980–1800 cm⁻¹ range by focusing the laser beam onto the nucleus and the cytoplasm regions of single cells from different cell lines modelling healthy (HaCaT) and cancer (Cal-27, SAS and HSC-3) cytological samples. We considered six classification algorithms (k-Nearest Neighbours, Logistic Regression, Naïve Bayes, artificial Neural Network, Random Forest and Support Vector Machine) to classify unknown Raman spectra. We report two classification tasks: a 4-level classification, which encompasses healthy cells, two different types of cancer cells, and one type of metastatic cells, and a 3-level classification, which includes healthy cells, non-metastatic cancer cells, and metastatic cancer cells. Our findings show that both Neural Network and Support Vector Machine algorithms applied to Raman spectra measured in the cytoplasm region can achieve sensitivity, precision and F1-score values larger than 90% in the 3-groups classifications, whereas Support Vector Machine performs better in the 4-groups classification with respect to a Neural Network. These results contribute to increasing confidence in the clinical translation of ML-assisted Raman spectroscopy as a tool to support conventional cytological techniques. Full article

(This article belongs to the Section Optics and Lasers)

► Show Figures

Figure 1

15 pages, 2774 KB

Open AccessArticle

A Prediction Model for Uncoating Receptor Usage in Human Enteroviruses Based on Amino Acid Sequences and a Naive Bayes Algorithm

by Yongtao Jia, Zhenyu Xie, Guoying Zhu and Changzheng Dong

Viruses 2026, 18(2), 236; https://doi.org/10.3390/v18020236 - 13 Feb 2026

Viewed by 400

Abstract

This study constructed a bioinformatics prediction algorithm for human enterovirus uncoating receptors based on amino acid sequences and physicochemical properties. Based on the availability of uncoating receptor information and three-dimensional (3D) structural data, human enterovirus serotypes were classified into training, validation, and prediction [...] Read more.

This study constructed a bioinformatics prediction algorithm for human enterovirus uncoating receptors based on amino acid sequences and physicochemical properties. Based on the availability of uncoating receptor information and three-dimensional (3D) structural data, human enterovirus serotypes were classified into training, validation, and prediction datasets. Using amino acid sequences of receptor-binding sites and their physicochemical properties as model features, a prediction model was constructed using the Naive Bayes algorithm and bioinformatic network analysis method. The results showed that both the training and validation datasets achieved a prediction accuracy of 100%. Among the 56 serotypes in the prediction dataset, the vast majority utilized seven known types of uncoating receptors (e.g., SCARB2, CAR, and ICAM-1), while a minority of serotypes may share the same novel, unknown receptor. This study indicates that uncoating receptors can be accurately predicted based on the amino acid sequences and physicochemical properties of human enteroviruses. Furthermore, the three-dimensional structural features at receptor-binding sites can be reflected through corresponding amino acid sequences and their physicochemical properties. This study facilitates a more in-depth investigations of enterovirus pathogenic mechanisms and provides important insights for the development of vaccines and antiviral drugs. Full article

(This article belongs to the Special Issue Coxsackieviruses, Polioviruses and Associated Diseases (Second Edition))

► Show Figures

Figure 1

24 pages, 652 KB

Open AccessArticle

Multi-Objective Harris Hawks Optimization with NSGA-III for Feature Selection in Student Performance Prediction

by Nabeel Al-Milli

Computers 2026, 15(2), 112; https://doi.org/10.3390/computers15020112 - 6 Feb 2026

Viewed by 346

Abstract

Student performance is an important factor for any education process to succeed; as a result, early detection of students at risk is critical for enabling timely and effective educational interventions. However, most educational datasets are complex and do not have a stable number [...] Read more.

Student performance is an important factor for any education process to succeed; as a result, early detection of students at risk is critical for enabling timely and effective educational interventions. However, most educational datasets are complex and do not have a stable number of features. As a result, in this paper, we propose a new algorithm called MOHHO-NSGA-III, which is a multi-objective feature-selection framework that jointly optimizes classification performance, feature subset compactness, and prediction stability with cross-validation folds. The algorithm combines Harris Hawks Optimization (HHO) to obtain a good balance between exploration and exploitation, with NSGA-III to preserve solution diversity along the Pareto front. Moreover, we control the diversity management strategy to figure out a new solution to overcome the issue, thereby reducing the premature convergence status. We validated the algorithm on Portuguese and Mathematics datasets obtained from the UCI Student Performance repository. Selected features were evaluated with five classifiers (k-NN, Decision Tree, Naive Bayes, SVM, LDA) through 10-fold cross-validation repeated over 21 independent runs. MOHHO-NSGA-III consistently selected 12 out of 30 features (60% reduction) while achieving 4.5% higher average accuracy than the full feature set (Wilcoxon test;

p < 0.01

across all classifiers). The most frequently selected features were past failures, absences, and family support aligning with educational research on student success factors. This suggests the proposed algorithm produces not just accurate but also interpretable models suitable for deployment in institutional early warning systems. Full article

(This article belongs to the Section AI-Driven Innovations)

► Show Figures

Figure 1

28 pages, 2032 KB

Open AccessArticle

Addressing Class Imbalance in Fetal Health Classification: Rigorous Benchmarking of Multi-Class Resampling Methods on Cardiotocography Data

by Zainab Subhi Mahmood Hawrami, Mehmet Ali Cengiz and Emre Dünder

Diagnostics 2026, 16(3), 485; https://doi.org/10.3390/diagnostics16030485 - 5 Feb 2026

Viewed by 550

Abstract

Background/Objectives: Fetal health is essential in prenatal care, influencing both maternal and fetal outcomes. Cardiotocography (CTG) monitors uterine contractions and fetal heart rate, yet manual interpretation exhibits significant inter-examiner variability. Machine learning offers automated alternatives; however, class imbalance in CTG datasets where [...] Read more.

Background/Objectives: Fetal health is essential in prenatal care, influencing both maternal and fetal outcomes. Cardiotocography (CTG) monitors uterine contractions and fetal heart rate, yet manual interpretation exhibits significant inter-examiner variability. Machine learning offers automated alternatives; however, class imbalance in CTG datasets where pathological cases constitute less than 10% leads to poor detection of minority classes. This study aims to provide the first systematic benchmark comparing five resampling strategies across seven classifier families for multi-class CTG classification, evaluated using imbalance-aware metrics rather than overall accuracy alone. Methods: Seven machine learning models were employed: Naïve Bayes (NB), Random Forest (RF), Linear Discriminant Analysis (LDA), k-Nearest Neighbors (KNN), Linear Support Vector Machine (SVM), Multinomial Logistic Regression (MLR), and Multi-Layer Perceptron (MLP). To address class imbalance, we evaluated the original unbalanced dataset (base) and five resampling methods: SMOTE, BSMOTE, ADASYN, NearMiss, and SCUT. Performance was evaluated on a held-out test set using Balanced Accuracy (BACC), Macro-F1, the Macro-Matthews Correlation Coefficient (Macro-MCC), and Macro-Averaged ROC-AUC. We also report per-class ROC curves. Results: Among all models, RF proved most reliable. Training on the original distribution (base) yielded the highest BACC (0.9118), whereas RF combined with BSMOTE provided the strongest class-balanced performance (Macro-MCC = 0.8533, Macro-F1 = 0.9073) with a near-perfect ROC-AUC (approximately 0.986–0.989). Overall, resampling effects proved model dependent. While some classifiers achieved optimal performance on the natural class distribution, oversampling techniques, particularly SMOTE and BSMOTE, demonstrated significant improvements in minority class discrimination and class-balanced metrics across multiple model families. Notably, certain models benefited substantially from resampling, exhibiting enhanced Macro-F1, BACC, and minority class recall without sacrificing overall accuracy. Conclusions: These findings establish robust, model-agnostic baselines for CTG-based fetal health screening. They highlight that strategic oversampling can translate improved minority class discrimination into clinically meaningful performance gains, supporting deployment in cost-sensitive and threshold-aware clinical settings. Full article

(This article belongs to the Special Issue Artificial Intelligence in Biomedical Diagnostics and Analysis 2025)

► Show Figures

Figure 1

28 pages, 2256 KB

Open AccessArticle

A Moving Window-Based Feature Extraction Method for Gearbox Fault Detection Using Vibration Signals

by Ietezaz ul Hassan, Krishna Panduru, Daniel Riordan and Joseph Walsh

Machines 2026, 14(2), 178; https://doi.org/10.3390/machines14020178 - 4 Feb 2026

Viewed by 295

Abstract

Early gearbox defect detection is imperative for reducing unplanned downtime, ensuring reliability and efficiency, and minimizing maintenance expenses. In recent years, with the rise of Artificial Intelligence (AI) and digital transformation, gearbox defect detection using AI has gained popularity. Machine learning (ML) classifiers [...] Read more.

Early gearbox defect detection is imperative for reducing unplanned downtime, ensuring reliability and efficiency, and minimizing maintenance expenses. In recent years, with the rise of Artificial Intelligence (AI) and digital transformation, gearbox defect detection using AI has gained popularity. Machine learning (ML) classifiers are very popular and transform gearbox condition monitoring from manual to automatic monitoring systems. This work proposes a moving window-based method for extracting statistical features from recorded vibration signals from the gearbox. The extracted features were used to train traditional ML classifiers. Moving window sizes of 300, 400, 500, 600, 700, and 800 were applied to extract statistical features from the publicly available benchmark dataset. The six different moving window sizes caused six types of datasets, each one corresponding to the moving window size. The generated datasets were partitioned using the K-fold cross-validation method to train and test ML models. This study explored and evaluated seven prominent ML classifiers: Decision Tree, Random Forest, Support Vector Machine (SVM), Naïve Bayes, K-Nearest Neighbor (KNN), Gradient Boosting Classifier (GBC), and Logistic Regression. The experimental results demonstrated that SVM, Logistic Regression, and GBC can outperform other ML classifiers. The experimental results in terms of accuracy, precision, and recall revealed that the ML classifier’s performance improves as the size of the moving window used for feature extraction increases. Full article

(This article belongs to the Section Machines Testing and Maintenance)

► Show Figures

Figure 1

19 pages, 1502 KB

Open AccessProceeding Paper

Machine Learning-Based Prognostic Modeling of Thyroid Cancer Recurrence

by Duppala Rohan, Kasaraneni Purna Prakash, Yellapragada Venkata Pavan Kumar, Gogulamudi Pradeep Reddy, Maddikera Kalyan Chakravarthi and Pradeep Reddy Challa

Eng. Proc. 2026, 124(1), 13; https://doi.org/10.3390/engproc2026124013 - 3 Feb 2026

Viewed by 551

Abstract

Thyroid cancer is the most common type of endocrine cancer. Most cases are called differentiated thyroid cancer (DTC), which includes papillary, follicular, and hurthle cell types. DTC usually grows slowly and has a good prognosis, especially when found early and treated with surgery, [...] Read more.

Thyroid cancer is the most common type of endocrine cancer. Most cases are called differentiated thyroid cancer (DTC), which includes papillary, follicular, and hurthle cell types. DTC usually grows slowly and has a good prognosis, especially when found early and treated with surgery, radioactive iodine, and thyroid hormone therapy. However, cancer can come back sometimes even years after treatment. This recurrence can appear as abnormal blood tests or as lumps in the neck or other parts of the body. Being able to predict and detect these recurrences early is important for improving patient care and planning follow-up treatment. In this view, this research explores different machine learning algorithms and neural networks to effectively predict DTC recurrence. A total of 17 classifiers were utilized for the experiment, namely, logistic regression, random forest, k-nearest neighbours, Gaussian naïve Bayes, multi-layered perceptron, extreme gradient boosting, adaptive boosting, gradient boosting classifier, extra tree classifier (ETC), light gradient boosting machine, categorical boosting, Bernoulli naïve Bayes, complement naïve Bayes, multinomial naïve Bayes, histogram-based gradient boosting, and nearest centroid, followed by building an artificial neural network. Among the classifiers, ETC performed best with 95.3% accuracy, 95.1% precision, 87.92% recall, 98.18% specificity, 91.21% F1-score, 98.84% AUROC and 97.66% AUPRC on the first dataset, and 99.47% accuracy, 94.83% precision, 98.62% sensitivity, 99.54% specificity, 96.65% F1-score, 99.95% AUROC, and 99.37% AUPRC on the second dataset. To improve model interpretability, Shapley Additive Explanations (SHAP) was also used to explain the contribution of each clinical feature to the model’s predictions, allowing for transparent, patient-specific insights into which factors were most important for predicting recurrence, thereby supporting the proposed model’s clinical relevance. Full article

(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

15 pages, 2051 KB

Open AccessArticle

Interpretable Multi-Model Framework for Early Warning of SME Loan Delinquency

by Ardak Akhmetova, Assem Shayakhmetova and Nurken Abdurakhmanov

Risks 2026, 14(2), 25; https://doi.org/10.3390/risks14020025 - 31 Jan 2026

Viewed by 491

Abstract

The rapid expansion of small and medium enterprise (SME) lending has intensified the need for accurate and interpretable credit risk forecasting. Financial institutions must anticipate potential business loan delinquency to maintain portfolio stability and meet regulatory standards. This study proposes an interpretable multi-model [...] Read more.

The rapid expansion of small and medium enterprise (SME) lending has intensified the need for accurate and interpretable credit risk forecasting. Financial institutions must anticipate potential business loan delinquency to maintain portfolio stability and meet regulatory standards. This study proposes an interpretable multi-model framework that integrates statistical (correlation screening and ordinary least squares regression), probabilistic (Gaussian Naïve Bayes), and classical time-series (SARIMA) methods to balance explanatory insight and predictive accuracy in delinquency forecasting. Ordinary least squares regression is used to quantify the direction and strength of each driver and yields statistically significant coefficients (β ≈ 1.336 for the overdue 15+ days bucket, p < 10⁻²²). The Naïve Bayes classifier provides a probabilistic early-warning signal with an out-of-sample accuracy of 55%, precision of 43%, recall of 75%, and ROC AUC of 0.371. Finally, a seasonal ARIMA model fitted on the selected regressors achieves a mean absolute percentage error (MAPE) of 7.6% and an out-of-sample R² of 0.49, demonstrating competitive forecasting performance while maintaining interpretability. The results show that the framework offers actionable insights for risk managers by identifying key risk drivers, providing probabilistic alarms, and generating calibrated point forecasts. The proposed approach contributes to the development of intelligent and explainable forecasting and control systems for modern financial institutions. Full article

(This article belongs to the Special Issue AI for Financial Risk Perception)

► Show Figures

Figure 1

Search Results (868)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (868)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI