MDPI - Publisher of Open Access Journals

24 pages, 2667 KiB

Open AccessArticle

Transformer-Driven Fault Detection in Self-Healing Networks: A Novel Attention-Based Framework for Adaptive Network Recovery

by Parul Dubey, Pushkar Dubey and Pitshou N. Bokoro

Mach. Learn. Knowl. Extr. 2025, 7(3), 67; https://doi.org/10.3390/make7030067 - 16 Jul 2025

Viewed by 98

Abstract

Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, [...] Read more.

Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, and delayed convergence, limiting their effectiveness in real-time applications. This study utilizes two benchmark datasets—EFCD and SFDD—which represent electrical and sensor fault scenarios, respectively. These datasets pose challenges due to class imbalance and complex temporal dependencies. To address this, we propose a novel hybrid framework combining Attention-Augmented Convolutional Neural Networks (AACNN) with transformer encoders, enhanced through Enhanced Ensemble-SMOTE for balancing the minority class. The model captures spatial features and long-range temporal patterns and learns effectively from imbalanced data streams. The novelty lies in the integration of attention mechanisms and adaptive oversampling in a unified fault-prediction architecture. Model evaluation is based on multiple performance metrics, including accuracy, F1-score, MCC, RMSE, and score*. The results show that the proposed model outperforms state-of-the-art approaches, achieving up to 97.14% accuracy and a score* of 0.419, with faster convergence and improved generalization across both datasets. Full article

► Show Figures

Figure 1

14 pages, 574 KiB

Open AccessArticle

Ki-67 as a Predictor of Metastasis in Adrenocortical Carcinoma: Artificial Intelligence Insights from Retrospective Imaging Data

by Andrew J. Goulian and David S. Yee

J. Clin. Med. 2025, 14(14), 4829; https://doi.org/10.3390/jcm14144829 - 8 Jul 2025

Viewed by 242

Abstract

Background/Objectives: Adrenocortical carcinoma (ACC) is a rare, aggressive malignancy with poor prognosis, particularly in metastatic cases. The Ki-67 proliferation index is a recognized marker of tumor aggressiveness, yet its role in guiding diagnostic imaging and surgical decision-making remains underexplored. This study evaluates Ki-67’s [...] Read more.

Background/Objectives: Adrenocortical carcinoma (ACC) is a rare, aggressive malignancy with poor prognosis, particularly in metastatic cases. The Ki-67 proliferation index is a recognized marker of tumor aggressiveness, yet its role in guiding diagnostic imaging and surgical decision-making remains underexplored. This study evaluates Ki-67’s predictive value for metastasis at diagnosis, leveraging artificial intelligence (AI) to inform personalized, minimally invasive strategies for ACC management. Methods: We retrospectively analyzed 53 patients with histologically confirmed ACC from the Adrenal-ACC-Ki67-Seg dataset in The Cancer Imaging Archive. All patients had Ki-67 indices from surgical specimens and preoperative contrast-enhanced CT scans. Descriptive statistics, t-tests, ANOVA, and multivariable logistic regression evaluated associations between Ki-67, tumor size, age, and metastasis. Random Forest classifiers—with and without the Synthetic Minority Oversampling Technique (SMOTE)—were developed to predict metastasis. A Ki-67-only model served as a baseline comparator. Model performance was assessed using the area under the curve (AUC) and DeLong’s test. Results: Patients with metastatic disease had significantly higher Ki-67 indices (mean 39.4% vs. 21.6%, p < 0.05). Logistic regression identified Ki-67 as the sole significant predictor (OR = 1.06, 95% CI: 1.01–1.12). The Ki-67-only model achieved an AUC of 0.637, while the SMOTE-enhanced Random Forest achieved an AUC of 0.994, significantly outperforming all others (p < 0.001). Conclusions: Ki-67 is significantly associated with metastasis at ACC diagnosis and demonstrates independent predictive value in regression analysis. However, integration with machine learning models incorporating tumor size and age significantly improves overall predictive accuracy, supporting AI-assisted risk stratification and precision imaging strategies in adrenal cancer care. Full article

(This article belongs to the Special Issue Recent Advances in Imaging and Interventional Techniques for Renal and Adrenal Diseases)

► Show Figures

Figure 1

15 pages, 7876 KiB

Open AccessArticle

Fine-Scale Risk Mapping for Dengue Vector Using Spatial Downscaling in Intra-Urban Areas of Guangzhou, China

by Yunpeng Shen, Zhoupeng Ren, Junfu Fan, Jianpeng Xiao, Yingtao Zhang and Xiaobo Liu

Insects 2025, 16(7), 661; https://doi.org/10.3390/insects16070661 - 25 Jun 2025

Viewed by 474

Abstract

Generating fine-scale risk maps for mosquito-borne diseases vectors is an essential tool for guiding spatially targeted vector control interventions in urban settings, given the limited public health resources. This study aimed to generate fine-scale risk maps for dengue vectors using routine vector surveillance [...] Read more.

Generating fine-scale risk maps for mosquito-borne diseases vectors is an essential tool for guiding spatially targeted vector control interventions in urban settings, given the limited public health resources. This study aimed to generate fine-scale risk maps for dengue vectors using routine vector surveillance data collected at the township scale. We integrated monthly township-specific Breteau Index (BI) data from Guangzhou city (2019 to 2020) with covariates extracted from remote sensing imagery and other geospatial datasets to develop an original random forest (RF) model for predicting hotspot areas (BI ≥ 5). We implemented three data resampling techniques (undersampling, oversampling, and hybrid sampling) to improve the model’s performance and evaluate it using the ROC-AUC, Recall, Specificity, and G-means metrics. Finally, we generated a downscaled risk maps for BI hotspot areas at a 1000 m grid scale by applying the optimal model to fine-scale input data. Our findings indicate the following: (1) data resampling techniques significantly improved the prediction accuracy of the original RF model, demonstrating robust spatial downscaling capabilities for fine-scale grids; (2) the spatial distribution of BI hotspot areas within townships exhibits significant heterogeneity. The fine-scale risk mapping approach overcomes the limitations of previous coarse-scale risk maps and provides critical evidence for policymakers to better understand the distribution of BI hotspot areas, facilitating pixel-level spatially targeted vector control interventions in intra-urban areas. Full article

(This article belongs to the Special Issue Control and Surveillance of Mosquitoes to Reduce the Spread of Mosquito-Borne Disease)

► Show Figures

Figure 1

24 pages, 4961 KiB

Open AccessArticle

A Small-Sample Scenario Optimization Scheduling Method Based on Multidimensional Data Expansion

by Yaoxian Liu, Kaixin Zhang, Yue Sun, Jingwen Chen and Junshuo Chen

Algorithms 2025, 18(6), 373; https://doi.org/10.3390/a18060373 - 19 Jun 2025

Viewed by 293

Abstract

Currently, deep reinforcement learning has been widely applied to energy system optimization and scheduling, and the DRL method relies more heavily on historical data. The lack of historical operation data in new integrated energy systems leads to insufficient DRL training samples, which easily [...] Read more.

Currently, deep reinforcement learning has been widely applied to energy system optimization and scheduling, and the DRL method relies more heavily on historical data. The lack of historical operation data in new integrated energy systems leads to insufficient DRL training samples, which easily triggers the problems of underfitting and insufficient exploration of the decision space and thus reduces the accuracy of the scheduling plan. In addition, conventional data-driven methods are also difficult to accurately predict renewable energy output due to insufficient training data, which further affects the scheduling effect. Therefore, this paper proposes a small-sample scenario optimization scheduling method based on multidimensional data expansion. Firstly, based on spatial correlation, the daily power curves of PV power plants with measured power are screened, and the meteorological similarity is calculated using multicore maximum mean difference (MK-MMD) to generate new energy output historical data of the target distributed PV system through the capacity conversion method; secondly, based on the existing daily load data of different types, the load historical data are generated using the stochastic and simultaneous sampling methods to construct the full historical dataset; subsequently, for the sample imbalance problem in the small-sample scenario, an oversampling method is used to enhance the data for the scarce samples, and the XGBoost PV output prediction model is established; finally, the optimal scheduling model is transformed into a Markovian decision-making process, which is solved by using the Deep Deterministic Policy Gradient (DDPG) algorithm. The effectiveness of the proposed method is verified by arithmetic examples. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

15 pages, 640 KiB

Open AccessArticle

Interpretable Machine Learning for Serum-Based Metabolomics in Breast Cancer Diagnostics: Insights from Multi-Objective Feature Selection-Driven LightGBM-SHAP Models

by Emek Guldogan, Fatma Hilal Yagin, Hasan Ucuzal, Sarah A. Alzakari, Amel Ali Alhussan and Luca Paolo Ardigò

Medicina 2025, 61(6), 1112; https://doi.org/10.3390/medicina61061112 - 19 Jun 2025

Viewed by 763

Abstract

Background and Objectives: Breast cancer accounts for 12.5% of all new cancer cases in women worldwide. Early detection significantly improves survival rates, but traditional biomarkers like CA 15-3 and HER2 lack sensitivity and specificity, particularly for early-stage disease. Advances in metabolomics and machine [...] Read more.

Background and Objectives: Breast cancer accounts for 12.5% of all new cancer cases in women worldwide. Early detection significantly improves survival rates, but traditional biomarkers like CA 15-3 and HER2 lack sensitivity and specificity, particularly for early-stage disease. Advances in metabolomics and machine learning, particularly explainable artificial intelligence (XAI), offer new opportunities for identifying robust biomarkers and improving diagnostic accuracy. This study aimed to identify and validate serum-based metabolic biomarkers for breast cancer using advanced metabolomic profiling techniques and a Light Gradient Boosting Machine (LightGBM) model. Additionally, SHapley Additive exPlanations (SHAP) were applied to enhance model interpretability and biological insight. Materials and Methods: The study included 103 breast cancer patients and 31 healthy controls. Serum samples underwent liquid and gas chromatography–time-of-flight mass spectrometry (LC-TOFMS and GC-TOFMS). Mutual Information (MI), Sparse Partial Least Squares (sPLS), Boruta, and Multi-Objective Feature Selection (MOFS) approaches were applied to the data for biomarker discovery. LightGBM, AdaBoost, and Random Forest were employed for classification and to identify class imbalance with the Synthetic Minority Oversampling Technique (SMOTE). SHAP analysis ranked metabolites based on their contribution to model predictions. Results: Compared to other feature selection approaches, the MOFS approach was more robust in terms of predictive performance, and metabolites identified by this method were used in subsequent analyses for biomarker discovery. LightGBM outperformed the AdaBoost and Random Forest models, achieving 86.6% accuracy, 89.1% sensitivity, 84.2% specificity, and an F1-score of 87.0%. SHAP analysis identified 2-Aminobutyric acid, choline, and coproporphyrin as the most influential metabolites, with dysregulation of these markers associated with breast cancer risk. Conclusions: This study is among the first to integrate SHAP explainability with metabolomic profiling, bridging computational predictions and biological insights for improved clinical adoption. This study demonstrates the effectiveness of combining metabolomics with XAI-driven machine learning for breast cancer diagnostics. The identified biomarkers not only improve diagnostic accuracy but also reveal critical metabolic dysregulations associated with disease progression. Full article

(This article belongs to the Special Issue Recent Advances in Diagnosis and Therapy of Gynecologic and Breast Cancers)

► Show Figures

Figure 1

27 pages, 2815 KiB

Open AccessArticle

Machine Learning-Augmented Triage for Sepsis: Real-Time ICU Mortality Prediction Using SHAP-Explained Meta-Ensemble Models

by Hülya Yilmaz Başer, Turan Evran and Mehmet Akif Cifci

Biomedicines 2025, 13(6), 1449; https://doi.org/10.3390/biomedicines13061449 - 12 Jun 2025

Viewed by 686

Abstract

Background/Objectives: Optimization algorithms are acknowledged to be critical in various fields and dynamical systems since they provide facilitation in identifying and retrieving the most possible solutions concerning complex problems besides improving efficiency, cutting down on costs, and boosting performance. Metaheuristic optimization algorithms, on [...] Read more.

Background/Objectives: Optimization algorithms are acknowledged to be critical in various fields and dynamical systems since they provide facilitation in identifying and retrieving the most possible solutions concerning complex problems besides improving efficiency, cutting down on costs, and boosting performance. Metaheuristic optimization algorithms, on the other hand, are inspired by natural phenomena, providing significant benefits related to the applicable solutions for complex optimization problems. Considering that complex optimization problems emerge across various disciplines, their successful applications are possible to be observed in tasks of classification and feature selection tasks, including diagnostic processes of certain health problems based on bio-inspiration. Sepsis continues to pose a significant threat to patient survival, particularly among individuals admitted to intensive care units from emergency departments. Traditional scoring systems, including qSOFA, SIRS, and NEWS, often fall short of delivering the precision necessary for timely and effective clinical decision-making. Methods: In this study, we introduce a novel, interpretable machine learning framework designed to predict in-hospital mortality in sepsis patients upon intensive care unit admission. Utilizing a retrospective dataset from a tertiary university hospital encompassing patient records from January 2019 to June 2024, we extracted comprehensive clinical and laboratory features. To address class imbalance and missing data, we employed the Synthetic Minority Oversampling Technique and systematic imputation methods, respectively. Our hybrid modeling approach integrates ensemble-based ML algorithms with deep learning architectures, optimized through the Red Piranha Optimization algorithm for feature selection and hyperparameter tuning. The proposed model was validated through internal cross-validation and external testing on the MIMIC-III dataset as well. Results: The proposed model demonstrates superior predictive performance over conventional scoring systems, achieving an area under the receiver operating characteristic curve of 0.96, a Brier score of 0.118, and a recall of 81. Conclusions: These results underscore the potential of AI-driven tools to enhance clinical decision-making processes in sepsis management, enabling early interventions and potentially reducing mortality rates. Full article

(This article belongs to the Section Molecular and Translational Medicine)

► Show Figures

Figure 1

29 pages, 3354 KiB

Open AccessArticle

Enhancing Heart Attack Prediction: Feature Identification from Multiparametric Cardiac Data Using Explainable AI

by Muhammad Waqar, Muhammad Bilal Shahnawaz, Sajid Saleem, Hassan Dawood, Usman Muhammad and Hussain Dawood

Algorithms 2025, 18(6), 333; https://doi.org/10.3390/a18060333 - 2 Jun 2025

Viewed by 780

Abstract

Heart attack is a leading cause of mortality, necessitating timely and precise diagnosis to improve patient outcomes. However, timely diagnosis remains a challenge due to the complex and nonlinear relationships between clinical indicators. Machine learning (ML) and deep learning (DL) models have the [...] Read more.

Heart attack is a leading cause of mortality, necessitating timely and precise diagnosis to improve patient outcomes. However, timely diagnosis remains a challenge due to the complex and nonlinear relationships between clinical indicators. Machine learning (ML) and deep learning (DL) models have the potential to predict cardiac conditions by identifying complex patterns within data, but their “black-box” nature restricts interpretability, making it challenging for healthcare professionals to comprehend the reasoning behind predictions. This lack of interpretability limits their clinical trust and adoption. The proposed approach addresses this limitation by integrating predictive modeling with Explainable AI (XAI) to ensure both accuracy and transparency in clinical decision-making. The proposed study enhances heart attack prediction using the University of California, Irvine (UCI) dataset, which includes various heart analysis parameters collected through electrocardiogram (ECG) sensors, blood pressure monitors, and biochemical analyzers. Due to class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied to enhance the representation of the minority class. After preprocessing, various ML algorithms were employed, among which Artificial Neural Networks (ANN) achieved the highest performance with 96.1% accuracy, 95.7% recall, and 95.7% F1-score. To enhance the interpretability of ANN, two XAI techniques, specifically SHapley Additive Explanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME), were utilized. This study incrementally benchmarks SMOTE, ANN, and XAI techniques such as SHAP and LIME on standardized cardiac datasets, emphasizing clinical interpretability and providing a reproducible framework for practical healthcare implementation. These techniques enable healthcare practitioners to understand the model’s decisions, identify key predictive features, and enhance clinical judgment. By bridging the gap between AI-driven performance and practical medical implementation, this work contributes to making heart attack prediction both highly accurate and interpretable, facilitating its adoption in real-world clinical settings. Full article

(This article belongs to the Section Algorithms for Multidisciplinary Applications)

► Show Figures

Figure 1

24 pages, 2518 KiB

Open AccessArticle

Enhanced Multi-Model Machine Learning-Based Dementia Detection Using a Data Enrichment Framework: Leveraging the Blessing of Dimensionality

by Khomkrit Yongcharoenchaiyasit, Sujitra Arwatchananukul, Georgi Hristov and Punnarumol Temdee

Bioengineering 2025, 12(6), 592; https://doi.org/10.3390/bioengineering12060592 - 30 May 2025

Viewed by 595

Abstract

The early diagnosis of dementia, a progressive condition impairing memory, cognition, and functional ability in older adults, is essential for timely intervention and improved patient outcomes. This study proposes a novel multiclass classification that differentiates dementia from other comorbid conditions, specifically cardiovascular diseases, [...] Read more.

The early diagnosis of dementia, a progressive condition impairing memory, cognition, and functional ability in older adults, is essential for timely intervention and improved patient outcomes. This study proposes a novel multiclass classification that differentiates dementia from other comorbid conditions, specifically cardiovascular diseases, including heart failure and aortic valve disorder, by leveraging the “blessing of dimensionality” to enhance predictive performance while ensuring feature accessibility. Using a dataset of 26,474 electronic health records from two hospitals in Chiang Rai, Thailand, the proposed framework introduced clinically informed feature augmentation to enhance model generalizability. Furthermore, the borderline synthetic minority oversampling technique was employed to address class imbalance, enhancing the model’s performance for minority classes. This study systematically evaluated a suite of machine learning models, including extreme gradient boosting, gradient boosting, random forest, support vector machine, decision trees, k-nearest neighbors, extra trees, and TabNet, across both the original and enriched datasets, with the latter integrating augmented features and synthetic data. Predictive performance was assessed using accuracy, precision, recall, F1 score, area under the receiver operating characteristic curve, and area under the precision–recall curve. The results revealed that all the models exhibited consistent performance improvements with the enriched dataset, affirming the value of dimensionality when guided by domain expertise. Full article

(This article belongs to the Section Biosignal Processing)

► Show Figures

Figure 1

24 pages, 3487 KiB

Open AccessArticle

A Convolutional Mixer-Based Deep Learning Network for Alzheimer’s Disease Classification from Structural Magnetic Resonance Imaging

by M. Krithika Alias Anbu Devi and K. Suganthi

Diagnostics 2025, 15(11), 1318; https://doi.org/10.3390/diagnostics15111318 - 23 May 2025

Viewed by 543

Abstract

Objective: Alzheimer’s disease (AD) is a neurodegenerative disorder that severely impairs cognitive function across various age groups, ranging from early to late sixties. It progresses from mild to severe stages, so an accurate diagnostic tool is necessary for effective intervention and treatment planning. [...] Read more.

Objective: Alzheimer’s disease (AD) is a neurodegenerative disorder that severely impairs cognitive function across various age groups, ranging from early to late sixties. It progresses from mild to severe stages, so an accurate diagnostic tool is necessary for effective intervention and treatment planning. Methods: This work proposes a novel AD classification architecture that integrates depthwise separable convolutional layers with traditional convolutional layers to efficiently extract features from structural magnetic resonance imaging (sMRI) scans. This model benefits from excellent feature extraction and lightweight operation, which reduces the number of parameters without compromising accuracy. The model learns from scratch with optimized weight initialization, resulting in faster convergence and improved generalization. However, medical imaging datasets contain class imbalance as a major challenge, which often results in biased models with poor generalization to the underrepresented disease stages. A hybrid sampling approach combining SMOTE (synthetic minority oversampling technique) with the ENN (edited nearest neighbors) effectively handles the complications of class imbalance issue inherent in the datasets. An explainable activation space occlusion sensitivity map (ASOP) pixel attribution method is employed to highlight the critical regions of input images that influence the classification decisions across different stages of AD. Results and Conclusions: The proposed model outperformed several state-of-the-art transfer learning architectures, including VGG19, DenseNet201, EfficientNetV2S, MobileNet, ResNet152, InceptionV3, and Xception. It achieves noteworthy results in disease stage classification, with an accuracy of 98.87%, an F1 score of 98.86%, a precision of 98.80%, and recall of 98.69%. These results demonstrate the effectiveness of the proposed model for classifying stages of AD progression. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

16 pages, 1447 KiB

Open AccessArticle

Noise Suppressed Image Reconstruction for Quanta Image Sensors Based on Transformer Neural Networks

by Guanjie Wang and Zhiyuan Gao

J. Imaging 2025, 11(5), 160; https://doi.org/10.3390/jimaging11050160 - 17 May 2025

Viewed by 518

Abstract

The photon detection capability of quanta image sensors make them an optimal choice for low-light imaging. To address Possion noise in QIS reconstruction caused by spatio-temporal oversampling characteristic, a deep learning-based noise suppression reconstruction method is proposed in this paper. The proposed neural [...] Read more.

The photon detection capability of quanta image sensors make them an optimal choice for low-light imaging. To address Possion noise in QIS reconstruction caused by spatio-temporal oversampling characteristic, a deep learning-based noise suppression reconstruction method is proposed in this paper. The proposed neural network integrates convolutional neural networks and Transformers. Its architecture combines the Anscombe transformation with serial and parallel modules to enhance denoising performance and adaptability across various scenarios. Experimental results demonstrate that the proposed method effectively suppresses noise in QIS image reconstruction. Compared with representative methods such as TD-BM3D, QIS-Net and DPIR, our approach achieves up to 1.2 dB improvement in PSNR, demonstrating superior reconstruction quality. Full article

(This article belongs to the Section Image and Video Processing)

► Show Figures

Figure 1

20 pages, 1198 KiB

Open AccessArticle

Mitigating Class Imbalance in Network Intrusion Detection with Feature-Regularized GANs

by Jing Li, Wei Zong, Yang-Wai Chow and Willy Susilo

Future Internet 2025, 17(5), 216; https://doi.org/10.3390/fi17050216 - 13 May 2025

Viewed by 523

Abstract

Network Intrusion Detection Systems (NIDS) often suffer from severe class imbalance, where minority attack types are underrepresented, leading to degraded detection performance. To address this challenge, we propose a novel augmentation framework that integrates Soft Nearest Neighbor Loss (SNNL) into Generative Adversarial Networks [...] Read more.

Network Intrusion Detection Systems (NIDS) often suffer from severe class imbalance, where minority attack types are underrepresented, leading to degraded detection performance. To address this challenge, we propose a novel augmentation framework that integrates Soft Nearest Neighbor Loss (SNNL) into Generative Adversarial Networks (GANs), including WGAN, CWGAN, and WGAN-GP. Unlike traditional oversampling methods (e.g., SMOTE, ADASYN), our approach improves feature-space alignment between real and synthetic samples, enhancing classifier generalization on rare classes. Experiments on NSL-KDD, CSE-CIC-IDS2017, and CSE-CIC-IDS2018 show that SNNL-augmented GANs consistently improve minority-class F1-scores without degrading overall accuracy or majority-class performance. UMAP visualizations confirm that SNNL produces more compact and class-consistent sample distributions. We also evaluate the computational overhead, finding the added cost moderate. These results demonstrate the effectiveness and practicality of SNNL as a general enhancement for GAN-based data augmentation in imbalanced NIDS tasks. Full article

(This article belongs to the Special Issue Intrusion Detection and Resiliency in Cyber-Physical Systems and Networks)

► Show Figures

Figure 1

27 pages, 6543 KiB

Open AccessArticle

Driver Injury Prediction and Factor Analysis in Passenger Vehicle-to-Passenger Vehicle Collision Accidents Using Explainable Machine Learning

by Peng Liu, Weiwei Zhang, Xuncheng Wu, Wenfeng Guo and Wangpengfei Yu

Vehicles 2025, 7(2), 42; https://doi.org/10.3390/vehicles7020042 - 3 May 2025

Viewed by 574

Abstract

Vehicle accidents, particularly PV-PV collisions, result in significant property damage and driver injuries, causing substantial economic losses and health risks. Most existing studies focus on macro-level predictions, such as accident frequency, but lack detailed collision-level analysis, which limits the precision of severity prediction. [...] Read more.

Vehicle accidents, particularly PV-PV collisions, result in significant property damage and driver injuries, causing substantial economic losses and health risks. Most existing studies focus on macro-level predictions, such as accident frequency, but lack detailed collision-level analysis, which limits the precision of severity prediction. This study investigates various accident-related factors, including environmental conditions, vehicle attributes, driver characteristics, pre-crash scenarios, and collision dynamics. Data from NHTSA’s CRSS and FARS datasets were integrated and balanced using random over-sampling and under-sampling techniques to address severity-level data imbalances. The mRMR algorithm was employed for feature selection to minimize redundancy and identify key features. Five advanced machine learning models were evaluated for severity prediction, with XGBoost achieving the best performance: 84.9% accuracy, 84.85% precision, 84.90% recall, and an F1-score of 84.87%. SHAP analysis was utilized to interpret the model and conduct a comprehensive analysis of accident features, including their importance, dependencies, and combined effects on severity prediction. This study achieved high accuracy in predicting accident severity across all levels in PV-PV collisions. Moreover, by integrating the SHAP model interpretation method, we conducted detailed feature analysis at global, local, and individual case levels, thereby filling the gap in PV-PV accident severity prediction and feature analysis. Full article

(This article belongs to the Special Issue Novel Solutions for Transportation Safety)

► Show Figures

Figure 1

22 pages, 3438 KiB

Open AccessArticle

A High-Accuracy Advanced Persistent Threat Detection Model: Integrating Convolutional Neural Networks with Kepler-Optimized Bidirectional Gated Recurrent Units

by Guangwu Hu, Maoqi Sun and Chaoqin Zhang

Electronics 2025, 14(9), 1772; https://doi.org/10.3390/electronics14091772 - 27 Apr 2025

Viewed by 754

Abstract

Advanced Persistent Threat (APT) refers to a highly targeted, sophisticated, and prolonged form of cyberattack, typically directed at specific organizations or individuals. The primary objective of such attacks is the theft of sensitive information or the disruption of critical operations. APT attacks are [...] Read more.

Advanced Persistent Threat (APT) refers to a highly targeted, sophisticated, and prolonged form of cyberattack, typically directed at specific organizations or individuals. The primary objective of such attacks is the theft of sensitive information or the disruption of critical operations. APT attacks are characterized by their stealth and complexity, often resulting in significant economic losses. Furthermore, these attacks may lead to intelligence breaches, operational interruptions, and even jeopardize national security and political stability. Given the covert nature and extended durations of APT attacks, current detection solutions encounter challenges such as high detection difficulty and insufficient accuracy. To address these limitations, this paper proposes an innovative high-accuracy APT attack detection model, CNN-KOA-BiGRU, which integrates Convolutional Neural Networks (CNN), Bidirectional Gated Recurrent Units (BiGRU), and the Kepler optimization algorithm (KOA). The model first utilizes CNN to extract spatial features from network traffic data, followed by the application of BiGRU to capture temporal dependencies and long-term memory, thereby forming comprehensive temporal features. Simultaneously, the Kepler optimization algorithm is employed to optimize the BiGRU network structure, achieving globally optimal feature weights and enhancing detection accuracy. Additionally, this study employs a combination of sampling techniques, including Synthetic Minority Over-sampling Technique (SMOTE) and Tomek links, to mitigate classification bias caused by dataset imbalance. Evaluation results on the CSE-CIC-IDS2018 experimental dataset demonstrate that the CNN-KOA-BiGRU model achieves superior performance in detecting APT attacks, with an average accuracy of 98.68%. This surpasses existing methods, including CNN (93.01%), CNN-BiGRU (97.77%), and Graph Convolutional Network (GCN) (95.96%) on the same dataset. Specifically, the proposed model demonstrates an accuracy improvement of 5.67% over CNN, 0.91% over CNN-BiGRU, and 2.72% over GCN. Overall, the proposed model achieves an average improvement of 3.1% compared to existing methods. Full article

(This article belongs to the Special Issue Advanced Technologies in Edge Computing and Applications)

► Show Figures

Figure 1

27 pages, 3764 KiB

Open AccessArticle

Effective Epileptic Seizure Detection with Hybrid Feature Selection and SMOTE-Based Data Balancing Using SVM Classifier

by Hany F. Atlam, Gbenga Ebenezer Aderibigbe and Muhammad Shahroz Nadeem

Appl. Sci. 2025, 15(9), 4690; https://doi.org/10.3390/app15094690 - 23 Apr 2025

Cited by 2 | Viewed by 943

Abstract

Epileptic seizures, a leading cause of global morbidity and mortality, pose significant challenges in timely diagnosis and management. Epilepsy, a chronic neurological disorder characterized by recurrent and unpredictable seizures, affects over 70 million people worldwide, according to the World Health Organization (WHO). Despite [...] Read more.

Epileptic seizures, a leading cause of global morbidity and mortality, pose significant challenges in timely diagnosis and management. Epilepsy, a chronic neurological disorder characterized by recurrent and unpredictable seizures, affects over 70 million people worldwide, according to the World Health Organization (WHO). Despite significant advances in medical science, accurate and timely diagnosis of epileptic seizures remains a challenge, with misdiagnosis rates reported to be as high as 30%. The consequences of misdiagnosis or delayed diagnosis can be severe, leading to increased morbidity, mortality, and reduced quality of life for patients. Therefore, this paper presents a novel approach to enhancing epileptic seizure detection through the integration of Synthetic Minority Over-Sampling Technique (SMOTE) for data balancing and a Hybrid Feature Selection Technique—Principal Component Analysis (PCA) and Discrete Wavelet Transform (DWT). The proposed model aims to improve the accuracy and reliability of seizure detection systems by addressing data imbalance and extracting discriminative features from electroencephalograms (EEG) signals. Experimental results demonstrate substantial performance gains, with the Support Vector Machine (SVM) classifier achieving 97.30% accuracy, 99.62% Area Under the Curve (AUC), and 93.08% F1 score, which outperform the results of the existing studies from the literature. The results highlight the effectiveness of the proposed model in advancing seizure detection systems, highlighting the potential to improve diagnostic capabilities and patient outcomes. Full article

(This article belongs to the Special Issue Smart Healthcare: Techniques, Applications and Prospects)

► Show Figures

Figure 1

14 pages, 734 KiB

Open AccessArticle

MWMOTE-FRIS-INFFC: An Improved Majority Weighted Minority Oversampling Technique for Solving Noisy and Imbalanced Classification Datasets

by Dong Zhang, Xiang Huang, Gen Li, Shengjie Kong and Liang Dong

Appl. Sci. 2025, 15(9), 4670; https://doi.org/10.3390/app15094670 - 23 Apr 2025

Viewed by 450

Abstract

In view of the data of fault diagnosis and good product testing in the industrial field, high-noise unbalanced data samples exist widely, and such samples are very difficult to analyze in the field of data analysis. The oversampling technique has proved to be [...] Read more.

In view of the data of fault diagnosis and good product testing in the industrial field, high-noise unbalanced data samples exist widely, and such samples are very difficult to analyze in the field of data analysis. The oversampling technique has proved to be a simple solution to unbalanced data in the past, but it has no significant resistance to noise. In order to solve the binary classification problem of high-noise unbalanced data, an enhanced majority-weighted minority oversampling technique, MWMOTE-FRIS-INFFC, is introduced in this study, which is specially used for processing noise-unbalanced classified data sets. The method uses Euclidean distance to assign sample weights, synthesizes and combines new samples into samples with larger weights but belonging to a few classes, and thus solves the problem of data scarcity in smaller class clusters. Then, the fuzzy rough instance selection (FRIS) method is used to eliminate the subsets of synthetic minority samples with low clustering membership, which effectively reduces the overfitting tendency of minority samples caused by synthetic oversampling. In addition, the integration of classification fusion iterative filters (INFFC) helps mitigate synthetic noise issues, both raw data and synthetic data noise. On this basis, a series of experiments are designed to improve the performance of 6 oversampling algorithms on 8 data sets by using the MWMOTE-FRIS-INFFC algorithm proposed in this paper. Full article

(This article belongs to the Special Issue Fuzzy Control Systems: Latest Advances and Prospects)

► Show Figures

Figure 1

Search Results (144)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (144)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI