Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,082)

Search Parameters:
Keywords = imbalance data learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 2952 KiB  
Article
Raw-Data Driven Functional Data Analysis with Multi-Adaptive Functional Neural Networks for Ergonomic Risk Classification Using Facial and Bio-Signal Time-Series Data
by Suyeon Kim, Afrooz Shakeri, Seyed Shayan Darabi, Eunsik Kim and Kyongwon Kim
Sensors 2025, 25(15), 4566; https://doi.org/10.3390/s25154566 - 23 Jul 2025
Abstract
Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw [...] Read more.
Ergonomic risk classification during manual lifting tasks is crucial for the prevention of workplace injuries. This study addresses the challenge of classifying lifting task risk levels (low, medium, and high risk, labeled as 0, 1, and 2) using multi-modal time-series data comprising raw facial landmarks and bio-signals (electrocardiography [ECG] and electrodermal activity [EDA]). Classifying such data presents inherent challenges due to multi-source information, temporal dynamics, and class imbalance. To overcome these challenges, this paper proposes a Multi-Adaptive Functional Neural Network (Multi-AdaFNN), a novel method that integrates functional data analysis with deep learning techniques. The proposed model introduces a novel adaptive basis layer composed of micro-networks tailored to each individual time-series feature, enabling end-to-end learning of discriminative temporal patterns directly from raw data. The Multi-AdaFNN approach was evaluated across five distinct dataset configurations: (1) facial landmarks only, (2) bio-signals only, (3) full fusion of all available features, (4) a reduced-dimensionality set of 12 selected facial landmark trajectories, and (5) the same reduced set combined with bio-signals. Performance was rigorously assessed using 100 independent stratified splits (70% training and 30% testing) and optimized via a weighted cross-entropy loss function to manage class imbalance effectively. The results demonstrated that the integrated approach, fusing facial landmarks and bio-signals, achieved the highest classification accuracy and robustness. Furthermore, the adaptive basis functions revealed specific phases within lifting tasks critical for risk prediction. These findings underscore the efficacy and transparency of the Multi-AdaFNN framework for multi-modal ergonomic risk assessment, highlighting its potential for real-time monitoring and proactive injury prevention in industrial environments. Full article
(This article belongs to the Special Issue (Bio)sensors for Physiological Monitoring)
Show Figures

Figure 1

25 pages, 1047 KiB  
Article
Integrated Blockchain and Federated Learning for Robust Security in Internet of Vehicles Networks
by Zhikai He, Rui Xu, Binyu Wang, Qisong Meng, Qiang Tang, Li Shen, Zhen Tian and Jianyu Duan
Symmetry 2025, 17(7), 1168; https://doi.org/10.3390/sym17071168 - 21 Jul 2025
Viewed by 121
Abstract
The Internet of Vehicles (IoV) operates in an environment characterized by asymmetric security threats, where centralized vulnerabilities create a critical imbalance that can be disproportionately exploited by attackers. This study addresses this imbalance by proposing a symmetrical security framework that integrates Blockchain and [...] Read more.
The Internet of Vehicles (IoV) operates in an environment characterized by asymmetric security threats, where centralized vulnerabilities create a critical imbalance that can be disproportionately exploited by attackers. This study addresses this imbalance by proposing a symmetrical security framework that integrates Blockchain and Federated Learning (FL) to restore equilibrium in the Vehicle–Road–Cloud ecosystem. The evolution toward sixth-generation (6G) technologies amplifies both the potential of vehicle-to-everything (V2X) communications and its inherent security risks. The proposed framework achieves a delicate balance between robust security and operational efficiency. By leveraging blockchain’s symmetrical and decentralized distribution of trust, the framework ensures data and model integrity. Concurrently, the privacy-preserving approach of FL balances the need for collaborative intelligence with the imperative of safeguarding sensitive vehicle data. A novel Cloud Proxy Re-Encryption Offloading (CPRE-IoV) algorithm is introduced to facilitate efficient model updates. The architecture employs a partitioned blockchain and a smart contract-driven FL pipeline to symmetrically neutralize threats from malicious nodes. Finally, extensive simulations validate the framework’s effectiveness in establishing a resilient and symmetrically secure foundation for next-generation IoV networks. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

29 pages, 5526 KiB  
Article
Dynamic Machine Learning-Based Simulation for Preemptive Supply-Demand Balancing Amid EV Charging Growth in the Jamali Grid 2025–2060
by Joshua Veli Tampubolon, Rinaldy Dalimi and Budi Sudiarto
World Electr. Veh. J. 2025, 16(7), 408; https://doi.org/10.3390/wevj16070408 - 21 Jul 2025
Viewed by 165
Abstract
The rapid uptake of electric vehicles (EVs) in the Jawa–Madura–Bali (Jamali) grid produces highly variable charging demands that threaten the supply–demand balance. To forestall instability, we developed a predictive simulation based on long short-term memory (LSTM) networks that combines historical generation and consumption [...] Read more.
The rapid uptake of electric vehicles (EVs) in the Jawa–Madura–Bali (Jamali) grid produces highly variable charging demands that threaten the supply–demand balance. To forestall instability, we developed a predictive simulation based on long short-term memory (LSTM) networks that combines historical generation and consumption patterns with models of EV population growth and initial charging-time (ICT). We introduce a novel supply–demand balance score to quantify weekly and annual deviations between projected supply and demand curves, then use this metric to guide the machine-learning model in optimizing annual growth rate (AGR) and preventing supply demand imbalance. Relative to a business-as-usual baseline, our approach improves balance scores by 64% and projects up to a 59% reduction in charging load by 2060. These results demonstrate the promise of data-driven demand-management strategies for maintaining grid reliability during large-scale EV integration. Full article
Show Figures

Figure 1

15 pages, 4874 KiB  
Article
A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification
by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam
J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025
Viewed by 260
Abstract
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article
Show Figures

Figure 1

18 pages, 265 KiB  
Article
AI in Biodiversity Education: The Bias in Endangered Species Information and Its Implications
by Luis de Pedro Noriega, Javier Bobo-Pinilla, Jaime Delgado-Iglesias, Roberto Reinoso-Tapia, Ana María Gallego and Susana Quirós-Alpera
Sustainability 2025, 17(14), 6554; https://doi.org/10.3390/su17146554 - 18 Jul 2025
Viewed by 554
Abstract
The use of AI-generated content in education is significantly increasing, but its reliability for teaching natural sciences and, more specifically, biodiversity-related contents still remains understudied. The need to address this question is substantial, considering the relevance that biodiversity conservation has on human sustainability, [...] Read more.
The use of AI-generated content in education is significantly increasing, but its reliability for teaching natural sciences and, more specifically, biodiversity-related contents still remains understudied. The need to address this question is substantial, considering the relevance that biodiversity conservation has on human sustainability, and the recurrent presence of these topics in the educational curriculum, at least in Spain. The present article tests the existence of biases in some of the most widely used AI tools (ChatGPT-4.5, DeepSeek-V3, Gemini) when asked a relevant and objective research question related to biodiversity. The results revealed both taxonomic and geographic biases in all the lists of endangered species provided by these tools when compared to IUCN Red List data. These imbalances may contribute to the perpetuation of plant blindness, zoocentrism, and Western centrism in classrooms, especially at levels where educators lack specialized training. In summary, the present study highlights the potential harmful impact that AI’s cultural and social biases may have on biodiversity education and Sustainable Development Goals-aligned learning and appeals to an urgent need for model refinement (using scientific datasets) and teacher AI literacy to mitigate misinformation. Full article
(This article belongs to the Special Issue Sustainable Education in the Age of Artificial Intelligence (AI))
24 pages, 2667 KiB  
Article
Transformer-Driven Fault Detection in Self-Healing Networks: A Novel Attention-Based Framework for Adaptive Network Recovery
by Parul Dubey, Pushkar Dubey and Pitshou N. Bokoro
Mach. Learn. Knowl. Extr. 2025, 7(3), 67; https://doi.org/10.3390/make7030067 - 16 Jul 2025
Viewed by 358
Abstract
Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, [...] Read more.
Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, and delayed convergence, limiting their effectiveness in real-time applications. This study utilizes two benchmark datasets—EFCD and SFDD—which represent electrical and sensor fault scenarios, respectively. These datasets pose challenges due to class imbalance and complex temporal dependencies. To address this, we propose a novel hybrid framework combining Attention-Augmented Convolutional Neural Networks (AACNN) with transformer encoders, enhanced through Enhanced Ensemble-SMOTE for balancing the minority class. The model captures spatial features and long-range temporal patterns and learns effectively from imbalanced data streams. The novelty lies in the integration of attention mechanisms and adaptive oversampling in a unified fault-prediction architecture. Model evaluation is based on multiple performance metrics, including accuracy, F1-score, MCC, RMSE, and score*. The results show that the proposed model outperforms state-of-the-art approaches, achieving up to 97.14% accuracy and a score* of 0.419, with faster convergence and improved generalization across both datasets. Full article
Show Figures

Figure 1

24 pages, 3235 KiB  
Article
A Cost-Sensitive Small Vessel Detection Method for Maritime Remote Sensing Imagery
by Zhuhua Hu, Wei Wu, Ziqi Yang, Yaochi Zhao, Lewei Xu, Lingkai Kong, Yunpei Chen, Lihang Chen and Gaosheng Liu
Remote Sens. 2025, 17(14), 2471; https://doi.org/10.3390/rs17142471 - 16 Jul 2025
Viewed by 161
Abstract
Vessel detection technology based on marine remote sensing imagery is of great importance. However, it often faces challenges, such as small vessel targets, cloud occlusion, insufficient data volume, and severely imbalanced class distribution in datasets. These issues result in conventional models failing to [...] Read more.
Vessel detection technology based on marine remote sensing imagery is of great importance. However, it often faces challenges, such as small vessel targets, cloud occlusion, insufficient data volume, and severely imbalanced class distribution in datasets. These issues result in conventional models failing to meet the accuracy requirements for practical applications. In this paper, we first construct a novel remote sensing vessel image dataset that includes various complex scenarios and enhance the data volume and diversity through data augmentation techniques. Secondly, we address the class imbalance between foreground (small vessels) and background in remote sensing imagery from two perspectives: the sensitivity of IoU metrics to small object localization errors and the innovative design of a cost-sensitive loss function. Specifically, at the dataset level, we select vessel targets appearing in the original dataset as templates and randomly copy–paste several instances onto arbitrary positions. This enriches the diversity of target samples per image and mitigates the impact of data imbalance on the detection task. At the algorithm level, we introduce the Normalized Wasserstein Distance (NWD) to compute the similarity between bounding boxes. This enhances the importance of small target information during training and strengthens the model’s cost-sensitive learning capabilities. Ablation studies reveal that detection performance is optimal when the weight assigned to the NWD metric in the model’s loss function matches the overall proportion of small objects in the dataset. Comparative experiments show that the proposed NWD-YOLO achieves Precision, Recall, and AP50 scores of 0.967, 0.958, and 0.971, respectively, meeting the accuracy requirements of real-world applications. Full article
Show Figures

Figure 1

35 pages, 8048 KiB  
Article
Characterization and Automated Classification of Underwater Acoustic Environments in the Western Black Sea Using Machine Learning Techniques
by Maria Emanuela Mihailov
J. Mar. Sci. Eng. 2025, 13(7), 1352; https://doi.org/10.3390/jmse13071352 - 16 Jul 2025
Viewed by 112
Abstract
Growing concern over anthropogenic underwater noise, highlighted by initiatives like the Marine Strategy Framework Directive (MSFD) and its Technical Group on Underwater Noise (TG Noise), emphasizes regions like the Western Black Sea, where increasing activities threaten marine habitats. This region is experiencing rapid [...] Read more.
Growing concern over anthropogenic underwater noise, highlighted by initiatives like the Marine Strategy Framework Directive (MSFD) and its Technical Group on Underwater Noise (TG Noise), emphasizes regions like the Western Black Sea, where increasing activities threaten marine habitats. This region is experiencing rapid growth in maritime traffic and resource exploitation, which is intensifying concerns over the noise impacts on its unique marine habitats. While machine learning offers promising solutions, a research gap persists in comprehensively evaluating diverse ML models within an integrated framework for complex underwater acoustic data, particularly concerning real-world data limitations like class imbalance. This paper addresses this by presenting a multi-faceted framework using passive acoustic monitoring (PAM) data from fixed locations (50–100 m depth). Acoustic data are processed using advanced signal processing (broadband Sound Pressure Level (SPL), Power Spectral Density (PSD)) for feature extraction (Mel-spectrograms for deep learning; PSD statistical moments for classical/unsupervised ML). The framework evaluates Convolutional Neural Networks (CNNs), Random Forest, and Support Vector Machines (SVMs) for noise event classification, alongside Gaussian Mixture Models (GMMs) for anomaly detection. Our results demonstrate that the CNN achieved the highest classification accuracy of 0.9359, significantly outperforming Random Forest (0.8494) and SVM (0.8397) on the test dataset. These findings emphasize the capability of deep learning in automatically extracting discriminative features, highlighting its potential for enhanced automated underwater acoustic monitoring. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

28 pages, 5813 KiB  
Article
YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR
by Yizhou Shuai, Jingsha Shi, Yi Li, Shaohao Zhou, Lihua Zhang and Jiong Mu
Agronomy 2025, 15(7), 1712; https://doi.org/10.3390/agronomy15071712 - 16 Jul 2025
Viewed by 322
Abstract
Accurate weed detection in soybean fields is essential for enhancing crop yield and reducing herbicide usage. This study proposes a YOLO-SW model, an improved version of YOLOv8, to address the challenges of detecting weeds that are highly similar to the background in natural [...] Read more.
Accurate weed detection in soybean fields is essential for enhancing crop yield and reducing herbicide usage. This study proposes a YOLO-SW model, an improved version of YOLOv8, to address the challenges of detecting weeds that are highly similar to the background in natural environments. The research stands out for its novel integration of three key advancements: the Swin Transformer backbone, which leverages local window self-attention to achieve linear O(N) computational complexity for efficient global context capture; the CARAFE dynamic upsampling operator, which enhances small target localization through context-aware kernel generation; and the RTDETR encoder, which enables end-to-end detection via IoU-aware query selection, eliminating the need for complex post-processing. Additionally, a dataset of six common soybean weeds was expanded to 12,500 images through simulated fog, rain, and snow augmentation, effectively resolving data imbalance and boosting model robustness. The experimental results highlight both the technical superiority and practical relevance: YOLO-SW achieves 92.3% mAP@50 (3.8% higher than YOLOv8), with recognition accuracy and recall improvements of 4.2% and 3.9% respectively. Critically, on the NVIDIA Jetson AGX Orin platform, it delivers a real-time inference speed of 59 FPS, making it suitable for seamless deployment on intelligent weeding robots. This low-power, high-precision solution not only bridges the gap between deep learning and precision agriculture but also enables targeted herbicide application, directly contributing to sustainable farming practices and environmental protection. Full article
Show Figures

Figure 1

19 pages, 20865 KiB  
Article
Vegetation Baseline and Urbanization Development Level: Key Determinants of Long-Term Vegetation Greening in China’s Rapidly Urbanizing Region
by Ke Zeng, Mengyao Ci, Shuyi Zhang, Ziwen Jin, Hanxin Tang, Hongkai Zhu, Rui Zhang, Yue Wang, Yiwen Zhang and Min Liu
Remote Sens. 2025, 17(14), 2449; https://doi.org/10.3390/rs17142449 - 15 Jul 2025
Viewed by 278
Abstract
Urban vegetation shows significant spatial differences due to the combined effects of natural and human factors, yet fine-scale evolutionary patterns and their cross-scale feedback mechanisms remain limited. This study focuses on the Yangtze River Delta (YRD), the top economic area in China. By [...] Read more.
Urban vegetation shows significant spatial differences due to the combined effects of natural and human factors, yet fine-scale evolutionary patterns and their cross-scale feedback mechanisms remain limited. This study focuses on the Yangtze River Delta (YRD), the top economic area in China. By integrating data from multiple Landsat sensors, we built a high—resolution framework to track vegetation dynamics from 1990 to 2020. It generates annual 30-m Enhanced Vegetation Index (EVI) data and uses a new Vegetation Green—Brown Balance Index (VBI) to measure changes between greening and browning. We combined Mann-Kendall trend analysis with machine—learning based attribution analysis to look into vegetation changes across different city types and urban—rural gradients. Over 30 years, the YRD’s annual EVI increased by 0.015/10 a, with greening areas 3.07 times larger than browning. Spatially, urban centers show strong greening, while peri—urban areas experience remarkable browning. Vegetation changes showed a city-size effect: larger cities had higher browning proportions but stronger urban cores’ greening trends. Cluster analysis finds four main evolution types, showing imbalances in grey—green infrastructure allocation. Vegetation baseline in 1990 is the main factor driving the long-term trend of vegetation greenness, while socioeconomic and climate drivers have different impacts depending on city size and position on the urban—rural continuum. In areas with low urbanization levels, climate factors matter more than human factors. These multi-scale patterns challenge traditional urban greening ideas, highlighting the need for vegetation governance that adapts to specific spatial conditions and city—unique evolution paths. Full article
Show Figures

Graphical abstract

22 pages, 1906 KiB  
Article
Explainable and Optuna-Optimized Machine Learning for Battery Thermal Runaway Prediction Under Class Imbalance Conditions
by Abir El Abed, Ghalia Nassreddine, Obada Al-Khatib, Mohamad Nassereddine and Ali Hellany
Thermo 2025, 5(3), 23; https://doi.org/10.3390/thermo5030023 - 15 Jul 2025
Viewed by 251
Abstract
Modern energy storage systems for both power and transportation are highly related to lithium-ion batteries (LIBs). However, their safety depends on a potentially hazardous failure mode known as thermal runaway (TR). Predicting and classifying TR causes can widely enhance the safety of power [...] Read more.
Modern energy storage systems for both power and transportation are highly related to lithium-ion batteries (LIBs). However, their safety depends on a potentially hazardous failure mode known as thermal runaway (TR). Predicting and classifying TR causes can widely enhance the safety of power and transportation systems. This paper presents an advanced machine learning method for forecasting and classifying the causes of TR. A generative model for synthetic data generation was used to handle class imbalance in the dataset. Hyperparameter optimization was conducted using Optuna for four classifiers: Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), tabular network (TabNet), and Extreme Gradient Boosting (XGBoost). A three-fold cross-validation approach was used to guarantee a robust evaluation. An open-source database of LIB failure events is used for model training and testing. The XGBoost model outperforms the other models across all TR categories by achieving 100% accuracy and a high recall (1.00). Model results were interpreted using SHapley Additive exPlanations analysis to investigate the most significant factors in TR predictors. The findings show that important TR indicators include energy adjusted for heat and weight loss, heater power, average cell temperature upon activation, and heater duration. These findings guide the design of safer battery systems and preventive monitoring systems for real applications. They can help experts develop more efficient battery management systems, thereby improving the performance and longevity of battery-operated devices. By enhancing the predictive knowledge of temperature-driven failure mechanisms in LIBs, the study directly advances thermal analysis and energy storage safety domains. Full article
Show Figures

Figure 1

24 pages, 4383 KiB  
Article
Predicting Employee Attrition: XAI-Powered Models for Managerial Decision-Making
by İrem Tanyıldızı Baydili and Burak Tasci
Systems 2025, 13(7), 583; https://doi.org/10.3390/systems13070583 - 15 Jul 2025
Viewed by 316
Abstract
Background: Employee turnover poses a multi-faceted challenge to organizations by undermining productivity, morale, and financial stability while rendering recruitment, onboarding, and training investments wasteful. Traditional machine learning approaches often struggle with class imbalance and lack transparency, limiting actionable insights. This study introduces an [...] Read more.
Background: Employee turnover poses a multi-faceted challenge to organizations by undermining productivity, morale, and financial stability while rendering recruitment, onboarding, and training investments wasteful. Traditional machine learning approaches often struggle with class imbalance and lack transparency, limiting actionable insights. This study introduces an Explainable AI (XAI) framework to achieve both high predictive accuracy and interpretability in turnover forecasting. Methods: Two publicly available HR datasets (IBM HR Analytics, Kaggle HR Analytics) were preprocessed with label encoding and MinMax scaling. Class imbalance was addressed via GAN-based synthetic data generation. A three-layer Transformer encoder performed binary classification, and SHapley Additive exPlanations (SHAP) analysis provided both global and local feature attributions. Model performance was evaluated using accuracy, precision, recall, F1 score, and ROC AUC metrics. Results: On the IBM dataset, the Generative Adversarial Network (GAN) Transformer model achieved 92.00% accuracy, 96.67% precision, 87.00% recall, 91.58% F1, and 96.32% ROC AUC. On the Kaggle dataset, it reached 96.95% accuracy, 97.28% precision, 96.60% recall, 96.94% F1, and 99.15% ROC AUC, substantially outperforming classical resampling methods (ROS, SMOTE, ADASYN) and recent literature benchmarks. SHAP explanations highlighted JobSatisfaction, Age, and YearsWithCurrManager as top predictors in IBM and number project, satisfaction level, and time spend company in Kaggle. Conclusion: The proposed GAN Transformer SHAP pipeline delivers state-of-the-art turnover prediction while furnishing transparent, actionable insights for HR decision-makers. Future work should validate generalizability across diverse industries and develop lightweight, real-time implementations. Full article
Show Figures

Figure 1

26 pages, 3020 KiB  
Article
Data-Driven Loan Default Prediction: A Machine Learning Approach for Enhancing Business Process Management
by Xinyu Zhang, Tianhui Zhang, Lingmin Hou, Xianchen Liu, Zhen Guo, Yuanhao Tian and Yang Liu
Systems 2025, 13(7), 581; https://doi.org/10.3390/systems13070581 - 15 Jul 2025
Viewed by 424
Abstract
Loan default prediction is a critical task for financial institutions, directly influencing risk management, loan approval decisions, and profitability. This study evaluates the effectiveness of machine learning models, specifically XGBoost, Gradient Boosting, Random Forest, and LightGBM, in predicting loan defaults. The research investigates [...] Read more.
Loan default prediction is a critical task for financial institutions, directly influencing risk management, loan approval decisions, and profitability. This study evaluates the effectiveness of machine learning models, specifically XGBoost, Gradient Boosting, Random Forest, and LightGBM, in predicting loan defaults. The research investigates the following question: How effective are machine learning models in predicting loan defaults compared to traditional approaches? A structured machine learning pipeline is developed, including data preprocessing, feature engineering, class imbalance handling (SMOTE and class weighting), model training, hyperparameter tuning, and evaluation. Models are assessed using accuracy, F1-score, ROC AUC, precision–recall curves, and confusion matrices. The results show that Gradient Boosting achieves the highest overall classification performance (accuracy = 0.8887, F1-score = 0.8084, recall = 0.8021), making it the most effective model for identifying defaulters. XGBoost exhibits superior discriminatory power with the highest ROC AUC (0.9714). A cost-sensitive threshold-tuning procedure is embedded to align predictions with regulatory loss weights to support audit requirements. Full article
(This article belongs to the Special Issue Data-Driven Methods in Business Process Management)
Show Figures

Figure 1

22 pages, 6194 KiB  
Article
KidneyNeXt: A Lightweight Convolutional Neural Network for Multi-Class Renal Tumor Classification in Computed Tomography Imaging
by Gulay Maçin, Fatih Genç, Burak Taşcı, Sengul Dogan and Turker Tuncer
J. Clin. Med. 2025, 14(14), 4929; https://doi.org/10.3390/jcm14144929 - 11 Jul 2025
Viewed by 230
Abstract
Background: Renal tumors, encompassing benign, malignant, and normal variants, represent a significant diagnostic challenge in radiology due to their overlapping visual characteristics on computed tomography (CT) scans. Manual interpretation is time consuming and susceptible to inter-observer variability, emphasizing the need for automated, [...] Read more.
Background: Renal tumors, encompassing benign, malignant, and normal variants, represent a significant diagnostic challenge in radiology due to their overlapping visual characteristics on computed tomography (CT) scans. Manual interpretation is time consuming and susceptible to inter-observer variability, emphasizing the need for automated, reliable classification systems to support early and accurate diagnosis. Method and Materials: We propose KidneyNeXt, a custom convolutional neural network (CNN) architecture designed for the multi-class classification of renal tumors using CT imaging. The model integrates multi-branch convolutional pathways, grouped convolutions, and hierarchical feature extraction blocks to enhance representational capacity. Transfer learning with ImageNet 1K pretraining and fine tuning was employed to improve generalization across diverse datasets. Performance was evaluated on three CT datasets: a clinically curated retrospective dataset (3199 images), the Kaggle CT KIDNEY dataset (12,446 images), and the KAUH: Jordan dataset (7770 images). All images were preprocessed to 224 × 224 resolution without data augmentation and split into training, validation, and test subsets. Results: Across all datasets, KidneyNeXt demonstrated outstanding classification performance. On the clinical dataset, the model achieved 99.76% accuracy and a macro-averaged F1 score of 99.71%. On the Kaggle CT KIDNEY dataset, it reached 99.96% accuracy and a 99.94% F1 score. Finally, evaluation on the KAUH dataset yielded 99.74% accuracy and a 99.72% F1 score. The model showed strong robustness against class imbalance and inter-class similarity, with minimal misclassification rates and stable learning dynamics throughout training. Conclusions: The KidneyNeXt architecture offers a lightweight yet highly effective solution for the classification of renal tumors from CT images. Its consistently high performance across multiple datasets highlights its potential for real-world clinical deployment as a reliable decision support tool. Future work may explore the integration of clinical metadata and multimodal imaging to further enhance diagnostic precision and interpretability. Additionally, interpretability was addressed using Grad-CAM visualizations, which provided class-specific attention maps to highlight the regions contributing to the model’s predictions. Full article
(This article belongs to the Special Issue Artificial Intelligence and Deep Learning in Medical Imaging)
Show Figures

Figure 1

24 pages, 2469 KiB  
Article
Generative and Contrastive Self-Supervised Learning for Virulence Factor Identification Based on Protein–Protein Interaction Networks
by Yalin Yao, Hao Chen, Jianxin Wang and Yeru Wang
Microorganisms 2025, 13(7), 1635; https://doi.org/10.3390/microorganisms13071635 - 10 Jul 2025
Viewed by 320
Abstract
Virulence factors (VFs), produced by pathogens, facilitate pathogenic microorganisms to invade, colonize, and damage the host cells. Accurate VF identification advances pathogenic mechanism understanding and provides novel anti-virulence targets. Existing models primarily utilize protein sequence features while overlooking the systematic protein–protein interaction (PPI) [...] Read more.
Virulence factors (VFs), produced by pathogens, facilitate pathogenic microorganisms to invade, colonize, and damage the host cells. Accurate VF identification advances pathogenic mechanism understanding and provides novel anti-virulence targets. Existing models primarily utilize protein sequence features while overlooking the systematic protein–protein interaction (PPI) information, despite pathogenesis typically resulting from coordinated protein–protein actions. Moreover, a severe imbalance exists between virulence and non-virulence proteins, which causes existing models trained on balanced datasets by sampling to fail in incorporating proteins’ inherent distributional characteristics, thus restricting generalization to real-world imbalanced data. To address these challenges, we propose a novel Generative and Contrastive self-supervised learning framework for Virulence Factor identification (GC-VF) that transforms VF identification into an imbalanced node classification task on graphs generated from PPI networks. The framework encompasses two core modules: the generative attribute reconstruction module learns attribute space representations via feature reconstruction, capturing intrinsic data patterns and reducing noise; the local contrastive learning module employs node-level contrastive learning to precisely capture local features and contextual information, avoiding global aggregation losses while ensuring node representations truly reflect inherent characteristics. Comprehensive benchmark experiments demonstrate that GC-VF outperforms baseline methods on naturally imbalanced datasets, exhibiting higher accuracy and stability, as well as providing a potential solution for accurate VF identification. Full article
(This article belongs to the Section Molecular Microbiology and Immunology)
Show Figures

Figure 1

Back to TopTop