Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (439)

Search Parameters:
Keywords = imbalanced data distribution

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 3201 KB  
Article
Research on Pipeline Magnetic Flux Leakage Testing Defect Classification Based on Generate Expansion and Dual-Channel Vision Transformer
by Xulai Zhu, Yuxiang Zhang, Qiansheng Fang, Jin Jiang, Nana Zhang, Shiheng Tang and Gongquan Zhang
Appl. Sci. 2026, 16(12), 6214; https://doi.org/10.3390/app16126214 (registering DOI) - 19 Jun 2026
Viewed by 68
Abstract
Magnetic flux leakage (MFL) testing is a vital non-destructive testing method used to identify defects in oil and gas pipelines and critical components. However, variations in defect geometry and testing conditions can lead to inaccurate data and imbalanced feature distributions, which compromise detection [...] Read more.
Magnetic flux leakage (MFL) testing is a vital non-destructive testing method used to identify defects in oil and gas pipelines and critical components. However, variations in defect geometry and testing conditions can lead to inaccurate data and imbalanced feature distributions, which compromise detection outcomes. To address these challenges, this paper presents a defect classification approach for MFL testing based on generating expansion and the Dual-Channel Vision Transformer (DC-ViT). First, COMSOL finite element software (version 6.1) was used to simulate magnetic flux leakage for different types of pipeline defects. Axial and radial dual-channel signals were extracted to create the initial dataset. Next, a Conditional Variational Autoencoder (CVAE) was used for Generate Expansion to effectively mitigate sample scarcity and defect category imbalance. Finally, the DC-ViT model was constructed and trained using the Generate Expansion dataset as input to achieve multidimensional feature fusion and classification prediction for defects. Experimental results demonstrate 97.97% detection accuracy. The DC-ViT model outperforms traditional convolutional neural networks and single-channel models in terms of accuracy, precision, recall, and F1-score. These results validate the method’s effectiveness and robustness in complex defect scenarios and offer a novel approach to magnetic leakage signal detection. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
25 pages, 956 KB  
Article
Knowledge Graph-Driven Graph Neural Networks for Equipment Fault Prediction in Maglev Train Systems
by Chunlong Yu, Yi Peng, Kunyan Li, Jianyu Guo, Yi Wang and JingJing Chen
Appl. Sci. 2026, 16(12), 6205; https://doi.org/10.3390/app16126205 (registering DOI) - 19 Jun 2026
Viewed by 68
Abstract
Equipment fault prediction in maglev train systems poses substantial challenges: fault events are inherently rare, class distributions are severely imbalanced, and individual equipment units are subject to complex spatial and functional couplings that single-device statistical approaches fundamentally cannot capture. To address these challenges, [...] Read more.
Equipment fault prediction in maglev train systems poses substantial challenges: fault events are inherently rare, class distributions are severely imbalanced, and individual equipment units are subject to complex spatial and functional couplings that single-device statistical approaches fundamentally cannot capture. To address these challenges, this study proposes a Knowledge Graph-driven Graph Neural Network (KG-GNN) framework. A fault knowledge graph encompassing equipment, fault, temporal, and environmental entities is constructed to unify multi-source maintenance data. Graph connectivity is established via three spatial relation types (co-location, co-zone, and co-level), with edge weights derived from Laplacian-smoothed Lift scores quantifying fault co-occurrence strength. A two-layer GATv2Conv-based graph attention network is designed: the first layer employs four-head attention with explicit edge-weight integration to capture heterogeneous neighborhood influences, while the second layer produces compact node embeddings via single-head attention. A Top-20 sparsification strategy suppresses weak-association noise, and training under severe class imbalance is stabilized through Focal Loss and F2-Score-guided early stopping. On the test set, the proposed method achieves an F2-Score of 0.5703, Recall of 0.6825, and AUC-ROC of 0.9329 (single-run evaluation); multi-seed evaluation (5 seeds) yields F2 = 0.5645 ± 0.0035, Recall = 0.6789 ± 0.0095, and AUC-ROC = 0.9298 ± 0.0026, outperforming the MLP baseline by 18.3% in F2-Score and substantially exceeding GCN (F2 = 0.1476 ± 0.0176) and GATConv (F2 = 0.4284 ± 0.0097). Ablation studies confirm the individual contributions of authentic graph topology, precise edge weighting, and graph sparsification to overall performance. Full article
34 pages, 3776 KB  
Article
Spatial Coupling Characteristics and Driving Mechanisms of Population–Land–Housing Based on Multi-Source Data: A Case Study of Guangzhou, China
by Chunshan Zhou, Shuyuan Liu, Huiming Huang, Xiong He and Xiaodie Yuan
Land 2026, 15(6), 1085; https://doi.org/10.3390/land15061085 - 18 Jun 2026
Viewed by 86
Abstract
Against the backdrop of the transition of new-type urbanization towards high-quality development, the triple contradictions of population agglomeration, land constraints, and housing supply-demand imbalance have become increasingly prominent. The conventional binary framework of human–land relations can no longer meet the requirements of coordinated [...] Read more.
Against the backdrop of the transition of new-type urbanization towards high-quality development, the triple contradictions of population agglomeration, land constraints, and housing supply-demand imbalance have become increasingly prominent. The conventional binary framework of human–land relations can no longer meet the requirements of coordinated development within human settlement systems, creating an urgent need to examine the multi-system interactions among population, land, and housing in order to resolve spatial mismatch. Taking Guangzhou as a case study, this research integrates 2020 population census data, land-use data from the European Space Agency (ESA), housing-price data from the Anjuke platform, and multi-source data on related influencing factors, and conducts a systematic empirical analysis by combining coupling coordination analysis, a relative development model, and the geographical detector. The findings reveal that the coupling coordination level of population, land and housing in Guangzhou exhibits a concentric, ring-shaped distribution pattern with central agglomeration and peripheral decline. The relative development among the three systems can be classified into matching types including the core-differentiated type, the peripheral-imbalanced type, and the surrounding-equilibrium type. With respect to influencing factors, all pairwise interactions are of the bi-factor enhancement type, and the driving mechanism displays a three-stage dynamic evolution. This study enriches research on human–land relations, provides precise guidance for optimizing spatial allocation and alleviating housing mismatch conflicts in Guangzhou, and offers transferable practical experience for comparable cities in China seeking to advance the high-quality development of new-type urbanization. Full article
26 pages, 1399 KB  
Article
A Node-Adaptive Feature Fusion Network for Drug–Target Interaction Prediction Based on Multi-View Graphs
by Lin Xie, Hongmei Xu, Pinglu Zhang, Jianshe Xiong and Jing Li
Biomolecules 2026, 16(6), 908; https://doi.org/10.3390/biom16060908 (registering DOI) - 18 Jun 2026
Viewed by 131
Abstract
Existing drug–target interaction (DTI) prediction methods still face challenges caused by sparse interaction data, complex multi-source relationships, and imbalanced information contributions among different nodes. In this study, we propose NAFF-DTI, a node-level adaptive feature fusion network based on multi-view graphs. The model uniformly [...] Read more.
Existing drug–target interaction (DTI) prediction methods still face challenges caused by sparse interaction data, complex multi-source relationships, and imbalanced information contributions among different nodes. In this study, we propose NAFF-DTI, a node-level adaptive feature fusion network based on multi-view graphs. The model uniformly represents drug similarity, target similarity, and known drug–target interactions as multiple relational views, and learns node representations through graph encoding and cross-view representation learning. To more effectively utilize heterogeneous relational information, NAFF-DTI introduces cross-view feature discrepancy modeling and a node-level adaptive fusion mechanism to dynamically adjust the contribution of different views according to node structural characteristics. Experimental results show that NAFF-DTI achieves the best AUC and AUPR on all five benchmark datasets. Compared with the strongest baseline for each dataset and metric, NAFF-DTI achieves average relative improvements of 3.81% in AUC and 3.23% in AUPR. It can also improve the utilization of multi-source information, maintain relatively stable prediction under different data distributions, and prioritize biologically plausible candidate drug–target associations from the unannotated candidate space. These results indicate that NAFF-DTI can provide computational support for DTI candidate prioritization and repurposing-oriented hypothesis generation. Full article
25 pages, 5791 KB  
Article
MSS-MambaNet: A Mamba Framework for Building Extraction from Multi-Phase Disaster Imagery
by Xin Liang, Huijiao Qiao, Yanda Chen and Jin Zhang
Sensors 2026, 26(12), 3868; https://doi.org/10.3390/s26123868 (registering DOI) - 17 Jun 2026
Viewed by 343
Abstract
Building extraction from disaster scenes is critical for emergency response and post-disaster assessment. Unlike conventional static remote sensing imagery, multi-phase disaster imagery contains scenes spanning early, middle, and late disaster stages, where building morphology, class distribution, and boundary characteristics exhibit significant cross-phase heterogeneity. [...] Read more.
Building extraction from disaster scenes is critical for emergency response and post-disaster assessment. Unlike conventional static remote sensing imagery, multi-phase disaster imagery contains scenes spanning early, middle, and late disaster stages, where building morphology, class distribution, and boundary characteristics exhibit significant cross-phase heterogeneity. Such phase-dependent variations substantially increase the difficulty of stable semantic segmentation, particularly under complex damage conditions. To address these challenges, we propose MSS-MambaNet for building extraction from multi-phase disaster imagery. A multi-scale architecture is designed to overcome the limitations of single-scale scanning in Mamba, enabling more effective perception of diverse building morphologies. To enhance feature discrimination, a Dual-Domain Cross-Gated Fusion (DDCGF) module is introduced through complementary interactions between spatial and frequency-domain representations. In addition, a Pixel-Aware Dynamic Weighting (PADW) strategy is developed to adaptively emphasize imbalanced foreground pixels and ambiguous boundary regions, thereby improving segmentation consistency under complex disaster conditions. Extensive experiments demonstrate that MSS-MambaNet consistently outperforms state-of-the-art methods, achieving an average mIoU of 92.78% and mF1 of 96.25% with only 12.37 M parameters. These results indicate that the proposed method effectively handles the heterogeneity of multi-phase data, providing a stable and efficient solution for building extraction from multi-phase disaster imagery. Full article
Show Figures

Figure 1

26 pages, 62623 KB  
Article
Semi-Supervised Traffic Sign Detection with Dynamic Pseudo-Label Selection and Gated Feature Fusion-Based Proposal Refinement
by Chenhui Xia, Yeqin Shao, Meiqin Che and Guoqing Yang
Sensors 2026, 26(12), 3836; https://doi.org/10.3390/s26123836 - 16 Jun 2026
Viewed by 167
Abstract
Accurate traffic sign detection is important for the safety of autonomous driving systems. However, fully supervised methods require a large amount of manual annotation, which is cost-prohibitive and time-consuming. Semi-supervised methods employ a small amount of labeled data and a large amount of [...] Read more.
Accurate traffic sign detection is important for the safety of autonomous driving systems. However, fully supervised methods require a large amount of manual annotation, which is cost-prohibitive and time-consuming. Semi-supervised methods employ a small amount of labeled data and a large amount of unlabeled data to train the models, hence largely reducing the annotation costs. However, these methods have the following challenges: (1) with an imbalanced long-tail class distribution of traffic signs, they tend to achieve poor performance on tail classes; (2) they often fail to detect small traffic signs. To solve these issues, we propose a Semi-Supervised Traffic Sign Detection method with Dynamic Pseudo-Label Selection and Gated Feature Fusion-based Proposal Refinement. Firstly, we design a Class Distribution-based Dynamic Pseudo-Label Selection module (CD-DPLS) to select pseudo-labels for different classes based on the class distribution information, which reduces the tendency to select more pseudo-labels from head classes instead of tail classes, thereby improving the tail class detection performance. Secondly, we employ a Gated Feature Fusion-based Proposal Refinement strategy (GFF-PR) to refine detection proposals by fusing different-scale features with a gating mechanism, which facilitates the detection of small traffic signs. In addition, we use an Adaptive-Weight Focal Loss (AWFL), with which the weight of each pseudo-label is determined by the ratio between its classification confidence and the corresponding class-specific classification-confidence threshold. Experiments on traffic sign datasets demonstrate that the proposed method outperforms state-of-the-art semi-supervised approaches, with mAP50 scores of 10.8% and 34.9% using only 1% and 10% labeled data, respectively. Full article
(This article belongs to the Section Intelligent Sensors)
33 pages, 4129 KB  
Article
Optimization of Empty Railcar Distribution at the Loading End of a Heavy-Haul Railway Based on Deep Reinforcement Learning
by Liang Ma and Yuanli Bao
Future Transp. 2026, 6(3), 127; https://doi.org/10.3390/futuretransp6030127 - 14 Jun 2026
Viewed by 109
Abstract
In heavy-haul railway systems, effective empty railcar distribution (ERD) can optimize composition planning and meet empty railcar requirements (ERRs) at all loading ends, thereby improving the efficiency of train operations. To solve practical challenges such as the imbalanced supply–demand of empty trains, redundant [...] Read more.
In heavy-haul railway systems, effective empty railcar distribution (ERD) can optimize composition planning and meet empty railcar requirements (ERRs) at all loading ends, thereby improving the efficiency of train operations. To solve practical challenges such as the imbalanced supply–demand of empty trains, redundant loading and unloading cycles, and prolonged waiting times, this study establishes a multi-objective and 0–1 integer programming model for ERD at the loading end of a heavy-haul railway. The model can simultaneously maximize the fulfilment of all ERRs, minimize the ERD delay time, and reduce the waiting time in the heavy-train combination problem under complex constraints, including the passing capacity of sections, combination capacity of stations, and ERR at the loading end. While traditional optimization methods such as mathematical programming or heuristic algorithms partially address these issues, they are ineffective under dynamic constraints and state-space explosion. Furthermore, traditional reinforcement learning-based methods, such as Q-learning, exhibit limitations in railway scheduling due to the state-space explosion problem and inadequate model generalization. To overcome these limitations, this study proposes an innovative framework; the ERD at the loading end of the heavy-haul railway is formalized as a Markov decision process and optimized using deep Q-network (DQN) reinforcement learning. In addition, this study proposes an experience data fusion mechanism that integrates the empirical rules of the dispatchers through a modular architecture, achieving real-time constraint compliance while maintaining scalability for practical implementation. The NSGA-II genetic algorithm for multi-objective problems is used in this study to evaluate the performance of the DQN algorithm. The experimental results demonstrate that the DQN algorithm can fully meet ERRs with zero delay and produce optimal schemes for train combinations. Meanwhile, NSGA-II presents superior performance in minimizing the combination waiting time and same-destination train combinations. Meanwhile, the DQN algorithm can identify superior ERD strategies in the expanded-action and state spaces, enabling the effective handling of complex constraint-based ERD. Full article
Show Figures

Figure 1

23 pages, 2158 KB  
Article
DynamicFU: Contribution-Aware Dynamic Federated Unlearning for Industrial IoT
by Ziang Wu, Buzhen He, Zhiwei Si, Xiuheng Liao and Chunhua Su
Sensors 2026, 26(12), 3714; https://doi.org/10.3390/s26123714 - 11 Jun 2026
Viewed by 180
Abstract
The Industrial Internet of Things (IIoT) increasingly relies on federated learning (FL) to enable collaborative model training without directly sharing raw traffic data across industrial sites. However, in practical IIoT deployments, clients may later request the removal of their data contributions from a [...] Read more.
The Industrial Internet of Things (IIoT) increasingly relies on federated learning (FL) to enable collaborative model training without directly sharing raw traffic data across industrial sites. However, in practical IIoT deployments, clients may later request the removal of their data contributions from a trained federated model due to regulatory requirements, such as the General Data Protection Regulation (GDPR), ownership transfer, or internal data-governance policies. Such practical requirements create a strong demand for federated unlearning in IIoT applications. Furthermore, IIoT deployments often exhibit highly imbalanced client data distributions, resulting in substantially different contributions of individual clients to the global model. Nevertheless, most existing federated unlearning methods adopt a uniform unlearning strategy and fail to account for such client-level contribution gaps. To address this issue, we propose DynamicFU, a contribution-aware dynamic federated unlearning framework for IIoT deployments. The proposed method evaluates the target client from parameter-level, data-level, and performance-level perspectives and adaptively determines the unlearning strength by dynamically adjusting the number of unlearning rounds. Experimental results on public IIoT datasets show that DynamicFU substantially improves unlearning efficiency, achieving up to 22.89× speedup over Full Retrain while maintaining comparable effectiveness. Full article
Show Figures

Figure 1

18 pages, 3324 KB  
Article
Entropy-Constrained M2ANet for Early Fault Prediction of Wind Turbines
by Jingchan Lv and Zhihai Yao
Entropy 2026, 28(6), 666; https://doi.org/10.3390/e28060666 - 11 Jun 2026
Viewed by 156
Abstract
Early fault prediction of wind turbines is critical for ensuring wind farm safety and reducing operation and maintenance costs. However, the latent and progressive nature of incipient faults, together with concurrent failures across multiple subsystems, makes accurate root-cause identification challenging. In addition, severe [...] Read more.
Early fault prediction of wind turbines is critical for ensuring wind farm safety and reducing operation and maintenance costs. However, the latent and progressive nature of incipient faults, together with concurrent failures across multiple subsystems, makes accurate root-cause identification challenging. In addition, severe class imbalance between normal and faulty samples further degrades prediction performance, particularly for minority fault types. To address these challenges, this paper proposes a novel fault prediction model, M2ANet, using SCADA data within a 30-min pre-fault window. The model combines a dual-memory module with progressive dilated convolutions to efficiently capture multi-scale temporal dependencies from high-dimensional operational variables. An entropy-bias penalty is further introduced into the loss function to adaptively regularize the predicted probability distribution, alleviating overconfidence under imbalanced data conditions and improving the recognition of minority faults. Experiments on a real-world wind farm dataset show that M2ANet achieves an overall accuracy of 90.73% and a weighted F1-score of 90.62% in multi-class fault prediction, outperforming 10 representative baseline models. In addition to these aggregate metrics, per-class evaluation confirms the model’s robustness under class imbalance. Notably, for yaw system faults, which account for only 1.9% of the samples, M2ANet achieves a recall of 95.92% with a 30-min-ahead warning. These results demonstrate its effectiveness and reliability for early fault prediction in practical wind turbine applications. Full article
Show Figures

Figure 1

28 pages, 12346 KB  
Article
Feature-Embedded Transformer-Based Classification of Steel Plate Defects for Robust Industrial Process Inspection
by Bowen Dong, Xinyu Zhang, Chaoya Yan, Weiyan Zhu, Lingmin Hou, Yifan Feng and Lixing Lin
Processes 2026, 14(12), 1892; https://doi.org/10.3390/pr14121892 - 10 Jun 2026
Viewed by 155
Abstract
Robust defect classification is critical for intelligent process inspection and quality control in steel manufacturing, but it remains challenging when industrial tabular data are small, imbalanced, statistically skewed, and characterized by nonlinear inter-feature dependencies. This study proposes a robust steel plate defect classification [...] Read more.
Robust defect classification is critical for intelligent process inspection and quality control in steel manufacturing, but it remains challenging when industrial tabular data are small, imbalanced, statistically skewed, and characterized by nonlinear inter-feature dependencies. This study proposes a robust steel plate defect classification framework based on a feature-embedded Transformer. A quantile-based transformation is first introduced to regularize heterogeneous and heavy-tailed process descriptors. Each numerical variable is then represented as a learnable feature token and processed by a Transformer encoder to model contextual interactions among positional, geometric, luminosity-related, and morphological attributes. Experiments were conducted on the Steel Plates Faults dataset, containing 1941 samples, 27 input features, and 7 defect categories. On the held-out test set, the model achieved an accuracy of 0.735, remaining competitive with XGBoost (0.794) and Random Forest (0.783). SHAP and self-attention analyses further indicate that the model captures distributed and interaction-aware defect representations, providing an interpretable solution for robust industrial defect classification. Full article
Show Figures

Figure 1

24 pages, 3428 KB  
Article
Sustainable and Reliable Operation of EV Charging Infrastructure: A Lightweight Prototype-Driven Contrastive Learning Framework for Fault Diagnosis Under Class-Imbalanced Conditions
by Zhengyu Lei, Baowen Xing, Jingrui Liu, Yuxin Yang, Tianyuan Miao and Yingjie Lu
Sustainability 2026, 18(11), 5783; https://doi.org/10.3390/su18115783 - 5 Jun 2026
Viewed by 357
Abstract
With the rapid growth of transportation electrification and smart energy systems, the reliable operation of electric vehicle (EV) charging infrastructure has become an important issue for sustainable transport, since charging faults may interrupt service and shorten equipment lifetime. However, practical charging environments are [...] Read more.
With the rapid growth of transportation electrification and smart energy systems, the reliable operation of electric vehicle (EV) charging infrastructure has become an important issue for sustainable transport, since charging faults may interrupt service and shorten equipment lifetime. However, practical charging environments are often characterized by heterogeneous operating conditions and severely imbalanced fault distributions, which limit the effectiveness of conventional fault diagnosis methods. To address these challenges, this study proposes a lightweight Proto-Contrastive Discriminative Learning (PCDL) framework for intelligent fault diagnosis in EV charging systems. The proposed method combines supervised contrastive learning with a prototype-distance discrimination mechanism to improve the identification of rare abnormal states under long-tailed data conditions. Heterogeneous charging features, including discrete control signals and continuous total harmonic distortion (THD) indicators, are projected into a discriminative embedding space, while anomaly detection is performed according to the relative distances between samples and class prototypes. Experimental results on a publicly available EV charging-pile monitoring dataset, containing 122,144 samples with four discrete control/safety features and two THD-based power-quality features, demonstrate that the proposed framework maintains stable detection performance under imbalance ratios of 1:1, 1:10, and 1:100. Under the most challenging 1:100 condition, the proposed method achieves an F1-score of 84.21%, representing a 29.08% improvement over the strongest baseline method. In addition, the framework requires only approximately 11 KB of memory and maintains CPU inference latency below 6.3 ms, demonstrating strong potential for real-time deployment on resource-constrained edge devices. These results suggest that the proposed framework can provide a lightweight diagnostic tool for practical charging stations and support safer and more reliable EV charging operation. Full article
(This article belongs to the Section Energy Sustainability)
Show Figures

Figure 1

27 pages, 1381 KB  
Article
Federated Learning for Breast Cancer Classification: A Comparative Study of Aggregation Methods
by Nadjat Saàdia Lachemi, Medjeded Merati and Saïd Mahmoudi
Information 2026, 17(6), 545; https://doi.org/10.3390/info17060545 - 2 Jun 2026
Viewed by 215
Abstract
Federated Learning (FL) allows healthcare institutions to collaboratively develop machine learning models while safeguarding patient data, making it ideal for privacy-sensitive medical imaging. This study explores the effects of data heterogeneity on federated breast cancer classification using MobileNetV2 across five simulated clients. Five [...] Read more.
Federated Learning (FL) allows healthcare institutions to collaboratively develop machine learning models while safeguarding patient data, making it ideal for privacy-sensitive medical imaging. This study explores the effects of data heterogeneity on federated breast cancer classification using MobileNetV2 across five simulated clients. Five aggregation methods—FedAvg, FedProx, FedNova, FedDyn, and SCAFFOLD—were assessed under various data distributions, including balanced, imbalanced, non-homogeneous, and non-IID. Results indicate that aggregation performance is significantly affected by data distribution; FedAvg excels in balanced settings but falters in heterogeneity, whereas FedProx shows robustness in extreme non-IID cases, achieving up to 98.466% accuracy. FedDyn and SCAFFOLD also demonstrate adaptability but are less consistent in severe imbalance scenarios. Beyond accuracy, recall and robustness under extreme non-IID conditions were analyzed to assess clinical reliability in cancer detection. These results underscore the necessity of choosing suitable aggregation methods for effective medical federated learning. Full article
Show Figures

Graphical abstract

16 pages, 2925 KB  
Article
SABI: Self-Adaptive Bias for Imbalanced Data Classification
by Suchan Choi, Jinyoung Oh and Jeong-Won Cha
Appl. Sci. 2026, 16(11), 5486; https://doi.org/10.3390/app16115486 - 1 Jun 2026
Viewed by 130
Abstract
Class imbalance remains a significant challenge in classification, often leading to poor generalization on underrepresented classes. While Oversampling methods mitigate this issue by replicating minority class instances to balance class distributions, they typically overlook the informativeness of individual samples. In this paper, we [...] Read more.
Class imbalance remains a significant challenge in classification, often leading to poor generalization on underrepresented classes. While Oversampling methods mitigate this issue by replicating minority class instances to balance class distributions, they typically overlook the informativeness of individual samples. In this paper, we propose an entropy-guided data selection strategy that dynamically prioritizes samples exhibiting frequent prediction changes during training, that is, those with high predictive entropy. Such uncertain samples are expected to contribute more effectively to the learning process. Moreover, we incorporate a credal set-based weighting scheme that adjusts class-wise selection probabilities according to global imbalance severity, quantified using the Gini coefficient. This adjustment penalizes overrepresented classes while increasing the sampling probability of rare but uncertain examples. Experiments on benchmark datasets show that the proposed method improves overall classification performance across imbalanced data settings, while also showing a more balanced trade-off across head, body, and tail classes. Full article
Show Figures

Figure 1

23 pages, 3604 KB  
Article
Spectrum-Aware Generative Model for Small-Sample Motor Fault Diagnosis
by Lijing Wang, Ying Xie, Yuchen Yang, Chunsong Han and Qi Zhao
Actuators 2026, 15(6), 299; https://doi.org/10.3390/act15060299 - 28 May 2026
Viewed by 255
Abstract
This paper proposes a spectrum-aware generative learning framework for intelligent motor fault diagnosis under small-sample conditions. To address the challenges of insufficient labeled fault data and imbalanced distributions in motor systems, a hybrid model integrating a generative adversarial network (GAN) with an attention-enhanced [...] Read more.
This paper proposes a spectrum-aware generative learning framework for intelligent motor fault diagnosis under small-sample conditions. To address the challenges of insufficient labeled fault data and imbalanced distributions in motor systems, a hybrid model integrating a generative adversarial network (GAN) with an attention-enhanced deep neural network is developed. First, vibration signals of the motor are transformed into time–frequency representations to capture discriminative spectral features. Then, the GAN is employed to augment minority classes and improve data diversity, while the SE (squeeze-and-excitation) mechanism enhances feature extraction by emphasizing critical fault-related components. Finally, a deep classifier is trained on the augmented dataset for fault identification. Experimental results on benchmark datasets demonstrate that the proposed method achieves superior diagnostic accuracy and robustness compared with several state-of-the-art approaches, especially under severe data scarcity and imbalance scenarios. The results indicate that the proposed framework effectively improves generalization performance and provides a reliable solution for intelligent motor fault diagnosis in practical industrial applications. Full article
Show Figures

Figure 1

26 pages, 13861 KB  
Article
Construction Safety Risk Prediction Using SAE-SMOTE Data Augmentation and Adaptive Weighted Naive Bayes
by Qifei Wang, Jian Li, Shuai Liu and Jinbo Yao
Buildings 2026, 16(11), 2156; https://doi.org/10.3390/buildings16112156 - 28 May 2026
Viewed by 269
Abstract
Safety risk prediction in construction is a fundamental basis for accident prevention, and its scientific rigor directly affects the effectiveness of safety management decision-making. However, traditional approaches rely heavily on subjective judgment in weight determination, while accident datasets are often constrained by limited [...] Read more.
Safety risk prediction in construction is a fundamental basis for accident prevention, and its scientific rigor directly affects the effectiveness of safety management decision-making. However, traditional approaches rely heavily on subjective judgment in weight determination, while accident datasets are often constrained by limited sample sizes and imbalanced class distributions, which hinder model training performance. To address these limitations, this study proposes an improved model based on data augmentation, termed SAE-SMOTE-AWBN (SSA). The proposed approach employs SMOTE to alleviate data imbalance, integrates a stacked autoencoder (SAE) to enhance feature representation, and utilizes AWBN to perform probabilistic inference of risk factors alongside adaptive weight adjustment. Results obtained from five-fold cross-validation indicate that, compared with the 58.6% prediction accuracy achieved by traditional methods, the SSA model improves accident prediction accuracy to 78.3 ± 2.4%, thereby demonstrating the effectiveness and applicability of the proposed approach. Full article
Show Figures

Figure 1

Back to TopTop