Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (659)

Search Parameters:
Keywords = class-imbalanced

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 2093 KB  
Article
Dual-Stream Time-Series Transformer-Based Encrypted Traffic Data Augmentation Framework
by Daeho Choi, Yeog Kim, Changhoon Lee and Kiwook Sohn
Appl. Sci. 2025, 15(18), 9879; https://doi.org/10.3390/app15189879 - 9 Sep 2025
Abstract
We propose a Transformer-based data augmentation framework with a time-series dual-stream architecture to address performance degradation in encrypted network traffic classification caused by class imbalance between attack and benign traffic. The proposed framework independently processes the complete flow’s sequential packet information and statistical [...] Read more.
We propose a Transformer-based data augmentation framework with a time-series dual-stream architecture to address performance degradation in encrypted network traffic classification caused by class imbalance between attack and benign traffic. The proposed framework independently processes the complete flow’s sequential packet information and statistical characteristics by extracting and normalizing a local channel (comprising packet size, inter-arrival time, and direction) and a set of six global flow-level statistical features. These are used to generate a fixed-length multivariate sequence and an auxiliary vector. The sequence and vector are then fed into an encoder-only Transformer that integrates learnable positional embeddings with a FiLM + context token-based injection mechanism, enabling complementary representation of sequential patterns and global statistical distributions. Large-scale experiments demonstrate that the proposed method reduces reconstruction RMSE and additional feature restoration MSE by over 50%, while improving accuracy, F1-Score, and AUC by 5–7%p compared to classification on the original imbalanced datasets. Furthermore, the augmentation process achieves practical levels of processing time and memory overhead. These results show that the proposed approach effectively mitigates class imbalance in encrypted traffic classification and offers a promising pathway to achieving more robust model generalization in real-world deployment scenarios. Full article
(This article belongs to the Special Issue AI-Enabled Next-Generation Computing and Its Applications)
Show Figures

Figure 1

28 pages, 1433 KB  
Article
Class-Adaptive Weighted Broad Learning System with Hybrid Memory Retention for Online Imbalanced Classification
by Jintao Huang, Yu Wang and Mengxin Wang
Electronics 2025, 14(17), 3562; https://doi.org/10.3390/electronics14173562 - 8 Sep 2025
Abstract
Data stream classification is a critical challenge in data mining, where models must rapidly adapt to evolving data distributions and concept drift in real time, while extreme learning machines offer fast training and strong generalization, most existing methods struggle to jointly address multi-class [...] Read more.
Data stream classification is a critical challenge in data mining, where models must rapidly adapt to evolving data distributions and concept drift in real time, while extreme learning machines offer fast training and strong generalization, most existing methods struggle to jointly address multi-class imbalance, concept drift, and the high cost of label acquisition in streaming settings. In this paper, we present the Adaptive Broad Learning System for Online Imbalanced Classification (ABLS-OIC), which introduces three core innovations: (1) a Class-Adaptive Weight Matrix (CAWM) that dynamically adjusts sample weights according to class distribution, sample density, and difficulty; (2) a Hybrid Memory Retention Mechanism (HMRM) that selectively retains representative samples based on importance and diversity; and (3) a Multi-Objective Adaptive Optimization Framework (MAOF) that balances classification accuracy, class balance, and computational efficiency. Extensive experiments on ten benchmark datasets with varying imbalance ratios and drift patterns show that ABLS-OIC consistently outperforms state-of-the-art methods, with improvements of 5.9% in G-mean, 6.3% in F1-score, and 3.4% in AUC. Furthermore, a real-world credit fraud detection case study demonstrates the practical effectiveness of ABLS-OIC, highlighting its value for early detection of rare but critical events in dynamic, high-stakes applications. Full article
(This article belongs to the Special Issue Advances in Data Mining and Its Applications)
Show Figures

Figure 1

14 pages, 685 KB  
Proceeding Paper
Predictive Analysis of Voice Pathology Using Logistic Regression: Insights and Challenges
by Divya Mathews Olakkengil and Sagaya Aurelia P
Eng. Proc. 2025, 107(1), 28; https://doi.org/10.3390/engproc2025107028 - 27 Aug 2025
Viewed by 481
Abstract
Voice pathology diagnosis is essential for the timely detection and management of voice disorders, which can significantly impact an individual’s quality of life. This study employed logistic regression to evaluate the predictive power of variables that include age, severity, loudness, breathiness, pitch, roughness, [...] Read more.
Voice pathology diagnosis is essential for the timely detection and management of voice disorders, which can significantly impact an individual’s quality of life. This study employed logistic regression to evaluate the predictive power of variables that include age, severity, loudness, breathiness, pitch, roughness, strain, and gender on a binary diagnosis outcome (Yes/No). The analysis was performed on the Perceptual Voice Qualities Database (PVQD), a comprehensive dataset containing voice samples with perceptual ratings. Two widely used voice quality assessment tools, CAPE-V (Consensus Auditory-Perceptual Evaluation of Voice) and GRBAS (Grade, Roughness, Breathiness, Asthenia, Strain), were employed to annotate voice qualities, ensuring systematic and clinically relevant perceptual evaluations. The model revealed that age (odds ratio: 1.033, p < 0.001), loudness (odds ratio: 1.071, p = 0.005), and gender (male) (odds ratio: 1.904, p = 0.043) were statistically significant predictors of voice pathology. In contrast, severity and voice quality-related features like breathiness, pitch, roughness, and strain did not show statistical significance, suggesting their limited predictive contributions within this model. While the results provide valuable insights, the study underscores notable limitations of logistic regression. The model assumes a linear relationship between the independent variables and the log odds of the outcome, which restricts its ability to capture complex, non-linear patterns within the data. Additionally, logistic regression does not inherently account for interactions between predictors or feature dependencies, potentially limiting its performance in more intricate datasets. Furthermore, a fixed classification threshold (0.5) may lead to misclassification, particularly in datasets with imbalanced classes or skewed predictor distributions. These findings highlight that although logistic regression serves as a useful tool for identifying significant predictors, its results are dataset-dependent and cannot be generalized across diverse populations. Future research should validate these findings using heterogeneous datasets and employ advanced machine learning techniques to address the limitations of logistic regression. Integrating non-linear models or feature interaction analyses may enhance diagnostic accuracy, ensuring more reliable and robust voice pathology predictions. Full article
Show Figures

Figure 1

40 pages, 30645 KB  
Article
From Data to Diagnosis: A Novel Deep Learning Model for Early and Accurate Diabetes Prediction
by Muhammad Mohsin Zafar, Zahoor Ali Khan, Nadeem Javaid, Muhammad Aslam and Nabil Alrajeh
Healthcare 2025, 13(17), 2138; https://doi.org/10.3390/healthcare13172138 - 27 Aug 2025
Viewed by 521
Abstract
Background: Diabetes remains a major global health challenge, contributing significantly to premature mortality due to its potential progression to organ failure if not diagnosed early. Traditional diagnostic approaches are subject to human error, highlighting the need for modern computational techniques in clinical [...] Read more.
Background: Diabetes remains a major global health challenge, contributing significantly to premature mortality due to its potential progression to organ failure if not diagnosed early. Traditional diagnostic approaches are subject to human error, highlighting the need for modern computational techniques in clinical decision support systems. Although these systems have successfully integrated deep learning (DL) models, they still encounter several challenges, such as a lack of intricate pattern learning, imbalanced datasets, and poor interpretability of predictions. Methods: To address these issues, the temporal inception perceptron network (TIPNet), a novel DL model, is designed to accurately predict diabetes by capturing complex feature relationships and temporal dynamics. An adaptive synthetic oversampling strategy is utilized to reduce severe class imbalance in an extensive diabetes health indicators dataset consisting of 253,680 instances and 22 features, providing a diverse and representative sample for model evaluation. The model’s performance and generalizability are assessed using a 10-fold cross-validation technique. To enhance interpretability, explainable artificial intelligence techniques are integrated, including local interpretable model-agnostic explanations and Shapley additive explanations, providing insights into the model’s decision-making process. Results: Experimental results demonstrate that TIPNet achieves improvement scores of 3.53% in accuracy, 3.49% in F1-score, 1.14% in recall, and 5.95% in the area under the receiver operating characteristic curve. Conclusions: These findings indicate that TIPNet is a promising tool for early diabetes prediction, offering accurate and interpretable results. The integration of advanced DL modeling with oversampling strategies and explainable AI techniques positions TIPNet as a valuable resource for clinical decision support, paving the way for its future application in healthcare settings. Full article
Show Figures

Figure 1

27 pages, 2279 KB  
Article
HQRNN-FD: A Hybrid Quantum Recurrent Neural Network for Fraud Detection
by Yao-Chong Li, Yi-Fan Zhang, Rui-Qing Xu, Ri-Gui Zhou and Yi-Lin Dong
Entropy 2025, 27(9), 906; https://doi.org/10.3390/e27090906 - 27 Aug 2025
Viewed by 560
Abstract
Detecting financial fraud is a critical aspect of modern intelligent financial systems. Despite the advances brought by deep learning in predictive accuracy, challenges persist—particularly in capturing complex, high-dimensional nonlinear features. This study introduces a novel hybrid quantum recurrent neural network for fraud detection [...] Read more.
Detecting financial fraud is a critical aspect of modern intelligent financial systems. Despite the advances brought by deep learning in predictive accuracy, challenges persist—particularly in capturing complex, high-dimensional nonlinear features. This study introduces a novel hybrid quantum recurrent neural network for fraud detection (HQRNN-FD). The model utilizes variational quantum circuits (VQCs) incorporating angle encoding, data reuploading, and hierarchical entanglement to project transaction features into quantum state spaces, thereby facilitating quantum-enhanced feature extraction. For sequential analysis, the model integrates a recurrent neural network (RNN) with a self-attention mechanism to effectively capture temporal dependencies and uncover latent fraudulent patterns. To mitigate class imbalance, the synthetic minority over-sampling technique (SMOTE) is employed during preprocessing, enhancing both class representation and model generalizability. Experimental evaluations reveal that HQRNN-FD attains an accuracy of 0.972 on publicly available fraud detection datasets, outperforming conventional models by 2.4%. In addition, the framework exhibits robustness against quantum noise and improved predictive performance with increasing qubit numbers, validating its efficacy and scalability for imbalanced financial classification tasks. Full article
(This article belongs to the Special Issue Quantum Computing in the NISQ Era)
Show Figures

Figure 1

16 pages, 601 KB  
Article
UAV Airborne Network Intrusion Detection Method Based on Improved Stratified Sampling and Ensemble Learning
by Lin Lin, Hongjuan Ge, Yuefei Zhou and Runzong Shangguan
Drones 2025, 9(9), 604; https://doi.org/10.3390/drones9090604 - 27 Aug 2025
Viewed by 359
Abstract
UAV airborne network intrusion detection faces challenges due to highly imbalanced datasets, where normal samples significantly outnumber intrusion instances. This paper proposes an improved stratified sampling and ensemble learning (ISSEL) method to address this issue. The method improves upon traditional stratified sampling by [...] Read more.
UAV airborne network intrusion detection faces challenges due to highly imbalanced datasets, where normal samples significantly outnumber intrusion instances. This paper proposes an improved stratified sampling and ensemble learning (ISSEL) method to address this issue. The method improves upon traditional stratified sampling by clustering normal samples and performing distance-based sampling from cluster centers to ensure better feature space representation. Subsequently, five tree models, namely, decision tree, extra tree, random forest, gradient boosting tree, and XGBoost, are utilized to train each subset. The model prediction results are then integrated using an adaptive weighting strategy based on the F1 score. The experimental results on the MIL-STD-1553B data bus demonstrated that the ISSEL method maintained a high accuracy rate of 99.42% while significantly enhancing the recognition ability for minority-class attacks. The precision, recall, and F1 score reached 98.94%, 97.62%, and 98.28%, respectively. These results validate the effectiveness of the ISSEL method in handling imbalanced datasets, highlighting its potential application in the field of airborne network intrusion detection. Full article
Show Figures

Figure 1

14 pages, 3720 KB  
Proceeding Paper
A Novel Data-Driven Framework for Automated Migraines Classification Using Ensemble Learning
by Muhammad Owais Butt, Azka Mir and Alun Sujjada
Eng. Proc. 2025, 107(1), 25; https://doi.org/10.3390/engproc2025107025 - 26 Aug 2025
Viewed by 332
Abstract
Migraines are recurring and highly painful headaches with multiple associated symptoms that severely affect millions of people around the world. This condition is considered quite serious from a neurologist’s perspective because it is highly debilitating. Effective treatment of migraines begins with its diagnosis [...] Read more.
Migraines are recurring and highly painful headaches with multiple associated symptoms that severely affect millions of people around the world. This condition is considered quite serious from a neurologist’s perspective because it is highly debilitating. Effective treatment of migraines begins with its diagnosis but the subjective nature of clinical evaluations along with class imbalance in patient datasets makes this very complicated. This paper attempts to tackle these issues by developing a machine-learning framework for automated migraines classification by utilizing a Kaggle dataset of 400 samples with 23 independent attributes and 1 dependent attribute representing different types of migraines. Our framework starts with a detailed cleansing of the data, which includes filtering out all missing values. Then, through the use of SMOTE (Synthetic Minority Oversampling Technique), the issue of an imbalanced dataset is tackled. This is followed by optimized feature selection through forward selection and cross-validation with Naïve Bayes. Supervised machine-learning classifiers such as Random Forest (RF), decision tree (DT), K-nearest Neighbors (KNN), and Naïve Bayes (NB) are evaluated and voted on to predict the outcome. Full article
Show Figures

Figure 1

25 pages, 4100 KB  
Article
An Adaptive Unsupervised Learning Approach for Credit Card Fraud Detection
by John Adejoh, Nsikak Owoh, Moses Ashawa, Salaheddin Hosseinzadeh, Alireza Shahrabi and Salma Mohamed
Big Data Cogn. Comput. 2025, 9(9), 217; https://doi.org/10.3390/bdcc9090217 - 25 Aug 2025
Viewed by 777
Abstract
Credit card fraud remains a major cause of financial loss around the world. Traditional fraud detection methods that rely on supervised learning often struggle because fraudulent transactions are rare compared to legitimate ones, leading to imbalanced datasets. Additionally, the models must be retrained [...] Read more.
Credit card fraud remains a major cause of financial loss around the world. Traditional fraud detection methods that rely on supervised learning often struggle because fraudulent transactions are rare compared to legitimate ones, leading to imbalanced datasets. Additionally, the models must be retrained frequently, as fraud patterns change over time and require new labeled data for retraining. To address these challenges, this paper proposes an ensemble unsupervised learning approach for credit card fraud detection that combines Autoencoders (AEs), Self-Organizing Maps (SOMs), and Restricted Boltzmann Machines (RBMs), integrated with an Adaptive Reconstruction Threshold (ART) mechanism. The ART dynamically adjusts anomaly detection thresholds by leveraging the clustering properties of SOMs, effectively overcoming the limitations of static threshold approaches in machine learning and deep learning models. The proposed models, AE-ASOMs (Autoencoder—Adaptive Self-Organizing Maps) and RBM-ASOMs (Restricted Boltzmann Machines—Adaptive Self-Organizing Maps), were evaluated on the Kaggle Credit Card Fraud Detection and IEEE-CIS datasets. Our AE-ASOM model achieved an accuracy of 0.980 and an F1-score of 0.967, while the RBM-ASOM model achieved an accuracy of 0.975 and an F1-score of 0.955. Compared to models such as One-Class SVM and Isolation Forest, our approach demonstrates higher detection accuracy and significantly reduces false positive rates. In addition to its performance, the model offers considerable computational efficiency with a training time of 200.52 s and memory usage of 3.02 megabytes. Full article
Show Figures

Figure 1

16 pages, 1827 KB  
Review
Disease Prediction in Cattle: A Mixed-Methods Review of Predictive Modeling Studies
by Lilli Heinen, Robert L. Larson and Brad J. White
Animals 2025, 15(17), 2481; https://doi.org/10.3390/ani15172481 - 23 Aug 2025
Viewed by 341
Abstract
Predictive models use historical data to predict a future event and can be applied to a wide variety of tasks. A broader evaluation of the cattle literature is required to better understand predictive model performance across various health challenges and to understand data [...] Read more.
Predictive models use historical data to predict a future event and can be applied to a wide variety of tasks. A broader evaluation of the cattle literature is required to better understand predictive model performance across various health challenges and to understand data types utilized to train models. This narrative review aims to describe predictive model performance in greater detail across various disease outcomes, input data types, and algorithms with a specific focus on accuracy, sensitivity, specificity, and positive and negative predictive values. A secondary goal is to address important areas for consideration for future work in the beef cattle sector. In total, 19 articles were included. Broad categories of disease were covered, including respiratory disease, bovine tuberculosis, and others. Various input data types were reported, including demographic data, images, and laboratory test results, among others. Several algorithms were utilized, including neural networks, linear models, and others. Accuracy, sensitivity, and specificity values ranged widely across disease outcome and algorithm categories. Negative predictive values were greater than positive predictive values for most disease outcomes. This review highlights the importance of utilizing several performance metrics and concludes that future work should address prevalence of outcomes and class-imbalanced data. Full article
(This article belongs to the Special Issue Artificial Intelligence Applications for Veterinary Medicine)
Show Figures

Figure 1

13 pages, 1341 KB  
Proceeding Paper
Predicting Nurse Stress Levels Using Time-Series Sensor Data and Comparative Evaluation of Classification Algorithms
by Ayşe Çiçek Korkmaz, Adem Korkmaz and Selahattin Koşunalp
Eng. Proc. 2025, 104(1), 30; https://doi.org/10.3390/engproc2025104030 - 22 Aug 2025
Viewed by 273
Abstract
This study proposes a machine learning-based framework for classifying occupational stress levels among nurses using physiological time-series data collected from wearable sensors. The dataset comprises multimodal signals including electrodermal activity (EDA), heart rate (HR), skin temperature (TEMP), and tri-axial accelerometer measurements (X, Y, [...] Read more.
This study proposes a machine learning-based framework for classifying occupational stress levels among nurses using physiological time-series data collected from wearable sensors. The dataset comprises multimodal signals including electrodermal activity (EDA), heart rate (HR), skin temperature (TEMP), and tri-axial accelerometer measurements (X, Y, Z), which are labeled into three categorical stress levels: low (0), medium (1), and high (2). To enhance the usability of the raw data, a resampling process was performed to aggregate the measurements into one-minute intervals, followed by the application of the Synthetic Minority Over-sampling Technique (SMOTE) to mitigate severe class imbalance. Subsequently, a comparative classification analysis was conducted using four supervised learning algorithms: Random Forest, XGBoost, k-Nearest Neighbors (k-NN), and LightGBM. Model performances were evaluated based on accuracy, weighted F1-score, and confusion matrices to ensure robustness across imbalanced class distributions. Additionally, temporal pattern analyses by the day of the week and the hour of the day revealed significant trends in stress variation, underscoring the influence of circadian and organizational factors. Among the models tested, ensemble-based methods, particularly Random Forest and XGBoost with optimized hyperparameters, demonstrated a superior predictive performance. These findings highlight the feasibility of integrating real-time, sensor-driven stress monitoring systems into healthcare environments to support proactive workforce management and improve care quality. Full article
Show Figures

Figure 1

20 pages, 2239 KB  
Article
Lightweight Financial Fraud Detection Using a Symmetrical GAN-CNN Fusion Architecture
by Yiwen Yang, Chengjun Xu and Guisheng Tian
Symmetry 2025, 17(8), 1366; https://doi.org/10.3390/sym17081366 - 21 Aug 2025
Viewed by 584
Abstract
With the rapid development of information technology and the deep integration of the Internet platform, the scale and form of financial transactions continue to grow and expand, significantly improving users’ payment experience and life efficiency. However, financial transactions bring us convenience but also [...] Read more.
With the rapid development of information technology and the deep integration of the Internet platform, the scale and form of financial transactions continue to grow and expand, significantly improving users’ payment experience and life efficiency. However, financial transactions bring us convenience but also expose many security risks, such as money laundering activities, forged checks, and other financial fraud that occurs frequently, seriously threatening the stability and security of the financial system. Due to the imbalance between the proportion of normal and abnormal transactions in the data, most of the existing deep learning-based methods still have obvious deficiencies in learning small numbers sample classes, context modeling, and computational complexity control. To address these deficiencies, this paper proposes a symmetrical structure-based GAN-CNN model for lightweight financial fraud detection. The symmetrical structure can improve the feature extraction and fusion ability and enhance the model’s recognition effect for complex fraud patterns. Synthetic fraud samples are generated based on a GAN to alleviate category imbalance. Multi-scale convolution and attention mechanisms are designed to extract local and global transaction features, and adaptive aggregation and context encoding modules are introduced to improve computational efficiency. We conducted numerous replicate experiments on two public datasets, YelpChi and Amazon. The results showed that on the Amazon dataset with a 50% training ratio, compared with the CNN-GAN model, the accuracy of our model was improved by 1.64%, and the number of parameters was reduced by approximately 88.4%. Compared with the hybrid CNN-LSTM–attention model under the same setting, the accuracy was improved by 0.70%, and the number of parameters was reduced by approximately 87.6%. The symmetry-based lightweight architecture proposed in this work is novel in terms of structural design, and the experimental results show that it is both efficient and accurate in detecting imbalanced transactions. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

21 pages, 2544 KB  
Article
Towards Fair Graph Neural Networks via Counterfactual and Balance
by Zhiguo Xiao, Yangfan Zhou, Dongni Li and Ke Wang
Information 2025, 16(8), 704; https://doi.org/10.3390/info16080704 - 19 Aug 2025
Viewed by 753
Abstract
In recent years, graph neural networks (GNNs) have shown powerful performance in processing non-Euclidean data. However, similar to other machine-learning algorithms, GNNs can amplify data bias in high-risk decision-making systems, which can easily lead to unfairness in the final decision-making results. At present, [...] Read more.
In recent years, graph neural networks (GNNs) have shown powerful performance in processing non-Euclidean data. However, similar to other machine-learning algorithms, GNNs can amplify data bias in high-risk decision-making systems, which can easily lead to unfairness in the final decision-making results. At present, a large number of studies focus on solving the fairness problem of GNNs, but the existing methods mostly rely on building complex model architectures or rely on technical means in the field of non-GNNs. To this end, this paper proposes FairCNCB (Fair Graph Neural Network based on Counterfactual and Category Balance) to address the problem of class imbalancing in minority sensitive attribute groups. First, we conduct a causal analysis of fair representation and employ the adversarial network to generate counterfactual node samples, effectively mitigating bias induced by sensitive attributes. Secondly, we calculate the weights for minority sensitive attribute groups, and reconstruct the loss function to achieve the fairness of sensitive attribute classes among different groups. The synergy between the two modules optimizes GNNs from multiple dimensions and significantly improves the performance of GNNs in terms of fairness. The experimental results on the three datasets show the effectiveness and fairness of FairCNCB. The performance metrics (such as AUC, F1, and ACC) have been improved by approximately 2%, and the fairness metrics (△sp, △eo) have been enhanced by approximately 5%. Full article
Show Figures

Figure 1

32 pages, 2983 KB  
Article
TS-SMOTE: An Improved SMOTE Method Based on Symmetric Triangle Scoring Mechanism for Solving Class-Imbalanced Problems
by Shihao Song and Sibo Yang
Symmetry 2025, 17(8), 1326; https://doi.org/10.3390/sym17081326 - 14 Aug 2025
Viewed by 392
Abstract
The imbalanced classification problem is a key research in machine learning as the relevant algorithms tend to focus on the features and patterns of the majority class instead of insufficient learning of the minority class, resulting in unsatisfactory performance of machine learning. Scholars [...] Read more.
The imbalanced classification problem is a key research in machine learning as the relevant algorithms tend to focus on the features and patterns of the majority class instead of insufficient learning of the minority class, resulting in unsatisfactory performance of machine learning. Scholars have attempted to solve this problem and proposed many ideas at the data and algorithm levels. The SMOTE (Synthetic Minority Over-sampling Technique) method is an effective approach at the data level. In this paper, we propose an oversampling method based on SMOTE and symmetric regular triangles scoring mechanism. This method uses symmetrical triangles to flatten the plane, and then establishes a suitable scoring mechanism to select the minority samples that participate in the synthesis. After selecting the minority samples, it conducts multiple linear interpolations according to the established rules to generate new minority samples. In the experimental section, we select 30 imbalanced datasets to test their performance of the proposed method and some classical oversampling methods under different indicators. In order to demonstrate the performance of these oversampling methods with classifiers, we select three different classifiers and test their performance. The experimental results show that the TS-SMOTE method has the best performance. Full article
(This article belongs to the Special Issue Advances in Neural Network/Deep Learning and Symmetry/Asymmetry)
Show Figures

Figure 1

30 pages, 1637 KB  
Article
Life Insurance Fraud Detection: A Data-Driven Approach Utilizing Ensemble Learning, CVAE, and Bi-LSTM
by Markapurapu John Dana Ebinezer and Bondalapu Chaitanya Krishna
Appl. Sci. 2025, 15(16), 8869; https://doi.org/10.3390/app15168869 - 12 Aug 2025
Viewed by 508
Abstract
Insurance fraud detection is a significant challenge due to increasing fraudulent claims, class imbalance, and the increasing complexity of fraudulent behaviour. Traditional machine learning models often struggle to generalize effectively when applied to high-dimensional and imbalanced datasets. This study proposes a data-driven framework [...] Read more.
Insurance fraud detection is a significant challenge due to increasing fraudulent claims, class imbalance, and the increasing complexity of fraudulent behaviour. Traditional machine learning models often struggle to generalize effectively when applied to high-dimensional and imbalanced datasets. This study proposes a data-driven framework for intelligent fraud detection employing three distinct modelling strategies: chaotic variational autoencoders (CVAEs), idirectional long short-term memory (Bi-LSTM), and a hybrid random forest + Bi-LSTM technique. This study aims to evaluate and compare the effectiveness of generative, sequential, and ensemble-based models in identifying rare fraudulent claims within created datasets of 4000 life insurance applications containing 83 features. Following extensive preprocessing and model training, CVAEs achieved the highest accuracy (83.75%) but failed to detect many fraudulent cases due to its low recall (3.28). The Bi-LSTM model outperformed the CVAEs in recall (5.98%) and F1-score, effectively capturing temporal dependencies within the data. The hybrid RF + Bi-LSTM model matched Bi–LSTM in recall but showed more stable ROC and precision–recall curves, indicating robustness and misinterpretability. This hybrid approach balances the strengths of feature-driven and sequential modelling, making it suitable for operational deployment. While Bi–LSTM achieved the best statistical performance, the hybrid model offers enhanced reliability in threshold-sensitive fraud applications. Full article
Show Figures

Figure 1

18 pages, 18060 KB  
Article
A Cross-Modal Multi-Layer Feature Fusion Meta-Learning Approach for Fault Diagnosis Under Class-Imbalanced Conditions
by Haoyu Luo, Mengyu Liu, Zihao Deng, Zhe Cheng, Yi Yang, Guoji Shen, Niaoqing Hu, Hongpeng Xiao and Zhitao Xing
Actuators 2025, 14(8), 398; https://doi.org/10.3390/act14080398 - 11 Aug 2025
Viewed by 403
Abstract
In practical applications, intelligent diagnostic methods for actuator-integrated gearboxes in industrial driving systems encounter challenges such as the scarcity of fault samples and variable operating conditions, which undermine diagnostic accuracy. This paper introduces a multi-layer feature fusion meta-learning (MLFFML) approach to address fault [...] Read more.
In practical applications, intelligent diagnostic methods for actuator-integrated gearboxes in industrial driving systems encounter challenges such as the scarcity of fault samples and variable operating conditions, which undermine diagnostic accuracy. This paper introduces a multi-layer feature fusion meta-learning (MLFFML) approach to address fault diagnosis problems in cross-condition scenarios with class imbalance. First, meta-training is performed to develop a mature fault diagnosis model on the source domain, obtaining cross-domain meta-knowledge; subsequently, meta-testing is conducted on the target domain, extracting meta-features from limited fault samples and abundant healthy samples to rapidly adjust model parameters. For data augmentation, this paper proposes a frequency-domain weighted mixing (FWM) method that preserves the physical plausibility of signals while enhancing sample diversity. Regarding the feature extractor, this paper integrates shallow and deep features by replacing the first layer of the feature extraction module with a dual-stream wavelet convolution block (DWCB), which transforms actuator vibration or acoustic signals into the time-frequency space to flexibly capture fault characteristics and fuses information from both amplitude and phase aspects; following the convolutional network, an encoder layer of the Transformer network is incorporated, containing multi-head self-attention mechanisms and feedforward neural networks to comprehensively consider dependencies among different channel features, thereby achieving a larger receptive field compared to other methods for actuation system monitoring. Furthermore, this paper experimentally investigates cross-modal scenarios where vibration signals exist in the source domain while only acoustic signals are available in the target domain, specifically validating the approach on industrial actuator assemblies. Full article
Show Figures

Figure 1

Back to TopTop