Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,686)

Search Parameters:
Keywords = robust representation learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 4248 KB  
Article
A Spatial Post-Multiscale Fusion Entropy and Multi-Feature Synergy Model for Disturbance Identification of Charging Stations
by Hui Zhou, Xiujuan Zeng, Tong Liu, Wei Wu, Bolun Du and Yinglong Diao
Energies 2026, 19(8), 1837; https://doi.org/10.3390/en19081837 - 8 Apr 2026
Abstract
The large-scale integration and grid connection of renewable energy sources and charging stations introduce a multitude of nonlinear and impact loads, resulting in more severe distortion and higher complexity of disturbance signals in power systems. As a consequence, power quality disturbances (PQDs) in [...] Read more.
The large-scale integration and grid connection of renewable energy sources and charging stations introduce a multitude of nonlinear and impact loads, resulting in more severe distortion and higher complexity of disturbance signals in power systems. As a consequence, power quality disturbances (PQDs) in active distribution networks, including overvoltage and harmonics, display greater randomness and diversity, which increases the challenge of PQD identification. To tackle this problem, this study presents a dual-channel early-fusion approach for PQD recognition based on Spatial Post-MultiScale Fusion Entropy (SMFE). SMFE is used as an entropy-based feature-construction pipeline in which a time–frequency representation is formed prior to spatial post-multiscale aggregation to produce a compact complexity map complementary to waveform morphology. Subsequently, a dual-channel model is constructed by integrating waveform-morphology input with SMFE-derived complexity features for joint learning. By leveraging the ConvNeXt architecture and a Squeeze-and-Excitation (SE) mechanism, a multimodal channel-recalibration model is implemented to emphasize informative feature responses during PQD recognition. Experimental verification with simulated signals shows that the proposed approach achieves an identification accuracy of 97.83% under an SNR of 30 dB, indicating robust performance under the tested noise settings. Full article
Show Figures

Figure 1

25 pages, 7549 KB  
Article
Unseen-Crop Plant Disease Classification via Disentangled Representation Learning
by Zhenzhen Wu, Jianli Guo, Wei Hou, Kun Zhou, Kerang Cao and Hoekyung Jung
Electronics 2026, 15(8), 1553; https://doi.org/10.3390/electronics15081553 - 8 Apr 2026
Abstract
Deep learning has accelerated progress in plant disease recognition, providing strong technical support for early diagnosis and precision management. However, models often lack robustness and generalization when confronted with novel crops absent from the training set, leading to a marked performance drop in [...] Read more.
Deep learning has accelerated progress in plant disease recognition, providing strong technical support for early diagnosis and precision management. However, models often lack robustness and generalization when confronted with novel crops absent from the training set, leading to a marked performance drop in cross-unseen-crop scenarios. Cross-crop generalization for plant disease recognition requires models to identify known disease categories in crop domains never observed during training. A central challenge is that disease symptoms are strongly coupled with crop-specific appearance cues, which severely degrades generalization. Here, TDC (Text-guided feature Disentanglement Contrast) is introduced as a feature-disentanglement framework for cross-crop plant disease recognition. The proposed method employs a dual-branch visual encoder to separately capture disease semantic representations and crop-domain representations, and it leverages a frozen CLIP text encoder to use disease and crop prompts for text-guided semantic anchoring. A semantic-anchor-only contrastive disentanglement strategy is further formulated under a hybrid label space, where crop-branch features are incorporated as stop-gradient hard negatives to suppress semantic–domain information leakage and strengthen the intra-class aggregation of the same disease across crops. Residual domain-discriminative cues are mitigated via domain-adversarial learning. During inference, only the disease branch is retained for classification, improving generalization while reducing deployment overhead. Experiments demonstrate that under the PlantVillage cross-crop setting, the method achieves 98.04% and 74.29% Top-1 accuracy on seen and unseen crop domains, respectively. Moreover, it attains 81.99% on a real-world field dataset of strawberry powdery mildew and 76.31% on a low-illumination degradation set, validating robustness under realistic imaging distribution shifts. Full article
(This article belongs to the Special Issue Advances in Data-Driven Artificial Intelligence, 2nd Edition)
Show Figures

Figure 1

33 pages, 3919 KB  
Article
BiLSTM Guided LPA Planning, Re-Planning, and Backtracking for Effective and Efficient Emergency Evacuation
by Ramzi Djemai, Hamza Kheddar, Mohamed Chahine Ghanem, Karim Ouazzane and Erivelton Nepomuceno
Smart Cities 2026, 9(4), 65; https://doi.org/10.3390/smartcities9040065 - 7 Apr 2026
Abstract
Emergency evacuation in complex and dynamic building environments requires robust and adaptive routing strategies capable of responding to evolving hazards, blocked passages, and changing crowd behaviour. Most existing evacuation planners rely on static geometric representations and lack semantic awareness of the environment, limiting [...] Read more.
Emergency evacuation in complex and dynamic building environments requires robust and adaptive routing strategies capable of responding to evolving hazards, blocked passages, and changing crowd behaviour. Most existing evacuation planners rely on static geometric representations and lack semantic awareness of the environment, limiting their ability to perform informed re-planning and backtracking when routes become unsafe. This paper proposes a neuro-symbolic evacuation planning framework that integrates Lifelong Planning A* (LPA*) with ontology-driven semantic reasoning and a Bidirectional Long Short-Term Memory (BiLSTM) prediction model. The building’s spatial and semantic knowledge is represented using the Web Ontology Language (OWL) and Resource Description Framework (RDF), enabling automated inference of implicit connections and enforcement of safety policies. The BiLSTM model learns temporal patterns from ontology-consistent evacuation trajectories and provides guidance for remaining-cost estimation and early prediction of routes likely to require backtracking, which is combined with a bounded semantic heuristic to preserve admissibility and optimality guarantees. Simulation results in a multi-floor academic building show that the proposed BiLSTM-guided semantic LPA* framework reduces average evacuation time by up to 9.6%, decreases node expansions by up to 32%, and increases evacuation success rates to 96.2% compared with a purely semantic baseline. The BiLSTM model also achieves strong predictive performance, with a test AUC of 0.92 for backtracking prediction and a next-state accuracy of 87.1%. The proposed framework is designed to support explainable, policy-compliant, and incrementally adaptable evacuation guidance under rapidly evolving emergency conditions. Full article
14 pages, 16245 KB  
Article
Aging State Classification of Lithium-Ion Batteries in a Low-Dimensional Latent Space
by Limei Jin, Franz Philipp Bereck, Rüdiger-A. Eichel, Josef Granwehr and Christoph Scheurer
Batteries 2026, 12(4), 127; https://doi.org/10.3390/batteries12040127 - 7 Apr 2026
Abstract
Battery datasets, whether gathered experimentally or through simulation, are typically high-dimensional and complex, which complicates the direct interpretation of degradation behavior or anomaly detection. To overcome these limitations, this study introduces a framework that compresses battery signals into a low-dimensional representation using an [...] Read more.
Battery datasets, whether gathered experimentally or through simulation, are typically high-dimensional and complex, which complicates the direct interpretation of degradation behavior or anomaly detection. To overcome these limitations, this study introduces a framework that compresses battery signals into a low-dimensional representation using an autoencoder, enabling the extraction of informative features for state analysis. A central component of this work is the systematic comparison of latent representations obtained from two fundamentally different data sources: frequency-domain impedance data and time-domain voltage-current data. The close agreement of aging trajectories in both representations suggests that information traditionally derived from impedance analysis can also be captured directly from raw time-series signals. To better approximate real operating conditions, synthetic datasets are augmented with stochastic perturbations. In this context, latent spaces learned from idealized periodic inputs are contrasted with those derived from permuted and noise-contaminated signals. The resulting low-dimensional features are subsequently evaluated through a support vector machine with both linear and nonlinear kernel functions, allowing the categorization of battery states into fresh, aged and damaged conditions. The results demonstrate that the progression of battery degradation is consistently reflected in the latent space, independent of the input domain or signal quality. This robustness indicates that the proposed approach can effectively capture essential aging characteristics even under non-ideal conditions. Consequently, this framework provides a basis for developing advanced diagnostic strategies, including the design of pseudo-random excitation profiles for improved battery state assessment and optimized operational control. Full article
Show Figures

Graphical abstract

32 pages, 6103 KB  
Article
An Optimal Deep Hybrid Framework with Selective Kernel U-Net for Skin Lesion Detection and Classification
by Guzal Gulmirzaeva, Robert Hudec, Baxtiyorjon Akbaraliev and Batirbek Samandarov
Bioengineering 2026, 13(4), 427; https://doi.org/10.3390/bioengineering13040427 - 6 Apr 2026
Viewed by 36
Abstract
Early and accurate detection of skin cancer is critical for reducing mortality rates, particularly for malignant melanoma. Automated analysis of dermoscopic images has gained significant attention due to its potential to support clinical diagnosis and overcome the limitations of manual inspection. Motivated by [...] Read more.
Early and accurate detection of skin cancer is critical for reducing mortality rates, particularly for malignant melanoma. Automated analysis of dermoscopic images has gained significant attention due to its potential to support clinical diagnosis and overcome the limitations of manual inspection. Motivated by challenges such as image noise, low contrast, lesion variability, and redundant feature representation, this study proposes an optimal deep hybrid framework for skin lesion detection and classification. The objective of this work is to design a robust and efficient system that integrates advanced preprocessing, precise segmentation, optimal feature selection, and accurate classification. Initially, contrast enhancement using Contrast Limited Adaptive Histogram Equalization (CLAHE) and noise reduction using Wiener filtering are applied to improve image quality. Lesion regions are then segmented using a Selective Kernel U-Net (SK-UNet), which adaptively captures multi-scale spatial information. Subsequently, discriminative color, texture, and shape features are extracted and optimized using the Fossa Optimization Algorithm (FOA) to eliminate redundancy. A hybrid one-dimensional Convolutional Neural Network–Gated Recurrent Unit (1D-CNN–GRU) classifier is employed for final classification, learning both spatial and sequential feature patterns. Experimental evaluation on the ISIC and DermMNIST datasets demonstrates that the proposed framework achieves classification accuracies of 97.6% and 95.6%, respectively, outperforming several existing methods. The results confirm that the proposed hybrid framework provides reliable, accurate, and scalable skin cancer diagnosis, highlighting its potential for assisting clinical decision-making and early detection. Full article
(This article belongs to the Special Issue Deep Learning for Medical Applications: Challenges and Opportunities)
Show Figures

Figure 1

29 pages, 547 KB  
Article
MRHL: Multi-Relational Hypergraph Learning for Next POI Recommendation
by Sai Zhao, Caisen Chen and Shuai He
Electronics 2026, 15(7), 1528; https://doi.org/10.3390/electronics15071528 - 6 Apr 2026
Viewed by 53
Abstract
With the rapid advancement of location-based services, next Point-of-Interest (POI) recommendation has emerged as a critical task in personalized mobility modeling and recommendation systems. It aims to predict users’ future locations based on their historical trajectories, thereby enhancing the personalization and intelligence of [...] Read more.
With the rapid advancement of location-based services, next Point-of-Interest (POI) recommendation has emerged as a critical task in personalized mobility modeling and recommendation systems. It aims to predict users’ future locations based on their historical trajectories, thereby enhancing the personalization and intelligence of recommendation systems. Despite the promising progress, two key challenges remain insufficiently addressed. First, many existing methods overlook the dynamic evolution of user trajectories across multiple perspectives, resulting in entangled representations that fail to capture user intent accurately. Second, they often ignore the latent synergy across diverse perspectives, which limits the effective utilization of complementary information for recommendation. To address these issues, we propose a novel framework called MRHL. MRHL constructs multiple hypergraphs to represent distinct views of user behavior, including interaction frequency, time decay, and geographical proximity. An enhanced hypergraph convolutional network is employed to effectively model the high-order relationships within them. We propose a cascaded enhancement fusion mechanism that progressively integrates multi-view hypergraph representations to enrich the semantic information of user representations. In addition, a multi-relational contrastive learning strategy is developed to capture the consistent signals across different views, thereby enhancing the robustness and discriminative capability of user and POI representations. Extensive experiments on three public datasets consistently demonstrate that MRHL outperforms a range of strong baselines. Full article
(This article belongs to the Special Issue Advances in Deep Learning for Graph Neural Networks)
Show Figures

Figure 1

29 pages, 1303 KB  
Article
An Enhanced Traffic Classifier Based on Self-Supervised Feature Learning
by Shaoqing Jiang, Xin Luo, Hongyi Wang, Gang Chen and Hongwei Zhao
Appl. Sci. 2026, 16(7), 3493; https://doi.org/10.3390/app16073493 - 3 Apr 2026
Viewed by 142
Abstract
Encrypted network traffic classification is an important research topic in the field of network security. Although deep learning-based methods have made progress, they still face three main challenges: first, the semantic information in encrypted traffic is inadequately represented, making it difficult for existing [...] Read more.
Encrypted network traffic classification is an important research topic in the field of network security. Although deep learning-based methods have made progress, they still face three main challenges: first, the semantic information in encrypted traffic is inadequately represented, making it difficult for existing methods to effectively capture the hierarchical interaction relationships between packet-level and flow-level features; second, models rely on large amounts of labeled data for supervised training, resulting in high training costs and limited generalization ability in new scenarios; third, in existing self-supervised methods, the functions of the encoder and decoder are coupled, which restricts the full potential of the encoder’s representation learning. To address these issues, this paper proposes an Enhanced Traffic Classifier (ETC) based on self-supervised feature learning. The model first constructs a multi-level interactive traffic representation matrix, converting raw traffic into structured grayscale images that fuse packet-level and flow-level temporal features, thereby addressing the problem of missing semantic information. On this basis, an improved Masked Image Modeling Vision Transformer architecture is adopted. Through a three-stage decoupled design of encoder–regressor–decoder, the encoder focuses solely on feature extraction, the regressor performs masked representation prediction, and the decoder is only responsible for image reconstruction, thereby fully unleashing the encoder’s feature learning capability. Furthermore, during the fine-tuning stage, an Attentive Probing classification mechanism is introduced to replace the traditional linear classification head. By using learnable class query vectors to dynamically focus on semantic regions relevant to the classification target, the model’s recognition accuracy and robustness are further improved. Experiments are conducted on five public datasets, including USTC-TFC2016 and CICIoT2022, as well as a self-built Human-Internet dataset. The results show that ETC significantly outperforms mainstream methods such as YaTC and ET-BERT in core metrics including accuracy and F1-score, while also demonstrating strong generalization in few-shot scenarios. Full article
Show Figures

Figure 1

15 pages, 2437 KB  
Article
A Hybrid Self-ONN and Vision Mamba Architecture for Robust Radio Interference Recognition in GNSS Applications
by Nursultan Meirambekuly, Margulan Ibraimov, Bakyt Khaniyev, Beibit Karibayev, Alisher Skabylov, Nursultan Uzbekov, Sungat Koishybay, Timur Dautov, Ainur Khaniyeva and Bagdat Kozhakhmetova
Electronics 2026, 15(7), 1498; https://doi.org/10.3390/electronics15071498 - 3 Apr 2026
Viewed by 200
Abstract
Radio-frequency interference (RFI) poses a critical challenge for modern high-precision Global Navigation Satellite System (GNSS) applications, as both intentional and unintentional interference can significantly degrade positioning accuracy and reliability. With increasingly sophisticated interference sources, robust and computationally efficient automatic recognition methods are required [...] Read more.
Radio-frequency interference (RFI) poses a critical challenge for modern high-precision Global Navigation Satellite System (GNSS) applications, as both intentional and unintentional interference can significantly degrade positioning accuracy and reliability. With increasingly sophisticated interference sources, robust and computationally efficient automatic recognition methods are required for next-generation GNSS receivers. Although deep learning approaches show strong potential for interference detection, their high computational cost often limits deployment in resource-constrained navigation hardware. This paper proposes a hybrid deep learning architecture for radio interference recognition in high-precision GNSSs. The framework employs a dual-branch design integrating complementary signal representations. A Self-Organizing Operational Neural Network (Self-ONN) extracts nonlinear temporal features from raw one-dimensional signals, while a Vision Mamba state-space model processes two-dimensional time-frequency spectrograms obtained via Short-Time Fourier Transform (STFT). The fused features enable accurate classification of diverse interference types with high computational efficiency. Experiments on a synthetic dataset demonstrate that the proposed model achieves 99.83% accuracy and F1-score, outperforming ResNet18, VGG16, and Vision Transformer while reducing computational complexity by up to 42% and improving inference speed by up to 35%, supporting its applicability for intelligent interference monitoring in GNSS receivers. Full article
Show Figures

Figure 1

22 pages, 4903 KB  
Article
A Robust Lithium-Ion Battery Capacity Prediction Framework Using Multi-Point Voltage Temporal Features and an OOF-Trained Adaptive Gating Mechanism
by Lun-Yi Lung, Bo-Hao Zhou and Cheng-Chien Kuo
Energies 2026, 19(7), 1745; https://doi.org/10.3390/en19071745 - 2 Apr 2026
Viewed by 209
Abstract
Accurate capacity prediction is paramount for ensuring the operational safety and reliability of lithium-ion battery management systems (BMS). Nevertheless, contemporary data-driven approaches often grapple with limited feature representation—frequently relying solely on aggregate charging duration or noise measures—which compromises the robustness of these approaches. [...] Read more.
Accurate capacity prediction is paramount for ensuring the operational safety and reliability of lithium-ion battery management systems (BMS). Nevertheless, contemporary data-driven approaches often grapple with limited feature representation—frequently relying solely on aggregate charging duration or noise measures—which compromises the robustness of these approaches. To address these limitations, this study proposes a robust framework integrating multi-point voltage temporal sampling (MVTS) with an adaptive gated hybrid ensemble learning strategy. The MVTS method is first used to extract high-dimensional geometric features from the constant-current (CC) charging phase (3.9 V–4.15 V), effectively capturing subtle degradation patterns. Subsequently, an unsupervised isolation forest algorithm is incorporated for automated anomaly detection and rectification, thereby augmenting data stability prior to training. In the fusion stage, a heterogeneous hybrid model comprising eXtreme gradient boosting (XGBoost) and long short-term memory (LSTM) is constructed. An adaptive gating mechanism based on random forest (RF) is added to dynamically weight the base learners. To mitigate data leakage during the stacking process, this study employs an out-of-fold (OOF) training strategy based on leave-one-battery-out (LOBO) cross-validation to generate unbiased meta-features for the gating model. This mechanism dynamically modulates fusion weights contingent upon the multi-point voltage features and model discrepancies, thereby accommodating diverse aging stages and capacity degradation patterns. Experimental results from the NASA battery aging dataset demonstrate that the proposed framework significantly outperforms single-model baselines in terms of RMSE and R2, exhibiting superior adaptability and predictive precision. Full article
Show Figures

Figure 1

33 pages, 10259 KB  
Article
Multimodal Remote Sensing Image Classification Based on Dynamic Group Convolution and Bidirectional Guided Cross-Attention Fusion
by Lu Zhang, Yaoguang Yang, Zhaoshuang He, Guolong Li, Feng Zhao, Wenqiang Hua, Gongwei Xiao and Jingyan Zhang
Remote Sens. 2026, 18(7), 1066; https://doi.org/10.3390/rs18071066 - 2 Apr 2026
Viewed by 170
Abstract
The synergistic integration of Hyperspectral Imaging (HSI) and Light Detection and Ranging (LiDAR) data has become a pivotal strategy in remote sensing for precise land-cover classification. However, existing multimodal deep learning frameworks frequently suffer from intrinsic limitations, including rigid feature extraction protocols, underutilization [...] Read more.
The synergistic integration of Hyperspectral Imaging (HSI) and Light Detection and Ranging (LiDAR) data has become a pivotal strategy in remote sensing for precise land-cover classification. However, existing multimodal deep learning frameworks frequently suffer from intrinsic limitations, including rigid feature extraction protocols, underutilization of LiDAR-derived textural information, and asymmetric fusion mechanisms that fail to balance the contribution of spectral and elevation features effectively. To address these challenges, this paper proposes a novel framework named DGC-BCAF, which integrates Dynamic Group Convolution and Bidirectional Guided Cross-Attention Fusion to achieve adaptive feature representation and robust cross-modal interaction. First, a Dynamic Group Convolution (DGConv) module embedded within a ResNet18 backbone is designed to function as the central spatial context extractor. Unlike traditional group convolution, this module learns a dynamic relationship matrix to automatically group input channels, thereby facilitating flexible and context-aware feature representation that adapts to complex spatial distributions. Second, to overcome the insufficient exploitation of elevation data, we introduce a dedicated LiDAR texture encoding branch. This branch innovatively fuses Gray-Level Co-occurrence Matrix (GLCM) statistical features with multi-scale convolutional representations, capturing both geometric height information and fine-grained surface textural details that are critical for distinguishing objects with similar elevations. Finally, central to our architecture is the Bidirectional Cross-Attention Fusion (BCAF) module. Unlike standard unidirectional fusion approaches, BCAF employs a LiDAR geometry to guide the selection of salient spectral bands, while simultaneously utilizing spectral signatures to emphasize informative LiDAR channels. This mutual guidance ensures a balanced contribution from both modalities. Extensive experiments conducted on three benchmark datasets—Houston 2013, Trento, and MUUFL—demonstrate that DGC-BCAF consistently outperforms state-of-the-art methods in terms of overall accuracy, average accuracy, and Kappa coefficient. The results confirm that the proposed adaptive grouping and bidirectional guidance strategies significantly improve classification performance, particularly in distinguishing spectrally similar materials and delineating complex urban structures. Full article
Show Figures

Figure 1

38 pages, 1145 KB  
Article
Transfer Learning Strategies for Comic Character Recognition in Low-Data Regimes: A Comparative Study
by Marco Parrillo, Luigi Laura and Alessandro Manna
Future Internet 2026, 18(4), 192; https://doi.org/10.3390/fi18040192 - 2 Apr 2026
Viewed by 207
Abstract
Image classification in low-data regimes remains a challenging problem, particularly in stylized visual domains where intra-class similarity and inter-class feature overlap limit discriminative capacity. This study presents a systematic evaluation of regularization and transfer learning strategies for multi-class comic character recognition under constrained [...] Read more.
Image classification in low-data regimes remains a challenging problem, particularly in stylized visual domains where intra-class similarity and inter-class feature overlap limit discriminative capacity. This study presents a systematic evaluation of regularization and transfer learning strategies for multi-class comic character recognition under constrained data conditions. Four convolutional architectures are compared: (i) a baseline CNN trained from scratch, (ii) a regularized CNN incorporating data augmentation, dropout, and early stopping, (iii) a pretrained ResNet-50 used as a fixed feature extractor, and (iv) a partially fine-tuned ResNet-50 with selective layer unfreezing. Experiments are conducted on a custom four-class dataset exhibiting moderate class imbalance, evaluated using both a fixed 70/20/10 split and 5-fold cross-validation to assess generalization stability. Results indicate that shallow CNN architectures suffer from substantial overfitting, even when regularization is applied, whereas transfer learning significantly improves macro-averaged F1-score and out-of-distribution detection performance. Cross-validated results, the primary basis for inference given the dataset scale, show that both ResNet-50 strategies achieve equivalent mean accuracy of 95.0% (SD: ±0.4% for feature extraction, ±0.8% for fine-tuning; paired t = 0.00, p = 1.000), while shallow CNN architectures reach only 81–87%. Under a single fixed 70/20/10 partition (n = 69 test samples, 95% CI: ±9–12%), fine-tuning nominally reaches 98.5%; crucially, cross-validation deflates this figure to parity with feature extraction, confirming it reflects favorable partitioning rather than genuine architectural superiority. The primary finding is therefore that frozen ResNet-50 feature extraction is the recommended strategy: it matches fine-tuning in cross-validated generalization while requiring 15× fewer trainable parameters and exhibiting lower fold-to-fold variance. The findings demonstrate that pretrained deep residual representations transfer effectively to stylized comic imagery and that evaluation protocol selection critically impacts perceived performance in small datasets. These results provide practical guidelines for robust model selection in domain-specific, limited-data image classification tasks. Full article
(This article belongs to the Special Issue Innovations in Artificial Intelligence and Neural Networks)
Show Figures

Graphical abstract

23 pages, 8650 KB  
Article
GAFR-Net: A Graph Attention and Fuzzy-Rule Network for Interpretable Breast Cancer Image Classification
by Lin-Guo Gao and Suxing Liu
Electronics 2026, 15(7), 1487; https://doi.org/10.3390/electronics15071487 - 2 Apr 2026
Viewed by 235
Abstract
Accurate classification of breast cancer histopathology images is essential for early diagnosis and effective clinical management. However, conventional deep learning models often exhibit performance degradation under limited labeled data and lack interpretability, which restricts their clinical applicability. To address these challenges, we propose [...] Read more.
Accurate classification of breast cancer histopathology images is essential for early diagnosis and effective clinical management. However, conventional deep learning models often exhibit performance degradation under limited labeled data and lack interpretability, which restricts their clinical applicability. To address these challenges, we propose GAFR-Net, a robust and interpretable Graph Attention and Fuzzy-Rule Network designed for histopathology image classification under scarce supervision (defined here as less than 10% labeled data). GAFR-Net constructs a similarity-driven graph to model inter-sample relationships and employs a multi-head graph attention mechanism to capture complex relational representations among heterogeneous tissue structures. Meanwhile, a differentiable fuzzy-rule module integrates intrinsic topological descriptors—such as node degree, clustering coefficient, and label consistency—into explicit and human-readable diagnostic rules. This architecture establishes transparent IF–THEN inference mappings that emulate the heuristic reasoning process of clinical experts, thereby enhancing model interpretability without relying on post-hoc explanation techniques. Extensive experiments conducted on three public benchmark datasets—BreakHis, Mini-DDSM, and ICIAR2018—demonstrate that GAFR-Net consistently surpasses state-of-the-art methods across multiple magnifications and classification settings. These results highlight the strong generalization capability and practical potential of GAFR-Net as a trustworthy decision-support framework for weakly supervised medical image analysis. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Image Classification)
Show Figures

Figure 1

23 pages, 23579 KB  
Article
Image-Based Waste Classification Using a Hybrid Deep Learning Architecture with Transfer Learning and Edge AI Deployment
by Domen Verber, Teodora Grneva and Jani Dugonik
Mathematics 2026, 14(7), 1176; https://doi.org/10.3390/math14071176 - 1 Apr 2026
Viewed by 296
Abstract
Growing amounts of municipal waste and the need for efficient recycling demand automated and accurate classification systems. This paper investigates deep learning approaches for multi-class waste sorting based on image data, comparing three widely used convolutional neural network architectures (ResNet-50, EfficientNet-B0, and MobileNet [...] Read more.
Growing amounts of municipal waste and the need for efficient recycling demand automated and accurate classification systems. This paper investigates deep learning approaches for multi-class waste sorting based on image data, comparing three widely used convolutional neural network architectures (ResNet-50, EfficientNet-B0, and MobileNet V3) with a custom hybrid model (CustomNet). The dataset comprises 13,933 RGB images across 10 waste categories, combining publicly available samples from the Kaggle Garbage Classification dataset (61.1%) with images collected in house (38.9%). The three glass sub-categories (brown, green, and white glass) were merged into a single glass class to ensure consistent class representation across all dataset splits. Preprocessing steps include normalization, resizing, and extensive data augmentation to improve robustness and mitigate class imbalance. Transfer learning is applied to pretrained models, while CustomNet integrates feature representations from multiple backbones using projection layers and attention mechanisms. Performance is evaluated using accuracy, macro-F1, and ROC–AUC on a held-out test set. Statistical significance was assessed using paired t-tests and Wilcoxon signed-rank tests with Bonferroni correction across five-fold cross-validation runs. The results show that CustomNet achieves 97.79% accuracy, a macro-F1 score of 0.973, and a ROC–AUC of 0.992. CustomNet significantly outperforms EfficientNet-B0 and MobileNet V3 (p<0.001, Bonferroni corrected), and it achieves performance parity with ResNet-50 (p=0.383) at a substantially lower parameter count in the classification head (9.7 M vs. 25.6 M). These findings indicate that combining multiple feature extractors with attention mechanisms improves classification performance, supports qualitative model explainability via saliency visualization (Grad-CAM), and enables practical deployment on heterogeneous Edge AI platforms. Inference benchmarking on an NVIDIA Jetson Orin Nano demonstrated real-world deployment feasibility at 86.70 ms per image (11.5 FPS). Full article
(This article belongs to the Special Issue The Application of Deep Neural Networks in Image Processing)
Show Figures

Figure 1

8 pages, 540 KB  
Proceeding Paper
A Federated Learning Approach for Privacy-Preserving Automated Signature Verification
by Haris Veraros, Fotios Zantalis, Stylianos Katsoulis, Elias N. Zois and Grigorios Koulouras
Eng. Proc. 2026, 124(1), 100; https://doi.org/10.3390/engproc2026124100 - 1 Apr 2026
Viewed by 295
Abstract
The growing interconnectivity of digital systems has led to the massive collection and centralization of sensitive data, raising serious concerns about confidentiality and compliance with privacy regulations. Biometric authentication systems, such as offline signature verification, are particularly vulnerable. Federated learning (FL) provides a [...] Read more.
The growing interconnectivity of digital systems has led to the massive collection and centralization of sensitive data, raising serious concerns about confidentiality and compliance with privacy regulations. Biometric authentication systems, such as offline signature verification, are particularly vulnerable. Federated learning (FL) provides a promising framework by enabling model training without exposing raw client data. However, keeping data strictly localized inherently creates severe data scarcity, which is a significant barrier to building robust deep learning (DL) models. This work investigates the feasibility of a privacy-preserving writer-dependent (WD) offline signature verification (OSV) system within an FL framework. To make local training viable under these constraints, we integrate complementary techniques into the federated pipeline: data augmentation is utilized to increase local sample diversity, while transfer learning provides robust pre-trained feature representations, drastically reducing the volume of data required for effective local fine-tuning. The proposed WD-OSV system was trained and evaluated on the popular CEDAR signature dataset, for which an average area under the curve of 0.8893, along with an average binary accuracy (ACC) of 80.12%, are reported as preliminary results. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

22 pages, 2730 KB  
Article
Ensemble Learning Based on Bagging and Hybrid Sampling for Food Safety Risk Prediction
by Dafang Li, Zhengyong Zhang, Qingchun Wu and Xin Chen
Foods 2026, 15(7), 1176; https://doi.org/10.3390/foods15071176 - 31 Mar 2026
Viewed by 191
Abstract
Food safety sampling inspections are critical for risk prevention in complex supply chains, yet the extremely low frequency of high-risk samples poses substantial challenges for accurate risk prediction. To address the limitations of conventional machine learning models under severe class imbalance, this study [...] Read more.
Food safety sampling inspections are critical for risk prevention in complex supply chains, yet the extremely low frequency of high-risk samples poses substantial challenges for accurate risk prediction. To address the limitations of conventional machine learning models under severe class imbalance, this study proposes a unified Bagging–Stacking framework that integrates stacking ensembles, bagging, and SMOTE–Tomek hybrid resampling to enhance minority-class detection in food safety risk prediction. The stacking ensemble serves as the core of the framework, combining five tree-based base learners with Logistic Regression as the meta-learner to enhance classification robustness. Balanced bootstrap subsets generated through bagging and SMOTE–Tomek hybrid resampling further improve minority-class representation, while a probability-based threshold optimization mechanism is incorporated to refine high-risk classification. Experiments on real-world inspection data show that the proposed framework substantially improves high-risk recall while simultaneously increasing precision, yielding the highest F1 among all compared models. It also maintains a stable overall performance across varying test set proportions, demonstrating strong robustness and consistent generalization under varying evaluation conditions. SHAP analysis identifies storage conditions, production month, shelf life, package, and food category as key contributors to risk prediction, aligning with established mechanisms of food safety risk formation. Overall, the proposed framework provides accurate, robust, and interpretable support for food safety risk prediction, offering practical value for proactive risk prevention and more efficient regulatory resource allocation. Full article
(This article belongs to the Section Food Engineering and Technology)
Show Figures

Figure 1

Back to TopTop