Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (6,478)

Search Parameters:
Keywords = deep convolutional neural network (CNN)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 1009 KB  
Article
An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion
by Ruize Gu, Xiaoying Wang, Fangfang Cui, Guoqing Yang, Shuai Liu and Panpan Qi
Future Internet 2026, 18(5), 270; https://doi.org/10.3390/fi18050270 - 20 May 2026
Abstract
Existing anomalous traffic detection methods based on feature fusion in Software-Defined Networking (SDN) lack adaptability in weight allocation mechanisms. Consequently, their detection accuracy and model generalization capabilities fail to meet practical security requirements. To solve these limitations, this paper proposes a refined detection [...] Read more.
Existing anomalous traffic detection methods based on feature fusion in Software-Defined Networking (SDN) lack adaptability in weight allocation mechanisms. Consequently, their detection accuracy and model generalization capabilities fail to meet practical security requirements. To solve these limitations, this paper proposes a refined detection method based on hybrid feature selection and gated fusion. First, the framework employs XGBoost combined with the Recursive Feature Elimination (RFE) algorithm. This process identifies shallow statistical features with high discriminative power. Simultaneously, the method utilizes a 1D Convolutional Neural Network (1D-CNN) integrated with a Squeeze-and-Excitation (SE) block to extract deep temporal semantic features. Subsequently, a tailored gated fusion mechanism incorporating linear projection layers for feature alignment adaptively integrates these two categories of features. The fused features are then input into a Multilayer Perceptron (MLP) to execute anomalous traffic detection. Experimental results demonstrate that the proposed method achieves superior performance. Specifically, on the InSDN Dataset, the binary and multi-classification accuracy rates reach 99.91% and 99.88%. Similarly, the accuracy rates on the NSL-KDD dataset are 99.78% and 99.76%. Finally, we established a local simulation environment. Experimental results demonstrate that our method attains an average precision exceeding 93% for anomalous traffic detection in simulated real scenarios. Full article
(This article belongs to the Section Cybersecurity)
Show Figures

Figure 1

20 pages, 4618 KB  
Article
A Deep Shale Gas Reservoir Rock Brittleness Index Prediction Method Based on a CNN-BiGRU-Attention Hybrid Model
by Feng Deng, Jin Wu, Chengyong Li, Liuting Chen, Yiding Wang and Yang Zeng
Appl. Sci. 2026, 16(10), 5112; https://doi.org/10.3390/app16105112 - 20 May 2026
Abstract
Hydraulic fracturing is a key technology for the commercial exploitation of deep shale gas reservoirs, and accurate prediction of rock-mechanical parameters is essential for optimizing these operations. Conventional approaches primarily rely on empirical formulas based on longitudinal and transverse wave velocities; however, obtaining [...] Read more.
Hydraulic fracturing is a key technology for the commercial exploitation of deep shale gas reservoirs, and accurate prediction of rock-mechanical parameters is essential for optimizing these operations. Conventional approaches primarily rely on empirical formulas based on longitudinal and transverse wave velocities; however, obtaining transverse wave data is challenging, and these formulas often lack accuracy. Conventional machine learning algorithms also exhibit limited predictive performance and generalization due to the intrinsic heterogeneity of rock-mechanical data. Therefore, to address the extreme heterogeneity and complex nonlinear logging responses inherent in deep shale gas reservoirs in the Zigong (ZG) block, this study proposes a geology-tailored deep learning framework, CNN-BiGRU-AT. Unlike generic machine learning applications, this architecture is specifically designed to decode complex stratigraphic signals: the convolutional neural network (CNN) module extracts multi-scale spatial features to capture abrupt lithological transitions; the bidirectional gated recurrent units (BiGRUs) analyzes the continuous depth-sequential dependencies of overlying and underlying strata; and the attention mechanism (AT) dynamically regulates the weight allocation of critical input geophysical parameters, thereby delivering a geophysically informative and highly robust predictive performance. This paper employs the CNN-BiGRU-AT model to predict the Brittleness index (BI), using the ZG block as an example. The results demonstrate that the coefficient of determination (R2) for the brittleness index on the test dataset achieved 0.969, representing a 12% improvement over conventional models. The high accuracy of this model satisfies the precision requirements for predicting rock-mechanical parameters, thereby offering reliable theoretical support for optimizing hydraulic fracturing operations in deep shale gas reservoirs. Full article
Show Figures

Figure 1

29 pages, 1438 KB  
Article
Stability-Driven Feature Extraction–Kolmogorov–Arnold Network-Driven Ensemble Framework for Reliable Breast Cancer Detection
by Abdul Rahaman Wahab Sait and Yazeed Alkhurayyif
Electronics 2026, 15(10), 2207; https://doi.org/10.3390/electronics15102207 - 20 May 2026
Abstract
Breast cancer screening is a fundamentally probabilistic diagnostic task that requires precise identification of complex imaging characteristics from diverse patient cohorts. Despite improvements in deep learning techniques, current automatic tools are typically trained on well-curated datasets and do not generalize to heterogeneous data, [...] Read more.
Breast cancer screening is a fundamentally probabilistic diagnostic task that requires precise identification of complex imaging characteristics from diverse patient cohorts. Despite improvements in deep learning techniques, current automatic tools are typically trained on well-curated datasets and do not generalize to heterogeneous data, thereby limiting their application. This study aims to address these shortcomings by introducing a more effective and generalizable framework for breast cancer classification that focuses on the stability of features, the learning of complementary representations, and improved decision modeling. The proposed methodology incorporates stability-driven feature extraction (SDFE) with a multi-branch architecture that consists of EfficientNetV2 (Convolutional neural networks (CNNs)), EfficientFormer (Vision transformers (ViTs)), and multi-layer perceptron (MLP)-Mixer models to extract various feature representations. To improve non-linear decision boundaries, it uses a Kolmogorov–Arnold Network (KAN)-based classification head and selects the most credible prediction via an adaptive voting mechanism. This model is trained using patient-level splitting on the VinDr-Mammo dataset, evaluated using five-fold cross-validation, and subsequently externally validated on the CBIS-DDSM dataset. Experimental findings demonstrate the consistent performance of the proposed model, with accuracies of 94.5% in cross-validation, 93.3% on the VinDr-Mammo test set, and 94.6% on CBIS-DDSM, surpassing other recent state-of-the-art solutions. It demonstrates enhanced robustness and cross-dataset generalization, offering a scalable, consistent framework for breast cancer classification that supports the development of computer-aided diagnostic systems. Full article
34 pages, 11332 KB  
Article
Artificial Intelligence for Autonomous Vehicles: Robustness Analysis in Complex Urban Traffic Scenarios
by Brandon Quezada-Godoy, Antonio Guerrero-González, Francisco García-Córdova, Francisco Lloret-Abrisqueta and Antonio Martínez-Espinosa
Electronics 2026, 15(10), 2204; https://doi.org/10.3390/electronics15102204 - 20 May 2026
Abstract
Autonomous driving in complex urban environments remains challenging due to perception uncertainty, dynamic multi-agent interactions, and control instability under adverse conditions. Despite advances in individual components, systematic evaluations of fully integrated modular pipelines under compounded urban disturbances remain scarce. This work presents a [...] Read more.
Autonomous driving in complex urban environments remains challenging due to perception uncertainty, dynamic multi-agent interactions, and control instability under adverse conditions. Despite advances in individual components, systematic evaluations of fully integrated modular pipelines under compounded urban disturbances remain scarce. This work presents a modular autonomous driving framework in CARLA Town10HD, integrating Convolutional Neural Network (CNN)-based perception using ResNet-18, global path planning via A* algorithm, and two control strategies: a classical Proportional–Integral–Derivative (PID) controller and a Deep Q-Network (DQN) agent with adaptive geometric steering assistance. A structured protocol assessed robustness across five scenarios: Heavy Rain, Dense Fog, Nighttime Driving, Dense Traffic, and Combined Extreme Conditions. The perception module achieved F1-scores close to 0.99 for traffic-sign, pedestrian, and lane classification; results reflect synthetic CARLA data and should not be interpreted as real-world generalization. The PID controller produced smoother trajectories with lower steering oscillations, while the DQN agent achieved faster traversal times at the cost of higher control variability. Route efficiency remained around 0.96 under isolated disturbances and decreased to 0.52 under compounded conditions, confirming sensitivity to multi-factor complexity. This study contributes a reproducible multi-scenario benchmark quantifying stability–adaptability trade-offs between classical and learning-based control, identifying scenario generalization and simulation-to-reality transfer as key future directions. Full article
(This article belongs to the Special Issue Electronic Architecture for Autonomous Vehicles)
18 pages, 5937 KB  
Article
Portable Holonomic Educational Robot Platform for Home Laboratory—Study Case: AI-Based Electromyography Control
by Erick Alexander Noboa, Lourdes Ruiz, György Eigner and Péter Galambos
Technologies 2026, 14(5), 308; https://doi.org/10.3390/technologies14050308 - 20 May 2026
Abstract
The post-pandemic evolution of education involving mechatronics and machine learning has shifted the demand for robotic hardware from centralized laboratories to accessible laboratories in home environments. This paper presents a portable three-wheeled holonomic robotic platform designed for remote research and home office experimentation. [...] Read more.
The post-pandemic evolution of education involving mechatronics and machine learning has shifted the demand for robotic hardware from centralized laboratories to accessible laboratories in home environments. This paper presents a portable three-wheeled holonomic robotic platform designed for remote research and home office experimentation. The proposed system utilizes a modular design and low-cost philosophy comprising a custom embedded control system driven by an ESP32-WROOM microcontroller, which manages a closed-loop PID velocity controller using Hall effect feedback from three DC micromotors. In contrast, external nodes allow the reception, conditioning, and classification of 8-channel surface electromyography (sEMG) data sampled at 500 Hz. To address the non-stationarity and stochastic noise in raw sEMG signals, this study implements a hybrid Deep Learning (DL) architecture that complements 2D Convolutional Neural Networks (CNN) for spatial feature extraction with Long Short-Term Memory (LSTM) networks for temporal context awareness. This model decodes the neuromuscular intent of the user into real-time holonomic velocity vectors, achieving validation accuracies of 80.51% for horizontal movement, 84.86% for vertical translation, and 99.56% for the Fist/no-Fist state. By synthesizing advanced AI-based teleoperation with a portable design, this study establishes a scalable framework for the next generation of “laboratory-at-home” educational tools and research regardless of physical location. Full article
18 pages, 1186 KB  
Article
Autonomous Reinforcement Learning-Based Intrusion Detection for IoT Cyber Defense
by Ammar Odeh
Digital 2026, 6(2), 41; https://doi.org/10.3390/digital6020041 - 19 May 2026
Abstract
The rapid proliferation of Internet of Things (IoT) devices has dramatically expanded the attack surface for cyber threats, exposing critical infrastructure to sophisticated intrusion attempts that traditional static intrusion detection systems (IDS) fail to counter effectively. This paper proposes an autonomous reinforcement learning [...] Read more.
The rapid proliferation of Internet of Things (IoT) devices has dramatically expanded the attack surface for cyber threats, exposing critical infrastructure to sophisticated intrusion attempts that traditional static intrusion detection systems (IDS) fail to counter effectively. This paper proposes an autonomous reinforcement learning (RL)-based IDS framework for dynamic IoT networks, capable of adaptive, real-time threat detection without human intervention. The proposed system integrates a Deep Q-Network (DQN) agent with a hybrid convolutional neural network–long short-term memory (CNN-LSTM) feature extractor to identify and classify malicious network traffic across 33 attack categories. We evaluate the framework on two recent, publicly available benchmark datasets: CICIoT2023, comprising 8.94 GB of traffic from 105 real IoT devices, and CIC IoT-DIAD 2024, a flow-based dataset with diverse attack and benign scenarios. Experimental results demonstrate superior detection performance compared to baseline classifiers, including SVM, Random Forest, and standalone deep learning models, with improved F1-score, reduced false alarm rate (FAR), and lower detection latency. The reward-shaping strategy explicitly penalizes false positives, addressing a key limitation of prior RL-based IDS approaches. This work contributes a scalable, dataset-agnostic autonomous defense architecture suitable for real-world IoT deployment. Full article
(This article belongs to the Special Issue Intelligent and Autonomous Cyber Defense Systems)
41 pages, 1712 KB  
Review
Machine Learning-Based Optimization for Renewable Energy Systems: A Comprehensive Review
by Mohammad Shehab, Afaf Edinat, Mariam Al Ghamri, Mamdouh Gomaa, Fatima Alhaj, Israa Wahbi Kamal and Ahmed E. Fakhry
Algorithms 2026, 19(5), 405; https://doi.org/10.3390/a19050405 - 18 May 2026
Viewed by 63
Abstract
Machine learning (ML) has become a key enabling technology for optimizing renewable energy systems and supporting global sustainability objectives. This paper presents a comprehensive review of recent advances in ML-based optimization techniques applied to clean and renewable energy systems, with particular emphasis on [...] Read more.
Machine learning (ML) has become a key enabling technology for optimizing renewable energy systems and supporting global sustainability objectives. This paper presents a comprehensive review of recent advances in ML-based optimization techniques applied to clean and renewable energy systems, with particular emphasis on wind energy, hybrid energy systems, energy storage, and intelligent energy management. A systematic literature review covering peer-reviewed publications from 2021 to 2025 was conducted, resulting in the analysis of 138 high-quality journal and conference studies. The reviewed studies were categorized according to evolutionary algorithm-based hybrid models, classical neural networks, and deep learning architectures, including Convolutional Neural Network (CNN), LSTMs, GRUs, and attention-based models. The analysis demonstrates that hybrid ML–metaheuristic frameworks significantly enhance forecasting accuracy, system reliability, fault diagnosis, and multi-objective optimization compared to traditional methods. These intelligent approaches directly contribute to Sustainable Development Goals SDG-7 (Affordable and Clean Energy), SDG-9 (Industry, Innovation, and Infrastructure), and SDG-13 (Climate Action). Key challenges and future research directions are discussed, highlighting the need for scalable, explainable, and real-time ML solutions to enable resilient, low-carbon, and sustainable energy systems. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
44 pages, 45556 KB  
Article
Clinicopathological Characteristics and Prediction of Overall Survival and Death Within 2 Years in Diffuse Large B-Cell Lymphoma Based on Histological Images and Deep Learning
by Joaquim Carreras
Biomedicines 2026, 14(5), 1134; https://doi.org/10.3390/biomedicines14051134 - 17 May 2026
Viewed by 182
Abstract
Background: Diffuse large B-cell lymphoma (DLBCL) is one of the most frequent lymphomas. To date, it is not possible to identify which DLBCL patients will have an aggressive clinical evolution only by using hematoxylin and eosin (H&E) histological images. Methods: This [...] Read more.
Background: Diffuse large B-cell lymphoma (DLBCL) is one of the most frequent lymphomas. To date, it is not possible to identify which DLBCL patients will have an aggressive clinical evolution only by using hematoxylin and eosin (H&E) histological images. Methods: This study predicted the prognosis of DLBCL using H&E images, computer vision and deep learning. The series included 114 DLBCL cases, split into 2 prognostic groups according to overall survival, and 44 cases of reactive lymphoid tissue. Results: The curve fitting and slope analysis showed a point of inflection at 2 years (24 months), which differentiated patients with aggressive clinical evolution (“Dead < 2 years”, b1 = −0.024) from the rest with moderate clinical evolution (“Others”, b1 = −0.003). Twenty different convolutional neural networks (CNNs) were used, and explainable artificial intelligence (XAI) was also applied. The final model based on DarkNet-19 predicted prognosis groups with high performance (test set accuracy = 96.3%). The other performance parameters were precision (94.5%), recall (95.0%), false positive rate (3.1%), specificity (96.9%), and F1 score (94.7%). XAI, including grad-CAM, occlusion sensitivity, and image-LIME, confirmed that the CNN focused on the correct areas. Hybrid partitioning to prevent information leakage with patient-based analysis, image classification between DLBCL and 44 cases of reactive lymphoid tissue, and hyperparameter tuning were also successfully performed. Correlation with the clinicopathological characteristics found that the Dead < 2 years group was correlated with stage III–IV, International Prognostic Index (IPI) High + High/intermediate, progressive disease, non-GCB cell-of-origin, CD10−, BCL2+, and Epstein–Barr virus (EBER)+. Analysis of the microenvironment, immune checkpoint, cell cycle, and germinal center markers showed that Dead < 2 years had higher IL10, PD-L1, and CD163 levels and lower E2F1 protein expression. No differences were found for Ki67, CSF1R, CASP8, TNFAIP8, LMO2, MYC, MDM2, CDK6, and TP53 markers at a quantitative level. Conclusions: The DLBCL overall survival can be predicted using H&E histological images and deep learning using the 2-year (24 months) point (similar to POD24). This trained CNN can be used as a pretrained model for transfer learning in the future. Full article
27 pages, 7263 KB  
Article
LEViM-Net: A Lightweight EfficientViM Network for Earthquake Building Damage Assessment
by Qing Ma, Dongpu Wu, Yichen Zhang, Jiquan Zhang, Jinyuan Xu and Yechi Yao
Remote Sens. 2026, 18(10), 1592; https://doi.org/10.3390/rs18101592 - 15 May 2026
Viewed by 123
Abstract
Building damage and collapse are the main sources of serious casualties and financial losses during earthquakes, which are among the most destructive natural disasters that endanger human life and property. Therefore, quick and precise post-earthquake building damage assessment is essential for risk assessment [...] Read more.
Building damage and collapse are the main sources of serious casualties and financial losses during earthquakes, which are among the most destructive natural disasters that endanger human life and property. Therefore, quick and precise post-earthquake building damage assessment is essential for risk assessment and emergency action. Convolutional neural networks (CNNs) primarily concentrate on local features and frequently ignore global contextual information within and across buildings, despite the fact that deep learning-based techniques allow automated damage identification. Transformer-based approaches, on the other hand, are good at capturing global dependencies, but their large memory and processing costs restrict their usefulness. As a result, existing networks still struggle to achieve an effective balance between accuracy and efficiency. To address this issue, this study proposes a lightweight and efficient network for post-earthquake building damage assessment. Specifically, we develop a two-stage method based on EfficientViM with an encoder–decoder architecture. In the encoder, Mamba is introduced to extract multi-scale change features with long-range dependencies, leveraging the state space model to preserve global modeling capability while significantly reducing computational complexity. In the decoder, two lightweight modules are designed to further enhance discriminative capability and computational efficiency. The network finally outputs building localization and pixel-level building damage, respectively. Experiments were conducted on four earthquake events from the BRIGHT dataset using a three-for-training and one-for-testing cross-event rotation evaluation strategy. The results demonstrate that LEViM-Net requires only 30.94 M parameters and 27.10 G FLOPs. In addition, for the Türkiye earthquake event, the proposed method achieves an F1 score of 80.49%, an overall accuracy (OA) of 88.17%, and a mean intersection over union (mIoU) of 49.73%. The proposed model enables efficient remote-sensing-based mapping of macroscopic and image-visible building damage, providing timely support for early-stage emergency response. Full article
(This article belongs to the Special Issue Advances in AI-Driven Remote Sensing for Geohazard Perception)
17 pages, 2811 KB  
Article
Efficacy of Spectral-Aided Visual Enhancer in Classification of Esophageal Cancer
by Kok-Yean Koh, Arvind Mukundan, Riya Karmakar, Chaudhary Tirth Atulbhai, Tsung-Hsien Chen, Wei-Chun Weng and Hsiang-Chen Wang
Cancers 2026, 18(10), 1609; https://doi.org/10.3390/cancers18101609 - 15 May 2026
Viewed by 274
Abstract
Background/Objectives: Esophageal cancer is one of the major global causes of cancer mortality, and the 5-year survival rate remains below 20% because many cases are detected late. In this study, a Spectral-Aided Vision Enhancer (SAVE) algorithm was utilized to convert conventional white-light endoscopic [...] Read more.
Background/Objectives: Esophageal cancer is one of the major global causes of cancer mortality, and the 5-year survival rate remains below 20% because many cases are detected late. In this study, a Spectral-Aided Vision Enhancer (SAVE) algorithm was utilized to convert conventional white-light endoscopic images (WLI) into hyperspectral-like narrow-band imaging (NBI) images for machine-learning classification of Dysplasia, Normal, and Squamous Cell Carcinoma (SCC). Methods: A total of 762 WLI images obtained from Kaohsiung Medical University were augmented to 1074 using the Al bumentations library, employing vertical flipping, horizontal flipping, and rotations. The SAVE conversion pipeline employs a 24-patch Macbeth color checker for calibration, γ-correction, CIE XYZ transformation, and multivariate regression to interpolate spectral bands, yielding an average color difference of 2.79 (CIEDE2000) from true NBI. The training outcomes and performance metrics illustrate the versatility of the machine learning/deep learning models—Random Forest (RF), Support Vector Machine (SVM), and Convolutional Neural Network (CNN)—which were trained and evaluated on both the original WLI and SAVE datasets. Performance metrics were analyzed based on precision, recall, accuracy, and F1-score. Results: The CNN sample achieved an accuracy of 100 percent on SAVE data, compared to 93 percent for WLI. The accuracy of RF improved, with WLI at 91% and SAVE at 96%, while SVM increased from 79% to 84%. These improvements indicate the diagnostically valuable spectral variations that can be amplified with SAVE, resulting in significant enhancements in pre-cancer/SCC sensitivity. Conclusions: The proposed SAVE method demonstrates significant potential for enhancing endoscopic imaging and advancing computer-aided diagnosis in esophageal cancer screening, with applicability in other gastrointestinal imaging scenarios as well. Full article
(This article belongs to the Special Issue Advances in Endoscopic Management of Esophageal Cancer)
Show Figures

Figure 1

30 pages, 1071 KB  
Article
An Enhanced Hybrid CNN–LSTM Model for Improved Precipitation Forecasting
by Huthaifa Al-Omari, Murad A. Yaghi and Layan Alrifai
Algorithms 2026, 19(5), 394; https://doi.org/10.3390/a19050394 - 15 May 2026
Viewed by 87
Abstract
Accurate precipitation forecasting is essential for water resource management, flood early-warning systems, and agriculture, but remains difficult because of the nonlinear and highly variable spatiotemporal nature of rainfall. This paper compares four deep learning architectures—a standalone LSTM, a standalone CNN, a hybrid CNN–LSTM, [...] Read more.
Accurate precipitation forecasting is essential for water resource management, flood early-warning systems, and agriculture, but remains difficult because of the nonlinear and highly variable spatiotemporal nature of rainfall. This paper compares four deep learning architectures—a standalone LSTM, a standalone CNN, a hybrid CNN–LSTM, and a Transformer encoder—against three classical baselines (persistence, day-of-year climatology, and per-grid-point ARIMA) for daily precipitation forecasting over Washington State at lead times of one to four days. A 40-year ERA5 dataset (1985–2024) of near-surface air temperature, mean sea-level pressure, and total precipitation is split into training (1985–2012), validation (2013–2015), and test (2016–2024) periods, with the test years held out completely. Each (model, horizon) is trained with three random seeds and evaluated in physical units (mm/day). On the held-out test period, the hybrid CNN–LSTM achieves the lowest RMSE at every horizon h2, with R2=0.576±0.007 and RMSE =15.08±0.07 mm/day at h=4. Diebold–Mariano tests, paired t-tests, and bootstrap 95% confidence intervals confirm that the CNN–LSTM advantage over the LSTM is statistically significant at horizons 2–4 (but not at h=1), while CNN–LSTM is significantly better than every classical baseline and the Transformer at every horizon. The headline result is reproduced under a rolling-origin temporal cross-validation across three non-overlapping splits (R2[0.576,0.590]). Practically, the sub-millisecond inference cost of the CNN–LSTM makes it directly deployable in operational forecasting pipelines used for flood early-warning, irrigation scheduling, and reservoir management, where even modest improvements in 3–4-day-ahead RMSE translate into measurable risk reduction and improved decision lead time for water managers and emergency planners. Full article
(This article belongs to the Special Issue Artificial Intelligence in Sustainable Development)
Show Figures

Figure 1

30 pages, 7003 KB  
Article
Facial Expression Recognition in Anime and Manga Characters: A Comparative Study of Vision Transformers and Convolutional Neural Networks
by Marco Parrillo, Elia Santoro, Luigi Laura and Valerio Rughetti
Information 2026, 17(5), 484; https://doi.org/10.3390/info17050484 - 15 May 2026
Viewed by 236
Abstract
Facial expression recognition (FER) is a well-established task in computer vision, yet its application to non-photorealistic domains, such as anime and manga, remains largely underexplored. The stylized, exaggerated, and often non-proportional facial features of illustrated characters present unique challenges for deep learning models [...] Read more.
Facial expression recognition (FER) is a well-established task in computer vision, yet its application to non-photorealistic domains, such as anime and manga, remains largely underexplored. The stylized, exaggerated, and often non-proportional facial features of illustrated characters present unique challenges for deep learning models trained predominantly on realistic imagery. In this work, we construct a balanced dataset of 3000 manga and anime face images spanning six emotion categories (Angry, Embarrassed, Happy, Manic–Euphoric, Sad, Scared) and conduct a systematic comparison of two major deep learning paradigms: Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Specifically, we evaluate ResNet-18, ResNet-50, ViT-B/16, and ViT-S/16 under four fine-tuning strategies: linear probing, partial fine-tuning, full fine-tuning, and progressive unfreezing, enabling a controlled comparison of both architectural families and transfer learning depth. Our results show that fine-tuning strategy significantly impacts performance: the best configuration (ViT-B/16 with progressive unfreezing) achieves 81.33% test accuracy (single run, seed 42), compared to 61.33% for the weakest linear probe baseline (ViT-S/16), a gap of 20.00 percentage points. To isolate architectural differences from strategy effects, we note that under full fine-tuning, the only strategy applied identically to all four models, ViT-S/16 (76.00%) outperforms ResNet-18 (74.44%) by 1.56 percentage points and ViT-B/16 (74.22%) by 1.78 percentage points, confirming a modest but consistent architectural advantage for Transformers once backbone adaptation is permitted. Vision Transformers benefit disproportionately from fine-tuning, and the relative ranking of architectures changes across fine-tuning regimes. Confusion matrix analysis reveals persistent cross-class confusion between visually similar emotions (e.g., Happy vs. Embarrassed), while the highly distinctive Manic–Euphoric category is consistently well recognized across all architectures. To the best of our knowledge, this is the first work to conduct a controlled multi-architecture, multi-strategy transfer learning benchmark specifically for FER in anime and manga, revealing findings that are not predictable from photographic FER literature and that carry direct practical implications for model selection in non-photorealistic visual recognition tasks. The anime and manga domain provides a uniquely controlled testbed for studying transfer learning under deliberate stylization, where the domain gap from realistic imagery is not an artifact of image degradation or environmental noise but a principled artistic choice with codified visual conventions; observing that fine-tuning depth dominates architectural choice in this domain suggests the same conclusion likely holds in other non-photorealistic transfer scenarios such as medical illustrations, architectural drawings, and synthetic training data. Full article
Show Figures

Figure 1

36 pages, 1533 KB  
Review
Medical Image Segmentation Methods: A Decision-Guided Survey Covering 2D/3D CNNs, Transformers, VLMs, SAM-Based Models and Diffusion Approaches
by Kadir Sabanci, Busra Aslan and Muhammet Fatih Aslan
Bioengineering 2026, 13(5), 555; https://doi.org/10.3390/bioengineering13050555 - 15 May 2026
Viewed by 367
Abstract
Recent advances in medical image segmentation have introduced a wide spectrum of deep learning paradigms, including 2D/3D convolutional neural networks (CNNs), transformer-based architectures, vision-language models (VLMs), prompt-driven foundation models such as Segment Anything Model (SAM), and diffusion-based approaches. Although these methods have demonstrated [...] Read more.
Recent advances in medical image segmentation have introduced a wide spectrum of deep learning paradigms, including 2D/3D convolutional neural networks (CNNs), transformer-based architectures, vision-language models (VLMs), prompt-driven foundation models such as Segment Anything Model (SAM), and diffusion-based approaches. Although these methods have demonstrated remarkable performance across MRI, CT, PET, ultrasound, and endoscopic imaging, the rapid proliferation of architectures has created methodological uncertainty regarding optimal model selection under varying clinical and data constraints. Existing surveys primarily focus on architectural categorization, yet provide limited guidance for decision-oriented model selection. This study presents a comprehensive and decision-guided survey that systematically analyzes segmentation paradigms across imaging modalities, task types, dataset characteristics, and evaluation protocols. Beyond taxonomy, we propose a practical model selection framework that links clinical scenarios, such as small lesion detection, multi-organ 3D segmentation, limited-data regimes, and domain shift, to appropriate segmentation strategies. Furthermore, robustness, generalization, annotation variability, and benchmarking reproducibility are critically examined. By integrating architectural taxonomy, cross-modal comparative analysis, and a structured decision framework, this work provides a clinically oriented roadmap for selecting segmentation methods and highlights future research directions toward reliable and reproducible medical AI systems. Full article
Show Figures

Figure 1

31 pages, 2165 KB  
Article
Class Imbalance in IoMT Datasets: Evaluating Balancing Strategies for Learning-Based Attack Detection
by Eren Gencturk, Beste Ustubioglu, Guzin Ulutas and Iraklis Symeonidis
Appl. Sci. 2026, 16(10), 4921; https://doi.org/10.3390/app16104921 - 15 May 2026
Viewed by 315
Abstract
Internet of Medical Things (IoMT) devices are inherently vulnerable to cyberattacks, typically due to their limited processing power and memory capacity. Their widespread use in healthcare poses a significant security risk, threatening patient data privacy and the continuity of services. This study examines [...] Read more.
Internet of Medical Things (IoMT) devices are inherently vulnerable to cyberattacks, typically due to their limited processing power and memory capacity. Their widespread use in healthcare poses a significant security risk, threatening patient data privacy and the continuity of services. This study examines the effects of data imbalance correction and balancing strategies on the performance of machine and deep learning models using openly available IoMT datasets. In this context, four different balancing methods—RandomUnderSampler, SMOTE, Borderline-SMOTE, and ADASYN—were applied to three open-access IoMT datasets: ECU-IoHT, WUSTL, and CICIoMT2024. Performance analyses were conducted using five machine learning algorithms (AdaBoost, Logistic Regression, Random Forest, XGBoost, and K-Nearest Neighbor (KNN)) and two deep learning algorithms (Convolutional Neural Networks (CNN) and Deep Neural Networks (DNN)). In the highly imbalanced binary setting of the CICIoMT2024 dataset, the combination of RandomUnderSampler and SMOTE under the balanced-training/original-testing scenario produced the strongest improvement in the binary CICIoMT2024 setting, increasing the F1-Score from the unbalanced baseline to 99.87% for Random Forest and 99.86% for XGBoost across repeated runs. However, the benefit of balancing was not universal. In datasets with stronger class separability, such as ECU-IoHT, and in several multi-class settings, the effect of balancing was limited or, in some cases, inferior to the unbalanced baseline. These findings indicate that balancing is most effective under specific conditions, particularly in highly imbalanced binary tasks, and should be validated using class-sensitive metrics rather than overall performance alone. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

16 pages, 6736 KB  
Article
Hyperparameter Tuning of Inception CNNs Using Genetic Algorithms for Automatic Defect Detection
by Ambra Korra, Anduel Kuqi and Indrit Enesi
Computers 2026, 15(5), 309; https://doi.org/10.3390/computers15050309 - 13 May 2026
Viewed by 185
Abstract
Automated defect detection in industrial casting processes is important for improving product quality while reducing the cost of manual inspection. In this work, two deep convolutional neural network (CNN) architectures, InceptionV3 and InceptionResNetV2, are evaluated for the binary classification of defects in submersible [...] Read more.
Automated defect detection in industrial casting processes is important for improving product quality while reducing the cost of manual inspection. In this work, two deep convolutional neural network (CNN) architectures, InceptionV3 and InceptionResNetV2, are evaluated for the binary classification of defects in submersible pump impellers. A genetic algorithm (GA) is used to optimize key hyperparameters, including dropout rate, learning rate, and dense layer configuration, while model complexity is assessed through Pareto-based analysis. Single-run optimization results show that InceptionV3 achieves high classification accuracy (99.0%) with lower model complexity than InceptionResNetV2 (98.75%). Repeated experiments using different random seeds demonstrate relatively stable performance across runs, with InceptionV3 achieving an accuracy of 0.9913 ± 0.003 and InceptionResNetV2 achieving 0.9860 ± 0.0076. Additional experiments were conducted using random-search baselines and classification-head ablation studies (Flatten vs. Global Average Pooling). These experiments showed that optimization strategy and architectural design choices influence both predictive performance and computational complexity. The environmental impact of the training process is evaluated using CodeCarbon, with energy consumption ranging from 0.083 to 0.098 kWh and carbon emissions ranging from 2.008 to 2.401 g CO2eq for InceptionV3 and InceptionResNetV2, respectively. Overall, the results suggest that the most effective configuration depends on the evaluated architecture and experimental setting, highlighting the importance of balancing accuracy, model complexity, and computational efficiency in industrial defect detection systems. Full article
Show Figures

Figure 1

Back to TopTop