MDPI - Publisher of Open Access Journals

27 pages, 12041 KB

Open AccessArticle

FPGA-Based CNN Acceleration on Zynq-7020 for Embedded Ship Recognition in Unmanned Surface Vehicles

by Abdelilah Haijoub, Aissam Bekkari, Anas Hatim, Mounir Arioua, Mohamed Nabil Srifi and Antonio Guerrero-Gonzalez

Sensors 2026, 26(5), 1626; https://doi.org/10.3390/s26051626 - 5 Mar 2026

Viewed by 188

Abstract

Unmanned surface vehicles (USVs) increasingly rely on vision-based perception for safe navigation and maritime surveillance, while onboard computing is constrained by strict size, weight, and power (SWaP) budgets. Although deep convolutional neural networks (CNNs) offer strong recognition performance, their computational and memory requirements [...] Read more.

Unmanned surface vehicles (USVs) increasingly rely on vision-based perception for safe navigation and maritime surveillance, while onboard computing is constrained by strict size, weight, and power (SWaP) budgets. Although deep convolutional neural networks (CNNs) offer strong recognition performance, their computational and memory requirements pose significant challenges for deployment on low-cost embedded platforms. This paper presents a hardware–software co-design architecture and deployment study for CNN acceleration on a heterogeneous ARM–FPGA system, targeting energy-efficient near-sensor processing for embedded maritime applications. The proposed approach exploits a fully streaming hardware architecture in the FPGA fabric, based on line-buffered convolutions and AXI-Stream dataflow, while the ARM processing system is responsible for lightweight configuration, scheduling, and data movement. The architecture was evaluated using representative CNN models trained on a maritime ship dataset. Our experimental results on a Zynq-7020 system-on-chip demonstrate that the proposed co-design strategy achieves a balanced trade-off between throughput, resource utilisation, and power consumption under tight embedded constraints, highlighting its suitability as a practical building block for onboard perception in USVs. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

31 pages, 3408 KB

Open AccessArticle

Grad-CAM Enhanced Explainable Deep Learning for Multi-Class Lung Cancer Classification Using DE-SAMNet Model

by Murat Kılıç, Merve Bıyıklı, Abdulkadir Yelman, Hüseyin Fırat, Hüseyin Üzen, İpek Balikçi Çiçek and Abdulkadir Şengür

Diagnostics 2026, 16(5), 757; https://doi.org/10.3390/diagnostics16050757 - 3 Mar 2026

Viewed by 297

Abstract

Background/Objectives: Lung cancer (LC) is the leading cause of cancer-related mortality worldwide, making early and accurate diagnosis crucial for improving patient outcomes. Although chest computed tomography (CT) enables detailed assessment of lung abnormalities, manual interpretation is time-consuming, requires expert expertise, and is prone [...] Read more.

Background/Objectives: Lung cancer (LC) is the leading cause of cancer-related mortality worldwide, making early and accurate diagnosis crucial for improving patient outcomes. Although chest computed tomography (CT) enables detailed assessment of lung abnormalities, manual interpretation is time-consuming, requires expert expertise, and is prone to diagnostic variability. To address these challenges, this study proposes DE-SAMNet, a hybrid deep learning framework for automated multi-class LC classification from CT scans. Methods: The model integrates two pre-trained convolutional neural networks—DenseNet121 and EfficientNetB0—operating in parallel to extract complementary multi-scale features. A Spatial Attention Module (SAM) is applied to each feature stream to emphasize clinically important regions. Final classification is performed through a compact fusion mechanism involving global average pooling, batch normalization, and a fully connected layer. DE-SAMNet was evaluated on two datasets: a public dataset (IQ-OTH/NCCD) with benign, malignant, and normal cases, and a private clinical dataset including benign, malignant, cystic, and healthy cases. Results: On the public dataset, the model achieved a 99.00% F1-score, 98.41% recall, 99.64% precision, and 99.54% accuracy. On the private dataset, it obtained 95.96% accuracy, 95.99% precision, 96.04% F1-score, and 96.21% recall, outperforming existing approaches. To enhance reliability, explainable AI (XAI) techniques such as Grad-CAM were used to visualize the model’s decision rationale. The resulting heatmaps effectively highlight lesion-specific regions, offering transparency and supporting clinical interpretability. Conclusions: This explainability strengthens trust in automated predictions and demonstrates the clinical potential of the proposed system. Overall, DE-SAMNet delivers a highly accurate and interpretable solution for early LC detection. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

14 pages, 1058 KB

Open AccessArticle

QCNN-Inspired Variational Circuits for Enhanced Noise Robustness in Quantum Deep Q-Learning

by Louyang Yu, Wenbin Yu, Yadang Chen and Chengjun Zhang

Information 2026, 17(3), 250; https://doi.org/10.3390/info17030250 - 3 Mar 2026

Viewed by 186

Abstract

Quantum reinforcement learning (QRL) is often evaluated under idealized, noiseless assumptions, yet realistic quantum devices inevitably introduce noise that can severely degrade performance. This paper improves the robustness of quantum deep Q-learning (QDQN) by redesigning the variational quantum circuit (VQC) used in its [...] Read more.

Quantum reinforcement learning (QRL) is often evaluated under idealized, noiseless assumptions, yet realistic quantum devices inevitably introduce noise that can severely degrade performance. This paper improves the robustness of quantum deep Q-learning (QDQN) by redesigning the variational quantum circuit (VQC) used in its value-function approximator. Motivated by recent advances in quantum convolutional neural networks (QCNNs), we construct four QCNN-inspired VQC variants (Models A–D) by combining representative QCNN two-qubit building blocks with an explicit fully connected (all-to-all) layer. Using a 10-fold evaluation protocol at a fixed noise level p = 0.005, Model D achieves the best robustness, reducing the mean number of episodes required to reach a target reward from 1981 (baseline) to 1243. Under a stricter success criterion, Model D also doubles the empirically observed noise-tolerance boundary from 0.002 to 0.004. These results indicate that carefully chosen QCNN-style circuit components and connectivity can significantly improve the noise robustness of QDQN-like QRL agents. Full article

(This article belongs to the Special Issue Advances in Quantum Information Processing: Theory, Methods and Emerging Applications)

► Show Figures

Figure 1

17 pages, 2481 KB

Open AccessArticle

Soft Sensor Model of f-CaO Content in Cement Clinker Based on Self-Attention and Time Convolutional Network

by Siyuan Zhou and Le Yang

Information 2026, 17(3), 230; https://doi.org/10.3390/info17030230 - 1 Mar 2026

Viewed by 180

Abstract

The quality of cement clinker is strongly linked to its free calcium oxide (f-CaO) content. Therefore, real-time detection of f-CaO content is crucial for reducing energy consumption and stabilizing clinker quality. This work presents a Temporal Convolutional Network (TCN) that incorporates a self-attention [...] Read more.

The quality of cement clinker is strongly linked to its free calcium oxide (f-CaO) content. Therefore, real-time detection of f-CaO content is crucial for reducing energy consumption and stabilizing clinker quality. This work presents a Temporal Convolutional Network (TCN) that incorporates a self-attention mechanism for handling coupled time-series data from process variables. This model utilizes TCN to capture the time series coupling relationship among multiple input variables and extract multivariable time series features that affect f-CaO content. On this basis, a self-attention mechanism is introduced to focus on nonlinear features that have a significant impact on the output variable. The self-attention mechanism enhances the model’s ability through three key aspects: dynamic feature weighting, global context awareness, and interpretable feature selection. Combined with TCN’s time feature extraction, a robust f-CaO content prediction framework is constructed. Finally, a mapping relationship between nonlinear features and output is established through a fully connected layer, enabling real-time measurement of f-CaO content. Experimental comparisons with existing deep learning-based soft sensors demonstrate the superior performance of our model. Full article

(This article belongs to the Special Issue New Deep Learning Approach for Time Series Forecasting, 2nd Edition)

► Show Figures

Figure 1

26 pages, 4104 KB

Open AccessArticle

Deep Convolution–Bidirectional GRU Neural Network Surrogate Model for Productivity Prediction of Multi-Fractured Horizontal Wells

by Tong Zhou, Cong Xiao, Jie Liu and Xianliang Jiang

Energies 2026, 19(5), 1187; https://doi.org/10.3390/en19051187 - 27 Feb 2026

Viewed by 210

Abstract

A productivity simulation for hydraulically fractured wells with complex fracture geometry involves a heavy computational burden and is therefore not suitable for engineering-scale fracture-optimization designs and production-analysis applications. This paper develops a productivity-prediction surrogate model based on a deep convolution–bidirectional gated recurrent unit [...] Read more.

A productivity simulation for hydraulically fractured wells with complex fracture geometry involves a heavy computational burden and is therefore not suitable for engineering-scale fracture-optimization designs and production-analysis applications. This paper develops a productivity-prediction surrogate model based on a deep convolution–bidirectional gated recurrent unit temporal network (DC-BiGRU) framework where a deep convolutional neural network is used to extract features from fracture images, while a BiGRU model was designed to fully capture valuable information from the production sequence. Some additional inputs, e.g., cluster spacing and stage spacing, that account for different fracture-placement designs in horizontal wells were also considered. A large number of shale-gas production data samples at different times were generated using a fractured-horizontal-well productivity simulator under diverse hydraulic-fracture geometries and bottom-hole flowing pressures. The surrogate model had relative errors below 10% with an average error of about 6%. Compared to high-fidelity capacity prediction simulators, the computational efficiency of the deep learning surrogate models was improved by two to three orders of magnitude. The runtime of the high-fidelity numerical simulator was about 20 min, while the surrogate model, which was run on an NVIDIA Tesla P100 GPU (NVIDIA, Santa Clara, CA, USA), took less than 1 s, which is almost negligible. The proposed surrogate model resolved the low efficiency of the productivity simulation for complex-fracture hydraulic fracturing wells in unconventional reservoirs, enabling rapid dynamic forecasting of fractured-well productivity. Full article

(This article belongs to the Special Issue Oil and Gas Reservoirs: Phase Behavior, Seepage Mechanisms, Productivity Prediction, and Novel Modelling Methods—3rd Edition)

► Show Figures

Figure 1

28 pages, 7556 KB

Open AccessArticle

RFM-Net: A Convolutional Neural Network for Customer Segment Classification

by Kadriye Filiz Balbal and Derya Birant

Appl. Sci. 2026, 16(5), 2223; https://doi.org/10.3390/app16052223 - 25 Feb 2026

Viewed by 197

Abstract

Customer Segment Classification is a machine learning task in marketing analytics that involves assigning customers to predefined categories using features derived from historical transactional data. However, conventional approaches, such as statistical and clustering-based algorithms, may face challenges in fully capturing the nonlinear relationships [...] Read more.

Customer Segment Classification is a machine learning task in marketing analytics that involves assigning customers to predefined categories using features derived from historical transactional data. However, conventional approaches, such as statistical and clustering-based algorithms, may face challenges in fully capturing the nonlinear relationships in customer data, which can lead to limited insights and suboptimal segmentation outcomes. This paper introduces RFM-Net, an approach that integrates Deep Learning with Recency, Frequency, and Monetary (RFM) analysis for customer segment classification. By leveraging RFM features as input and labeled customer segments as output, we designed a specialized Convolutional Neural Network (CNN) model tailored for classification tasks. In the proposed method, labels are generated by a rule-based logic from RFM scores and then used as supervised ground truth. Accordingly, learning an expert-defined mapping is employed to model customer segmentation, rather than discovering a new segmentation structure. The proposed method enables businesses to classify customers into strategically meaningful segments such as Champions, Loyal Customers, At Risk, and Hibernating, thereby facilitating effective and targeted marketing strategies. Unlike traditional CNN architectures, RFM-Net offers a more compact, lightweight, and computationally efficient model with fewer layers and parameters, supporting improved interpretability and reduced risk of overfitting. Experimental results conducted on a real-world dataset demonstrated the effectiveness of RFM-Net with an accuracy of 94.33%. The results of this study showed a relative average increase of 13.17% compared to the results reported in previous studies on the same dataset. The core contribution of this research lies in combining the powerful generalization capabilities of deep learning with the effectiveness of RFM analysis, offering a robust solution for data-driven customer relationship management. Full article

(This article belongs to the Special Issue Exploring AI: Methods and Applications for Data Mining)

► Show Figures

Figure 1

30 pages, 1870 KB

Open AccessArticle

DL-MFFSSnet: A Multi-Feature Fusion-Based Dynamic Collaborative Spectrum Sensing Method in a Satellite–Terrestrial Converged System

by Chao Tang, Yueyun Chen, Guang Chen, Liping Du, Zhen Wang and Huan Liu

Electronics 2026, 15(4), 905; https://doi.org/10.3390/electronics15040905 - 23 Feb 2026

Viewed by 246

Abstract

Satellite–terrestrial spectrum sensing plays a crucial role in enhancing spectrum efficiency through reusing spectra. However, in a satellite–terrestrial converged system, the large SNR range, non-Gaussian signal characteristics and noise uncertainty pose significant challenges for spectrum sensing. In this paper, we investigate a downlink [...] Read more.

Satellite–terrestrial spectrum sensing plays a crucial role in enhancing spectrum efficiency through reusing spectra. However, in a satellite–terrestrial converged system, the large SNR range, non-Gaussian signal characteristics and noise uncertainty pose significant challenges for spectrum sensing. In this paper, we investigate a downlink spectrum sensing framework where multi-terrestrial BSs act as a secondary system to sense idle satellite spectra through a multi-domain feature-level sensing signal fusion. To enhance the characterization of signal/noise features, we provide a fusion strategy of multi-features including energy, power spectral density, cyclic autocorrelation function, higher-order moments, sparse ratio, and I/Q samples, constructing two feature tensors of statistical features and an I/Q component. Then, we propose a deep-learning-enabled multi-feature fusion spectrum sensing method (DL-MFFSSnet) based on a dual-branch deep neural network architecture with the constructed two feature tensors as inputs. In the statistical feature processing branch, CNN and channel self-attention are incorporated to capture intra-channel correlations and inter-channel relative contributions of different feature modalities. In the I/Q branch, multi-scale dilated convolutions and spatial self-attention are introduced to analyze dependencies across different temporal positions and multi-scale spatial features. The feature map extracted from both branches passed through fully connected layers for deepwise feature fusion, achieving accurate spectrum sensing. Extensive simulation results demonstrate that the DL-MFFSSnet method outperforms the existing state-of-the-art algorithms. Full article

(This article belongs to the Special Issue Next-Generation Wireless Networks: Advances in 5G, 6G, and Broadband Communication)

► Show Figures

Figure 1

27 pages, 7867 KB

Open AccessArticle

A Multi-Scale Object Detection Network with Integrated Spatial-Channel Collaborative Attention for Remote Sensing Images

by Lijun Ma, Chengjun Xu, Kun Jiao, Wenming Pei, Hongfei Zhang, Lanfeng Liu, Bin Deng and Juan Wu

Sensors 2026, 26(4), 1370; https://doi.org/10.3390/s26041370 - 21 Feb 2026

Viewed by 297

Abstract

In remote sensing object detection, current models typically employ feature extraction modules and attention mechanisms to tackle issues such as significant scale variations among targets, cluttered backgrounds, and the subtle characteristics of small objects. Nevertheless, existing feature extraction approaches often depend on convolution [...] Read more.

In remote sensing object detection, current models typically employ feature extraction modules and attention mechanisms to tackle issues such as significant scale variations among targets, cluttered backgrounds, and the subtle characteristics of small objects. Nevertheless, existing feature extraction approaches often depend on convolution kernels with fixed sizes, which can blur the contours of large objects and provide inadequate feature representation for small objects. Moreover, many attention mechanisms simply combine spatial and channel attention, without fully considering the deep integration between spatial and channel features, consequently leading to high-dimensional features and considerable computational overhead. To overcome these shortcomings, this paper introduces a multi-scale object detection network with integrated spatial-channel collaborative attention for remote sensing images. This approach enhances feature perception and representation for multi-scale targets, particularly small targets, through the design of the cross-channel multi-scale feature extraction module (CC-MSFE). Furthermore, a new channel-spatial cross-attention mechanism (CSCA) is introduced, comprising the channel attention mechanism (CA), the spatial attention mechanism (SA), and the cross-attention fusion module (CAFM). This design fosters dynamic interaction and joint optimization across channel and spatial dimensions, thereby improving detection accuracy while effectively reducing computational cost. The efficacy of the proposed model is evaluated on three publicly available remote sensing datasets. Experimental results show that the model achieves a mAP of 78.1% on the DIOR dataset and of 90.6% on the HRRSD dataset, outperforming YOLOv11 by 0.7% and 1.4%, respectively. On the RSOD dataset, it attains a mAP of 96.5%, surpassing YOLOv8 by 2.1%. In addition, the proposed method maintains a notably lower parameter count and computational complexity compared to existing approaches, achieving an effective balance between detection accuracy and computational efficiency. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

35 pages, 1423 KB

Open AccessReview

Analysis of Biological Images and Quantitative Monitoring Using Deep Learning and Computer Vision

by Aaron Gálvez-Salido, Francisca Robles, Rodrigo J. Gonçalves, Roberto de la Herrán, Carmelo Ruiz Rejón and Rafael Navajas-Pérez

J. Imaging 2026, 12(2), 88; https://doi.org/10.3390/jimaging12020088 - 18 Feb 2026

Viewed by 529

Abstract

Automated biological counting is essential for scaling wildlife monitoring and biodiversity assessments, as manual processing currently limits analytical effort and scalability. This review evaluates the integration of deep learning and computer vision across diverse acquisition platforms, including camera traps, unmanned aerial vehicles (UAVs), [...] Read more.

Automated biological counting is essential for scaling wildlife monitoring and biodiversity assessments, as manual processing currently limits analytical effort and scalability. This review evaluates the integration of deep learning and computer vision across diverse acquisition platforms, including camera traps, unmanned aerial vehicles (UAVs), and remote sensing. Methodological paradigms ranging from Convolutional Neural Networks (CNNs) and one-stage detectors like You Only Look Once (YOLO) to recent transformer-based architectures and hybrid models are examined. The literature shows that these methods consistently achieve high accuracy—often exceeding 95%—across various taxa, including insect pests, aquatic organisms, terrestrial vegetation, and forest ecosystems. However, persistent challenges such as object occlusion, cryptic species differentiation, and the scarcity of high-quality, labeled datasets continue to hinder fully automated workflows. We conclude that while automated counting has fundamentally increased data throughput, future advancements must focus on enhancing model generalization through self-supervised learning and improved data augmentation techniques. These developments are critical for transitioning from experimental models to robust, operational tools for global ecological monitoring and conservation efforts. Full article

(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (3rd Edition))

► Show Figures

Figure 1

22 pages, 4598 KB

Open AccessArticle

Deep Learning Based Correction Algorithms for 3D Medical Reconstruction in Computed Tomography and Macroscopic Imaging

by Tomasz Les, Tomasz Markiewicz, Malgorzata Lorent, Miroslaw Dziekiewicz and Krzysztof Siwek

Appl. Sci. 2026, 16(4), 1954; https://doi.org/10.3390/app16041954 - 15 Feb 2026

Viewed by 373

Abstract

This paper introduces a hybrid two-stage registration framework for reconstructing three-dimensional (3D) kidney anatomy from macroscopic slices, using CT-derived models as the geometric reference standard. The approach addresses the data-scarcity and high-distortion challenges typical of macroscopic imaging, where fully learning-based registration (e.g., VoxelMorph) [...] Read more.

This paper introduces a hybrid two-stage registration framework for reconstructing three-dimensional (3D) kidney anatomy from macroscopic slices, using CT-derived models as the geometric reference standard. The approach addresses the data-scarcity and high-distortion challenges typical of macroscopic imaging, where fully learning-based registration (e.g., VoxelMorph) often fails to generalize due to limited training diversity and large nonrigid deformations that exceed the capture range of unconstrained convolutional filters. In the proposed pipeline, the Optimal Cross-section Matching (OCM) algorithm first performs constrained global alignment—translation, rotation, and uniform scaling—to establish anatomically consistent slice initialization. Next, a lightweight deep-learning refinement network, inspired by VoxelMorph, predicts residual local deformations between consecutive slices. The core novelty of this architecture lies in its hierarchical decomposition of the registration manifold: the OCM acts as a deterministic geometric anchor that neutralizes high-amplitude variance, thereby constraining the learning task to a low-dimensional residual manifold. This hybrid OCM + DL design integrates explicit geometric priors with the flexible learning capacity of neural networks, ensuring stable optimization and plausible deformation fields even with few training examples. Experiments on an original dataset of 40 kidneys demonstrated that the OCM + DL method achieved the highest registration accuracy across all evaluated metrics: NCC = 0.91, SSIM = 0.81, Dice = 0.90, IoU = 0.81, HD95 = 1.9 mm, and volumetric agreement

{DC}_{Vol}

= 0.89. Compared to single-stage baselines, this represents an average improvement of approximately 17% over DL-only and 14% over OCM-only, validating the synergistic contribution of the proposed hybrid strategy over standalone iterative or data-driven methods. The pipeline maintains physical calibration via Hough-based grid detection and employs Bézier-based contour smoothing for robust meshing and volume estimation. Although validated on kidney data, the proposed framework generalizes to other soft-tissue organs reconstructed from optical or photographic cross-sections. By decoupling interpretable global optimization from data-efficient deep refinement, the method advances the precision, reproducibility, and anatomical realism of multimodal 3D reconstructions for surgical planning, morphological assessment, and medical education. Full article

(This article belongs to the Special Issue Engineering Applications of Hybrid Artificial Intelligence Tools)

► Show Figures

Figure 1

24 pages, 16653 KB

Open AccessArticle

Evaluation of Compressive Strength of Expanded Polystyrene Concrete Based on Broad Learning System

by Zhenhao Zhou, Wanfen Cao, Qiang Jin and Sen Li

Buildings 2026, 16(4), 795; https://doi.org/10.3390/buildings16040795 - 14 Feb 2026

Viewed by 265

Abstract

Expanded polystyrene (EPS) concrete, with excellent properties such as light weight, thermal insulation, and soundproofing, is widely applied in construction engineering. However, its complex heterogeneous internal structure makes it difficult to quickly and accurately assess compressive strength. Existing testing methods struggle to meet [...] Read more.

Expanded polystyrene (EPS) concrete, with excellent properties such as light weight, thermal insulation, and soundproofing, is widely applied in construction engineering. However, its complex heterogeneous internal structure makes it difficult to quickly and accurately assess compressive strength. Existing testing methods struggle to meet the real-time demands of on-site quality control in terms of both operational efficiency and accuracy. To address this, the present study proposes a method for predicting the compressive strength of EPS concrete based on image processing and Deep Convolutional Neural Networks (DCNN). By constructing a dataset consisting of 5600 preprocessed concrete slice images and addressing the issue of parameter redundancy in fully connected layers, the Broad Learning System (BLS) was employed to reconstruct and optimize the network architecture, thereby improving computational efficiency and enhancing prediction accuracy. The experimental results indicate that after introducing the BLS and related training optimization mechanisms, the training time was reduced by approximately 15%. Among all models, the BLS-Xception model performed the best, requiring only 1.9 s per training image. The coefficient of determination (R²) on the test set reached 0.95, representing an 18.7% improvement over traditional models. The study also indicates that the appropriate incorporation of coal ash, silica fume, and mineral powder significantly enhances the compressive strength of EPS concrete, with smaller EPS particles contributing more substantially to strength improvement. The model demonstrates excellent accuracy and reliability in predictions, providing an effective method for the rapid, non-destructive evaluation of the compressive strength of EPS concrete on construction sites. Full article

(This article belongs to the Special Issue BIM and Smart Technologies in Building Design, Construction, and Lifecycle Management)

► Show Figures

Figure 1

38 pages, 3458 KB

Open AccessArticle

MERGE: Mammogram-Enhanced Representation via Wavelet-Guided CNNs for Computer-Aided Diagnosis of Breast Cancer

by Omneya Attallah

Mach. Learn. Knowl. Extr. 2026, 8(2), 40; https://doi.org/10.3390/make8020040 - 9 Feb 2026

Viewed by 410

Abstract

The early and accurate identification of breast cancer is a significant healthcare issue, largely because the traditional machine learning approaches rely on handcrafted features that are unable to fully capture the spatial and textural complexity found in mammograms. Even with the advancements made [...] Read more.

The early and accurate identification of breast cancer is a significant healthcare issue, largely because the traditional machine learning approaches rely on handcrafted features that are unable to fully capture the spatial and textural complexity found in mammograms. Even with the advancements made possible through deep learning and improvements in diagnostic performance, most computational-aided diagnosis (CAD) systems based on Convolutional Neural Networks (CNNs) still only rely on single-domain features, normally spatial features, while neglecting some important spectral and spatial–spectral features, leading to limitations in generalisability, redundancy, and loss of performative interpretability. Inspired by these limitations, this research proposes MERGE, a novel CAD framework that combines spatial, spectral, and spatial–spectral information—all part of a single multistage architecture taking advantage of three fine-tuned CNN models (ResNet-50, Xception, and Inception). This system utilises Discrete Stationary Wavelet Transform (DSWT) to enhance spectral–spatial features; Discrete Cosine Transform (DCT) to fuse the features optimally, resulting in enhanced spatial and spatial–spectral representations; and, finally, Non-Negative Matrix Factorisation (NNMF) for reduced-dimensional features. Finally, the Linear Discriminant Analysis (LDA), support vector machine (SVM), and k-nearest neighbours (KNN) classifiers provide a robust diagnosis. Using the INBreast and MIAS datasets in evaluations of the experimental research design, evaluation metrics of accuracy, sensitivity, specificity, and AUC were around 99%, with performance surpassing state-of-the-art paradigms. The findings of the suggested MERGE indicate significant promise as a dependable and effective diagnostic tool, enhancing the consistency and interpretability of breast cancer screening results. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

26 pages, 4800 KB

Open AccessArticle

Porosity and Permeability Estimations from X-Ray Tomography Images and Data Using a Deep Learning Approach

by Edwar Herrera, Oriol Oms and Eduard Remacha

Appl. Sci. 2026, 16(3), 1613; https://doi.org/10.3390/app16031613 - 5 Feb 2026

Viewed by 407

Abstract

This work presents a novel deep learning workflow for estimating porosity and permeability from combined data, where numerical variables such as high-resolution bulk density (RHOB) and photoelectric factor (PEF) data are integrated with X-ray computed tomography (X-CT) image data, using a dual-energy X-CT [...] Read more.

This work presents a novel deep learning workflow for estimating porosity and permeability from combined data, where numerical variables such as high-resolution bulk density (RHOB) and photoelectric factor (PEF) data are integrated with X-ray computed tomography (X-CT) image data, using a dual-energy X-CT approach (DECT). Convolutional neural networks (CNNs) were calibrated with routine core analysis (RCAL) laboratory measurements from one well from Sinú-San Jacinto Basin (Colombia). The CNN architecture combines two main branches: An image branch, in which a CNN extracts spatial features from normalized X-CT sections using 3 × 3 convolution layers, ReLU activation, batch normalization, and maxPooling, and a numerical branch, which processes the input vectors corresponding to RHOB and PEF using fully connected dense layers and dropout regularization. Both branches are concatenated in a fusion layer, from which the model’s final predictions are made. Results indicate a strong correlation between porosity, permeability, RHOB and PEF logs, and CT images. The porosity model achieved excellent predictive performance, with an R² = 0.996, MAE = 3.96 × 10⁻³, MSE = 3.82 × 10⁻⁵, and 0.064 maximum error. The permeability model also performed well, with a linear R² = 0.983, though metrics reflected the wide dynamic range of permeability. Consequently, artificial neural networks (ANNs) can accurately predict porosity and permeability at various depths where no corresponding laboratory data exists, demonstrating excellent predictive capabilities over several rock intervals, in a high vertical resolution because of X-CT data scale (0.625 mm). Full article

► Show Figures

Figure 1

25 pages, 15438 KB

Open AccessArticle

Day–Night All-Sky Scene Classification with an Attention-Enhanced EfficientNet

by Wuttichai Boonpook, Peerapong Torteeka, Kritanai Torsri, Daroonwan Kamthonkiat, Yumin Tan, Asamaporn Sitthi, Patcharin Kamsing, Chomchanok Arunplod, Utane Sawangwit, Thanachot Ngamcharoensuktavorn and Kijnaphat Suksod

ISPRS Int. J. Geo-Inf. 2026, 15(2), 66; https://doi.org/10.3390/ijgi15020066 - 3 Feb 2026

Viewed by 849

Abstract

All-sky cameras provide continuous hemispherical observations essential for atmospheric monitoring and observatory operations; however, automated classification of sky conditions in tropical environments remains challenging due to strong illumination variability, atmospheric scattering, and overlapping thin-cloud structures. This study proposes EfficientNet-Attention-SPP Multi-scale Network (EASMNet), a [...] Read more.

All-sky cameras provide continuous hemispherical observations essential for atmospheric monitoring and observatory operations; however, automated classification of sky conditions in tropical environments remains challenging due to strong illumination variability, atmospheric scattering, and overlapping thin-cloud structures. This study proposes EfficientNet-Attention-SPP Multi-scale Network (EASMNet), a physics-aware deep learning framework for robust all-sky scene classification using hemispherical imagery acquired at the Thai National Observatory. The proposed architecture integrates Squeeze-and-Excitation (SE) blocks for radiometric channel stabilization, the Convolutional Block Attention Module (CBAM) for spatial–semantic refinement, and Spatial Pyramid Pooling (SPP) for hemispherical multi-scale context aggregation within a fully fine-tuned EfficientNetB7 backbone, forming a domain-aware atmospheric representation framework. A large-scale dataset comprising 122,660 RGB images across 13 day–night sky-scene categories was curated, capturing diverse tropical atmospheric conditions including humidity, haze, illumination transitions, and sensor noise. Extensive experimental evaluations demonstrate that the EASMNet achieves 93% overall accuracy, outperforming representative convolutional (VGG16, ResNet50, DenseNet121) and transformer-based architectures (Swin Transformer, Vision Transformer). Ablation analyses confirm the complementary contributions of hierarchical attention and multi-scale aggregation, while class-wise evaluation yields F1-scores exceeding 0.95 for visually distinctive categories such as Day Humid, Night Clear Sky, and Night Noise. Residual errors are primarily confined to physically transitional and low-contrast atmospheric regimes. These results validate the EASMNet as a reliable, interpretable, and computationally feasible framework for real-time observatory dome automation, astronomical scheduling, and continuous atmospheric monitoring, and provide a scalable foundation for autonomous sky-observation systems deployable across diverse climatic regions. Full article

(This article belongs to the Topic Advances in Sensor Data Fusion and AI for Environmental Monitoring)

► Show Figures

Figure 1

12 pages, 1209 KB

Open AccessArticle

Deep Learning-Based Semantic Segmentation and Classification of Otoscopic Images for Otitis Media Diagnosis and Health Promotion

by Chien-Yi Yang, Che-Jui Lee, Wen-Sen Lai, Kuan-Yu Chen, Chung-Feng Kuo, Chieh Hsing Liu and Shao-Cheng Liu

Diagnostics 2026, 16(3), 467; https://doi.org/10.3390/diagnostics16030467 - 2 Feb 2026

Viewed by 463

Abstract

Background/Objectives: Otitis media (OM), including acute otitis media (AOM) and chronic otitis media (COM), is a common middle ear disease that can lead to significant morbidity if not accurately diagnosed. Otoscopic interpretation remains subjective and operator-dependent, underscoring the need for objective and reproducible [...] Read more.

Background/Objectives: Otitis media (OM), including acute otitis media (AOM) and chronic otitis media (COM), is a common middle ear disease that can lead to significant morbidity if not accurately diagnosed. Otoscopic interpretation remains subjective and operator-dependent, underscoring the need for objective and reproducible diagnostic support. Recent advances in artificial intelligence (AI) offer promising solutions for automated otoscopic image analysis. Methods: We developed an AI-based diagnostic framework consisting of three sequential steps: (1) semi-supervised learning for automatic recognition and semantic segmentation of tympanic membrane structures, (2) region-based feature extraction, and (3) disease classification. A total of 607 clinical otoscopic images were retrospectively collected, including normal ears (n = 220), AOM (n = 157), and COM with tympanic membrane perforation (n = 230). Among these, 485 images were used for training and 122 for independent testing. Semantic segmentation of five anatomically relevant regions was performed using multiple convolutional neural network architectures, including U-Net, PSPNet, HRNet, and DeepLabV3+. Following segmentation, color and texture features were extracted from each region and used to train a neural network-based classifier to differentiate disease states. Results: Among the evaluated segmentation models, U-Net demonstrated superior performance, achieving an overall pixel accuracy of 96.76% and a mean Dice similarity coefficient of 71.68%. The segmented regions enabled reliable extraction of discriminative chromatic and texture features. In the final classification stage, the proposed framework achieved diagnostic accuracies of 100% for normal ears, 100% for AOM, and 91.3% for COM on the independent test set, with an overall accuracy of 96.72%. Conclusions: This study demonstrates that a semi-supervised, segmentation-driven AI pipeline integrating feature extraction and classification can achieve high diagnostic accuracy for otitis media. The proposed framework offers a clinically interpretable and fully automated approach that may enhance diagnostic consistency, support clinical decision-making, and facilitate scalable otoscopic assessment in diverse healthcare screening settings for disease prevention and health education. Full article

(This article belongs to the Special Issue AI-Assisted Diagnostics in Telemedicine and Digital Health)

► Show Figures

Figure 1

Search Results (1,194)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,194)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI