Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,283)

Search Parameters:
Keywords = UNet architecture

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 11304 KB  
Article
Geo-U-Mamba: A Mamba-Based Framework for Mineral Prospectivity Mapping of Gold Exploration Using Multi-Source Geoscientific Data
by Yuheng Zhou, Yongzhi Wang, Shibo Wen, Guangpeng Zhang and Yong Li
Remote Sens. 2026, 18(13), 2068; https://doi.org/10.3390/rs18132068 (registering DOI) - 24 Jun 2026
Abstract
Modern mineral exploration faces the pivotal challenge of detecting concealed mineral deposits in complex geology, as depleting outcropping ores have driven global exploration to depths where 1000 m deep mining is now commonplace. To address this, this study proposes Geo-U-Mamba, an unsupervised deep [...] Read more.
Modern mineral exploration faces the pivotal challenge of detecting concealed mineral deposits in complex geology, as depleting outcropping ores have driven global exploration to depths where 1000 m deep mining is now commonplace. To address this, this study proposes Geo-U-Mamba, an unsupervised deep learning framework for gold mineral prospectivity mapping. The model integrates multi-source geoscientific data, encompassing geochemistry, remote sensing alteration indicators, topography, and structural distance fields. By incorporating a Mamba-driven four-directional cross-scan mechanism into a U-Net architecture, the framework effectively models the complex nonlinear mapping relationships between metallogenic elements and the geological environment. This approach recognizes gold geochemical anomalies with an 86.11% deposit capture rate, decoupling environmental noise by reconstructing the geochemical background field and extracting anomalies in combination with C-A fractal theory. When applied to China’s Hatu gold belt in Xinjiang, Geo-U-Mamba achieved an AUC of 0.83, consistently outperforming classical baselines such as CAE, U-Net, and ViT. Ultimately, the findings indicate that this framework provides a reliable and high-precision tool for modern mineral exploration, successfully separating mineralization signals from geological backgrounds in complex metallogenic belts to facilitate exploration targeting. Full article
Show Figures

Figure 1

17 pages, 14712 KB  
Article
LLM-Integrated Semantic Deep Learning Framework for Automated Floor Plan Analysis, Area Estimation, and Compliance Assessment of Existing Buildings
by Yuxuan Guo, Xiaodeng Zhou and Su-Kit Tang
Appl. Sci. 2026, 16(13), 6290; https://doi.org/10.3390/app16136290 (registering DOI) - 23 Jun 2026
Viewed by 65
Abstract
The digitization of existing building stock often depends on legacy 2D raster floor plans (scanned drawings, PDF exports, or photographs) because structured building information models are frequently unavailable for older properties. Manual measurement and visual inspection of such documents are time consuming and [...] Read more.
The digitization of existing building stock often depends on legacy 2D raster floor plans (scanned drawings, PDF exports, or photographs) because structured building information models are frequently unavailable for older properties. Manual measurement and visual inspection of such documents are time consuming and error prone. This paper presents an integrated deep learning pipeline that extracts semantic information from unstructured two-dimensional floor plan images of existing structures and supports preliminary compliance screening via locally deployed large language models. The pipeline employs YOLOv8 for the localization and classification of 18 architectural symbols and furniture items, and a U-Net with a ResNet34 encoder for the semantic segmentation of walls and interior room spaces. To translate pixel-level predictions into physical metrics, we implement an area calculation module based on user-defined reference scale calibration. An LLM evaluation module, deployed locally via Ollama with a retrieval-augmented generation pipeline, interprets extracted room metrics and flags potential non-compliance against referenced residential design guidelines; it is intended for the assessment of existing layouts rather than generative co-design. We expand a core dataset of 101 manually annotated source floor plans to 303 augmented instances using label-aligned geometric transformations, while reporting generalization in terms of the 101 unique source plans. On the held-out validation split (10 source plans), YOLOv8 achieves 92.3% mAP50 versus 87.2% for a Faster R-CNN reference model on the same data split (detection baselines differ in training epochs and pretraining; see Experiments); U-Net achieves 95.71% mIoU, surpassing DeepLabv3+ (93.2%) under matched segmentation training settings. The system is deployed as an interactive web application for legacy building survey and preliminary regulatory review when only two-dimensional documentation is available. Full article
(This article belongs to the Topic AI Agents: Progress, Architecture, and Applications)
Show Figures

Figure 1

14 pages, 4300 KB  
Article
DeepFlare: Weakly Supervised Cross-Modality Translation and Segmentation for Immunohistochemistry and Immunofluorescence Imaging
by Md. Tamim, Aditto Rahman, Redwan Hossain, Tausib Abrar and Riasat Khan
BioMedInformatics 2026, 6(3), 37; https://doi.org/10.3390/biomedinformatics6030037 (registering DOI) - 22 Jun 2026
Viewed by 327
Abstract
Immunohistochemistry (IHC) is a widely used method for detecting specific proteins in tissue samples, helping diagnose diseases such as cancer. Traditional analysis methods rely heavily on human interpretation, which can lead to inconsistencies. In this study, we propose DeepFlare, a weakly supervised deep [...] Read more.
Immunohistochemistry (IHC) is a widely used method for detecting specific proteins in tissue samples, helping diagnose diseases such as cancer. Traditional analysis methods rely heavily on human interpretation, which can lead to inconsistencies. In this study, we propose DeepFlare, a weakly supervised deep learning framework for cross-modality translation and segmentation of immunofluorescence and immunohistochemistry images. The proposed method utilizes multiplex immunofluorescence (mpIF) and co-registered IHC images, combined with preprocessing techniques such as affine transformation, stain normalization, noise reduction, and artifact removal. Multiple imaging channels, including hematoxylin, DAPI, Lap2, and nuclear envelope signals, are leveraged to generate segmentation masks using a U-Net++ architecture. The final segmentation mask is obtained through weighted fusion of modality-specific outputs. A generative adversarial network (GAN) is employed to measure translation fidelity between generated and real images. Weakly supervised learning techniques, including image-level supervision and consistency constraints, are applied to enhance performance under limited annotation scenarios. Pretrained pathology foundation encoders such as UNI and Virchow are integrated to extract multi-scale morphological and contextual features. Explainable AI techniques are incorporated to highlight critical regions and refine model attention. Experimental results demonstrate strong performance, achieving an SSIM of 0.7077 for image translation and a Dice score of 0.7424 for segmentation. The integration of the UNI encoder provides marginal improvement over the baseline (0.72 Dice score), indicating limited domain adaptation without fine-tuning on the dataset of 1264 training samples. Full article
(This article belongs to the Section Imaging Informatics)
Show Figures

Figure 1

26 pages, 8518 KB  
Article
CVA-Net: Multi-View 3D Reconstruction for Fringe Projection Profilometry via Cross-View Attention and Sim2Real Learning
by Zuqiong Chen, Xiaopin Zhong and Yibin Tian
Photonics 2026, 13(6), 601; https://doi.org/10.3390/photonics13060601 (registering DOI) - 21 Jun 2026
Viewed by 194
Abstract
Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that [...] Read more.
Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that directly reconstructs dense depth maps from multi-view fringe patterns. CVA-Net simultaneously processes four fringe images acquired from orthogonal projection directions and leverages a CVA module to explicitly model inter-view dependencies, enabling adaptive fusion of complementary information. A 3D U-Net backbone with attention gates, atrous spatial pyramid pooling (ASPP), and an auxiliary parameter estimation branch further enhances reconstruction accuracy and structural consistency via multitask learning. To support Sim2Real network training, we build a Blender-based digital twin of a multi-view FPP system and generate a large-scale synthetic dataset with perfect ground truth. Extensive experiments on both synthetic and real-world objects demonstrate that CVA-Net significantly outperforms state-of-the-art single-view methods. With a symmetric four-view configuration and fringe period of 8, CVA-Net achieves an MAE of 0.0359 mm, an MSE of 0.0379 mm2 and an RMSE of 0.1947 mm, reducing the MAE, MSE, and RMSE by 32.8%, 54.1%, and 32.2%, respectively, compared to the best single-view competitor. Ablation studies validate the contribution of each architectural component, while real-system experiments demonstrate the feasibility of transferring a network trained purely on synthetic data to practical FPP measurements without domain adaptation. Although further improvements are required to enhance reconstruction accuracy under real imaging conditions, the proposed framework provides an effective initial step toward bridging the gap between digital-twin-based training and real-world multi-view FPP applications. CVA-Net provides a robust, occlusion-aware solution for multi-view FPP reconstruction. Full article
Show Figures

Figure 1

10 pages, 4337 KB  
Proceeding Paper
Next-Day Forest Fire Risk Prediction Using Machine Learning and Multimodal Satellite Data
by Prajwal Mohapatra, Swayam Subhankar Sahoo, Adyasha Das and Rururaj Pradhan
Eng. Proc. 2026, 124(1), 120; https://doi.org/10.3390/engproc2026124120 (registering DOI) - 17 Jun 2026
Abstract
Predicting forest fire occurrence is essential for proactive disaster preparedness and environmental protection. We introduce a machine learning-based system that forecasts next-day fire probability at high spatial resolution using satellite-derived, multi-modal geospatial data. In contrast to existing reactive systems that rely on thermal [...] Read more.
Predicting forest fire occurrence is essential for proactive disaster preparedness and environmental protection. We introduce a machine learning-based system that forecasts next-day fire probability at high spatial resolution using satellite-derived, multi-modal geospatial data. In contrast to existing reactive systems that rely on thermal anomaly detection (e.g., MODIS or VIIRS-SNPP), our approach is fully predictive, generating pixel-wise fire risk maps a day in advance. Our study focuses on Uttarakhand, India, which is an ecologically sensitive region that experiences frequent and severe forest fires. We curated a domain-specific geospatial dataset spanning 1 April to 29 May 2016. It includes daily 30 m GeoTIFF images with 10 bands comprising weather (e.g., temperature, wind, precipitation), topography (slope, aspect), fuel map, and fire mask. We constructed this dataset from diverse sources and aligned all bands spatially and temporally. To demonstrate the usefulness of this dataset, we implement a deep convolutional neural network (CNN) using the ResUNet-A architecture, chosen for its robust performance in the semantic segmentation of high-resolution remote sensing data. Our model is trained from scratch to produce high-resolution fire probability maps and classify fire/no-fire pixels. Our solution helps with planning and decision-making for early intervention, especially in areas with high risk. It supports the UN’s SDG 13 (Climate Action) and SDG 15 (Life on Land) by enhancing resilience and conserving ecosystems. The presented dataset and methodology can serve as a benchmark for future research on wildfire risk prediction using Earth observation data. Full article
(This article belongs to the Proceedings of The 6th International Electronic Conference on Applied Sciences)
Show Figures

Figure 1

27 pages, 8573 KB  
Article
LTM-UNet: Linear Transformer–Mamba with Attention-Based U-Net for Context-Aware Breast Ultrasound Image Segmentation
by Shivpratap Singh Kushwah, Santosh Prakash Chouhan, Narinder Singh Punn and Mahua Bhattacharya
Diagnostics 2026, 16(12), 1888; https://doi.org/10.3390/diagnostics16121888 - 17 Jun 2026
Viewed by 226
Abstract
Background/Objectives: Accurate breast lesion segmentation using deep learning models requires precise understanding of both global contextual relevance and finer lesion structure details, which remains a challenge for existing convolutional and transformer-based approaches. This study aims to address these limitations by proposing a [...] Read more.
Background/Objectives: Accurate breast lesion segmentation using deep learning models requires precise understanding of both global contextual relevance and finer lesion structure details, which remains a challenge for existing convolutional and transformer-based approaches. This study aims to address these limitations by proposing a new segmentation model capable of improving context-aware dense segmentation tasks for ultrasound images. Method: We propose LTM-UNet, a novel segmentation method integrating transformer-based encoding with state-space-driven decoding in a U-Net-style framework. The architecture utilizes an efficient vision transformer encoder to extract multi-scale global representations. These features are refined through an attention-guided skip-fusion mechanism incorporating spatial-channel attention preserving finer spatial details and thereby minimizes the semantic gap between encoder and decoder features. Additionally, a direction-aware decoder based on a state-space model is introduced to efficiently capture long-range dependencies and enhance relevant feature reconstruction. Results: Extensive experiments on benchmark ultrasound medical imaging datasets demonstrate the effectiveness of the proposed method. The model achieves dice-score coefficients of 82.41% on the BUSI dataset and 86.62% on Dataset B (UDIAT), outperforming several existing segmentation approaches in both dice-score coefficient and Intersection-over-Union (IoU) metrics. Conclusions: The integration of efficient transformer-based global feature extraction, attention-enhanced feature fusion, and state-space-driven decoding enables LTM-UNet to effectively capture both structural details and contextual information, resulting in superior segmentation performance compared to existing methods. Full article
Show Figures

Figure 1

20 pages, 2220 KB  
Article
R2KAN-U-Net: A Novel Architecture Integrating Kolmogorov–Arnold Networks with Residual U-Net for Robust Traffic Sign Segmentation
by Taha Ben-Abbou, Houda El Omrani, Khalid El Fazazy, Mohamed Adnane Mahraz, Hamid Tairi and Jamal Riffi
Sensors 2026, 26(12), 3797; https://doi.org/10.3390/s26123797 - 15 Jun 2026
Viewed by 265
Abstract
Traffic sign segmentation is a fundamental component of intelligent transportation systems and autonomous driving, where reliable pixel-level perception is required under challenging real-world conditions such as illumination variations, occlusion, scale diversity, and complex urban backgrounds. In this work, we propose Residual–Recurrent Kolmogorov–Arnold Network [...] Read more.
Traffic sign segmentation is a fundamental component of intelligent transportation systems and autonomous driving, where reliable pixel-level perception is required under challenging real-world conditions such as illumination variations, occlusion, scale diversity, and complex urban backgrounds. In this work, we propose Residual–Recurrent Kolmogorov–Arnold Network U-Net (R2KAN-U-Net), where “R2” denotes the integration of residual convolutional learning and recurrent KAN-based feature refinement. The proposed architecture combines residual U-Net feature extraction, multi-scale KAN fusion, and recurrent KAN refinement to improve pixel-level traffic sign segmentation under challenging road-scene conditions. The proposed framework integrates three complementary components: (1) residual convolutional blocks for stable feature propagation; (2) a multi-scale KAN fusion bottleneck for capturing contextual information at different receptive fields; and (3) recurrent KAN refinement modules for iterative enhancement of discriminative features. Unlike conventional convolutional architectures, the proposed KAN-based formulation replaces linear transformations with learnable univariate functions, enabling adaptive nonlinear feature modeling. We conduct extensive experiments on a custom dataset containing 9300 annotated urban traffic scene images, as well as on the ADE20K and Cityscapes benchmarks. On the custom dataset, the proposed R2KAN-U-Net achieved a Dice coefficient of 0.92 and an IoU score of 0.89, providing a strong accuracy–efficiency trade-off for traffic-sign foreground segmentation. It achieves competitive segmentation accuracy compared with recent CNN-, transformer-, and state-space-based segmentation models while using fewer parameters and lower computational cost. Additional low-light experiments demonstrate improved segmentation stability, with R2KAN-U-Net achieving the highest low-light Dice score of 0.88 and a competitive low-light IoU of 0.79. Furthermore, the proposed architecture maintains competitive computational efficiency with only 24 M parameters, 44.8 G FLOPs, and near-real-time inference at 13 ms per image. The experimental results demonstrate that integrating KAN-based function-space learning with residual and multi-scale feature refinement provides an effective and computationally efficient solution for robust traffic sign segmentation in complex driving environments. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

23 pages, 19029 KB  
Article
CETransUNet: An Intelligent Landslide Identification Method Based on Collaborative Optimization of Global Context and Dual Attention Mechanisms
by Tianli Sun, Chengsheng Yang, Jifeng Wu, Zewei Liu, Ziqian Wang and Xiaoqiang Cheng
Remote Sens. 2026, 18(12), 1974; https://doi.org/10.3390/rs18121974 - 13 Jun 2026
Viewed by 222
Abstract
Accurate landslide identification is crucial for enhancing emergency response capabilities during destructive geological hazards. Although deep-learning-based semantic segmentation has demonstrated effectiveness, substantial variations in landslide scales and environmental similarities continue to challenge existing methods. This paper systematically constructs a new co-seismic landslide dataset [...] Read more.
Accurate landslide identification is crucial for enhancing emergency response capabilities during destructive geological hazards. Although deep-learning-based semantic segmentation has demonstrated effectiveness, substantial variations in landslide scales and environmental similarities continue to challenge existing methods. This paper systematically constructs a new co-seismic landslide dataset for the Yarlung Zangbo River basin based on the 2017 Nyingchi earthquake, effectively filling a critical regional data gap. This paper proposes CETransUNet (coordinate attention and edge-guided attention transformer UNet), a novel landslide detection model that integrates ResNet and Transformer architectures. Specifically, a coordinate attention (CA) module is introduced within the skip connections between the encoder and decoder. This module encodes positional information along both horizontal and vertical spatial directions and dynamically re-weights the feature maps, thereby effectively suppressing background noise caused by semantic gaps and enhancing the model’s ability to localize landslide regions. Additionally, an edge-guided attention (EGA) module is incorporated into the decoder. This module extracts explicit edge priors from the input image using a Laplacian operator and imposes geometric constraints on the predictions via a boundary reverse attention mechanism, thereby significantly alleviating boundary ambiguity and morphological distortion of landslides. Evaluations across datasets from the Yarlung Zangbo River, Iburi-Tobu, and Bijie regions demonstrate that CETransUNet significantly outperforms state-of-the-art models—including TransUNet, SegFormer, and SwinUNet—in terms of IoU, MIoU, and F1-score. Overall, through the synergistic optimization of the coordinate attention and edge-guided attention modules, the CETransUNet model achieves synchronous enhancement of boundary integrity and geometric precision in complex scenarios, providing a reliable technical solution for large-scale intelligent landslide identification. Full article
Show Figures

Figure 1

20 pages, 1567 KB  
Article
Efficient Glare Suppression Network for Nighttime Images with Lightweight Parallel Attention and Ghost Convolution
by Ruoyu Yang, Huaixin Chen, Sijie Luo and Zhixi Wang
Sensors 2026, 26(12), 3773; https://doi.org/10.3390/s26123773 - 12 Jun 2026
Viewed by 376
Abstract
Aiming at the problems of glare interference, local overexposure and detail loss caused by artificial light sources such as vehicle lamps and street lamps in nighttime road scenes, as well as the challenges of existing glare suppression models with large parameters, high computational [...] Read more.
Aiming at the problems of glare interference, local overexposure and detail loss caused by artificial light sources such as vehicle lamps and street lamps in nighttime road scenes, as well as the challenges of existing glare suppression models with large parameters, high computational complexity and difficulty in deploying on edge devices, this paper proposes a lightweight glare suppression network (LGSNet) based on ghost depthwise separable convolution and Lightweight Parallel Attention. Based on the U-Net architecture, the network introduces ghost depthwise separable convolution blocks (GhostDSC) in the encoder and decoder, which generates ghost features through cheap linear transformations by exploiting feature map redundancy, significantly reducing model parameters and computational costs while maintaining feature representation ability. Meanwhile, a Lightweight Parallel Attention (LPA) module is designed in the decoder stage, which integrates channel attention and pixel attention in parallel, enhancing the network’s attention to glare regions and edge details with extremely low parameter increment to improve detail recovery accuracy. In addition, a joint loss function consisting of background loss, glare loss and reconstruction loss is constructed to collaboratively optimize glare suppression and detail preservation. Experimental results on the public Flare7K++ dataset and the self-built nighttime road glare dataset NRGD show that the proposed method has only 7.45 M parameters, much lower than standard U-Net and Uformer. It achieves competitive results on full-reference metrics such as PSNR, SSIM, LPIPS and no-reference metrics such as NIQE, BRISQUE, PIQE, and can effectively suppress various types of glare interference and restore obscured scene details. It achieves a superior trade-off between model complexity and enhancement performance, significantly reducing the parameter count and computational overhead compared to heavy baselines, thereby offering a highly efficient solution for resource-aware glare suppression tasks. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

29 pages, 1369 KB  
Review
On Solar Filament Detection Techniques: From Manual to Intelligent
by Yang Hu, Yu Liu, Hai-Tang Li, Abouazza Elmhamdi, Gaofei Zhu, Feiyang Sha, Qiang Liu, Saleh Baltyuor, Delin Tang, Tengfei Song, Huan Zhang, Qing Zhou, Xi Wang and Qiwang Luo
Universe 2026, 12(6), 173; https://doi.org/10.3390/universe12060173 - 11 Jun 2026
Viewed by 220
Abstract
Solar filaments (and their limb counterparts, prominences) are critical tracers of the Sun’s magnetic topology and key precursors to coronal mass ejections (CMEs). Precise identification and continuous tracking of these features are essential for understanding solar eruptive mechanisms and improving space weather forecasting. [...] Read more.
Solar filaments (and their limb counterparts, prominences) are critical tracers of the Sun’s magnetic topology and key precursors to coronal mass ejections (CMEs). Precise identification and continuous tracking of these features are essential for understanding solar eruptive mechanisms and improving space weather forecasting. This systematic review evaluates the evolution of automated detection methodologies, addressing the challenge of processing the exponentially growing volume of high-resolution solar observations. We identify deep learning architectures, particularly U-Net variants and Mask R-CNN, as the most promising current paradigms. Compared to traditional image processing, these data-driven models demonstrate superior robustness against noise and variable observing conditions, achieving high-precision segmentation (>90% accuracy) with sub-second inference speeds. This leap in computational efficiency and accuracy directly facilitates real-time operational monitoring and enables large-scale statistical analysis of filament evolution across solar cycles. We conclude that future breakthroughs lie in developing physics-informed AI and standardized benchmarks to bridge the gap between pixel-level segmentation and physical interpretation, ultimately creating detection systems that are both operationally reliable and scientifically meaningful. Full article
(This article belongs to the Section Solar and Stellar Physics)
Show Figures

Figure 1

22 pages, 19870 KB  
Article
SIG-Net: A Spectral-Index-Guided Network for Red Tide Extraction from Sentinel-2 Multispectral Imagery
by Lei Zhou, Hongping Li, Xiaojun Chen and Zhanqiang Li
Remote Sens. 2026, 18(12), 1928; https://doi.org/10.3390/rs18121928 - 11 Jun 2026
Viewed by 234
Abstract
Red tide events pose substantial threats to marine ecosystems, aquaculture, and coastal public health. Timely and accurate delineation of red tide extent from satellite imagery is therefore essential for operational monitoring and early warning. However, existing deep learning-based semantic segmentation methods generally treat [...] Read more.
Red tide events pose substantial threats to marine ecosystems, aquaculture, and coastal public health. Timely and accurate delineation of red tide extent from satellite imagery is therefore essential for operational monitoring and early warning. However, existing deep learning-based semantic segmentation methods generally treat multispectral bands as homogeneous inputs and do not fully exploit the domain knowledge embodied in spectral indices commonly used in traditional remote sensing analysis. To address this limitation, this study proposes a spectral-index-guided network (SIG-Net) that explicitly incorporates spectral-index priors into deep feature extraction through a dual-branch architecture. SIG-Net comprises three components: a spectral encoder based on a Mix Vision Transformer (MiT-B2) that learns spatial-spectral representations from the original Sentinel-2 bands; a lightweight CNN-based index encoder that extracts discriminative features from four spectral indices, namely the red-green index (RGI), blue-green index (BGI), normalized difference vegetation index (NDVI), and the normalized difference Noctiluca index (NDNI) proposed in this study; and a spectral-index-guided fusion (SIGF) module that adaptively integrates multi-scale features from the two branches using spatial-reduction cross-attention and a gated fusion mechanism. Experiments on a Sentinel-2 red tide dataset show that SIG-Net outperforms single-branch baselines, including U-Net, DeepLabV3+, and SegFormer, as well as naive multi-source fusion strategies. Ablation studies further confirm the contributions of the SIGF module, the gating mechanism, and the proposed NDNI to performance improvements. The proposed method provides an effective framework for integrating domain knowledge with deep learning for red tide remote sensing monitoring. Full article
Show Figures

Figure 1

21 pages, 3003 KB  
Article
Electromagnetic Imaging of Anisotropic Objects Using a Self-Attention Perceptual Generative Adversarial Network
by Po-Hsiang Chen, Chien-Ching Chiu, Yang-Han Lee and Eng Hock Lim
Sensors 2026, 26(12), 3705; https://doi.org/10.3390/s26123705 - 10 Jun 2026
Viewed by 243
Abstract
Reconstructing high-resolution images of anisotropic targets in microwave imaging remains a challenging problem due to the strong directionality of electromagnetic responses and the inherent nonlinearity of the inverse scattering process. To address these issues, we propose a novel Perceptual Generative Adversarial Network (PGAN) [...] Read more.
Reconstructing high-resolution images of anisotropic targets in microwave imaging remains a challenging problem due to the strong directionality of electromagnetic responses and the inherent nonlinearity of the inverse scattering process. To address these issues, we propose a novel Perceptual Generative Adversarial Network (PGAN) enhanced with a Self-Attention mechanism for anisotropic electromagnetic imaging. The perceptual loss encourages the preservation of high-level structural features, while the Self-Attention module enables the model to capture long-range dependencies and directional correlations that are critical in representing anisotropic material distributions. This joint architecture is trained to refine coarse permittivity estimates obtained from conventional Back-Propagation Schemes (BPSs). Numerical simulations and validation using measured experimental data demonstrate that the proposed method achieves improved reconstruction accuracy and structural similarity compared with the PGAN without SA and U-Net. In particular, PGAN with SA reduces the Root Mean Square Error (RMSE) by 15.1% and improves the Structural Similarity Index Measure (SSIM) by 3.8%, confirming its effectiveness in recovering fine-scale details and enhancing reconstruction quality. These results suggest that the proposed framework offers a promising solution for robust and high-resolution electromagnetic imaging in geophysical and remote sensing applications. Full article
(This article belongs to the Special Issue Antenna and Sensor Technologies for Environmental EMF Sensing)
Show Figures

Figure 1

31 pages, 3749 KB  
Article
Cascaded Dual Stage U-Net with Texture-Aware Feature Fusion for Unified Segmentation and Classification in Echo-Cardiogram Images
by Arakere Nagarajappa Jagadish, Ravikumar Manjunath and Indrakumar Krishnamurthy
Informatics 2026, 13(6), 84; https://doi.org/10.3390/informatics13060084 - 10 Jun 2026
Viewed by 327
Abstract
Accurate, automated analysis of medical images is indispensable for effective diagnosis and treatment planning, particularly for complex multiclass diseases. This paper presents a system that combines a cascaded dual-stage U-Net with texture-based deep learning techniques to improve segmentation and classification precision. The cascaded [...] Read more.
Accurate, automated analysis of medical images is indispensable for effective diagnosis and treatment planning, particularly for complex multiclass diseases. This paper presents a system that combines a cascaded dual-stage U-Net with texture-based deep learning techniques to improve segmentation and classification precision. The cascaded dual-stage U-Net architecture comprises two parallel encoding-decoding pathways optimized for deep semantic feature extraction. This dual-path design enables the network to recognize lesion edges and intricate structural variations across imaging modalities. To enhance diagnostic performance, texture features are extracted using the Color Co-occurrence Matrix (CCM), which preserves local texture patterns and color relationships, providing helpful context for deep feature extraction. We feed this enriched data into a convolutional neural network (CNN) classifier, which categorizes the images into disease groups. Extensive evaluation on benchmark medical image datasets (MRI, CT, endoscopic images) demonstrates the framework’s superior performance in segmentation accuracy, classification precision, and robustness to noise and distortions. Integrating segmentation and classification in a coherent pipeline increases the reliability and interpretability of the diagnostic process. This technique represents an important step toward the clinical utility of intelligent, automated medical image processing. Full article
Show Figures

Graphical abstract

25 pages, 863 KB  
Article
Dual-Domain Symmetry: A Frequency-Aware Residual U-Net for High-Fidelity EEG Artifact Removal
by Jiahao Zhang, Tong Liu, Tianhao Cui, Fanqiang Lin and Yong Jia
Symmetry 2026, 18(6), 988; https://doi.org/10.3390/sym18060988 - 8 Jun 2026
Viewed by 180
Abstract
Electroencephalography (EEG) is a non-invasive technique used to monitor brain activity but is prone to physiological artifacts, especially eye movements (EOG) and muscle contractions (EMG). These artifacts are non-stationary and frequently overlap with neural oscillation bands, making them difficult to separate accurately from [...] Read more.
Electroencephalography (EEG) is a non-invasive technique used to monitor brain activity but is prone to physiological artifacts, especially eye movements (EOG) and muscle contractions (EMG). These artifacts are non-stationary and frequently overlap with neural oscillation bands, making them difficult to separate accurately from genuine EEG activity. Conventional single-domain filters often fail to eliminate such interference, resulting in either residual noise or the unintended suppression of authentic EEG data. To address these limitations, we propose a Frequency-Aware Residual U-Net (FARU-Net), a dual-domain, frequency-aware residual architecture for EEG artifact removal designed to improve restoration fidelity. Unlike models based solely on temporal features, FARU-Net explicitly modulates the spectral properties of the signal in the latent space through a Frequency-aware Bottleneck Module (FBM), while simultaneously refining temporal details. Additionally, Attention Gates (AGs) are integrated into the skip connections to refine feature fusion and reduce residual noise while preserving salient waveform structures. Comparative experiments on the EEGdenoiseNet benchmark demonstrate that FARU-Net achieves strong overall performance for single-channel EEG restoration. Across five independent test groups, the proposed model attains a mean Pearson correlation coefficient (CC) of 0.9681 and a mean signal-to-noise ratio improvement (ΔSNR) of 26.66 dB. These results indicate that the proposed method effectively preserves both waveform morphology and spectral structure compared with conventional U-Net variants and CNN-based models. Full article
Show Figures

Figure 1

18 pages, 7342 KB  
Article
3D Karst Cave Identification Using UKAN-CBAM in Seismic Images of Fractured-Vuggy Reservoir
by Binpeng Yan, Haobo Gao, Rui Pan and Yongliang Wang
Appl. Sci. 2026, 16(12), 5765; https://doi.org/10.3390/app16125765 - 8 Jun 2026
Viewed by 129
Abstract
Accurate identification of karst caves from seismic data is crucial for carbonate reservoir characterization, as these caves often serve as primary hydrocarbon storage spaces and migration pathways. However, it remains challenging due to the highly nonlinear relationship between seismic waveforms and cave geometries, [...] Read more.
Accurate identification of karst caves from seismic data is crucial for carbonate reservoir characterization, as these caves often serve as primary hydrocarbon storage spaces and migration pathways. However, it remains challenging due to the highly nonlinear relationship between seismic waveforms and cave geometries, as well as the noise propagation in skip connections inherent to U-Net-based methods. To address these limitations, this paper proposes UKAN-CBAM, a novel 3D network that synergistically integrates Tokenized Kolmogorov–Arnold Network (Tok-KAN) modules and Convolutional Block Attention Modules (CBAM) within a U-shaped encoder–decoder architecture. Unlike U-Net, which relies on linear convolutional kernels, the Tok-KAN modules employ learnable spline-based activation functions to better capture the nonlinear relationships between seismic waveforms and cave geometries. Furthermore, CBAM embedded in each skip connection adaptively recalibrates features along the channel and spatial dimensions, thereby suppressing noise and sharpening cave boundaries. Trained on synthetic data and validated on physical modeling data from the Sichuan Basin and field data from the Tarim Basin, UKAN-CBAM consistently outperforms U-Net, ResUNet, UNet-CBAM, and coherence attributes across multiple evaluation metrics. The proposed network delineates caves with improved continuity and sharper boundaries while reducing false positives, demonstrating strong generalization capability. The results indicate that the synergistic design of KAN’s nonlinear modeling and CBAM’s attention mechanism effectively mitigates the limitations of traditional approaches for karst cave identification. Full article
Show Figures

Figure 1

Back to TopTop