Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (25,471)

Search Parameters:
Keywords = image network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 3492 KB  
Article
Filter-Wise Mask Pruning and FPGA Acceleration for Object Classification and Detection
by Wenjing He, Shaohui Mei, Jian Hu, Lingling Ma, Shiqi Hao and Zhihan Lv
Remote Sens. 2025, 17(21), 3582; https://doi.org/10.3390/rs17213582 - 29 Oct 2025
Abstract
Pruning and acceleration has become an essential and promising technique for convolutional neural networks (CNN) in remote sensing image processing, especially for deployment on resource-constrained devices. However, how to maintain model accuracy and achieve satisfactory acceleration simultaneously remains to be a challenging and [...] Read more.
Pruning and acceleration has become an essential and promising technique for convolutional neural networks (CNN) in remote sensing image processing, especially for deployment on resource-constrained devices. However, how to maintain model accuracy and achieve satisfactory acceleration simultaneously remains to be a challenging and valuable problem. To break this limitation, we introduce a novel pruning pattern of filter-wise mask by enforcing extra filter-wise structural constraints on pattern-based pruning, which achieves the benefits of both unstructured and structured pruning. The newly introduced filter-wise mask enhances fine-grained sparsity with more hardware-friendly regularity. We further design an acceleration architecture with optimization of calculation parallelism and memory access, aiming to fully translate weight pruning to hardware performance gain. The proposed pruning method is firstly proven on classification networks. The pruning rate can achieve 75.1% for VGG-16 and 84.6% for ResNet-50 without accuracy compromise. Further to this, we enforce our method on the widely used object detection model, the you only look once (YOLO) CNN. On the aerial image dataset, the pruned YOLOv5s achieves a pruning rate of 53.43% with a slight accuracy degradation of 0.6%. Meanwhile, we implement the acceleration architecture on a field-programmable gate array (FPGA) to evaluate its practical execution performance. The throughput reaches up to 809.46MOPS. The pruned network achieves a speedup of 2.23× and 4.4×, with a compression rate of 2.25× and 4.5×, respectively, converting the model compression to execution speedup effectively. The proposed pruning and acceleration approach provides crucial technology to facilitate the application of remote sensing with CNN, especially in scenarios such as on-board real-time processing, emergency response, and low-cost monitoring. Full article
Show Figures

Figure 1

30 pages, 2196 KB  
Article
MFF-ClassificationNet: CNN-Transformer Hybrid with Multi-Feature Fusion for Breast Cancer Histopathology Classification
by Xiaoli Wang, Guowei Wang, Luhan Li, Hua Zou and Junpeng Cui
Biosensors 2025, 15(11), 718; https://doi.org/10.3390/bios15110718 (registering DOI) - 29 Oct 2025
Abstract
Breast cancer is one of the most prevalent malignant tumors among women worldwide, underscoring the urgent need for early and accurate diagnosis to reduce mortality. To address this, A Multi-Feature Fusion Classification Network (MFF-ClassificationNet) is proposed for breast histopathological image classification. The network [...] Read more.
Breast cancer is one of the most prevalent malignant tumors among women worldwide, underscoring the urgent need for early and accurate diagnosis to reduce mortality. To address this, A Multi-Feature Fusion Classification Network (MFF-ClassificationNet) is proposed for breast histopathological image classification. The network adopts a two-branch parallel architecture, where a convolutional neural network captures local details and a Transformer models global dependencies. Their features are deeply integrated through a Multi-Feature Fusion module, which incorporates a Convolutional Block Attention Module—Squeeze and Excitation (CBAM-SE) fusion block combining convolutional block attention, squeeze-and-excitation mechanisms, and a residual inverted multilayer perceptron to enhance fine-grained feature representation and category-specific lesion characterization. Experimental evaluations on the BreakHis dataset achieved accuracies of 98.30%, 97.62%, 98.81%, and 96.07% at magnifications of 40×, 100×, 200×, and 400×, respectively, while an accuracy of 97.50% was obtained on the BACH dataset. These results confirm that integrating local and global features significantly strengthens the model’s ability to capture multi-scale and context-aware information, leading to superior classification performance. Overall, MFF-ClassificationNet surpasses conventional single-path approaches and provides a robust, generalizable framework for advancing computer-aided diagnosis of breast cancer. Full article
(This article belongs to the Special Issue AI-Based Biosensors and Biomedical Imaging)
26 pages, 32733 KB  
Article
Contextual-Semantic Interactive Perception Network for Small Object Detection in UAV Aerial Images
by Yiming Xu and Hongbing Ji
Remote Sens. 2025, 17(21), 3581; https://doi.org/10.3390/rs17213581 - 29 Oct 2025
Abstract
Unmanned Aerial Vehicle (UAV)-based aerial object detection has been widely applied in various fields, including logistics, public security, disaster response, and smart agriculture. However, numerous small objects in UAV aerial images are often overwhelmed by large-scale complex backgrounds, making their appearance difficult to [...] Read more.
Unmanned Aerial Vehicle (UAV)-based aerial object detection has been widely applied in various fields, including logistics, public security, disaster response, and smart agriculture. However, numerous small objects in UAV aerial images are often overwhelmed by large-scale complex backgrounds, making their appearance difficult to distinguish and thereby prone to being missed by detectors. To tackle these issues, we propose a novel Contextual-Semantic Interactive Perception Network (CSIPN) for small object detection in UAV aerial scenarios, which enhances detection performance through scene interaction modeling, dynamic context modeling, and dynamic feature fusion. The core components of the CSIPN include the Scene Interaction Modeling Module (SIMM), the Dynamic Context Modeling Module (DCMM), and the Semantic-Context Dynamic Fusion Module (SCDFM). Specifically, the SIMM introduces a lightweight self-attention mechanism to generate a global scene semantic embedding vector, which then interacts with shallow spatial descriptors to explicitly depict the latent relationships between small objects and complex background, thereby selectively activating key spatial responses. The DCMM employs two dynamically adjustable receptive-field branches to adaptively model contextual cues and effectively supplement the contextual information required for detecting various small objects. The SCDFM utilizes a dual-weighting strategy to dynamically fuse deep semantic information with shallow contextual details, highlighting features relevant to small object detection while suppressing irrelevant background. Our method achieves mAPs of 37.2%, 93.4%, 50.8%, and 48.3% on the TinyPerson dataset, the WAID dataset, the VisDrone-DET dataset, and our self-built WildDrone dataset, respectively, while using only 25.3M parameters, surpassing existing state-of-the-art detectors and demonstrating its superiority and robustness. Full article
33 pages, 4037 KB  
Article
DCBAN: A Dynamic Confidence Bayesian Adaptive Network for Reconstructing Visual Images from fMRI Signals
by Wenju Wang, Yuyang Cai, Renwei Zhang, Jiaqi Li, Zinuo Ye and Zhen Wang
Brain Sci. 2025, 15(11), 1166; https://doi.org/10.3390/brainsci15111166 - 29 Oct 2025
Abstract
Background: Current fMRI (functional magnetic resonance imaging)-driven brain information decoding for visual image reconstruction techniques faces issues such as poor structural fidelity, inadequate model generalization, and unnatural visual image reconstruction in complex scenarios. Methods: To address these challenges, this study proposes a [...] Read more.
Background: Current fMRI (functional magnetic resonance imaging)-driven brain information decoding for visual image reconstruction techniques faces issues such as poor structural fidelity, inadequate model generalization, and unnatural visual image reconstruction in complex scenarios. Methods: To address these challenges, this study proposes a Dynamic Confidence Bayesian Adaptive Network (DCBAN). In this network model, deep nested Singular Value Decomposition is introduced to embed low-rank constraints into the deep learning model layers for fine-grained feature extraction, thus improving structural fidelity. The proposed Bayesian Adaptive Fractional Ridge Regression module, based on singular value space, dynamically adjusts the regularization parameters, significantly enhancing the decoder’s generalization ability under complex stimulus conditions. The constructed Dynamic Confidence Adaptive Diffusion Model module incorporates a confidence network and time decay strategy, dynamically adjusting the semantic injection strength during the generation phase, further enhancing the details and naturalness of the generated images. Results: The proposed DCBAN method is applied to the NSD, outperforming state-of-the-art methods by 8.41%, 0.6%, and 4.8% in PixCorr (0.361), Incep (96.0%), and CLIP (97.8%), respectively, achieving the current best performance in both structural and semantic fMRI visual image reconstruction. Conclusions: The DCBAN proposed in this thesis offers a novel solution for reconstructing visual images from fMRI signals, significantly enhancing the robustness and generative quality of the reconstructed images. Full article
48 pages, 1608 KB  
Systematic Review
A Systematic Review of Advances in Deep Learning Architectures for Efficient and Sustainable Photovoltaic Solar Tracking: Research Challenges and Future Directions
by Ali Alhazmi, Kholoud Maswadi and Christopher Ifeanyi Eke
Sustainability 2025, 17(21), 9625; https://doi.org/10.3390/su17219625 (registering DOI) - 29 Oct 2025
Abstract
The swift advancement of renewable energy technology has highlighted the need for effective photovoltaic (PV) solar energy tracking systems. Deep learning (DL) has surfaced as a promising method to improve the precision and efficacy of photovoltaic (PV) solar tracking by utilising complicated patterns [...] Read more.
The swift advancement of renewable energy technology has highlighted the need for effective photovoltaic (PV) solar energy tracking systems. Deep learning (DL) has surfaced as a promising method to improve the precision and efficacy of photovoltaic (PV) solar tracking by utilising complicated patterns in meteorological and PV system data. This systematic literature review (SLR) seeks to offer a thorough examination of the progress in deep learning architectures for photovoltaic solar energy tracking over the last decade (2016–2025). The review was structured around four research questions (RQs) aimed at identifying prevalent deep learning architectures, datasets, performance metrics, and issues within the context of deep learning-based PV solar tracking systems. The present research utilised SLR methodology to analyse 64 high-quality publications from reputed academic databases like IEEE Xplore, Science Direct, Springer, and MDPI. The results indicated that deep learning architectures, including Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformer-based models, are extensively employed to improve the accuracy and efficiency of photovoltaic solar tracking systems. Widely utilised datasets comprised meteorological data, photovoltaic system data, time series data, temperature data, and image data. Performance metrics, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE), were employed to assess model efficacy. Identified significant challenges encompass inadequate data quality, restricted availability, high computing complexity, and issues in model generalisation. Future research should concentrate on enhancing data quality and accessibility, creating generalised models, minimising computational complexity, and integrating deep learning with real-time photovoltaic systems. Resolving these challenges would facilitate advancements in efficient, reliable, and sustainable photovoltaic solar tracking systems, hence promoting the wider adoption of renewable energy technology. This review emphasises the capability of deep learning to transform photovoltaic solar tracking and stresses the necessity for interdisciplinary collaboration to address current limitations. Full article
32 pages, 2280 KB  
Article
Symmetry-Aware Feature Representations and Model Optimization for Interpretable Machine Learning
by Mehtab Alam, Abdullah Alourani, Ashraf Ali and Firoj Ahamad
Symmetry 2025, 17(11), 1821; https://doi.org/10.3390/sym17111821 - 29 Oct 2025
Abstract
This paper investigates the role of symmetry and asymmetry in the learning process of modern machine learning models, with a specific focus on feature representation and optimization. We introduce a novel symmetry-aware learning framework that identifies and preserves symmetric properties within high-dimensional datasets, [...] Read more.
This paper investigates the role of symmetry and asymmetry in the learning process of modern machine learning models, with a specific focus on feature representation and optimization. We introduce a novel symmetry-aware learning framework that identifies and preserves symmetric properties within high-dimensional datasets, while allowing model asymmetries to capture essential discriminative cues. Through analytical modeling and empirical evaluations on benchmark datasets, we demonstrate how symmetrical transformations of features (e.g., rotation, mirroring, permutation invariance) impact learning efficiency, interpretability, and generalization. Furthermore, we explore asymmetric regularization techniques that prioritize informative deviations from symmetry in model parameters, thereby improving classification and clustering performance. The proposed approach is validated using a variety of classifiers including neural networks and tested across domains such as image recognition, biomedical data, and social networks. Our findings highlight the critical importance of leveraging domain-specific symmetries to enhance both the performance and explainability of machine learning systems. Full article
(This article belongs to the Special Issue Symmetry/Asymmetry in Data Mining & Machine Learning)
Show Figures

Figure 1

22 pages, 4001 KB  
Article
SolPowNet: Dust Detection on Photovoltaic Panels Using Convolutional Neural Networks
by Ömer Faruk Alçin, Muzaffer Aslan and Ali Ari
Electronics 2025, 14(21), 4230; https://doi.org/10.3390/electronics14214230 - 29 Oct 2025
Abstract
In recent years, the widespread adoption of photovoltaic (PV) panels for electricity generation has provided significant momentum toward sustainable energy goals. However, it has been observed that the accumulation of dust and contaminants on panel surfaces markedly reduces efficiency by blocking solar radiation [...] Read more.
In recent years, the widespread adoption of photovoltaic (PV) panels for electricity generation has provided significant momentum toward sustainable energy goals. However, it has been observed that the accumulation of dust and contaminants on panel surfaces markedly reduces efficiency by blocking solar radiation from reaching the surface. Consequently, dust detection has become a critical area of research into the energy efficiency of PV systems. This study proposes SolPowNet, a novel Convolutional Neural Network (CNN) model based on deep learning with a lightweight architecture that is capable of reliably distinguishing between images of clean and dusty panels. The performance of the proposed model was evaluated by testing it on a dataset containing images of 502 clean panels and 340 dusty panels and comprehensively comparing it with state-of-the-art CNN-based approaches. The experimental results demonstrate that SolPowNet achieves an accuracy of 98.82%, providing 5.88%, 3.57%, 4.7%, 18.82%, and 0.02% higher accuracy than the AlexNet, VGG16, VGG19, ResNet50, and Inception V3 models, respectively. These experimental results reveal that the proposed architecture exhibits more effective classification performance than other CNN models. In conclusion, SolPowNet, with its low computational cost and lightweight structure, enables integration into embedded and real-time applications. Thus, it offers a practical solution for optimizing maintenance planning in photovoltaic systems, managing panel cleaning intervals based on data, and minimizing energy production losses. Full article
Show Figures

Figure 1

17 pages, 3452 KB  
Article
A Deep Regression Model for Tongue Image Color Correction Based on CNN
by Xiyuan Cao, Delong Zhang, Chunyang Jin, Wei Zhang, Zhidong Zhang and Chenyang Xue
J. Imaging 2025, 11(11), 381; https://doi.org/10.3390/jimaging11110381 - 29 Oct 2025
Abstract
Different viewing or shooting situations can affect color authenticity and generally lead to visual inconsistencies for the same images. At present, deep learning has gained popularity and opened up new avenues for image processing and optimization. In this paper, we propose a novel [...] Read more.
Different viewing or shooting situations can affect color authenticity and generally lead to visual inconsistencies for the same images. At present, deep learning has gained popularity and opened up new avenues for image processing and optimization. In this paper, we propose a novel regression model named TococoNet (Tongue Color Correction Network) that extends from CNN (convolutional neural network) to eliminate the color bias in tongue images. The TococoNet model consists of symmetric encoder-–decoder U-Blocks which are connected by M-Block through concatenation layers for feature fusion at different levels. Initially, we train our model by simulatively introducing five common biased colors. The various image quality indicators holistically demonstrate that our model achieves accurate color correction for tongue images, and simultaneously surpasses conventional algorithms and shallow networks. Furthermore, we conduct correction experiments by introducing random degrees of color bias, and the model continues to perform well for achieving excellent correction effects. The model achieves up to 84% correction effectiveness in terms of color distance ΔE for tongue images with varying degrees of random color cast. Finally, we obtain excellent color correction for actual captured images for tongue diagnosis application. Among these, the maximum ΔE can be reduced from 30.38 to 6.05. Overall, the TococoNet model possesses excellent color correction capabilities, which opens promising opportunities for clinical assistance and automatic diagnosis. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

19 pages, 20616 KB  
Article
Toward Trustworthy On-Device AI: A Quantization-Robust Parameterized Hybrid Neural Filtering Framework
by Sangwoo Hong, Seung-Wook Kim, Seunghyun Moon and Seowon Ji
Mathematics 2025, 13(21), 3447; https://doi.org/10.3390/math13213447 - 29 Oct 2025
Abstract
Recent advances in deep learning have led to a proliferation of AI services for the general public. Consequently, constructing trustworthy AI systems that operate on personal devices has become a crucial challenge. While on-device processing is critical for privacy-preserving and latency-sensitive applications, conventional [...] Read more.
Recent advances in deep learning have led to a proliferation of AI services for the general public. Consequently, constructing trustworthy AI systems that operate on personal devices has become a crucial challenge. While on-device processing is critical for privacy-preserving and latency-sensitive applications, conventional deep learning approaches often suffer from instability under quantization and high computational costs. Toward a trustworthy and efficient on-device solution for image processing, we present a hybrid neural filtering framework that combines the representational power of lightweight neural networks with the stability of classical filters. In our framework, the neural network predicts a low-dimensional parameter map that guides the filter’s behavior, effectively decoupling parameter estimation from the final image synthesis. This design enables a truly trustworthy AI system by operating entirely on-device, which eliminates the reliance on servers and significantly reduces computational cost. To ensure quantization robustness, we introduce a basis-decomposed parameterization, a design mathematically proven to bound reconstruction errors. Our network predicts a set of basis maps that are combined via fixed coefficients to form the final guidance. This architecture is intrinsically robust to quantization and supports runtime-adaptive precision without retraining. Experiments on depth map super-resolution validate our approach. Our framework demonstrates exceptional quantization robustness, exhibiting no performance degradation under 8-bit quantization, whereas a baseline suffers a significant 1.56 dB drop. Furthermore, our model’s significantly lower Mean Squared Error highlights its superior stability, providing a practical and mathematically grounded pathway toward trustworthy on-device AI. Full article
Show Figures

Figure 1

23 pages, 3915 KB  
Article
A Comparative Study of Generative Adversarial Networks in Medical Image Processing
by Marwa Mahfodh Abdulqader and Adnan Mohsin Abdulazeez
Eng 2025, 6(11), 291; https://doi.org/10.3390/eng6110291 - 29 Oct 2025
Abstract
The rapid development of Generative Adversarial Networks (GANs) has transformed medical image processing, enabling realistic image synthesis, augmentation, and restoration. This study presents a comparative evaluation of three representative GAN architectures, Pix2Pix, SPADE GAN, and Wasserstein GAN (WGAN), across multiple medical imaging tasks, [...] Read more.
The rapid development of Generative Adversarial Networks (GANs) has transformed medical image processing, enabling realistic image synthesis, augmentation, and restoration. This study presents a comparative evaluation of three representative GAN architectures, Pix2Pix, SPADE GAN, and Wasserstein GAN (WGAN), across multiple medical imaging tasks, including segmentation, image synthesis, and enhancement. Experiments were conducted on three benchmark datasets: ACDC (cardiac MRI), Brain Tumor MRI, and CHAOS (abdominal MRI). Model performance was assessed using Fréchet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), Dice coefficient, and segmentation accuracy. Results show that SPADE-inpainting achieved the best image fidelity (PSNR ≈ 36 dB, SSIM > 0.97, Dice ≈ 0.94, FID < 0.01), while Pix2Pix delivered the highest segmentation accuracy (Dice ≈ 0.90 on ACDC). WGAN provided stable enhancement and strong visual sharpness on smaller datasets such as Brain Tumor MRI. The findings confirm that no single GAN architecture universally excels across all tasks; performance depends on data complexity and task objectives. Overall, GANs demonstrate strong potential for medical image augmentation and synthesis, though their clinical utility remains dependent on anatomical fidelity and dataset diversity. Full article
Show Figures

Figure 1

10 pages, 1364 KB  
Article
Automated Detection of Lumbosacral Transitional Vertebrae on Plain Lumbar Radiographs Using a Deep Learning Model
by Donghyuk Kwak, Du Hyun Ro and Dong-Ho Kang
J. Clin. Med. 2025, 14(21), 7671; https://doi.org/10.3390/jcm14217671 (registering DOI) - 29 Oct 2025
Abstract
Background/Objectives: Lumbosacral transitional vertebra (LSTV) is a common anatomical variant, but its identification on plain radiographs is often inconsistent. This inconsistency can lead to clinical complications such as chronic low back pain, misinterpretation of spinal parameters, and an increased risk of wrong-level [...] Read more.
Background/Objectives: Lumbosacral transitional vertebra (LSTV) is a common anatomical variant, but its identification on plain radiographs is often inconsistent. This inconsistency can lead to clinical complications such as chronic low back pain, misinterpretation of spinal parameters, and an increased risk of wrong-level surgery. This study aimed to develop and validate a deep learning-based artificial intelligence (AI) model for the automated detection of LSTV on plain lumbar radiographs. Methods: This retrospective observational study included a total of 3116 standing lumbar lateral radiographs. The presence or absence of lumbosacral transitional vertebra (LSTV) was definitively established using whole-spine imaging, CT, or MRI. Multiple deep learning architectures, including DINOv2, CLIP (ViT-B/32), and ResNet-50, were initially evaluated for binary classification of LSTV. Among these, the ResNet-50 model with partial fine-tuning achieved the best test performance and was subsequently selected for fivefold cross-validation using the training set. Model performance was assessed using accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUROC), and interpretability was evaluated using gradient-weighted class activation mapping (Grad-CAM). Results: On the independent test set of 313 radiographs, the final model demonstrated robust diagnostic performance. It achieved an accuracy of 76.4%, a sensitivity of 85.1%, a specificity of 61.9%, and an AUC of 0.84. The model correctly identified 166 out of 195 LSTV cases and 73 out of 118 normal cases. Conclusions: This AI-based system offers a highly accurate and reliable method for the automated detection of LSTV on plain radiographs. It shows strong potential as a clinical decision-support tool to reduce diagnostic errors, improve pre-operative planning, and enhance patient safety. Full article
Show Figures

Figure 1

17 pages, 3889 KB  
Article
STGAN: A Fusion of Infrared and Visible Images
by Liuhui Gong, Yueping Han and Ruihong Li
Electronics 2025, 14(21), 4219; https://doi.org/10.3390/electronics14214219 - 29 Oct 2025
Abstract
The fusion of infrared and visible images provides critical value in computer vision by integrating their complementary information, especially in the field of industrial detection, which provides a more reliable data basis for subsequent defect recognition. This paper presents STGAN, a novel Generative [...] Read more.
The fusion of infrared and visible images provides critical value in computer vision by integrating their complementary information, especially in the field of industrial detection, which provides a more reliable data basis for subsequent defect recognition. This paper presents STGAN, a novel Generative Adversarial Network framework based on a Swin Transformer for high-quality infrared and visible image fusion. Firstly, the generator employs a Swin Transformer as its backbone for feature extraction, which adopts a U-Net architecture, and the improved W-MSA is introduced into the bottleneck layer to enhance local attention and improve the expression ability of cross-modal features. Secondly, the discriminator uses a Markov discriminator to distinguish the difference. Then, the core GAN framework is leveraged to guarantee the retention of both infrared thermal radiation and visible-light texture details in the generated image so as to improve the clarity and contrast of the fused image. Finally, simulation verification showed that six out of seven indicators ranked in the top two, especially in key indicators such as PSNR, VIF, MI, and EN, which achieved optimal or suboptimal values. The experimental results on the general dataset show that this method is superior to the advanced method in terms of subjective vision and objective indicators, and it can effectively enhance the fine structure and thermal anomaly information in the image, which gives it great potential in the application of industrial surface defect detection. Full article
Show Figures

Figure 1

23 pages, 3485 KB  
Article
MMA-Net: A Semantic Segmentation Network for High-Resolution Remote Sensing Images Based on Multimodal Fusion and Multi-Scale Multi-Attention Mechanisms
by Xuanxuan Huang, Xuejie Zhang, Longbao Wang, Dandan Yuan, Shufang Xu, Fengguang Zhou and Zhijun Zhou
Remote Sens. 2025, 17(21), 3572; https://doi.org/10.3390/rs17213572 - 28 Oct 2025
Abstract
Semantic segmentation of high-resolution remote sensing images is of great application value in fields like natural disaster monitoring. Current multimodal semantic segmentation methods have improved the model’s ability to recognize different ground objects and complex scenes by integrating multi-source remote sensing data. However, [...] Read more.
Semantic segmentation of high-resolution remote sensing images is of great application value in fields like natural disaster monitoring. Current multimodal semantic segmentation methods have improved the model’s ability to recognize different ground objects and complex scenes by integrating multi-source remote sensing data. However, these methods still face challenges such as blurred boundary segmentation and insufficient perception of multi-scale ground objects when achieving high-precision classification. To address these issues, this paper proposes MMA-Net, a semantic segmentation network enhanced by two key modules: cross-layer multimodal fusion module and multi-scale multi-attention module. These modules effectively improve the model’s ability to capture detailed features and model multi-scale ground objects, thereby enhancing boundary segmentation accuracy, detail feature preservation, and consistency in multi-scale object segmentation. Specifically, the cross-layer multimodal fusion module adopts a staged fusion strategy to integrate detailed information and multimodal features, realizing detail preservation and modal synergy enhancement. The multi-scale multi-attention module combines cross-attention and self-attention to leverage long-range dependencies and inter-modal complementary relationships, strengthening the model’s feature representation for multi-scale ground objects. Experimental results show that MMA-Net outperforms state-of-the-art methods on the Potsdam and Vaihingen datasets. Its mIoU reaches 88.74% and 84.92% on the two datasets, respectively. Ablation experiments further verify that each proposed module contributes to the final performance. Full article
Show Figures

Figure 1

15 pages, 2237 KB  
Article
LPI Radar Waveform Modulation Recognition Based on Improved EfficientNet
by Yuzhi Qi, Lei Ni, Xun Feng, Hongquan Li and Yujia Zhao
Electronics 2025, 14(21), 4214; https://doi.org/10.3390/electronics14214214 - 28 Oct 2025
Abstract
To address the challenge of low modulation recognition accuracy for Low Probability of Intercept (LPI) radar waveforms under low Signal-to-Noise Ratio (SNR) conditions—a critical limitation in current radar signal processing research—this study proposes a novel recognition framework anchored in an improved EfficientNet model. [...] Read more.
To address the challenge of low modulation recognition accuracy for Low Probability of Intercept (LPI) radar waveforms under low Signal-to-Noise Ratio (SNR) conditions—a critical limitation in current radar signal processing research—this study proposes a novel recognition framework anchored in an improved EfficientNet model. First, to generate time–frequency images, the radar signals are initially subjected to time–frequency analysis using the Choi–Williams Distribution (CWD). Second, the Mobile Inverted Bottle-neck Convolution (MBConv) structure incorporates the Simple Attention Module (SimAM) to improve the network’s capacity to extract features from time–frequency images. Specifically, the original serial mechanism within the MBConv structure is replaced with a parallel convolution and attention approach, further optimizing feature extraction efficiency. Third, the network’s loss function is upgraded to Focal Loss. This modification aims to mitigate the issue of low recognition rates for specific radar signal types during training: by dynamically adjusting the loss weights of hard-to-recognize samples, it effectively improves the classification accuracy of challenging categories. Simulation experiments were conducted on 13 distinct types of LPI radar signals. The results demonstrate that the improved model validates the effectiveness of the proposed approach for LPI waveform modulation recognition, achieving an overall recognition accuracy of 96.48% on the test set. Full article
Show Figures

Figure 1

35 pages, 18392 KB  
Article
Assessing the Impacts of Land Cover and Climate Changes on Streamflow Dynamics in the Río Negro Basin (Colombia) Under Present and Future Scenarios
by Blanca A. Botero, Juan C. Parra, Juan M. Benavides, César A. Olmos-Severiche, Rubén D. Vásquez-Salazar, Juan Valdés-Quintero, Sandra Mateus, Jean P. Díaz-Paz, Lorena Díez, Andrés F. García and Oscar E. Cossio
Hydrology 2025, 12(11), 281; https://doi.org/10.3390/hydrology12110281 - 28 Oct 2025
Abstract
Understanding and quantifying the coupled effects of land cover change and climate change on hydrological regimes is critical for sustainable water management in tropical mountainous regions. The Río Negro Basin in eastern Antioquia, Colombia, has undergone rapid urban expansion, agricultural intensification, and deforestation [...] Read more.
Understanding and quantifying the coupled effects of land cover change and climate change on hydrological regimes is critical for sustainable water management in tropical mountainous regions. The Río Negro Basin in eastern Antioquia, Colombia, has undergone rapid urban expansion, agricultural intensification, and deforestation over recent decades, profoundly altering its hydrological dynamics. This study integrates advanced satellite image processing, AI-based land cover modeling, climate change projections, and distributed hydrological simulation to assess future streamflow responses. Multi-sensor satellite data (Landsat, Sentinel-1, Sentinel-2, ALOS) were processed using Random Forest classifiers, intelligent multisensor fusion, and probabilistic neural networks to generate high-resolution land cover maps and scenarios for 2060 (optimistic, trend, and pessimistic), with strict area constraints for urban growth and forest conservation. Future precipitation was derived from MPI-ESM CMIP6 outputs (SSP2-4.5, SSP3-7.0, SSP5-8.5) and statistically downscaled using Empirical Quantile Mapping (EQM) to match the basin scale and precipitation records from the national hydrometeorological service of the Colombia IDEAM (Instituto de Hidrología, Meteorología y Estudios Ambientales, Colombia). The TETIS hydrological model was calibrated and validated using observed streamflow records (1998–2023) and subsequently used to simulate hydrological responses under combined land cover and climate scenarios. Results indicate that urban expansion and forest loss significantly increase peak flows (Q90, Q95) and flood risk while decreasing baseflows (Q10, Q30), compromising water availability during dry seasons. Conversely, conservation-oriented scenarios mitigate these effects by enhancing flow regulation and groundwater recharge. The findings highlight that targeted land management can partially offset the negative impacts of climate change, underscoring the importance of integrated land–water planning in the Andes. This work provides a replicable framework for modeling hydrological futures in data-scarce mountainous basins, offering actionable insights for regional authorities, environmental agencies, and national institutions responsible for water security and disaster risk management. Full article
Show Figures

Figure 1

Back to TopTop