Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (306)

Search Parameters:
Keywords = multiscale dilated convolution

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 1197 KB  
Article
A Multi-Scale Feature Fusion Linear Attention Model for Movie Review Sentiment Analysis
by Zi Jiang and Chengjun Xu
Big Data Cogn. Comput. 2025, 9(12), 325; https://doi.org/10.3390/bdcc9120325 - 18 Dec 2025
Viewed by 127
Abstract
Sentiment classification is a key technique for analyzing the emotional tendency of user reviews and is of great significance to movie recommendation systems. However, existing methods often face challenges in practical applications due to complex model structures, low computational efficiency, or difficulties in [...] Read more.
Sentiment classification is a key technique for analyzing the emotional tendency of user reviews and is of great significance to movie recommendation systems. However, existing methods often face challenges in practical applications due to complex model structures, low computational efficiency, or difficulties in balancing local details with global contextual features. To address these issues, this paper proposes a Multi-Scale Feature Fusion Linear Attention model (MSFFLA). The model consists of three core modules: the BERT Encoder module for extracting basic semantic features; the Parallel Multi-scale Feature Extraction module (PMFE), which employs multi-branch dilated convolutions to accurately capture local fine-grained features; and the Global Multi-scale Linear Feature Extraction module (MGLFE), which introduces a Multi-Scale Linear Attention mechanism (MSLA) to efficiently model global contextual dependencies with approximately linear computational complexity. Extensive experiments were conducted on three public datasets: SST-2, Amazon Reviews, and MR. The results show that compared to the state-of-the-art BERT-CondConv model, our model achieves improvements in accuracy and F1-Score by 1.8% and 0.4%, respectively, on the SST-2 dataset, and by 1.5% and 0.3% on the Amazon Reviews dataset. This study not only validates the effectiveness of the proposed model but also provides an efficient and lightweight solution for sentiment classification tasks in movie recommendation systems, demonstrating promising practical application prospects. Full article
Show Figures

Figure 1

23 pages, 2619 KB  
Article
LITransformer: Transformer-Based Vehicle Trajectory Prediction Integrating Spatio-Temporal Attention Networks with Lane Topology and Dynamic Interaction
by Yuanchao Zhong, Zhiming Gui, Zhenji Gao, Xinyu Wang and Jiawen Wei
Electronics 2025, 14(24), 4950; https://doi.org/10.3390/electronics14244950 - 17 Dec 2025
Viewed by 225
Abstract
Vehicle trajectory prediction is a pivotal technology in intelligent transportation systems. Existing methods encounter challenges in effectively modeling lane topology and dynamic interaction relationships in complex traffic scenarios, limiting prediction accuracy and reliability. This paper presents Lane Interaction Transformer (LITransformer), a lane-informed trajectory [...] Read more.
Vehicle trajectory prediction is a pivotal technology in intelligent transportation systems. Existing methods encounter challenges in effectively modeling lane topology and dynamic interaction relationships in complex traffic scenarios, limiting prediction accuracy and reliability. This paper presents Lane Interaction Transformer (LITransformer), a lane-informed trajectory prediction framework that builds on spatio–temporal graph attention networks and Transformer-based global aggregation. Rather than introducing entirely new network primitives, LITransformer focuses on two design aspects: (i) a lane topology encoder that fuses geometric and semantic lane features via direction-sensitive, multi-scale dilated graph convolutions, converting vectorized lane data into rich topology-aware representations; and (ii) an Interaction-Aware Graph Attention mechanism (IAGAT) that explicitly models four types of interactions between vehicles and lane infrastructure (V2V, V2N, N2V, N2N), with gating-based fusion of structured road constraints and dynamic spatio–temporal features. The overall architecture employs a Transformer module to aggregate global scene context and a multi-modal decoding head to generate diverse trajectory hypotheses with confidence estimation. Extensive experiments on the Argoverse dataset show that LITransformer achieves a minADE of 0.76 and a minFDE of 1.20, and significantly outperforms representative baselines such as LaneGCN and HiVT. These results demonstrate that explicitly incorporating lane topology and interaction-aware spatio-temporal modeling can significantly improve the accuracy and reliability of vehicle trajectory prediction in complex real-world traffic scenarios. Full article
(This article belongs to the Special Issue Autonomous Vehicles: Sensing, Mapping, and Positioning)
Show Figures

Figure 1

26 pages, 8544 KB  
Article
Hi-MDTCN: Hierarchical Multi-Scale Dilated Temporal Convolutional Network for Tool Condition Monitoring
by Anying Chai, Zhaobo Fang, Mengjia Lian, Ping Huang, Chenyang Guo, Wanda Yin, Lei Wang, Enqiu He and Siwen Li
Sensors 2025, 25(24), 7603; https://doi.org/10.3390/s25247603 - 15 Dec 2025
Viewed by 254
Abstract
Accurate identification of tool wear conditions is of great significance for extending tool life, ensuring processing quality, and improving production efficiency. Current research shows that signals collected by a single sensor have limited dimensions and cannot comprehensively capture the degradation process of tool [...] Read more.
Accurate identification of tool wear conditions is of great significance for extending tool life, ensuring processing quality, and improving production efficiency. Current research shows that signals collected by a single sensor have limited dimensions and cannot comprehensively capture the degradation process of tool wear, while multi-sensor fusion recognition methods cannot effectively handle the complementarity and redundancy between heterogeneous sensor data in feature extraction and fusion. To address these issues, this paper proposes Hi-MDTCN (Hierarchical Multi-scale Dilated Temporal Convolutional Network). In the network, we propose a hierarchical signal analysis framework that processes the signal in segments. When processing intra-segment signals, we design a Multi-channel one-dimensional convolutional network with attention mechanism to capture local wear features at different time scales and fuse them into a unified representation. When processing signal segments, we design a Bi-TCN module to further capture long-term dependencies in wear evolution, mining the overall trend of tool wear over time. Hi-MDTCN adopts a dilated convolution mechanism, which can achieve an extremely large receptive field without building an overly deep network structure, effectively solving problems faced by recurrent neural networks in long sequence modeling such as gradient vanishing, low training efficiency, and poor parallel computing capability, achieving efficient parallel capture of long-range dependencies in time series. Finally, the proposed method is applied to the PHM2010 milling data. Experimental results show that the model’s tool condition recognition accuracy is higher than traditional methods, demonstrating its effectiveness for practical applications. Full article
(This article belongs to the Special Issue Sensing Technologies in Industrial Defect Detection)
Show Figures

Figure 1

26 pages, 7430 KB  
Article
PMSAF-Net: A Progressive Multi-Scale Asymmetric Fusion Network for Lightweight and Multi-Platform Thin Cloud Removal
by Li Wang and Feng Liang
Remote Sens. 2025, 17(24), 4001; https://doi.org/10.3390/rs17244001 - 11 Dec 2025
Viewed by 148
Abstract
With the rapid improvement of deep learning, significant progress has been made in cloud removal for remote sensing images (RSIs). However, the practical deployment of existing methods on multi-platform devices faces several limitations, including high computational complexity preventing real-time processing, substantial hardware resource [...] Read more.
With the rapid improvement of deep learning, significant progress has been made in cloud removal for remote sensing images (RSIs). However, the practical deployment of existing methods on multi-platform devices faces several limitations, including high computational complexity preventing real-time processing, substantial hardware resource demands that are unsuitable for edge devices, and inadequate performance in complex cloud scenarios. To address these challenges, we propose PMSAF-Net, a lightweight Progressive Multi-Scale Asymmetric Fusion Network designed for efficient thin cloud removal. The proposed network employs a Dual-Branch Asymmetric Attention (DBAA) module to optimize spatial details and channel dependencies, reducing computation cost while improving feature extraction. A Multi-Scale Context Aggregation (MSCA) mechanism captures multi-level contextual information through hierarchical dilated convolutions, effectively handling clouds of varying scales and complexities. A Refined Residual Block (RRB) minimizes boundary artifacts through reflection padding and residual calibration. Additionally, an Iterative Feature Refinement (IFR) module progressively enhances feature representations via dense cross-stage connections. Extensive experimental multi-platform datasets results show that the proposed method achieves favorable performance against state-of-the-art algorithms. With only 0.32 M parameters, PMSAF-Net maintains low computational costs, demonstrating its strong potential for multi-platform deployment on resource-constrained edge devices. Full article
Show Figures

Figure 1

22 pages, 1188 KB  
Article
EFDepth: A Monocular Depth Estimation Model for Multi-Scale Feature Optimization
by Fengchun Liu, Xinying Shao, Chunying Zhang, Liya Wang, Lu Liu and Jing Ren
Sensors 2025, 25(23), 7379; https://doi.org/10.3390/s25237379 - 4 Dec 2025
Viewed by 418
Abstract
To address the accuracy issues in monocular depth estimation caused by insufficient feature extraction and inadequate context modeling, a multi-scale feature optimization model named EFDepth was proposed to improve prediction performance. This framework adopted an encoder–decoder structure: the encoder (EC-Net) was composed of [...] Read more.
To address the accuracy issues in monocular depth estimation caused by insufficient feature extraction and inadequate context modeling, a multi-scale feature optimization model named EFDepth was proposed to improve prediction performance. This framework adopted an encoder–decoder structure: the encoder (EC-Net) was composed of MobileNetV3-E and ETFBlock, and its features were optimized through multi-scale dilated convolution; the decoder (LapFA-Net) combined the Laplacian pyramid and the FMA module to enhance cross-scale feature fusion and output accurate depth maps. Comparative experiments between EFDepth and algorithms including Lite-mono, Hr-depth, and Lapdepth were conducted on the KITTI datasets. The results show that, for the three error metrics—RMSE (Root Mean Square Error), AbsRel (Absolute Relative Error), and SqRel (Squared Relative Error)—EFDepth is 1.623, 0.030, and 0.445 lower than the average values of the comparison algorithms, respectively, and for the three accuracy metrics, it is 0.052, 0.023, and 0.011 higher than the average values of the comparison algorithms, respectively. Experimental results indicate that EFDepth outperforms the comparison methods in most metrics, providing an effective reference for monocular depth estimation and 3D reconstruction of complex scenes. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 55314 KB  
Article
MSFN-YOLOv11: A Novel Multi-Scale Feature Fusion Recognition Model Based on Improved YOLOv11 for Real-Time Monitoring of Birds in Wetland Ecosystems
by Linqi Wang, Lin Ye, Xinbao Chen and Nan Chu
Animals 2025, 15(23), 3472; https://doi.org/10.3390/ani15233472 - 2 Dec 2025
Viewed by 369
Abstract
Intelligent bird species recognition is vital for biodiversity monitoring and ecological conservation. This study tackles the challenge of declining recognition accuracy caused by occlusions and imaging noise in complex natural environments. Focusing on ten representative bird species from the Dongting Lake Wetland, we [...] Read more.
Intelligent bird species recognition is vital for biodiversity monitoring and ecological conservation. This study tackles the challenge of declining recognition accuracy caused by occlusions and imaging noise in complex natural environments. Focusing on ten representative bird species from the Dongting Lake Wetland, we propose an improved YOLOv11n-based model named MSFN-YOLO11, which incorporates multi-scale feature fusion. After selecting YOLOv11n as the baseline through comparison with the most-stable version of YOLOv8n, we enhance its backbone by introducing an MSFN module. This module strengthens global and local feature extraction via parallel dilated convolution and a channel attention mechanism. Experiments are conducted on a self-built dataset containing 4540 images of ten species with 6824 samples. To simulate real-world conditions, 25% of samples are augmented using random occlusion, Gaussian noise (σ = 0.2, 0.3, 0.4), and Poisson noise. The improved model achieves a mAP@50 of 96.4% and mAP@50-95 of 83.2% on the test set. Although the mAP@50 shows a slight improvement of 0.3% compared to the original YOLOv11, it has contributed to an 18% reduction in training time. Furthermore, it also demonstrates practical efficacy in processing dynamic video, attaining an average 63.1% accuracy at 1920 × 1080@72fps on an NVIDIA_Tesla_V100_SXM2_32_GB. The proposed model provides robust technical support for real-time bird monitoring in wetlands and enhances conservation efforts for endangered species. Full article
(This article belongs to the Special Issue Artificial Intelligence as a Useful Tool in Behavioural Studies)
Show Figures

Figure 1

18 pages, 1835 KB  
Article
Towards Robust Medical Image Segmentation with Hybrid CNN–Linear Mamba
by Xiao Ma and Guangming Lu
Electronics 2025, 14(23), 4726; https://doi.org/10.3390/electronics14234726 - 30 Nov 2025
Viewed by 348
Abstract
Problem: Medical image segmentation faces critical challenges in balancing global context modeling and computational efficiency. While conventional neural networks struggle with long-range dependencies, Transformers incur quadratic complexity. Although Mamba-based architectures achieve linear complexity, they lack adaptive mechanisms for heterogeneous medical images and demonstrate [...] Read more.
Problem: Medical image segmentation faces critical challenges in balancing global context modeling and computational efficiency. While conventional neural networks struggle with long-range dependencies, Transformers incur quadratic complexity. Although Mamba-based architectures achieve linear complexity, they lack adaptive mechanisms for heterogeneous medical images and demonstrate insufficient local feature extraction capabilities. Method: We propose Linear Context-Aware Robust Mamba (LCAR–Mamba) to address these dual limitations through adaptive resource allocation and enhanced multi-scale extraction. LCAR–Mamba integrates two synergistic modules: the Context-Aware Linear Mamba Module (CALM) for adaptive global–local fusion, and the Multi-scale Partial Dilated Convolution Module (MSPD) for efficient multi-scale feature refinement. Core Innovations: CALM module implements content-driven resource allocation through four-stage processing: (1) analyzing spatial complexity via gradient and activation statistics, (2) computing allocation weights to dynamically balance global and local processing branches, (3) parallel dual-path processing with linear attention and convolution, and (4) adaptive fusion guided by complexity weights. MSPD module employs statistics-based channel selection and multi-scale partial dilated convolutions to capture features at multiple receptive scales while reducing computational cost. Key Results: On ISIC2017 and ISIC2018 datasets, mIoU improvements of 0.81%/1.44% confirm effectiveness across 2D benchmarks. On the Synapse dataset, LCAR–Mamba achieves 85.56% DSC, outperforming the former best Mamba baseline by 0.48% with 33% fewer parameters. Significance: LCAR–Mamba demonstrates that adaptive resource allocation and statistics-driven multi-scale extraction can address critical limitations in linear-complexity architectures, establishing a promising direction for efficient medical image segmentation. Full article
(This article belongs to the Special Issue Target Tracking and Recognition Techniques and Their Applications)
Show Figures

Figure 1

20 pages, 6998 KB  
Article
Seismic Data Enhancement for Tunnel Advanced Prediction Based on TSISTA-Net
by Deshan Feng, Mengchen Yang, Xun Wang, Wenxiu Yan, Chen Chen and Xiao Tao
Appl. Sci. 2025, 15(23), 12700; https://doi.org/10.3390/app152312700 - 30 Nov 2025
Viewed by 309
Abstract
Tunnel seismic advanced prediction is a widely used technique in geotechnical engineering due to its non-destructive characteristics and deep detection capability. However, limitations in acquisition space and complex on-site conditions often result in missing traces, damaged channels, and low-resolution data, thereby hindering accurate [...] Read more.
Tunnel seismic advanced prediction is a widely used technique in geotechnical engineering due to its non-destructive characteristics and deep detection capability. However, limitations in acquisition space and complex on-site conditions often result in missing traces, damaged channels, and low-resolution data, thereby hindering accurate geological interpretation. Although deep learning models such as U-Net have shown promise in seismic data reconstruction, their emphasis on local features and fixed parameter configurations limits their capacity to capture global and long-range dependencies, thereby constraining reconstruction accuracy. To address these challenges, this study proposes a novel deep unrolling network, TSISTA-Net (Tunnel Seismic Iterative Shrinkage–Thresholding Algorithm Network), specifically designed to improve seismic data quality. Built upon the ISTA-Net architecture, TSISTA-Net incorporates three distinct innovations. First, reflection padding is utilized to minimize boundary artifacts and effectively recover edge information. Second, multi-scale dilated convolutions are employed to extend the receptive field, thereby facilitating the extraction of long-range and multi-scale features from seismic signals. Third, a lightweight and patch-based processing strategy is adopted, guaranteeing high computational efficiency while maintaining reconstruction quality. The effectiveness of the proposed method was validated on both synthetic and real tunnel seismic datasets. On synthetic data, TSISTA-Net achieved a PSNR of 37.28 dB, an SSIM of 0.9667, and an LCCC of 0.9357, outperforming U-Net (35.93 dB, 0.9480, 0.9087) and conventional ISTA-Net (34.04 dB, 0.9167, 0.8878). These results demonstrate superior signal fidelity, structural similarity, and local correlation relative to established baselines. Consistent improvements were also observed on real tunnel datasets, indicating that TSISTA-Net provides an efficient, data-driven solution for tunnel seismic data processing with strong potential for practical engineering applications. Full article
Show Figures

Figure 1

22 pages, 2517 KB  
Article
A Novel Hybrid Framework for Stock Price Prediction Integrating Adaptive Signal Decomposition and Multi-Scale Feature Extraction
by Junqi Su, Raymond Y. K. Lau, Yuefeng Du, Jia Yu and Hui Zhang
Appl. Sci. 2025, 15(23), 12450; https://doi.org/10.3390/app152312450 - 24 Nov 2025
Viewed by 678
Abstract
To address the issue of low prediction accuracy due to the inherent high noise and non-stationary characteristics of stock price series, this paper proposes a novel stock price prediction framework (CVASD-MDCM-Informer) that integrates adaptive signal decomposition with multi-scale feature extraction. The framework first [...] Read more.
To address the issue of low prediction accuracy due to the inherent high noise and non-stationary characteristics of stock price series, this paper proposes a novel stock price prediction framework (CVASD-MDCM-Informer) that integrates adaptive signal decomposition with multi-scale feature extraction. The framework first employs a CVASD module, which is a variational mode decomposition (VMD) method adaptively optimized by a porcupine optimization (CPO) algorithm, to decompose the original stock price series into a series of intrinsic mode functions (IMFs) with different frequency characteristics, effectively separating noise and multi-frequency signals. Subsequently, the decomposed components are input into a prediction network based on Informer. In the feature extraction phase, this paper designs a multi-scale dilated convolution module (MDCM) to replace the standard convolution of the Informer, enhancing the model’s ability to capture short-term fluctuations and long-term trends by using convolution kernels with different dilation rates in parallel. Finally, the prediction results of each component are integrated to obtain the final predicted value. Experimental results on three representative industry datasets (Information Technology, Finance, and Consumer Staples) of the US S&P 500 index show that, compared to several advanced baseline models, the proposed framework demonstrates significant advantages in multiple evaluation metrics such as MAE, MSE, and RMSE. Ablation experiments further validate the effectiveness of the two core modules, CVASD and MDCM. The study indicates that the framework can effectively handle complex financial time series, providing a new solution for stock price prediction. Full article
(This article belongs to the Special Issue Advanced Methods for Time Series Forecasting)
Show Figures

Figure 1

26 pages, 3916 KB  
Article
Multi-Length Prediction of the Drilling Rate of Penetration Based on TCN–Informer
by Jun Sun, Wendi Huang, Lin Du, Qianyu Yang, Bowen Deng and Xiqiao Chen
Electronics 2025, 14(22), 4538; https://doi.org/10.3390/electronics14224538 - 20 Nov 2025
Viewed by 314
Abstract
The Rate of Penetration (ROP) during drilling is nonstationary and exhibits coupled local fluctuations, which makes it challenging to model for accurate prediction. To address the challenge of modeling multi-scale temporal dependencies in drilling, this study introduces a hybrid TCN–Informer framework. It integrates [...] Read more.
The Rate of Penetration (ROP) during drilling is nonstationary and exhibits coupled local fluctuations, which makes it challenging to model for accurate prediction. To address the challenge of modeling multi-scale temporal dependencies in drilling, this study introduces a hybrid TCN–Informer framework. It integrates the causal dilated Temporal Convolutional Network (TCN) for capturing short-term patterns with the Informer’s ProbSparse attention mechanism for modeling long-range dependencies. A comprehensive methodology is adopted, which includes a four-stage data preprocessing pipeline featuring per-well z-score standardization and label concatenation, a sliding-window training scheme to address cold-start issues, and an Optuna-based Bayesian search for hyperparameter optimization. The prediction performance of the models was evaluated across various input sequence lengths using the Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Coefficient of Determination (R2). The results show that the proposed TCN–Informer demonstrates superior performance compared to Informer, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Transformer. Furthermore, the predictions of the TCN–Informer respond more rapidly to abrupt changes in the ROP and yield smoother, more stable results during intervals of stable ROP, validating its effectiveness in capturing both local and global temporal patterns. Full article
(This article belongs to the Special Issue Digital Intelligence Technology and Applications, 2nd Edition)
Show Figures

Figure 1

21 pages, 10713 KB  
Article
Super Resolution of Satellite-Based Land Surface Temperature Through Airborne Thermal Imaging
by Raniero Beber, Salim Malek and Fabio Remondino
Remote Sens. 2025, 17(22), 3766; https://doi.org/10.3390/rs17223766 - 19 Nov 2025
Viewed by 733
Abstract
Urban heat island pose a significant threat to public health and urban livability. UHI maps are created using satellite thermal data, a crucial source for earth monitoring and for delivering mitigation strategies. Nowadays there is still a resolution gap between high-resolution optical data [...] Read more.
Urban heat island pose a significant threat to public health and urban livability. UHI maps are created using satellite thermal data, a crucial source for earth monitoring and for delivering mitigation strategies. Nowadays there is still a resolution gap between high-resolution optical data and low-resolution satellite thermal imagery. This study introduces a novel deep learning approach—named Dilated Spatio-Temporal U-Net (DST-UNet)—to bridge this gap. DST-UNET is a modified U-Net architecture which incorporates dilated convolutions to address the multiscale nature of urban thermal patterns. The model is trained to generate high-resolution, airborne-like thermal maps from available, low-resolution satellite imagery and ancillary data. Our results demonstrate that the DST-UNet can effectively generalise across different urban environments, enabling municipalities to generate detailed thermal maps with a frequency far exceeding that of traditional airborne campaigns. This framework leverages open-source data from missions like Landsat to provide a cost-effective and scalable solution for continuous, high-resolution urban thermal monitoring, empowering more effective climate resilience and public health initiatives. Full article
(This article belongs to the Special Issue Remote Sensing for Land Surface Temperature and Related Applications)
Show Figures

Figure 1

17 pages, 2692 KB  
Article
MSDTCN-Net: A Multi-Scale Dual-Encoder Network for Skin Lesion Segmentation
by Da Li, Xinyang Wu and Qin Wei
Diagnostics 2025, 15(22), 2924; https://doi.org/10.3390/diagnostics15222924 - 19 Nov 2025
Viewed by 461
Abstract
Background/Objectives: Accurate segmentation of skin lesions is essential for early skin cancer detection. However, traditional CNNs are limited in modeling long-range dependencies, leading to poor performance on lesions with complex shapes. Methods: We propose MSDTCN-Net, a dual-encoder network that integrates ConvNeXt and Deformable [...] Read more.
Background/Objectives: Accurate segmentation of skin lesions is essential for early skin cancer detection. However, traditional CNNs are limited in modeling long-range dependencies, leading to poor performance on lesions with complex shapes. Methods: We propose MSDTCN-Net, a dual-encoder network that integrates ConvNeXt and Deformable Transformer to extract both local details and global semantic information. A Squeeze-and-Excitation (SE) mechanism is introduced to adaptively emphasize important channels. To address scale variation in lesions, we design a Multi-Scale Receptive Field (MSRF) module combining multi-branch and dilated convolutions. Furthermore, a Hierarchical Feature Transfer (HFT) mechanism is employed to guide high-level semantics progressively to shallow layers, enhancing boundary reconstruction in the decoder. Results: Extensive experiments on the ISIC 2016, ISIC 2017, ISIC 2018, and PH2 datasets show that MSDTCN-Net achieves competitive performance across metrics including IoU, Dice, and ACC, validating its effectiveness and generalization in skin lesion segmentation. Conclusions: MSDTCN-Net effectively combines local and global feature extraction, multi-scale adaptability, and semantic guidance to achieve high-accuracy skin lesion segmentation, demonstrating its potential in clinical diagnostic applications. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

28 pages, 3335 KB  
Article
MDFA-AconvNet: A Novel Multiscale Dilated Fusion Attention All-Convolution Network for SAR Target Classification
by Jiajia Wang, Jun Liu, Pin Zhang, Qi Jia, Xin Yang, Shenyu Du and Xueyu Bai
Information 2025, 16(11), 1007; https://doi.org/10.3390/info16111007 - 19 Nov 2025
Cited by 1 | Viewed by 395
Abstract
Synthetic aperture radar (SAR) features all-weather and all-day imaging capabilities, long-range detection, and high resolution, making it indispensable for battlefield reconnaissance, target detection, and guidance. In recent years, deep learning has emerged as a prominent approach for the classification of SAR image targets, [...] Read more.
Synthetic aperture radar (SAR) features all-weather and all-day imaging capabilities, long-range detection, and high resolution, making it indispensable for battlefield reconnaissance, target detection, and guidance. In recent years, deep learning has emerged as a prominent approach for the classification of SAR image targets, owing to its hierarchical feature extraction, progressive refinement, and end-to-end learning capabilities. However, challenges such as the high cost of SAR data acquisition and the limited number of labeled samples often result in overfitting and poor model generalization. In addition, conventional layers typically operate with fixed receptive fields, making it difficult to simultaneously capture multiscale contextual information and dynamically focus on salient target features. To address these limitations, this paper proposes a novel architecture: the Multiscale Dilated Fusion Attention All-Convolution Network (MDFA-AconvNet). The model incorporates a multiscale dilated attention mechanism that significantly broadens the receptive field across varying target scales in SAR images without compromising spatial resolution, thereby enhancing multiscale feature extraction. Furthermore, by introducing both channel attention and spatial attention mechanisms, the model is able to selectively emphasize informative feature channels and spatial regions relevant to target recognition. These attention modules are seamlessly integrated into the All-Convolution Network (A-convNet) backbone, resulting in comprehensive performance improvements. Extensive experiments on the MSTAR dataset demonstrate that the proposed MDFA-AconvNet achieves a high classification accuracy of 99.38% in ten target classes, markedly outperforming the original A-convNet algorithm. These compelling results highlight the model’s robustness against target variations and its significant potential for practical deployment, paving the way for more efficient SAR image classification and recognition systems. Full article
Show Figures

Graphical abstract

19 pages, 1786 KB  
Article
Path-Routing Convolution and Scalable Lightweight Networks for Robust Underwater Acoustic Target Recognition
by Yue Zhao, Menghan Chen, Yuchen Lu, Liangliang Cheng, Cheng Chen, Yifei Li and Nizar Faisal Alkayem
Sensors 2025, 25(22), 7007; https://doi.org/10.3390/s25227007 - 17 Nov 2025
Viewed by 480
Abstract
Maritime traffic surveillance and ocean environmental protection urgently require the accurate identification of surface vessel types. Although deep learning methods have significantly improved the underwater acoustic target recognition performance, the existing models suffer from large parameter counts and fail to adapt to the [...] Read more.
Maritime traffic surveillance and ocean environmental protection urgently require the accurate identification of surface vessel types. Although deep learning methods have significantly improved the underwater acoustic target recognition performance, the existing models suffer from large parameter counts and fail to adapt to the multi-scale spectral features of radiated noise from different vessel types, restricting their practical deployment on power-constrained underwater sensors. To address these challenges, this paper proposes a novel path-routing convolution mechanism that achieves the discriminative extraction of cross-scale acoustic features through multi-dilation-rate parallel paths and an adaptive routing strategy and designs the MobilePR-ConvNet unified architecture that enables a single framework to automatically adapt to diverse hardware platforms through systematic width scaling. Experiments on the DeepShip and ShipsEar datasets demonstrate that the proposed method achieved 98.58% and 97.82% recognition accuracies, respectively, while maintaining a 77.8% robust performance under 10 dB low-signal-to-noise-ratio conditions, validating the cross-dataset generalization capability in complex marine environments and providing an effective solution for intelligent deployment on resource-constrained underwater devices. Full article
Show Figures

Figure 1

17 pages, 2779 KB  
Article
Image Restoration Based on Semantic Prior Aware Hierarchical Network and Multi-Scale Fusion Generator
by Yapei Feng, Yuxiang Tang and Hua Zhong
Technologies 2025, 13(11), 521; https://doi.org/10.3390/technologies13110521 - 13 Nov 2025
Viewed by 468
Abstract
As a fundamental low-level vision task, image restoration plays a pivotal role in reconstructing authentic visual information from corrupted inputs, directly impacting the performance of downstream high-level vision systems. Current approaches frequently exhibit two critical limitations: (1) Progressive texture degradation and blurring during [...] Read more.
As a fundamental low-level vision task, image restoration plays a pivotal role in reconstructing authentic visual information from corrupted inputs, directly impacting the performance of downstream high-level vision systems. Current approaches frequently exhibit two critical limitations: (1) Progressive texture degradation and blurring during iterative refinement, particularly in irregular damage patterns. (2) Structural incoherence when handling cross-domain artifacts. To address these challenges, we present a semantic-aware hierarchical network (SAHN) that synergistically integrates multi-scale semantic guidance with structural consistency constraints. Firstly, we construct a Dual-Stream Feature Extractor. Based on a modified U-Net backbone with dilated residual blocks, this skip-connected encoder–decoder module simultaneously captures hierarchical semantic contexts and fine-grained texture details. Secondly, we propose the semantic prior mapper by establishing spatial–semantic correspondences between damaged areas and multi-scale features through predefined semantic prototypes through adaptive attention pooling. Additionally, we construct a multi-scale fusion generator, by employing cascaded association blocks with structural similarity constraints. This unit progressively aggregates features from different semantic levels using deformable convolution kernels, effectively bridging the gap between global structure and local texture reconstruction. Compared to existing methods, our algorithm attains the highest overall PSNR of 34.99 with the best visual authenticity (with the lowest FID of 11.56). Comprehensive evaluations of three datasets demonstrate its leading performance in restoring visual realism. Full article
Show Figures

Figure 1

Back to TopTop